Package 'discent'

Title: Estimation of Deme Inbreeding Spatial Coefficients with Gradient Descent
Description: TODO
Authors: Nick Brazeau [aut, cre], Bob Verity [aut]
Maintainer: Nick Brazeau <[email protected]>
License: MIT + file LICENSE
Version: 0.5.0
Built: 2026-05-22 05:52:58 UTC
Source: https://github.com/nickbrazeau/discent

Help Index


Calculate Hessian Matrix from Loss Function

Description

Calculate the Hessian, eigenvalues/vecotrs, and conditional number from the Loss Function using central finite differences

Usage

calculate_hessian_eigen(mod, discdat)

Arguments

mod

class of DISCresult; output from disc

discdat

dataframe; The genetic-geographic data by deme (K). Must contain columns: smpl1, smpl2, deme1, deme2, gendist, geodist

Value

A list containing:

  • Hessian: Hessian matrix

  • Eigen: Eigenvalues and eigenvectors from the Hessian

  • KappaH: Conditional number from Hessian matrix


Identify Deme Inbreeding Spatial Coefficients in Continuous Space

Description

This function estimates deme-specific inbreeding coefficients and a global migration rate from genetic and geographic distance data using an isolation-by-distance model. The model assumes that genetic similarity between samples decreases exponentially with geographic distance, modulated by deme-specific inbreeding coefficients.

Usage

disc(
  discdat,
  start_params = NULL,
  learningrate = 0.001,
  b1 = 0.9,
  b2 = 0.999,
  e = 1e-08,
  steps = 1000,
  thin = 1,
  report_progress = TRUE,
  return_verbose = FALSE,
  diagnostics = TRUE
)

Arguments

discdat

dataframe; The genetic-geographic data by deme (K). Must contain columns: smpl1, smpl2, deme1, deme2, gendist, geodist

start_params

named double vector; vector of start parameters. Names must match deme names, plus one parameter named "m" for migration rate

learningrate

double; Learning rate (alpha) for gradient descent optimization. Default: 0.001

b1

double; Exponential decay rate for first moment estimate in Adam optimizer. Default: 0.9

b2

double; Exponential decay rate for second moment estimate in Adam optimizer. Default: 0.999

e

double; Small constant for numerical stability in Adam optimizer. Default: 1e-8

steps

integer; Number of optimization steps. Default: 1000

thin

integer; Thinning interval for stored iterations (1 = store all). Default: 1

report_progress

logical; Whether to display progress bar during optimization. Default: TRUE

return_verbose

logical; Whether to return full optimization trajectory (TRUE) or just

diagnostics

logical; Whether to return diagnostics (described below) final results (FALSE). Full trajectory can be memory intensive. Default: FALSE

Details

The input dataframe must have exactly these column names in order:

  • smpl1, smpl2: Sample identifiers

  • deme1, deme2: Deme (location) identifiers

  • gendist: Pairwise genetic distance (0,1)

  • geodist: Pairwise geographic distance

The start_params vector must contain:

  • One parameter per unique deme (named with deme identifiers)

  • One parameter named "m" for the migration rate

  • All F parameters must be in (0,1) (inbreeding coefficients)

The model assumes: E[rij]=Fi+Fj2exp(dij/m)E[r_{ij}] = \frac{F_i + F_j}{2} \exp(-d_{ij}/m) where rijr_{ij} is genetic relatedness, FiF_i is deme i's inbreeding coefficient, dijd_{ij} is geographic distance, and mm is the migration rate parameter.

Value

A list of class "DISCresult" containing:

  • Final_Fis: Final inbreeding coefficient estimates

  • Final_m: Final migration rate estimate

  • deme_key: Mapping of deme names to indices

  • cost: Final cost function value(s)

If return_verbose = TRUE, additional elements include:

  • fi_run: F parameter trajectory over iterations

  • m_run: Migration parameter trajectory

  • ci_gradtraj: C_i gradient trajectory

  • b_gradtraj: β\beta gradient trajectory

  • m_gradtraj: Migration gradient trajectory

  • ci_1moment, ci_2moment: C_i parameter Adam moments

  • b_1moment, b_2moment: β\beta parameter Adam moments

  • m_1moment, m_2moment: Migration parameter Adam moments

If diagnostics = TRUE, additional diagnostics elements are provided from calculate_hessian_eigen, which include:

  • Hessian: Hessian matrix

  • Eigen: Eigenvalues and eigenvectors from the Hessian

  • KappaH: Conditional number from Hessian matrix


Expit (Inverse Logit) Transformation

Description

Converts logit-scale values back to probability scale using the standard expit formula: expit(p)=11+ep\text{expit}(p) = \frac{1}{1 + e^{-p}}

Usage

expit(p)

Arguments

p

numeric vector of values in logit space

Details

This function is the inverse of logit. It transforms values from the logit scale (real line) back to probabilities (0,1).

Value

Numeric vector of values in (0,1) (probability scale)

See Also

logit


Simulated Identity by Descent Data for DISC Analysis

Description

Simulated Identity by Descent Data for DISC Analysis

Usage

IBD_simulation_data

Format

A dataframe with 45 rows and 6 columns representing pairwise genetic and geographic distances:

smpl1, smpl2

Integer sample identifiers (1-10)

deme1, deme2

Factor deme (location) identifiers (1-3)

gendist

Numeric genetic distances in (0,1) based on simulated identity by descent

geodist

Numeric geographic distances (500, 1000) representing spatial separation

Details

This dataset contains pairwise comparisons between samples from 3 demes:

  • Deme 1-2 pairs: Geographic distance = 500

  • Deme 1-3 pairs: Geographic distance = 1000

  • Deme 2-3 pairs: Geographic distance = 1500

The data follows an isolation-by-distance model where genetic similarity decreases exponentially with geographic distance, modulated by deme-specific inbreeding coefficients.

Source

Simulated toy dataset for testing and demonstration purposes. Generated using an exponential decay relationship between genetic relatedness and geographic distance. Not intended for real-world analysis - use for testing DISC functions and understanding expected input format.

Examples

# Load the dataset
data("IBD_simulation_data")

# View structure
str(IBD_simulation_data)

# Basic usage with disc function
## Not run: 
start_params <- c("1" = 0.2, "2" = 0.3, "3" = 0.4, "m" = 800)
result <- disc(IBD_simulation_data, start_params, steps = 100)

## End(Not run)

Check if DISCresult S3 Class

Description

Tests whether an object is of class DISCresult

Usage

is.DISCresult(x)

Arguments

x

Object to test

Value

Logical value: TRUE if object is of class DISCresult, FALSE otherwise


Logit Transformation

Description

Transforms probability values to the logit scale using the standard logit formula: logit(p)=log(p1p)\text{logit}(p) = \log\left(\frac{p}{1-p}\right)

Usage

logit(p)

Arguments

p

numeric vector of values in (0,1) (probability scale)

Details

This function transforms probabilities (0,1) to the logit scale (real line). Values of 0 and 1 will produce -Inf and +Inf respectively. The transformation is used internally in DISC to ensure inbreeding coefficients remain non-negative.

Value

Numeric vector of values on the real line (logit scale)

See Also

expit


Print DISCresult S3 Class

Description

S3 method for printing DISCresult objects with summary information

Usage

## S3 method for class 'DISCresult'
print(x, ...)

Arguments

x

Object to test

...

Further arguments passed to or from other methods

Value

Invisibly returns the input object. Called for side effect of printing


Draw from Beta-binomial distribution

Description

Draw from Beta-binomial distribution.

Usage

rbetabinom(n = 1, k = 10, alpha = 1, beta = 1)

Arguments

n

number of draws.

k

number of binomial trials.

alpha

first shape parameter of beta distribution.

beta

second shape parameter of beta distribution.


DISCent Forward Simulator

Description

Uses the DISCent equation, rij=(fi+fj/2)e(dij/m)r_ij = (f_i + f_j/2)e^(-d_{ij}/m) to simulate data expected genetic distances between sample pairs within demes. We assume a beta-binomial model to create realistic "noise" amongst the pairs, where Yp binom(n,p)Y|p ~ binom(n,p) such that p beta(μϕ,(1μ)ϕ)p ~ beta(\mu * \phi, (1 - \mu) * \phi), where μ\mu is average relatedness (defined by \rij\rij) and ϕ\phi is a concentration parameter of that relatedness.

Usage

run_forward_disc(
  true_params,
  geodist_matrix,
  samples_per_deme,
  overdispersion = 200
)

Arguments

true_params

named vector; True F_i and M values to simulate. At least one element must be named "m".

geodist_matrix

named matrix of geodistances; must be a nxn geodistance matrix, with n corresponding to number of demes and deme names matching the true_param names

samples_per_deme

integer vector; number of pairs within each deme

overdispersion

numeric; overdispersion parameter in the beta-binomial model; Default: 200

Value

A dataframe of class "DISCsim" containing columns:

  • Final_Fis: Final inbreeding coefficient estimates

  • Final_m: Final migration rate estimate

  • deme_key: Mapping of deme names to indices

  • cost: Final cost function value(s)


Summary of DISCresult S3 Class

Description

S3 method for summarizing DISCresult objects

Usage

## S3 method for class 'DISCresult'
summary(object, ...)

Arguments

object

DISCresult object from disc function

...

Further arguments passed to or from other methods

Value

A list containing tidied final inbreeding coefficients and migration rate


Tidy Output Generic Method

Description

Generic method for tidying output from DISC analysis

Usage

tidyout(x)

Arguments

x

Object to tidy (typically a DISCresult)

Value

Method-specific tidied output


Tidy DISCresult Output

Description

S3 method for tidying DISCresult objects into user-friendly format

Usage

## S3 method for class 'DISCresult'
tidyout(x)

Arguments

x

Object to test

Value

A list with two elements:

  • Final_Fis: Dataframe with deme names and final inbreeding coefficients

  • Final_M: Final migration rate estimate