| Title: | Estimation of Deme Inbreeding Spatial Coefficients with Gradient Descent |
|---|---|
| Description: | TODO |
| Authors: | Nick Brazeau [aut, cre], Bob Verity [aut] |
| Maintainer: | Nick Brazeau <[email protected]> |
| License: | MIT + file LICENSE |
| Version: | 0.5.0 |
| Built: | 2026-05-22 05:52:58 UTC |
| Source: | https://github.com/nickbrazeau/discent |
Calculate the Hessian, eigenvalues/vecotrs, and conditional number from the Loss Function using central finite differences
calculate_hessian_eigen(mod, discdat)calculate_hessian_eigen(mod, discdat)
mod |
class of DISCresult; output from |
discdat |
dataframe; The genetic-geographic data by deme (K). Must contain columns:
|
A list containing:
Hessian: Hessian matrix
Eigen: Eigenvalues and eigenvectors from the Hessian
KappaH: Conditional number from Hessian matrix
This function estimates deme-specific inbreeding coefficients and a global migration rate from genetic and geographic distance data using an isolation-by-distance model. The model assumes that genetic similarity between samples decreases exponentially with geographic distance, modulated by deme-specific inbreeding coefficients.
disc( discdat, start_params = NULL, learningrate = 0.001, b1 = 0.9, b2 = 0.999, e = 1e-08, steps = 1000, thin = 1, report_progress = TRUE, return_verbose = FALSE, diagnostics = TRUE )disc( discdat, start_params = NULL, learningrate = 0.001, b1 = 0.9, b2 = 0.999, e = 1e-08, steps = 1000, thin = 1, report_progress = TRUE, return_verbose = FALSE, diagnostics = TRUE )
discdat |
dataframe; The genetic-geographic data by deme (K). Must contain columns:
|
start_params |
named double vector; vector of start parameters. Names must match deme names, plus one parameter named "m" for migration rate |
learningrate |
double; Learning rate (alpha) for gradient descent optimization. Default: 0.001 |
b1 |
double; Exponential decay rate for first moment estimate in Adam optimizer. Default: 0.9 |
b2 |
double; Exponential decay rate for second moment estimate in Adam optimizer. Default: 0.999 |
e |
double; Small constant for numerical stability in Adam optimizer. Default: 1e-8 |
steps |
integer; Number of optimization steps. Default: 1000 |
thin |
integer; Thinning interval for stored iterations (1 = store all). Default: 1 |
report_progress |
logical; Whether to display progress bar during optimization. Default: TRUE |
return_verbose |
logical; Whether to return full optimization trajectory (TRUE) or just |
diagnostics |
logical; Whether to return diagnostics (described below) final results (FALSE). Full trajectory can be memory intensive. Default: FALSE |
The input dataframe must have exactly these column names in order:
smpl1, smpl2: Sample identifiers
deme1, deme2: Deme (location) identifiers
gendist: Pairwise genetic distance (0,1)
geodist: Pairwise geographic distance
The start_params vector must contain:
One parameter per unique deme (named with deme identifiers)
One parameter named "m" for the migration rate
All F parameters must be in (0,1) (inbreeding coefficients)
The model assumes:
where is genetic relatedness, is deme i's inbreeding coefficient,
is geographic distance, and is the migration rate parameter.
A list of class "DISCresult" containing:
Final_Fis: Final inbreeding coefficient estimates
Final_m: Final migration rate estimate
deme_key: Mapping of deme names to indices
cost: Final cost function value(s)
If return_verbose = TRUE, additional elements include:
fi_run: F parameter trajectory over iterations
m_run: Migration parameter trajectory
ci_gradtraj: C_i gradient trajectory
b_gradtraj: gradient trajectory
m_gradtraj: Migration gradient trajectory
ci_1moment, ci_2moment: C_i parameter Adam moments
b_1moment, b_2moment: parameter Adam moments
m_1moment, m_2moment: Migration parameter Adam moments
If diagnostics = TRUE, additional diagnostics elements are provided from calculate_hessian_eigen, which include:
Hessian: Hessian matrix
Eigen: Eigenvalues and eigenvectors from the Hessian
KappaH: Conditional number from Hessian matrix
Converts logit-scale values back to probability scale using the standard
expit formula:
expit(p)expit(p)
p |
numeric vector of values in logit space |
This function is the inverse of logit. It transforms values
from the logit scale (real line) back to probabilities (0,1).
Numeric vector of values in (0,1) (probability scale)
Simulated Identity by Descent Data for DISC Analysis
IBD_simulation_dataIBD_simulation_data
A dataframe with 45 rows and 6 columns representing pairwise genetic and geographic distances:
Integer sample identifiers (1-10)
Factor deme (location) identifiers (1-3)
Numeric genetic distances in (0,1) based on simulated identity by descent
Numeric geographic distances (500, 1000) representing spatial separation
This dataset contains pairwise comparisons between samples from 3 demes:
Deme 1-2 pairs: Geographic distance = 500
Deme 1-3 pairs: Geographic distance = 1000
Deme 2-3 pairs: Geographic distance = 1500
The data follows an isolation-by-distance model where genetic similarity decreases exponentially with geographic distance, modulated by deme-specific inbreeding coefficients.
Simulated toy dataset for testing and demonstration purposes. Generated using an exponential decay relationship between genetic relatedness and geographic distance. Not intended for real-world analysis - use for testing DISC functions and understanding expected input format.
# Load the dataset data("IBD_simulation_data") # View structure str(IBD_simulation_data) # Basic usage with disc function ## Not run: start_params <- c("1" = 0.2, "2" = 0.3, "3" = 0.4, "m" = 800) result <- disc(IBD_simulation_data, start_params, steps = 100) ## End(Not run)# Load the dataset data("IBD_simulation_data") # View structure str(IBD_simulation_data) # Basic usage with disc function ## Not run: start_params <- c("1" = 0.2, "2" = 0.3, "3" = 0.4, "m" = 800) result <- disc(IBD_simulation_data, start_params, steps = 100) ## End(Not run)
Tests whether an object is of class DISCresult
is.DISCresult(x)is.DISCresult(x)
x |
Object to test |
Logical value: TRUE if object is of class DISCresult, FALSE otherwise
Transforms probability values to the logit scale using the standard
logit formula:
logit(p)logit(p)
p |
numeric vector of values in (0,1) (probability scale) |
This function transforms probabilities (0,1) to the logit scale (real line). Values of 0 and 1 will produce -Inf and +Inf respectively. The transformation is used internally in DISC to ensure inbreeding coefficients remain non-negative.
Numeric vector of values on the real line (logit scale)
S3 method for printing DISCresult objects with summary information
## S3 method for class 'DISCresult' print(x, ...)## S3 method for class 'DISCresult' print(x, ...)
x |
Object to test |
... |
Further arguments passed to or from other methods |
Invisibly returns the input object. Called for side effect of printing
Draw from Beta-binomial distribution.
rbetabinom(n = 1, k = 10, alpha = 1, beta = 1)rbetabinom(n = 1, k = 10, alpha = 1, beta = 1)
n |
number of draws. |
k |
number of binomial trials. |
alpha |
first shape parameter of beta distribution. |
beta |
second shape parameter of beta distribution. |
Uses the DISCent equation, to simulate
data expected genetic distances between sample pairs within demes. We assume a beta-binomial
model to create realistic "noise" amongst the pairs, where such that
, where is average relatedness (defined by ) and
is a concentration parameter of that relatedness.
run_forward_disc( true_params, geodist_matrix, samples_per_deme, overdispersion = 200 )run_forward_disc( true_params, geodist_matrix, samples_per_deme, overdispersion = 200 )
true_params |
named vector; True F_i and M values to simulate. At least one element must be named "m". |
geodist_matrix |
named matrix of geodistances; must be a nxn geodistance matrix, with n corresponding to number of demes and deme names matching the true_param names |
samples_per_deme |
integer vector; number of pairs within each deme |
overdispersion |
numeric; overdispersion parameter in the beta-binomial model; Default: 200 |
A dataframe of class "DISCsim" containing columns:
Final_Fis: Final inbreeding coefficient estimates
Final_m: Final migration rate estimate
deme_key: Mapping of deme names to indices
cost: Final cost function value(s)
S3 method for summarizing DISCresult objects
## S3 method for class 'DISCresult' summary(object, ...)## S3 method for class 'DISCresult' summary(object, ...)
object |
DISCresult object from |
... |
Further arguments passed to or from other methods |
A list containing tidied final inbreeding coefficients and migration rate
Generic method for tidying output from DISC analysis
tidyout(x)tidyout(x)
x |
Object to tidy (typically a DISCresult) |
Method-specific tidied output
S3 method for tidying DISCresult objects into user-friendly format
## S3 method for class 'DISCresult' tidyout(x)## S3 method for class 'DISCresult' tidyout(x)
x |
Object to test |
A list with two elements:
Final_Fis: Dataframe with deme names and final inbreeding coefficients
Final_M: Final migration rate estimate