Hi all,
I recently started using dadi-cli to infer demographic history from ddRAD-seq data. We generated site frequency spectra (SFS) using easySFS and used these as input for dadi-cli for a set of five species. For each species, we are testing both two-population and three-population demographic scenarios. We used the following set of Portik models to test different scenarios (each model with 100 runs of optimisations) and calculated the AIC using the equation
AIC = 2*number of parameters - 2*Log Likelihood. We plan to select the model with the lowest AIC as the best supported scenario.
Models used:
2D models: no_mig, sym_mig, asym_mig, anc_sym_mig, anc_asym_mig, sec_contact_sym_mig, sec_contact_asym_mig
3D models: split_nomig, split_symmig_all, split_symmig_adjacent
Parameter bounds used:
nu: 1e-3 to 100
m: 0 to 10
T: 0 to 20
I had the following doubts that I would appreciate some input on:
- During the runs, we encountered the following warnings:
“WARNING:Inference:Model is < 0 where data is not masked.
WARNING:Inference:Number of affected entries is 25. Sum of data in those entries is 89.7342:”
Should this be concerning? If so, how can we overcome this?
- When I checked for convergence, only a few of the runs were converged. Would you recommend increasing the number of optimisation replicates, and if so, what would be a reasonable number to aim for?
- We got the warning:
“WARNING: The converged parameters are close to the boundaries”
The values appear to have converged closer to the lower bounds (near 0). We expect our system to have a recent divergence. Should we be concerned about these warnings?
- Does the AIC-based model we follow make sense, or should we use another method to determine the best-fit model?
- We are using VCFs thinned to retain SNPs at least 10kb apart. Is this appropriate for our analysis, or would you suggest LD pruning or using the non-thinned VCF?
Thanks & Regards,
Rayis.