Hi Ryan,
I've been using dadi to model the initial expansion and gene flow of a recent invasion, and I've started on three populations. I have SNP data for the 3 populations and generation times for how long they have (probably) been there. I wanted a model that could examine likely initial expansion events as well as migration post founding of each population. I have scripts for each scenario, and have been comparing AIC values calculated from output LL.
Specifically - I tested different expansion scenarios first with the nuPOP parameters, and found the best fit expansion scenario. From there, I have added migration overtop of the expansion with the m_POP parameters in my scripts.
1. Question about migration - I have included a script where I have both the expansion scenario and one of the migration scenarios I am testing overtop the expansion results. I was wondering if this was a reasonable/informative approach to expansion and migration. Because the data is from one time point, how does dadi differentiate initial expansion events from migration that occurs later on given that these shared allele frequency changes can be quite subtle for recent events? I'm curious because I just got a couple results suggesting migration in the same direction as my likely expansion scenario (I tested just expansion first).
2. Question about seeding - I realized that I was missing a random seeding for the different replicates of the model, and I am thinking of adding in np.random.seed() to the run_opt() portion. Any advice on this approach? I appreciate it.
Thank you for all of your help and for making such an interesting model!
-Cameron
On Nov 4, 2025, at 1:47 PM, Cameron Grey <camero...@gmail.com> wrote:Hi Ryan,I've been using dadi to model the initial expansion and gene flow of a recent invasion, and I've started on three populations. I have SNP data for the 3 populations and generation times for how long they have (probably) been there. I wanted a model that could examine likely initial expansion events as well as migration post founding of each population. I have scripts for each scenario, and have been comparing AIC values calculated from output LL.
Specifically - I tested different expansion scenarios first with the nuPOP parameters, and found the best fit expansion scenario. From there, I have added migration overtop of the expansion with the m_POP parameters in my scripts.1. Question about migration - I have included a script where I have both the expansion scenario and one of the migration scenarios I am testing overtop the expansion results. I was wondering if this was a reasonable/informative approach to expansion and migration. Because the data is from one time point, how does dadi differentiate initial expansion events from migration that occurs later on given that these shared allele frequency changes can be quite subtle for recent events? I'm curious because I just got a couple results suggesting migration in the same direction as my likely expansion scenario (I tested just expansion first).
2. Question about seeding - I realized that I was missing a random seeding for the different replicates of the model, and I am thinking of adding in np.random.seed() to the run_opt() portion. Any advice on this approach? I appreciate it.
Thank you for all of your help and for making such an interesting model!-Cameron
--
You received this message because you are subscribed to the Google Groups "dadi-user" group.
To unsubscribe from this group and stop receiving emails from it, send an email to dadi-user+...@googlegroups.com.
To view this discussion visit https://groups.google.com/d/msgid/dadi-user/b4fcad17-e40e-47b4-8845-75b4de67f348n%40googlegroups.com.
<scenario2_3popGOM_migBAtoBZ.py>
Hi Ryan,
I appreciate your feedback on the earlier discussion. We’ve reevaluated our approach and are starting with relatively simple expansion-only demographic models to infer source populations and directionality among four populations.
I wanted to get your opinion on our SNP filtering strategy. My understanding is that any filtering that preferentially removes rare alleles can bias the SFS, so I’m avoiding MAF and HWE filtering. We’re working with ddRAD-seq SNPs (and later WGS SNPs) processed with STACKS, and I plan to randomly retain one SNP per RAD locus to ensure independence.
For missing data, I’m considering allowing ~30–50% missingness per SNP, while removing individuals with very high missing data (>40–50%). Does that seem like a reasonable balance for dadi analyses?
I also wanted to ask about paralog filtering. I’ve seen this suggested as a potential solution in some dadi discussions, and I was wondering whether this is something you typically recommend, and if so, whether it’s best handled via depth/heterozygosity-based filters rather than HWE.
Any guidance you’re willing to share would be much appreciated. Thanks again for taking the time to respond on these threads.
To view this discussion visit https://groups.google.com/d/msgid/dadi-user/6a10c2d8-8a75-42d3-b25c-5a5e572e42bbn%40googlegroups.com.