Split Time Interpretation

13 views
Skip to first unread message

Sarah Babaei

unread,
Mar 9, 2026, 4:56:54 PM (3 days ago) Mar 9
to dadi-user
Hello,

I was wondering if I could get some feedback on how I'm interpreting the optimized parameter for time since split. 

I used the Portik 2D (2 population) pipeline, and the parameter is described as: T1: Time in the past of split (in units of 2*Na generations). The value is 0.1759. 

Based on other threads in this Google group and Github questions, it seems that Time (years) = T1(parameter) x 2Nref x mutationrate x L.

Nref seems to be calculated using 4Nref=theta(from dadi optimized model)/(mutationrate x L).

To calculate L, I did the following: My data are from RADseq, and after running Stacks de novo and populations, filtering to write only 1 SNP per locus and for maximum heterozygosity of 0.8, I retained 14,249SNPs. My loci are on average 60bp long. So I calculated L = 14,249 x 60. I then downprojected this data in easySFS before plugging into dadi, but as I understand it this doesn't affect my L value. 

Apologies for the long question, any advice on whether I'm doing this right or wrong would be super helpful!

Thank you,
Sarah

Ryan Gutenkunst

unread,
Mar 10, 2026, 7:20:52 PM (2 days ago) Mar 10
to dadi...@googlegroups.com
Hello Sarah,

Corrections…

Time (years) = T1(parameter) x 2Nref x generation time

The calculation of Nerf is correct.

Down projecting doesn’t affect your L value, but filtering to select only 1 SNP per locus does. If that filter, for example, removes 1/2 of your SNPs, then that reduces L by 1/2.

Best,
Ryan

--
You received this message because you are subscribed to the Google Groups "dadi-user" group.
To unsubscribe from this group and stop receiving emails from it, send an email to dadi-user+...@googlegroups.com.
To view this discussion visit https://groups.google.com/d/msgid/dadi-user/e01367b3-d635-4109-9958-31537156c647n%40googlegroups.com.

Sarah Babaei

unread,
Mar 11, 2026, 12:23:33 PM (18 hours ago) Mar 11
to dadi-user
Hi Ryan,

Thank you for the reply, I really appreciate it! 

I'm still a little confused with calculating L, apologies. If I retained 14,249 SNPs post filtering, should I calculate L by multiplying it by the average size of my loci (assembled during de novo)? I see how filtering would reduce my L, but wouldn't that be taken into consideration since I'm using the final number of SNPs in the calculation? Or would I have to figure out how many SNPs I lost during all my filtering steps and divide it, like so: L=(size of locus) * ((final snps)/(total snps, before filtering))? 

Additionally, I wasn't sure if I should use the average size of locus or the average genotyped sites per locus, as my stacks populations output states:
Kept 555806 loci, composed of 33558642 sites; 10460 of those sites were filtered, 14249 variant sites remained.
Mean genotyped sites per locus: 24.65bp (stderr 0.04).
Would 33558642 be the (total snps, before filtering) to use in my L calculation?

Apologies again for the many questions!

Thank you,
Sarah

Ryan Gutenkunst

unread,
Mar 11, 2026, 7:43:06 PM (10 hours ago) Mar 11
to dadi-user
Hi Sarah,

For L, what we’re trying to calculate is the amount of sequence from which SNPs could have entered the SFS you’re analyzing. So we want to use genotyped sites. And then we account for any additional filtering on top of that.

I’ve not used stacks before, but my interpretation is that the total number of bases that were genotyped was 24.65 * 555806 = 13.7e6. Then 14249 SNPs remained, after filtering out 10460. So the effective L would be 1.37e6 * 14249/(14249+10460) = 7.6e6. It’s a bit confusing, but I’m assuming that those 10460 sites that were filtered were variant sites. You should check the stacks documentation to confirm that.

Best,
Ryan

Reply all
Reply to author
Forward
0 new messages