Troubleshooting Ne inflation in gl.stairway2

24 views
Skip to first unread message

Krishna Pavan Kumar Komanduri

unread,
Mar 12, 2026, 3:22:06 AMMar 12
to dartR
Hi dartR community,

I am using gl.stairway2 to estimate recent demographic history and test for a population bottleneck in the past 100 years.

gl.stairway2(My_genlight,
             stairway.path = stairway.path.exe,
             mu = 8e-10,
             gentime = 1,
             run=T,
             nreps = 200,
             parallel=4,
             L=1.8e6,
             minbinsize =5)

I am currently using the following parameters:

- Generation time: 1 year (since my species reaches sexual maturity at 0.57 years and breeds the following season).
- Mutation rate: 8e-10 (potential frog mutation rate derived from literature).
- L (callable sites): 1.8e6
L was calculated using: [loci (monomorphic + polymorphic) * mean read length (60 bp based on TrimmedSequence mean)]

However, I am encountering very high values for both the estimated Ne and time (years/generations), and this appears to be a scaling issue with L, as these values reach reasonable scales when L is increased to 1e7.

To put my current L value into context, the genome size of the closest species, Litoria verreauxii alpina, is 2.77 Gb. An L of 1.8e6 would mean a coverage of roughly 0.06% of the entire genome.

If I understand correctly, DArT pre-filters SNPs before sharing the data. And so I was wondering if calculating L directly from the received genlight object inherently underestimates the true sequence space surveyed? I noticed the Kioloa tutorial on Effective Population Size advises against this type of calculation and recommends using the length of the chromosome.

Question: If calculating L directly from the genlight object is not appropriate for DArT data, could anyone recommend how to find a more realistic L value or how to estimate the length of the chromosome for species without a sequenced genome?

I would really appreciate any insights. Thank you!

Regards,
Krishna


P.S. I am new to population genetic analysis and dartRverse, so apologies for any oversight.

Jose Luis Mijangos

unread,
Mar 17, 2026, 2:50:28 AMMar 17
to dartR
Hi Krishna,

Have a look at the Ne session of our latest workshop:
https://green-striped-gecko.github.io/kioloa2/inst/tutorials/W07/W07.html 

Bernd suggests using L = nLoc * 75 * 200 (loci × tag length ~75 bp × ~200 to account for sampling fraction).

or using Watterson’s estimator:


The important message is that even if L is not completely accurate, the trajectory of the historical population size shape is preserved. 

Alternatively, you can calibrate L by estimating contemporary Ne of your dataset and play with different values for L so the contemporary Ne and the Ne at the latest generations in the historical population size plots agree. 

Cheers,
Luis

Bernd.Gruber

unread,
Mar 17, 2026, 3:41:51 AMMar 17
to da...@googlegroups.com, dartR
Hi

I definitely recommend the Ld estimate = contemporary Ne if possible. 

The watterson is also not working too well and the time 75 should also be a fairly strong underestimate. 

The problem is the filtering as you would need to know the number of callable bases passing your filter and they is pretty difficult without the raw data. 

Cheers Bernd 
---------


On 17 Mar 2026, at 17:50, Jose Luis Mijangos <luis.m...@gmail.com> wrote:

Hi Krishna,
--
You received this message because you are subscribed to the Google Groups "dartR" group.
To unsubscribe from this group and stop receiving emails from it, send an email to dartr+un...@googlegroups.com.
To view this discussion visit https://groups.google.com/d/msgid/dartr/2bd4c5fc-69ca-4d21-9156-1f67eec4bed3n%40googlegroups.com.
Reply all
Reply to author
Forward
0 new messages