I’m trying to calculate Tajima’s D across the genome using GBS data that’s gone through Stacks and so far have tried to calculate it per site using pi (from Stacks), ensuring that sample size changes with respect to missing data at each site. However, using standard calculations for theta, a1, a2, e1, e2 and with S=1 at every site, my D scores are very high across the genome, with 4’s and 5’s occurring frequently.
I’ve looked at other bits of software, such as vcftools and popgenome, but they associate sample size more broadly with the number of individuals total, so genome-wide data become difficult to correct when results reflect sliding windows. I’m aware that such stats have taken a knock (http://onlinelibrary.wiley.com/doi/10.1111/mec.12276/full), but I’m still keen to try. Is it possible to emulate the Hohenlohe et al. 2010 paper using the current Stacks output, so as to determine sample size within the sliding window without performing the analysis independently of Stacks? Is it even worth it?
Cheers,
Craig
--
Stacks website: http://creskolab.uoregon.edu/stacks/
---
You received this message because you are subscribed to the Google Groups "Stacks" group.
To unsubscribe from this group and stop receiving emails from it, send an email to stacks-users...@googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.