Folded vs. unfolded SFS for population genetic summary statistics

657 views
Skip to first unread message

Joshua Penalba

unread,
Sep 15, 2016, 8:06:46 AM9/15/16
to dadi-user
Hi,

I just wanted to clarify some assumptions for both the single and multipopulation population genetic statistics (Watterson's theta, pi, Tajima's D, Fst, segregating sites). I'm hoping that I didn't miss this in the manual. I'm currently inputting a folded SFS from ANGSD into dadi to do my analyses. Does the calculation for these measures assume an unfolded SFS? If so, since I don't have an outgroup to properly polarize my SNPs, would there be a "good practice" alternative? Thanks in advance for the response. 

Best,
Josh

Gutenkunst, Ryan N - (rgutenk)

unread,
Sep 16, 2016, 7:30:58 PM9/16/16
to dadi...@googlegroups.com
Hello Josh,

The number of segregating sites doesn’t depend on whether the data are folded, so that’s okay. Watterson’s theta only depends on the number of segregating sites, so it’s okay too. Calculation of pi depends on frequency x times (1-x), which is symmetric with respect to folding, so it should be okay. Tajima’s D depends only on the other quantities, so it’s also okay. It’s less obvious for Fst, but that also doesn’t depend on folding.

Best,
Ryan

--
You received this message because you are subscribed to the Google Groups "dadi-user" group.
To unsubscribe from this group and stop receiving emails from it, send an email to dadi-user+...@googlegroups.com.
To post to this group, send email to dadi...@googlegroups.com.
Visit this group at https://groups.google.com/group/dadi-user.
For more options, visit https://groups.google.com/d/optout.

--
Ryan Gutenkunst
Assistant Professor of Molecular and Cellular Biology, University of Arizona
phone: (520) 626-0569, office: LSS 325, web: http://gutengroup.mcb.arizona.edu

Latest papers: 
“Selection on network dynamics drives differential rates of protein domain evolution”
PLoS Genetics; http://dx.doi.org/10.1371/journal.pgen.1006132
"Triallelic population genomics for inferring correlated fitness effects of same site nonsynonymous mutations"
Genetics; http://dx.doi.org/10.1534/genetics.115.184812
"Whole genome sequence analyses of Western Central African Pygmy hunter-gatherers reveal a complex demographic history and identify candidate genes under positive natural selection"
Genome Research; http://dx.doi.org/10.1101/gr.192971.115

Joshua Penalba

unread,
Sep 19, 2016, 8:01:34 AM9/19/16
to dadi-user
Fantastic! That's a relief to hear. Thanks Ryan!
Reply all
Reply to author
Forward
0 new messages