Format of the folded SFS for fastsimcoal

465 views
Skip to first unread message

Martin Stoffel

unread,
Feb 5, 2018, 3:52:13 AM2/5/18
to fastsimcoal
Dear fastsimcoal users and Laurent,

I am using RADseq data for demographic inference with fastsimcoal and struggle to understand the exact format for the folded SFS. 

When I have k individuals, the folded SFS is k+1 long.

However, given that I do not have any information about the ancestral allele, I think that the first entry d0_0 of the SFS is actually not informative, as it just reflects the proportion of monomorphic sites in my data. Hence, to me it makes sense to use angsd to filter for polymorphic sites and compute an SFS which starts with 0:

1 observations
d0_0 d0_1 d0_2 d0_3
0 82.04235 1490.948 557.6909 


To then run fsc26 with the following command:

fsc26 -t nes.tpl -n 10000 -m -e nes.est -M -L 40 -q --foldedSFS

While --foldedSFS shouldn't make a difference for the 1d SFS.


My main question is whether the folded SFS can start with 0, and if not, why it makes sense to include monomorphic sites.

Thanks,
Martin

Martin Stoffel

unread,
Feb 15, 2018, 3:04:18 AM2/15/18
to fastsimcoal
Found an answer:

fastsimcoal incorporates to options to not take into account both monomorphic SNPs as well as singletons, with --removeZeroSFS --nosingleton.

cheers
Reply all
Reply to author
Forward
0 new messages