Dear fastsimcoal users and Laurent,
I am using RADseq data for demographic inference with fastsimcoal and struggle to understand the exact format for the folded SFS.
When I have k individuals, the folded SFS is k+1 long.
However, given that I do not have any information about the ancestral allele, I think that the first entry d0_0 of the SFS is actually not informative, as it just reflects the proportion of monomorphic sites in my data. Hence, to me it makes sense to use angsd to filter for polymorphic sites and compute an SFS which starts with 0:
1 observations
d0_0 d0_1 d0_2 d0_3
0 82.04235 1490.948 557.6909
To then run fsc26 with the following command:
fsc26 -t nes.tpl -n 10000 -m -e nes.est -M -L 40 -q --foldedSFS
While --foldedSFS shouldn't make a difference for the 1d SFS.
My main question is whether the folded SFS can start with 0, and if not, why it makes sense to include monomorphic sites.
Thanks,
Martin