Hi All,
Stacks will calculate Fst values from every polymorphic site between
every pair of populations. The "write_single_snp" option does not apply
to the Fst statistics (at least, we haven't applied it yet), as we
implemented that option only to support exports to file formats (like
Structure) that do not want linked SNPs.
The filters in populations that require a locus to be present in X
populations or Y individuals in a population are applied generally to
the dataset immediately after all the data is read from disk. This means
they will be applied to all downstream statistics including Fst.
However, these filters apply to a RAD locus itself, not to each SNP in a
RAD locus.
The Genepop data will be output for your entire dataset. So, if a site
is polymorphic between populations 1 and 2, but not 2 and 3, it will
appear in Stacks' Fst calculation for 1 and 2, but not for the Fst of 2
and 3, however, it will be output for 1, 2, and 3 in the genepop output,
which does not make any distinctions (it just has to be polymorphic in
one of the populations).
Finally, as you know, there are many ways to calculate Fst. Stacks
provides two methods, one described in Hohenlohe, Bassham, et al. 2010,
PLoS Genetics that involves sums of binomial coefficients and Pi, and a
second method described in Weir's Data Analysis II, based on AMOVA
(which is the current preferred method). I don't know what Genepop uses,
but I would guess it is different, so you wouldn't expect Fst to match
exactly.
But, this is an empirical question. The allele frequencies can be
observed directly in the web interface for a particular locus, and the
various Fst calculations made by hand in say R and checked against the
values output in Stacks' Fst output. Stacks' populations also includes a
diagnostic flag, --log_fst_comp, which will cause the intermediate
values for the Fst calculation to be output to a file, so they can be
verified.
I think if you compare one or two loci from your data set that give
different Fst measures in Stacks and Genepop in this way it will become
clear why they are different.
Best,
julian
Thierry Gosselin wrote: