Hello all!
I am trying to understand projecting down a 1 population SFS to deal with missing genotype information. From what I understand, when you project down, you use all SNPs with calls for at least as many individuals as you project down to, and SNPs with fewer are discarded.
When I sum the SFS produced by projecting to the actual sample size, I get as expected an SFS with the number of the SNPs in our dataset that are called in all individuals. However, if I project down to 60% of the actual sample size, I get more SNPs, but not as many as if I counted up how many SNPs in our dataset had genotype calls for 60% of individuals. Is this expected and all sites with 60% calls are still being used, but the method doesn't necessarily produce an SFS with that number of SNPs?
Thank you all for your help!
Zach