Dear Dr. Excoffier,I hope this email finds you well.I got in touch with you since I have some questions for fastsimcoal (fsc).I'm now trying parameter estimation using SFS data which is derived from ddRAD-seq. My purpose is not only estimating demographic parameters but also selecting the most optimal model. I saw a previous post (link) in Google group, where you said pruning SNPs is necessary for model selection but problematic for parameter estimation.Here are two questions about this post.1) Why is SNPs pruning problematic for model selection?I read Excoffier et al. (2013, PLoS Genet), but can't understand the reason...... Software generating SFS, such as easySFS, assume unlinked SNPs, so pruning SNPs seems necessary to reduce linkage disequilibrium.
The reasoning is that pruning is done by computing LD considering that all individuals come from a population in HW equilibrium. If you have some genetic structure LD will be created due to the genetic structure, and a series of SNPs that are associated to high FST between populations will show high levels of LD, and might be removed by pruning whereas they are informative about your structure.
2) What is the best scheme to perform parameter estimation as well as model selection?In my understanding, the best scheme is selecting the optimal model WITHOUT pruning SNPs, and estimating parameters with pruning SNPs. Yet, this seems very time-consuming. Could you tell me a more effective way if any?
Best regards,Yasuto