Dear Laurent and all,
I want to use fastsimcoal2 to infer the demography of five bacterial populations, I've been following the ANGSD manual for the generation of observed SFS and related procedures, and I would highly appreciate your expertise to clarify a few aspects.
1. Generation of observed SFS
Based on the ANGSD manual, I initially generated the sfs for each population using an outgroup as the reference genome. Subsequently, I created a 5D-SFS using the following command:
`realSFS -P 4 C1.ref.saf.idx C2.ref.saf.idx C3.ref.saf.idx C4.ref.saf.idx C5.ref.saf.idx > 5D.sfs`
Could you kindly confirm if the resulting 5D.sfs file is valid for use with the --multiSFS option?
2. Sample sizes for haploid bacteria
I have varying numbers of genomes for populations C1 to C5 (35, 7, 13, 3, and 2 respectively). Considering that the genomes are haploid, would it be suitable to include these numbers in the .tpl file? If so, should I use the NPOPi/2 to represent the estimated population effective size?
3. The split order of populations
Due to high rates of recombination, establishing the population split order via a phylogenomic tree is a significant challenge. Therefore, I'd like to explore alternative topologies to better fit my data. I guess that alternative topologies may be set at the 'historical event'. Could you offer any examples or guidelines on how to approach this issue?
Your insights and assistance on these matters are highly appreciated.
Thank you very much for your time and support.
Warm regards,
Xiaojun