Dear all,
I am studying the role of gene flow in the speciation of two sister species. Polyploidy has been suggested for both species and from chromosome counts they are thought to be decaploid (however this is uncertain as the sister genus is not known). I am working with a non-model group so I don’t have a reference genome and I used de novo assembly of RADseq loci to generate a vcf file. I have several population pairs each including two populations (one from each species) with 10 individuals in each population.
For my demographic analyses I followed the D.Portik pipeline with the purpose of evaluating the fit of ‘no migration’, ‘ancient migration’ and ‘secondary contact models’. For one of the population pairs, I have used a diploidized dataset and I have also tried the analysis accounting for polyploidy.
For the polyploid dataset, I have used the polyRAD pipeline to call SNPs assuming my species are autodecaploid (based on phylogenetic and genetic structure analyses that do not suggest the existence of two subgenomes). Then, I applied quality filters in each dataset (diploidized & decaploid) for minimum allele frequency, missing data and depth coverage. I ran the ‘standard’ dadi approach for each dataset, but I considered the increase in sample size in the case of the decaploid dataset.
The diploidized dataset contains about 11800 SNPs and the maximum number of segregating sites is about 10500, while the polyploid dataset contains about 5600 SNPs and projecting down to 50 (in each population) I get 1703 segregating sites (without projection I get 1706). I chose projecting down to 50 to save computational resources as segregating sites do not differ too much.
The diploidized dataset and the decaploid projecting down to 50 support the ‘symmetric migration model’ based on AIC.
With this in mind, I would like to raise some concerns about my analyses:
I am unsure if using the ‘standard’ dadi approach for a autopolyploid dataset is fine, or if it violates some assumptions of the method.
Also, as the SNP calling already includes some uncertainty I would like to know if my SFS for the polyploid dataset looks correct. It is quite different from those in the diploidized dataset and what I have seen around from other people and publications.
I enclose a decaploid SFS plot projecting down to 50, one without any projection and also the diploidized SFS.
Would you have any opinion on how to interpret the SFS? Would you have any recommendations regarding projecting down or not in the case of the decaploid dataset?
Any advice is greatly appreciated. Thank you
Juan Manuel
decaploid SFS plot without any projection
decaploid SFS plot projecting down to 50
diploidized SFS plot