Dear Ryan and dadi users,
I am working on mosquito cryptic species and trying to find the best demographic model for each population. I have populations that vary from 12 to 72 individuals. I strongly suspect from other indices like a negative genome-wide Tajima’D and a skewed allele frequency spectrum that the different populations are in expansion, so I was expecting to find a model that confirm this trend.
I have tried all the model implemented in Dadi.Demographic1D with different initial parameters and I have several questions regarding segregating sites, model selection and parameters interpretation. I have read almost all related topics on this forum but did not find answers to my questions.
1. Segregating sites : In the manual you indicate that « As a rule of thumb, we often choose our projection to maximize the number of segregating sites in our final fs (assessed via fs.S()), although we have not formally tested whether this maximizes statistical power. » I have tried different sample sizes and the number of segregating sites varies a little bit but it doesn’t seem to affect the model selection. Is it better to adapt the number of samples according to the maximum of segregating sites for each population or can I project down to the same sample size for almost all the populations? For the group of 72 individuals, the best fs.S() is for a sample size of 104…I have never seen such a high sample size in the different analyses I have found... does it sound correct?
2. Model selection : For some populations, all the model implemented in Dadi.Demographic1D give about the same LL. The AIC are also very close. But when I look a the residual plots (see document attached), they seem to show a trend for all the model except one (the bottlegrowth). Can I choose this model based on the value of the LL (the highest) and the residual plot to make simulations with ms with this model?
3. Parameters : If I choose the bottlegrowth model I was wondering how to interpret the parameters. nuB is equal to 5.7 and nuF is equal to 1.5. My understanding is that the population had an instantaneous size change whereby the population size has been multiplied by 5.7. And then, the population started to growth exponentially to increase again the population size by 1.5 times the ancient population. Doest it make sense? I have seen a recent publication where they had the same kind of parameters for a bottlegrowth model: the first parameter nuB was > 1, meaning expansion, and the nuF parameter was < 1, corresponding to a bottleneck. To me the model looked more like a growth-bottleneck model. Is it possible to interpret the parameters in this manner?
Sorry for the long email and thank you very much for your help and this forum that helps me a lot with Dadi,
Best,
Caroline