Hi Ryan,
I am running three 2D models:
- no migration
- symmetric migration
- secondary contact
The optimization scheme consists of 3 rounds. The first two rounds have 25, and the last has 50 replicates. Each replicate can run up to 25 iterations, and the starting parameter values in each round are generated by 3-, 2-, and 1-fold perturbations.
Based on the AIC, the secondary contact scenario is the best fitting model. However, when visually comparing the observed and the simulated SFS, I am not sure I am satisfied with these results. The simulated SFS under different models look rather similar, and even the secondary contact model fails to explain very prominent features in the observed SFS. What is particularly intriguing to me is the relatively high abundance of alleles at medium frequencies (visible as a bump at the center of the SFS and in the residual plot). This is also visible in the 1D frequency spectra of each population.
In order to improve the fit of these models, I did consider running the optimization for longer, but I would be surprised if this solved the excess of middle-frequency variants in the residual plots. Particularly because the top five replicates seem to broadly converge. Hence, I am thinking that a better way forward would be to test additional models, although I wanted to keep the models as simple as possible.
Do you agree that the fit is relatively poor and that the best way to improve it is by testing more complex models? If so, are there any particular demographic events that you would add (such as bottlenecks)?
Thank you very much in advance for your help and for your guidance!