Hi all,
I am struggling to get convergence in my Stacey analysis. I have two datasets with four markers (two nuclear and two mitochondrial which I have linked the trees for), one with 63 specimens and one with 27. For both I have set the substitution models to HKY+G and used strict clock to simplify the model. After 2 billion generations the ESS values for the first dataset (63 specimens) is getting close to 100 for all parameters (some are much higher), but for the second dataset (27 specimens) there are still ESS values below 50 after 2 bill. gen. I am running several independent runs of 500 mill generations and looking at the combined logfiles in Tracer
Several questions:
1 - How high does the ESS values need to be for me to trust the tree? I am only interested in tree topology and species delimitation. Which parameters are most important to have high ESS values?
2 - For the smallest dataset it does not look like the ESS from the different runs are "adding up". For each run the ESS for a given parameter can be between 20-40 (four runs of 500 mill generations), but when I look at the combined values in Tracer it is only 45. Why is this? 3 - If the run has not converged by 2 bill. gen., will it never converge?
4 - I have some missing data in my dataset, could it help the analysis if I remove some of the specimens with missing data? That will mean less data in total, but less questionmarks in the matrix, which is more important?
5 - Are there ways I can further simplify the model? I have two partitions for COI (1 and 2 codon partition), should I link these (the trees are linked, and with the other mitochondrial marker)? Partition finder suggested to keep these as separate partitions. How do I know if I have simplified too much?
Thanks in advance for any input/ideas!
Mari