Dear Beast-Users,
I have a methodological/philosophical question regarding model choice and convergence.
I'm currently running a rather large analyses with for 200 taxa for 6 genes with 14 gene partitions. I have used Modeltest to estimate substitution models for each partition. After running the analyses for different lengths (>100 mil x 8 runs) all but two partitions reach convergence. I specified a GTR for these two partitions and the particular elements that fail to converge include the .ag, .cg, .freq. pinv, etc., receiving ESS between 100-200, or several <100. Everything else converges for independent and combined runs - same tree topology, same divergence dates, same everything.
My question is whether or not it is better to purposely miss-specify a simpler model (e.g. an HKY over a GTR) to reach convergence. Or, maintain the empirically determined, parameter rich model even though convergence is low (...the website suggests convergence on everything may not be necessary). Preliminary runs under simpler models change divergence estimates by few hundred thousand years (the is a Hawaiian system, so that's not totally trivial). With this result, I'm not sure what the more prudent choice is.
I have tried playing around with different priors on the GTR, which hasn't changed much over previous runs (or fails to initiate all together).
Any insights are greatly appreciated!
thanks,
-Gordon