Higher posterior probabilities with an incorrect substitution model?

61 views
Skip to first unread message

Rutger Wilschut

unread,
Jun 20, 2012, 4:11:52 AM6/20/12
to beast-users
Hello everybody,

When running jmodeltest on my chloroplast dataset of the genus
Mosannona the substitution model GTR+G is predicted to be the best
fitting model. Accidently, I performed BEAST analyses with a GTR model
without gamma. These analyses remarkably show a significantly higher
support value (0.91 instead of 0.74) for one of the basal nodes of
interest compared to the same analyses with a GTR+G model.

Jmodeltest gives a –lnL of 11.314,459 for GTR+G and 11.362,524 for
GTR.

My question is whether it is more likely that the analyses with the
GTR model find an incorrectly high support value or that jmodeltest
fails in finding the model best describing the data.

Has anyone had a similar experience in BEAST? Comments are welcome!

Kind regards,

Rutger

Alexei Drummond

unread,
Jun 20, 2012, 4:42:30 AM6/20/12
to beast...@googlegroups.com
Dear Rutger,

This is not unexpected. Models with less parameters generally admit less uncertainty and so if you use a simpler model within a set of nested models (e.g. GTR instead of GTR+G) then you will be often obtain estimates that have less uncertainty associated with them. If you had tried Jukes-Cantor you would probably have gotten even higher posterior probabilities for the clades in the majority consensus tree. The posterior probabilities reported in each analysis are conditional on the model being correct. So the tradeoff is that you may get higher support for the clades estimated by less parameter-rich models, but the clades estimated are more likely to be wrong if the simpler model is further from the true evolutionary process. You will often be trading accuracy for precision.

In your case if you have the same topology then this may not be a problem, but you should also consider that the estimate of the divergence times are also conditional on the model chosen. The best approach would be to actually admit that the you don't know what model to use and average over the substitution model space as well. Without that option in BEAST 1, a formal model selection approach like Jmodeltest is a much better idea than choosing the model that gives you the highest posterior probabilities for the clades.

Cheers
Alexei
Reply all
Reply to author
Forward
0 new messages