Compare tree phylogenies calculated using different models

82 views
Skip to first unread message

Michal Malszycki

unread,
Jun 17, 2019, 8:40:08 AM6/17/19
to IQ-TREE
Hello,

I have two trees calculated using different models (LG+I+R15 , and LG+C20+G). The C20 tree has higher Log-likelohood of consensus tree ( ‑702686.401 comparing to ‑705517.838 ), even though ModelFinder was showing much better fit for R15 model.
> I would like to specify if this difference is meaningfull. How could I do it?
> I guess, that I could calculate AIC by hand, and it would be a good indicator, but for this i need the number of parameters used in these models. How could I find out the number of parameters?

Thank you in advance,
Sincerely
Michal Malszycki

Minh Bui

unread,
Jun 25, 2019, 11:15:17 AM6/25/19
to IQ-TREE, Michal Malszycki
Hi Michal,

On 17 Jun 2019, at 8:40 am, Michal Malszycki <michal.m...@gmail.com> wrote:

Hello, 

I have two trees calculated using different models (LG+I+R15 , and LG+C20+G). The C20 tree has higher Log-likelohood of consensus tree ( ‑702686.401 comparing to ‑705517.838 ), even though ModelFinder was showing much better fit for R15 model. 

Did you include LG+C20+G model in the model selection, or are they the log-likelihoods from two different runs?

But given this huge gain in log-likelihood, I’d use the LG+C20+G model.

> I would like to specify if this difference is meaningfull. How could I do it?

You can look at the AIC/BIC score of the two models. The model with lower BIC score should be preferred. In the literature, BIC difference of >10 units can be considered “significant”.

> I guess, that I could calculate AIC by hand, and it would be a good indicator, but for this i need the number of parameters used in these models. How could I find out the number of parameters?

Ah OK, you already know the answer. You can look at .iqtree file. It reports log-likelihood, number of parameters and even AIC, AICc and BIC. Something like this:

Log-likelihood of the tree: -22669.6911 (s.e. 344.6960)
Unconstrained log-likelihood (without tree): -11402.5956
Number of free parameters (#branches + #model parameters): 43
Akaike information criterion (AIC) score: 45425.3822
Corrected Akaike information criterion (AICc) score: 45427.3187
Bayesian information criterion (BIC) score: 45666.1779

Cheers,
Minh


Thank you in advance,
Sincerely
Michal Malszycki

-- 
You received this message because you are subscribed to the Google Groups "IQ-TREE" group.
To unsubscribe from this group and stop receiving emails from it, send an email to iqtree+un...@googlegroups.com.
To post to this group, send email to iqt...@googlegroups.com.
Visit this group at https://groups.google.com/group/iqtree.
To view this discussion on the web visit https://groups.google.com/d/msgid/iqtree/0757bf09-68a1-44cf-b363-63359a6cbad5%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

Michał Małszycki

unread,
Jun 26, 2019, 5:52:07 AM6/26/19
to Minh Bui, IQ-TREE
Thank you for the answer. I know now everything i need.
For your question: in my case model finder found -lnL for R10: 711875.431 and  C20: 731206.120 (with Information criteria strongly better for R10). Dataset I was calculating consisted of 4100 sequences one gene (260 aminoacids after pruning). C20 in the end calculated much more complicated phylogeny with lower node support (propably because the alignment was too short).
Sincerely,
Michał Małszycki

Minh Bui

unread,
Jun 26, 2019, 11:40:37 PM6/26/19
to Michał Małszycki, IQ-TREE
Ah ok, I didn’t know that you have only one short gene. In that case, please do not use C20 or other CAT models, which are only designed for phylogenomic alignments. For short alignments, such models are over-parameterised.

Cheers,
Minh
Reply all
Reply to author
Forward
0 new messages