degrees of freedom and testing mixture model fit

90 views
Skip to first unread message

Anthony Redmond

unread,
Mar 7, 2018, 11:11:11 AM3/7/18
to IQ-TREE
Hi Minh and others,

Unless I am mistaken, for model comparison of mixture models, some models that are more complex have the same number of degrees of freedom to simpler models, e.g. such that LG+C60 and LG+C20 have an equal number of degrees of freedom, while both have fewer degrees of freedom than say LG+C20+F (If one does not use the -mwopt). 

Allowing that one wishes to use the precomputed mixture weights for phylogenetic analysis (i.e. not using -mwopt, which might be appropriate for a 'short' alignment with a complex evolutionary process), if model finder returns LG+C50, for example, as the best model from the set LG+C10 - LG+C60, can we be sure that this in fact is better fitting than the models LG+C10 - LG+C40?

Perhaps I'm overlooking something obvious!
Thank you,

Anthony


Minh Bui

unread,
Mar 7, 2018, 5:26:11 PM3/7/18
to IQ-TREE, Anthony Redmond
Hi Anthony,

On 8 Mar 2018, at 3:11 am, Anthony Redmond <anthon...@gmail.com> wrote:

Hi Minh and others,

Unless I am mistaken, for model comparison of mixture models, some models that are more complex have the same number of degrees of freedom to simpler models, e.g. such that LG+C60 and LG+C20 have an equal number of degrees of freedom, while both have fewer degrees of freedom than say LG+C20+F (If one does not use the -mwopt). 

Yes that’s right. Thus to re-iterate, C20 or C60 have pre-defined mixture weights, which are not regarded as free parameters. LG+C20 or LG+C60 will apply its own pre-defined mixture weight vector, and that’s why they have the same degree of freedom. If you use -mwopt (for mixture weight optimisation), then the weights are free parameters being estimated from the current data and thus LG+C20 has higher degree of freedom.

Note that LG+C20+F automatically turns on -mwopt, because the weight for +F component (as the 21st class in the mixture) is not known.


Allowing that one wishes to use the precomputed mixture weights for phylogenetic analysis (i.e. not using -mwopt, which might be appropriate for a 'short' alignment with a complex evolutionary process), if model finder returns LG+C50, for example, as the best model from the set LG+C10 - LG+C60, can we be sure that this in fact is better fitting than the models LG+C10 - LG+C40?

I would say Yes. In this case as these models have the same degree of freedom, the one with highest likelihood will be selected. 


Perhaps I'm overlooking something obvious!

No you didn’t overlook anything,

Cheers
Minh

Thank you,

Anthony



--
You received this message because you are subscribed to the Google Groups "IQ-TREE" group.
To unsubscribe from this group and stop receiving emails from it, send an email to iqtree+un...@googlegroups.com.
To post to this group, send email to iqt...@googlegroups.com.
Visit this group at https://groups.google.com/group/iqtree.
For more options, visit https://groups.google.com/d/optout.

Anthony Redmond

unread,
Mar 8, 2018, 10:21:58 AM3/8/18
to IQ-TREE
Thanks Minh,

That was helpful.
I guess my concern was that in a case where say LG+C60 gives a very slight improvement of fit over LG+C10, or even LG, there is a lot of hidden complexity in the real difference between the models themselves when the mixture weights are not free parameters. But I suppose this is really more of a philosophical factor.

Thanks again,

Anthony
Reply all
Reply to author
Forward
0 new messages