Which criteria would be better, AIC, BIC or AICc based on real running data

377 views
Skip to first unread message

Runhua Lei

unread,
May 29, 2015, 3:32:59 PM5/29/15
to partiti...@googlegroups.com
Hi Rob:
I did read through your paper and the one that you mentioned.
I run six different combinations of AIC, AICc, and BIC, Linked or unlinked.  It seemed to me that unlinkedAIC was the better choice for partition.  However, it is very interested to see that how different criteria make total different schemes (8 partitions to 23 partitions).  I ran Partitionfinder with default.  Could be something wrong with my setting or preference? What do you think?

Thank you very much,

All the best,

Runhua
  AICc lnL Number of params Number of subsets
linkedAICc 285197.6468 -142282.1487 310 23
unlinkedAICc 284895.0972 -141529.743 864 8
         
  AIC lnL Number of params Number of subsets
linkedAIC 285187.2469 -142282.6235 311 23
unlinkedAIC 284787.4859 -141529.743 864 8
         
  BIC lnL Number of params Number of subsets
linkedBIC 286823.5273 -142543.0236 181 9
unlinkedBIC 287929.9599 -142409.8873 324 3

Paul Frandsen

unread,
May 29, 2015, 3:54:29 PM5/29/15
to partiti...@googlegroups.com
Hi Runhua,

It is totally reasonable for the different methods to generate different numbers of subsets. The numbers that you list fall in line with what I would expect considering how the different methods work. The runs using unlinked branch lengths require many more parameters to be estimated for each subset, therefore, fewer total subsets would be expected because the much larger number of parameters estimated per subset incur a higher penalty in the calculation of AIC(c)/BIC.

To decide which scheme to use, you should first pick your preferred information criterion (Rob suggested AICc for good reason). Then you should select the scheme with the best score associated with that criterion. If you do decided to choose AICc, then the best scheme is the one generated with unlinked branch lengths. However, keep in mind that if you choose this scheme, and you are planning on estimating a phylogenetic tree, you should use software that allows you to estimate a tree with unlinked branch lengths among subsets. You can usually figure this information out by consulting the software manual/documentation.

My best,

Paul

--
You received this message because you are subscribed to the Google Groups "PartitionFinder" group.
To unsubscribe from this group and stop receiving emails from it, send an email to partitionfind...@googlegroups.com.
For more options, visit https://groups.google.com/d/optout.



--
Bioinformatics and Genomics
Office of Research Information Services
Office of the CIO
Smithsonian Institution

Runhua Lei

unread,
Jun 1, 2015, 1:31:00 PM6/1/15
to partiti...@googlegroups.com
Hi Paul:
Thanks for your explanation.
I talked quite a lot of people at IPS about mitogenome data and partition.  They strongly suggested that mitogenome was one loci and unlinked model may not fit my data.
Do you think that it is reasonable to choose linked AICc best scheme?
Thanks,

Runhua

Paul Frandsen

unread,
Jun 2, 2015, 9:00:56 PM6/2/15
to partiti...@googlegroups.com
Hi Runhua,

I think you are confusing what the linked and unlinked branch lengths mean. In a model with unlinked branch lengths, the underlying branching pattern is the same, but each branch is given the 'freedom' to have it's own length for each subset. This is opposed to the linked model in which branches are scaled in tandem with a single rate multiplier. This has nothing to do with whether it is one locus or not--it just assumes that different portions of the dataset are best fit with different model parameters, e.g. 3rd codon positions will most often evolve very differently than 2nd codon positions and should be modeled differently even though they are from the same locus. As Rob pointed out, unlinked branch lengths are just a rather crude method to account for heterotachy. Generally they add so many parameters that they are rarely favored. However, with your dataset, the unlinked branch length model had a better AICc score. This could be *because* it is a mitogenome, not in spite of it. Mitogenomes can exhibit some weird behavior, heterotachy being one of them.

Hope that helps.

Paul
Reply all
Reply to author
Forward
0 new messages