CAT vs. GAMMA

884 views
Skip to first unread message

db...@cam.ac.uk

unread,
Jun 24, 2015, 7:06:05 AM6/24/15
to ra...@googlegroups.com
Hello,

I have a question about the GAMMA and CAT approximations for the modelling of among-site rate heterogeneity. In the RAxML manual it is stated that the CAT approximation, relative to the GAMMA approximation, confers a four-fold reduction in
memory consumption and computational requirement. However, I am confused about the basis of this reduction. In my understanding, the GAMMA approximation leads to a four-fold increase -- relative to homogeneous rate models -- in computational requirement due to the computation of four likelihood values at each site -- one for each discrete rate.

For the CAT model though, if each site is being optimised with respect to a number of rate categories -- assumed to be greater in number than four -- why is the CAT model not more computationally expensive than GAMMA? The difference from my perspective is optimising a site rate with respect to four categories (GAMMA), and optimising it with respect to more than four categories (CAT). If you look at the 'Materials and Methods' section from the FastTree2 article, for example, they use a Bayesian method to optimise the site with respect to 20 different categories, representing 20 different posterior probability calculations for each site:

http://journals.plos.org/plosone/article?id=10.1371/journal.pone.0009490#s4

My feeling is that for GAMMA, the four-fold calculations are performed at every step in the navigation of tree space, whereas for CAT the optimisation is performed once, and then a single rate (per site) used for the navigation through tree space. Can anybody confirm this?

Best,

David

Alexandros Stamatakis

unread,
Jun 24, 2015, 3:35:12 PM6/24/15
to ra...@googlegroups.com
Hi David,

> I have a question about the GAMMA and CAT approximations for the
> modelling of among-site rate heterogeneity. In the RAxML manual it is
> stated that the CAT approximation, relative to the GAMMA approximation,
> confers a four-fold reduction in
> memory consumption and computational requirement. However, I am confused
> about the basis of this reduction. In my understanding, the GAMMA
> approximation leads to a four-fold increase -- relative to homogeneous
> rate models -- in computational requirement due to the computation of
> four likelihood values at each site -- one for each discrete rate.

that is correct ...

> For the CAT model though, if each site is being optimised with respect
> to a number of rate categories -- assumed to be greater in number than
> four -- why is the CAT model not more computationally expensive than
> GAMMA? The difference from my perspective is optimising a site rate with
> respect to four categories (GAMMA), and optimising it with respect to
> more than four categories (CAT). If you look at the 'Materials and
> Methods' section from the FastTree2 article, for example, they use a
> Bayesian method to optimise the site with respect to 20 different
> categories, representing 20 different posterior probability calculations
> for each site:
>
> http://journals.plos.org/plosone/article?id=10.1371/journal.pone.0009490#s4

Yes, however then they just use the best-fit rate per site and don't
integrate the likelihood over all 4 discrete rates as in Gamma ...

> My feeling is that for GAMMA, the four-fold calculations are performed
> at every step in the navigation of tree space, whereas for CAT the
> optimisation is performed once, and then a single rate (per site) used
> for the navigation through tree space. Can anybody confirm this?

that's correct ...

alexis

>
> Best,
>
> David
>
> --
> You received this message because you are subscribed to the Google
> Groups "raxml" group.
> To unsubscribe from this group and stop receiving emails from it, send
> an email to raxml+un...@googlegroups.com
> <mailto:raxml+un...@googlegroups.com>.
> For more options, visit https://groups.google.com/d/optout.

--
Alexandros (Alexis) Stamatakis

Research Group Leader, Heidelberg Institute for Theoretical Studies
Full Professor, Dept. of Informatics, Karlsruhe Institute of Technology
Adjunct Professor, Dept. of Ecology and Evolutionary Biology, University
of Arizona at Tucson

www.exelixis-lab.org

D. Bradley

unread,
Jun 24, 2015, 5:16:21 PM6/24/15
to ra...@googlegroups.com
Hi Alexis,

Thank you for getting back to me. This makes sense now.

Best,

David
Reply all
Reply to author
Forward
0 new messages