Your understanding is correct, -c number_of_categories refers always
to the number of categories used in the CAT (aka PSR per site rates)
approximation of rate heterogeneity.
Cheers,
Fernando
This is not possible and we do not plan on implementing it.
The Gamma rate cats are hard-coded to 4 in RAxML for greater
computational efficiency of the phylogenetic likelihood function implementation.
Besides, especially on large datasets our CAT approximation of rate heterogeneity works
equally welll, if not even better, while at the same time requiring less memory and CPU cycles, i.e., the CO^2 footprint
of your analyses will be smaller ;-)
see, e.g.:
http://www.plosone.org/article/info%3Adoi%2F10.1371%2Fjournal.pone.0009490
and
http://www.biomedcentral.com/1471-2105/12/470
Alexis
--
Dr. Alexandros Stamatakis
www.exelixis-lab.org
> Thank you for clarifying this doubt I had. Indeed, I heard a lot of
> good things about the CAT aproximation-based models when working on
> large datasets, and I understand that when something good is found in
> science, one shall move to the next-geneneration technology and not
> stay stuck in old-fashionned practices.
Well, I wasn't syaing CAT is that good. Please keep in mind that the RAxML CAT
model is fundamentally different from the PhyloBayes CAT model, the naming was just very unfortunate because I was not
aware of the PhyloBayes CAT model back then.
> However, I'm not working on large datasets, at least not in the sense
> of whole-genome phylogenies, as I intend to compute phylogenies of
> potentially (very) large gene families, those having usually around
> 1,000 DNA sites. My sentiment is CAT approximation would not work well
> on so few sites. Do you agree on this?
No, the CAT model (or per-site rate category model in RAxML) should work quite well on this.
In fact, it mostly only doesn't work well if you don't have that many taxa (say less than 100).
You may want to have a look at the original paper:
http://sco.h-its.org/exelixis/pubs/HICOMB2006.pdf
Alexis