GTR model in mcmctree

66 views
Skip to first unread message

Tiago Simoes

unread,
May 25, 2023, 10:32:08 AM5/25/23
to PAML discussion group
Hi all,

I have used the GTR model to obtain a ML tree in RAxML for a genomic dataset for which I would like to use MCMCTREE to infer the divergence times. However, I do not see an option implement it in MCMCTREE. I see it is available for BASEML, though. I also see that another person has asked this question before back in 2015, to which the answer was that GTR is not available for MCMCTREE. So, I guess my questions are: 1) why MCMCTREE does not implement GTR? And 2) is it possible to specify different models to different partitions? 

Best regards,

Tiago Simoes

Sishuo Wang

unread,
May 29, 2023, 3:41:38 AM5/29/23
to PAML discussion group
Hi Tiago,

1) As you noted, you can use baseml for that purpose (model=7) instead. As I understand, in case of the approximate likelihood approach, you might wish to use "baseml" to first calculate the maximum-likelihood estimates of the branch lengths, as well as the gradient and hessian of the log-likelihood evaluated at the ml estimates, and rename the output "out.BV" as "in.BV". Then set usedata=2 in mcmctree.ctl, and continue by "mcmctree". A version of this strategy is mentioned in the chapter. More details are given in the manual of mcmctree. See also my script https://github.com/evolbeginner/dating/blob/master/do_mcmctree.rb.

2) Of course. You can specify that with your phylip-formatted alignment, and change ndata = ? in mcmctree.ctl. If you use the above strategy, you can run baseml separately on each partition then merge the outputs into a single in.BV. By doing so, you can also run multiple runs of baseml at the same time, which is not available using the original "mcmctree". Again, ndata=? should be correctly specified otherwise only the 1st partition is considered if I remember correctly.

Cheers,
Sishuo Wang

Sishuo Wang

unread,
May 31, 2023, 5:23:13 AM5/31/23
to PAML discussion group
Hi Tiago,

forgot one thing. you might also want to set getSE = 2 in baseml.ctl so that the gradient and hessian will be shown in rst2

SW

Tiago Simoes

unread,
May 31, 2023, 4:54:45 PM5/31/23
to PAML discussion group
Dear Sishuo and all,

Thank you for your feedback! 

I understand it is possible to use GTR to calculate the maximum-likelihood estimates of the branch lengths, as well as the gradient and hessian of the log-likelihood evaluated at the ml estimates with "baseml", prior to using mcmctree. However, I wonder how the model mismatch between inferences in RAxML and maseml using GTR and whatever other model is chosen in mcmctree can create biases when inferring time trees. Or is the model chosen for the mcmctree runs irrelevant when using the ML estimates of the branch lengths from baseml (as in, effective branch lengths are not recalculated, and simply taken from baseml)?

Cheers,

Tiago

Sishuo Wang

unread,
Jun 5, 2023, 7:06:31 AM6/5/23
to PAML discussion group
Hi Tiago,

Thanks for your questions! If i understand correctly, you were asking

i) if baseml and raxml estimates diff branch lengths
I'm unsure but my gut feeling is that they'll make little diff if the model specified by the user is the same for both software, when the tree topo is fixed as required by mcmctree, the diff if any likely due to optimization precision. Actually, even my very simple script can give a highly similar result if the above criteria are met.

ii) if different branch lengths estimates lead to different time estimates in mcmctree
Perhaps you might want to have a look at this blog article by Mario dos reis and Groussin, Pawlowski, Yang 2011 for nucleotides.

Sishuo
Reply all
Reply to author
Forward
0 new messages