MCMCTree error: Only bounds for the root age are implemented..

45 views
Skip to first unread message

Dilrini Vanrooyen

unread,
Apr 16, 2024, 1:22:32 PMApr 16
to PAML discussion group
Hi,
For MCMCTree analysis, I have used the tree produced from RAxML. To make the RAxML tree I have use concatenated single copy orthologs from 13 species. 
Here is the tree file I used:

13

((KF,(VV,((GM,(FS,(MD,PP))),(PG,((TC,AT),(PT,(CF,ptg))))))),CS);

The sequence file use for the MCMCTree analysis contain amino acid sequences.

After I ran the tool, I got this error:

Error: Only bounds for the root age are implemented..

I got the output file. But there is no tree.

Here is my mcmctree.ctl file:

seed = -1

seqfile = /$PWD/beast.fasta

treefile = /$PWD/tree

outfile = mcmctree_output

ndata = 1  * indicates there is only one dataset

seqtype = 2 * amino acids

clock = 2 * independent rates

RootAge = '<1.0' * safe constraint on root age, used if no fossil for root.

model = 6 * JTT substitution model 

alpha = 0 * alpha for gamma rates at sites

ncatG = 5 * No. categories in discrete gamma

cleandata = 1

BDparas = 1 1 0.1 * birth, death, sampling

kappa_gamma = 6 2 * gamma prior for kappa

alpha_gamma = 1 1 * gamma prior for alpha

rgene_gamma = 2 20 1 * gammaDir prior for rate for genes

sigma2_gamma = 1 10 1 * gammaDir prior for sigma^2 (for clock=2 or 3)

finetune = 1: .1 .1 .1 .1 .1 .1 * auto (0 or 1) : times, rates, mixing...

print = 1 * 0: no mcmc sample; 1: everything except branch 2: ev...

burnin = 2000

I want to use default values since I don't know what is the RootAge for the analysis. 

Anyone could help me to fix this error?

Thanks so much!

Sandra AC

unread,
Apr 17, 2024, 3:04:48 PMApr 17
to PAML discussion group
Hi there,

Thanks for your message! It seems that there may be various problems regarding the format of your input files and the settings in your control file:
  • You first need to convert your input sequence alignment into PHYLIP format instead of using a FASTA format. If you do a quick search on the Internet, you will find many tools to convert your alignment from FASTA to PHYLIP format -- you could even code your own script! You may want to read the PAML Wiki on GitHub to understand the required format for your input files.
  • According to your message, it seems that you are only incorporating one node age constraint via the control file (i.e., RootAge = '<1.0'). If <1.0 is the constraint that you want to apply for the root age, you should incorporate it in your tree file (e.g., according to your message, ((KF,(VV,((GM,(FS,(MD,PP))),(PG,((TC,AT),(PT,(CF,ptg))))))),CS)'<1.0'; ) -- using option `RootAge` is somehow discouraged in later versions of PAML (latest: PAML v4.10). To decide such constraint, you will need to assess the evidence from the fossil record, biomarkers, geological events, etc. Once you have an idea of, say, min/max ages for the node/s of interest, you should decide which distribution you should use to constrain such node ages. You will have to take a look at the PAML documentation on GitHub (pp. 49-52) to learn more about what calibrations you can use and the corresponding format you have to follow to calibrate the tree topology.
  • If you used "13" as the header of your input tree file, then you will get an error. You will need to use "13 1" as the header follows the PHYLIP format. You may want to check the PAML Wiki for more details about how to format your input tree file.
  • I believe there are also some errors in the settings of your control file. You can take a look at the PAML documentation on GitHub to learn more about the available settings and what they mean (pp. 43-48). E.g., you are missing some options such as `usedata`, `sampfreq`, `nsample` (the last two required when using the exact likelihood calculation, optional if you are using `usedata = 2` when calling `CODEML` for AA data or `BASEML` for nuc data to estimate the gradient, Hessian, and branch lengths). Please note that you will also need to add option `aaRatefile`, which value is the path to a file that has the substitution matrix of your interest (e.g., see the various matrices already available in PAML, which you can just download and save in your preferred location). In the settings you have shared, you have model = 6, which, according to the PAML documentation (p. 32), enables model `FromCodon1`. As described in the same page of the documentation, this model is the "mechanistic amino acid substitution model of Yang et al. (1998, table3)". Make sure that this is the model that you want to use, otherwise choose the AA subst. model that best fits your data! :)
  • Additional notes:
    • Option `finetune` is not required anymore as the program autotunes itself.
    • Option `RootAge`, as aforementioned, is discouraged. You should include all node age constraints in the tree topology, including the root age constraint.
Lastly, you may also want to read some protocols that show how to use `MCMCtree` (e.g., dos Reis 2022, where an AA dataset is also used) so that you become more familiar with the input files required and learn how to best specify the settings in the control file.

Hope this helps!
S.
Reply all
Reply to author
Forward
0 new messages