How to obtain reconstructed ancestral sequence from codeml output rst file?

1,011 views
Skip to first unread message

Yufeng Wan

unread,
May 28, 2014, 4:33:28 AM5/28/14
to pamlso...@googlegroups.com
Hi, 

I have just step into the field of ancestral gene resurrection. I am not familiar with those programs and models, but I was trying to use codeml(aaml) to reconstruct several proteins. I have set RateAncestor=1 but I don't know how to interpret the results in rst file: It seems the program only gives me the best tree which I am not sure how to use it. Most of the lines in the rst are like these:
stage 0: (1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20);
lnL: -13969.391738  1814

stage 1:   190 trees, ntime: 21  np: 21
star tree: (1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20);  lnL:-13969.391738

S=1:  1/190  T=   1  ((1, 2), 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20);  1633 X +403.132005
S=1:  2/190  T=   2  ((1, 3), 2, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20);  2063 X +367.281142
S=1:  3/190  T=   3  ((1, 4), 2, 3, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20);  1194 X +2.381085
S=1:  4/190  T=   4  ((1, 5), 2, 3, 4, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20);  1228 X +0.586824
S=1:  5/190  T=   5  ((1, 6), 2, 3, 4, 5, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20);  1129 X +0.169289
S=1:  6/190  T=   6  ((1, 7), 2, 3, 4, 5, 6, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20);  1457 X +0.135935
S=1:  7/190  T=   7  ((1, 8), 2, 3, 4, 5, 6, 7, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20);  1059 X +1.146362
S=1:  8/190  T=   8  ((1, 9), 2, 3, 4, 5, 6, 7, 8, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20);  1279 X +1.668315
S=1:  9/190  T=   9  ((1, 10), 2, 3, 4, 5, 6, 7, 8, 9, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20);  1948 X +5.864339
S=1: 10/190  T=  10  ((1, 11), 2, 3, 4, 5, 6, 7, 8, 9, 10, 12, 13, 14, 15, 16, 17, 18, 19, 20);  1062 X +5.632540
S=1: 11/190  T=  11  ((1, 12), 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 13, 14, 15, 16, 17, 18, 19, 20);  1896 X +0.218827 
....
....
);  lnL:-9321.621428

S=17:  1/3  T=1321  (((((1, (2, 3)), (((((15, 16), 18), 17), 19), 20)), (10, (11, (12, (13, 14))))), ((4, 5), 6)), (7, 8), 9);  6474 X +0.084288
S=17:  2/3  T=1322  (((((1, (2, 3)), (((((15, 16), 18), 17), 19), 20)), (10, (11, (12, (13, 14))))), (7, 8)), ((4, 5), 6), 9);  5872 X -0.000808
S=17:  3/3  T=1323  (((((1, (2, 3)), (((((15, 16), 18), 17), 19), 20)), (10, (11, (12, (13, 14))))), 9), ((4, 5), 6), (7, 8));  4899 X +3.840010

best tree: (((((1, (2, 3)), (((((15, 16), 18), 17), 19), 20)), (10, (11, (12, (13, 14))))), 9), ((4, 5), 6), (7, 8));   lnL:  -9317.781418

Can anyone please tell me how to use codeml to reconstruct a gene? Thanks a lot!

Yufeng Wan

Ziheng

unread,
Aug 3, 2014, 6:46:26 PM8/3/14
to pamlso...@googlegroups.com
You ca read the following paper
Yang et al 1995 genetics.
Try the example file stewart.aa stewart.ctl (I think) to duplicate the results in that paper.
The doc for the program is pamlDOC.pdf.
Ziheng.

Xingcheng Lin

unread,
Jun 4, 2017, 9:16:28 PM6/4/17
to PAML discussion group
Hi,

May I ask what is the current example of learning Ancestral Sequence Reconstruction using PAML? The stewart.ctl mentioned in the previous reply seems not existed in the latest version of PAML package.

Thanks,
Xingcheng

Fei Yuan

unread,
Jun 9, 2017, 5:31:43 PM6/9/17
to PAML discussion group
Your rst file seems do not have ancestral sequences in it. Have you checked you profile control file for running codeml? I started to use PAML recently, even I have some questions about the rooted or unrooted trees, but my rst file does give me the ancestral squences and many other pieces of messages. If you got the right rst file, extract the ancestral sequence should not be very hard. You can use any program language you familiar with to parse the rst file. The sequence part always start with "List of extant and reconstructed sequences" and for ancestral sequence (sequences for internal node, not for leaf node) are always start with "node #" and follwd by a number. The numbers are corresponding to the third tree ("tree with node labels for Rod Page's TreeView") in the beginning part, so you can easily map them to the phylogenetic tree. By the way, I have a python script at hand can parse the rst file, if you need that, you can let me know (yuanfe...@gmail.com). Or you can write your own script to parse whatever part you want. 

Ziheng

unread,
Aug 26, 2017, 12:55:21 PM8/26/17
to PAML discussion group
the codeml.ctl file in the paml folder is to read stewart.aa and stewart.trees and to duplicate the results in the following paper:

Yang, Z., et al. (1995). "A new method of inference of ancestral nucleotide and amino acid sequences." Genetics 141: 1641-1650.

this has
 RateAncestor = 1  * (0,1,2): rates (alpha>0) or ancestral states (1 or 2)
which should be 1 to get results for ancestral reconstruction (in the file rst).
ziheng

 
Reply all
Reply to author
Forward
Message has been deleted
0 new messages