Re: How Ancestral state reconstruction results made by Beast

2,459 views
Skip to first unread message

Andrew Rambaut

unread,
Mar 19, 2013, 4:28:32 PM3/19/13
to beast...@googlegroups.com
Dear Chervin,

It is not clear what you need. Do you want the ancestral sequence at the root of the tree or for every node in the tree? 

Andrew

On 18 Mar 2013, at 15:32, chervin...@gmail.com wrote:

Hello,

 I have done an analysis with beast 1.7.4 for reconstructing the ancestral stats at the different nodes of my topology. I need to the nucleotide sequence reconstructed but I don’t know how to get them from the file of my MCC tree.

 Can someone tell me if it is possible to collect easily the ancestral sequences from the MCC tree file ?

 Thank you for your help

 Chervin


--
You received this message because you are subscribed to the Google Groups "beast-users" group.
To unsubscribe from this group and stop receiving emails from it, send an email to beast-users...@googlegroups.com.
To post to this group, send email to beast...@googlegroups.com.
Visit this group at http://groups.google.com/group/beast-users?hl=en.
For more options, visit https://groups.google.com/groups/opt_out.
 
 

___________________________________________________________________
 Andrew Rambaut                
 Institute of Evolutionary Biology       University of Edinburgh
 Ashworth Laboratories                         Edinburgh EH9 3JT
 EMAIL - a.ra...@ed.ac.uk                TEL - +44 131 6508624  

Andrew Rambaut

unread,
Mar 21, 2013, 8:09:30 AM3/21/13
to chervin...@gmail.com, beast...@googlegroups.com
'Reconstruct states at all ancestors' will write an ancestral state at every node in every tree in the tree file. This is intended for single character data like phylogeography location or discrete traits. If you reconstruct sequences you will get very large data sets indeed. You will also need to devise some way of extracting the sequences from the trees (we don't currently have any way of doing this).

To obtain the ancestral sequence at the root, you can chose that option in BEAUti (Reconstruct states at ancestor: Tree Root). If you create a taxon set in the Taxa set, you will also get the option to reconstruct the ancestral sequence at the MRCA of that taxon set (you might want to enforce monophyly for this to be meaningful). This option will write the ancestral states to the log file which may be easier to access.

Andrew

On 20 Mar 2013, at 07:45, chervin...@gmail.com wrote:

> Dear Andrew,
>
> Thanks for your answer.
>
> I am interested in the ancestral sequences of nodes within the trees and the root.
>
> I would like to compare the ancestral sequences at key nodes in the phylogenies.
>
> Thanks you again for your help.
>
> Chervin
--
The University of Edinburgh is a charitable body, registered in
Scotland, with registration number SC005336.

Arman Bilge

unread,
Mar 21, 2013, 3:37:42 PM3/21/13
to beast...@googlegroups.com, chervin...@gmail.com
Hi Chervin and Andrew,

I actually ran into this same problem just over a month ago. I had used ancestral reconstruction to reconstruct sequences at every node in the tree, and after giving the TreeAnnotator JVM plenty of extra memory I got an annotated tree. It was still pretty big (a few GB) and I realized that there was no tool to rescue my ancestral sequences with...or even visualize the tree itself. So I threw together a Java program to get the job done quick and dirty.

I was in a rush/feeling lazy so the code is pretty (very) bad, but it may be helpful to you. If you want I can write up a much more robust, easier-to-use version. I probably did more work than necessary, writing up my own parser and everything, but it worked for me.

You may need to make a few changes to get it to work for you. Namely, you'll want to change the "paths" array in the driver class to point to the nodes that you are interested in and have their stats written to a file. Also, make sure to give the JVM a ton of memory depending on the size of your tree and sequences.

Good luck and hope that helps!

Best,
Arman
AncestralSequenceRecovery.zip

Andrew Rambaut

unread,
Mar 29, 2013, 6:07:02 AM3/29/13
to beast...@googlegroups.com
I don't know if this will be helpful for all these situations but you can log the ancestral state at a specific node (defined by the MRCA of a group of taxa) using the following XML:

<log logEvery="1000" fileName="cladeA.states.log">
<ancestralTrait name="statesA" traitName="states">
<treeModel idref="treeModel"/>
<ancestralTreeLikelihood idref="treeLikelihood"/>
<mrca>
<taxa idref="cladeA"/>
</mrca>
</ancestralTrait>
</log>

You can create one of these for every clade you are interested in (add them after the main log file element). Define the clades in the 'Taxa' tab of BEAUti (you many wish to enforce monophyly or you may get the node jumping around the tree).

You still need some way of summarising the ancestral states across all the samples (perhaps a consensus sequence?).

Andrew


On 26 Mar 2013, at 16:40, Brian Muchmore <bmuc...@gmail.com> wrote:

I have been interested in something like this for months, so it is great to see I am not alone, and it seems to me a lot of other people would be interested in this capability too.  I have some computer and evolutionary biology savvy, but truthfully, I am basically just a noob figuring things out one problem at a time.  Any chance you would write a much more robust, easier to use version?  I know I would appreciate it, and I highly doubt I would be the only one.

Brian Muchmore

unread,
Mar 29, 2013, 7:18:44 AM3/29/13
to beast...@googlegroups.com
Great, thanks for the response.  Also, for other users who may not be aware, I prefer to limit the number of programs I utilize, but you can also use a program like the FastML server to reconstruct ancestral sequences.  Check out this paper for some more general info: Robustness of Ancestral Sequence Reconstruction to Phylogenetic Uncertainty.

Also, a brief aside to all the experts who help out others:  As a Beast-user for about a year now, I have come to realize the power and popularity of the program lies as much in the (constant) guidance and support of the developers as any aspect of the software itself.  This program can be a struggle for the uninitiated, but its nice to know there are people out there willing to help. 


--
You received this message because you are subscribed to a topic in the Google Groups "beast-users" group.
To unsubscribe from this topic, visit https://groups.google.com/d/topic/beast-users/jSeWxDTvts4/unsubscribe?hl=en.
To unsubscribe from this group and all its topics, send an email to beast-users...@googlegroups.com.

Arman Bilge

unread,
Mar 29, 2013, 11:38:19 PM3/29/13
to beast...@googlegroups.com, chervin...@gmail.com
Hi Brian,

So I have gone ahead and made a new version of that little utility. Currently you specify at least two taxa and the program spits out the node attributes for their most recent common ancestor to the screen (or easily to a file using standard shell redirection) which will include the reconstructed ancestral sequences. You will likely need to give the JVM additional memory with the -Xms and -Xmx arguments.

Example Usage:
java -Xms512 -Xmx2G -jar AncestralSequenceRescue.jar path/to/annotated.tre taxon1 taxon2 taxon3 ...

I hope this is useful and do let me know if you find any bugs or have suggestions for features/improvements.

Best,
Arman


On Tue, Mar 26, 2013 at 12:40 PM, Brian Muchmore <bmuc...@gmail.com> wrote:
I have been interested in something like this for months, so it is great to see I am not alone, and it seems to me a lot of other people would be interested in this capability too.  I have some computer and evolutionary biology savvy, but truthfully, I am basically just a noob figuring things out one problem at a time.  Any chance you would write a much more robust, easier to use version?  I know I would appreciate it, and I highly doubt I would be the only one.


On Friday, March 22, 2013 2:37:42 AM UTC+7, Arman Bilge wrote:
AncestralSequenceRescue.jar

Brian Muchmore

unread,
Mar 30, 2013, 1:03:31 AM3/30/13
to beast...@googlegroups.com
That is awesome, thanks.  I'll try using it sometime next week, and I'll let you know how I fare.


--
You received this message because you are subscribed to a topic in the Google Groups "beast-users" group.
To unsubscribe from this topic, visit https://groups.google.com/d/topic/beast-users/jSeWxDTvts4/unsubscribe?hl=en.
To unsubscribe from this group and all its topics, send an email to beast-users...@googlegroups.com.

Brian Muchmore

unread,
Apr 24, 2013, 3:22:55 AM4/24/13
to beast...@googlegroups.com
Arman,

I just got around to trying this, and so far, it gives me exactly what I want.  Thank you very much.  If I use this in a publication I will be sure to reference you somehow.

For anybody else, if you have zero command line experience, and need a little hand holding, just drop me an email, and I will make it as straightforward as I can.

seanyu...@gmail.com

unread,
Jul 1, 2013, 4:08:20 PM7/1/13
to beast...@googlegroups.com
Hi Arman,
 
This tool is great! But why it will produce 2, or more than 2 ancestral sequences?  That is what I don't understand.
 
Thanks,
 
Sean

Arman Bilge

unread,
Jul 3, 2013, 11:35:22 AM7/3/13
to beast...@googlegroups.com
Hi Sean,

Glad to hear that it is of help! Regarding your question, I think that this is one for the BEAST developers. My understanding has been that the set of reconstructed sequences at the node is accompanied by a trait appended ".probs", which represents the posterior probability of each reconstruction.

Best,
Arman


Arman Bilge

unread,
Jul 7, 2013, 4:52:49 PM7/7/13
to beast...@googlegroups.com
Hi all,

Just letting you (and any future readers) know that I have moved this utility over to GitHub and that you can always find the latest version at the following link:

I have also added a rough GUI to this version, which I hope that some of you will find easier to use than the command line. I am always open to other ideas for improvements, as well as bugs.

Arman

Brittany Rife

unread,
Jan 17, 2017, 8:47:10 PM1/17/17
to beast-users
Arman,

I am having the same problem, but I am primarily interested in the tree and do not have zero JAVA coding skills :( Is there anyway to modify this so that I can extract both the ASR and the tree without the ASR annotations as separate files?

Andrew Rambaut

unread,
Jan 18, 2017, 4:23:16 AM1/18/17
to beast...@googlegroups.com
If you know in advance which TMRCAs you wish to reconstruct on you can create log files for each individual TMRCA on which you wish to reconstruct the ancestral state:

<log logEvery="1000" fileName=“HK.states.log”>
<ancestralTrait name=“HA" traitName="states”> 
<treeModel idref="treeModel"/> 
<ancestralTreeLikelihood idref="treeLikelihood"/> 
<mrca>
<taxa idref="HK”/>
</mrca> 
</ancestralTrait> 
</log> 

This avoids having very large tree files with reconstructions for every node (and very slow treeannotator reconstructions). It will still need some processing to get modal state values across states (and probabilities if required).

The above TMRCA reconstruction is generated by BEAUTi but only for one TMRCA (or the root) but you can add as many of these as you need in the XML.

Andrew

Brittany Rife

unread,
Jan 18, 2017, 9:42:28 AM1/18/17
to beast-users
Thank you, Andrew; this is very helpful for the future! But I have a tree from a very long run, and, rather than re-running BEAST, I would very much like to just extract the tree and ASR info separately so that I can view and re-purpose the tree. Do you know of a way to do this?

man...@umich.edu

unread,
Jun 4, 2018, 4:35:17 PM6/4/18
to beast-users
Can you clarify a couple points on this program for me? Does this only work if you select the 'Reconstruct states at all ancestors' on the States tab of Beast 1? Does the tree need to be annotated? I am getting an error message from an unannotated tree generated from Beast 2 without selecting a "Reconstructed ancestor state". Should I stick with Beast 1 for sequence reconstruction? Thank you in an advance.

Jack Dorman

unread,
Sep 17, 2024, 2:37:25 PM9/17/24
to beast-users
Hi all,

I wanted to follow up in this thread to see if there's been any updates on parsing sequence reconstructions from the nexus files. I want to extract reconstructions at each node but the MCC tree file is so huge that all parsing functions that I've tried have struggled with it. I can manage to do my own extremely rough parsing in python but it's not very reproducible for other runs. Further, some nodes have far more sequences (it seems like thousands) while others only have a few or one in the MCC tree. The previously listed Java script seems useful but this computer has some tight security permissions and is often fussy about java scripts. Any help would be greatly appreciated.

Best,
Jack Dorman
PhD Candidate
QVEU-LVD-NIAID
Cohort of 2022, NIH-JHU GPP
Reply all
Reply to author
Forward
0 new messages