Fossil MRCA Calibrations in BEAST

2,135 views
Skip to first unread message

Nitish Narula

unread,
Jul 18, 2014, 12:01:05 AM7/18/14
to beast...@googlegroups.com
Hi,

I am working with a dataset containing 748 taxa and 24 partitions (each containing sites ranging from 1900 to 173) for a total ~10k sites. BEAST version is 2.1.3 running (with Beagle) on a server containing 48 cores and 64 GB RAM.

As a test, I have had success running BEAST on the 10 MB xml file that includes all 24 site models, a linked clock with relaxed clock log normal model, Yule prior, and a linked tree with a fixed topology (I fixed the topology based on the information on the BEAST wiki).

We have 48 calibrations (including one for the root node) to use. We generated a starting tree using the 'chronos' utility in the 'ape' package in R using a topology we consider reliable (generated with a thorough RAxML run). All 48 interior nodes including the root on the starting tree, fit within the minimum and maximum years (I have checked them visually and programmatically). This was the fixed tree used in the first test run mentioned above.

When I add the 48 calibrations to the xml file, each being a uniform prior, and changed to Calibrated Yule model, BEAST2 loads the file successfully but does not start and hangs indefinitely (well at least for 48 hours). On screen I can see the section where the citations are mentioned. Usually after this point either the starting likelihood is displayed and generations are logged, or, an error is displayed. But I see neither of those.

So to test further, I included only one prior (one for the root node) and I had success: on screen, BEAST2 went past the citation section and calculated the starting likelihood, etc.

Adding the next prior (the next oldest node), BEAST2 gives an error: "Start likelihood: -Infinity after 11 initialisation attempts ..." In the error I can see the starting likelihood of the second prior was -Infinity. Wondering if the second prior was problematic, I used a different prior as the second prior, but got the same error. I have changed the order of the priors, and randomly chose a calibration as the first prior, but apparently, only the prior for the root node seems to work as the first prior. 

I have looked at the BEAST FAQ dealing with this error. Based on that, I ran other tests which included changing from uniform priors to say log normal, changing initial substitution and clock rates, changing the branchlengths from MYA to YA instead of changing rates, using different start trees, running BEAST2 with '-beagle_scaling always' option, changing the precision option, doubling the heap space memory for java, even changing the upper and lower bounds on the uniform priors. All lead to the same error: "Start likelihood: -Infinity after 11 initialisation attempts ..." for the second prior.

I would like to hear other people's experience with calibrations in BEAST and if anyone has come across similar issues. Why doesn't beast report anything when all 48 calibrations? What is the issue when I use just two priors and can't initialize? Please let me know if I can elaborate on this problem.

Thank you,
Nitish Narula

Alexei Drummond

unread,
Jul 18, 2014, 4:46:29 AM7/18/14
to beast...@googlegroups.com, Remco Bouckaert, Sasha, Joseph Heled
Dear Nitish,

Does it run with the normal Yule prior instead of the “calibrated Yule prior”? 

The calibrated Yule prior uses an analytical result for one calibration (which is fast), but it uses a numerical approach with exponential time complexity in the number of calibrations when there is more than one calibration, so I guess that its not going to work for 48 calibrations. If you don’t mind the “multiplicative prior” in BEAST then you can just select the normal Yule prior instead (read Heled and Drummond 2012 to understand the difference).

But if you want to be really bleeding edge you should use the fossilised birth-death tree prior, which has none of the computational problems of the calibrated Yule model, is super cool and includes speciation, extinction and sampling rates:



The first link is a PNAS paper describing the model for fossil calibration. The second ArXiV paper describes a new BEAST 2 package that implements the fossilised birth-death tree prior for fossil calibration. The second paper is in revision in PLoS Comp Bio at the moment, but the software is already available.

Cheers
Alexei

--
You received this message because you are subscribed to the Google Groups "beast-users" group.
To unsubscribe from this group and stop receiving emails from it, send an email to beast-users...@googlegroups.com.
To post to this group, send email to beast...@googlegroups.com.
Visit this group at http://groups.google.com/group/beast-users.
For more options, visit https://groups.google.com/d/optout.

Nitish Narula

unread,
Jul 22, 2014, 9:19:59 PM7/22/14
to beast...@googlegroups.com, re...@cs.auckland.ac.nz, gavryu...@gmail.com, jhe...@gmail.com
Hi Alexei,

Thanks for your suggestions. I am looking into the sample ancestors package. Is there any tutorial for it out there? O should I just try to model fossil.xml example file?

Using the normal Yule prior instead of calibrated Yule solved the issue of BEAST2 hanging. However, I go back to the problem where the second MRCA prior probablity is -Infinity. I am leaning towards the starting tree being the issue but I don't know what the problem is since I have checked the node branching times (using 'ape') and they fall withing the age minima and maxima. Since I had to setup 48 uniform MRCA calibrations in the xml file, I generated that xml block using some scripts. I am confident that the xml block is fine since BEAST2 nor Beauti complain when I try to lead the file. 

Are there other things I may be glossing over but really should be thinking about?

Thanks,
Nitish

Alexei Drummond

unread,
Jul 22, 2014, 10:53:01 PM7/22/14
to beast...@googlegroups.com, Remco Bouckaert, gavryu...@gmail.com, Joseph Heled
Can you email me the full error log?

Alexei

Nitish Narula

unread,
Jul 22, 2014, 11:05:22 PM7/22/14
to beast...@googlegroups.com, Remco Bouckaert, gavryu...@gmail.com, Joseph Heled
See attached. Is this what you need? Let me know.

Thanks,
Nitish



--
You received this message because you are subscribed to a topic in the Google Groups "beast-users" group.
To unsubscribe from this topic, visit https://groups.google.com/d/topic/beast-users/ccZ9T3Phhlk/unsubscribe.
To unsubscribe from this group and all its topics, send an email to beast-users...@googlegroups.com.
beasterrorlog.txt

Remco Bouckaert

unread,
Jul 23, 2014, 9:14:37 PM7/23/14
to beast...@googlegroups.com, re...@cs.auckland.ac.nz, gavryu...@gmail.com, jhe...@gmail.com, nitishn...@gmail.com
Hi Nitish,

In the XML you sent, I noticed that the TreeParser element does not refer to a taxon set.

The file starts if you add taxa='@p1' , so it reads

    <init id="StartingTree.t:p1" initial="@Tree.t:p1" spec="beast.util.TreeParser" IsLabelledNewick="true" taxa='@p1'>

instead of

    <init id="StartingTree.t:p1" initial="@Tree.t:p1" spec="beast.util.TreeParser" IsLabelledNewick="true">

Without this, taxa can be mixed up, with as a result that calibrations will not be valid.

Cheers,

Remco

Nitish Narula

unread,
Jul 23, 2014, 9:19:12 PM7/23/14
to Remco Bouckaert, beast...@googlegroups.com, Remco Bouckaert, Alexandra Gavryushkina, Joseph Heled
Remco, 

Thank you so much. After weeks of trying to get it to work, it was this element that I had missed.

Thanks,
Nitish

Remco Bouckaert

unread,
Jul 23, 2014, 10:02:55 PM7/23/14
to beast...@googlegroups.com, higg...@gmail.com, re...@cs.auckland.ac.nz, gavryu...@gmail.com, jhe...@gmail.com, nitishn...@gmail.com
Hi Nitish, sorry for the agonising experiences. Clearly, the wiki on setting up a starting tree should be clarified.

An issue was raised (https://github.com/CompEvol/beast2/issues/98), so it should be fixed in the next release.

Cheers,

Remco

Nitish Narula

unread,
Jul 23, 2014, 11:08:01 PM7/23/14
to Remco Bouckaert, beast...@googlegroups.com, Remco Bouckaert, Alexandra Gavryushkina, Joseph Heled
Hi Remco,

It's not a big deal. 

When I used the instructions on the wiki (http://www.beast2.org/wiki/index.php/Fix_starting_tree) I got an error. 

So instead I used the instructions from Justin Bagley's blog (http://www.justinbagley.org/511/got-starting-tree-add-beast2-xml-file-make-work). There he also describes the same error if you use the instructions from the wiki (near the bottom of the post). However, he did miss the taxa element in his xml block.

Thanks again,
Nitish


Alexei Drummond

unread,
Jul 23, 2014, 11:28:48 PM7/23/14
to beast...@googlegroups.com, Remco Bouckaert, Remco Bouckaert, Alexandra Gavryushkina, Joseph Heled
Hi Nitish,

I have just committed a small change to the TreeParser.java class so your original XML will work in the next release. With the change, if an initial attribute is specified and isLabelledNewick="true" then that will be enough in the future.

Cheers
Alexei

Nitish Narula

unread,
Jul 24, 2014, 12:08:41 AM7/24/14
to beast...@googlegroups.com, Remco Bouckaert, Remco Bouckaert, Alexandra Gavryushkina, Joseph Heled
Hi Alexei, and Remco,

Sounds good. Thanks to both of you for your prompt response.

We are very interested in using the fossilized birth death model for this analysis. So I am thinking of modeling my xml file according to the fossil.xml example file in the sampled ancestors package. From what I can gather, I need add to
a) Add the 48 calibrations as taxa to all the alignment (having missing data)
b) Add the same 48 taxa to the starting tree.
c) Define constraints of monophyly
d) Define tip dates for all taxa in the tree, all but the 48 fossil taxa will be 0, while the 48 fossil taxa will have an age.

Based on (d), I am guessing the starting tree branchlengths of fossil taxa in (b) should match with the tip dates in (d). How about the rest of the branchlengths? Does the tree need to be ultrametric?

Also, any suggestions on a package I could use to add the taxa to the tree? I was thinking of using 'ape' library in R.

Thanks,
Nitish

Nitish Narula

unread,
Jul 25, 2014, 12:53:52 AM7/25/14
to Alexandra Gavryushkina, beast...@googlegroups.com, Remco Bouckaert, Remco Bouckaert, Joseph Heled
Thanks, Sasha.

Since we are fixing the topology in our current divergence dating analysis (using Yule model), we were wondering what that means for the fossilized birth death analysis. Is there a way to fix the backbone topology and let the fossil taxa move within their monophyletic clades? I guess not. We are not opposed to re-estimating the topology but if there was a way to fix the topology of the extant species, it would be convenient and perhaps quicker. 

Best,
Nitish


On 25 July 2014 08:47, Alexandra Gavryushkina <gavryu...@gmail.com> wrote:
Hi Nitish

I am back from holiday and can help you with the package now. All things you listed are correct. 
 
Also I am working now on the possibility of setting a random starting tree that obeys the constraints. So that you don't need to worry about the branch lengths. When I finish I will send you a new example xml with random tree. 

Cheers,
Sasha. 


Remco Bouckaert

unread,
Jul 28, 2014, 4:07:03 PM7/28/14
to beast...@googlegroups.com, higg...@gmail.com, re...@cs.auckland.ac.nz, gavryu...@gmail.com, jhe...@gmail.com, nitishn...@gmail.com
There is now a more complete description of setting up starting trees here:
http://blog.beast2.org/2014/07/28/all-about-starting-trees/

Remco
Reply all
Reply to author
Forward
0 new messages