Tree Errors, Messages, Assumptions and Fixes? BEAST v1.8.2

392 views
Skip to first unread message

Mel Melendrez

unread,
Dec 22, 2015, 12:45:30 PM12/22/15
to beast-users
Greetings,

I attempted to use a user-defined tree. I imported the nexus, rooted tree into Beauti with export format newick. Formed the XML - viewed the XML confirmed the tree was indeed in newick format on the newick line. ran beast and got the following error below.

Random number seed: 1450803724679


Parsing XML file: 7.HA.BEASTA_discreteRecomb_strict1.all.xml
  File encoding: UTF8
Looking for plugins in /media/VD_Research/Analysis/ProjectBased_Analysis/melanie/share/Issue_10712/plugins
Read alignment: alignment
  Sequences = 559
      Sites = 1701
   Datatype = nucleotide
Site patterns 'CP1+2.patterns' created by merging 2 pattern lists
  pattern count = 209
Site patterns 'CP3.patterns' created from positions 3-1701 of alignment 'alignment'
  only using every 3 site
  pattern count = 567
Read attribute patterns, 'ReassortGrp.pattern' for attribute, ReassortGrp
Parsing error - poorly formed BEAST file, 7.HA.BEASTA_discreteRecomb_strict1.all.xml:
Error parsing '<newick>' element with id, 'null':
error parsing tree in newick element - Expecting ',' in tree, but got ' '

Upon review - there were no spaces in the newick line.

What I did see though was when Beauti exported the tree into the XML file - it broke my tree up into more the one line? I used VIM to search for spaces and there were none, but there were random 'new lines' and 'tabs' splitting my tree up.

Perhaps a more informative error message would be:
error parsing tree in newick element - Expecting 1 line, but got >1

I've attached the XML that Beauti gives me and you can see the break up on the newick line. We fixed this by removing all the new lines and tabs so that the tree was all on one line.

Once i remedied this the file ran but then gave me a 0 states probability error.

 ---------------------------------
Creating ctmcScalePrior model.
    If you publish results using this prior, please reference:
         1. Ferreira and Suchard (2008) for the conditional reference prior on CTMC scale parameter prior;
Constructing a cache around likelihood 'null', signal = ReassortGrp.rates
Likelihood computation is using a pool of 1 threads.
Creating the MCMC chain:
  chainLength=100000000
  autoOptimize=true
  autoOptimize delayed for 1000000 steps
Underflow calculating likelihood. Attempting a rescaling...
Error running file: 7.HA.BEASTA_discreteRecomb_strict1.all.xml
The initial model is invalid because state has a zero probability.

If the log likelihood of the tree is -Inf, his may be because the
initial, random tree is so large that it has an extremely bad
likelihood which is being rounded to zero.

Alternatively, it may be that the product of starting mutation rate
and tree height is extremely small or extremely large.

Finally, it may be that the initial state is incompatible with
one or more 'hard' constraints (on monophyly or bounds on parameter
values. This will result in Priors with zero probability.

The individual components of the posterior are as follows:
The initial posterior is zero:
  CompoundLikelihood(compoundModel)=(
    LogNormal(CP1+2.kappa)=-1.8654,
    LogNormal(CP3.kappa)=-1.8654,
    Uniform(CP1+2.mu)=0.0,
    Uniform(CP3.mu)=0.0,
    Uniform(CP1+2.frequencies)=0.0,
    Uniform(CP3.frequencies)=0.0,
    Exponential(CP1+2.alpha)=-0.3069,
    Exponential(CP3.alpha)=-0.3069,
    Gamma(7.HA.BEASTA.all.clock.rate)=2.9321,
    CTMCScalePrior(ctmcScalePrior)=-76543.4367,
    Gamma(skyride.precision)=-6.9151,
    Poisson(ReassortGrp.nonZeroRates)=-686.8347,
    Uniform(ReassortGrp.frequencies)=0.0,
    CachedDistributionLikelihood(cachedPrior)=-342.0,
    Uniform(ReassortGrp.root.frequencies)=0.0,
    GMRFSkyrideLikelihood(gmrfSkyrideLikelihood[skyride])=NaN,
    SVSComplexSubstitutionModel(generalSubstitutionModel[ReassortGrp.model])=0.0
    Total = NaN
  ),
  CompoundLikelihood(compoundModel)=(
    AncestralStateBeagleTreeLikelihood(treeLikelihood[CP1+2.treeLikelihood])=-6686.4152,
    AncestralStateBeagleTreeLikelihood(treeLikelihood[CP3.treeLikelihood])=-7878.8274,
    AncestralStateBeagleTreeLikelihood(treeLikelihood[ReassortGrp.treeLikelihood])=-1645.9414
    Total = -16211.184012880509
  )
  Total = NaN


I've run this file before successfully the only difference was the addition of my tree this time and addition of traits. So changed to UPGMA tree and got the same error. When I changed to random starting tree it is now running just fine. I'm curious how my beast consensus tree (I rooted it per Beauti instructions on trees tab) from a previous analysis with good ESS values/results would not work to inform this run?

Problem with random starting tree is, I really wanted to use my user-defined consensus tree that I'd generated with BEAST before to inform the analysis as I am only overlaying traits for discrete analysis rather than redoing the entire run over again. Right now it's telling me 2 hrs/million states - at running 100 million states = 200 hrs. I was hoping by informing my analysis with the previous beast run consensus tree it would run faster.

Is this an incorrect assumption?

Are these issues fixed in BEAST 2? Right now only v1.8 is approved for use at my facility until I can get v2 approved.

Cheers,
Mel


7.HA.BEASTA_discreteRecomb_strict1.all_1.xml

Andrew Rambaut

unread,
Dec 22, 2015, 5:21:15 PM12/22/15
to beast...@googlegroups.com
Dear Mel,

The pertinent line here (in the second part of your posting) is:

    GMRFSkyrideLikelihood(gmrfSkyrideLikelihood[skyride])=NaN, 

So this is suggesting the skyride model is having numerical issues with your starting tree and starting values. I would try setting the initial population sizes to something smaller (for the skyride, these are given as log pop sizes so try setting them to 0.0).

To answer your final question - I don’t think picking a tree from a previous run is going to help enormously. It may help shorten the burnin - because the tree is already from an area of reasonable probability. But the burnin will (should) generally be a relatively small proportion of your total run. If you need to remove more than 10% of the run as ‘burnin’ it is unlikely that what remains will have high ESSs (i.e., it is a sufficient sample). Thus if shortening your burnin is an important factor in the run time, then you are not running the chain for long enough.

Andrew

--
You received this message because you are subscribed to the Google Groups "beast-users" group.
To unsubscribe from this group and stop receiving emails from it, send an email to beast-users...@googlegroups.com.
To post to this group, send email to beast...@googlegroups.com.
Visit this group at https://groups.google.com/group/beast-users.
For more options, visit https://groups.google.com/d/optout.
<7.HA.BEASTA_discreteRecomb_strict1.all_1.xml>

Mel Melendrez

unread,
Dec 23, 2015, 3:30:38 PM12/23/15
to beast-users
Great Thanks Andrew!
Reply all
Reply to author
Forward
0 new messages