2 issues with ancestral state reconstruction of discrete traits

1,568 views
Skip to first unread message

RobinvanVelzen

unread,
Mar 15, 2012, 10:42:56 AM3/15/12
to beast-users
Dear all,

I would like to use BEAST for estimating ancestral states of discrete
(binary in my case) traits - which is possible reconstuwith the latest
version (1.7.0). I imported the trait, created a partition from it and
unlinked the substitution (asymmetric) and clock models (strict). This
works fine. However, I would also like to set a node calibration, and
reconstruct the state change counts and both have issues:

1. When I tick the 'Reconstruct state change counts' toggle there is
no change in the analyses or files (as far I can tell). An 'Ancestral
state reconstruction' item is inserted in the 'write log to file'
section of the xml file but this is empty (see xml excerpt below this
message).
Does anyone know how to estimate the number of state changes? Is there
code available that I can insert in the xml file?

2. When after specifying settings as described above I set a lognormal
prior on the tmrca of a taxon set, BEAUTY refuses to generate an xml
file. Pressing the appropriate button simply has no effect. The only
solution is to manually edit the xml file.
Does anyone know if this is a bug or if there is a way I can solve
this?

Thanks in advance for any help or suggestions!

best wishes,

Robin van Velzen
Wageningen University

------------------------------------
<!-- write log to
file -->
<log id="fileLog" logEvery="1000"
fileName="Cymothoe84_traits.log.txt" overwrite="false">
<posterior idref="posterior"/>
<prior idref="prior"/>
<likelihood idref="likelihood"/>
<parameter idref="treeModel.rootHeight"/>
<parameter idref="yule.birthRate"/>
<parameter idref="kappa"/>
<parameter idref="frequencies"/>
<parameter idref="alpha"/>
<parameter idref="ucld.mean"/>
<parameter idref="ucld.stdev"/>
<parameter idref="trait.clock.rate"/>
<rateStatistic idref="meanRate"/>
<rateStatistic idref="coefficientOfVariation"/>
<rateCovarianceStatistic idref="covariance"/>

<!-- START Ancestral state
reconstruction -->

<!-- END Ancestral state
reconstruction -->

<!-- START Discrete Traits
Model -->
<parameter idref="trait.rates"/>

<!-- END Discrete Traits
Model -->
<treeLikelihood idref="treeLikelihood"/>
<treeLikelihood idref="nDNA.treeLikelihood"/>

<!-- START Discrete Traits
Model -->
<ancestralTreeLikelihood idref="h.treeLikelihood"/>
<ancestralTreeLikelihood idref="e.treeLikelihood"/>

<!-- END Discrete Traits
Model -->
<speciationLikelihood idref="speciation"/>
</log>




Andrew Rambaut

unread,
Mar 15, 2012, 10:46:24 AM3/15/12
to beast...@googlegroups.com
Hi Robin,

These are issues that have been fixed in v1.7.1 which will be posted sometime
tomorrow (or today New Zealand time).

Andrew

> --
> You received this message because you are subscribed to the Google Groups "beast-users" group.
> To post to this group, send email to beast...@googlegroups.com.
> To unsubscribe from this group, send email to beast-users...@googlegroups.com.
> For more options, visit this group at http://groups.google.com/group/beast-users?hl=en.
>

RobinvanVelzen

unread,
Mar 16, 2012, 9:58:36 AM3/16/12
to beast-users
Hi Andrew,

Many thanks for the reply. That is perfect timing of an update!

Indeed, the ancestral state reconstructions are transmitted to the xml
file when using BEAUti v1.7.1. However, running it using BEAST gives
errors at the MCMC initialisation step (copied below this message). I
believe that this may be related to the beagle libraries which I do
not have installed (windows os). Is it necessary to have the beagle
libraries in order to run ancestral state reconstructions or can it be
done without?

Also, the issue of BEAUti hanging when setting ancestral state
reconstruction as well as a node calibration remains.

In addition, setting ancestral state reconstruction leads to the
creation of a rates log file which lacks the file name stem
(.rates.log).

Thanks again!

Robin

---errors when running ancestral state reconstruction--------

Creating the MCMC chain:
chainLength=10000000
autoOptimize=true
autoOptimize delayed for 100000 steps
Exception in thread "Thread-6" java.lang.NullPointerException
at dr.evomodelxml.tree.TreeLoggerParser$2.getIntent(Unknown Source)
at dr.evolution.tree.Tree$Utils.writeTreeTraits(Unknown Source)
at dr.evolution.tree.Tree$Utils.newick(Unknown Source)
at dr.evolution.tree.Tree$Utils.newick(Unknown Source)
at dr.evolution.tree.Tree$Utils.newick(Unknown Source)
at dr.evolution.tree.Tree$Utils.newick(Unknown Source)
at dr.evolution.tree.Tree$Utils.newick(Unknown Source)
at dr.evomodel.tree.TreeLogger.log(Unknown Source)
at dr.inference.mcmc.MCMC$1.currentState(Unknown Source)
at dr.inference.markovchain.MarkovChain.fireCurrentModel(Unknown
Source)
at dr.inference.markovchain.MarkovChain.runChain(Unknown Source)
at dr.inference.mcmc.MCMC.chain(Unknown Source)
at dr.inference.mcmc.MCMC.run(Unknown Source)
at java.lang.Thread.run(Unknown Source)

Andrew Rambaut

unread,
Mar 16, 2012, 10:04:06 AM3/16/12
to beast...@googlegroups.com
Most of the ancestral state reconstruction stuff (and Markov Jumps in particular) require the BEAGLE library. The ancestral state reconstruction code is in 'BEAST classic' but it is quite possible that we have let the code degrade because we always use BEAGLE (it is generally faster). So if you try using BEAGLE and if it works, I will mark this as an issue with BEAST not demanding the use of BEAGLE.

Andrew

RobinvanVelzen

unread,
Apr 17, 2012, 5:19:08 AM4/17/12
to beast-users
Dear Andrew et al.

I have finally tested my .xml file on beast 1.7.1 under BEAGLE and
unfortunately the problem remains. So apparently it is not an issue
with beast not demanding BEAGLE but rather something else. I ran some
test xml files and ancestral state reconstruction works ok but
whenever I select the 'State Change Count Reconstruction' option (i.e
using Markov Jumps) I get error messages and beast will not run (with
or without BEAGLE). Particularly, there seems to be a problem with an
'unexpected element in treeLikelihood:mtDNA.treeLikelihood:
dr.inference.model.Parameter$Default'. I have copied the output below
this message.

Any help or suggestions would be very much appreciated, because I am
particularly interested in estimating the number of state changes in
my discrete (binary) trait over the phylogenetic tree.

Many thanks!

Robin

-------beast output------------------------
Parsing XML file: traits.xml
File encoding: UTF8
Looking for plugins in /home/biosys/BEASTv1.7.1/bin/plugins
Read alignment: alignment1
Sequences = 84
Sites = 1475
Datatype = nucleotide
Read alignment: alignment2
Sequences = 84
Sites = 2366
Datatype = nucleotide
Site patterns 'mtDNA.patterns' created from positions 1-1475 of
alignment 'alignment1'
pattern count = 598
Site patterns 'nDNA.patterns' created from positions 1-2366 of
alignment 'alignment2'
pattern count = 677
Read attribute patterns, 'empirical.pattern' for attribute, e
Using Yule prior on tree
Creating the tree model, 'treeModel'
initial tree topology =
[random starting tree here]
tree height = 193.5373539368146
Using discretized relaxed clock model.
over sampling = 1
parametric model = logNormalDistributionModel
rate categories = 1
Using strict molecular clock model.
Creating state frequencies model 'frequencies': Initial frequencies =
{0.25, 0.25, 0.25, 0.25}
Creating site model:
4 category discrete gamma with initial shape = 0.5
initial proportion of invariant sites = 0.5
Unexpected element in treeLikelihood:mtDNA.treeLikelihood:
dr.inference.model.Parameter$Default
Using BEAGLE TreeLikelihood
Branch rate model used: discretizedBranchRates
Using BEAGLE resource 1: GeForce GTX 480
Global memory (MB): 1535
Clock speed (Ghz): 1.40
Number of cores: 480
with instance flags: PRECISION_DOUBLE COMPUTATION_SYNCH
EIGEN_REAL SCALING_MANUAL SCALERS_RAW VECTOR_NONE THREADING_NONE
PROCESSOR_GPU
Ignoring ambiguities in tree likelihood.
With 598 unique site patterns.
Using rescaling scheme : dynamic (rescaling every 10000 evaluations)
Unexpected element in treeLikelihood:nDNA.treeLikelihood:
dr.inference.model.Parameter$Default
Using BEAGLE TreeLikelihood
Branch rate model used: discretizedBranchRates
Using BEAGLE resource 1: GeForce GTX 480
Global memory (MB): 1535
Clock speed (Ghz): 1.40
Number of cores: 480
with instance flags: PRECISION_DOUBLE COMPUTATION_SYNCH
EIGEN_REAL SCALING_MANUAL SCALERS_RAW VECTOR_NONE THREADING_NONE
PROCESSOR_GPU
Ignoring ambiguities in tree likelihood.
With 677 unique site patterns.
Using rescaling scheme : dynamic (rescaling every 10000 evaluations)
Creating state frequencies model 'empirical.frequencies': Initial
frequencies = {0.5, 0.5}
General Substitution Model (stateCount=2)
Using BSSVS General Substitution Model
Creating site model.
Using BEAGLE TreeLikelihood
Branch rate model used: strictClockBranchRates
Using BEAGLE resource 1: GeForce GTX 480
Global memory (MB): 1535
Clock speed (Ghz): 1.40
Number of cores: 480
with instance flags: PRECISION_DOUBLE COMPUTATION_SYNCH
EIGEN_REAL SCALING_MANUAL SCALERS_RAW VECTOR_NONE THREADING_NONE
PROCESSOR_GPU
Ignoring ambiguities in tree likelihood.
With 1 unique site patterns.
Using rescaling scheme : dynamic (rescaling every 10000 evaluations)
Creating swap operator for parameter mtDNA.branchRates.categories
(weight=10.0)
Constructing a cache around likelihood 'null', signal =
empirical.rates
Creating the MCMC chain:
chainLength=10000
autoOptimize=true
autoOptimize delayed for 100 steps
# BEAST v1.7.1, r4860
# Generated Tue Apr 17 10:57:44 CEST 2012 [seed=1334653059044]
state Posterior Prior Likelihood rootHeight
mtDNA.ucld.mean empirical.clock.rate
0 -4355516.2224 -4311162.0787 -44354.1437
193.537 1.00000
1.00000 -
Exception in thread "Thread-2" java.lang.NullPointerException
at dr.evomodelxml.tree.TreeLoggerParser$2.getIntent(Unknown
Source)
at dr.evolution.tree.Tree$Utils.writeTreeTraits(Unknown Source)
at dr.evolution.tree.Tree$Utils.newick(Unknown Source)
at dr.evolution.tree.Tree$Utils.newick(Unknown Source)
at dr.evomodel.tree.TreeLogger.log(Unknown Source)
at dr.inference.mcmc.MCMC$1.currentState(Unknown Source)
at
dr.inference.markovchain.MarkovChain.fireCurrentModel(Unknown Source)
at dr.inference.markovchain.MarkovChain.runChain(Unknown
Source)
at dr.inference.mcmc.MCMC.chain(Unknown Source)
at dr.inference.mcmc.MCMC.run(Unknown Source)
at java.lang.Thread.run(Thread.java:636)


0.166 seconds

Andrew Rambaut

unread,
Apr 17, 2012, 5:22:48 AM4/17/12
to beast...@googlegroups.com
There are currently some issues with the XML generated by BEAUti 1.7.1 for state change counting. We are working to resolve these and will be releasing a new version shortly.

Andrew

Antonello Di Nardo

unread,
Apr 20, 2012, 7:28:55 AM4/20/12
to beast-users
Dear Andrew,

I am facing the same problem of Robin performing the ancestral state
reconstruction through BEAST 1.7.1 using BEAGLE.
It seems that there are same issues in the XML generated by BEAUti as
follow:

1. <ancestralTreeLikelihood> should include the SubstitutionModel
element (as in the BEAST 1.7.x xml reference) but that is not
generated by BEAUti. This gives the error from BEAST:

The '<ancestralTreeLikelihood>' element with id,
'treeLikelihood', is
incorrectly constructed.
The following was expected:
Exactly one ELEMENT of type SubstitutionModel REQUIRED

Including the SubstitutionModel (for example <TN93Model idref="tn93"/
>) the program run correctly. It seems a problem affecting the
<markovJumpsTreeLikelihood> as well.

2. the <taxa idref="x"/> for the <mrca> in the Ancestral state
reconstruction log wrongly pass the taxa idref as "MRCA(x)" -where x
is the taxa name- that has not been previously declared. This gives
the error from BEAST:

Object with idref=MRCA(x) has not been previously declared.

The program run correctly renaming the <taxa idref="x"> with the
correct name of the taxa.

3. It seems that the error reported by Robin

Exception in thread "Thread-6" java.lang.NullPointerException
at dr.evomodelxml.tree.TreeLoggerParser$2.getIntent(Unknown
Source)
at dr.evolution.tree.Tree$Utils.writeTreeTraits(Unknown
Source)
at dr.evolution.tree.Tree$Utils.newick(Unknown Source)
at dr.evolution.tree.Tree$Utils.newick(Unknown Source)
at dr.evolution.tree.Tree$Utils.newick(Unknown Source)
at dr.evolution.tree.Tree$Utils.newick(Unknown Source)
at dr.evolution.tree.Tree$Utils.newick(Unknown Source)
at dr.evomodel.tree.TreeLogger.log(Unknown Source)
at dr.inference.mcmc.MCMC$1.currentState(Unknown Source)
at
dr.inference.markovchain.MarkovChain.fireCurrentModel(Unknown Source)
at dr.inference.markovchain.MarkovChain.runChain(Unknown
Source)
at dr.inference.mcmc.MCMC.chain(Unknown Source)
at dr.inference.mcmc.MCMC.run(Unknown Source)
at java.lang.Thread.run(Unknown Source)

is happening only using the GPU instance via BEAGLE and not with the
CPU (at least experienced by me with a Nvidia Quadro FX 1700).

Hope this help.


Antonello
Institute for Animal Health
> ...
>
> read more »

RobinvanVelzen

unread,
Apr 20, 2012, 10:28:38 AM4/20/12
to beast-users
@Antonello: thanks for the additional comments. I am not receiving any
errors regarding undeclared elements - this is probably specific to
your files. I tried running the xml using beagle on CPU but get the
same ' dr.evolution' errors I posted in my previous post.

@Andrew: thanks for working on the next version. I hope the code
improvements will resolve the issues with state name change counting.

Robin

On Apr 20, 1:28 pm, Antonello Di Nardo <antonello.dina...@gmail.com>
wrote:
> ...
>
> read more »

Jackie Brown

unread,
May 13, 2012, 12:39:54 PM5/13/12
to beast...@googlegroups.com
Hi all, 

   I am also trying to use the ancestral reconstruction, and have been getting similar errors of "Object with idref=X has not been previously declared, though in this case it is for a rates parameter for the morphological data.  (I'll note this occurs whether or not I check the box for ancestral reconstructions).  

Using BEAGLE TreeLikelihood
  Branch rate model used: strictClockBranchRates
Parsing error - poorly formed BEAST file, Megonlyclockdimorphism.xml:
Object with idref=Dimorphism.rates has not been previously declared.

I've seen the note above that these are problems being addressed in the next version. 

However, I'd also appreciate advice on setting up the models themselves for morphological characters.  I have both binary and continuous characters and a data set with two genes.  I have successfully used BEAST to infer trees with the molecular data alone (though using non-strict clocks gives me problems with underflow no matter how I tweaked the BEAGLE rescaling), but would like perform analyses very similar to the Finch example in Fig 2 of 


Any advice on best practices (or a pointer to a fuller description of what was done on the Finch example file) would be much appreciated. 

Many thanks,

Jackie

Jackie Brown
Grinnell College
Grinnell, IA, USA

Andrew Rambaut

unread,
May 15, 2012, 7:07:14 PM5/15/12
to beast...@googlegroups.com
Hi Jackie,

This issue is indeed fixed in the very soon to be released v1.7.2. There are also a couple of post describing how to fix your XML file to get this to work now:


Apologies for this bug.

Andrew

On 13 May 2012, at 17:39, Jackie Brown wrote:


   I am also trying to use the ancestral reconstruction, and have been getting similar errors of "Object with idref=X has not been previously declared, though in this case it is for a rates parameter for the morphological data.  (I'll note this occurs whether or not I check the box for ancestral reconstructions).  

Using BEAGLE TreeLikelihood
  Branch rate model used: strictClockBranchRates
Parsing error - poorly formed BEAST file, Megonlyclockdimorphism.xml:
Object with idref=Dimorphism.rates has not been previously declared.

I've seen the note above that these are problems being addressed in the next version. 


___________________________________________________________________
 Andrew Rambaut                
 Institute of Evolutionary Biology       University of Edinburgh
 Ashworth Laboratories                         Edinburgh EH9 3JT
 EMAIL - a.ra...@ed.ac.uk                TEL - +44 131 6508624  

Reply all
Reply to author
Forward
0 new messages