Designating Starting Trees in BEAST v2.1.3, cannot find gene taxa

wolfra...@gmail.com

unread,

Jan 7, 2015, 1:59:22 PM1/7/15

to beast...@googlegroups.com

Hi All,

I am trying to run a *BEAST analysis in BEAST v2.1.3 by designating starting trees for the species and gene trees. I am fairly new to BEAST and coding, so help would be appreciated!

I basically followed the instructions here: http://blog.beast2.org/2014/07/28/all-about-starting-trees/

Background info:

-I am using trees generated by a previous *BEAST analysis, in newick format

-I merged some of my taxa from my gene trees into a single species for my species tree. This was done in BEAUTi, under the "Taxon Sets" tab. When I look at trees from my individual loci, they have all the original taxa, while those from my species tree have only the merged ones.

-I did some indel coding, and these partitions were set to be part of the appropriate nucleotide gene trees/clocks.

When I made a new <init for each gene tree, I used taxa="@taxonsuperset". However, since things are named differently between the species tree and gene trees (with extra tips in the gene trees), BEAST obviously doesn't like this. In addition, when looking at the old <init for the gene trees, I could not find any gene specific taxa groups. So my question is as follows: do I need to make a new taxonset that refers to the names used in the gene trees? Or is there some other way to do this? Or should I edit the newick trees to delete the extra tips and rename the taxa?

Another way of putting this is: How does BEAST generate gene trees with all the taxa, not just the merged species taxa, if there is not a taxon block for all the taxa?

Ideally, I would be able to just use the original gene trees with all the taxa (not just the merged taxa from the species tree). I can post text blocks later, if it helps. Thanks in advanced!

Wolfgang Rahfeldt

University of Washington

Remco Bouckaert

unread,

Jan 7, 2015, 2:34:37 PM1/7/15

to beast...@googlegroups.com

Hi Wolfgang,

For the gene trees, the taxon set consist of the taxa in the gene tree, not the species tree. Instead of using taxonset=‘@taxonsuperset', you can create a new taxon set for each gene. But there is a shortcut that is more convenient: since all gene trees come with alignments, and these alignments contain all taxa (typically), you can use the “taxa” attribute instead of the “taxonset” attribute and refer to the alignment. So, for a gene tree associated with alignment XYZ you add taxa=‘@XYZ’ instead of taxonset=‘@taxonsuperset’ to the newick tree.

For those alignments that are partitioned, it is fine to refer to the original alignment.

Hope this helps,

Remco

--
You received this message because you are subscribed to the Google Groups "beast-users" group.
To unsubscribe from this group and stop receiving emails from it, send an email to beast-users...@googlegroups.com.
To post to this group, send email to beast...@googlegroups.com.
Visit this group at http://groups.google.com/group/beast-users.
For more options, visit https://groups.google.com/d/optout.

wolfra...@gmail.com

unread,

Jan 7, 2015, 10:48:18 PM1/7/15

to beast...@googlegroups.com

Great! Thanks for the help, and I think that worked. Unfortunately, I seem to be running into a different error:

Start likelihood: -Infinity after 11 initialisation attempts

P(posterior) = -Infinity (was NaN)

P(speciescoalescent) = -Infinity (was NaN)

P(SpeciesTreePopSize.Species) = -199.2538977075783 (was NaN)

P(treePrior.t:ETS) = -Infinity (was NaN)

P(treePrior.t:ITS) = NaN (was NaN)

P(treePrior.t:PPR127) = NaN (was NaN)

P(treePrior.t:psbJ) = NaN (was NaN)

P(treePrior.t:rpl16) = NaN (was NaN)

P(treePrior.t:trnQ) = NaN (was NaN)

P(prior) = NaN (was NaN)

P(ExtendedBayesianSkyline.t:Species) = -0.16662957391639566 (was NaN)

P(GammaShapePrior.s:ETS) = -1.0 (was NaN)

P(GammaShapePrior.s:ITS) = -1.0 (was NaN)

P(GammaShapePrior.s:PPR127) = -1.0 (was NaN)

P(GammaShapePrior.s:psbJ) = -1.0 (was NaN)

P(GammaShapePrior.s:rpl16) = -1.0 (was NaN)

P(GammaShapePrior.s:trnQ) = -1.0 (was NaN)

P(indicatorsPrior.alltrees) = -0.69314718056 (was NaN)

P(HyperPrior.hyperExponential-mean-MeanRatePrior.c:ITS) = -2.302585092994046 (was NaN)

P(HyperPrior.hyperExponential-mean-MeanRatePrior.c:PPR127) = -2.302585092994046 (was NaN)

P(HyperPrior.hyperExponential-mean-MeanRatePrior.c:psbJ) = -2.302585092994046 (was NaN)

P(HyperPrior.hyperExponential-mean-MeanRatePrior.c:rpl16) = -2.302585092994046 (was NaN)

P(HyperPrior.hyperExponential-mean-MeanRatePrior.c:trnQ) = -2.302585092994046 (was NaN)

P(popMean.prior) = 0.0 (was NaN)

P(popSizePrior.alltrees) = -42.0 (was NaN)

P(populationMeanPrior.alltrees) = 0.0 (was NaN)

P(RateACPrior.s:ETS) = -3.184008455701433 (was NaN)

P(RateACPrior.s:ITS) = -3.184008455701433 (was NaN)

P(RateACPrior.s:PPR127) = -3.184008455701433 (was NaN)

P(RateACPrior.s:psbJ) = -3.184008455701433 (was NaN)

P(RateACPrior.s:rpl16) = -3.184008455701433 (was NaN)

P(RateACPrior.s:trnQ) = -3.184008455701433 (was NaN)

P(RateAGPrior.s:ETS) = -3.1686658147294304 (was NaN)

P(RateAGPrior.s:ITS) = -3.1686658147294304 (was NaN)

P(RateAGPrior.s:PPR127) = -3.1686658147294304 (was NaN)

P(RateAGPrior.s:psbJ) = -3.1686658147294304 (was NaN)

P(RateAGPrior.s:rpl16) = -3.1686658147294304 (was NaN)

P(RateAGPrior.s:trnQ) = -3.1686658147294304 (was NaN)

P(RateATPrior.s:ETS) = -3.184008455701433 (was NaN)

P(RateATPrior.s:ITS) = -3.184008455701433 (was NaN)

P(RateATPrior.s:PPR127) = -3.184008455701433 (was NaN)

P(RateATPrior.s:psbJ) = -3.184008455701433 (was NaN)

P(RateATPrior.s:rpl16) = -3.184008455701433 (was NaN)

P(RateATPrior.s:trnQ) = -3.184008455701433 (was NaN)

P(RateCGPrior.s:ETS) = -3.184008455701433 (was NaN)

P(RateCGPrior.s:ITS) = -3.184008455701433 (was NaN)

P(RateCGPrior.s:PPR127) = -3.184008455701433 (was NaN)

P(RateCGPrior.s:psbJ) = -3.184008455701433 (was NaN)

P(RateCGPrior.s:rpl16) = -3.184008455701433 (was NaN)

P(RateCGPrior.s:trnQ) = -3.184008455701433 (was NaN)

P(RateGTPrior.s:ETS) = -3.184008455701433 (was NaN)

P(RateGTPrior.s:ITS) = -3.184008455701433 (was NaN)

P(RateGTPrior.s:PPR127) = -3.184008455701433 (was NaN)

P(RateGTPrior.s:psbJ) = -3.184008455701433 (was NaN)

P(RateGTPrior.s:rpl16) = -3.184008455701433 (was NaN)

P(RateGTPrior.s:trnQ) = -3.184008455701433 (was NaN)

P(MeanRatePrior.c:ITS) = -2.402585092994046 (was NaN)

P(MeanRatePrior.c:PPR127) = -2.402585092994046 (was NaN)

P(MeanRatePrior.c:psbJ) = -2.402585092994046 (was NaN)

P(MeanRatePrior.c:rpl16) = -2.402585092994046 (was NaN)

P(MeanRatePrior.c:trnQ) = -2.402585092994046 (was NaN)

P(ucldStdevPrior.c:ETS) = -0.40143772133305716 (was NaN)

P(ucldStdevPrior.c:ITS) = -0.40143772133305716 (was NaN)

P(ucldStdevPrior.c:PPR127) = -0.40143772133305716 (was NaN)

P(ucldStdevPrior.c:psbJ) = -0.40143772133305716 (was NaN)

P(ucldStdevPrior.c:rpl16) = -0.40143772133305716 (was NaN)

P(ucldStdevPrior.c:trnQ) = -0.40143772133305716 (was NaN)

P(likelihood) = NaN (was NaN)

P(treeLikelihood.ETS) = NaN (was NaN)

P(treeLikelihood.ITS) = NaN (was NaN)

P(treeLikelihood.ITSgaps) = NaN (was NaN)

P(treeLikelihood.PPR127) = NaN (was NaN)

P(treeLikelihood.psbJ) = NaN (was NaN)

P(treeLikelihood.psbJgaps) = NaN (was NaN)

P(treeLikelihood.rpL16gaps) = NaN (was NaN)

P(treeLikelihood.rpl16) = NaN (was NaN)

P(treeLikelihood.trnQ) = NaN (was NaN)

P(treeLikelihood.trnQgaps) = NaN (was NaN)

java.lang.Exception: Could not find a proper state to initialise. Perhaps try another seed.

at beast.core.MCMC.run(Unknown Source)

at beast.app.BeastMCMC.run(Unknown Source)

at beast.app.beastapp.BeastMain.<init>(Unknown Source)

at beast.app.beastapp.BeastMain.main(Unknown Source)

Any thoughts?

Thanks!

Remco Bouckaert

unread,

Jan 7, 2015, 11:08:55 PM1/7/15

to beast...@googlegroups.com

It looks like the prior treePrior.t:ETS is not compatible with the starting tree, so check whether the taxa in ETS are not monophyletic in the starting tree, or the clade height of the starting tree is outside the range of the prior.

Remco

Message has been deleted

Remco Bouckaert

unread,

Jan 11, 2015, 2:31:49 PM1/11/15

to beast...@googlegroups.com

Hi Wolfgang,

Judging from the XML, it looks like the gene tree does not fit inside the species tree — so it is not the usual issues with starting trees. If you scale the species tree to say 1/1000 of the current tree (as below) it starts since all gene trees will be sticking out above the species tree. This is probably not a very good starting state, but it will get the chain started.

Cheers,

Remco

(((((((ophiobia:8.396852945224964E-8,cronquistii:8.396852945224964E-8):2.3307052151722019E-7,((deflexa_WA:3.4750606573652476E-8,(patens_Oregon:1.318057911703363E-8,jessicae_Nevada:1.318057911703363E-8):2.1570027456618845E-8):8.014415288926102E-8,(((brevicola:2.3695473970519743E-8,jessicae_Oregon:2.3695474197893418E-8):3.071940645327231E-8,floribunda_Cali:5.4414880423792056E-8):1.8277254412168986E-8,micrantha:7.269213483596104E-8):4.2202624626952456E-8):2.0214429241605103E-7):1.1026715583284386E-7,virginiana:4.273062068023137E-7):2.769815364445094E-7,(((revoluta:9.155157954410242E-8,mexicana:9.155158022622345E-8):3.611682068367372E-7,((nervosa:3.256577292631846E-8,veluntina:3.256577292631846E-8):3.3974788630075636E-7,(mundula:2.044106186076533E-7,((bella:5.71590026083868E-8,amethystina:5.715900215363945E-8):7.680939233978279E-8,sharsmithii:1.339683940386749E-7):7.044222365948372E-8):1.6790304061942152E-7):8.040612965487526E-8):1.540948478577775E-7,(californica:1.318367685598787E-7,floribunda_CNM:1.318367685598787E-7):4.749778681798489E-7):9.747310741659021E-8):1.66156685736496E-7,((((ursina:6.073539134376915E-8,gracilenta:6.073539134376915E-8):1.2082692239800963E-7,(hirsuta:2.8731404654536163E-8,besseyi:2.8731404086101975E-8):1.5283091067885834E-7):9.905067315685301E-8,davisii:2.806129868986318E-7):1.1810952128143981E-7,((((cusickii:3.263561848143581E-8,setosa:3.263561757194111E-8):6.0858838878630195E-8,cinerea:9.349445736006601E-8):1.6408862666139612E-7,((ciliata:5.138468304721755E-8,patens_Utah:5.138468304721755E-8):1.3069892655437343E-7,((venusta:3.758539196496713E-8,diffusa:3.758539287446183E-8):5.44840427210147E-8,hispida:9.206943559547653E-8):9.001417491560915E-8):7.549947351037645E-8):7.187284973042551E-8,deflexa:3.2945593375188764E-7):6.926657624717336E-8):4.717219198937528E-7):0.000005118740184116177,(squarrosa:6.215907269506715E-7,(heterosperma:1.2588186928041978E-7,(occidentalis:5.089673595648492E-8,(redowskii:2.1013976038375404E-8,cucullata:2.1013976038375404E-8):2.9882759918109514E-8):7.498513241444016E-8):4.957088576702517E-7):0.000005367593897972256):0.0000034831101656891406,((flavoculata:0.000005923625118157361,lanceolata:0.000005923625132709276):0.0000012814013680326752,longiflorum:0.000007205026486190036):0.0000022672683116979897)

On 8/01/2015, at 6:02 pm, wolfra...@gmail.com wrote:

When I set one of the starting gene trees to random, it produces the same error for the other gene trees under the treePrior.t:XYZ. How do I go about changing the range of the prior for clade height? Sorry if this is straight forward! Also, attached is my .xml file, if it helps.

<Hackelia_BEAST_concattreesclocks_1bilgen_20000sample_inputtrees.xml>

Reply all

Reply to author

Forward

Message has been deleted