Bootstrapping artifact? High support in bootstrap trees for a clade that is absent in the best tree

58 views
Skip to first unread message

41....@gmail.com

unread,
Sep 30, 2015, 10:03:54 AM9/30/15
to raxml
I have an interesting problem that may be potentially due to an error in RAxML, or perhaps a peculiarity in the datasets I am putting together.

For three quite different large matrices (150,000-400,000) with a moderate number of taxa and moderate missing data, I have found that using GTRGAMMA I get a tree mostly with high support, but with one or two unconventional relationships (as compared to previous studies) with exactly 0 support. All other relationships are in line with my experience with these taxa and smaller datasets. This unusual topology often involves a small clade moving to the base of the tree -- and when this clade would ordinarily be derived this operation traverses several nodes, so of course it can drive several nodes down to zero instead of just one. When I perform a majority rule consensus of the bootstraps with RAxML, I get the conventional topology for these relationships, and the support is actually very good (100 or nearly so). That means there is strong support in the bootstrap trees for a group that is absent from the best tree. 

When I use GTRGAMMA+I, this problem goes away entirely for all three large datasets, and I get the topology that agrees with the majority rule of the GTRGAMMA analysis for these nodes.

Of course the majority rule and best trees could differ, but seeing such extremely different branch support statistics suggests an artifact to me. I have seen a similar but not identical problem on here (https://groups.google.com/forum/#!searchin/raxml/0$20bootstrap$20majority$20rule/raxml/r7TKJDuvrzM/wAjkUK3_pQAJ, "Possibly odd behaviour in bootstrapping?"), which in that case was attributed to a bug in RAxML. 

Any suggestions for fixing this issue? I am attempting a run with a different build of RAxML (currently using pthreads-SSE3; I may try a different version as well).

I attached an example of results from one of the matrices.

Ryan
GTRGAMMAandI.pdf
GTRGAMMA.pdf
GTRGAMMAmajorityrule.pdf

41....@gmail.com

unread,
Sep 30, 2015, 11:31:35 AM9/30/15
to raxml
Just as an update, I ran this with the same version (current newest) and no SSE3, and same results. I also reran it with version 7.7.6, same results.

Alexandros Stamatakis

unread,
Sep 30, 2015, 3:43:27 PM9/30/15
to ra...@googlegroups.com
hi ryan,

are you using rapid bootstraps or normal, slow bootstraps?

also, how many ML searches on the original MSA have you done?

it might be helpful if you could paste the command lines you used.

Also note that, apparently there might be something wrong with the ML
tree, since the BS consensus gives you reasonable results.

alexis
> --
> You received this message because you are subscribed to the Google
> Groups "raxml" group.
> To unsubscribe from this group and stop receiving emails from it, send
> an email to raxml+un...@googlegroups.com
> <mailto:raxml+un...@googlegroups.com>.
> For more options, visit https://groups.google.com/d/optout.

--
Alexandros (Alexis) Stamatakis

Research Group Leader, Heidelberg Institute for Theoretical Studies
Full Professor, Dept. of Informatics, Karlsruhe Institute of Technology
Adjunct Professor, Dept. of Ecology and Evolutionary Biology, University
of Arizona at Tucson

www.exelixis-lab.org

41....@gmail.com

unread,
Sep 30, 2015, 5:00:36 PM9/30/15
to raxml
Sure thing! 

I did rapid bootstrapping with -f a, like so:

raxmlHPC-PTHREADS-SSE3 -T 12 -f a -s ChloroplastMAFFTalignmentwith1kp.phy.reduced -x 12345 -p 12345 -# 100 -m GTRGAMMA -n chloroplast1kp -o Peltoboykinia_watanabei_cp

The invariant-including analyses used -m GTRGAMMAI but otherwise were identical.

If I remember correctly, every fifth bootstrap is used as a starting tree for the best tree search... I have tried various bootstrap numbers from 50 to 500.

This dataset is not hard to run, so should I attempt a different search algorithm?

Ryan

Alexandros Stamatakis

unread,
Oct 1, 2015, 4:34:04 AM10/1/15
to ra...@googlegroups.com
Hi Ryan,

Okay, so generally the ML search algo implemented in there is more
thorough than the BS search algorithm. To this end, one might suspect
that the ML search algorithm finds a more optimal tree. Thus it is hard
to judge if the ML tree or BS tree is suboptimal.

I would thus run a couple of standard bootstraps with the bootstopping
option enabled (see manual) and then re-compute a majority rule consensus.

Cheers,

Alexis

> I did rapid bootstrapping with -f a, like so:
>
> raxmlHPC-PTHREADS-SSE3 -T 12 -f a -s
> ChloroplastMAFFTalignmentwith1kp.phy.reduced -x 12345 -p 12345 -# 100 -m
> GTRGAMMA -n chloroplast1kp -o Peltoboykinia_watanabei_cp
>
> The invariant-including analyses used -m GTRGAMMAI but otherwise were
> identical.
>
> If I remember correctly, every fifth bootstrap is used as a starting
> tree for the best tree search... I have tried various bootstrap numbers
> from 50 to 500.
>
> This dataset is not hard to run, so should I attempt a different search
> algorithm?
>
> Ryan
>
>
> On Wednesday, September 30, 2015 at 3:43:27 PM UTC-4, Alexis wrote:
>
> hi ryan,
>
> are you using rapid bootstraps or normal, slow bootstraps?
>
> also, how many ML searches on the original MSA have you done?
>
> it might be helpful if you could paste the command lines you used.
>
> Also note that, apparently there might be something wrong with the ML
> tree, since the BS consensus gives you reasonable results.
>
> alexis
>
> <https://groups.google.com/forum/#!searchin/raxml/0$20bootstrap$20majority$20rule/raxml/r7TKJDuvrzM/wAjkUK3_pQAJ>,
>
> > "Possibly odd behaviour in bootstrapping?"), which in that case was
> > attributed to a bug in RAxML.
> >
> > Any suggestions for fixing this issue? I am attempting a run with a
> > different build of RAxML (currently using pthreads-SSE3; I may try a
> > different version as well).
> >
> > I attached an example of results from one of the matrices.
> >
> > Ryan
> >
> > --
> > You received this message because you are subscribed to the Google
> > Groups "raxml" group.
> > To unsubscribe from this group and stop receiving emails from it,
> send
> > an email to raxml+un...@googlegroups.com <javascript:>
> > <mailto:raxml+un...@googlegroups.com <javascript:>>.
> > For more options, visit https://groups.google.com/d/optout
> <https://groups.google.com/d/optout>.
>
> --
> Alexandros (Alexis) Stamatakis
>
> Research Group Leader, Heidelberg Institute for Theoretical Studies
> Full Professor, Dept. of Informatics, Karlsruhe Institute of Technology
> Adjunct Professor, Dept. of Ecology and Evolutionary Biology,
> University
> of Arizona at Tucson
>
> www.exelixis-lab.org <http://www.exelixis-lab.org>

Grimm

unread,
Oct 1, 2015, 12:42:49 PM10/1/15
to raxml
Hi,
a simple way to check where the support has gone to is to read in the bootstrapping samples into Splitstree and compute the consensus network. Then you directly see how the BS replicates diverge from you ML tree
It may also be a simple signal issue

Bw Guido

Reply all
Reply to author
Forward
0 new messages