compare differing results with multiple parsimony seeds

111 views
Skip to first unread message

Evan Biederstedt

unread,
Sep 20, 2016, 2:39:33 PM9/20/16
to ra...@googlegroups.com
Sorry if this is documented somewhere, but I was hoping for a clarification:

We are working on applying RAxML with alignments of binary characters, similar to this example: http://sco.h-its.org/exelixis/web/software/raxml/hands_on.html

"raxmlHPC -m BINGAMMA -p 12345 -s binary.phy -n T1"

My understanding is that for a given dataset, different parsimony seeds.

Given these results, what is the recommendation for RAxML users: 

Users could then compare likelihood values and chose the best tree---the problem appears to be that users should run the above command on every possible random seed (from 1 to 99999) and compare. Even then, one could argue this is somewhat ad-hoc

Is there a fundamental problem with the dataset and one should give up? How should one use RAxML when different parsimony seeds give inconsistent results? 






Evan Biederstedt

unread,
Sep 20, 2016, 5:03:20 PM9/20/16
to raxml
Correction:

My understanding is that for a given dataset, different parsimony seeds may end in different ML trees (as different random seeds will generate different starting trees). 


See this thread: 

https://groups.google.com/forum/#!topic/raxml/v5k3usO_p38

Alexey Kozlov

unread,
Sep 21, 2016, 8:14:16 AM9/21/16
to ra...@googlegroups.com
Dear Evan,

yes, your understanding is correct. RAxML search algorithm is based on a greedy heuristic, and thus may end up in a
local optimum.

The usual way to deal with this problem is to perform multiple searches with different starting trees (=different random
seed) and then pick the tree with the highest likelihood. Of course, it's not feasible to try "all" seeds; in practice,
20-50 starting trees are usually enough to find the global optimum (or at least reach the point where likelihood cannot
be further improved by adding more starting trees)

With RAxML, you can use -# (or -N) command line switch to perform multiple tree searches in one run, and automatically
select the best-scoring ML tree.

Hope this helps,
Alexey
> --
> You received this message because you are subscribed to the Google Groups "raxml" group.
> To unsubscribe from this group and stop receiving emails from it, send an email to raxml+un...@googlegroups.com
> <mailto:raxml+un...@googlegroups.com>.
> For more options, visit https://groups.google.com/d/optout.

EB

unread,
Sep 21, 2016, 10:02:51 AM9/21/16
to raxml
Hi Alexey, 

Thanks for the help!

So after running with 20-50 starting trees, users notice the global optimum by the majority of trees with the highest likelihood. 

If one cannot see this by, the procedure would be to try more starting trees? Given our datasets at least, I would expect there to be a global max/min. 

Thank you for the help, Evan

Alexey Kozlov

unread,
Sep 21, 2016, 10:12:37 AM9/21/16
to ra...@googlegroups.com
Hi Evan,

> So after running with 20-50 starting trees, users notice the global optimum by the majority of trees with the highest
> likelihood.

Yes.

> If one cannot see this by, the procedure would be to try more starting trees? Given our datasets at least, I would
> expect there to be a global max/min.

Exactly.

Alexey



> Thank you for the help, Evan
>
> On Wednesday, September 21, 2016 at 8:14:16 AM UTC-4, Alexey Kozlov wrote:
>
> Dear Evan,
>
> yes, your understanding is correct. RAxML search algorithm is based on a greedy heuristic, and thus may end up in a
> local optimum.
>
> The usual way to deal with this problem is to perform multiple searches with different starting trees (=different
> random
> seed) and then pick the tree with the highest likelihood. Of course, it's not feasible to try "all" seeds; in practice,
> 20-50 starting trees are usually enough to find the global optimum (or at least reach the point where likelihood cannot
> be further improved by adding more starting trees)
>
> With RAxML, you can use -# (or -N) command line switch to perform multiple tree searches in one run, and automatically
> select the best-scoring ML tree.
>
> Hope this helps,
> Alexey
>
> On 20.09.2016 23:03, Evan Biederstedt wrote:
> > Correction:
> >
> > My understanding is that for a given dataset, different parsimony seeds may end in different ML trees (as different
> > random seeds will generate different starting trees).
> >
> >
> > See this thread:
> >
> > https://groups.google.com/forum/#!topic/raxml/v5k3usO_p38 <https://groups.google.com/forum/#!topic/raxml/v5k3usO_p38>
> >
> > On Tuesday, September 20, 2016 at 2:39:33 PM UTC-4, Evan Biederstedt wrote:
> >
> > Sorry if this is documented somewhere, but I was hoping for a clarification:
> >
> > We are working on applying RAxML with alignments of binary characters, similar to this
> > example: http://sco.h-its.org/exelixis/web/software/raxml/hands_on.html
> <http://sco.h-its.org/exelixis/web/software/raxml/hands_on.html>
> > <http://sco.h-its.org/exelixis/web/software/raxml/hands_on.html
> <http://sco.h-its.org/exelixis/web/software/raxml/hands_on.html>>
> >
> > "raxmlHPC -m BINGAMMA -p 12345 -s binary.phy -n T1"
> >
> > My understanding is that for a given dataset, different parsimony seeds.
> >
> > Given these results, what is the recommendation for RAxML users:
> >
> > Users could then compare likelihood values and chose the best tree---the problem appears to be that users
> should run
> > the above command on every possible random seed (from 1 to 99999) and compare. Even then, one could argue this is
> > somewhat ad-hoc
> >
> > Is there a fundamental problem with the dataset and one should give up? How should one use RAxML when different
> > parsimony seeds give inconsistent results?
> >
> >
> >
> >
> >
> >
> > --
> > You received this message because you are subscribed to the Google Groups "raxml" group.
> > To unsubscribe from this group and stop receiving emails from it, send an email to raxml+un...@googlegroups.com
> <javascript:>
> > <mailto:raxml+un...@googlegroups.com <javascript:>>.
> > For more options, visit https://groups.google.com/d/optout <https://groups.google.com/d/optout>.

EB

unread,
Sep 21, 2016, 11:13:18 PM9/21/16
to raxml
Hi Alexey, 

Thanks, this has been helpful. 

One last question: 


"With RAxML, you can use -# (or -N) command line switch to perform multiple tree searches in one run, and automatically 
select the best-scoring ML tree. "

If one uses the -N flag along with the -p flag, I imagine the first initial tree will be determined by the -p flag, e.g. -p 12345. Subsequent trees (i.e. the number given by -N) will use a different initial tree than -p 12345? Just to be clear. 

Alexandros Stamatakis

unread,
Sep 22, 2016, 4:01:56 AM9/22/16
to ra...@googlegroups.com
Hi Evan,

if -N is used with -p, -p specifies the random number seed, and thereby
the sequence of pseudo random numbers that will be used for all N tree
inferences.

Alexis
--
Alexandros (Alexis) Stamatakis

Research Group Leader, Heidelberg Institute for Theoretical Studies
Full Professor, Dept. of Informatics, Karlsruhe Institute of Technology
Adjunct Professor, Dept. of Ecology and Evolutionary Biology, University
of Arizona at Tucson

www.exelixis-lab.org

EB

unread,
Sep 22, 2016, 9:45:42 AM9/22/16
to raxml
Hi Alexis, 

Thanks for the response. I'm still slightly confused:

My understanding above is that if the flag -p is used with the flag -#, all # tree inferences will use that randomized seed. 

I would like for each # tree to use a different initial tree, and therefore I should no use both flags. 

However, if I do not use the -p flag, I get the following error, and RAxML doesn't run:

""

Error: you need to specify a random number seed with "-p" for the randomized stepwise addition                                              

parsimony algorithm or random tree building algorithm such that runs can be reproduced and debugged ... exiting 

""

How does one execute # tree inferences, each with a different random seed? 

Thanks, Evan

Alexandros Stamatakis

unread,
Sep 23, 2016, 3:05:38 AM9/23/16
to ra...@googlegroups.com
using -p in combination with -# does exactly what you want, the seed is
not re-set to the initial value for each individual tree search,

alexis
> Alexandros (Alexis) Stamatakis
>
> Research Group Leader, Heidelberg Institute for Theoretical Studies
> Full Professor, Dept. of Informatics, Karlsruhe Institute of Technology
> Adjunct Professor, Dept. of Ecology and Evolutionary Biology,
> University
> of Arizona at Tucson
>
> www.exelixis-lab.org <http://www.exelixis-lab.org>

EB

unread,
Sep 23, 2016, 5:15:42 PM9/23/16
to raxml
Thanks for the help---I think I misunderstood the previous comment. 

I believe all issues have been resolved in this thread. Thank you
Reply all
Reply to author
Forward
0 new messages