Total number of spr rounds?

Kristjan Oopkaup

unread,

Dec 4, 2018, 3:02:58 PM12/4/18

to raxml

My tree search has run for several days and has reached to SLOW spr round 37 (radius: 5). How can I tell how much longer does it goes?

Running parameters:

raxml-ng --msa T2.raxml.rba --threads 8 --seed 5 --model GTR+G

Analysis options:

run mode: ML tree search

start tree(s): random (1)

random seed: 5

tip-inner: OFF

pattern compression: ON

per-rate scalers: OFF

site repeats: ON

fast spr radius: AUTO

spr subtree cutoff: 1.000000

branch lengths: linked (ML estimate, algorithm: NR-FAST)

SIMD kernels: AVX

parallelization: PTHREADS (8 threads), thread pinning: OFF

Partition 0: noname

Model: GTR+FO+G4m

Alignment sites / patterns: 6500 / 4638

Gaps: 77.57 %

Invariant sites: 14.72 %

Thank you,

Kristjan

Alexandros Stamatakis

unread,

Dec 5, 2018, 3:49:28 AM12/5/18

to ra...@googlegroups.com

You can't unfortunately, it will run as long as it keeps finding better
trees, that's why the run-times of raxml-ng are so hard to predict.

How many taxa does your dataset have?

Alexis

> --
> You received this message because you are subscribed to the Google
> Groups "raxml" group.
> To unsubscribe from this group and stop receiving emails from it, send
> an email to raxml+un...@googlegroups.com
> <mailto:raxml+un...@googlegroups.com>.
> For more options, visit https://groups.google.com/d/optout.

--
Alexandros (Alexis) Stamatakis

Research Group Leader, Heidelberg Institute for Theoretical Studies
Full Professor, Dept. of Informatics, Karlsruhe Institute of Technology

www.exelixis-lab.org

Kristjan Oopkaup

unread,

Dec 5, 2018, 3:53:42 AM12/5/18

to raxml

I have 13866 taxa in my alignment.

Alexandros Stamatakis

unread,

Dec 5, 2018, 7:14:29 AM12/5/18

to ra...@googlegroups.com

That explains it, likelihood surfaces of such difficult datasets (few
sites, many taxa) are usually very difficult to explore, hence the slow
convergence,

Alexis

> > an email to raxml+un...@googlegroups.com <javascript:>
> > <mailto:raxml+un...@googlegroups.com <javascript:>>.

> > For more options, visit https://groups.google.com/d/optout

> <https://groups.google.com/d/optout>.

>
> --
> Alexandros (Alexis) Stamatakis
>
> Research Group Leader, Heidelberg Institute for Theoretical Studies
> Full Professor, Dept. of Informatics, Karlsruhe Institute of Technology
>

> www.exelixis-lab.org <http://www.exelixis-lab.org>

Alexey Kozlov

unread,

Dec 5, 2018, 7:55:38 AM12/5/18

to ra...@googlegroups.com

Hi Kristjan,

unfortunately this type of alignment - single-gene, thousands of taxa
and lots of gaps (77.57 %) - are in general quite problematic, since
they have too little signal. That's why it takes so long for raxml-ng to
converge

You can try a couple of tricks to accelerate the search:

1) increase the number of CPU cores (up to 16 should still yield a
speedup), or use newer/faster CPU if possible

2) use a parsimony starting tree: --tree pars{1}

3) increase log-likelihood convergence threshold: e.g. --lh-epsilon 1.0

4) use a more aggressive heuristic for skipping SPRs: e.g. --spr-cutoff
0.5 or --spr-cutoff 0.2

5) use a constraint tree

Options 2) and 5) require re-running the tree search from scratch,
whereas with 1), 3), 4) you can continue your run from checkpoint.

Options 3), 4) and 5) would yield a tree with lower likelihood.
However, as can be seen in simulations, the true tree will most probably
*not* have the highest likelihood on the alignment with such dimensions
(>10,000 taxa x <10,000 sites). Especially short inner branches are
difficult to get right. So the best we can hope for is to get a
"reasonable" tree, i.e. the one that is better than e.g. inferred with
parsimony.

Hope this helps,
Alexey

On 05.12.18 09:53, Kristjan Oopkaup wrote:

> > an email to raxml+un...@googlegroups.com <javascript:>
> > <mailto:raxml+un...@googlegroups.com <javascript:>>.

> > For more options, visit https://groups.google.com/d/optout

> <https://groups.google.com/d/optout>.

>
> --
> Alexandros (Alexis) Stamatakis
>
> Research Group Leader, Heidelberg Institute for Theoretical Studies
> Full Professor, Dept. of Informatics, Karlsruhe Institute of Technology
>

> www.exelixis-lab.org <http://www.exelixis-lab.org>

Győző Kaján

unread,

May 28, 2019, 9:22:51 AM5/28/19

to raxml

Dear Alexey,

Have the same problem but in a bootstrap analysis (the tree is already done), and I fear, I cannot use 2) here. I have tried to use the inferred (best) tree as a starting tree, but RAxML-NG 0.7 still uses a newly generated random tree. Though I guess this would help a lot, as tree inference took max. 3.5 hours for me (based on parsimony starting trees), but for bootstrapping 5 hours were not enough (single tree inference times).

Now I have continued the analysis from the checkpoint using suggestion 3) and 4) by adding --lh-epsilon 1.0 --spr-cutoff 0.2 and also --bs-metric fbp,tbe. I am wondering if such a change in analysis settings is possible "during" an inference?

Best regards,

Gyozo

> an email to ra...@googlegroups.com
> <mailto:ra...@googlegroups.com>.

Alexey Kozlov

unread,

May 28, 2019, 10:29:17 AM5/28/19

to ra...@googlegroups.com

Dear Gyozo,

> Have the same problem but in a bootstrap analysis (the tree is already done), and I fear, I cannot
> use 2) here. I have tried to use the inferred (best) tree as a starting tree, but RAxML-NG 0.7 still
> uses a newly generated random tree.

Yes, bootstrapping will use random starting trees by default. This cannot be changed directly, but
there is a workaround, see below.

>Though I guess this would help a lot, as tree inference took
> max. 3.5 hours for me (based on parsimony starting trees), but for bootstrapping 5 hours were not
> enough (single tree inference times).

With RAxML-NG 0.8 and later, you can do the following:

1) use "--bsmsa" command to generate bootstrap replicate alignments

2) run regular tree search on each replicate alignment, starting from a parsimony tree (better) or
best ML tree

3) collect best trees from 2) and use "--support" command to compute bootstrap support values on the
best ML tree

> Now I have continued the analysis from the checkpoint using suggestion 3) and 4) by adding
> --lh-epsilon 1.0 --spr-cutoff 0.2 and also --bs-metric fbp,tbe. I am wondering if such a change in
> analysis settings is possible "during" an inference?

This should work, although using those options should really be just last resort solution, I'd try
parsimony starting trees first - or just wait a bit longer :)

Hope this helps,
Alexey

> > www.exelixis-lab.org <http://www.exelixis-lab.org> <http://www.exelixis-lab.org>

> >
> > --
> > You received this message because you are subscribed to the Google
> > Groups "raxml" group.
> > To unsubscribe from this group and stop receiving emails from it, send

> > an email to ra...@googlegroups.com <javascript:>
> > <mailto:ra...@googlegroups.com <javascript:>>.

> > For more options, visit https://groups.google.com/d/optout <https://groups.google.com/d/optout>.
>
> --

> You received this message because you are subscribed to the Google Groups "raxml" group.
> To unsubscribe from this group and stop receiving emails from it, send an email to

> raxml+un...@googlegroups.com <mailto:raxml+un...@googlegroups.com>.
> To view this discussion on the web visit
> https://groups.google.com/d/msgid/raxml/d57689b8-ea13-4527-876e-409931b22b98%40googlegroups.com
> <https://groups.google.com/d/msgid/raxml/d57689b8-ea13-4527-876e-409931b22b98%40googlegroups.com?utm_medium=email&utm_source=footer>.

Reply all

Reply to author

Forward