Total number of spr rounds?

248 views
Skip to first unread message

Kristjan Oopkaup

unread,
Dec 4, 2018, 3:02:58 PM12/4/18
to raxml
My tree search has run for several days and has reached to SLOW spr round 37 (radius: 5). How can I tell how much longer does it goes?

Running parameters:
raxml-ng --msa T2.raxml.rba --threads 8 --seed 5 --model GTR+G

Analysis options:
  run mode: ML tree search
  start tree(s): random (1)
  random seed: 5
  tip-inner: OFF
  pattern compression: ON
  per-rate scalers: OFF
  site repeats: ON
  fast spr radius: AUTO
  spr subtree cutoff: 1.000000
  branch lengths: linked (ML estimate, algorithm: NR-FAST)
  SIMD kernels: AVX
  parallelization: PTHREADS (8 threads), thread pinning: OFF

Partition 0: noname
Model: GTR+FO+G4m
Alignment sites / patterns: 6500 / 4638
Gaps: 77.57 %
Invariant sites: 14.72 %


Thank you,
Kristjan

Alexandros Stamatakis

unread,
Dec 5, 2018, 3:49:28 AM12/5/18
to ra...@googlegroups.com
You can't unfortunately, it will run as long as it keeps finding better
trees, that's why the run-times of raxml-ng are so hard to predict.

How many taxa does your dataset have?

Alexis
> --
> You received this message because you are subscribed to the Google
> Groups "raxml" group.
> To unsubscribe from this group and stop receiving emails from it, send
> an email to raxml+un...@googlegroups.com
> <mailto:raxml+un...@googlegroups.com>.
> For more options, visit https://groups.google.com/d/optout.

--
Alexandros (Alexis) Stamatakis

Research Group Leader, Heidelberg Institute for Theoretical Studies
Full Professor, Dept. of Informatics, Karlsruhe Institute of Technology

www.exelixis-lab.org

Kristjan Oopkaup

unread,
Dec 5, 2018, 3:53:42 AM12/5/18
to raxml
I have 13866 taxa in my alignment.

Alexandros Stamatakis

unread,
Dec 5, 2018, 7:14:29 AM12/5/18
to ra...@googlegroups.com
That explains it, likelihood surfaces of such difficult datasets (few
sites, many taxa) are usually very difficult to explore, hence the slow
convergence,

Alexis
> > an email to raxml+un...@googlegroups.com <javascript:>
> > <mailto:raxml+un...@googlegroups.com <javascript:>>.
> > For more options, visit https://groups.google.com/d/optout
> <https://groups.google.com/d/optout>.
>
> --
> Alexandros (Alexis) Stamatakis
>
> Research Group Leader, Heidelberg Institute for Theoretical Studies
> Full Professor, Dept. of Informatics, Karlsruhe Institute of Technology
>
> www.exelixis-lab.org <http://www.exelixis-lab.org>

Alexey Kozlov

unread,
Dec 5, 2018, 7:55:38 AM12/5/18
to ra...@googlegroups.com
Hi Kristjan,

unfortunately this type of alignment - single-gene, thousands of taxa
and lots of gaps (77.57 %) - are in general quite problematic, since
they have too little signal. That's why it takes so long for raxml-ng to
converge

You can try a couple of tricks to accelerate the search:

1) increase the number of CPU cores (up to 16 should still yield a
speedup), or use newer/faster CPU if possible

2) use a parsimony starting tree: --tree pars{1}

3) increase log-likelihood convergence threshold: e.g. --lh-epsilon 1.0

4) use a more aggressive heuristic for skipping SPRs: e.g. --spr-cutoff
0.5 or --spr-cutoff 0.2

5) use a constraint tree

Options 2) and 5) require re-running the tree search from scratch,
whereas with 1), 3), 4) you can continue your run from checkpoint.

Options 3), 4) and 5) would yield a tree with lower likelihood.
However, as can be seen in simulations, the true tree will most probably
*not* have the highest likelihood on the alignment with such dimensions
(>10,000 taxa x <10,000 sites). Especially short inner branches are
difficult to get right. So the best we can hope for is to get a
"reasonable" tree, i.e. the one that is better than e.g. inferred with
parsimony.

Hope this helps,
Alexey

On 05.12.18 09:53, Kristjan Oopkaup wrote:
> > an email to raxml+un...@googlegroups.com <javascript:>
> > <mailto:raxml+un...@googlegroups.com <javascript:>>.
> > For more options, visit https://groups.google.com/d/optout
> <https://groups.google.com/d/optout>.
>
> --
> Alexandros (Alexis) Stamatakis
>
> Research Group Leader, Heidelberg Institute for Theoretical Studies
> Full Professor, Dept. of Informatics, Karlsruhe Institute of Technology
>
> www.exelixis-lab.org <http://www.exelixis-lab.org>

Győző Kaján

unread,
May 28, 2019, 9:22:51 AM5/28/19
to raxml
Dear Alexey,

Have the same problem but in a bootstrap analysis (the tree is already done), and I fear, I cannot use 2) here. I have tried to use the inferred (best) tree as a starting tree, but RAxML-NG 0.7 still uses a newly generated random tree. Though I guess this would help a lot, as tree inference took max. 3.5 hours for me (based on parsimony starting trees), but for bootstrapping 5 hours were not enough (single tree inference times).
Now I have continued the analysis from the checkpoint using suggestion 3) and 4) by adding --lh-epsilon 1.0 --spr-cutoff 0.2 and also --bs-metric fbp,tbe. I am wondering if such a change in analysis settings is possible "during" an inference?

Best regards,
Gyozo

Alexey Kozlov

unread,
May 28, 2019, 10:29:17 AM5/28/19
to ra...@googlegroups.com
Dear Gyozo,

> Have the same problem but in a bootstrap analysis (the tree is already done), and I fear, I cannot
> use 2) here. I have tried to use the inferred (best) tree as a starting tree, but RAxML-NG 0.7 still
> uses a newly generated random tree.

Yes, bootstrapping will use random starting trees by default. This cannot be changed directly, but
there is a workaround, see below.

>Though I guess this would help a lot, as tree inference took
> max. 3.5 hours for me (based on parsimony starting trees), but for bootstrapping 5 hours were not
> enough (single tree inference times).

With RAxML-NG 0.8 and later, you can do the following:

1) use "--bsmsa" command to generate bootstrap replicate alignments

2) run regular tree search on each replicate alignment, starting from a parsimony tree (better) or
best ML tree

3) collect best trees from 2) and use "--support" command to compute bootstrap support values on the
best ML tree

> Now I have continued the analysis from the checkpoint using suggestion 3) and 4) by adding
> --lh-epsilon 1.0 --spr-cutoff 0.2 and also --bs-metric fbp,tbe. I am wondering if such a change in
> analysis settings is possible "during" an inference?

This should work, although using those options should really be just last resort solution, I'd try
parsimony starting trees first - or just wait a bit longer :)

Hope this helps,
Alexey
> > www.exelixis-lab.org <http://www.exelixis-lab.org> <http://www.exelixis-lab.org>
> >
> > --
> > You received this message because you are subscribed to the Google
> > Groups "raxml" group.
> > To unsubscribe from this group and stop receiving emails from it, send
> > an email to ra...@googlegroups.com <javascript:>
> > <mailto:ra...@googlegroups.com <javascript:>>.
> You received this message because you are subscribed to the Google Groups "raxml" group.
> To unsubscribe from this group and stop receiving emails from it, send an email to
> raxml+un...@googlegroups.com <mailto:raxml+un...@googlegroups.com>.
> To view this discussion on the web visit
> https://groups.google.com/d/msgid/raxml/d57689b8-ea13-4527-876e-409931b22b98%40googlegroups.com
> <https://groups.google.com/d/msgid/raxml/d57689b8-ea13-4527-876e-409931b22b98%40googlegroups.com?utm_medium=email&utm_source=footer>.
Reply all
Reply to author
Forward
0 new messages