Hi Kristjan,
unfortunately this type of alignment - single-gene, thousands of taxa
and lots of gaps (77.57 %) - are in general quite problematic, since
they have too little signal. That's why it takes so long for raxml-ng to
converge
You can try a couple of tricks to accelerate the search:
1) increase the number of CPU cores (up to 16 should still yield a
speedup), or use newer/faster CPU if possible
2) use a parsimony starting tree: --tree pars{1}
3) increase log-likelihood convergence threshold: e.g. --lh-epsilon 1.0
4) use a more aggressive heuristic for skipping SPRs: e.g. --spr-cutoff
0.5 or --spr-cutoff 0.2
5) use a constraint tree
Options 2) and 5) require re-running the tree search from scratch,
whereas with 1), 3), 4) you can continue your run from checkpoint.
Options 3), 4) and 5) would yield a tree with lower likelihood.
However, as can be seen in simulations, the true tree will most probably
*not* have the highest likelihood on the alignment with such dimensions
(>10,000 taxa x <10,000 sites). Especially short inner branches are
difficult to get right. So the best we can hope for is to get a
"reasonable" tree, i.e. the one that is better than e.g. inferred with
parsimony.
Hope this helps,
Alexey
On 05.12.18 09:53, Kristjan Oopkaup wrote:
> > an email to
raxml+un...@googlegroups.com <javascript:>
> > <mailto:
raxml+un...@googlegroups.com <javascript:>>.
> <
https://groups.google.com/d/optout>.
>
> --
> Alexandros (Alexis) Stamatakis
>
> Research Group Leader, Heidelberg Institute for Theoretical Studies
> Full Professor, Dept. of Informatics, Karlsruhe Institute of Technology
>
>
www.exelixis-lab.org <
http://www.exelixis-lab.org>