Hi Martha,
thanks for your answers/clarifications!
I guess that after fixing technical issues and with "GTR+I" model, you should see ~similar runtimes
between raxml and raxml-ng. Although raxml can still be faster due to using a parsimony starting
tree or other factors. In general, depending on the dataset, raxml-ng can be slower but find a
better tree (there is a reason why we use 20 starting tree by default). However, this does not
really matter, because:
The underlying problem - which we discussed here a few times, most recently w.r.t. SARS-Cov-2 data -
is the lack of signal in the alignment. In your case, there are >36000 taxa and just under 5000
sites, of which ~50% are invariant and ~80% are gaps. We know from experience with both empirical
and simulated data, that it is virtually impossible to recover a stable, let alone the "true"
topology from such data. Sure, there will be probably some "good" branches with high support. But in
most cases, the branching order will be either completely random (duplicate seqs) or almost random
(seqs differ in gap pattern or in 1-2 positions -> could be a sequencing error).
So, if you really need to get a bifurcating tree with all 36000 taxa, and you accept that most of
its branches will be random - you can use FastTree and save lots of time/energy :) The additional
work that raxml/raxml-ng are investing into finding a better-scoring tree simply wouldn't pay off:
as can be seen in simulations, for such "short" alignments, the true tree often has *lower*
likelihood score than a "wrong" topology found by raxml...
Alternatively, you can try to cluster your sequences into fewer OTUs with reasonable variation, and
then run raxml-ng on this reduced dataset.
Hope this helps,
Alexey