I am using RAxML-NG to generate a tree with 159 taxa. I generated the alignment file from GTDB-Tk and ran the parse command before setting up the main command:raxml-ng --all --msa T1.raxml.rba --model LG+G8+F --prefix RAXML_full_tree --threads 14 --seed 2RAxML-ng has been running for more than 19 days and has made it through the first two bootstrapping analysis trees. This is much slower than any other tree I've made with RAxML-ng. Indeed, I used the same command and alignment for 140 taxa and it was finished in a couple of days. These were run with the same command and on the same server cluster, the only change was the alignment file used.I'm attaching the output (so far) from the slow running sample. Any advice would be greatly appreciated. Thanks in advance.
I used GTDB-Tk to identify and align 120 genes and I used that alignment in RAxML. They are at the organism level. The 140 taxa tree are all from the same genus and it ran fine, so that's why I'm confused about why it's taking so long to run with an additional samples from outside the genus.
It could indeed be that the added further-away tips are too
distant for a good number of the genes because the aligned
basepairs do not provide sorted phylogenetic patterns. If you
focal genus is an isolated one, genetically highly coherent and
distant from the added tips, especially the bootstraps may have
problems in effectively inserting the new tips in the phylogeny
that makes the framework for the smaller, more focussed taxon set.
The problem may be exaggerated by crucially missing genes. E.g. an
outgroup only covered for genes where it has no consistent
splitting pattern with any distinct part of the ingroup, will lead
to extreme topological ambiguity and may inflate computation time.
Have you looked at the new alignment, does it looks as clean as
the one without the added tips? A quick assessment is also to
infer a simple pairwise distance matrix for the total data and
visualise it using a neighbour-net splits graph or a heat map.
On Thursday, December 7, 2023 at 10:26:31 AM UTC-6 Guido wrote:
Hi Kevin,
Am 07.12.2023 um 17:11 schrieb 'Kevin Myers' via raxml:
> Thanks. Would the addition of 19 new samples really impact the signal
> that much?
Normally not, if from the same group...
> The 19 are from more distant related taxa as well if that helps?
...it could be that they are too distantly related and interfering with
the original tip sets/ingroup topology. What kind of matrix is it?
Taxonomic level, organismal group, and gene sample can make huge
differences even when only few samples are added to an otherwise
unproblematic matrix (in some cases).
/G
--
You received this message because you are subscribed to a topic in the Google Groups "raxml" group.
To unsubscribe from this topic, visit https://groups.google.com/d/topic/raxml/4QQpQxqlbz8/unsubscribe.
To unsubscribe from this group and all its topics, send an email to raxml+un...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/raxml/d3bef3aa-2277-4039-b1fd-472b4920a8dan%40googlegroups.com.
raxml-ng-mpi --check --msa gtdbtk.bac120.msa.fasta --model GTR+G --data-type aa --prefix T1