raxml-ng tips and tricks?

1,390 views
Skip to first unread message

Andy

unread,
Nov 8, 2020, 9:27:20 AM11/8/20
to raxml

Hello team,

I have an msa with just over 2,000 taxa (amino acids). I executed raxml-ng as follows:


raxml-ng-mpi --all --msa alignment.rba --model JTT --prefix T3 --threads 7 --seed 69 --bs-metric tbe

I chose 7 threads because that's what the --parse argument that I ran prior had suggested. I was told our computing cluster has 30 available virtual cores and I know it has 32 Intel cores (1 thread per core).

It has been 51 hours and the process is only on its third starting tree (of 20 being generated) The questions are 1) did I do the right thing to end up with a best ML tree and bootstrap values within a reasonable (~1-2 week) window of time? And  2) is there a way I could hope for a tree of reasonable quality in less than a week?

Obviously I probably wouldn't want 1000 bootstrap replicates for this size alignment, so I am hoping I did the right thing for raxml to decide for itself how many replicates to do.

Below is the terminal printout of what I believe to be the pertinent information. I hope it is not too much to read (bonus if someone knows about posting syntax in this group for code:))

Best, and thanks a ton RAXers!

###############################################################

System: Intel(R) Xeon(R) CPU E5-2683 v4 @ 2.10GHz, 32 cores, 503 GB RAM

RAxML-NG was called at 06-Nov-2020 11:03:00 as follows:

raxml-ng-mpi --all --msa all_seq_sprot_alignment.fas.reduced.raxml.rba --model JTT --prefix T3 --threads 7 --seed 69 --bs-metric tbe

Analysis options:
  run mode: ML tree search + bootstrapping (Transfer Bootstrap)
  start tree(s): random (10) + parsimony (10)
  bootstrap replicates: max: 1000 + bootstopping (autoMRE, cutoff: 0.030000)
  random seed: 69
  tip-inner: OFF
  pattern compression: ON
  per-rate scalers: OFF
  site repeats: ON
  branch lengths: proportional (ML estimate, algorithm: NR-FAST)
  SIMD kernels: AVX2
  parallelization: coarse-grained (auto), PTHREADS (7 threads), thread pinning: ON

WARNING: The model you specified on the command line (JTT) will be ignored
         since the binary MSA file already contains a model definition.
         If you want to change the model, please re-run RAxML-NG
         with the original PHYLIP/FASTA alignment and --redo option.

[00:00:00] Loading binary alignment from file: all_seq_sprot_alignment.fas.reduced.raxml.rba
[00:00:00] Alignment comprises 2178 taxa, 1 partitions and 2825 patterns

Partition 0: noname
Model: JTT
Alignment sites / patterns: 3075 / 2825
Gaps: 88.88 %
Invariant sites: 16.26 %


Parallelization scheme autoconfig: 1 worker(s) x 7 thread(s)




Alexey Kozlov

unread,
Nov 8, 2020, 4:30:39 PM11/8/20
to ra...@googlegroups.com
Hello Andy,

the latest version of raxml-ng (which you appear to be using) supports automatic coarse-grained
parallelization:

https://github.com/amkozlov/raxml-ng/wiki/Parallelization#adaptive-parallelization

Therefore, you can use all 32 cores (--threads 32) on each node and, if available, multiple nodes
for your analysis. This should speedup things at least 4x (using 1 node).

Regarding posting syntax, please attach full *.raxml.log file, this is usually more informative than
individual lines from the output.

Best,
Alexey
> --
> You received this message because you are subscribed to the Google Groups "raxml" group.
> To unsubscribe from this group and stop receiving emails from it, send an email to
> raxml+un...@googlegroups.com <mailto:raxml+un...@googlegroups.com>.
> To view this discussion on the web visit
> https://groups.google.com/d/msgid/raxml/d951cb12-df01-4b05-b4a3-6a8927934fd1n%40googlegroups.com
> <https://groups.google.com/d/msgid/raxml/d951cb12-df01-4b05-b4a3-6a8927934fd1n%40googlegroups.com?utm_medium=email&utm_source=footer>.

Andy

unread,
Nov 10, 2020, 3:49:08 AM11/10/20
to raxml
@Alexey

Thanks much for your feedback and I'll look into this. However, I tried running raxml-ng-mpi with 32 threads and received the following error:

ERROR: CPU core oversubscription detected! RAxML-NG will terminate now to avoid wasting resources.
NOTE:  Details: https://github.com/amkozlov/raxml-ng/wiki/Parallelization#core-oversubscription
NOTE:  You can use '--force perf_threads' to disable this check, but ONLY if you are 200% sure this is a false alarm!

However, I do not know if this was a false alarm. I know there are risks to using too many threads according to the manual.  So it appears 7 threads is too low, 32 is too many, would you suggest picking a number about in the middle? (ex. 24 threads) ? I am attaching the info file output from my last run.

Best,

Andy
T3.raxml.log

Alexey Kozlov

unread,
Nov 10, 2020, 6:22:34 AM11/10/20
to ra...@googlegroups.com
Hi Andy,

according to the raxml-ng output, your machine has 32 cores, so it could be that it is not
exclusively allocated to your job. Are you using a workload manager like SLURM? If so, could you
please show your job submission script?

Best,
Alexey

On 10.11.20 09:49, Andy wrote:
> @Alexey
>
> Thanks much for your feedback and I'll look into this. However, I tried running raxml-ng-mpi with 32
> threads and received the following error:
>
> ERROR: CPU core oversubscription detected! RAxML-NG will terminate now to avoid wasting resources.
> NOTE:  Details: https://github.com/amkozlov/raxml-ng/wiki/Parallelization#core-oversubscription
> NOTE:  You can use '--force perf_threads' to disable this check, but ONLY if you are 200% sure this
> is a false alarm!
>
> However, I do not know if this was a false alarm. I know there are risks to using too many threads
> according to the manual.  So it appears 7 threads is too low, 32 is too many, would you suggest
> picking a number about in the middle? (ex. 24 threads) ? I am attaching the info file output from my
> https://groups.google.com/d/msgid/raxml/c76d9948-e1e7-4ab6-8560-d3d3dd53d442n%40googlegroups.com
> <https://groups.google.com/d/msgid/raxml/c76d9948-e1e7-4ab6-8560-d3d3dd53d442n%40googlegroups.com?utm_medium=email&utm_source=footer>.

Andy

unread,
Nov 13, 2020, 5:06:38 AM11/13/20
to raxml
Hi Alexey,

I can confirm that using a job manager fixed the issue. It seems that the cores were "oversubscribed" to other jobs not related to me but now I have specified to find a node with available resources.

To my surprise, my run now says it's starting bootstrapping with 1000 replicates. I assumed raxml would choose a lower amount needed. Is the command I used calling for 1000 bootstrap replicates or is raxml choosing that on its own?

raxml-ng-mpi --all --msa seqs.rba --model JTT --prefix T4 --threads 32 --seed 69 --bs-metric tbe

Alexey Kozlov

unread,
Nov 13, 2020, 10:04:34 AM11/13/20
to ra...@googlegroups.com
Hi Andy,

great to hear it worked!

With the command line below, raxml-ng will bootstopping test to stop as soon as convergence is
detected, so 1000 is just the upper bound. Still, you can manually limit the number of replicates
(eg "--bs-trees 100") to get the result sooner. You can always add more replicates later on, as
described in the tutorial here:

https://github.com/amkozlov/raxml-ng/wiki/Tutorial#bootstrapping

Best,
Alexey
> https://groups.google.com/d/msgid/raxml/1560e1c3-8652-4aaa-b45e-5fc128a6ff59n%40googlegroups.com
> <https://groups.google.com/d/msgid/raxml/1560e1c3-8652-4aaa-b45e-5fc128a6ff59n%40googlegroups.com?utm_medium=email&utm_source=footer>.
Reply all
Reply to author
Forward
0 new messages