RAxML EPA expectation management

22 views
Skip to first unread message

rutge...@naturalis.nl

unread,
Mar 16, 2025, 12:27:01 AMMar 16
to raxml
Hi folks,

I have a fully resolved tree with 9002 taxa, on which I am placing 57887 sequences (including those 9002 in the tree) from an alignment with 700 columns under a GTRCAT model. 

I'm running raxmlHPC-PTHREADS-AVX2 version 8.2.12 on an HPC compute node with 56 AMD 64 cores (Intel(R) Xeon(R) CPU E5-2680 v4, 2.4 GhZ) and 385 GiB RAM, i.e. with `-T 56`.

Three questions:
  1. Is this the optimal version of raxmlHPC to use? E.g. among AVX2/SSE3 and PTHREADS, am I missing out on architecture-specific features that could help? (`lscpu | grep -i avx2` shows we have AVX2 support)
  2. Any sense of what running time to expect? I mean, between 'a few days' and 'not before the heat death of the sun', what's closer?
  3. What dials can I still turn to speed things up? My sense is that both pruning tips from the reference tree and filtering sequences to place will scale linearly (approx.) with running time?
Thanks very much!

Best wishes,

Rutger Vos

Alexandros Stamatakis

unread,
Mar 16, 2025, 12:54:28 AMMar 16
to ra...@googlegroups.com
Dear Rutger,

How are you doing ?

We now have a new, much faster version of the evolutionary placement
algorithm called EPA-NG that is substantially faster:

https://academic.oup.com/sysbio/article/68/2/365/5079844

The code is here:

https://github.com/pierrebarbera/epa-ng

Placement of the sequences on the reference tree you describe should
take less than one day.

Hope this helps,

Alexis


On 15.03.25 15:08, 'rutge...@naturalis.nl' via raxml wrote:
> Hi folks,
>
> I have a fully resolved tree with 9002 taxa, on which I am placing 57887
> sequences (including those 9002 in the tree) from an alignment with 700
> columns under a GTRCAT model.
>
> I'm running raxmlHPC-PTHREADS-AVX2 version 8.2.12 on an HPC compute node
> with 56 AMD 64 cores (Intel(R) Xeon(R) CPU E5-2680 v4, 2.4 GhZ) and 385
> GiB RAM, i.e. with `-T 56`.
>
> Three questions:
>
> 1. Is this the optimal version of raxmlHPC to use? E.g. among AVX2/SSE3
> and PTHREADS, am I missing out on architecture-specific features
> that could help? (`lscpu | grep -i avx2` shows we have AVX2 support)
> 2. Any sense of what running time to expect? I mean, between 'a few
> days' and 'not before the heat death of the sun', what's closer?
> 3. What dials can I still turn to speed things up? My sense is that
> both pruning tips from the reference tree and filtering sequences to
> place will scale linearly (approx.) with running time?
>
> Thanks very much!
>
> Best wishes,
>
> Rutger Vos
>
> --
> You received this message because you are subscribed to the Google
> Groups "raxml" group.
> To unsubscribe from this group and stop receiving emails from it, send
> an email to raxml+un...@googlegroups.com
> <mailto:raxml+un...@googlegroups.com>.
> To view this discussion visit
> https://groups.google.com/d/msgid/raxml/94a5c87f-cf60-402a-8a56-456f908a3304n%40googlegroups.com <https://groups.google.com/d/msgid/raxml/94a5c87f-cf60-402a-8a56-456f908a3304n%40googlegroups.com?utm_medium=email&utm_source=footer>.

--
Alexandros (Alexis) Stamatakis

ERA Chair, Institute of Computer Science, Foundation for Research and
Technology - Hellas
Research Group Leader, Heidelberg Institute for Theoretical Studies
Full Professor, Dept. of Informatics, Karlsruhe Institute of Technology

www.biocomp.gr (Crete lab)
www.exelixis-lab.org (Heidelberg lab)

Rutger Vos

unread,
Mar 16, 2025, 8:59:34 AMMar 16
to ra...@googlegroups.com
Dear Alexis,

how nice to hear from you! I am very well, thank you - hope you are too.

Indeed, I discovered epa-ng after posting my question, and I'm now exploring whether/how to incorporate it in our workflow. I gather that it wants to have the GTR model parameters and since we are using IQTree upstream anyway I will look into using that to estimate the model and pass it into the placement run.

However, I also noticed that the raxml EPA run completed within a few hours! Faster than I anticipated. I guess the run was tuned adequately :)

Sincerely // Met vriendelijke groet,

Dr. Rutger A. Vos
Bioinformatician





Phone: +31627085806
Email: rutge...@naturalis.nl

Darwinweg 2, 2333 CR Leiden
Postbus 9517, 2300 RA Leiden

 


You received this message because you are subscribed to a topic in the Google Groups "raxml" group.
To unsubscribe from this topic, visit https://groups.google.com/d/topic/raxml/ZsmxmGtUFo0/unsubscribe.
To unsubscribe from this group and all its topics, send an email to raxml+un...@googlegroups.com.
To view this discussion visit https://groups.google.com/d/msgid/raxml/843c2889-da2b-458c-8596-c91eb5d6aa6e%40gmail.com.
Reply all
Reply to author
Forward
0 new messages