restarting MPI version with RaxML

301 views
Skip to first unread message

Andrea Pineda

unread,
Mar 8, 2016, 9:24:23 AM3/8/16
to raxml
Dear RAxML users

I need to do a Best Tree's search without bootstrapping using RAxML in a dataset with ~800 alignment patterns and ~ 8300 sequences. Since I have some limitations of walltime in the cluster, I used the -j option. However, given the pbs that I am using, I have several files called RAxML_result.exampleRun##.  Please see pbs below. 

How can I restart the runs, if I have multiple files like this? What other option do I have to do a faster calculation of the best Tree with RaxML? I know that the ExaML program may be useful, but I need to do this tree specifically with RAxML for comparison purposes. 

Thank you for your help, 

Andrea

PBS file:
________



#!/bin/bash -l
#PBS -l nodes=5:ppn=20:ivybridge
#PBS -l pmem=3gb 
#PBS -l walltime=200:00:00

module load OpenMPI/1.6.5-GCC-4.8.2

mpirun raxmlHPC-MPI -j -p 12345 -# 20 -m GTRGAMMA -s inputfile.phy -n Tree_Best_file





Alexandros Stamatakis

unread,
Mar 8, 2016, 9:31:07 AM3/8/16
to ra...@googlegroups.com
dear andrea,

the easiest way is to run 20 separate tree searches on one processor
each with the sequential version of RAxML, it should then also be easier
to track from where you need to restart if you give each run a different
name. Also, for greater efficiency you should use the SSE and preferabl
the AVX version of the code, AVX seems to be suported by the CPUs in
your cluster,

Alexis
--
Alexandros (Alexis) Stamatakis

Research Group Leader, Heidelberg Institute for Theoretical Studies
Full Professor, Dept. of Informatics, Karlsruhe Institute of Technology
Adjunct Professor, Dept. of Ecology and Evolutionary Biology, University
of Arizona at Tucson

www.exelixis-lab.org

Andrea Pineda

unread,
Mar 8, 2016, 4:21:38 PM3/8/16
to raxml
Dear Alexis

Thank you for the quick reply. 
I'll try that option.
One last question,   how can I calculate the use of RAM for an specific dataset. This dataset already gave me some errors of lack of memory and the cluster administrator advised to use the option of 3gb or 6gb. Although it is very nice to use the 6gb option, it is going to take more time in the queue. Then, it would be nice to have an approximation for this.
Thank you in advance, 

Andrea

Alexey Kozlov

unread,
Mar 8, 2016, 7:25:08 PM3/8/16
to ra...@googlegroups.com
Hi Andrea,

> One last question, how can I calculate the use of RAM for an specific dataset.

You can roughly estimate the amount of RAM required for your dataset here:

http://exelixis-lab.org/web/software/raxml/index.html#memcalc

3gb should be more than enough.

Also I noticed that you have the following line in your script:

> #PBS -l nodes=5:ppn=20:ivybridge

which means that you're using 5 nodes running 20 MPI processes *each* - that is, you have 100 processes in total.
However, according to your command line, you have just 20 starting trees/tree searches. Now, since raxmlHPC-MPI can only
parallelize across tree searches, 80% of the CPUs allocated for you job will lie idle - not good :)

So if you can afford it in terms of per-node memory, you could just use 1 node (nodes=1:ppn=20) - RAxML runtime will be
roughly the same, but your job might be scheduled faster. Alternatively, you can also decrease "ppn" value to start less
processes per node - this would allow to use more memory per process (pmem) or to run your job on nodes with less RAM
installed (again, lower queuing time).

Hope this helps,
Alexey
> > #PBS -l pmem=3gb
> > #PBS -l walltime=200:00:00
> >
> > module load OpenMPI/1.6.5-GCC-4.8.2
> >
> > mpirun raxmlHPC-MPI -j -p 12345 -# 20 -m GTRGAMMA -s inputfile.phy -n
> > Tree_Best_file
> >
> >
> >
> >
> >
>
> --
> Alexandros (Alexis) Stamatakis
>
> Research Group Leader, Heidelberg Institute for Theoretical Studies
> Full Professor, Dept. of Informatics, Karlsruhe Institute of Technology
> Adjunct Professor, Dept. of Ecology and Evolutionary Biology, University
> of Arizona at Tucson
>
> www.exelixis-lab.org <http://www.exelixis-lab.org>
>
> --
> You received this message because you are subscribed to the Google Groups "raxml" group.
> To unsubscribe from this group and stop receiving emails from it, send an email to raxml+un...@googlegroups.com
> <mailto:raxml+un...@googlegroups.com>.
> For more options, visit https://groups.google.com/d/optout.

Kaján Győző

unread,
Mar 9, 2016, 4:32:18 AM3/9/16
to ra...@googlegroups.com
Dear Alexey,

And what setting would you suggest to parallelize bootstrap processes?
I am using the PTHREADS version, but (at least on our cluster) this works only with 1 node (16 cores, -T 16). When I booked two nodes, and tried "-T 32" it became even slower, if I remember well.
Now I have seen this letter, I started to hope to use the MPI version, but if I understand well, this is not helpful to me.
(I tried to interpret the different flavors of RAxML in the manual, but I must admit, this part exceeded my abilities.)

Best,
Viktor

To unsubscribe from this group and stop receiving emails from it, send an email to raxml+un...@googlegroups.com.

Alexandros Stamatakis

unread,
Mar 9, 2016, 5:01:30 AM3/9/16
to ra...@googlegroups.com
dear victor,

> And what setting would you suggest to parallelize bootstrap processes?

It really depends on your input dataset, see here for advice:

http://onlinelibrary.wiley.com/doi/10.1002/0471250953.bi0614s51/abstract;jsessionid=5EF5A21C9CD83CF10CBD5D2B77B770E7.f04t02?userIsAuthenticated=false&deniedAccessCustomisedMessage=

> I am using the PTHREADS version, but (at least on our cluster) this
> works only with 1 node (16 cores, -T 16). When I booked two nodes, and
> tried "-T 32" it became even slower, if I remember well.

Evidently, since two seperate nodes don't share the same memory, the
PThreads version only woerks om processors with shared memory, in your
case I assume that all 32 threads were running on 16 cores only, thus
competing for them and slowing things down.

> Now I have seen this letter, I started to hope to use the MPI version,
> but if I understand well, this is not helpful to me.

You can use the MPI version for bootsraps if you want to use more than
one node.

Alexis

> (I tried to interpret the different flavors of RAxML in the manual, but
> I must admit, this part exceeded my abilities.)
>
> Best,
> Viktor
>
> On 9 March 2016 at 01:25, Alexey Kozlov <alexei...@gmail.com
> <mailto:raxml%2Bunsu...@googlegroups.com>
> <mailto:raxml+un...@googlegroups.com
> <mailto:raxml%2Bunsu...@googlegroups.com>>.
> For more options, visit https://groups.google.com/d/optout.
>
>
> --
> You received this message because you are subscribed to the Google
> Groups "raxml" group.
> To unsubscribe from this group and stop receiving emails from it,
> send an email to raxml+un...@googlegroups.com
> <mailto:raxml%2Bunsu...@googlegroups.com>.
> For more options, visit https://groups.google.com/d/optout.
>
>
> --
> You received this message because you are subscribed to the Google
> Groups "raxml" group.
> To unsubscribe from this group and stop receiving emails from it, send
> an email to raxml+un...@googlegroups.com
> <mailto:raxml+un...@googlegroups.com>.
> For more options, visit https://groups.google.com/d/optout.

--

Andrea Pineda

unread,
Mar 9, 2016, 5:45:16 AM3/9/16
to raxml
Dear Alexey

Thank you for your clear answer, this is going to help us to improve our pbs settings in our group.
All the best, 

Andrea

Alexey Kozlov

unread,
Mar 9, 2016, 9:09:42 AM3/9/16
to ra...@googlegroups.com
Hi Viktor,

as Alexis suggested above, you can use RAxML-MPI to parallelize over bootstraps as well (sorry, I didn't mention it in
my answer to Andrea, since she was running plain tree search without bootstrapping - and this was probably confusing for
you). You should use 1 MPI process per core, so in your case it would be "mpirun -n 32" for 2 nodes.

Also, if your alignment is large enough, you can also experiment with RAxML-HYBRID, which use PTHREADS-based
parallelization over alignment sites as well as MPI-based parallelization over bootstraps/tree searches. The command
line would then look like "mpirun -n 2 raxmlHPC-HYBRID-AVX -T 16 ...", and of course you can tune number of threads and
MPI processes according to your dataset dimensions and # of bootstraps.

In general, however, I would expect pure MPI version to perform better. There are two important exceptions though:
- you have less bootstraps/starting trees than you have cores
- you cannot run 1 MPI process (=tree search/BS) per core because of per-node memory limit (s. my answer to Andrea)
In these cases, you might want to try RAxML-HYBRID.

Also, please read the chapter "How many threads shall I use?" in the RAxML manual:

http://sco.h-its.org/exelixis/resource/download/NewManual.pdf

Cheers,
Alexey

On 09.03.2016 10:31, Kaján Győző wrote:
> Dear Alexey,
>
> And what setting would you suggest to parallelize bootstrap processes?
> I am using the PTHREADS version, but (at least on our cluster) this works only with 1 node (16 cores, -T 16). When I
> booked two nodes, and tried "-T 32" it became even slower, if I remember well.
> Now I have seen this letter, I started to hope to use the MPI version, but if I understand well, this is not helpful to me.
> (I tried to interpret the different flavors of RAxML in the manual, but I must admit, this part exceeded my abilities.)
>
> Best,
> Viktor
>
> www.exelixis-lab.org <http://www.exelixis-lab.org> <http://www.exelixis-lab.org>
>
> --
> You received this message because you are subscribed to the Google Groups "raxml" group.
> To unsubscribe from this group and stop receiving emails from it, send an email to
> raxml+un...@googlegroups.com <mailto:raxml%2Bunsu...@googlegroups.com>
> <mailto:raxml+un...@googlegroups.com <mailto:raxml%2Bunsu...@googlegroups.com>>.
> For more options, visit https://groups.google.com/d/optout.
>
>
> --
> You received this message because you are subscribed to the Google Groups "raxml" group.
> To unsubscribe from this group and stop receiving emails from it, send an email to
> raxml+un...@googlegroups.com <mailto:raxml%2Bunsu...@googlegroups.com>.

Kaján Győző

unread,
Mar 10, 2016, 9:32:25 AM3/10/16
to ra...@googlegroups.com
Dear Alexis, Alexey and Andrea,

You all helped me a lot, thank you very much! RAxML is a very powerful program, I learn it every day again and again.
Using multiple nodes, I have achieved ultra short computing times!
Just one question: using 2 nodes, 32 cores and the MPI version, I was asking for 100 inferences. And I got 128! I guess the number of inferences should be n-times the number of cores? (n=1,2,3...)
(Alexey, I made some tests, and the PTHREADS version was actually slower on my short, little alignment in sum core time. I mean it was of course faster, but multiplied by the number of cores, it was more and more using more cores. So at the end I have chosen MPI on 32 cores and not the HYBRID.)

Thank you again!
Viktor

To unsubscribe from this group and stop receiving emails from it, send an email to raxml+un...@googlegroups.com.

Alexandros Stamatakis

unread,
Mar 10, 2016, 9:49:10 AM3/10/16
to ra...@googlegroups.com
Hi Viktor,

> You all helped me a lot, thank you very much! RAxML is a very powerful
> program, I learn it every day again and again.

:-)

> Using multiple nodes, I have achieved ultra short computing times!
> Just one question: using 2 nodes, 32 cores and the MPI version, I was
> asking for 100 inferences. And I got 128! I guess the number of
> inferences should be n-times the number of cores? (n=1,2,3...)

Yes, it's rounded up.

> (Alexey, I made some tests, and the PTHREADS version was actually slower
> on my short, little alignment in sum core time.

Yes, that's expected, parallel efficiency will increase with alignment
length ...

> I mean it was of course
> faster, but multiplied by the number of cores, it was more and more
> using more cores. So at the end I have chosen MPI on 32 cores and not
> the HYBRID.)

good choice :-)

alexis

>
> Thank you again!
> Viktor
>
> On 9 March 2016 at 15:09, Alexey Kozlov <alexei...@gmail.com
> <mailto:alexei...@gmail.com> <mailto:alexei...@gmail.com
> <mailto:raxml%2Bunsu...@googlegroups.com
> <mailto:raxml%252Buns...@googlegroups.com>>
> <mailto:raxml+un...@googlegroups.com
> <mailto:raxml%2Bunsu...@googlegroups.com>
> <mailto:raxml%2Bunsu...@googlegroups.com
> <mailto:raxml%252Buns...@googlegroups.com>>>.
> For more options, visit https://groups.google.com/d/optout.
>
>
> --
> You received this message because you are subscribed to the
> Google Groups "raxml" group.
> To unsubscribe from this group and stop receiving emails
> from it, send an email to
> raxml+un...@googlegroups.com
> <mailto:raxml%2Bunsu...@googlegroups.com>
> <mailto:raxml%2Bunsu...@googlegroups.com
> <mailto:raxml%252Buns...@googlegroups.com>>.
Reply all
Reply to author
Forward
0 new messages