How to determine the optimal number of threads for RAxML jobs

Alex

unread,

Feb 13, 2013, 11:17:06 AM2/13/13

to ra...@googlegroups.com

Hi,

I have a question regarding how to determine the optimal number of

threads to use for a RAxML job:

I would like to run several RAxML jobs on a handful of alignments

on a machine with 8 CPUs, 10 cores each, hyperthreading turned

on (i.e. up to 160 threads).

The protein alignment has ~2500 taxa and is ~8000 characters wide.

What would be the best strategy to determine the optimal number

of threads to use? I do not want to run full RAxML jobs with different

number of threads, instead I was wondering if it would be sufficient to

run e.g. 5 bootstraps only and check the running time for these. Can I

use the running time of the bootstraps as an estimate for the

running time of the complete job including ML searches? Or is the

behaviour of the ML search (in terms of threading performance) completely

different from the bootstraps?

Thanks for your help,

Alex

Daniel Lahr

unread,

Feb 14, 2013, 6:37:04 AM2/14/13

to ra...@googlegroups.com

Hi Alex,

In my experience that depends on several aspects of the strategy you are using.

In my typical search, I will run real bootstraps (not rapid ones) separately from the ML search. I generally do an ML search starting from several different parsimony trees and then use the best one plot the bootstrap values. In this case, assuming all other things are equal, each ML search iteration takes about the same time as a bootstrap.

Depending on whether you are doing search using CAT or GAMMA, the times to completion will vary enormously. If you are using GAMMA for the the ML search, and using CAT for boostrapping, ML will take much longer.

If you are doing the "fast" approach, ie, running a bunch of rapid BS under CAT then doing a final ML optimization under GAMMA, that final step shouldn't add much to the overall time...

Hope this helps,

Dan

Alexandros Stamatakis

unread,

Feb 25, 2013, 12:11:30 PM2/25/13

to ra...@googlegroups.com

Hi Alex,

You can generate a parsimony starting tree and then test
execution times with different numbers of threads using, for instance,
the -f e option that just optimizes model params on the tree to predict
the parallel speedups for your searches.

Alexis

--
Alexandros (Alexis) Stamatakis

Research Group Leader, Heidelberg Institute for Theoretical Studies
Full Professor, Dept. of Informatics, Karlsruhe Institute of Technology
Adjunct Professor, Dept. of Ecology and Evolutionary Biology, University
of Arizona at Tucson

www.exelixis-lab.org

Reply all

Reply to author

Forward