Is RAxML Hanging...

86 views
Skip to first unread message

Dallas Thomas

unread,
Oct 17, 2016, 4:23:13 PM10/17/16
to raxml
So I am working with a large data set which I understand is going to take RAxML a long time to process.  In order to keep a bit of an idea of progress have been using tail -f on log to gauge status.  The output log appears to have stopped mid-word, however top still records that RAxML is running.  How do you know then if RAxML has actually hung or not.  Cannot let this run for days or weeks and then after it sitting there for so long just decide to restart the run.


Any thoughts?  Thanks.
Dallas

Alexey Kozlov

unread,
Oct 17, 2016, 6:41:11 PM10/17/16
to ra...@googlegroups.com
Hi Dallas,

I guess log file is the only way to know, just check it once again in a couple of hours with cat / tail (tail -f
sometimes behaves weird).

Also please make sure you're not falling into one of the common traps with pthreads version, i.e. that you're not using
more threads than you have cpu cores and not more that 1 thread per 500-1000 alignment patterns.

Hope this helps,
Alexey
> --
> You received this message because you are subscribed to the Google Groups "raxml" group.
> To unsubscribe from this group and stop receiving emails from it, send an email to raxml+un...@googlegroups.com
> <mailto:raxml+un...@googlegroups.com>.
> For more options, visit https://groups.google.com/d/optout.

Dallas Thomas

unread,
Oct 18, 2016, 11:12:23 AM10/18/16
to raxml
Hello Alexey,

Nothing changed in the log file so I killed the process and started over.

As for the number of pthreads I have never exceeded the number of cores.  I am running RAxML on a 48 core (8x hexacore) server and I set the number of threads based on total number of sequences in alignment divided by 500.  To be more accurate I could start RAxML until it produces the info file then kill that run of RAxML and get the number of patterns from the alignment file and set threads accordingly.

Other than that not sure what else I can do.

Thanks.
Dallas

Alexey Kozlov

unread,
Oct 18, 2016, 11:20:41 AM10/18/16
to ra...@googlegroups.com
Hi Dallas,

> As for the number of pthreads I have never exceeded the number of cores. I am running RAxML on a 48 core (8x hexacore)
> server and I set the number of threads based on total number of sequences in alignment divided by 500. To be more

you mean "number of alignment sites" not sequences, do you?

> accurate I could start RAxML until it produces the info file then kill that run of RAxML and get the number of patterns
> from the alignment file and set threads accordingly.

Yes that's always better. So what are the dimensions of your dataset (# seqs and # sites)?

Also, can I see the RAxML_log file which was generated before you killed the process for the first time (i.e. after long
"hanging")?

Best,
Alexey

Dallas Thomas

unread,
Oct 19, 2016, 1:13:12 PM10/19/16
to raxml
I have been using sequences at first instead of "number of alignment sites" as this is part of an automated pipeline.  The start and stop can be done but is more of a hassle for automation, that is unless you have a script that produces the info file that could then be parsed to get the alignment site information prior to running RAxML.

As for the log file portion - here is a snippet:

...
...
IMPORTANT WARNING: Sequences AFC33405.1 and AFH65718.1 are exactly identical

IMPORTANT WARNING
Found 1434 sequences that are exactly identical to other sequences in the alignment.
Normally they should be excluded from the analysis.

Just in case you might need it, an alignment file with 
sequence duplicates removed is printed to file /home/thomasd/Projects/Biochem/GH/GH43/all/muscle/GH43.muscle_aln.phyi.reduced

This is the RAxML Master Pthread

This is RAxML Worker Pthread Number: 4

This is RAxML Worker Pthread Number: 2

This is RAxML Worker Pthread Number: 3

This is RAxML Worker Pthread Number: 5

This is RAxML Worker Pthread Number: 1

This is RAxML Worker Pthread Number: 6

This is RAxML Worker Pthread Number: 7

This is RAxML Worker Pthread Number: 10

This is RAxML Worker Pthread Number: 8

This is RAxML Worker Pthread Number: 9

This is RAxML Worker Pthread Number: 11

This is RAxML Worker Pthread Number: 12

This is RAxML Worker Pthread Number: 13

This is RAxML Worker Pthread Number: 15

This is RAxML Worker Pthread Number: 14

This is RAxML Worker Pthread Number: 17

This is RAxML Worker Pthread Number: 16

This is RAxML Worker Pthread Number: 18

This is RAxML Worker Pthread Number: 19

This is RAxML Worker Pthread Number: 20

This is RAxML Worker Pthread Number: 21

This is RAxML Worker Pthread Number: 22

This is RAxML Worker Pthread Number: 23

This is RAxML Worker Pthread Number: 24

This is RAxML Worker Pthread Number: 2  <- this is where it hung

---------

The repeats are expected.  I called raxml initially with -T 26, now this probably would be a bit much I could of easily halved this - I have over 5900 sequences and there are nearly 5500 alignment sites.

Dallas

Alexey Kozlov

unread,
Oct 19, 2016, 4:50:44 PM10/19/16
to ra...@googlegroups.com
Hi Dallas,

> I have been using sequences at first instead of "number of alignment sites" as this is part of an automated pipeline.
> The start and stop can be done but is more of a hassle for automation, that is unless you have a script that produces
> the info file that could then be parsed to get the alignment site information prior to running RAxML.

You can (mis-)use any fast-to-compute operation for this purpose, e.g. random starting tree generation:

./raxmlHPC-AVX -s alignment.phy -n test -m GTRGAMMA -n test -p 12345 -y -d

Alternatively, if you have thousands of sequences, alignment width (# sites) should give an approximation of # unique
patterns which is close enough for choosing number of threads to use.

> As for the log file portion - here is a snippet:

This looks rather weird, did you get "RAxML_log.<RUN_NAME>" file as well? RAxML_log is more convenient for tracking the
progress as it gets updated more frequently than RAxML_info.

> The repeats are expected. I called raxml initially with -T 26, now this probably would be a bit much I could of easily
> halved this - I have over 5900 sequences and there are nearly 5500 alignment sites.

Exactly, I'd say 8 threads would be more than enough in this case, it's much more efficient to run multiple tree
searches in parallel instead (different starting trees or alignments), given that you have enough memory of course. You
can use hybrid MPI/PTHREADS version of RAxML to parallelize across starting trees. If interested, please search in the
googlegroup history, as this topic was discussed a couple of times before...

Best,
Alexey

Dallas Thomas

unread,
Oct 20, 2016, 12:01:10 PM10/20/16
to raxml
Hello Alexey,



You can (mis-)use any fast-to-compute operation for this purpose, e.g. random starting tree generation:

./raxmlHPC-AVX -s alignment.phy -n test -m GTRGAMMA -n test -p 12345 -y -d

Alternatively, if you have thousands of sequences, alignment width (# sites) should give an approximation of # unique
patterns which is close enough for choosing number of threads to use.


I will give these a try and go from there.  Thanks.
 
> As for the log file portion - here is a snippet:

This looks rather weird, did you get "RAxML_log.<RUN_NAME>" file as well? RAxML_log is more convenient for tracking the
progress as it gets updated more frequently than RAxML_info.


No i did not get that log file - the only log i generally get is RAxML info.  The snippet I gave you was what I redirected from stdout.
 
> The repeats are expected.  I called raxml initially with -T 26, now this probably would be a bit much I could of easily
> halved this - I have over 5900 sequences and there are nearly 5500 alignment sites.

Exactly, I'd say 8 threads would be more than enough in this case, it's much more efficient to run multiple tree
searches in parallel instead (different starting trees or alignments), given that you have enough memory of course. You
can use hybrid MPI/PTHREADS version of RAxML to parallelize across starting trees. If interested, please search in the
googlegroup history, as this topic was discussed a couple of times before...


I will look into the MPI/PTHREADS version.

Thanks
Dallas
Reply all
Reply to author
Forward
0 new messages