Trouble running RAxML-MPI on cluster.

1,099 views
Skip to first unread message

Mike

unread,
Oct 19, 2012, 10:01:10 AM10/19/12
to ra...@googlegroups.com
Greetings raxml group!

I have successfully compiled raxmlHPC-MPI-SSE3 on our head node without issue. However, when I try to submit a raxml job to the queue I get some errors I do not understand. Below is the batch script I used:

#### BEGIN SUBMISSION ####
# Walltime Limit: hh:nn:ss 
#PBS -l walltime=48:00:00 
# Node Specification:
#PBS -l ncpus=4 -l nodes=1
# Queue 
#PBS -q long 
# supress email notification
#PBS -m n
# Job Name:
#PBS -N RAxML_test 
# Keep output
#PBS -k oe
echo ------------------------------------------------------
echo PBS: qsub is running on $PBS_O_HOST
echo PBS: originating queue is $PBS_O_QUEUE
echo PBS: executing queue is $PBS_QUEUE
echo PBS: working directory is $PBS_O_WORKDIR
echo PBS: execution mode is $PBS_ENVIRONMENT
echo PBS: job identifier is $PBS_JOBID
echo PBS: job name is $PBS_JOBNAME
echo PBS: node file is $PBS_NODEFILE
echo PBS: current home directory is $PBS_O_HOME
echo PBS: PATH = $PBS_O_PATH
echo ------------------------------------------------------
cd $PBS_O_WORKDIR
mpiexec -n 4 raxmlHPC-MPI-SSE3 -f d -m PROTGAMMALG -N 100 -o Escherichia_coli_str__K_12_substr__MG1655_chromosome_NC_000913 -s /home/mike/raxml_test/Concatenated_AA_alignment.phy -n /home/mike/raxml_test/Concatenated_AA_alignment -q /home/mike/raxml_test/AA_raxml_parts.txt

#### END SUBMISSION ####

The error file that the cluster outputs reads:

#### BEGIN ERROR FILE ####
raxmlHPC-MPI-SSE3: axml.c:4188: analyzeRunId: Assertion `0' failed.
[node041:18522] *** Process received signal ***
[node041:18522] Signal: Aborted (6)
[node041:18522] Signal code:  (-6)
raxmlHPC-MPI-SSE3: axml.c:4188: analyzeRunId: Assertion `0' failed.
[node041:18521] *** Process received signal ***
[node041:18521] Signal: Aborted (6)
[node041:18521] Signal code:  (-6)
[node041:18521] [ 0] /lib64/libpthread.so.0 [0x35ad40eb10]
[node041:18521] [ 1] /lib64/libc.so.6(gsignal+0x35) [0x35acc30265]
[node041:18521] [ 2] /lib64/libc.so.6(abort+0x110) [0x35acc31d10]
[node041:18521] [ 3] /lib64/libc.so.6(__assert_fail+0xf6) [0x35acc296e6]
[node041:18521] [ 4] raxmlHPC-MPI-SSE3(main+0x1852) [0x415dc2]
[node041:18521] [ 5] /lib64/libc.so.6(__libc_start_main+0xf4) [0x35acc1d994]
[node041:18521] [ 6] raxmlHPC-MPI-SSE3 [0x404339]
[node041:18521] *** End of error message ***
[node041:18522] [ 0] /lib64/libpthread.so.0 [0x35ad40eb10]
[node041:18522] [ 1] /lib64/libc.so.6(gsignal+0x35) [0x35acc30265]
[node041:18522] [ 2] /lib64/libc.so.6(abort+0x110) [0x35acc31d10]
[node041:18522] [ 3] /lib64/libc.so.6(__assert_fail+0xf6) [0x35acc296e6]
[node041:18522] [ 4] raxmlHPC-MPI-SSE3(main+0x1852) [0x415dc2]
[node041:18522] [ 5] /lib64/libc.so.6(__libc_start_main+0xf4) [0x35acc1d994]
[node041:18522] [ 6] raxmlHPC-MPI-SSE3 [0x404339]
[node041:18522] *** End of error message ***
raxmlHPC-MPI-SSE3: axml.c:4188: analyzeRunId: Assertion `0' failed.
[node041:18524] *** Process received signal ***
[node041:18524] Signal: Aborted (6)
[node041:18524] Signal code:  (-6)
[node041:18524] [ 0] /lib64/libpthread.so.0 [0x35ad40eb10]
[node041:18524] [ 1] /lib64/libc.so.6(gsignal+0x35) [0x35acc30265]
[node041:18524] [ 2] /lib64/libc.so.6(abort+0x110) [0x35acc31d10]
[node041:18524] [ 3] /lib64/libc.so.6(__assert_fail+0xf6) [0x35acc296e6]
[node041:18524] [ 4] raxmlHPC-MPI-SSE3(main+0x1852) [0x415dc2]
[node041:18524] [ 5] /lib64/libc.so.6(__libc_start_main+0xf4) [0x35acc1d994]
[node041:18524] [ 6] raxmlHPC-MPI-SSE3 [0x404339]
[node041:18524] *** End of error message ***
--------------------------------------------------------------------------
mpiexec noticed that process rank 1 with PID 18522 on node node041 exited on signal 6 (Aborted).
--------------------------------------------------------------------------
raxmlHPC-MPI-SSE3: axml.c:4188: analyzeRunId: Assertion `0' failed.
[node041:18523] *** Process received signal ***
[node041:18523] Signal: Aborted (6)
[node041:18523] Signal code:  (-6)
[node041:18523] [ 0] /lib64/libpthread.so.0 [0x35ad40eb10]
[node041:18523] [ 1] /lib64/libc.so.6(gsignal+0x35) [0x35acc30265]
[node041:18523] [ 2] /lib64/libc.so.6(abort+0x110) [0x35acc31d10]
[node041:18523] [ 3] /lib64/libc.so.6(__assert_fail+0xf6) [0x35acc296e6]
[node041:18523] [ 4] raxmlHPC-MPI-SSE3(main+0x1852) [0x415dc2]
[node041:18523] [ 5] /lib64/libc.so.6(__libc_start_main+0xf4) [0x35acc1d994]
[node041:18523] [ 6] raxmlHPC-MPI-SSE3 [0x404339]
[node041:18523] *** End of error message ***
#### END ERROR FILE ####

I am currently discussing this with our cluster admin, currently neither of us can determine what is causing the issue. I get the same error regardless if I use either the RAxML v7.2.8 or the github version. Any ideas on what the issue may be?

-Thanks for any help in advance.
-Mike

Alexandros Stamatakis

unread,
Oct 19, 2012, 11:13:07 AM10/19/12
to ra...@googlegroups.com
Hi Mike,

The run name passed via -n just needs to be a name and not a directory
structure, something like "-n myFirstMPIJob",

the function analyzeRunId() that checks the format of that strings exits
with an error because the string you pass contains "/"

Alexis
--
Alexandros (Alexis) Stamatakis

Research Group Leader, Heidelberg Institute for Theoretical Studies
Full Professor, Dept. of Informatics, Karlsruhe Institute of Technology
Adjunct Professor, Dept. of Ecology and Evolutionary Biology, University
of Arizona at Tucson

www.exelixis-lab.org

Mike

unread,
Oct 22, 2012, 11:16:02 AM10/22/12
to ra...@googlegroups.com
Hi Alexis,

That was it! It's all running fine now. :-)

-Thanks!
-Mike

Érica Souza

unread,
May 17, 2018, 10:06:40 AM5/17/18
to raxml
Hi

I have the same problem and when I substituted the -n for a name, the message was:

This is RAxML MPI Process Number: 0
The file /dados/software/standard-RAxML-8.2.11/RAxML_info.Mfirst RAxML wants to open for writing or appending can not be opened [mode: ab], exiting ...

I search for an answer but only what can find it was something about the permission to the folder.

Thanks

Alexey Kozlov

unread,
May 17, 2018, 10:14:01 AM5/17/18
to ra...@googlegroups.com
Hi Erica,

that's right, you probably don't have write permission for the folder where RAxML is installed
(/dados/software/standard-RAxML-8.2.11).

So just tell RAxML to save output files in the folder you have access to (e.g. under your home folder), you can do it
with the "-w" option, e.g.

-w /home/erica/raxml_output

Hope this helps,
Alexey
> www.exelixis-lab.org <http://www.exelixis-lab.org>
>
> --
> You received this message because you are subscribed to the Google Groups "raxml" group.
> To unsubscribe from this group and stop receiving emails from it, send an email to raxml+un...@googlegroups.com
> <mailto:raxml+un...@googlegroups.com>.
> For more options, visit https://groups.google.com/d/optout.

Érica Souza

unread,
May 17, 2018, 11:25:51 AM5/17/18
to ra...@googlegroups.com
Dear Alexey

It works \o/

Thank you very much

All the best


Sent with Free Email Tracker by cloudHQ

To unsubscribe from this group and stop receiving emails from it, send an email to raxml+unsubscribe@googlegroups.com <mailto:raxml+unsubscribe@googlegroups.com>.

For more options, visit https://groups.google.com/d/optout.
--
You received this message because you are subscribed to the Google Groups "raxml" group.
To unsubscribe from this group and stop receiving emails from it, send an email to raxml+unsubscribe@googlegroups.com.

For more options, visit https://groups.google.com/d/optout.



--
Érica Souza
Bel. Ciências Biológicas - Biodiversidade e Conservação  (Universidade Federal do Amazonas - UFAM)
Me. Genética, Conservação e Biologia Evolutiva (Instituto Nacional de Pesquisas da Amazônia - INPA)
Doutoranda pelo programa de Genética, Evolução, Microbiologia e Imunologia (Universidade Estadual de Campinas - UNICAMP)

Alexey Kozlov

unread,
May 17, 2018, 11:27:59 AM5/17/18
to ra...@googlegroups.com
great, you're welcome :)

On 17.05.2018 17:25, Érica Souza wrote:
> Dear Alexey
>
> It works \o/
>
> Thank you very much
>
> All the best
>
>
> Sent with Free Email Tracker by cloudHQ <https://www.cloudhq.net/install_mail_tracker?source=signature&referral=593826>
>
> 2018-05-17 11:13 GMT-03:00 Alexey Kozlov <alexei...@gmail.com <mailto:alexei...@gmail.com>>:
> www.exelixis-lab.org <http://www.exelixis-lab.org> <http://www.exelixis-lab.org>
>
> --
> You received this message because you are subscribed to the Google Groups "raxml" group.
> To unsubscribe from this group and stop receiving emails from it, send an email to
> raxml+un...@googlegroups.com <mailto:raxml%2Bunsu...@googlegroups.com>
> <mailto:raxml+un...@googlegroups.com <mailto:raxml%2Bunsu...@googlegroups.com>>.
> For more options, visit https://groups.google.com/d/optout <https://groups.google.com/d/optout>.
>
>
> --
> You received this message because you are subscribed to the Google Groups "raxml" group.
> To unsubscribe from this group and stop receiving emails from it, send an email to
> raxml+un...@googlegroups.com <mailto:raxml%2Bunsu...@googlegroups.com>.
> For more options, visit https://groups.google.com/d/optout <https://groups.google.com/d/optout>.
>
>
>
>
> --
> *Érica Souza*
> *Bel. Ciências Biológicas - Biodiversidade e Conservação  (Universidade Federal do Amazonas - UFAM)*
> *Me. Genética, Conservação e Biologia Evolutiva (Instituto Nacional de Pesquisas da Amazônia - INPA)*
> *Doutoranda pelo programa de Genética, Evolução, Microbiologia e Imunologia (Universidade Estadual de Campinas - UNICAMP)**
> *
> *CV:http://lattes.cnpq.br/9530933208734813/
> /*
> */
> /*
>
> --
> You received this message because you are subscribed to the Google Groups "raxml" group.
> To unsubscribe from this group and stop receiving emails from it, send an email to raxml+un...@googlegroups.com
> <mailto:raxml+un...@googlegroups.com>.

Carolina Piña Páez

unread,
Aug 10, 2021, 1:12:50 AM8/10/21
to raxml
Hello everyone!

I have a similar error running RAxML-MPI on the Univerity cluster:

mpiexec has exited due to process rank 9 with PID 57204 on
node symbiosis.cgrb.oregonstate.local exiting without calling "finalize". This may
have caused other processes in the application to be
terminated by signals sent by mpiexec (as reported here).

Here is the full command: mpiexec -n 10 raxmlHPC-MPI -N 50 -n myMLJob -s gt.phy -m MULTICAT -f a -x 12345 -p 12345

Not sure if this is a infrastructural or a user error. Any help will be greatly appreciated.

Best,
Caro

Alexey Kozlov

unread,
Aug 11, 2021, 8:07:14 AM8/11/21
to ra...@googlegroups.com
Hi Caro,

it would help if you can post RAxML_info file and console output for this run.

Best,
Alexey
> <http://www.exelixis-lab.org>> <http://www.exelixis-lab.org <http://www.exelixis-lab.org>>
> >
> > --
> > You received this message because you are subscribed to the Google Groups "raxml" group.
> > To unsubscribe from this group and stop receiving emails from it, send an email to
> > raxml+un...@googlegroups.com <mailto:raxml%2Bunsu...@googlegroups.com>
> > <mailto:raxml+un...@googlegroups.com <mailto:raxml%2Bunsu...@googlegroups.com>>.
> > For more options, visit https://groups.google.com/d/optout
> <https://groups.google.com/d/optout> <https://groups.google.com/d/optout
> <https://groups.google.com/d/optout>>.
> >
> >
> > --
> > You received this message because you are subscribed to the Google Groups "raxml" group.
> > To unsubscribe from this group and stop receiving emails from it, send an email to
> > raxml+un...@googlegroups.com <mailto:raxml%2Bunsu...@googlegroups.com>.
> > For more options, visit https://groups.google.com/d/optout
> <https://groups.google.com/d/optout> <https://groups.google.com/d/optout
> <https://groups.google.com/d/optout>>.
> >
> >
> >
> >
> > --
> > *Érica Souza*
> > *Bel. Ciências Biológicas - Biodiversidade e Conservação  (Universidade Federal do Amazonas -
> UFAM)*
> > *Me. Genética, Conservação e Biologia Evolutiva (Instituto Nacional de Pesquisas da Amazônia
> - INPA)*
> > *Doutoranda pelo programa de Genética, Evolução, Microbiologia e Imunologia (Universidade
> Estadual de Campinas - UNICAMP)**
> > *
> > *CV:http://lattes.cnpq.br/9530933208734813/ <http://lattes.cnpq.br/9530933208734813/>
> > /*
> > */
> > /*
> >
> > --
> > You received this message because you are subscribed to the Google Groups "raxml" group.
> > To unsubscribe from this group and stop receiving emails from it, send an email to
> raxml+un...@googlegroups.com
> > <mailto:raxml+un...@googlegroups.com>.
> > For more options, visit https://groups.google.com/d/optout <https://groups.google.com/d/optout>.
>
> --
> You received this message because you are subscribed to the Google Groups "raxml" group.
> To unsubscribe from this group and stop receiving emails from it, send an email to
> raxml+un...@googlegroups.com <mailto:raxml+un...@googlegroups.com>.
> To view this discussion on the web visit
> https://groups.google.com/d/msgid/raxml/a978bf1c-1728-4bd9-a4c0-f419566c50ean%40googlegroups.com
> <https://groups.google.com/d/msgid/raxml/a978bf1c-1728-4bd9-a4c0-f419566c50ean%40googlegroups.com?utm_medium=email&utm_source=footer>.

Carolina Piña Páez

unread,
Aug 19, 2021, 3:12:46 PM8/19/21
to raxml

Hi Alexey, 

Thanks for the response. This run did not generate a RaxML_info file. Here is the output information:

snp_tree_6.o9884263 

This is RAxML MPI Process Number: 9

This is RAxML MPI Process Number: 5

This is RAxML MPI Process Number: 7

This is RAxML MPI Process Number: 8

This is RAxML MPI Process Number: 2

This is RAxML MPI Process Number: 6
Multi State Error, characters must be used in the order they are available, i.e.
0, 1, 2, 3, 4, 5, 6, 7, 8, 9, A, B, C, D, E, F, G, H, I, J, K, L, M, N, O, P, Q, R, S, T, U, V
You are using the following characters:
0 1 2 N
  Finished at:            Mon Aug 9 14:39:07 PDT 2021

Thanks for your help.
Best,
Caro

Alexey Kozlov

unread,
Aug 25, 2021, 8:14:19 AM8/25/21
to ra...@googlegroups.com
Hi Carolina,

you should replace "N" with "-" (gap) in your alignment. See error message below:

Multi State Error, characters must be used in the order they are available, i.e.
0, 1, 2, 3, 4, 5, 6, 7, 8, 9, A, B, C, D, E, F, G, H, I, J, K, L, M, N, O, P, Q, R, S, T, U, V
You are using the following characters:
0 1 2 N

Best,
Alexey
> <https://groups.google.com/d/optout> <https://groups.google.com/d/optout
> <https://groups.google.com/d/optout>>.
> >
> > --
> > You received this message because you are subscribed to the Google Groups "raxml" group.
> > To unsubscribe from this group and stop receiving emails from it, send an email to
> > raxml+un...@googlegroups.com <mailto:raxml+un...@googlegroups.com>.
> > To view this discussion on the web visit
> >
> https://groups.google.com/d/msgid/raxml/a978bf1c-1728-4bd9-a4c0-f419566c50ean%40googlegroups.com
> <https://groups.google.com/d/msgid/raxml/a978bf1c-1728-4bd9-a4c0-f419566c50ean%40googlegroups.com>
> >
> <https://groups.google.com/d/msgid/raxml/a978bf1c-1728-4bd9-a4c0-f419566c50ean%40googlegroups.com?utm_medium=email&utm_source=footer
> <https://groups.google.com/d/msgid/raxml/a978bf1c-1728-4bd9-a4c0-f419566c50ean%40googlegroups.com?utm_medium=email&utm_source=footer>>.
>
>
> --
> You received this message because you are subscribed to the Google Groups "raxml" group.
> To unsubscribe from this group and stop receiving emails from it, send an email to
> raxml+un...@googlegroups.com <mailto:raxml+un...@googlegroups.com>.
> To view this discussion on the web visit
> https://groups.google.com/d/msgid/raxml/2a6a89bd-14e9-4937-9303-0abf14694453n%40googlegroups.com
> <https://groups.google.com/d/msgid/raxml/2a6a89bd-14e9-4937-9303-0abf14694453n%40googlegroups.com?utm_medium=email&utm_source=footer>.
Reply all
Reply to author
Forward
0 new messages