Error MPI job abort - meme-chip

544 views
Skip to first unread message

Teshome Mulugeta

unread,
Apr 27, 2016, 4:55:02 AM4/27/16
to MEME Suite Q&A
Hi,

MEME VERSION 4.11.1

I am running meme-chip and it failed when running meme with the following message. There is no option to specify the "self" BTL in meme-chip to fix this as it suggested below.

meme-chip -bfile $BACKGROUND -order 3 -dna -meme-p 6 -oc $RESULTDIR -meme-nmotifs 50 -meme-maxw 20 ${INFILE}

-----------------------------------------------------------------------------------------------------------------------------------------------------------
[cn-7:44386] [db_pmi.c:457:commit] PMI_KVS_Commit: Operation failed
[cn-7][[30107,1],0][btl_tcp_proc.c:132:mca_btl_tcp_proc_create] ompi_modex_recv: failed with return value=-48
[cn-7][[30107,1],0][btl_tcp_proc.c:132:mca_btl_tcp_proc_create] ompi_modex_recv: failed with return value=-48
[cn-7][[30107,1],0][btl_tcp_proc.c:132:mca_btl_tcp_proc_create] ompi_modex_recv: failed with return value=-48
[cn-7][[30107,1],0][btl_tcp_proc.c:132:mca_btl_tcp_proc_create] ompi_modex_recv: failed with return value=-48
[cn-7][[30107,1],0][btl_tcp_proc.c:132:mca_btl_tcp_proc_create] ompi_modex_recv: failed with return value=-48
[cn-7][[30107,1],0][btl_tcp_proc.c:132:mca_btl_tcp_proc_create] ompi_modex_recv: failed with return value=-48
[cn-7][[30107,1],0][btl_tcp_proc.c:132:mca_btl_tcp_proc_create] ompi_modex_recv: failed with return value=-48
[cn-7][[30107,1],0][btl_tcp_proc.c:132:mca_btl_tcp_proc_create] ompi_modex_recv: failed with return value=-48
[cn-7][[30107,1],0][btl_tcp_proc.c:132:mca_btl_tcp_proc_create] ompi_modex_recv: failed with return value=-48
[cn-7][[30107,1],0][btl_tcp_proc.c:132:mca_btl_tcp_proc_create] ompi_modex_recv: failed with return value=-48
--------------------------------------------------------------------------
At least one pair of MPI processes are unable to reach each other for
MPI communications.  This means that no Open MPI device has indicated
that it can be used to communicate between these processes.  This is
an error; Open MPI requires that all MPI processes be able to reach
each other.  This error can sometimes be the result of forgetting to
specify the "self" BTL.

  Process 1 ([[30107,1],0]) is on host: cn-7
  Process 2 ([[30107,1],1]) is on host: unknown!
  BTLs attempted: tcp self

Your MPI job is now going to abort; sorry.
--------------------------------------------------------------------------
--------------------------------------------------------------------------
MPI_INIT has failed because at least one MPI process is unreachable
from another.  This *usually* means that an underlying communication
plugin -- such as a BTL or an MTL -- has either not loaded or not
allowed itself to be used.  Your MPI job will now abort.

You may wish to try to narrow down the problem;

 * Check the output of ompi_info to see which BTL/MTL plugins are
   available.
 * Run your application with MPI_THREAD_SINGLE.
 * Set the MCA parameter btl_base_verbose to 100 (or mtl_base_verbose,
   if using MTL-based communications) to see exactly which
   communication plugins were considered and/or discarded.
--------------------------------------------------------------------------
*** An error occurred in MPI_Init
*** on a NULL communicator
*** MPI_ERRORS_ARE_FATAL (processes in this communicator will now abort,
***    and potentially your MPI job)
[cn-7:44386] Local abort before MPI_INIT completed successfully; not able to aggregate error messages, and not able to guarantee that all other processes were killed!





CharlesEGrant

unread,
Apr 27, 2016, 1:22:51 PM4/27/16
to MEME Suite Q&A
HI Teshome,

These error messages indicate a problem with the configuration of your MPI library or your cluster. This is beyond what we can help with. You'll need to work with your local system administrators to resolve this. In the mean time you can omit the
 -meme-p 6
option, and run MEME in serial mode.

Charles
Reply all
Reply to author
Forward
0 new messages