Fatal error in MPI_Init_thread, mvapich2 on ranger

2,114 views
Skip to first unread message

Chris Kees

unread,
Sep 16, 2009, 9:06:22 PM9/16/09
to mpi4py
Hi,

I'm getting this problem testing helloworld.py on the ranger system
(linux) on 128 processors. I'm using mpi4py-1.1.0 downloaded today an
python 2.6.1 that I build myself using intel and gnu compilers and
mvapich2/1.2. Ever seen this before? Thanks, Chris

Fatal error in MPI_Init_thread:
Other MPI error, error stack:
MPIR_Init_thread(310)..: Initialization failed
MPID_Init(113).........: channel initialization failed
MPIDI_CH3_Init(161)....:
MPIDI_CH3I_CM_Init(828): Error initializing MVAPICH2 malloc library

Lisandro Dalcin

unread,
Sep 16, 2009, 9:24:10 PM9/16/09
to mpi...@googlegroups.com
Not sure what going on... Could you try to launch a simple C program
calling MPI_Init_thread() asking for MPI_THREAD_MULTIPLE?

There are three possible roots for this issue.

1) Your MPI installation is broken. Have you tried to run a trivial,
pure-C program?

2) Your MPI does not fully support threads. You could try to:

2.a) Compile simple C program calling MPI_Init_thread() asking for
MPI_THREAD_MULTIPLE, and see what happens.

2.b) add these two line at the VERY beginning of demo/helloworld.py,
BEFORE the line "from mpi4py import MPI"

import mpi4py.rc
mpi4py.rc.threaded = False

3) Perhaps there is a problem with shared libraries... Before "from
mpi4py import MPI", could you use ctypes to load the MPICH2 shared
library using RTLD_GLOBAL? You should add a line like

import ctypes
ctypes.CDLL("/full/path/to/libmpich.so", ctypes.RTLD_GLOBAL)
--
Lisandro Dalcín
---------------
Centro Internacional de Métodos Computacionales en Ingeniería (CIMEC)
Instituto de Desarrollo Tecnológico para la Industria Química (INTEC)
Consejo Nacional de Investigaciones Científicas y Técnicas (CONICET)
PTLC - Güemes 3450, (3000) Santa Fe, Argentina
Tel/Fax: +54-(0)342-451.1594

Chris Kees

unread,
Sep 16, 2009, 11:22:49 PM9/16/09
to mpi...@googlegroups.com

On Sep 16, 2009, at 8:24 PM, Lisandro Dalcin wrote:

>
> Not sure what going on... Could you try to launch a simple C program
> calling MPI_Init_thread() asking for MPI_THREAD_MULTIPLE?
>
> There are three possible roots for this issue.
>
> 1) Your MPI installation is broken. Have you tried to run a trivial,
> pure-C program?
>

pure-C is OK.

> 2) Your MPI does not fully support threads. You could try to:
>
> 2.a) Compile simple C program calling MPI_Init_thread() asking for
> MPI_THREAD_MULTIPLE, and see what happens.
>
> 2.b) add these two line at the VERY beginning of demo/helloworld.py,
> BEFORE the line "from mpi4py import MPI"
>
> import mpi4py.rc
> mpi4py.rc.threaded = False
>

Same error

> 3) Perhaps there is a problem with shared libraries... Before "from
> mpi4py import MPI", could you use ctypes to load the MPICH2 shared
> library using RTLD_GLOBAL? You should add a line like
>
> import ctypes
> ctypes.CDLL("/full/path/to/libmpich.so", ctypes.RTLD_GLOBAL)
>

Same error.

I then recompiled python using openmpi and re-installed mpi4py. Now
helloworld.py works fine, at least on 16 and 128 processors.

I also noticed that I could not compile python with the full mvapich2
mpicc. I had to switch to gcc, build python, and then install mpi4py
with --mpicc=mpicc. Maybe it is a linker/shared lib issue with
something besides libmpich.so. For now I think I'll just stick with
openmpi and see if the support staff will experiment with
mvapich2+mpi4py. Thank you for your help.

Chris

Lisandro Dalcin

unread,
Sep 16, 2009, 11:42:26 PM9/16/09
to mpi...@googlegroups.com
Could you try a little experiment? Here I'm assuming you have
mvapich2's mpicc on your $PATH.

mpi4py has some support to build a MPI-enabled Python interpreter,
dtails here:http://mpi4py.scipy.org/docs/usrman/appendix.html#mpi-enabled-python-interpreter

Basically, you have to go to mpi4py top-level dir and do

$ python setup.py install_exe

this will install a pythonX.X-mpi in <prefix>/bin, alongside the
"standard" python interpreter...

Could you then try to run helloworld.py using pythonX.X-mpi binary? If
this do work, then my bet is that your original problem is related to
shared libs and the way Python loads extensions modules, i.e., using
dlopen() with RTLD_LOCAL. If such case, unfortunately, I'll not be
able to fix this issue without temporary access to a machine with
working mvapich2, hope you understand...

Chris Kees

unread,
Sep 17, 2009, 2:29:34 PM9/17/09
to mpi...@googlegroups.com
On Sep 16, 2009, at 10:42 PM, Lisandro Dalcin wrote:

>
> Could you try a little experiment? Here I'm assuming you have
> mvapich2's mpicc on your $PATH.
>
> mpi4py has some support to build a MPI-enabled Python interpreter,
> dtails here:http://mpi4py.scipy.org/docs/usrman/appendix.html#mpi-enabled-python-interpreter
>
> Basically, you have to go to mpi4py top-level dir and do
>
> $ python setup.py install_exe
>
> this will install a pythonX.X-mpi in <prefix>/bin, alongside the
> "standard" python interpreter...
>

Success!!! I ran the 128 core test with python2.6-mpi this morning. I
will continue to test it it and post again if I run into any more
problems.

According to the support staff, openmpi is fine on that machine until
you get over 1000 cores, so it's good to at least to have the option
of running both mvapich2 and openmpi. Thanks again.

Chris

Lisandro Dalcin

unread,
Sep 17, 2009, 3:33:25 PM9/17/09
to mpi...@googlegroups.com
On Thu, Sep 17, 2009 at 3:29 PM, Chris Kees <cek...@gmail.com> wrote:
>
> On Sep 16, 2009, at 10:42 PM, Lisandro Dalcin wrote:
>
>>
>> mpi4py has some support to build a MPI-enabled Python interpreter,
>> dtails here:http://mpi4py.scipy.org/docs/usrman/appendix.html#mpi-enabled-python-interpreter
>>
>> Basically, you have to go to mpi4py top-level dir and do
>>
>> $ python setup.py install_exe
>>
>> this will install a pythonX.X-mpi in <prefix>/bin, alongside the
>> "standard" python interpreter...
>>
>
> Success!!! I ran the 128 core test with python2.6-mpi this morning. I
> will continue to test it it and post again if I run into any more
> problems.
>

Then you likely the issue is related to one of the situations below:

1) You MPI implemtantion DO REQUIRE the actual command line arguments
at MPI_Init(), i.e, you CANNOT initialize MPI with just MPI_Init(0,0)
(as mpi4py does for obvious reason). Could you try to run a tiny C
program initializing MPI with MPI_Init(0,0) or MPI_Init_(NULL,NULL) ?

2) There is some dynamic linking and shared lib issue related to the
way Python dlopen()s extension modules. Investigating this issue is
not so easy, it would involve playing with ctypes, opening all the
libs associated to mvapich2 ("mpicc -show" will tell you what libs are
required for linking with mvapich) using RTLD_GLOBAL.

Anyway, good to know that using the python2.6-mpi binary works for you...

hamideh.j...@gmail.com

unread,
Sep 7, 2013, 12:50:21 PM9/7/13
to mpi...@googlegroups.com
Hi Sir
I saw that you solve the following problem :

 Fatal error in MPI_Init_thread, mvapich2 on ranger.

please you solve my problem

I install opensuse v 12.3 linux, then I  install mpich2.(configure and make install mpich2),then I run the following example (cpi example):

abbas@127001:~/Documents/mpich2-hginit/build/examples> mpiexec -np 1 ./cpi

the following error:

[0] sctplike open
IPC control connect: Connection refused
Fatal error in MPI_Init: Other MPI error, error stack:
MPIR_Init_thread(294)........: Initialization failed
MPID_Init(94)................: channel initialization failed
MPIDI_CH3_Init(83)...........:
MPIDI_CH3I_Progress_init(303):
MPIDU_Sctp_init(1485)........: [unset]: aborting job:
Fatal error in MPI_Init: Other MPI error, error stack:
MPIR_Init_thread(294)........: Initialization failed
MPID_Init(94)................: channel initialization failed
MPIDI_CH3_Init(83)...........:
MPIDI_CH3I_Progress_init(303):
MPIDU_Sctp_init(1485)........:
--------------------------------------------------------------------------
mpiexec noticed that the job aborted, but has no info as to the process
that caused that situation.

please help me in my project.

Lisandro Dalcin

unread,
Sep 8, 2013, 5:17:21 AM9/8/13
to mpi4py
Is you code using mpi4py? Where "cpi" comes from?


--
Lisandro Dalcin
---------------
CIMEC (INTEC/CONICET-UNL)
Predio CONICET-Santa Fe
Colectora RN 168 Km 472, Paraje El Pozo
3000 Santa Fe, Argentina
Tel: +54-342-4511594 (ext 1011)
Tel/Fax: +54-342-4511169
Reply all
Reply to author
Forward
0 new messages