problem with mpi4py

258 views
Skip to first unread message

mg

unread,
Jul 13, 2009, 5:22:58 AM7/13/09
to mpi4py
Hello,
I have tried to build and install mpi4py ( version 1.1.0 ) on a Rocks
4.3 cluster frontend and when trying to run the demo it does not work
well ( see below) although I could not say it does not work at all. I
have tried with different versions of mpi.cfg as openMPI and mpich are
both installed with no better result. The build looks ok. I have no
idea on where to go next ... I take any suggestion !
Marc


$mpirun -mca btl self,tcp -np 5 --nolocal --hostfile hosts.txt
python2.4 demo/helloworld.py
Hello, World! I am process 1 of 5 on compute-0-1.local.
Hello, World! I am process 0 of 5 on compute-0-0.local.
Hello, World! I am process 3 of 5 on compute-0-4.local.
[compute-0-1.local:04708] *** An error occurred in MPI_Errhandler_free
[compute-0-0.local:08909] *** An error occurred in MPI_Errhandler_free
[compute-0-1.local:04708] *** on communicator MPI_COMM_WORLD
[compute-0-1.local:04708] *** MPI_ERR_ARG: invalid argument of some
other kind
[compute-0-1.local:04708] *** MPI_ERRORS_ARE_FATAL (goodbye)
[compute-0-4.local:31197] *** An error occurred in MPI_Errhandler_free
[compute-0-4.local:31197] *** on communicator MPI_COMM_WORLD
[compute-0-4.local:31197] *** MPI_ERR_ARG: invalid argument of some
other kind
[compute-0-4.local:31197] *** MPI_ERRORS_ARE_FATAL (goodbye)
[compute-0-0.local:08909] *** on communicator MPI_COMM_WORLD
[compute-0-0.local:08909] *** MPI_ERR_ARG: invalid argument of some
other kind
[compute-0-0.local:08909] *** MPI_ERRORS_ARE_FATAL (goodbye)
[merlan.im2np.fr:32504] [0,0,0]-[0,1,2] mca_oob_tcp_msg_recv: readv
failed with errno=104
[merlan.im2np.fr:32504] [0,0,0]-[0,1,4] mca_oob_tcp_msg_recv: readv
failed with errno=104
1 additional process aborted (not shown)


Lisandro Dalcin

unread,
Jul 13, 2009, 10:51:18 AM7/13/09
to mpi...@googlegroups.com
First, I would comment you something. Probably you noticed that mpi4py
manages MPI inititialization/finalization for your.

An additional thing mpi4py does at initialization is replacing the
default error handler (MPI_ERRORS_ARE_FATAL) on MPI_COMM_SELF and
MPI_COMM_WORLD; this is in order to map MPI errors to Python
exceptions.

At finalization, mpi4py restores the original error handlers. I guess
this clenup step is the one failing, though I'm not 100% sure.

BTW, if you are using MPICH2 or OpenMPI, you do not need at all to
play with mpi.cfg. Just make sure the appropriate 'mpicc' compiler
wrapper is on your $PATH (try "which mpicc" to be sure), and mpi4py
should build just fine.

Could you tell me the exact Python version and the Open MPI version
you are using? If your Open MPI is too old, perhaps it has a bug. I
could provide a workaround, but not promises :-)
--
Lisandro Dalcín
---------------
Centro Internacional de Métodos Computacionales en Ingeniería (CIMEC)
Instituto de Desarrollo Tecnológico para la Industria Química (INTEC)
Consejo Nacional de Investigaciones Científicas y Técnicas (CONICET)
PTLC - Güemes 3450, (3000) Santa Fe, Argentina
Tel/Fax: +54-(0)342-451.1594

mg

unread,
Jul 14, 2009, 4:56:38 AM7/14/09
to mpi4py
Hello Lisandro,
thank you for your answer and your explanations ( and by the way for
mpi4py !)
The version of OpenMPI installed on the cluster is the 1.1.1 and I use
python2.4.
If you feel the problem I have may come from a bug in this version of
OpenMPI, I may try to install a more recent version.
However if you have a workaround ( but I understand well you can't
promise ) ...
Marc

Lisandro Dalcin

unread,
Jul 14, 2009, 11:14:37 AM7/14/09
to mpi...@googlegroups.com
On Tue, Jul 14, 2009 at 5:56 AM, mg<marc.ga...@gmail.com> wrote:
>
> Hello Lisandro,
> thank you for your answer and your explanations ( and by the way for
> mpi4py !)
> The version of OpenMPI installed on the cluster is the 1.1.1 and I use
> python2.4.

That Open MPI version is REALLY ancient!!

>
> If you feel the problem I have may come from a bug in this version of
> OpenMPI, I may try to install a more recent version.
>

I really recommend you to upgrade if you have time. Anyway...

>
> However if you have a workaround ( but I understand well  you can't
> promise ) ...
>

Basically, the MPI_Errhandler_free() routine is broken in Open MPI <
1.1.3. It generates an error when you call it with the predefined
error handlers (MPI_ERRORS_RETURN and MPI_ERRORS_ARE_FATAL). And this
should not be de case according to the MPI standard (this was
specified in the MPI-2.0 errata for long time, and now should be
explicitly stated in the MPI-2.1 document).

Indeed, I have a workaround for the specific problem you had. Just do
the following:

1) Get this SINGLE file from mpi4py's SVN repo (you do not even need
and svn client, just use wget or your browser)

$ wget http://mpi4py.googlecode.com/svn/trunk/src/compat/openmpi.h

2) go to mpi4py's source top level dir, and replace
src/compat/openmpi.h with the file you downloaded.

3) rebuild and install mpi4py.

I warn you that Open MPI 1.1.1 still has other issues, thoug minor. If
you run mpi4py's testsuite after install ($ python test/runalltest.py)
many tests are expected to fail.


Please, let me know if this fix solved your issues.

Regards,

mg

unread,
Jul 14, 2009, 12:59:23 PM7/14/09
to mpi4py
Thank you for your answer.
With this new version of openmpi.h the demo script helloworld.py works
perfectly.
However I have a lot of errors when I run test/runalltests.py and it
gets stucked when running test_cco_buf.
The errors occur in the following tests:
test_pack
test_grequest
test_group
test_exceptions
and then test_cco_buf
Do you think it is still related to my version of OpenMPI ?
Is there an interest that I give you the details ?
Best regards.
Marc






On 14 juil, 17:14, Lisandro Dalcin <dalc...@gmail.com> wrote:

> > The version of OpenMPI installed on the cluster is the 1.1.1 and I use
> > python2.4.
>
> That Open MPI version is REALLY ancient!!
>
>
>
> > If you feel the problem I have may come from a bug in this version of
> > OpenMPI, I may try to install a more recent version.
>
> I really recommend you to upgrade if you have time. Anyway...
>
>
>
> > However if you have a workaround ( but I understand well  you can't
> > promise ) ...
>
> Basically, the MPI_Errhandler_free() routine is broken in Open MPI <
> 1.1.3. It generates an error when you call it with the predefined
> error handlers (MPI_ERRORS_RETURN and MPI_ERRORS_ARE_FATAL). And this
> should not be de case according to the MPI standard (this was
> specified in the MPI-2.0 errata for long time, and now should be
> explicitly stated in the MPI-2.1 document).
>
> Indeed, I have a workaround for the specific problem you had. Just do
> the following:
>
> 1) Get this SINGLE file from mpi4py's SVN repo (you do not even need
> and svn client, just use wget or your browser)
>
> $ wgethttp://mpi4py.googlecode.com/svn/trunk/src/compat/openmpi.h

Lisandro Dalcin

unread,
Jul 14, 2009, 1:49:56 PM7/14/09
to mpi...@googlegroups.com
On Tue, Jul 14, 2009 at 1:59 PM, mg<marc.ga...@gmail.com> wrote:
>
> Thank you for your answer.
> With this new version of openmpi.h the demo script helloworld.py works
> perfectly.
> However I have a lot of errors when I run test/runalltests.py and it
> gets stucked when running test_cco_buf.
> The errors occur in the following tests:
> test_pack
> test_grequest
> test_group
> test_exceptions
> and then test_cco_buf
> Do you think it is still related to my version of OpenMPI ?

Indeed. I told you some tests will fail... That's the reason I asked
you to upgrade !!

>
> Is there an interest that I give you the details ?
>

No, never mind... I have a more or less clear idea of where the
problems are, they are bugs, and in general I cannot workaround them.

If for whatever reason you will not upgrade (despite my STRONG
recommendation to do that), just try to use mpi4py and get back if you
have new issues (like collective calls hanging). Again, no promises!

Regards,

mg

unread,
Jul 14, 2009, 3:03:34 PM7/14/09
to mpi4py
ok I will do as you suggest. For now will use mpi4py with the old
OpenMPI.
If I have problems I will upgrade OpenMPI or this may give the
momentum to do a full Rocks upgrade ...
Thank you.
Best regards
Marc




On 14 juil, 19:49, Lisandro Dalcin <dalc...@gmail.com> wrote:
Reply all
Reply to author
Forward
0 new messages