from mpi4py import MPI triggers Segementation Fault

916 views
Skip to first unread message

timoth...@gmail.com

unread,
Nov 22, 2017, 4:20:28 AM11/22/17
to mpi4py
Hello.

I'm trying to build mpi4py 3.0.0 under a local account on a supercomputer (I only have privileges to work in my home directory). Attempting to run the test file in the examples directory using
mpirun -n 5 python helloworld.py 
results in a segmentation fault. It seems it is the line `from mpi4py import MPI` which triggers it. This problem does not occur if I compile the c version with `mpicc helloworld.c` and run the output instead.

I have had some issues setting up Open MPI on the server due to the way it handles processes but I am not sure if that is causing the issue currently.

Following https://mpi4py.readthedocs.io/en/stable/install.html, here are the install steps I have taken to install from source:
2. tar xvf mpi4py-3.0.0.tar.gz 
3. mv mpi4py-3.0.0 mpi4py-3.0.0_src
4. cd mpi4py-3.0.0_src

5. Set in mpi.cfg:
mpi_dir                     = /home/FIa/FIa164/programs/openmpi/openmpi-3.0.0
mpicc                       = %(mpi_dir)s/bin/mpicc
mpicxx                     = %(mpi_dir)s/bin/mpicxx
library_dirs               = %(mpi_dir)s/lib
runtime_library_dirs = %(library_dirs)s

6. source activate su2 (python 2.7 environment)
7. python setup.py build
8. python setup.py install

Do you have any ideas what might be triggering this? 
Many thanks.

Lisandro Dalcin

unread,
Nov 22, 2017, 3:49:27 PM11/22/17
to mpi4py
On 22 November 2017 at 11:49, <timoth...@gmail.com> wrote:
>
> Do you have any ideas what might be triggering this?
> Many thanks.
>

What Open MPI version are you using?

Any chance you can run under valgrind?

mpiexec -n 2 valgrind python helloworld.py


--
Lisandro Dalcin
============
Research Scientist
Computer, Electrical and Mathematical Sciences & Engineering (CEMSE)
Extreme Computing Research Center (ECRC)
King Abdullah University of Science and Technology (KAUST)
http://ecrc.kaust.edu.sa/

4700 King Abdullah University of Science and Technology
al-Khawarizmi Bldg (Bldg 1), Office # 0109
Thuwal 23955-6900, Kingdom of Saudi Arabia
http://www.kaust.edu.sa

Office Phone: +966 12 808-0459

timoth...@gmail.com

unread,
Nov 23, 2017, 7:55:51 PM11/23/17
to mpi4py
I'm running Open MPI 3.0.0. I will try to build Valgrind and report back.

timoth...@gmail.com

unread,
Nov 27, 2017, 12:34:31 AM11/27/17
to mpi4py
Please see attached for the results of the Valgrind run. There was quite a lot of output regarding some kind of memory issue. Does this suggest another issue?
val_out.txt

Lisandro Dalcin

unread,
Nov 27, 2017, 2:50:02 AM11/27/17
to mpi4py
Did you notice these lines in the output?

Traceback (most recent call last):
File "helloworld.py", line 6, in <module>
from mpi4py import MPI
ModuleNotFoundError: No module named 'mpi4py'
Traceback (most recent call last):
File "helloworld.py", line 6, in <module>
from mpi4py import MPI
ModuleNotFoundError: No module named 'mpi4py'
> --
> You received this message because you are subscribed to the Google Groups
> "mpi4py" group.
> To unsubscribe from this group and stop receiving emails from it, send an
> email to mpi4py+un...@googlegroups.com.
> To post to this group, send email to mpi...@googlegroups.com.
> Visit this group at https://groups.google.com/group/mpi4py.
> To view this discussion on the web visit
> https://groups.google.com/d/msgid/mpi4py/21fcad7e-17c4-46fb-a8ca-5a5e2d5f71ec%40googlegroups.com.
>
> For more options, visit https://groups.google.com/d/optout.

timoth...@gmail.com

unread,
Nov 27, 2017, 7:32:33 AM11/27/17
to mpi4py
Apologies, in my haste to try valgrind, I'd forgotten to activate my python environment (I had installed into a Python 2.7 environment and not to the system 3.6, hence the missing mi4py in that last log). The segmentation error has come now, as before. Please see the attached log. Many thanks.
val_out2.txt

Lisandro Dalcin

unread,
Nov 27, 2017, 9:58:19 AM11/27/17
to mpi4py
Are you sure you are using Open MPI? The line with MPI_SGI_misc_init
is suspicious. Can you run "ldd /path/to/site-packages/mpi4py/MPI.so"
to double check the linked libraries?


Maybe the MPI thread support is broken. Try the following:

At the VERY beginning of the script you execute (helloworld.py), add
the following lines:

import mpi4py
mpi4py.rc.threads = False

and then try again.

PS: I think you have a mess in your build environment. Uninstall
mpi4py, clean the pip cache ("rm -r ~/.cache/pip") and start fresh.
> https://groups.google.com/d/msgid/mpi4py/e9942bcb-7665-428b-957c-f787af1ad1f6%40googlegroups.com.

timoth...@gmail.com

unread,
Nov 28, 2017, 1:36:46 AM11/28/17
to mpi4py
(su2) FIa164@afispb07:~/anaconda3/envs/su2/lib/python2.7/site-packages/mpi4py$ ls
bench.py   dl.so    include       __init__.py   libmpi.pxd  __main__.py   mpi.cfg  MPI.so  run.pyc
bench.pyc  futures  __init__.pxd  __init__.pyc  lib-pmpi    __main__.pyc  MPI.pxd  run.py
(su2) FIa164@afispb07:~/anaconda3/envs/su2/lib/python2.7/site-packages/mpi4py$ pwd
/home/FIa/FIa164/anaconda3/envs/su2/lib/python2.7/site-packages/mpi4py
(su2) FIa164@afispb07:~/anaconda3/envs/su2/lib/python2.7/site-packages/mpi4py$ ldd MPI.so
linux-vdso.so.1 =>  (0x00007ffff7ffe000)
libdl.so.2 => /lib64/libdl.so.2 (0x00007ffff7a3c000)
libpython2.7.so.1.0 => /home/FIa/FIa164/anaconda3/envs/su2/lib/libpython2.7.so.1.0 (0x00007ffff765f000)
libmpi.so.40 => /home/FIa/FIa164/programs/openmpi/openmpi-3.0.0/lib/libmpi.so.40 (0x00007ffff7344000)
libpthread.so.0 => /lib64/libpthread.so.0 (0x00007ffff7127000)
libc.so.6 => /lib64/libc.so.6 (0x00007ffff6dab000)
/lib64/ld-linux-x86-64.so.2 (0x0000555555554000)
libutil.so.1 => /lib64/libutil.so.1 (0x00007ffff6ba7000)
libm.so.6 => /lib64/libm.so.6 (0x00007ffff692e000)
libopen-rte.so.40 => /home/FIa/FIa164/programs/openmpi/openmpi-3.0.0/lib/libopen-rte.so.40 (0x00007ffff667b000)
libopen-pal.so.40 => /home/FIa/FIa164/programs/openmpi/openmpi-3.0.0/lib/libopen-pal.so.40 (0x00007ffff6371000)
libpciaccess.so.0 => /usr/lib64/libpciaccess.so.0 (0x00007ffff6168000)
librt.so.1 => /lib64/librt.so.1 (0x00007ffff5f5f000)
libz.so.1 => /home/FIa/FIa164/programs/zlib/zlib-1.2.11/lib/libz.so.1 (0x00007ffff5d40000)



That was the output from ldd. Yes, it is possible that the wrong MPI is getting pulled in - unfortunately,  the supercomputer has many old, outdated MPI installations that I would ke to ignore if possible. Are there any environmental flags I should keep my eyes open for, to clear before calling mpi4py?

I"ll give the threadless run a try and get back to you.

timoth...@gmail.com

unread,
Nov 28, 2017, 8:43:48 AM11/28/17
to mpi4py
After adding the parameters you suggested to the top of the .py files, the output still gave a segmentation fault:
(su2) FIa164@afispb07:~/programs/mpi4py/mpi4py-3.0.0_src/demo$ mpirun -n 5 python helloworld.py
-------------------------------------------------------
Primary job  terminated normally, but 1 process returned
a non-zero exit code. Per user-direction, the job has been aborted.
-------------------------------------------------------
--------------------------------------------------------------------------
mpirun noticed that process rank 0 with PID 0 on node afispb07 exited on signal 11 (Segmentation fault).

On running valgrind, I get the same error with SGI popping in again. I will try to rebuild mpi4py again and clear the cache but to be honest, my account was only made recently - I haven't built and other implementations of MPI other than the ones which were already built on the system.

I'll try to clear the cache and build again. in mpi.cfg, are there any other parameters I can set so that it only picks up the implementation I wish to use? Are there any system parameters I should unset? Many thanks.

Lisandro Dalcin

unread,
Nov 28, 2017, 10:11:46 AM11/28/17
to mpi4py
Maybe at execution time there is a LD_LIBRARY_PATH variable set that
changes the dynamic MPI library mpi4py is linked with? At this point,
I think you have to contact support staff for help.
> https://groups.google.com/d/msgid/mpi4py/767e2bf4-aaae-453a-a0fe-2f430fda3c17%40googlegroups.com.

Tim Jim

unread,
Nov 28, 2017, 10:18:21 AM11/28/17
to mpi...@googlegroups.com
Ok, understood. Thanks for the help so far! I'll post up the solution if we find it. Kind regards.


> To post to this group, send email to mpi...@googlegroups.com.
> Visit this group at https://groups.google.com/group/mpi4py.
> To view this discussion on the web visit
> https://groups.google.com/d/msgid/mpi4py/767e2bf4-aaae-453a-a0fe-2f430fda3c17%40googlegroups.com.
>
> For more options, visit https://groups.google.com/d/optout.



--
Lisandro Dalcin
============
Research Scientist
Computer, Electrical and Mathematical Sciences & Engineering (CEMSE)
Extreme Computing Research Center (ECRC)
King Abdullah University of Science and Technology (KAUST)
http://ecrc.kaust.edu.sa/

4700 King Abdullah University of Science and Technology
al-Khawarizmi Bldg (Bldg 1), Office # 0109
Thuwal 23955-6900, Kingdom of Saudi Arabia
http://www.kaust.edu.sa

Office Phone: +966 12 808-0459

--
You received this message because you are subscribed to a topic in the Google Groups "mpi4py" group.
To unsubscribe from this topic, visit https://groups.google.com/d/topic/mpi4py/MuG6gAZw8CQ/unsubscribe.
To unsubscribe from this group and all its topics, send an email to mpi4py+unsubscribe@googlegroups.com.

To post to this group, send email to mpi...@googlegroups.com.
Visit this group at https://groups.google.com/group/mpi4py.

Tim Jim

unread,
Nov 29, 2017, 12:47:45 AM11/29/17
to mpi...@googlegroups.com
The problem is fixed. As you suspected, the system SGI MPI implementation has gotten linked somehow. Removing all loaded modules and cleaning out as many environmental flags as possible before doing a clean install of mpi4py seemed to do the trick. Many thanks for your thoughts!
--

Timothy Jim
PhD Researcher in Aerospace

Creative Flow Research Division,
Institute of Fluid Science, Tohoku University

www.linkedin.com/in/timjim/

Reply all
Reply to author
Forward
0 new messages