Re: MPI_ERR_OTHER: known error not in list

697 views
Skip to first unread message

Imran Ali

unread,
Apr 29, 2014, 1:16:00 PM4/29/14
to mpi...@googlegroups.com
On 2014-04-29 18:58, Imran Ali wrote:
> I have recently posted about this error message at both h5py forums
> (https://github.com/h5py/h5py/issues/434) and hdf forums
> (http://mail.lists.hdfgroup.org/pipermail/hdf-forum_lists.hdfgroup.org/2014-April/007751.html).
> I have come to the conclusion that there is an issue with my mpi4py
> install. I have run the following program
> (http://chrisjbutler.wordpress.com/2010/09/23/mpi4py-parallel-io-example/)
> with the code below in the same file
>
> if __name__=="__main__":
> comm = MPI.COMM_WORLD
> Particle_parallel("file.mpi",comm)
>
> where I created file.mpi (not that the format matters as the code
> crashes at opening the file) :
>
> 3
> 5
> 0.0 1.1 2.0 2.3 2.9
> 0.0 0.1 2.2 3.4 3.8
> 0.0 0.5 3.4 4.5 4.6
>
> Running for one core only, I get the following error message :
>
> mpirun -np 1 python mpi4py_read.py
> Traceback (most recent call last):
> File "mpi4py_read.py", line 89, in <module>
> Particle_parallel("file.mpi",comm)
> File "mpi4py_read.py", line 23, in __init__
> self.mpi_file = MPI.File.Open(self.comm, file_name)
> File "File.pyx", line 67, in mpi4py.MPI.File.Open
> (src/mpi4py.MPI.c:89639)
> mpi4py.MPI.Exception: MPI_ERR_OTHER: known error not in list
> --------------------------------------------------------------------------
> mpirun noticed that the job aborted, but has no info as to the process
> that caused that situation.
> --------------------------------------------------------------------------
>
> Hence there appears to be a problem with my mpi4py install. What is
> the problem, and how can I go about fixing it.
>
> With kind regards,
> Imran

I forgot to mention that I installed mpi4py 1.3.1 through dorsal
(https://bitbucket.org/fenics-project/dorsal) on Red Hat Enterprise 4.4
without root access. The install was built with python2.7.

Imran

Imran Ali

unread,
Apr 29, 2014, 12:58:44 PM4/29/14
to mpi...@googlegroups.com

Lisandro Dalcin

unread,
Apr 30, 2014, 4:36:44 AM4/30/14
to mpi4py
Are you using Open MPI as a backend? Have you tried to write a small C
program that just opens the file? I don't think this is a mpi4py
issue, but your backend MPI.



--
Lisandro Dalcin
---------------
CIMEC (UNL/CONICET)
Predio CONICET-Santa Fe
Colectora RN 168 Km 472, Paraje El Pozo
3000 Santa Fe, Argentina
Tel: +54-342-4511594 (ext 1016)
Tel/Fax: +54-342-4511169

imr...@math.uio.no

unread,
Apr 30, 2014, 5:46:05 AM4/30/14
to mpi...@googlegroups.com
I checked the dorsal_configure file and I think OpenMPI is set as backend.

 I also ran the following program :


and got no errors.

Here is how I ran the program :
mpicc -DDEBUG -D_FILE_OFFSET_BITS=64 -D_LARGEFILE64_SOURCE -o mkrandpfile mkrandpfile.c
mpiexec -n 8 mkrandpfile -f test -l 20

Imran

Lisandro Dalcin

unread,
Apr 30, 2014, 6:05:43 AM4/30/14
to mpi4py
>
> I checked the dorsal_configure file and I think OpenMPI is set as backend.
>
> I also ran the following program :
>
> http://beige.ucs.indiana.edu/I590/node88.html
>
> and got no errors.
>
> Here is how I ran the program :
>
> mpicc -DDEBUG -D_FILE_OFFSET_BITS=64 -D_LARGEFILE64_SOURCE -o mkrandpfile
> mkrandpfile.c
>
> mpiexec -n 8 mkrandpfile -f test -l 20
>

Well, this code creates a writes a new file, rather than opening a new
one for reading.

Can you try to pass the full path of "file.mpi" to File.Open() ?

What Open MPI version are you using? You can call MPI.get_vendor() to
figure out.

Lisandro Dalcin

unread,
Apr 30, 2014, 6:07:07 AM4/30/14
to mpi4py
On 30 April 2014 13:05, Lisandro Dalcin <dal...@gmail.com> wrote:
>>
>> I checked the dorsal_configure file and I think OpenMPI is set as backend.
>>
>> I also ran the following program :
>>
>> http://beige.ucs.indiana.edu/I590/node88.html
>>
>> and got no errors.
>>
>> Here is how I ran the program :
>>
>> mpicc -DDEBUG -D_FILE_OFFSET_BITS=64 -D_LARGEFILE64_SOURCE -o mkrandpfile
>> mkrandpfile.c
>>
>> mpiexec -n 8 mkrandpfile -f test -l 20
>>
>
> Well, this code creates a writes a new file, rather than opening a new
> one for reading.
>
> Can you try to pass the full path of "file.mpi" to File.Open() ?
>
> What Open MPI version are you using? You can call MPI.get_vendor() to
> figure out.
>

Also, does a plain Python open("file.mpi").close() work?

imr...@math.uio.no

unread,
Apr 30, 2014, 7:03:32 AM4/30/14
to mpi...@googlegroups.com


kl. 12:07:07 UTC+2 onsdag 30. april 2014 skrev Lisandro Dalcin følgende:
On 30 April 2014 13:05, Lisandro Dalcin <dal...@gmail.com> wrote:
>>
>> I checked the dorsal_configure file and I think OpenMPI is set as backend.
>>
>>  I also ran the following program :
>>
>> http://beige.ucs.indiana.edu/I590/node88.html
>>
>> and got no errors.
>>
>> Here is how I ran the program :
>>
>> mpicc -DDEBUG -D_FILE_OFFSET_BITS=64 -D_LARGEFILE64_SOURCE -o mkrandpfile
>> mkrandpfile.c
>>
>> mpiexec -n 8 mkrandpfile -f test -l 20
>>
>
> Well, this code creates a writes a new file, rather than opening a new
> one for reading. 

I tried to open file.mpi with read only argument 

MPI_File_open(MPI_COMM_WORLD, "file.mpi", MPI_MODE_RDONLY, MPI_INFO_NULL, &thefile);

and got not errors. However, when I tried to write over the file, nothing happened then either. The file remained unchanged.

>
> Can you try to pass the full path of "file.mpi" to File.Open() ?

I got the same error message after giving the full path to file.mpi :

MPI_ERR_OTHER: known error not in list
 
>
> What Open MPI version are you using? You can call MPI.get_vendor() to
> figure out.
>

It states openmpi  1.5.4
 

Also, does a plain Python open("file.mpi").close() work?

Yes it works.

Lisandro Dalcin

unread,
Apr 30, 2014, 7:38:14 AM4/30/14
to mpi4py
I'm running out of ideas. Please try this: at the very beginning of
your main script, add the following lines:

from mpi4py import rc
rc.threaded = False
from mpi4py import MPI


Please note you have to set the flag BEFORE you "from mpi4py import
MPI" the first time.

PS: Any chance you could upgrade your OpenMPI to something more
recent, like the last 1.6.x release? I guess something is broken in
your install, but I cannot figure out what's the issue. Do you by
chance have any other Open MPI installed in your system, e.g. from
your Linux distribution?

Imran Ali

unread,
Apr 30, 2014, 8:04:57 AM4/30/14
to mpi...@googlegroups.com
Unfortunately, I still got the same error message.


PS: Any chance you could upgrade your OpenMPI to something more
recent, like the last 1.6.x release? I guess something is broken in
your install, but I cannot figure out what's the issue. Do you by
chance have any other Open MPI installed in your system, e.g. from
your Linux distribution?


I typed mpiexec —version (and mpirun —version) and it appears that I have openRTE (openMPI) 1.6.2.
Does this mean that my mpi4py install was built with an older openMPI ? 



-- 
Lisandro Dalcin
---------------
CIMEC (UNL/CONICET)
Predio CONICET-Santa Fe
Colectora RN 168 Km 472, Paraje El Pozo
3000 Santa Fe, Argentina
Tel: +54-342-4511594 (ext 1016)
Tel/Fax: +54-342-4511169

-- 
You received this message because you are subscribed to a topic in the Google Groups "mpi4py" group.
To unsubscribe from this topic, visit https://groups.google.com/d/topic/mpi4py/RJwl9gSgbYs/unsubscribe.
To unsubscribe from this group and all its topics, send an email to mpi4py+un...@googlegroups.com.
To post to this group, send email to mpi...@googlegroups.com.
Visit this group at http://groups.google.com/group/mpi4py.
To view this discussion on the web visithttps://groups.google.com/d/msgid/mpi4py/CAEcYPwDaQz%2BQ7%3DUSwhV0r-fvtAhPJsmbqA1GjkBnQhxDmPqs2g%40mail.gmail.com.
For more options, visit https://groups.google.com/d/optout.

Imran Ali

unread,
Apr 30, 2014, 8:50:00 AM4/30/14
to mpi...@googlegroups.com, dal...@gmail.com
I reinstalled mpi4py manually, this time specifying the correct mpicc build. As a verification  MPI.get_vendor() states the newest open mpi version.

However, the error still occurs when opening any file.





-- 
Lisandro Dalcin
---------------
CIMEC (UNL/CONICET)
Predio CONICET-Santa Fe
Colectora RN 168 Km 472, Paraje El Pozo
3000 Santa Fe, Argentina
Tel: +54-342-4511594 (ext 1016)
Tel/Fax: +54-342-4511169

-- 
You received this message because you are subscribed to a topic in the Google Groups "mpi4py" group.
To unsubscribe from this topic, visit https://groups.google.com/d/topic/mpi4py/RJwl9gSgbYs/unsubscribe.
To unsubscribe from this group and all its topics, send an email to mpi4py+un...@googlegroups.com.
To post to this group, send email to mpi...@googlegroups.com.
Visit this group at http://groups.google.com/group/mpi4py.
To view this discussion on the web visithttps://groups.google.com/d/msgid/mpi4py/CAEcYPwDaQz%2BQ7%3DUSwhV0r-fvtAhPJsmbqA1GjkBnQhxDmPqs2g%40mail.gmail.com.
For more options, visit https://groups.google.com/d/optout.


-- 
You received this message because you are subscribed to a topic in the Google Groups "mpi4py" group.
To unsubscribe from this topic, visit https://groups.google.com/d/topic/mpi4py/RJwl9gSgbYs/unsubscribe.
To unsubscribe from this group and all its topics, send an email to mpi4py+un...@googlegroups.com.
To post to this group, send email to mpi...@googlegroups.com.
Visit this group at http://groups.google.com/group/mpi4py.

Imran Ali

unread,
Apr 30, 2014, 8:55:53 AM4/30/14
to mpi...@googlegroups.com, dal...@gmail.com
Since I do not have root access, I installed mpi4py locally in specified folder where I have my python2.7 site-packages using —prefix argument. I see on the install tutorial (http://mpi4py.scipy.org/docs/usrman/install.html#installing) that users without root access are supposed to use —user. Does this matter ? I couldn’t use both prefix and user as arguments, so I used —prefix.

Lisandro Dalcin

unread,
Apr 30, 2014, 9:04:37 AM4/30/14
to Imran Ali, mpi4py
No, it does not matter. Using --user is just the conventional and
straightforward way.

Now, please check the following.

Run

$ mpicc -show

and note the "-L" flag pointing to openmpi's library dir

Next run

$ ldd /path/to/site-packages/mpi4py/MPI.so

and double check that the openmpi libraries are found and point to the
same path as mpicc told you.

Finally, you may need to

$ export LD_LIBRARY_PATH=/path/to/openmpi-1.6.5/lib # i.e path of of
mpicc's -L flag

Imran Ali

unread,
Apr 30, 2014, 9:53:00 AM4/30/14
to Lisandro Dalcin, mpi4py
The libmpi.so.1 was pointing to the same folder path as mpicc -L flag was.


Finally, you may need to

$ export LD_LIBRARY_PATH=/path/to/openmpi-1.6.5/lib # i.e path of of
mpicc's -L flag


I added this in my .bashrc file.

Unfortunately I still get the same error message when I try to open any file :

[1] import mpi4py.MPI as MPI
[2] comm = MPI.COMM_WORLD
[3] MPI.File.Open(com,filename) # with and without full location

Exception : MPI_ERR_OTHER: known error not in list.

Imran Ali

unread,
Apr 30, 2014, 9:56:39 AM4/30/14
to mpi...@googlegroups.com, Lisandro Dalcin
Any recommendations what I should do next ? I have no issues with opening files in C, and only experience this error with python applications. 

Lisandro Dalcin

unread,
Apr 30, 2014, 11:39:13 AM4/30/14
to Imran Ali, mpi4py
On 30 April 2014 16:56, Imran Ali <imr...@student.matnat.uio.no> wrote:
> Any recommendations what I should do next ? I have no issues with opening
> files in C, and only experience this error with python applications.

Well, I've run out of ideas. I think your only chance is to use a
debugger to try to figure out where the error is triggered. Also, you
can try to run under valgrind control, that may help to spot problems
inside the MPI_File_open() calls.

Sorry I cannot help you more, but I simply cannot reproduce the issue
using Open MPI 1.6.5, wich is a similar (though newer) version:


$ echo hello > tmp.txt
$ ipython --no-banner

In [1]: from mpi4py import MPI

In [2]: from array import array

In [3]: fh = MPI.File.Open(MPI.COMM_WORLD, "tmp.txt")

In [4]: buf = array("c",'\0'*10)

In [5]: fh.Read(buf)

In [6]: print buf
array('c', 'hello\n\x00\x00\x00\x00')

In [7]: MPI.get_vendor()
Out[7]: ('Open MPI', (1, 6, 5))

Imran Ali

unread,
Apr 30, 2014, 12:50:07 PM4/30/14
to mpi4py
On 2014-04-30 17:39, Lisandro Dalcin wrote:
> On 30 April 2014 16:56, Imran Ali <imr...@student.matnat.uio.no>
> wrote:
>> Any recommendations what I should do next ? I have no issues with
>> opening
>> files in C, and only experience this error with python applications.
>
> Well, I've run out of ideas. I think your only chance is to use a
> debugger to try to figure out where the error is triggered. Also, you
> can try to run under valgrind control, that may help to spot problems
> inside the MPI_File_open() calls.
>
> Sorry I cannot help you more, but I simply cannot reproduce the issue
> using Open MPI 1.6.5, wich is a similar (though newer) version:
>
>
> $ echo hello > tmp.txt
> $ ipython --no-banner
>
> In [1]: from mpi4py import MPI
>
> In [2]: from array import array
>
> In [3]: fh = MPI.File.Open(MPI.COMM_WORLD, "tmp.txt")
>
> In [4]: buf = array("c",'\0'*10)
>
> In [5]: fh.Read(buf)
>
> In [6]: print buf
> array('c', 'hello\n\x00\x00\x00\x00')
>
> In [7]: MPI.get_vendor()
> Out[7]: ('Open MPI', (1, 6, 5))

I ran a python script with valgrind. However, I do not understand the
output or where the problem lies. I have attached the output here.
output.out

Imran Ali

unread,
May 10, 2014, 10:09:32 AM5/10/14
to mpi...@googlegroups.com
I managed to resolve this issue by installing openmpi 1.8.1. Apparently, openmpi 1.6 is not thread safe.

The advise was given at opempi user forums :


This issue is hence resolved.

-- 
You received this message because you are subscribed to a topic in the Google Groups "mpi4py" group.

To unsubscribe from this group and all its topics, send an email tompi4py+un...@googlegroups.com.
To post to this group, send email to mpi...@googlegroups.com.
Visit this group at http://groups.google.com/group/mpi4py.

For more options, visit https://groups.google.com/d/optout.
<output.out>

Reply all
Reply to author
Forward
0 new messages