On 19 June 2017 at 13:07, 趙睿 <
renyu...@gmail.com> wrote:
> Hi Lisandro, thanks for your reply.
>
> On 2017-6-19 Monday UTC+1 8:59:43,Lisandro Dalcin wrote:
>>
>> On 19 June 2017 at 02:57, <
renyu...@gmail.com> wrote:
>> > I run the program locally (x86_64 linux), with mpi4py 2.0.0, python
>> > 3.6.1,
>> > openmpi 1.10.6.
>>
>> Does you Open MPI build support MPI_THREAD_MULTIPLE? What's the output
>> of MPI.Query_thread()? If you are calling MPI routines in threads
>> without the proper thread support, bad things could happen. Have you
>> tried your code with an alternative MPI implementation, let say MPICH?
>
>
> The output of MPI.Query_thread() is `2`.
>
Well, this means you MPI build does not support multiple threads.
> Not really. Actually I have tried, but when using mpich (with the same set
> of arguments when I use openmpi), the size of COMM_WORLD is always `1` (for
> each process), so the program won't run properly.
> I'm not sure where it went wrong, but apparently this should be related to
> the core function of mpich... I just did a fresh install (without any
> modification) through AUR (I'm using archlinux on my laptop).
>
Then your system is messed up. This happens sometimes in Ubuntu, as
the alternatives system is messed up. The problem is that you are
building with MPICH, but your "mpiexec" command likely corresponds to
Open MPI. Maybe you have a "mpiexec.mpich" command? You should use
that one to execute Python and run MPICH+mpi4py scripts.
> (Maybe not related to this, but I think more information would help:)
> Actually I have encounterd a similar problem before, but in a different
> situation. At that time, it fails at the rank 0 process (which acts as a
> coordinator) followed by a syntax error (manually observed, not thrown by
> the interpreter/runtime).
>
> I read most of the segmantation fault posts on this forum but they don't
> seem to be similar to my issue. Through my web search, ths only similar
> question is
>
https://stackoverflow.com/questions/30275341/c-signal-code-address-not-mapped-1-mpirecv
> , though if this is the actual cause, this should be a lower level issue
> (maybe related to mpi4py or openmpi itself).
>
Could you paste the exact line with the "comm.recv()" call that
produces the segfault?