Hi Lisandro,
Thanks for your reply. Please bear with me as I am a biologist by training and am scrambling to fill the gaping holes in my knowledge of Linux.
> On Dec 8, 2009, at 2:39 PM, Lisandro Dalcin wrote:
> What MPI implementation? MPICH(1) or MPICH2?
I wish I knew which implementation is active. Scyld Clusterware claims to install MPICH and OpenMPI libraries among others. I found some OpenMPI binaries under /usr/openmpi and am trying to re-build mpi4py with them..
> What is the hostname of the front-end node? In the first run, it seems
> to be ''cluster", but in the second it seems to be
> "
strong_badia.cfenet.ubc.ca" ... What's going on there?
That was my poorly-executed attempt at securing a little privacy :-(
> So you never ever saw this error before with other MPI applications?
Not with the open-source C++ application that I co-develop, nor "Hello World"-grade C code snippets, nor any other example C code that I've tried.
> The helloworld example is so simple that no communication at all is involved.
I'm guessing that should tell me that the problem is systemic, i.e., the MPI implementation is at fault or the hardware is at fault.
> Could you try to add a MPI.COMM_WORLD.Barrier() at the end of the script?
>
> Could you to explicitly call MPI.Finalize() at the end of the script?
Ok, tried these suggestions but no change in outcome.
> OK. So this seems to be a Python-related issue. Try to make the
> modifications I commented before and come back. If your MPI is MPICH2,
> send me the output of "mpich2version" and "mpicc -show". Also try to
> run other demos to see if the error always happens at the end of the
> run.
mpicc -show:
gcc -L/usr/lib64/MPICH/p4/gnu -I/usr/include -lmpi -lbproc
No mpich2version in /usr/bin, so I guess MPICH2 is not installed. :-/
cpi-cco.py crashes with more than 3 processors immediately after prompting user to enter the number of intervals with similar error messages:
> mpirun -np 4 python cpi-cco.py
Enter the number of intervals: (0 quits) p1_10566: p4_error: net_recv read: probable EOF on socket: 1
p2_10567: p4_error: net_recv read: probable EOF on socket: 1
rm_l_3_10571: (0.769531) net_send: could not write to fd=4, errno = 32
[art@strong_badia compute-pi]$ rm_l_1_10568: (1.871094) net_send: could not write to fd=4, errno = 32
rm_l_2_10570: (1.320312) net_send: could not write to fd=4, errno = 32
p3_10569: (6.773438) net_send: could not write to fd=4, errno = 32
Thanks again,
- Art.
>
>
> --
> Lisandro Dalcín
> ---------------
> Centro Internacional de Métodos Computacionales en Ingeniería (CIMEC)
> Instituto de Desarrollo Tecnológico para la Industria Química (INTEC)
> Consejo Nacional de Investigaciones Científicas y Técnicas (CONICET)
> PTLC - Güemes 3450, (3000) Santa Fe, Argentina
> Tel/Fax:
+54-(0)342-451.1594
>
> --
>
> You received this message because you are subscribed to the Google Groups "mpi4py" group.
> To post to this group, send email to
mpi...@googlegroups.com.
> To unsubscribe from this group, send email to
mpi4py+un...@googlegroups.com.
> For more options, visit this group at
http://groups.google.com/group/mpi4py?hl=en.
>
>