[mpi4py] Allgather in inter-communicator bug,

85 views
Skip to first unread message

Battalgazi YILDIRIM

unread,
May 19, 2010, 6:00:21 PM5/19/10
to us...@open-mpi.org, mpi4py
Hi,


I am trying to use intercommunicator ::Allgather between two child process. I have fortran and Python code,
I am using mpi4py for python. It seems that ::Allgather is not working properly in my desktop.

 I have contacted first mpi4py developers (Lisandro Dalcin), he simplified my problem and provided two example files (python.py and fortran.f90,  please see below).

We tried with different MPI vendors, the following example worked correclty(
it means the final print out should be array('i', [1, 2, 3, 4, 5, 6, 7, 8]) )

However, it is not giving correct answer in my two desktop (Redhat and ubuntu) both
using OPENMPI

Could yo look at this problem please?

If you want to follow our discussion before you, you can go to following link:
http://groups.google.com/group/mpi4py/browse_thread/thread/c17c660ae56ff97e

yildirim@memosa:~/python_intercomm$ more python.py
from mpi4py import MPI
from array import array
import os

progr = os.path.abspath('a.out')
child = MPI.COMM_WORLD.Spawn(progr,[], 8)
n = child.remote_size
a = array('i', [0]) * n
child.Allgather([None,MPI.INT],[a,MPI.INT])
child.Disconnect()
print a

yildirim@memosa:~/python_intercomm$ more fortran.f90
program main
 use mpi
 implicit none
 integer :: parent, rank, val, dummy, ierr
 call MPI_Init(ierr)
 call MPI_Comm_get_parent(parent, ierr)
 call MPI_Comm_rank(parent, rank, ierr)
 val = rank + 1
 call MPI_Allgather(val,   1, MPI_INTEGER, &
                    dummy, 0, MPI_INTEGER, &
                    parent, ierr)
 call MPI_Comm_disconnect(parent, ierr)
 call MPI_Finalize(ierr)
end program main

yildirim@memosa:~/python_intercomm$ mpif90 fortran.f90

yildirim@memosa:~/python_intercomm$ python python.py
array('i', [0, 0, 0, 0, 0, 0, 0, 0])



--
B. Gazi YILDIRIM

--
You received this message because you are subscribed to the Google Groups "mpi4py" group.
To post to this group, send email to mpi...@googlegroups.com.
To unsubscribe from this group, send email to mpi4py+un...@googlegroups.com.
For more options, visit this group at http://groups.google.com/group/mpi4py?hl=en.

Battalgazi YILDIRIM

unread,
May 20, 2010, 1:51:11 PM5/20/10
to mpi4py
Hi,

I just want to post  reply from openmpi froums, they asked example only involving C or C++ and Fortran to
make sure, so I have converted Lisandro's python example to C++ and give them, they found their
bug, and fix it sooner,

I am going to use mpich2 for while,


thanks,

---------- Forwarded message ----------
From: Edgar Gabriel <gab...@cs.uh.edu>
Date: Thu, May 20, 2010 at 1:33 PM
Subject: Re: [OMPI users] Allgather in inter-communicator bug,
To: Open MPI Users <us...@open-mpi.org>


thanks for pointing the problem out. I checked in the code, the problem
is the MPI layer itself. The following check prevents us from doing
anything

----
e.g. ompi/mpi/c/allgather.c

  if ((MPI_IN_PLACE != sendbuf && 0 == sendcount) ||
       (0 == recvcount)) {
       return MPI_SUCCESS;
   }
----


so the problem is not in the modules/algorithms but in the API layer,
which did not encounter for intercommunicators correctly. I'll try to
fix it.

Thanks
edgar

On 05/20/2010 10:48 AM, Battalgazi YILDIRIM wrote:
> Hi,
>
> you are right, I should have provided C++ and Fortran example, so I am
> doing now
>
>
> Here is "cplusplus.cpp"
>
> #include <mpi.h>
> #include <iostream>
> using namespace std;
> int main()
> {
>     MPI::Init();
>     char command[] = "./a.out";
>     MPI::Info info;
>     MPI::Intercomm child = MPI::COMM_WORLD.Spawn(command, NULL, 8,info, 0);
>     int a[8]={0,0,0,0,0,0,0,0};
>     int dummy;
>     child.Allgather(&dummy, 0, MPI::INT, a, 1, MPI::INT);
>     child.Disconnect();
>     cout << "a[";
>     for ( int i = 0; i < 7; i++ )
>         cout << a[i] << ",";
>     cout << a[7] << "]" << endl;
>
>     MPI::Finalize();
> }
>
>
> Here is again "fortran.f90"

>
> program main
>  use mpi
>  implicit none
>  integer :: parent, rank, val, dummy, ierr
>  call MPI_Init(ierr)
>  call MPI_Comm_get_parent(parent, ierr)
>  call MPI_Comm_rank(parent, rank, ierr)
>  val = rank + 1
>  call MPI_Allgather(val,   1, MPI_INTEGER, &
>                     dummy, 0, MPI_INTEGER, &
>                     parent, ierr)
>  call MPI_Comm_disconnect(parent, ierr)
>  call MPI_Finalize(ierr)
> end program main
>
> here is how you build and run
>
> -bash-3.2$ mpif90 fortran.f90
> -bash-3.2$ mpiCC -o parent cplusplus.cpp
> -bash-3.2$ ./parent
> a[0,0,0,0,0,0,0,0]
>
>
>
> If I use mpich2,
> -bash-3.2$ mpif90 fortran.f90
> -bash-3.2$ mpiCC -o parent cplusplus.cpp
> -bash-3.2$ ./parent
> a[1,2,3,4,5,6,7,8]
>
> I hope that you can repeat this problem to see problem with OPENMPI,
>
> Thanks,
>
>
> On Thu, May 20, 2010 at 10:09 AM, Jeff Squyres <jsqu...@cisco.com
> <mailto:jsqu...@cisco.com>> wrote:
>
>     Can you send us an all-C or all-Fortran example that shows the problem?
>
>     We don't have easy access to test through the python bindings.
>      ...ok, I admit it, it's laziness on my part.  :-)  But having a
>     pure Open MPI test app would also remove some possible variables and
>     possible sources of error.
>
>
>     On May 20, 2010, at 9:43 AM, Battalgazi YILDIRIM wrote:
>
>     > Hi Jody,
>     >
>     > I think that it is correct, you can  test this example in your
>     desktop,
>     >
>     > thanks,
>     >
>     > On Thu, May 20, 2010 at 3:18 AM, jody <jody...@gmail.com
>     <mailto:jody...@gmail.com>> wrote:
>     > Hi
>     > I am really no python expert, but it looks to me as if you were
>     > gathering arrays filled with zeroes:

>     >  a = array('i', [0]) * n
>     >
>     > Shouldn't this line be
>     >  a = array('i', [r])*n
>     > where r is the rank of the process?
>     >
>     > Jody
>     >
>     >
>     > On Thu, May 20, 2010 at 12:00 AM, Battalgazi YILDIRIM
>     > > child.Allgather([None,MPI.INT <http://MPI.INT>],[a,MPI.INT
>     <http://MPI.INT>])
>     > > child.Disconnect()
>     > > print a
>     > >
>     > > yildirim@memosa:~/python_intercomm$ more fortran.f90
>     > > program main
>     > >  use mpi
>     > >  implicit none
>     > >  integer :: parent, rank, val, dummy, ierr
>     > >  call MPI_Init(ierr)
>     > >  call MPI_Comm_get_parent(parent, ierr)
>     > >  call MPI_Comm_rank(parent, rank, ierr)
>     > >  val = rank + 1
>     > >  call MPI_Allgather(val,   1, MPI_INTEGER, &
>     > >                     dummy, 0, MPI_INTEGER, &
>     > >                     parent, ierr)
>     > >  call MPI_Comm_disconnect(parent, ierr)
>     > >  call MPI_Finalize(ierr)
>     > > end program main
>     > >
>     > > yildirim@memosa:~/python_intercomm$ mpif90 fortran.f90
>     > >
>     > > yildirim@memosa:~/python_intercomm$ python python.py
>     > > array('i', [0, 0, 0, 0, 0, 0, 0, 0])
>     > >
>     > >
>     > > --
>     > > B. Gazi YILDIRIM
>     > >
>     > > _______________________________________________
>     > > users mailing list
>     > > us...@open-mpi.org <mailto:us...@open-mpi.org>
>     > > http://www.open-mpi.org/mailman/listinfo.cgi/users
>     > >
>     >
>     > _______________________________________________
>     > users mailing list
>     > us...@open-mpi.org <mailto:us...@open-mpi.org>
>     > http://www.open-mpi.org/mailman/listinfo.cgi/users
>     >
>     >
>     >
>     > --
>     > B. Gazi YILDIRIM
>     > _______________________________________________
>     > users mailing list
>     > us...@open-mpi.org <mailto:us...@open-mpi.org>
>     > http://www.open-mpi.org/mailman/listinfo.cgi/users
>
>
>     --
>     Jeff Squyres
>     jsqu...@cisco.com <mailto:jsqu...@cisco.com>
>     For corporate legal information go to:
>     http://www.cisco.com/web/about/doing_business/legal/cri/
>
>
>     _______________________________________________
>     users mailing list
>     us...@open-mpi.org <mailto:us...@open-mpi.org>
>     http://www.open-mpi.org/mailman/listinfo.cgi/users
>
>
>
>
> --
> B. Gazi YILDIRIM
>
>
>
> _______________________________________________
> users mailing list
> us...@open-mpi.org
> http://www.open-mpi.org/mailman/listinfo.cgi/users



_______________________________________________
users mailing list
us...@open-mpi.org
http://www.open-mpi.org/mailman/listinfo.cgi/users
signature.asc

Battalgazi YILDIRIM

unread,
May 20, 2010, 2:05:01 PM5/20/10
to mpi4py
Hi,

I think that openmpi forum issued this problem as their bug for 1.4.3,
They asked me to give some examples C,C++ and Fortran so I have
converted Lisandro' python example to C++ code.

thanks,


You can see the problem at below.


> Here is again "fortran.f90"

>
> program main
>  use mpi
>  implicit none
>  integer :: parent, rank, val, dummy, ierr
>  call MPI_Init(ierr)
>  call MPI_Comm_get_parent(parent, ierr)
>  call MPI_Comm_rank(parent, rank, ierr)
>  val = rank + 1
>  call MPI_Allgather(val,   1, MPI_INTEGER, &
>                     dummy, 0, MPI_INTEGER, &
>                     parent, ierr)
>  call MPI_Comm_disconnect(parent, ierr)
>  call MPI_Finalize(ierr)
> end program main
>
>     >  a = array('i', [0]) * n
>     >
>     > Shouldn't this line be
>     >  a = array('i', [r])*n
>     > where r is the rank of the process?
>     >
>     > Jody
>     >
>     >
>     > On Thu, May 20, 2010 at 12:00 AM, Battalgazi YILDIRIM
>     > > child.Allgather([None,MPI.INT <http://MPI.INT>],[a,MPI.INT
>     <http://MPI.INT>])
>     > > child.Disconnect()
>     > > print a
>     > >
>     > > yildirim@memosa:~/python_intercomm$ more fortran.f90
>     > > program main
>     > >  use mpi
>     > >  implicit none
>     > >  integer :: parent, rank, val, dummy, ierr
>     > >  call MPI_Init(ierr)
>     > >  call MPI_Comm_get_parent(parent, ierr)
>     > >  call MPI_Comm_rank(parent, rank, ierr)
>     > >  val = rank + 1
>     > >  call MPI_Allgather(val,   1, MPI_INTEGER, &
>     > >                     dummy, 0, MPI_INTEGER, &
>     > >                     parent, ierr)
>     > >  call MPI_Comm_disconnect(parent, ierr)
>     > >  call MPI_Finalize(ierr)
>     > > end program main
>     > >
>     > > yildirim@memosa:~/python_intercomm$ mpif90 fortran.f90
>     > >
>     > > yildirim@memosa:~/python_intercomm$ python python.py
>     > > array('i', [0, 0, 0, 0, 0, 0, 0, 0])
>     > >
>     > >
>     > > --
>     > > B. Gazi YILDIRIM
>     > >
>     > > _______________________________________________
>     > > users mailing list
signature.asc
Reply all
Reply to author
Forward
0 new messages