scatter numpy/scipy objects

brunetto

unread,

Nov 30, 2010, 4:52:44 AM11/30/10

to mpi4py

Hi all!!
My name is Brunetto and I am trying to parallelize a kdtree based code
in python for my master thesis using mpi4py. I code in python from one
month ago so I'm not very expert!:)
I hope my question is not too stupid!:P

I can't understand how to scatter a numpy scioy objects. I've found
this

http://www.sagemath.org/doc/numerical_sage/mpi4py.html

in particular (I have fixed some errors...)

File Edit Options Buffers Tools Python
Help
from mpi4py import MPI
import numpy

comm = MPI.COMM_WORLD
rank = comm.Get_rank()
size = comm.Get_size

sendbuf=[]
root=0
if rank==0:
m=numpy.array(range(size*size),dtype=float)
m.shape=(size,size)
print(m)
sendbuf=m

v=comm.Scatter(sendbuf,root)
print("I got this array:")
print(v)
v=v*v
recvbuf=comm.Gather(v,root)
if rank==0:
print numpy.array(recvbuf)

but, launching it with "mpiexec -n 4 python prova_uffi.py" I receive
this error:

Traceback (most recent call last):
File "prova_uffi.py", line 16, in <module>
v=comm.Scatter(sendbuf,root)
File "Comm.pyx", line 440, in mpi4py.MPI.Comm.Scatter (src/
mpi4py.MPI.c:50658)
File "message.pxi", line 418, in mpi4py.MPI._p_msg_cco.for_scatter
(src/mpi4py.MPI.c:16194)
File "message.pxi", line 326, in mpi4py.MPI._p_msg_cco.for_cco_recv
(src/mpi4py.MPI.c:15416)
File "message.pxi", line 88, in mpi4py.MPI.message_simple (src/
mpi4py.MPI.c:13292)
TypeError: message: expecting buffer or list/tuple
Traceback (most recent call last):
File "prova_uffi.py", line 16, in <module>
v=comm.Scatter(sendbuf,root)
File "Comm.pyx", line 440, in mpi4py.MPI.Comm.Scatter (src/
mpi4py.MPI.c:50658)
File "message.pxi", line 418, in mpi4py.MPI._p_msg_cco.for_scatter
(src/mpi4py.MPI.c:16194)
File "message.pxi", line 326, in mpi4py.MPI._p_msg_cco.for_cco_recv
(src/mpi4py.MPI.c:15416)
File "message.pxi", line 88, in mpi4py.MPI.message_simple (src/
mpi4py.MPI.c:13292)
TypeError: message: expecting buffer or list/tuple
Traceback (most recent call last):
File "prova_uffi.py", line 16, in <module>
v=comm.Scatter(sendbuf,root)
File "Comm.pyx", line 440, in mpi4py.MPI.Comm.Scatter (src/
mpi4py.MPI.c:50658)
File "message.pxi", line 418, in mpi4py.MPI._p_msg_cco.for_scatter
(src/mpi4py.MPI.c:16194)
File "message.pxi", line 326, in mpi4py.MPI._p_msg_cco.for_cco_recv
(src/mpi4py.MPI.c:15416)
File "message.pxi", line 88, in mpi4py.MPI.message_simple (src/
mpi4py.MPI.c:13292)
TypeError: message: expecting buffer or list/tuple
Traceback (most recent call last):
File "prova_uffi.py", line 11, in <module>
m=numpy.array(range(size*size),dtype=float)
TypeError: unsupported operand type(s) for *:
'builtin_function_or_method' and 'builtin_function_or_method'

So how I can scatter these objects? In particular I need to scatter
numpy arrays and kdtrees. I have not found any other documentation!

Thank you in advance for your reply!:)

brunetto

Lisandro Dalcin

unread,

Nov 30, 2010, 8:22:26 AM11/30/10

to mpi...@googlegroups.com

On 30 November 2010 06:52, brunetto <brunett...@gmail.com> wrote:
> Hi all!!
> My name is Brunetto and I am trying to parallelize a kdtree based code
> in python for my master thesis using mpi4py. I code in python from one
> month ago so I'm not very expert!:)
> I hope my question is not too stupid!:P
>
> I can't understand how to scatter a numpy scioy objects. I've found
> this
>
> http://www.sagemath.org/doc/numerical_sage/mpi4py.html
>
> in particular (I have fixed some errors...)
>
>

That page is really outdated (for mpi4py>=1.0)...

Take a look here:

http://mpi4py.scipy.org/docs/usrman/tutorial.html

--
Lisandro Dalcin
---------------
CIMEC (INTEC/CONICET-UNL)
Predio CONICET-Santa Fe
Colectora RN 168 Km 472, Paraje El Pozo
Tel: +54-342-4511594 (ext 1011)
Tel/Fax: +54-342-4511169

Aron Ahmadia

unread,

Nov 30, 2010, 8:21:44 AM11/30/10

to mpi...@googlegroups.com

You need to create a Python list of size arrays to scatter, not a
size*size array. For example

if rank==0:
m=[numpy.array(range(p*size),dtype=float) for p in size]
print(m)
sendbuf=m

A

> --
> You received this message because you are subscribed to the Google Groups "mpi4py" group.
> To post to this group, send email to mpi...@googlegroups.com.
> To unsubscribe from this group, send email to mpi4py+un...@googlegroups.com.
> For more options, visit this group at http://groups.google.com/group/mpi4py?hl=en.
>
>

Konstantin Kudryavtsev

unread,

Nov 30, 2010, 8:25:46 AM11/30/10

to mpi4py

Hi Brunetto!

I haven't used python and mpi4py for a long while either, and your
question is not stupid, believe me I struggled with the same issue
myself when I first tried to use Scatter / Gather routines. The thing
is, you're trying to scatter a numpy array, which is a single object,
and that's why your compiler complains "TypeError: message: expecting
buffer or list/tuple", try passing a list of size length instead,
i.e.

sendbuf = []
for i in range(size):
sendbuf.append(i * size)
v=comm.Scatter(sendbuf,root)

or something like that

Here, I found a piece of MPI code where I was using scatter to
distribute an array, hope that helps:

def get_dest_activity_opt(self):
if self.rank == 0:
activity = self.dest_ref.activity
self.items_per_node = int(round(activity.size/self.size))
data = []
start_i = 0
for i in range(self.size):
if i+1 < self.size:
data.append(activity.flat[start_i : start_i
+self.items_per_node])
else:
data.append(activity.flat[start_i : ])
start_i+=self.items_per_node
else:
data = None

activity = self.comm.scatter(data, root = 0)
return activity

Konstantin

Lisandro Dalcin

unread,

Nov 30, 2010, 8:48:39 AM11/30/10

to mpi...@googlegroups.com

(Sorry for the top-post, Konstantin broke the rule before me)

@Aron/@Konstantin: your suggestion is very valid, however take into
account that such approach will use pickle under the hood, then it is
not the most efficient way to communicate array data. The code below
is a far better, it should get near-C speed.

from mpi4py import MPI
import numpy as np

comm = MPI.COMM_WORLD
rank = comm.Get_rank()

size = comm.Get_size()

if rank==0:
# process 0 is the root, it has data to scatter
sendbuf = np.arange(size*size, dtype=float)
sendbuf.shape = (size, size) # actually not required
else:
# processes other than root do not send
sendbuf = None
# all processes receive data
recvbuf = np.arange(size, dtype=float)

print "[%d] sendbuf=%r" % (rank, sendbuf)
comm.Scatter(sendbuf, recvbuf, root=0)
print "[%d] recvbuf=%r" % (rank, recvbuf)

On 30 November 2010 10:25, Konstantin Kudryavtsev

> --
> You received this message because you are subscribed to the Google Groups "mpi4py" group.
> To post to this group, send email to mpi...@googlegroups.com.
> To unsubscribe from this group, send email to mpi4py+un...@googlegroups.com.
> For more options, visit this group at http://groups.google.com/group/mpi4py?hl=en.
>
>

--

brunetto

unread,

Dec 1, 2010, 5:18:12 AM12/1/10

to mpi4py

> @Aron/@Konstantin: your suggestion is very valid, however take into
> account that such approach will use pickle under the hood, then it is
> not the most efficient way to communicate array data. The code below
> is a far better, it should get near-C speed.
>
> from mpi4py import MPI
> import numpy as np
>
> comm = MPI.COMM_WORLD
> rank = comm.Get_rank()
> size = comm.Get_size()
>
> if rank==0:
> # process 0 is the root, it has data to scatter
> sendbuf = np.arange(size*size, dtype=float)
> sendbuf.shape = (size, size) # actually not required
> else:
> # processes other than root do not send
> sendbuf = None
> # all processes receive data
> recvbuf = np.arange(size, dtype=float)

Ok, thank you for your replies!:)

I have, more or less, understood this but it's not so clear to me the
general mechanism... I have to scatter also objects like trees or
pieces of trees (http://docs.scipy.org/doc/scipy/reference/generated/
scipy.spatial.KDTree.html)... I have also another question on this but
I'm going to open another thread on this, so maybe the situation is
going to be clearer!:)

Thank you very much!

brunetto

Hannad Hussein

unread,

Mar 18, 2021, 9:52:46 AM3/18/21

to mpi4py

Hi all,

I am not sure if I am lucky enough to get a response since this thread has been offline for 11 years now...

I have a numpy array contain various data types (strings, integers, etc.)

I am trying to scatter the numpy array across 20 nodes:

data = numpy.array(sample_data)

comm = MPI.COMM_WORLD

rank = comm.Get_rank()

size = comm.Get_size()

name = MPI.Get_processor_name()

N = data.size

if rank == 0:

print ("Application Will be Scattering: \n\n", data)

print ("---------------------------------------------------------------------------\n")

sendbuf = numpy.array(data)

ave, res = divmod(sendbuf.size, size)

count = [ave + 1 if p < res else ave for p in range(size)]

count = numpy.array(count)

displ = [sum(count[:p]) for p in range (size)]

displ = numpy.array(displ)

else:

sendbuf = None

count = numpy.zeros(size, dtype=numpy.int)

displ = None

comm.Bcast(count, root=0)

recvbuf = numpy.zeros(count[rank])

comm.Scatterv([sendbuf, count, displ, MPI.DOUBLE], recvbuf, root=0)

print("Process %d At Node %s Recieved: " % (rank, name), recvbuf)

---- the output is always in integers?

Process 17 At Node KPie01 Recieved: [0.00000000e+000 0.00000000e+000 0.00000000e+000 0.00000000e+000

0.00000000e+000 0.00000000e+000 0.00000000e+000 0.00000000e+000

1.01855798e-312 0.00000000e+000 0.00000000e+000 0.00000000e+000

0.00000000e+000 0.00000000e+000 0.00000000e+000 0.00000000e+000

0.00000000e+000 0.00000000e+000 0.00000000e+000]

Process 18 At Node KPie01 Recieved: [2.37151510e-322 0.00000000e+000 0.00000000e+000 0.00000000e+000

0.00000000e+000 0.00000000e+000 0.00000000e+000 0.00000000e+000

0.00000000e+000 0.00000000e+000 1.01855798e-312 0.00000000e+000

0.00000000e+000 0.00000000e+000 0.00000000e+000 0.00000000e+000

0.00000000e+000 0.00000000e+000 0.00000000e+000]

Process 19 At Node KPie01 Recieved: [0.00000000e+000 0.00000000e+000 0.00000000e+000 0.00000000e+000

0.00000000e+000 0.00000000e+000 0.00000000e+000 0.00000000e+000

0.00000000e+000 0.00000000e+000 1.18831764e-312 1.01855798e-312

6.79038653e-313 6.79038653e-313 2.48273508e-312 2.05833592e-312

6.79038654e-313 2.14321575e-312 2.35541533e-312 2.41907520e-312

2.14321575e-312 6.79038654e-313 2.41907520e-312 2.50395503e-312

2.44029516e-312 2.35541533e-312 6.79038654e-313 2.33419537e-312

6.79038654e-313 2.05833592e-312 2.05833592e-312 2.14321575e-312

2.14321575e-312 2.46151512e-312 2.35541533e-312]

Process 0 At Node KPie00 Recieved: [6.79038653e-313 1.54905693e-312 1.54905693e-312 1.46417710e-312

3.35964639e-322 0.00000000e+000 0.00000000e+000 0.00000000e+000

0.00000000e+000 0.00000000e+000 0.00000000e+000 0.00000000e+000

0.00000000e+000 0.00000000e+000 1.03977794e-312 1.16709769e-312

1.18831764e-312 1.12465777e-312 2.61854792e-322 0.00000000e+000

0.00000000e+000 0.00000000e+000 0.00000000e+000 0.00000000e+000

0.00000000e+000 0.00000000e+000 0.00000000e+000 0.00000000e+000]

any thoughts?

Reply all

Reply to author

Forward