Scattering an image into multiple nodes and rejoining them again.

609 views
Skip to first unread message

Arun Das

unread,
Feb 9, 2016, 3:41:59 AM2/9/16
to mpi4py
Hello !

I am taking a course on MPI programming and is going to do an image processing project very soon. I installed OpenCV and did basic image manipulations like opening, viewing, scaling, splitting images. The next step for me was to slice the image and send them to different nodes using Scatter() and gather them back using Gather(). I have been working on it for a couple of days now and I am missing the basic understanding on how the image can be scattered properly. Could you guys please guide me ?

I am programming in python.

This is what I did so far :

IN MASTER NODE:
1. Opened up an image, saved the number of rows and columns.
2. Found local_r and local_c ( rows and column for individual node) by dividing the number of rows and columns by the comm.size
if rank == 0:
        img
= cv2.imread('Lenna.png',0)
        imgarray
= img.flatten()
        no_rows
= img.shape[0]
        no_cols
= img.shape[1]
        local_r
= no_rows/size
        local_c
= no_cols/size


3. Created a local_image array by :  
local_x = np.empty((local_r, local_c),dtype = 'uint8')
4. Created a newimg array to store my image after it is being gathered back by:
newimg = np.empty((no_rows,no_cols),dtype = 'uint8')

IN EVERY NODE:

1. Broadcast local_r, local_c
local_c = comm.bcast(local_c,root=0)
local_r = comm.bcast(local_r,root=0)
local_x = comm.bcast(local_x, root = 0

2. Scatter the image by :
comm.Scatter([img,local_r,MPI.INT],[local_x,local_r,MPI.INT])
I think I am going wrong here deciding the size of image to scatter. I read many forums but is not actually understanding how it should be done. I found little detail on matrix scattering of this type. if you guys can help me, it would be great !

3.


Lisandro Dalcin

unread,
Feb 9, 2016, 4:14:46 AM2/9/16
to mpi4py
On 9 February 2016 at 03:52, Arun Das <arund...@gmail.com> wrote:
> Hello !
>
> I am taking a course on MPI programming and is going to do an image
> processing project very soon. I installed OpenCV and did basic image
> manipulations like opening, viewing, scaling, splitting images. The next
> step for me was to slice the image and send them to different nodes using
> Scatter() and gather them back using Gather(). I have been working on it for
> a couple of days now and I am missing the basic understanding on how the
> image can be scattered properly. Could you guys please guide me ?
>

What kind of image processing algorithms are you planning to use?
Depending on whether you need "ghost cells" (i.e entries from
neighboring processors) to perform the local processing, things go
from trivial to complicated.

The other thing you have to decide is how do you want to distribute
your data: you can use either a easy one-dimensional distribution
(i.e, distribute the rows) or a more complicated two-dimensional
distribution (i.e. distribute both rows and columns). If you are a
beginner, I would start experimenting with 1D distributions. Also be
aware that your image size might not be a multiple of the number of
processes in the run, so you'll likely have to use
Scatterv()/Gatherv() (i.e. in MPI parlance, the "vector" variant of
scatter/gather)

Take a look at this example:
https://bitbucket.org/dalcinl/pasi-2011-mpi4py/src/default/examples/mandelbrot-mpi-block.py

That's a start, you have to do basically the same to gather the
results back to process zero. What is left is the initial Scatterv()
to distribute your image from processes zero.

Line 23 is (one of the) the usual ways to compute the local number of
rows "N" given the global number of rows "h".
Line 25 is a quick way to compute the (global) start row index
corresponding to each local piece, in case you need it in your
computations.
Line 39 is a quick way to get a list of local sizes, you have to use
it to either Scatterv()/Gatterv().

Try to understand what's going on in this example first, add print
statements here and there, change the global sizes, run it with
1,2,3,... processes, and so on. If you still cannot Scatter/Gather you
image, then write us back, I'll write the code for you, but remember
that you learn by doing and making mistakes :-)



--
Lisandro Dalcin
============
Research Scientist
Computer, Electrical and Mathematical Sciences & Engineering (CEMSE)
Numerical Porous Media Center (NumPor)
King Abdullah University of Science and Technology (KAUST)
http://numpor.kaust.edu.sa/

4700 King Abdullah University of Science and Technology
al-Khawarizmi Bldg (Bldg 1), Office # 4332
Thuwal 23955-6900, Kingdom of Saudi Arabia
http://www.kaust.edu.sa

Office Phone: +966 12 808-0459

Arun Das

unread,
Feb 9, 2016, 11:00:57 AM2/9/16
to mpi4py
Thank You ! I will definitely check it out.

I initially planned to scatter 1D arrays. I am having trouble understanding the send and receive buffer sizes when using scatter. I will take a look at the example and will update. 
Thanks again !
Message has been deleted

Arun Das

unread,
Feb 9, 2016, 2:45:57 PM2/9/16
to mpi4py
Guys !
Thank you !

I was able to figure that out. Thank you Lisandro for reminding about distributing rows alone and/or rows & columns together. I changed my problem to tackle the easier 'Scatter rows alone' part of it.

Here's the code if it helps anyone out.

AIM: To open an image file, Scatter it to multiple nodes and Gather it back to the master node, display the image again.

Thanks again !
Arun.



import numpy as np
import cv2
from mpi4py import MPI
import time
comm = MPI.COMM_WORLD
size = comm.Get_size()
rank = comm.Get_rank()
if rank == 0:
        img = cv2.imread('Lenna.png',0)
        no_rows = img.shape[0]
        no_cols = img.shape[1]
        local_r = no_rows/size
        local_c = no_cols/size
        local_x = np.zeros((local_r, no_cols),dtype='uint8')
        newimg = np.zeros((no_rows, no_cols),dtype = 'uint8')
        internaldata = np.array([ no_rows, no_cols, local_r, local_c ])
else:
        internaldata = None
img = None
        newimg = None
        local_x = None
internaldata = comm.bcast(internaldata, root = 0) #Not required to scatter and gather. Done this for some other purpose.
comm.Barrier()
local_r = internaldata[2]
local_c = internaldata[3] #Not required in this program 
local_x = comm.bcast(local_x, root = 0)
comm.Scatterv(img,local_x,root = 0)
print "local_x in Process ", rank, "is", local_x, "type is ",type(local_x)
comm.Gatherv(local_x, newimg, root = 0)

if rank == 0:
        print "the new image is ",newimg, "and type is ", type(newimg)
        cv2.imwrite('newimgLenna.png',newimg)
        cv2.imshow('newLenna',newimg)
        cv2.waitKey(0)



Lisandro Dalcin

unread,
Feb 10, 2016, 3:19:22 AM2/10/16
to mpi4py
There are issues with this code. Why are you bcast()ing 'local_x',
which is a block of zeroed rows in process zero. You have to do it a
different way

(local_r, no_cols) = comm.bcast((local_r, no_cols), root = 0)
local_x = np.zeros((local_r, no_cols),dtype='uint8')
comm.Scatterv(img,local_x,root = 0)
print "local_x in Process ", rank, "is", local_x, "type is ",type(local_x)
comm.Gatherv(local_x, newimg, root = 0)


PS: Please, be aware mpi4py is not a magician, if you even use an
image whose number of rows cannot be split evenly across processes,
your code above will fail hard. In general, you have to specify
:counts: to Scatterv()/Gatherv(). If such code working for your, it is
just because mpi4py tries hard to figure out good defaults, but if it
cannot, you will get an error.
Reply all
Reply to author
Forward
0 new messages