Support for basic task queues

186 views
Skip to first unread message

rpg

unread,
Nov 9, 2009, 12:50:43 PM11/9/09
to mpi4py
Hi all,

What is the simplest way to setup a task queue in mpi4py? My
application is embarrassingly parallel and all I need to do is setup a
large queue of work in the master process which all the slave
processes pull, process and send the results back to the master node.
The processing times involved are so large that the communication is
effectively free, and hence I don't care how fast/slow the
communication is. Out of order arrival of results is no problem. I
thought I'll ask in case somebody has done this stuff before to avoid
reinventing the wheel. Any other suggestions are welcome too.

The cluster has openMPI installed, can I assume that will work nicely
with mpi4py?

Cheers,

Lisandro Dalcin

unread,
Nov 9, 2009, 1:47:51 PM11/9/09
to mpi...@googlegroups.com
On Mon, Nov 9, 2009 at 3:50 PM, rpg <rpg...@gmail.com> wrote:
>
> Hi all,
>
> What is the simplest way to setup a task queue in mpi4py? My
> application is embarrassingly parallel and all I need to do is setup a
> large queue of work in the master process which all the slave
> processes pull, process and send the results back to the master node.
> The processing times involved are so large that the communication is
> effectively free, and hence I don't care how fast/slow the
> communication is. Out of order arrival of results is no problem. I
> thought I'll ask in case somebody has done this stuff before to avoid
> reinventing the wheel. Any other suggestions are welcome too.
>

I cannot provide a definitive suggestion without knowing a bit more
about your application. Are you tasks going to take more or less the
same time, or do you expect a high unbalance here? Can you afford to
have the master process dedicated for the bookkeeping, without doing
any actual computation? Depending on these two points, the way to go
could be as trivial as using scatter() at the beginning and gather()
at the end, or slightly more complex, like the master using Comm.Probe
(or Comm.Iprobe) with wildcard source in a busy loop until the queue
gets emptied.

> The cluster has openMPI installed, can I assume that will work nicely
> with mpi4py?
>

It should work.

Disclaimer: Open MPI has managed to make my life miserable (note: I'm
exaggerating) in the last couple of years. So if you have any issues,
do not hesitate to ask here for assistance.


--
Lisandro Dalcín
---------------
Centro Internacional de Métodos Computacionales en Ingeniería (CIMEC)
Instituto de Desarrollo Tecnológico para la Industria Química (INTEC)
Consejo Nacional de Investigaciones Científicas y Técnicas (CONICET)
PTLC - Güemes 3450, (3000) Santa Fe, Argentina
Tel/Fax: +54-(0)342-451.1594

Rohit Garg

unread,
Nov 9, 2009, 10:01:58 PM11/9/09
to mpi...@googlegroups.com
On Tue, Nov 10, 2009 at 12:17 AM, Lisandro Dalcin <dal...@gmail.com> wrote:
>
> On Mon, Nov 9, 2009 at 3:50 PM, rpg <rpg...@gmail.com> wrote:
>>
>> Hi all,
>>
>> What is the simplest way to setup a task queue in mpi4py? My
>> application is embarrassingly parallel and all I need to do is setup a
>> large queue of work in the master process which all the slave
>> processes pull, process and send the results back to the master node.
>> The processing times involved are so large that the communication is
>> effectively free, and hence I don't care how fast/slow the
>> communication is. Out of order arrival of results is no problem. I
>> thought I'll ask in case somebody has done this stuff before to avoid
>> reinventing the wheel. Any other suggestions are welcome too.
>>
>
> I cannot provide a definitive suggestion without knowing a bit more
> about your application. Are you tasks going to take more or less the
> same time, or do you expect a high unbalance here? Can you afford to
> have the master process dedicated for the bookkeeping, without doing
> any actual computation? Depending on these two points, the way to go
> could be as trivial as using scatter() at the beginning and gather()
> at the end, or slightly more complex, like the master using Comm.Probe
> (or Comm.Iprobe) with wildcard source in a busy loop until the queue
> gets emptied.

The work distribution will be pretty uniform, with occasional
unbalances. I can certainly afford to have the master process just do
the book keeping. The real point here is that the no. of tasks I have
are more than the number of processors.
>
>> The cluster has openMPI installed, can I assume that will work nicely
>> with mpi4py?
>>
>
> It should work.
>
> Disclaimer: Open MPI has managed to make my life miserable (note: I'm
> exaggerating) in the last couple of years. So if you have any issues,
> do not hesitate to ask here for assistance.
>
>
> --
> Lisandro Dalcín
> ---------------
> Centro Internacional de Métodos Computacionales en Ingeniería (CIMEC)
> Instituto de Desarrollo Tecnológico para la Industria Química (INTEC)
> Consejo Nacional de Investigaciones Científicas y Técnicas (CONICET)
> PTLC - Güemes 3450, (3000) Santa Fe, Argentina
> Tel/Fax: +54-(0)342-451.1594
>
> >
>



--
Rohit Garg

http://rpg-314.blogspot.com/

Senior Undergraduate
Department of Physics
Indian Institute of Technology
Bombay

Lisandro Dalcin

unread,
Nov 10, 2009, 9:13:23 AM11/10/09
to mpi...@googlegroups.com
Then I would first try to scatter() all the tasks at once at the
beginning, and finally gather() results at the end. For doing
scatter(), you have to make at processor 0 a list of lists of tasks,
the 'outer' list with comm.size 'inner' sublists (of similar length).
This way, even the master can do useful computation.

The above strategy are just a few of lines. If you notice that the
unbalance is to high (some processes finish too early compared to
others), you can go from a more elaborated scheme using point-to-point
(Send/Recv/Probe).
Reply all
Reply to author
Forward
0 new messages