parallel mechanism problem

Joey Hsiao

unread,

Oct 19, 2004, 3:49:57 AM10/19/04

to

Hi all! I have one problem described below:

Requirements:

The parallel mechanism needs three kinds of processes:
1. Master process (one only)
2. Control process (one for each slave host, including some communication
with master process)
3. Agent process (at least one for each slave host, also some communication
with master)

The problem is how can I make each slave host execute one control process?

Assume the simple would look like this:

1. Master process {
spawn control processes; /* one for a slave host, however
MPI_Comm_spawn cannot assign processes to specific hosts */
spawn agent processes;
Send(mess, to control processes);
Send(mess, to agents);
}

2. Control process {
Recv(mess, from master);
}

3. Agent process {
Recv(mess, from master);
}

thanks in advance.

Randy

unread,

Oct 19, 2004, 10:40:06 AM10/19/04

to

If you're going to start up your program using PBS, you'll have more work to do,
because you'll need to edit its MPI machinefile on the fly in your PBS batch
script, but what *I* would do is the following:

- Write a single MPI program that all the MPI processes will execute. (You'll
put conditionals in the code to direct each process to do something different,
based on its MPI process ID, 0 to N-1).

- Each time you start up your MPI processes and run the program with a different
number of processes, you'll need to decide how you want to map that set of MPI
processes onto the available hosts. If you put some processes on the same hosts
as other processes (like one control + one or more agents on one host), you'll
need to inform mpirun (or lamboot) on what host it should start up each MPI
process. Then during execution, using its MPI process ID, each process will
then behave in the role that you wanted (master, control, or agent), and each
process will know with what other MPI processes it should communicate, based on
its role, and perhaps which host it's running on.

When you invoke mpirun to spawn the MPI job, you'll need to pass it a file that
contains the host names (i.e. the 'machinefile'), which will decide where each
MPI process is spawned. The sequence of host names in the file will tell mpirun
where to place MPI processes 0 to N-1. You'll also want to somehow inform your
MPI executable the role of each process. (Is it master, control, or agent?)

Let's say that you want to run 1 master process, 4 control processes and 8 agent
processes on a total of 5 different hosts. You specify the machinefile with the
hosts in the order: master, then all control hosts, then all agent hosts. For
example, using five hosts as I described, you could specify the master +
controls + agents using:

node1 (as master), node2, node3, node4, node5, (as controls) and node2, node2,
node3, node3, node4, node4, node5, node5 (as agents).

This would place one control and two agents on node2, another control and two
agents on node3, and the same for node4 and node5.

Then you invoke your program with a command line argument that identifies the
role of each host, like:

$ mpirun -np 13 \
m=node1 \
c=node2,node3,node4,node5 \
a=node2,node2,node3,node3,node4,node4,node5,node5

(BTW, I got the number 13 from summing up 1 master + 4 controls + 8 agents)

Of course you could specify this in different ways, especially if you wanted
each agent and control process that's on the same host to be aware of the other
processes that are on that host. That could be done with a variety of
alternative syntactic notations, perhaps describing each node's residents rather
than the number of processes of each role:

$ mpirun -np 13 \
n1=m \
n2=c,a,a \
n3=c,a,a \
n4=c,a,a \
n5=c,a,a

Then, when you spawn the MPI job, each MPI process can parse the same command
line arguments, identify itself (perhaps using MPI_get_host name or by reading
in $PBS_HOSTNAME and jumping to an offset identified by its MPI process ID), and
thus it can learn its role in the grand plan.

An approach like this is necessary if you want each process to be able to do its
own MPI communication. Since MPI is not pthread safe, I probably would not let
the control process spawn pthreads to create its agent processes, especially if
you wanted each agent process to do its own MPI communication. If it were OK
for one control process to do all MPI communication for all the agent processes
that reside on the same host, then I'd consider having the control process spawn
as many agent pthread processes as it needs, do all the MPI communication for
all the processes on that host, and then move message data to/from the agents
using something like named pipes or perhaps shared memory segments. (Of course,
you'll still need to add code to signal each agent that a message has arrived
for it, and signal the control that one of its agents has sent it a message.)
However, in my humble opinion, using pthreads to create agents is a lot more
trouble than just spawning as many MPI processes as are needed to cover master +
controls + agents.

If you do use PBS to start up your MPI job, you'll need to manipulate the host
names that are in its machinefile (usu present in the PBS batch script
environment variable $PBS_NODEFILE). To run multiple MPI processes on each
host, that host's name will have to be present more than once in the
machinefile. In the 5 host and 13 process example that I gave above, because
you would have requested 5 hosts when you submitted your PBS job, $PBS_NODEFILE
(machinefile) would initially contain something like:

node1
node2
node3
node4
node5

To create two agent processes along with each control process, you'll want to
add some code to your PBS batch script (before you invoke mpirun), that parses
and changes the contents of the machinefile to replicate node2 through node4 so
that additional MPI processes will be created, as in changing the above five
entries into:

node1 # master
node2 # control1
node3 # control2
node4 # control3
node5 # control4
node2 # agent1
node2 # agent2
node3 # agent3
node3 # agent4
node4 # agent5
node4 # agent6
node5 # agent7
node5 # agent8

Then the MPI processes 0 to 12 will map onto these hosts in this order, with
process 0 on the first host and process 12 on the last host (with some hosts
running more than one MPI process). Then your executable will start running,
and each copy of the executable (each MPI process) will parse the same command
line arguments, and each host will learn its role when it finds its host name
labeled in the list as c, m, or a. Or however you choose to convey that info to
each process.

I hope I haven't confused the hell out of you... :-}

Randy

--
Randy Crawford http://www.ruf.rice.edu/~rand rand AT rice DOT edu

Joey Hsiao

unread,

Oct 20, 2004, 5:33:06 AM10/20/04

to

There is somewhere I still don't get it.
If I choose one approach as you wrote,

*****************************************************************

$ mpirun -np 13 \
n1=m \
n2=c,a,a \
n3=c,a,a \
n4=c,a,a \
n5=c,a,a

Then, when you spawn the MPI job, each MPI process can parse the same
command
line arguments, identify itself (perhaps using MPI_get_host name or by
reading
in $PBS_HOSTNAME and jumping to an offset identified by its MPI process ID),
and
thus it can learn its role in the grand plan.

*****************************************************************

How can I deal with the command line arguments inside the program so that I
could dispatch each process to its proper host?
Would you explain it more detailed?

Randy

unread,

Oct 20, 2004, 11:35:33 AM10/20/04

to

Because MPI processes must be created *before* your program runs, you will be
unable to change the number of processes from within your program. That's why
you need to do a little preprocessing of each job in your PBS batch script to
decide how many MPI processes you will need, and on which host they will run.
Once you have figured that out, when you finally launch your MPI executable, you
must do two things:

1) Inform mpirun (or lamboot) how many processes to launch, and also tell it on
which host each process should be run.

2) Inform each of your MPI processes what role it is supposed to play in _your_
design (master, control, or agent). Depending on how your program is designed,
perhaps this will also help each process to know which other processes it should
then communicate with. The command line argument method that I'm suggesting is
just one way to do this. In essence, I'm suggesting that each MPI process will
figure this out for itself by looking at its MPI process ID (0 to N-1) and also
looking at the command line arguments that you specified.

For instance, given the example above, knowing that a process has MPI ID 8, and
looking at the argument list, each process would count through the command line
arguments (the list of roles) to the 9th role in the list (because MPI process 8
is the 9th process) which would be the first 'a' in the line:

n4=c,a,a \

So now MPI process 8 knows it's an agent process, and it knows that its control
processes is 7 and it's sister agent is 9, which are both running on the same
node. (Using this syntax, the 'n4' identifier is not necessary and can be
ignored. I just added it for clarity of exposition.)

Joey Hsiao

unread,

Oct 22, 2004, 12:56:13 AM10/22/04

to

Thanks so much, Randy!

Just one more question.

It seems that if I wanna acheive the facility of dipatching processes to
specific processors, it must go with PBS.
Is that right?

"Randy" <j...@burgershack.com> ???????:cl60k5$a83$1...@joe.rice.edu...

Randy

unread,

Oct 22, 2004, 1:09:33 PM10/22/04

to

Joey Hsiao wrote:
> Thanks so much, Randy!
>
> Just one more question.
>
> It seems that if I wanna acheive the facility of dipatching processes to
> specific processors, it must go with PBS.
> Is that right?

No. If you're using MPICH for MPI, you only need to feed a machinefile into
mpirun that contains an ordered list of hosts on which you want to run. Then
mpirun will map its MPI processes onto that sequence of hosts: process 0 on the
first host; process 1 on the second host, etc. You don't need PBS to do that.
PBS tells mpirun to read its host names from a file that is stored in the env
variable $PBS_NODEFILE. If you don't use PBS, you could put the ordered list of
hosts into a machinefile (a text file) so that mpirun will launch each MPI
process on the desired host.

If you're not using PBS, you can edit a machinefile by hand. Or you can edit
MPICH's default machinefile in $MPICH/share/machines.LINUX (where LINUX is the
name of the system architecture for which you built this installation of MPICH).
For simplicity's sake, I would just edit my own machinefile and then invoke
mpirun manually using:

$ mpirun -np 13 -machinefile machines.13 ./a.out

Then you put the host names in the file 'machines.13':

host-for-process-0
host-for-process-1
...
host-for-process-12

I suspect you could do something similar if you were using LAM MPI instead of
MPICH. You'd just have to learn how lamboot maps its MPI processes onto its
list of hosts. It's probably an equally straightforward approach. A little
trial and error will reveal how it was done (each MPI process could call
MPI_Get_processor_name to report its MPI process ID and host name).

Randy