Google Groups no longer supports new Usenet posts or subscriptions. Historical content remains viewable.
Dismiss

Run a particular number of concurrent jobs

86 views
Skip to first unread message

Kenneth Brun Nielsen

unread,
Oct 4, 2010, 12:16:10 AM10/4/10
to
I have created a script, that runs a number of simulations - one after
another.

In pseudo code som:
for sim in simulations
do
runSimulation $sim
done

Now I want to improve it, so it is able to run the simulations in
parallel. Once the script is started, the user might want to change
the number (0-4) of concurrent simulations by editing the number in a
file.

In pseudo code it will look something like this:

# script.sh

for sim in simulations
do
currentSimulations = getNumberOfRunningSimulations
# how should getNumberOfRunningSimulations be implemented?
maxAllowedConcurrentSimulations = getMaxSimulations
# getMaxSimulations could be implemented as a lookup in a file.
if (currentSimulations < maxAllowedConcurrentSimulations)
then
runSimulation $sim &
else
wait some amount of time
end if
done

Any suggestions to how I can do this elegantly?

Best regards,
Kenneth

Bit Twister

unread,
Oct 4, 2010, 12:33:05 AM10/4/10
to
On Sun, 3 Oct 2010 21:16:10 -0700 (PDT), Kenneth Brun Nielsen wrote:
>
> Now I want to improve it, so it is able to run the simulations in
> parallel. Once the script is started, the user might want to change
> the number (0-4) of concurrent simulations by editing the number in a
> file.
> Any suggestions to how I can do this elegantly?

Put your variable=value in a file. Your code would then source the
file. Example of
cat /var/filename/here
currentSimulations=4
maxAllowedConcurrentSimulations=8

for sim in simulations ; do

. /var/filename/here

whatever you want done with variables seen in /var/filename/here

done


Ben Bacarisse

unread,
Oct 4, 2010, 6:00:06 PM10/4/10
to

Sorry, no, but I do have a question. Why have you decided that it must
be done this way? I can't see any great benefit to this outline. If N
simulations need to run, why not just run all N of them concurrently?
The outline above just seems to add to the total work that needs to be
done.

Since that is all negative let me suggest two things about counting the
running simulations. If the program has an easy to distinguish name,
then you could use probably ps and grep -c to count the instances.

A slightly more sophisticated method would be for the runSimulation
script to create a file whose name is the pid of the simulation instance
in some known directory (touch /tmp/simulation-pids/$!). The file gets
removed when the simulation ends (by whatever means).

--
Ben.

Seebs

unread,
Oct 4, 2010, 6:58:42 PM10/4/10
to
On 2010-10-04, Ben Bacarisse <ben.u...@bsb.me.uk> wrote:
> Sorry, no, but I do have a question. Why have you decided that it must
> be done this way? I can't see any great benefit to this outline. If N
> simulations need to run, why not just run all N of them concurrently?

I have often had things (simulations, builds) such than running all N
concurrently was slower than running them in batches by an order of magnitude.

Time to run 16 tasks, in batches of:
1: 16 * t
2: (16 * t) / 2
4: (16 * t) / 4 + 5
8: (16 * t) / 8 + 1000

... Something like that. Say the machine has about 1GB free core, and a
simulation takes about 256MB free memory. If you run more than four, you're
going to be paging, and paging will CRIPPLE your performance.

In my case, the applicable thing was "system builds", and the contested
resources were disk cache and/or disk I/O.

-s
--
Copyright 2010, all wrongs reversed. Peter Seebach / usenet...@seebs.net
http://www.seebs.net/log/ <-- lawsuits, religion, and funny pictures
http://en.wikipedia.org/wiki/Fair_Game_(Scientology) <-- get educated!
I am not speaking for my employer, although they do rent some of my opinions.

Seebs

unread,
Oct 4, 2010, 6:55:34 PM10/4/10
to
On 2010-10-04, Kenneth Brun Nielsen <kenneth.br...@googlemail.com> wrote:
> Now I want to improve it, so it is able to run the simulations in
> parallel. Once the script is started, the user might want to change
> the number (0-4) of concurrent simulations by editing the number in a
> file.

This gets tricky!

It gets trickier still, because you don't necessarily know which ones will
end first.

A basic pattern is:

1. Define a number of slots to use.
2. Set up a way to track the process of each slot.
3. Fill slots, wait for a slot to free up, populate that slot, repeat.

A few notes:

* You can have a signal handler which does the reaping, then have child
processes send a kill <that signal> to the parent when they're done.
* You can create files in corresponding directories to use to find out
what the child processes think happened.
* You can get the process ID of a background task as $!.

With all of this, I am sure it's possible, since I've seen it done, but
the code in question is I believe considered proprietary, and is extremely
closely tied to the specific problem domain.

des...@verizon.net

unread,
Oct 4, 2010, 8:28:11 PM10/4/10
to
Kenneth Brun Nielsen <kenneth.br...@googlemail.com> writes:

Use a Makefile and GNUMake.

Look at the -j and -l options.

-l adjusts the number of jobs running in parallel based on the machines
load.

bsh

unread,
Oct 4, 2010, 9:20:37 PM10/4/10
to
On Oct 3, 9:16 pm, Kenneth Brun Nielsen

<kenneth.brun.niel...@googlemail.com> wrote:
> I have created a script, that runs a number of simulations - one after
> another.
> ...

> Now I want to improve it, so it is able to run the simulations in
> parallel. Once the script is started, the user might want to change
> the number (0-4) of concurrent simulations by editing the number in a
> file.
> Any suggestions to how I can do this elegantly?

Really, guys....

Running them all concurrently... Running them all sequentially...
Using GNUMake....

For such a non-trivial application, I would definitely recommend
a batch job manager, of which there are many free, quality ones
to be chosen from, such as:

"GNQS"
http://sourceforge.net/projects/gnqs/

I myself have written a function suite which is a frontend
to job control, which would have satisfied your request,
except for the fact that it is beta software, and "brittle"
in an arbitrary environment other than its development
platform.

For semaphore-based job control, the most workable example
of a function suite is:

semaphore.ksh: "programmable semaphores in shell code"
http://www.unixreview.com/documents/s=9303/sam0408f/0408f.htm
http://www.samag.com/code/
ftp://ftp.mfi.com/pub/sysadmin/2004/aug2004.zip

In lieu of a batch manager, you could try the currently
popular GNU parallel(1) which has had some discussion
at C.U.S. of late:

parallel.c; ssh_parallel.cpp: "parallelize multiple programs on local
or distributed hosts"
http://www.gnu.org/software/parallel/
http://mi.eng.cam.ac.uk/~er258/code/
^ Readme: http://www.youtube.com/watch?v=OpaiGYxkSuQ
^ Readme: http://tinyogg.com/watch/TORaR/
^ Readme: http://tinyogg.com/watch/hfxKj/

=Brian

des...@verizon.net

unread,
Oct 4, 2010, 9:56:47 PM10/4/10
to
bsh <brian...@rocketmail.com> writes:

> On Oct 3, 9:16 pm, Kenneth Brun Nielsen
> <kenneth.brun.niel...@googlemail.com> wrote:
>> I have created a script, that runs a number of simulations - one after
>> another.
>> ...
>> Now I want to improve it, so it is able to run the simulations in
>> parallel. Once the script is started, the user might want to change
>> the number (0-4) of concurrent simulations by editing the number in a
>> file.
>> Any suggestions to how I can do this elegantly?
>
> Really, guys....
>
> Running them all concurrently... Running them all sequentially...
> Using GNUMake....
>
> For such a non-trivial application, I would definitely recommend
> a batch job manager, of which there are many free, quality ones
> to be chosen from, such as:
>
> "GNQS"
> http://sourceforge.net/projects/gnqs/

Never heard of it.
It does load balancing, but so does GNUMake.

I read this:

Documentation is not one of GNQS's strong points.

Turn around and run away!

> In lieu of a batch manager, you could try the currently
> popular GNU parallel(1) which has had some discussion
> at C.U.S. of late:
>
> parallel.c; ssh_parallel.cpp: "parallelize multiple programs on local
> or distributed hosts"
> http://www.gnu.org/software/parallel/
> http://mi.eng.cam.ac.uk/~er258/code/
> ^ Readme: http://www.youtube.com/watch?v=OpaiGYxkSuQ
> ^ Readme: http://tinyogg.com/watch/TORaR/
> ^ Readme: http://tinyogg.com/watch/hfxKj/

Never heard of it,
sounds like a possibility:

DIFFERENCES BETWEEN make -j AND GNU Parallel

make -j can run jobs in parallel, but requires a crafted Makefile to do
this. That results in extra quoting to get filename containing newline
to work correctly.

make -j has no support for grouping the output, therefore output may run
together, e.g. the first half of a line is from one process and the last
half of the line is from another process. The example Parallel grep
cannot be done reliably with make -j because of this.

Neither "advantage" sounds like an advantage to me.
Never had an issue with quotes, and I don't consider a "crafted"
makefile to be an issue. Makefiles are simple.

Grouping output isn't an issue for me either, when I need to keep
stdout/stderr I redirect it to a logical place.

Icarus Sparry

unread,
Oct 4, 2010, 10:20:59 PM10/4/10
to
On Mon, 04 Oct 2010 18:20:37 -0700, bsh wrote:

> On Oct 3, 9:16 pm, Kenneth Brun Nielsen
> <kenneth.brun.niel...@googlemail.com> wrote:
>> I have created a script, that runs a number of simulations - one after
>> another.
>> ...
>> Now I want to improve it, so it is able to run the simulations in
>> parallel. Once the script is started, the user might want to change the
>> number (0-4) of concurrent simulations by editing the number in a file.
>> Any suggestions to how I can do this elegantly?
>
> Really, guys....
>
> Running them all concurrently... Running them all sequentially... Using
> GNUMake....
>
> For such a non-trivial application, I would definitely recommend a batch
> job manager, of which there are many free, quality ones to be chosen
> from, such as:
>
> "GNQS"
> http://sourceforge.net/projects/gnqs/

GNUmake is a perfectly reasonable thing to use for such a straight
forward thing.

Even simpler, get a version of the real ksh from either 2009 or 2010 and
set the "JOBMAX" variable to 4 and you are done.

tange

unread,
Oct 5, 2010, 11:34:21 AM10/5/10
to
On Oct 4, 6:16 am, Kenneth Brun Nielsen

<kenneth.brun.niel...@googlemail.com> wrote:
> I have created a script, that runs a number of simulations - one after
> another.
>
> In pseudo code som:
> for sim in simulations
> do
> runSimulation $sim
> done

Using GNU Parallel:

seq 1 100 | parallel runSimulation number-{}

One per core:

seq 1 100 | parallel -j+0 runSimulation number-{}

If your pseudo code is more advanced so it really looks like:

for sim in simulations
do

[... loads of stuff ...]
runSimulation $sim
done

Then use the --semaphore of GNU Parallel (which has an alias called
sem):

for sim in simulations
do

[... loads of stuff ...]
sem runSimulation $sim
done

Or one job per core:

for sim in simulations
do

[... loads of stuff ...]
sem -j+0 runSimulation $sim
done

If you want a simple jobqueue/batch manager that runs one job per core
try this:

touch jobqueue
tail -f jobqueue | parallel -j+0

To submit your jobs to the queue:

echo runSimulation $sim >> jobqueue

Watch the intro video to learn more: http://www.youtube.com/watch?v=OpaiGYxkSuQ

Regards,

Ole Tange
Author of GNU Parallel

bsh

unread,
Oct 5, 2010, 5:12:48 PM10/5/10
to
Icarus Sparry <i.sparry...@gmail.com> wrote:
> bsh wrote:
> > Kenneth Brun Nielsen wrote:
> > > ...

> GNUmake is a perfectly reasonable thing to use for such a straight
> forward thing.

Well, I _kinda_ modify my initial roll-of-the-eyes to
something a bit more tolerant....

> Even simpler, get a version of the real ksh from either 2009 or 2010 and
> set the "JOBMAX" variable to 4 and you are done.

Well, what do you know! I thought I was aware of every
single feature of even the new kornshells! Now if only
the -p option on all the builtins worked, so the output
of the jobs builtin could be effectively parsed....

=Brian

Icarus Sparry

unread,
Oct 5, 2010, 5:43:27 PM10/5/10
to

JOBMAX is even documented!
JOBMAX This variable defines the maximum number running back‐
ground jobs that can run at a time. When this limit is
reached, the shell will wait for a job to complete before
staring a new job.

"jobs -p" gives process group leaders, one per line, as integers. That
doesn't seem too hard to parse. Perhaps you want the output of "jobs"
alone to have a flag that makes it simple to parse?

Kenneth Brun Nielsen

unread,
Oct 5, 2010, 7:46:55 PM10/5/10
to
On 5 Okt., 00:00, Ben Bacarisse <ben.use...@bsb.me.uk> wrote:

>  Why have you decided that it must
> be done this way?  I can't see any great benefit to this outline.  If N
> simulations need to run, why not just run all N of them concurrently?
> The outline above just seems to add to the total work that needs to be
> done.

The amount of available server CPU power and software licenses (the
software driven by "runSimulation" checks out licenses from a license
server) have to be adjustable. Notice that each of the hundreds of
simulation runs lasts from 10 minutes to 10 hours and since we do not
have an infinite number of licenses or machines we (me and my
colleagues) have to share these with each other.

Practically, this means that during normal working hours (8-17), only
three simulations should run concurrently (by the script) and the
remaining three should be left available for working engineers, that
want to perform similar small simulations as a part of their work
analysis/synthesis. Outside normal working hours, where no person
needs the licenses, the script should use the maximum available
licenses (and calculation power) in order to finish the simulations
ASAP.

> Since that is all negative let me suggest two things about counting the
> running simulations.  If the program has an easy to distinguish name,
> then you could use probably ps and grep -c to count the instances.
>
> A slightly more sophisticated method would be for the runSimulation
> script to create a file whose name is the pid of the simulation instance
> in some known directory (touch /tmp/simulation-pids/$!).  The file gets
> removed when the simulation ends (by whatever means).

Thanks for the inspiration. I like the touch file idea. If I end up
putting this together myself, then I might use that trick :)

Best regards,
Kenneth

Kenneth Brun Nielsen

unread,
Oct 5, 2010, 7:49:37 PM10/5/10
to
On 5 Okt., 00:55, Seebs <usenet-nos...@seebs.net> wrote:

> On 2010-10-04, Kenneth Brun Nielsen <kenneth.brun.niel...@googlemail.com> wrote:
>
> > Now I want to improve it, so it is able to run the simulations in
> > parallel. Once the script is started, the user might want to change
> > the number (0-4) of concurrent simulations by editing the number in a
> > file.
>
> This gets tricky!
>
> It gets trickier still, because you don't necessarily know which ones will
> end first.
>
> A basic pattern is:
>
> 1.  Define a number of slots to use.
> 2.  Set up a way to track the process of each slot.
> 3.  Fill slots, wait for a slot to free up, populate that slot, repeat.
>
> A few notes:
>
> * You can have a signal handler which does the reaping, then have child
>   processes send a kill <that signal> to the parent when they're done.
> * You can create files in corresponding directories to use to find out
>   what the child processes think happened.
> * You can get the process ID of a background task as $!.

Thanks for your ideas, Peter.

Best regards,
Kenneth

Kenneth Brun Nielsen

unread,
Oct 5, 2010, 8:07:41 PM10/5/10
to
On 5 Okt., 02:28, des...@verizon.net wrote:

I might take a look at how easy I can convert the runSimulation script
into make, but machine load is only a secondary limitation in this
context. Licenses is the real limitation, and the number of licenses
used by concurrent jobs started from script should be "on-the-fly"
adjustable, since people might want to use the license for other
things.

Best regards,
Kenneth

Kenneth Brun Nielsen

unread,
Oct 5, 2010, 8:19:40 PM10/5/10
to
On 5 Okt., 03:20, bsh <brian_hi...@rocketmail.com> wrote:
> On Oct 3, 9:16 pm, Kenneth Brun Nielsen
>
> <kenneth.brun.niel...@googlemail.com> wrote:
> > I have created a script, that runs a number of simulations - one after
> > another.
> > ...
> > Now I want to improve it, so it is able to run the simulations in
> > parallel. Once the script is started, the user might want to change
> > the number (0-4) of concurrent simulations by editing the number in a
> > file.

> For such a non-trivial application, I would definitely recommend


> a batch job manager, of which there are many free, quality ones
> to be chosen from, such as:
>
> "GNQS"http://sourceforge.net/projects/gnqs/

The documentation seems to be non-existent. That's not a good start. I
might have a look at it, and try to install it. Feel free to give
further details :)

> For semaphore-based job control, the most workable example
> of a function suite is:
>
> semaphore.ksh: "programmable semaphores in shell code"
> http://www.unixreview.com/documents/s=9303/sam0408f/0408f.htm
> http://www.samag.com/code/

These links aren't working at the moment.

> In lieu of a batch manager, you could try the currently
> popular GNU parallel(1) which has had some discussion
> at C.U.S. of late:

C.U.S? http://en.wikipedia.org/wiki/CUS

Anyway, I'll take a look at GNU parallel.

Thanks for your input.

Best regards,
Kenneth

Kenneth Brun Nielsen

unread,
Oct 5, 2010, 8:44:37 PM10/5/10
to

Thanks for your very inspiring post and video. GNU Parallel looks like
a nice and usable tool. However, It looks as if I will need something
extra in order to change the number of concurrent threads on-the-fly.

BTW, I really like the simple jobqueue :)

Best regards,
Kenneth

Janis Papanagnou

unread,
Oct 6, 2010, 7:20:10 AM10/6/10
to
Am 06.10.2010 02:19, schrieb Kenneth Brun Nielsen:
> On 5 Okt., 03:20, bsh<brian_hi...@rocketmail.com> wrote:
> [...]

>> In lieu of a batch manager, you could try the currently
>> popular GNU parallel(1) which has had some discussion
>> at C.U.S. of late:
>
> C.U.S? http://en.wikipedia.org/wiki/CUS

comp.unix.shell - this newsgroup.

Janis

> [...]

Icarus Sparry

unread,
Oct 8, 2010, 9:53:58 AM10/8/10
to
On Sun, 03 Oct 2010 21:16:10 -0700, Kenneth Brun Nielsen wrote:

> I have created a script, that runs a number of simulations - one after
> another.
>
> In pseudo code som:
> for sim in simulations
> do
> runSimulation $sim
> done
>
> Now I want to improve it, so it is able to run the simulations in
> parallel. Once the script is started, the user might want to change the
> number (0-4) of concurrent simulations by editing the number in a file.

As I said elsewhere in this thread, get a modern version of ksh93, and
change your pseudo code as follows

#!/bin/ksh93
echo 2 > limitfile


for sim in simulations
do

runSimulation $sim &
JOBMAX=$(< limitfile)
done

Line 1. Make sure it is run by ksh93
Line 2. Put the number of simulations you want to run into "limitfile" -
choose some other name if you want
Line 3 - Unchanged
Line 4 - Unchanged
Line 5. Run the simulation in the background. It is conventional to
indent loops, but not needed
Line 6. Read a value for JOBMAX (the name does matter) from limitfile
Line 7 - Unchanged

You can change the number of things by just putting a new value into
limitfile, and the new limit will take effect as just after the next one
is launched. This takes care of your 1 to 4 concurrent simulations

You can not take the number down to zero, but you can always use "kill -
STOP" to suspend the processes.

ksh93 is opensource, you can get it from kornshell.com, which takes you
to http://www.research.att.com/sw/download
You want a version from 2009 or later. "echo ${.sh.version}" should give
you the version if it it recent enough, and JOBMAX should already be a
defined variable.

Kenneth Brun Nielsen

unread,
Oct 8, 2010, 6:46:08 PM10/8/10
to
On Oct 8, 3:53 pm, Icarus Sparry <i.sparry...@gmail.com> wrote:

> You can change the number of things by just putting a new value into
> limitfile, and the new limit will take effect as just after the next one
> is launched. This takes care of your 1 to 4 concurrent simulations

Arh. From your earlier post, I had the (wrong) impression, that JOBMAX
could be set once only. If it can be changed on-the-fly, then it
sounds as a proper solution. I will dig further into ksh93.

Thanks :)

/Kenneth

Aleksey Cheusov

unread,
Oct 10, 2010, 5:54:21 AM10/10/10
to
> C.U.S? http://en.wikipedia.org/wiki/CUS

> Anyway, I'll take a look at GNU parallel.

Have a look at http://sf.net/projects/paexec/.
It provides more features than GNU parallel.

Right now I'm working on new version of it.

--
Best regards, Aleksey Cheusov.

0 new messages