[slurm-users] Array jobs vs. many jobs

29 views
Skip to first unread message

Ryan Novosielski

unread,
Nov 22, 2019, 3:19:22 PM11/22/19
to slurm...@lists.schedmd.com
Hi there,

Quick question that I'm not sure how to find the answer to otherwise: do array jobs have less impact on the scheduler in any way than a whole long list of jobs run the more traditional way? Less startup overhead, anything like that?

Thanks!

(we run 17.11 on CentOS 7, but I'm not sure it makes any difference here)

--
____
|| \\UTGERS, |---------------------------*O*---------------------------
||_// the State | Ryan Novosielski - novo...@rutgers.edu
|| \\ University | Sr. Technologist - 973/972.0922 (2x0922) ~*~ RBHS Campus
|| \\ of NJ | Office of Advanced Research Computing - MSB C630, Newark
`'

Ree, Jan-Albert van

unread,
Nov 22, 2019, 4:40:37 PM11/22/19
to slurm...@lists.schedmd.com


Jan-Albert van Ree | Linux System Administrator | Digital Services
MARIN | T +31 317 49 35 48 | mailto:J.A....@marin.nl | http://www.marin.nl

It helps a lot indeed ; we run arrays up to 100k elements and more. If you submit 100k separate jobs, the scheduler will definately grind to a halt.

Regards,
--
Jan-Albert



________________________________________
From: slurm-users <slurm-use...@lists.schedmd.com> on behalf of Ryan Novosielski <novo...@rutgers.edu>
Sent: Friday, November 22, 2019 21:18
To: slurm...@lists.schedmd.com
Subject: [slurm-users] Array jobs vs. many jobs

Christopher Samuel

unread,
Nov 22, 2019, 7:23:36 PM11/22/19
to slurm...@lists.schedmd.com
Hi Ryan,

On 11/22/19 12:18 PM, Ryan Novosielski wrote:

> Quick question that I'm not sure how to find the answer to otherwise: do array jobs have less impact on the scheduler in any way than a whole long list of jobs run the more traditional way? Less startup overhead, anything like that?

Slurm will represent the whole job array as a single entity until it
needs to create elements for scheduling purposes (ageing if you limit
the number of jobs that can accrue time, or just starting them up).

So if you have a 10,000 element job array it uses the same amount of
memory as 1 job until things start to happen to it.

It's a big win if you've got a workload that can take advantage of it.

All the best,
Chris
--
Chris Samuel : http://www.csamuel.org/ : Berkeley, CA, USA

Mark Hahn

unread,
Nov 24, 2019, 11:28:50 AM11/24/19
to Slurm User Community List
> Quick question that I'm not sure how to find the answer to otherwise: do
>array jobs have less impact on the scheduler in any way than a whole long
>list of jobs run the more traditional way? Less startup overhead, anything
>like that?

kinda sorta.
I think of array jobs as a bit like Python generators: when you ask
it for more items, they get generated dynamically.

when you submit an array job, it is a single entity in the pending queue.
then the scheduler evaluates it, it says "why yes, here's something to run!" -
effectively generating a job. the trick is: the array job remains
pending util it has generated (forked/buded?) all the array elements.

each array element is a perfectly normal job - has its own jobid,
consumes exactly as many resources and overheads as any other job.

an array job prevents congestion of the pending queue with lots of jobs
that stand no chance of starting because resources won't be
available for a long time.

so it depends very much on how bursty your workload is: whether you have
a huge backlog sometimes. (of course, a huge backlog can't occur unless
the jobs are short. that is, a huge backlog implies short jobs - and high
throughput, or else your system would get stuck with ridiculous wait times.)

I do not think array jobs are a good excuse to run short jobs,
simply because all short jobs have high (relative) running overhead.
array jobs do nothing more than minimize the scheduler's effort
in sorting the pending queue. to some extent, they let you view
a set of jobs as a unit, but you can also organize sets of jobs
via jobname.

regards, mark hahn
--
operator may differ from spokesperson. ha...@mcmaster.ca

Reply all
Reply to author
Forward
0 new messages