[slurm-users] Is it possible to set a default QOS per partition?

3,306 views
Skip to first unread message

Stack Korora

unread,
Mar 1, 2021, 4:25:04 PM3/1/21
to slurm...@lists.schedmd.com
Greetings,

We have different node classes that we've set up in different
partitions. For example, we have our standard compute nodes in compute;
our GPU's in a gpu partition; and jobs that need to run for months go
into a long partition with a different set of machines.

For each partition, we have QOS to prevent any single user from
dominating the resources (set at a max of 60% of resources; not my call
- it's politics - I'm not going down that rabbit hole...).

Thus, I've got something like this in my slurm.conf (abbreviating to
save space; sorry if I trim too much).

PartitionName=compute [snip] AllowQOS=compute Default=YES
PartitionName=gpu [snip] AllowQOS=gpu Default=NO
PartitionName=long [snip] AllowQOS=long Default=NO

Then I have my QOS configured. And in my `sacctmgr dump cluster | grep
DefaultQOS` I have "DefaultQOS=compute".

All of that works exactly as expected.

This makes it easy/nice for my users to just do something like:
$ sbatch -n1 -N1 -p compute script.sh

They don't have to specify the QOS for compute and they like this.

However, for the other partitions they have to do something like this:
$ sbatch -n1 -N1 -p long --qos=long script.sh

The users don't like this. (though with scripts, I don't see the big
deal in just adding a new line...but you know... users...)

The request from the users is to make a default QOS for each partition
thus not needing to specify the QOS for the other partitions.

Because the default is set in the cluster configuration, I'm not sure
how to do this. And I'm not seeing anything in the documentation for a
scenario like this.

Question A:
Anyone know how I can set a default QOS per partition?

Question B:
Chesterton's fence and all... Is there a better way to accomplish what
we are attempting to do? I don't want a single QOS to limit across all
partitions. I need a per partition limit that restricts users to 60% of
resources in that partition.

Thank you!
~Stack~

Prentice Bisbal

unread,
Mar 1, 2021, 5:27:46 PM3/1/21
to slurm...@lists.schedmd.com
Two things:

1. So your users are okay with specifying a partition, but specifying a
QOS is a bridge too far?

2. Have your job_submit.lua script filter the jobs into the correct QOS.
You can check the partition and set the QOS accordingly.

First, you need to have this set in your slurm.conf:

JobSubmitPlugins=job_submit/lua

But I'm pretty sure that's the default setting.

Since it looks like your partitions and corresponding QOSes have the
same names, you can just add this line to the slurm_job_submit function
body in your job_submit.lua script:

  job_desc.qos = job_desc.partition

And voila! Problem solved.

After editing job_submit.lua, you'll need to restart slurmctld for the
changes to take effect. Also, it's a good idea to 'tail -f'
slurmctld.log while restarting - any errors with the syntax will be
printed there, and if there's any errors in that file, slumctld won't
start.

--
Prentice
--
Prentice Bisbal
Lead Software Engineer
Research Computing
Princeton Plasma Physics Laboratory
http://www.pppl.gov


Stack Korora

unread,
Mar 1, 2021, 7:55:40 PM3/1/21
to slurm...@lists.schedmd.com
On 3/1/21 4:26 PM, Prentice Bisbal wrote:
> Two things:
>
> 1. So your users are okay with specifying a partition, but specifying
> a QOS is a bridge too far?

*sigh* Yeah. It's been requested several times. I can't defend it, but
if it makes them happy...then they will find something else to complain
about. (kidding... mostly :-D )

>
> 2. Have your job_submit.lua script filter the jobs into the correct
> QOS. You can check the partition and set the QOS accordingly.
>
> First, you need to have this set in your slurm.conf:
>
> JobSubmitPlugins=job_submit/lua
>
> But I'm pretty sure that's the default setting.
>
> Since it looks like your partitions and corresponding QOSes have the
> same names, you can just add this line to the slurm_job_submit
> function body in your job_submit.lua script:
>
>   job_desc.qos = job_desc.partition
>
> And voila! Problem solved.
>
> After editing job_submit.lua, you'll need to restart slurmctld for the
> changes to take effect. Also, it's a good idea to 'tail -f'
> slurmctld.log while restarting - any errors with the syntax will be
> printed there, and if there's any errors in that file, slumctld won't
> start.

Thank you! A great idea. I will give this a try.

~Stack~


Loris Bennett

unread,
Mar 2, 2021, 2:12:03 AM3/2/21
to Slurm User Community List
I have often come across the idea of assigning QOS automatically on the
basis of other job parameters, but have never really understood the use
case. We use QOS to allow priority to be increased for a limit amount
of resources:

$ sqos
Name Priority MaxWall MaxJobs MaxSubmit MaxTRESPU
---------- ---------- ----------- ------- --------- -------------
hiprio 100000 03:00:00 50 100 cpu=160
prio 1000 3-00:00:00 500 1000 cpu=480
standard 0 14-00:00:00 3000 10000 cpu=960

and use job_submit.lua to force the users to select a QOS.

Can anyone explain idea behind the automatic QOS assignment approach to
me?

Cheers,

Loris

--
Dr. Loris Bennett (Hr./Mr.)
ZEDAT, Freie Universität Berlin Email loris....@fu-berlin.de

Sean Crosby

unread,
Mar 2, 2021, 2:55:12 AM3/2/21
to Slurm User Community List
I would have thought partition QoS is the way to do this. We add partition QoS to our partition definitions, and implement quotas on usage as well.

PartitionName=physical Nodes=... Default=YES MaxTime=30-0 DefaultTime=0:10:0 State=DOWN QoS=physical TRESBillingWeights=CPU=1.0,Mem=4.0G

We then define the QoS "physical"

# sacctmgr show qos physical -p
Name|Priority|GraceTime|Preempt|PreemptExemptTime|PreemptMode|Flags|UsageThres|UsageFactor|GrpTRES|GrpTRESMins|GrpTRESRunMins|GrpJobs|GrpSubmit|GrpWall|MaxTRES|MaxTRESPerNode|MaxTRESMins|MaxWall|MaxTRESPU|MaxJobsPU|MaxSubmitPU|MaxTRESPA|MaxJobsPA|MaxSubmitPA|MinTRES|
physical|0|00:00:00|||cluster|||1.000000|||||||||||cpu=750,mem=9585888M|||cpu=750,mem=9585888M||||

We implement quotas using MaxTRESPerUser and MaxTRESPerAccount

It works really well for us. If you need to override it for a particular group, you can create another QoS, set the OverPartQOS flag, and get the users to specify that QoS.

Sean

--
Sean Crosby | Senior DevOpsHPC Engineer and HPC Team Lead
Research Computing Services | Business Services
The University of Melbourne, Victoria 3010 Australia



On Tue, 2 Mar 2021 at 08:24, Stack Korora <stack...@disroot.org> wrote:
UoM notice: External email. Be cautious of links, attachments, or impersonation attempts
Reply all
Reply to author
Forward
0 new messages