[slurm-users] Determine usage for a QOS?

743 views
Skip to first unread message

Christopher Samuel

unread,
Aug 19, 2018, 9:27:31 PM8/19/18
to slurm...@lists.schedmd.com
Hi folks,

After an extended hiatus (I forgot to resubscribe after going away for a
few weeks) I'm back.. ;-)

We are using QOS's for projects which have been granted a fixed set of
time for higher priority work which works nicely, but have just been
asked the obvious question "how much time do we have left?".

The QOS's are set up with:

sacctmgr create qos astac_oz045 priority=10000 flags=NoDecay
GrpTRESMins=cpu=15000000

so once they hit that GrpTRESMins limit they should stop being able
to run with the priority boost (as NoDecay stops the usage accrued
being decayed as it is for fair-share).

It seems that neither sreport nor "sacctmgr list qos" has a way of
reporting the overall usage against a QOS (with sacctmgr you can set
the rawusage to 0 you can't actually see what it is).

Anyone found a way to do this?

All the best,
Chris
--
Chris Samuel : http://www.csamuel.org/ : Melbourne, VIC

Paul Edmon

unread,
Aug 19, 2018, 9:37:01 PM8/19/18
to slurm...@lists.schedmd.com
I don't really have enough experience with QoS's to give a slicker
method but you could use squeue --qos to poll the QoS and then write a
wrapper to do the summarization.  It's hacky but it should work.

-Paul Edmon-

Christopher Samuel

unread,
Aug 19, 2018, 9:39:39 PM8/19/18
to slurm...@lists.schedmd.com
Hi Paul,

On 20/08/18 11:36, Paul Edmon wrote:

> I don't really have enough experience with QoS's to give a slicker
> method but you could use squeue --qos to poll the QoS and then write a
> wrapper to do the summarization.  It's hacky but it should work.

I was thinking sacct -q ${QOS} to pull info out of the DB, but as
Slurm will be keeping this info locally to determine whether new
jobs can use the QOS I wondered if there was a less heavy-handed
way to get it.

I might dig into the code before opening a support request.

cheers!

Kilian Cavalotti

unread,
Aug 20, 2018, 12:29:49 PM8/20/18
to Slurm User Community List
Hi Chris,

On Sun, Aug 19, 2018 at 6:26 PM, Christopher Samuel <ch...@csamuel.org> wrote:
> We are using QOS's for projects which have been granted a fixed set of
> time for higher priority work which works nicely, but have just been
> asked the obvious question "how much time do we have left?".

I _think_ that "scontrol show assoc_mgr" could get you close. We're
not using TRESMins with our QOSes, so it's just a hunch, but I would
look there, that's the closest I could think of of a representation of
the various counters and limits the controller keeps in memory.

Cheers,
--
Kilian

Chris Samuel

unread,
Aug 20, 2018, 6:13:04 PM8/20/18
to slurm...@lists.schedmd.com
On Tuesday, 21 August 2018 2:28:27 AM AEST Kilian Cavalotti wrote:

> I _think_ that "scontrol show assoc_mgr" could get you close. We're
> not using TRESMins with our QOSes, so it's just a hunch, but I would
> look there, that's the closest I could think of of a representation of
> the various counters and limits the controller keeps in memory.

Awesome, thanks Kilian!

$ scontrol show assoc_mgr QOS=astac_oz045 | fgrep UsageRaw=
UsageRaw=18641632.000000

Looking promising...

Skouson, Gary

unread,
Aug 22, 2018, 9:17:43 AM8/22/18
to Slurm User Community List
A while ago, I thought a patch was made to sshare to show raw tres usage.

Something like

sshare -o account,user,GrpTRESRaw

At the time I used this, I was only concerned with account usage, so I didn't look to see if sshare would work on the QOS level.

I'm not sure that "feature" was in the man page last time I looked.

-----
Gary Skouson
Chris Samuel : https://na01.safelinks.protection.outlook.com/?url=http%3A%2F%2Fwww.csamuel.org%2F&amp;data=02%7C01%7Cgbs35%40psu.edu%7C5118db0eb1ef46588f3608d6063dc876%7C7cf48d453ddb4389a9c1c115526eb52e%7C0%7C0%7C636703259759774650&amp;sdata=7mRWpfhD%2BtzOBNhwWm9VhbWm6k8GKG9sCRHV3q19nbY%3D&amp;reserved=0 : Melbourne, VIC

Bjørn-Helge Mevik

unread,
Aug 23, 2018, 4:01:12 AM8/23/18
to slurm...@schedmd.com
"Skouson, Gary" <gb...@psu.edu> writes:

> sshare -o account,user,GrpTRESRaw

[...]

> I'm not sure that "feature" was in the man page last time I looked.

I isn't; neither in 17.02.7 or 17.11.9-2. Makes me wonder what other
features are undocumented... :)

--
Regards,
Bjørn-Helge Mevik, dr. scient,
Department for Research Computing, University of Oslo
signature.asc
Reply all
Reply to author
Forward
0 new messages