[slurm-users] GrpTRESMins and GrpTRESRaw usage

1,915 views
Skip to first unread message

gerar...@cines.fr

unread,
Jun 22, 2022, 12:17:15 PM6/22/22
to Slurm-users, Gérard Gil
Hello,

I am using SLURM v 19.05 and I am trying to figure out how the cpu GrpTRESRaw is calculated for a job.
I would like to use GrpTRESMins to limit a project to an allotted amount of hours.

If the limitation process works as defined in the documentation my tests show some strange results for the cpu TRESRaw value calculated for a job.

I thought job's cpu TRESRaw = nb of reserved core X walltime (mn)

Is there anyone who uses this "feature" and can help me to know if I am facing a "bug" or if the problem comes from my local slurm configuration ?


Thanks

Gerard


gerar...@cines.fr

unread,
Jun 22, 2022, 12:35:45 PM6/22/22
to Slurm-users, Gérard Gil
Hi,

new strange behaviour.

I'm using sshare command to get the current values of GrpTRESRaw  and  GrpTRESMins.

‎agil‎: toto@login1:~/TEST$ sshare -A myproject -u " " -o account,user,GrpTRESRaw%80,GrpTRESMins
             Account       User                                                                       GrpTRESRaw                    GrpTRESMins
-------------------- ----------                            ----------------------------------------------------- ------------------------------
myproject                        cpu=15805,mem=29646462,energy=0,node=493,billing=15805,fs/disk=0,vmem=0,pages=0                      cpu=17150


1/2 an hour later an ran the same command and I get the following answer:


‎agil‎: toto@login1:~/TEST$ sshare -A myproject -u " " -o account,user,GrpTRESRaw%80,GrpTRESMins

             Account       User                                                                       GrpTRESRaw                    GrpTRESMins
-------------------- ----------                            ----------------------------------------------------- ------------------------------
myproject                        cpu=15729,mem=29504131,energy=0,node=491,billing=15729,fs/disk=0,vmem=0,pages=0                      cpu=17150

TRESRaw cpu is lower than before as I'm alone on the system an no other job was submitted.
Any explanation of this ?

TRESRaw's cpu is lower than before while I am alone on the system and no other job has been submitted.
What is the explanation for this phenomenon?

Thanks


Gérard



Bjørn-Helge Mevik

unread,
Jun 23, 2022, 3:13:11 AM6/23/22
to slurm...@schedmd.com
<gerar...@cines.fr> writes:

> I thought job's cpu TRESRaw = nb of reserved core X walltime (mn)

It is the "TRES billing cost" x walltime. What the TRES billing cost of
a job is depends on how you've set up the TRESBillingWeights on the
partitions, and whethery you've defined PriorityFlags=MAX_TRES or not.

--
Regards,
Bjørn-Helge Mevik, dr. scient,
Department for Research Computing, University of Oslo

signature.asc

Bjørn-Helge Mevik

unread,
Jun 23, 2022, 3:19:13 AM6/23/22
to slurm...@schedmd.com
<gerar...@cines.fr> writes:

> TRESRaw cpu is lower than before as I'm alone on the system an no other job was submitted.
> Any explanation of this ?

I'd guess you have turned on FairShare priorities. Unfortunately, in
Slurm the same internal variables are used for fairshare calculations as
for GrpTRESMins (and similar), so when fair share priorities are in use,
slurm will reduce accumulated GrpTRESMins over time. This means that it
is impossible(*) to use GrpTRESMins limits and fairshare
priorities at the same time.

(*) It is possible to tell slurm *not* to reduce the accumulated
TRESMins of a QoS, so you can technically use GrpTRESMins limits on a
qos, and fair share priorites on the accounts and/or users.

--
B/H
signature.asc

Ole Holm Nielsen

unread,
Jun 23, 2022, 3:42:10 AM6/23/22
to slurm...@lists.schedmd.com
Hi Bjørn-Helge,

On 6/23/22 09:18, Bjørn-Helge Mevik wrote:
> <gerar...@cines.fr> writes:
>
>> TRESRaw cpu is lower than before as I'm alone on the system an no other job was submitted.
>> Any explanation of this ?
>
> I'd guess you have turned on FairShare priorities. Unfortunately, in
> Slurm the same internal variables are used for fairshare calculations as
> for GrpTRESMins (and similar), so when fair share priorities are in use,
> slurm will reduce accumulated GrpTRESMins over time. This means that it
> is impossible(*) to use GrpTRESMins limits and fairshare
> priorities at the same time.

This is a surprising observation! We use a 14 days HalfLife in slurm.conf:
PriorityDecayHalfLife=14-0

Since our longest running jobs can run only 7 days, maybe our limits never
get reduced as you describe?

The slurm.conf man-page says that PriorityDecayHalfLife affects hard time
limits per association:

> PriorityDecayHalfLife
> This controls how long prior resource use is considered in
> determining how over- or under-serviced an association is (user,
> bank account and cluster) in determining job priority. The
> record of usage will be decayed over time, with half of the
> original value cleared at age PriorityDecayHalfLife. If set to
> 0 no decay will be applied. This is helpful if you want to
> enforce hard time limits per association. If set to 0 Priori‐
> tyUsageResetPeriod must be set to some interval. Applicable
> only if PriorityType=priority/multifactor. The unit is a time
> string (i.e. min, hr:min:00, days-hr:min:00, or days-hr). The
> default value is 7-0 (7 days).

Is this what explains your statement?

BTW, I've written a handy script for displaying user limits in a readable
format:
https://github.com/OleHolmNielsen/Slurm_tools/tree/master/showuserlimits

/Ole

Bjørn-Helge Mevik

unread,
Jun 23, 2022, 6:40:07 AM6/23/22
to slurm...@schedmd.com
Ole Holm Nielsen <Ole.H....@fysik.dtu.dk> writes:

> Hi Bjørn-Helge,

Hello, Ole! :)

> On 6/23/22 09:18, Bjørn-Helge Mevik wrote:
>
>> Slurm the same internal variables are used for fairshare calculations as
>> for GrpTRESMins (and similar), so when fair share priorities are in use,
>> slurm will reduce accumulated GrpTRESMins over time. This means that it
>> is impossible(*) to use GrpTRESMins limits and fairshare
>> priorities at the same time.
>
> This is a surprising observation!

I discovered it quite a few years ago, when we wanted to use Slurm to
enforce cpu hour quota limits (instead of using Maui+Gold). Can't
remember anymore if I was surprised or just sad. :D

> We use a 14 days HalfLife in slurm.conf:
> PriorityDecayHalfLife=14-0
>
> Since our longest running jobs can run only 7 days, maybe our limits
> never get reduced as you describe?

The accumulated usage is reduced every 5 minutes (by default; see
PriorityCalcPeriod). The reduction is done by multiplying the
accumulated usage by a number slightly less than 1. The number is
chosen so that the accumulated usage is reduced to 50 % after
PriorityDecayHalfLife (given that you don't run anything more in
between, of course). With a halflife of 14 days and the default calc
period, that number is very close to 1 (0.9998281 if my calculations are
correct :).

Note: I read all about these details on the schedmd web pages some years
ago. I cannot find them again (the parts about the multiplication with
a number smaller than 1 to get the half life), so I might be wrong on
some of the details.

> BTW, I've written a handy script for displaying user limits in a
> readable format:
> https://github.com/OleHolmNielsen/Slurm_tools/tree/master/showuserlimits

Nice!

--
B/H
signature.asc

gerar...@cines.fr

unread,
Jun 23, 2022, 9:58:24 AM6/23/22
to Slurm-users, slurm...@schedmd.com
Hi Ole and B/H,

Thanks for your answers.


You're right B/H, and as I tuned TRESBillingWeights option to only counts cpu, in my case :        nb of reserved core   =   "TRES billing cost"

You're right again I forgot the PriorityDecayHalfLife parameter which is also used by fairshare Multifactor Priority.
We use multifactor priority to manage the priority of jobs in the queue, and we set the values of PriorityDecayHalfLife and PriorityUsageResetPeriod according to these needs.
So PriorityDecayHalfLife will decay GrpTRESRaw and GrpTRESMins can't be used as we want.

Setting the NoDecay flag to a QOS could be an option but I suppose it also impact fairshare Multifactor Priority  of all jobs using this QOS.

This means I have no solution to limit a project as we want, unless schedMD changes its behavior or adds a new feature.

Thanks a lot.

Regards,
Gérard



Miguel Oliveira

unread,
Jun 23, 2022, 12:42:58 PM6/23/22
to Slurm User Community List, slurm...@schedmd.com
Hi Gérard,

It is not exactly true that you have no solution to limit projects. If you implement each project as an account then you can create an account qos with the NoDecay flags.
This will not affect associations so priority and fair share are not impacted.

The way we do it is to create a qos:

sacctmgr -i --quiet create qos "{{ item.account }}" set flags=DenyOnLimit,NoDecay GrpTRESMin=cpu=600

And then use this qos when the account (project) is created:

sacctmgr -i --quiet add account "{{ item.account }}" Parent="{{ item.parent }}" QOS="{{ item.account }}" Fairshare=1 Description="{{ item.description }}”

We even have a slurm bank implementation to play along with this technique and it has not failed us yet too much! :)

Hope that helps,

Miguel Afonso Oliveira

Christopher Benjamin Coffey

unread,
Jun 23, 2022, 12:58:53 PM6/23/22
to slurm-users
Hi Miguel,

This is intriguing as I didn't know about this possibility, in dealing with fairshare, and limited priority minutes qos at the same time. How can you verify how many minutes have been used of this qos that has been setup with grptresmins ? Is that possible? Thanks.

Best,
Chris

--
Christopher Coffey
High-Performance Computing
Northern Arizona University
928-523-1167



On 6/23/22, 9:44 AM, "slurm-users on behalf of Miguel Oliveira" <slurm-use...@lists.schedmd.com on behalf of miguel....@uc.pt> wrote:

Hi Gérard,
It is not exactly true that you have no solution to limit projects. If you implement each project as an account then you can create an account qos with the NoDecay flags.
This will not affect associations so priority and fair share are not impacted.

The way we do it is to create a qos:

sacctmgr -i --quiet create qos "{{ item.account }}" set flags=DenyOnLimit,NoDecay GrpTRESMin=cpu=600


And then use this qos when the account (project) is created:

sacctmgr -i --quiet add account "{{ item.account }}" Parent="{{ item.parent }}" QOS="{{ item.account }}" Fairshare=1 Description="{{ item.description }}”

We even have a slurm bank implementation to play along with this technique and it has not failed us yet too much! :)

Hope that helps,

Miguel Afonso Oliveira



On 23 Jun 2022, at 14:57, gerar...@cines.fr wrote:

Hi Ole and B/H,

Thanks for your answers.



You're right B/H, and as I tuned TRESBillingWeights option to only counts cpu, in my case : nb of reserved core = "TRES billing cost"

You're right again I forgot the PriorityDecayHalfLife parameter which is also used by fairshare Multifactor Priority.
We use multifactor priority to manage the priority of jobs in the queue, and we set the values of PriorityDecayHalfLife and PriorityUsageResetPeriod according to these needs.
So PriorityDecayHalfLife will decay GrpTRESRaw and GrpTRESMins can't be used as we want.

Setting the NoDecay flag to a QOS could be an option but I suppose it also impact fairshare Multifactor Priority of all jobs using this QOS.

This means I have no solution to limit a project as we want, unless schedMD changes its behavior or adds a new feature.

Thanks a lot.

Regards,
Gérard
<http://www.cines.fr/>


________________________________________

Miguel Oliveira

unread,
Jun 23, 2022, 1:20:01 PM6/23/22
to Slurm User Community List
Hi Chris,

We use a python wrapper to do this but the basic command to retrieved account minutes is:

'scontrol -o show assoc_mgr | grep "^QOS='+account+’"'

You then have to parse the output for "GrpTRESMins=“. The output will be two numbers. The first is the limit, or N for no limit, while the next one in parenthesis is the consumed.

You can also report by user with:

'sreport -t minutes -T cpu,gres/gpu -nP cluster AccountUtilizationByUser start='+date_start+' end='+date_end+' account='+account+' format=login,used’

If you are willing to accept some rounding errors!

With slight variations, and some oddities, this can also be used to limit GPU utilisation, as is in our case as you can deduce from the previous command.

Best,

Miguel Afonso Oliveira

Christopher Benjamin Coffey

unread,
Jun 23, 2022, 2:27:50 PM6/23/22
to slurm-users
Awesome thanks, I didn't know about that "scontrol -o show assoc_mgr" command ! Thanks guys!

Best,
Chris

--
Christopher Coffey
High-Performance Computing
Northern Arizona University
928-523-1167



Ole Holm Nielsen

unread,
Jun 23, 2022, 3:48:17 PM6/23/22
to slurm...@lists.schedmd.com
On 23-06-2022 19:19, Miguel Oliveira wrote:
> We use a python wrapper to do this but the basic command to retrieved
> account minutes is:
>
> 'scontrol -o show assoc_mgr | grep "^QOS='+account+’"'
>
> You then have to parse the output for "GrpTRESMins=“. The output will be
> two numbers. The first is the limit, or N for no limit, while the next
> one in parenthesis is the consumed.

You may perhaps find it easier to use my showuserlimits script from
https://github.com/OleHolmNielsen/Slurm_tools/tree/master/showuserlimits

/Ole

Miguel Oliveira

unread,
Jun 23, 2022, 7:41:38 PM6/23/22
to Ole Holm Nielsen, Slurm User Community List
Hi Ole,

Your script is a nice piece of scripting but I think you missed the point, if I read your script correctly!
You read limits with:

scontrol -o show assoc_mgr users=$username $selectedaccount flags=Assoc

In our case the limits are declared in a QoS and not in the association and hence will not be picked up and shown in the output.
The purpose of having them in a QoS, which you could read with:

scontrol -o show assoc_mgr users=$username $selectedaccount flags=QoS

Is to apply the nodecay flag and hence not influence fair share and priorities.

We have a sbank code what output balances and statments to users:

[root@slurmdb ~]# sbank balance 

-------------------------------------------------------------------------------------------
|             |                   CPU (hours)       |                   GPU (hours)       |
-------------------------------------------------------------------------------------------
|     Account |       Limit       Usage   Available |       Limit       Usage   Available |
-------------------------------------------------------------------------------------------
|       staff |          ND      729881          ND |          ND          18          ND |
-------------------------------------------------------------------------------------------
[root@slurmdb ~]# sbank statement

-----------------------------------------------------------------------------------------------------------------------
|                             |                       CPU (hours)         |                       GPU (hours)         |
-----------------------------------------------------------------------------------------------------------------------
|      Username       Account |         Limit         Usage     Available |         Limit         Usage     Available |
-----------------------------------------------------------------------------------------------------------------------
|      bandrade         staff |            ND           182            ND |            ND            --            ND |
|     easybuild         staff |            ND         13210            ND |            ND            --            ND |
|     moliveira         staff |            ND          1875            ND |            ND            --            ND |
|      palberto         staff |            ND        720538            ND |            ND            --            ND |
|          root         staff |            ND       7949386            ND |            ND            --            ND |
|    ----------         staff |            ND        729881            ND |            ND            18            ND |
-----------------------------------------------------------------------------------------------------------------------

Best Regards,

MAO

gerar...@cines.fr

unread,
Jun 24, 2022, 2:57:45 AM6/24/22
to Slurm-users
Hi Miguel,

It sounds good !

But does it mean you have to request this "NoDecay" QOS to benefit the fairshare priority ?

Does this also mean that if all the QOS we use are created with NoDecay, we can take advantage of the FairShare priority and NoDecay for all jobs to use the GrpTRESMins limit?

Thanks

Regards,
Gérard



Bjørn-Helge Mevik

unread,
Jun 24, 2022, 6:53:56 AM6/24/22
to slurm...@schedmd.com
Miguel Oliveira <miguel....@uc.pt> writes:

> It is not exactly true that you have no solution to limit projects. If
> you implement each project as an account then you can create an
> account qos with the NoDecay flags.
> This will not affect associations so priority and fair share are not impacted.

Yes, that will work. But it has the drawback that you cannot use QoS'es
for *anything else*, like a QoS for development jobs or similar. So
either way it is a trade-off.
signature.asc

Miguel Oliveira

unread,
Jun 24, 2022, 7:10:14 AM6/24/22
to Slurm User Community List, slurm...@schedmd.com
Hi Bjørn-Helge,

Long time!

Why not? You can have multiple QoSs and you have other techniques to change priorities according to your policies.

Best,

MAO

gerar...@cines.fr

unread,
Jun 24, 2022, 7:56:47 AM6/24/22
to Slurm-users, slurm-users
Hi Miguel,

> Why not? You can have multiple QoSs and you have other techniques to change
> priorities according to your policies.

Is this answer my question ?

"If all configured QOS use NoDecay, we can take advantage of the FairShare priority with Decay and all jobs GrpTRESRaw with NoDecay ?"

Thanks

Best,
Gérard


----- Mail original -----
> De: "Miguel Oliveira" <miguel....@uc.pt>
> À: "Slurm-users" <slurm...@lists.schedmd.com>
> Cc: "slurm-users" <slurm...@schedmd.com>
> Envoyé: Vendredi 24 Juin 2022 13:09:47
> Objet: Re: [slurm-users] GrpTRESMins and GrpTRESRaw usage

Bjørn-Helge Mevik

unread,
Jun 24, 2022, 7:58:25 AM6/24/22
to slurm...@schedmd.com
Miguel Oliveira <miguel....@uc.pt> writes:

> Hi Bjørn-Helge,
>
> Long time!

Hi Miguel! Yes, definitely a long time! :D

> Why not? You can have multiple QoSs and you have other techniques to change priorities according to your policies.

A job can only run in a single QoS, so if you submit a job with "sbatch
--qos=devel ..." it will no longer be running in the account QoS and
thus its usage will not be recorded in that QoS. If that is ok, then no
problem, but if you want all jobs of an account to be limited by the
TRESMins limit, then you cannot use other QoS'es than the account QoSes
(except for partition QoSes).

--
Bjørn-Helge
signature.asc

Miguel Oliveira

unread,
Jun 24, 2022, 8:07:39 AM6/24/22
to Slurm User Community List, slurm-users
Hi Gérard,

I believe so. All our accounts correspond to one project and all have an associated QoS with NoDecay and DenyOnLimit. This is enough to restrict usage on each individual project.
You only need these flags on the QoS. The association will carry on as usual and fairshare will not be impacted.

Hope that helps,

Miguel Oliveira

Miguel Oliveira

unread,
Jun 24, 2022, 8:46:58 AM6/24/22
to Slurm User Community List, slurm...@schedmd.com
Hi Bjørn-Helge,

> On 24 Jun 2022, at 12:58, Bjørn-Helge Mevik <b.h....@usit.uio.no> wrote:
>
> Miguel Oliveira <miguel....@uc.pt> writes:
>
>> Hi Bjørn-Helge,
>>
>> Long time!
>
> Hi Miguel! Yes, definitely a long time! :D

Indeed!

>
>> Why not? You can have multiple QoSs and you have other techniques to change priorities according to your policies.
>
> A job can only run in a single QoS, so if you submit a job with "sbatch
> --qos=devel ..." it will no longer be running in the account QoS and
> thus its usage will not be recorded in that QoS. If that is ok, then no
> problem, but if you want all jobs of an account to be limited by the
> TRESMins limit, then you cannot use other QoS'es than the account QoSes
> (except for partition QoSes).

Unfortunately my cluster is down (storage issues ….) so I cannot test yet! The limit would certainly be imposed as, and I quote from documentation,
"If limits are defined at multiple points in this hierarchy, the point in this list where the limit is first defined will be used" (as long as job QoS does not define a different limit as this one takes precedence).
You are very likely right that slurm would not decrease the usage counter on the original account QoS in your scenario.
In that case then you can give every project a different development allocation to use! Your purpose was to limit project allocations anyway!!!

Even if that is not your thing, as I said originally, you have other techniques to change priorities or limits. In the case of development work, you could in principle define a development partition and enforce priorities and limits at that level.

All these are not really a replacement for proper allocation management, like gold was!!! But it does the trick!

Best,

MAO



>
> --
> Bjørn-Helge

gerar...@cines.fr

unread,
Jun 24, 2022, 8:52:30 AM6/24/22
to Slurm-users, slurm-users
Hi Miguel,

Good !!

I'll try this options on all existing QOS and see if everything works as expected.
I'll inform you on the results.


Thanks a lot

Best,
Gérard


----- Mail original -----
> De: "Miguel Oliveira" <miguel....@uc.pt>
> À: "Slurm-users" <slurm...@lists.schedmd.com>
> Cc: "slurm-users" <slurm...@schedmd.com>
> Envoyé: Vendredi 24 Juin 2022 14:07:16
> Objet: Re: [slurm-users] GrpTRESMins and GrpTRESRaw usage

gerar...@cines.fr

unread,
Jun 28, 2022, 3:59:41 AM6/28/22
to Slurm-users
Hi Miguel,


I modified my test configuration to evaluate the effect of NoDecay.




I modified all QOS adding NoDecay Flag.


toto@login1:~/TEST$ sacctmgr show QOS
      Name   Priority  GraceTime    Preempt   PreemptExemptTime PreemptMode                                    Flags UsageThres UsageFactor       GrpTRES   GrpTRESMins GrpTRESRunMin GrpJobs GrpSubmit     GrpWall       MaxTRES MaxTRESPerNode   MaxTRESMins     MaxWall     MaxTRESPU MaxJobsPU MaxSubmitPU     MaxTRESPA MaxJobsPA MaxSubmitPA       MinTRES
---------- ---------- ---------- ---------- ------------------- ----------- ---------------------------------------- ---------- ----------- ------------- ------------- ------------- ------- --------- ----------- ------------- -------------- ------------- ----------- ------------- --------- ----------- ------------- --------- ----------- -------------
    normal          0   00:00:00                                    cluster                                  NoDecay               1.000000                                                                                                                                                                                                                      
interactif         10   00:00:00                                    cluster                                  NoDecay               1.000000       node=50                                                                 node=22                               1-00:00:00       node=50                                                                        
     petit          4   00:00:00                                    cluster                                  NoDecay               1.000000     node=1500                                                                 node=22                               1-00:00:00      node=300                                                                        
      gros          6   00:00:00                                    cluster                                  NoDecay               1.000000     node=2106                                                                node=700                               1-00:00:00      node=700                                                                        
     court          8   00:00:00                                    cluster                                  NoDecay               1.000000     node=1100                                                                node=100                                 02:00:00      node=300                                                                        
      long          4   00:00:00                                    cluster                                  NoDecay               1.000000      node=500                                                                node=200                               5-00:00:00      node=200                                                                        
   special         10   00:00:00                                    cluster                                  NoDecay               1.000000     node=2106                                                               node=2106                               5-00:00:00     node=2106                                                                        
   support         10   00:00:00                                    cluster                                  NoDecay               1.000000     node=2106                                                                node=700                               1-00:00:00     node=2106                                                                        
      visu         10   00:00:00                                    cluster                                  NoDecay               1.000000        node=4                                                                node=700                                 06:00:00        node=4                      



I submitted a bunch of jobs to control the NoDecay efficiency and I noticed RawUsage as well as GrpTRESRaw cpu is still decreasing.


toto@login1:~/TEST$ sshare -A dci -u " " -o account,user,GrpTRESRaw%80,GrpTRESMins,RawUsage
             Account       User                                                                       GrpTRESRaw                    GrpTRESMins    RawUsage
-------------------- ----------                            ----------------------------------------------------- ------------------------------ -----------
dci                                cpu=6932,mem=12998963,energy=0,node=216,billing=6932,fs/disk=0,vmem=0,pages=0                      cpu=17150      415966
toto@login1:~/TEST$ sshare -A dci -u " " -o account,user,GrpTRESRaw%80,GrpTRESMins,RawUsage
             Account       User                                                                       GrpTRESRaw                    GrpTRESMins    RawUsage
-------------------- ----------                            ----------------------------------------------------- ------------------------------ -----------
dci                                cpu=6931,mem=12995835,energy=0,node=216,billing=6931,fs/disk=0,vmem=0,pages=0                      cpu=17150      415866
toto@login1:~/TEST$ sshare -A dci -u " " -o account,user,GrpTRESRaw%80,GrpTRESMins,RawUsage
             Account       User                                                                       GrpTRESRaw                    GrpTRESMins    RawUsage
-------------------- ----------                            ----------------------------------------------------- ------------------------------ -----------
dci                                cpu=6929,mem=12992708,energy=0,node=216,billing=6929,fs/disk=0,vmem=0,pages=0                      cpu=17150      415766


Something I forgot to do ?


Best,
Gérard

Cordialement,
Gérard Gil

Département Calcul Intensif

Centre Informatique National de l'Enseignement Superieur
950, rue de Saint Priest
34097 Montpellier CEDEX 5
FRANCE

tel :  (334) 67 14 14 14
fax : (334) 67 52 37 63
web : http://www.cines.fr


Miguel Oliveira

unread,
Jun 28, 2022, 11:24:07 AM6/28/22
to Slurm User Community List
Hi Gérard,

The way you are checking is against the association and as such it ought to be decreasing in order to be used by fair share appropriately.
The counter used that does not decrease is on the QoS, not the association. You can check that with:

scontrol -o show assoc_mgr | grep "^QOS='+account+’

That ought to give you two numbers. The first is the limit, or N for not limit, and the second in parenthesis the usage.

Hope that helps.

Best,

Miguel Afonso Oliveira

gerar...@cines.fr

unread,
Jun 28, 2022, 1:31:28 PM6/28/22
to Slurm-users
Hi Miguel,

OK, I did'nt know this command.

I'm not sure to understand how it works regarding to my goal.
I use the following command inspired by the command you gave me and I obtain a UsageRaw for each QOS.

scontrol -o show assoc_mgr -accounts=myaccount Users=" "


Do I have to sumup all QOS RawUsage to obtain the RawUsage of myaccount with NoDecay ?
If I set GrpTRESMins for an Account and not for a QOS, does SLURM handle to sumpup these QOS RawUsage to control if the GrpTRESMins account limit is reach ?

Thanks again for your precious help.

Gérard



Miguel Oliveira

unread,
Jun 28, 2022, 7:29:42 PM6/28/22
to Slurm User Community List
Hi Gérard,

If I understood you correctly your goal was to limit the number of minutes each project can run. By associating each project to a slurm account with a nodecay QoS then you will have achieved your goal.
Try a project with a very small limit and you will see that it won’t run.

You don’t have to add anything. Each QoS will accumulate its respective usage, i.e, the usage of all users on that account. Users can even be on different accounts (projects) and charge the respective project with the parameter --account on sbatch.
The GrpTRESMins is always changed on the QoS with a command like:

sacctmgr update qos where qos=... set GrpTRESMin=cpu=….

Hope that makes sense!

Best,

MAO

gerar...@cines.fr

unread,
Jun 29, 2022, 1:14:55 PM6/29/22
to Slurm-users
Hi Miguel,

>If I understood you correctly your goal was to limit the number of minutes each project can run. By associating each project to a slurm account with a nodecay QoS then you will have achieved your goal.

Here is what I what to do :

"All jobs submitted to an account regardless the QOS they use have to be constrained to a number of minutes set by the limit associated with that account (and not to QOS)."


>Try a project with a very small limit and you will see that it won’t run

I already tested GrpTRESmins limit and confirms it works as expected.
Then I saw the decay effect on GrpTRESRaw (what I thought first as the right metric to look at) and try to find out a way to fix it.

It's really very import for me to trust it, so I need a deterministic test to prove it.

I'm testing this GrpTRESMins limit with NoDecay set on QOS resetting all RawUsage (Account and QOS) to be sure it works as I expect.
I print the account GrpTRESRaw (in mn) at the end of my tests job to set a new limits with GrpTRESMins and see how it behaves.

I'll get inform on the results. I hope it works.


> You don’t have to add anything.
>Each QoS will accumulate its respective usage, i.e, the usage of all users on that account. Users can even be on different accounts (projects) and charge the respective project with the parameter --account on sbatch.

If SLURM does it for to manage limit I would also like to obtain the current RawUsage for an account.
Do you know how to get it ?



>The GrpTRESMins is always changed on the QoS with a command like:
>
>sacctmgr update qos where qos=... set GrpTRESMin=cpu=….

That's right if you want to set a limit to a QOS.
But I dont know/think the same limit value will also apply to all other QOS, and if I apply the same limit to all QOS.
Is my account limit the sum of all the QOS limit ?


Actualy I'm setting the limit to the Account using command:

sacctmgr modify account myaccount set grptresmins=cpu=60000 qos=...

With this setting I saw the limit is set to the account and not to the QOS.
sacctmgr show QOS     command shows an empty field for GrpTRESMins on all QOS


Thanks again form your help.
I hope I'm close to get the answer to my issue.

Best,
Gérard

gerar...@cines.fr

unread,
Jun 30, 2022, 2:13:00 PM6/30/22
to Slurm-users
Hi Miguel,

I finally found the time to test the QOS NoDecay configuration vs GrpTRESMins account limit.

Here is my benchmark :



1) Initialize the benchmark configuration
   - reset all RawUsage (on QOS and account)
   - set a limit on Account GrpTRESMins
   - run several jobs with a controlled ellaps cpu time on a QOS.
   - reset account RawUsage
   - set a limit on Account GrpTRESMins under the QOS RawUsage

Here is the inital state before running the benchmark

toto@login1:~/TEST$ sshare -A dci -u " " -o account,user,GrpTRESRaw%80,GrpTRESMins,rawusage

             Account       User                                                                       GrpTRESRaw                    GrpTRESMins    RawUsage
-------------------- ----------                            ----------------------------------------------------- ------------------------------ -----------
dci                                               cpu=0,mem=0,energy=0,node=0,billing=0,fs/disk=0,vmem=0,pages=0                       cpu=4100           0



Account            RawUsage = 0
GrpTRESMins    cpu=4100



toto@login1:~/TEST$ scontrol -o show assoc_mgr | grep "^QOS" | grep support
QOS=support(8) UsageRaw=253632.000000 GrpJobs=N(0) GrpJobsAccrue=N(0) GrpSubmitJobs=N(0) GrpWall=N(132.10) GrpTRES=cpu=N(0),mem=N(0),energy=N(0),node=2106(0),billing=N(0),fs/disk=N(0),vmem=N(0),pages=N(0) GrpTRESMins=cpu=N(4227),mem=N(7926000),energy=N(0),node=N(132),billing=N(4227),fs/disk=N(0),vmem=N(0),pages=N(0) GrpTRESRunMins=cpu=N(0),mem=N(0),energy=N(0),node=N(0),billing=N(0),fs/disk=N(0),vmem=N(0),pages=N(0) MaxWallPJ=1440 MaxTRESPJ=node=700 MaxTRESPN= MaxTRESMinsPJ= MinPrioThresh=  MinTRESPJ= PreemptMode=OFF Priority=10 Account Limits= dci={MaxJobsPA=N(0) MaxJobsAccruePA=N(0) MaxSubmitJobsPA=N(0) MaxTRESPA=cpu=N(0),mem=N(0),energy=N(0),node=N(0),billing=N(0),fs/disk=N(0),vmem=N(0),pages=N(0)} User Limits= 1145={MaxJobsPU=N(0) MaxJobsAccruePU=N(0) MaxSubmitJobsPU=N(0) MaxTRESPU=cpu=N(0),mem=N(0),energy=N(0),node=2106(0),billing=N(0),fs/disk=N(0),vmem=N(0),pages=N(0)}

QOS support RawUsage = 253632 s or 4227 mn


QOS support RawUsage  > GrpTRESMins     
SLURM should prevent to start a job for this account if it works as expected.



2) Run the benchmark to control limit GrpTRESMins efficiency over QOS rawusage


toto@login1:~/TEST$ sbatch TRESMIN.slurm
Submitted batch job 3687


toto@login1:~/TEST$ squeue
             JOBIDADMIN_COMMMIN_MEMOR         SUBMIT_TIME  PRIORITY PARTITION       QOS        USER      STATE  TIME_LIMIT      TIME NODES         REASON            START_TIME
              3687     BDW28   60000M 2022-06-30T19:36:42   1100000     bdw28   support       toto    RUNNING        5:00      0:02     1           None   2022-06-30T19:36:42


The job is running unless GrpTRESMins is under QOS support RawUsage .



Is there anything wrong with my control process that invalidates the result ?


Thanks

Gérard




Miguel Oliveira

unread,
Jun 30, 2022, 3:34:40 PM6/30/22
to gerar...@cines.fr, Slurm-users
Hi Gérard,

Let see if I understood this right. You have a user on the account dci and you have put GrpTRESMins limit on this (cpu=4100).
From the output it looks like that is associated to the QoS toto.
However the limit put on the association and not on the QoS:

GrpTRESMins=cpu=N(4227)

You need to remove the limit from the association and put it on the QoS.

Hope that helps,

MAO

gerar...@cines.fr

unread,
Jul 1, 2022, 5:04:29 AM7/1/22
to Miguel Oliveira, Slurm-users
Hi Miguel,

As far as I understood GrpTRESMins=cpu=N(4227)  seems not to be the limit of the QOS unlike its name but the RawUsage of the QOS in mn instead of second as accounted in RawUsage.
When I set QOS RawUsage to 0 GrpTRESMins=cpu is also set to 0.
Each time a job as completed using this QOS RawUsage   and GrpTRESMins=cpu  are increased by the usage of this job (in mn or s).

So I need to first try setting the limit on one QOS as you told me, then set the same limit on all QOS and see how it handles all those limits with one account.

I think this should be the last test to understand the complete behavior of GrpTRESMins.

I'll inform you on the result, but in a while because of holidays.

Thanks a lot for all your help.



gerar...@cines.fr

unread,
Aug 8, 2022, 8:22:16 AM8/8/22
to Miguel Oliveira, Slurm-users
Hello Miguel,

Setting the limit to only one QOS works indeed but it prevents usage of several QOS for all users, and all the multi QOS possibilities.

I'm thinking about how I can manage with it and if it's possible to set up a workaround in our environment.

Thanks for all your help.
Reply all
Reply to author
Forward
0 new messages