[slurm-users] preempt/qos is not working as expected

3 views
Skip to first unread message

Marchand Aurélia via slurm-users

unread,
Feb 16, 2026, 10:12:13 AM (6 days ago) Feb 16
to slurm...@lists.schedmd.com
Hello,

I want too use QOS preemption.

my configuration is :

preemptMode             = GANG,SUSPEND
PreemptParameters       = (null)
PreemptType             = preempt/qos
PriorityType            = priority/multifactor
SelectType              = select/cons_tres
SelectTypeParameters    = CR_CPU

I define 3 partitions :

PartitionName=DEFAULT Nodes=ALL Default=NO MaxTime=15-0 State=UP
OverSubscribe=FORCE:1 PreemptMode=suspend
PartitionName=veryhi PriorityTier=30 AllowQos=veryhi_short,gaia_veryhi_short
PartitionName=hi PriorityTier=20 AllowQos=hi_short,gaia_hi_short
PartitionName=def        PriorityTier=10 OverSubscribe=No Default=YES 
AllowQos=def_short

I define 5 QOS :

sacctmgr show qos format=name%20,Priority,Preempt%40,PreemptMode
                  Name    Priority Preempt PreemptMode
----------------------- ------------ ----------------------------------
---------------
                normal              0                     cluster
        veryhi_short      20000 def_short     suspend
               hi_short      10000 def_short     suspend
             def_short              0 normal     suspend
gaia_veryhi_short      20000   def_short,gaia_hi_short     suspend
       gaia_hi_short      10000 def_short     suspend

I define account :

sacctmgr show ass  format=account%20,user,qos%60
           Account User                                              QOS
--------------------   --------
--------------------------------------------
             gaia_hi   user1 gaia_hi_short
      gaia_veryhi   user1 gaia_veryhi_short
              mis_hi   user1    hi_short
                   def   user1                                     
def_short

I submit 2 jobs :

JOBID PARTITION                     QOS      ACCOUNT NAME USER       
STATE   PRIORITY
159                  hi                hi_short mis_hi     test user1
SUSPENDE           637
160           veryhi gaia_veryhi_short  gaia_veryhi     test user1  
RUNNING         1150

I don't understand why job 160 suspend job 159, the partition priority
is higher, but I want gaia_veryhi_short QOS  preempt only  def_short and
gaia_hi_short QOS.

The QOS of job 159 is hi_short.

Where is my configuration error ?

Thanks for your time,

Aurélia

--
slurm-users mailing list -- slurm...@lists.schedmd.com
To unsubscribe send an email to slurm-us...@lists.schedmd.com

Davide DelVento via slurm-users

unread,
Feb 16, 2026, 8:07:07 PM (6 days ago) Feb 16
to Marchand Aurélia, slurm...@lists.schedmd.com
I am not sure I fully understand your question, but if I do I believe the problem is that you forgot to add

PreemptMode=off

to the partitions which you do NOT want to be preempted, otherwise everything gets the cluster default. For example I have configured mine as follows

PreemptType = preempt/partition_prio
PreemptMode=REQUEUE
PartitionName=highpriority DefMemPerCPU=4580 Nodes=node[01-36] PreemptMode=off PriorityTier=500 State=UP
PartitionName=lowpriority DefMemPerCPU=4580 Nodes=node[01-36] PreemptMode=cancel PriorityTier=100 State=UP

and only jobs in the lowpriority partition get cancelled, other partitions (not shown) get requeued and highpriority one are left alone

HTH

Marchand Aurélia via slurm-users

unread,
Feb 17, 2026, 5:20:56 AM (6 days ago) Feb 17
to Davide DelVento, slurm...@lists.schedmd.com
Hello,
Thank you for your answer, but it doesn't solve my problem. I want to use a preemption based on QOS, not on the partition.
I want three levels of preemption,  'def', 'hi' and 'veryhi', for each project (I have 10 projets).
partition 'hi' can preempt partition 'def'.
But I want the 'veryhi' partitions only be able to preempt the 'hi' jobs of their project or 'def' jobs.
-- 
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
Aurélia Marchand
Division Informatique de l'Observatoire
11, avenue Marcelin Berthelot
92195 Meudon
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

Marchand Aurélia via slurm-users

unread,
Feb 19, 2026, 4:09:23 AM (4 days ago) Feb 19
to slurm...@lists.schedmd.com
Hello,
I find my error,
The job have to use the same partition.
Aurélia

Davide DelVento via slurm-users

unread,
Feb 19, 2026, 8:54:25 AM (3 days ago) Feb 19
to Marchand Aurélia, slurm...@lists.schedmd.com
Thanks for letting us know, I was not aware of this necessity.
Reply all
Reply to author
Forward
0 new messages