Hello.
We have a small set of compute nodes owned by a group. The group has agreed that the rest of the HPC community can use these nodes providing that they (the owners) can always have priority access to the nodes. The four nodes are well provisioned (1 TByte memory each plus 2 GRID K2 graphics cards) and so there is no need to worry about preemption. In fact I'm happy for the nodes to be used as well as possible by all users. It's just that jobs from the owners must take priority if resources are scarce.
What is the best way to achieve the above in slurm? I'm planning to place the nodes in their own partition. The node owners will have priority access to the nodes in that partition, but will have no advantage when submitting jobs to the public resources. Does anyone please have any ideas how to deal with this?
Best regards,
David
-- Marcus Wagner, Dipl.-Inf. IT Center Abteilung: Systeme und Betrieb RWTH Aachen University Seffenter Weg 23 52074 Aachen Tel: +49 241 80-24383 Fax: +49 241 80-624383 wag...@itc.rwth-aachen.de www.itc.rwth-aachen.de
Yup, PriorityTier is what we use to do exactly that here. That said unless you turn on preemption jobs may still pend if there is no space. We run with REQUEUE on which has worked well.
-Paul Edmon-
Hi Marcus,
sure, using Prioritytier is fine. And my point wasn't so much
about preepmtion but exactely about to use just one partition and
no preemption instead of two partitions, which is what David was
asking for, isn't? But actuallym, I forgot that you can do it in
one partition too by using preempt/qos. Though we haven't use
that.
Best,
Andreas
-- Dr. Andreas Henkel Operativer Leiter HPC Zentrum für Datenverarbeitung Johannes Gutenberg Universität Anselm-Franz-von-Bentzelweg 12 55099 Mainz Telefon: +49 6131 39 26434 OpenPGP Fingerprint: FEC6 287B EFF3 7998 A141 03BA E2A9 089F 2D8E F37E
I just set this up a couple of weeks ago myself. Creating two
partitions is definitely the way to go. I created one partition,
"general" for normal, general-access jobs, and another,
"interruptible" for general-access jobs that can be interrupted,
and then set PriorityTier accordingly in my slurm.conf file (Node
names omitted for clarity/brevity).
PartitionName=general Nodes=... MaxTime=48:00:00 State=Up
PriorityTier=10 QOS=general
PartitionName=interruptible Nodes=... MaxTime=48:00:00 State=Up
PriorityTier=1 QOS=interruptible
I then set PreemptMode=Requeue, because I'd rather have jobs requeued than suspended. And it's been working great. There are few other settings I had to change. The best documentation for all the settings you need to change is https://slurm.schedmd.com/preempt.html
Everything has been working exactly as desired and advertised. My
users who needed the ability to run low-priority, long-running
jobs are very happy.
The one caveat is that jobs that will be killed and requeued need
to support checkpoint/restart. So when this becomes a production
thing, users are going to have to acknowledge that they will only
use this partition for jobs that have some sort of
checkpoint/restart capability.
Prentice