[slurm-users] Configuring slurm.conf and using subpartitions

14 views
Skip to first unread message

Kratz, Zach

unread,
Oct 4, 2023, 12:03:42 AM10/4/23
to slurm...@lists.schedmd.com
I am a systems administrator for a computing cluster.

We have around 24 nodes available, recently adding a whole new updated cluster with upgraded nodes.

We use an interactive node that will randomly select from our list of computing nodes to complete the job. We would like to find a way to select from our list of old nodes first, before using the newer ones. We tried using weight and assigned each of the old nodes a lower weight than the new nodes, but in testing the new nodes were still assigned, even if the old nodes were available.

Is there any way to configure this in the line that configures the interactive node in slurm.conf, for example: 

PartitionName=interactive-cpu   Nodes=node[1-17] weight =10 node[18-24] weight=50

Or is there a way to create subpartitions where we could put the older nodes into a partition within this one?

Thank you for any feedback.


Rémi Palancher

unread,
Oct 4, 2023, 4:45:46 AM10/4/23
to Slurm User Community List
Le mercredi 4 octobre 2023 à 06:03, Kratz, Zach <ZKr...@clarku.edu> a écrit :

> We use an interactive node that will randomly select from our list of computing nodes to complete the job. We would like to find a way to select from our list of old nodes first, before using the newer ones. We tried using weight and assigned each of the old nodes a lower weight than the new nodes, but in testing the new nodes were still assigned, even if the old nodes were available.

Unless confidential, can you show the configuration Node and Partition configuration lines you have tested unsuccessfully?

> Is there any way to configure this in the line that configures the interactive node in slurm.conf, for example: 
>
> PartitionName=interactive-cpu   Nodes=node[1-17] weight =10 node[18-24] weight=50

Mind that Weight is a *Node* parameter, to be defined on Node setting lines[1], not on Partition line.

Another less optimal option is to define a default partition with the old nodes and another overlapping partition including the new nodes that users would need to specify explicitely on job submission to access the new nodes.

[1] https://slurm.schedmd.com/slurm.conf.html#SECTION_NODE-CONFIGURATION
--
Rémi Palancher
Rackslab: Open Source Solutions for HPC Operations
https://rackslab.io

Reply all
Reply to author
Forward
0 new messages