[slurm-users] Removing safely a node

580 views
Skip to first unread message

Ratnasamy, Fritz via slurm-users

unread,
May 16, 2024, 11:19:15 PM5/16/24
to Slurm User Community List
Hi, 

 What is the "official" process to remove nodes safely? I have drained the nodes so jobs are completed and put them in down state after they are completely drained. 
I edited the slurm.conf file to remove the nodes. After some time, I can see that the nodes were removed from the partition with the command sinfo

However, I was told I might need to restart the service slurmctld, do you know if it is necessary? Should I also run scontrol reconfig?
Best, 

Fritz Ratnasamy

Data Scientist

Information Technology


Ryan Novosielski via slurm-users

unread,
May 16, 2024, 11:46:44 PM5/16/24
to Ratnasamy, Fritz, Slurm User Community List
If I’m not mistaken, the manual for slurm.conf or one of the others lists either what action is needed to change every option, or has a combined list of what requires what (I can never remember and would have to look it up anyway).

--
#BlackLivesMatter
____
|| \\UTGERS,     |---------------------------*O*---------------------------
||_// the State  |         Ryan Novosielski - novo...@rutgers.edu
|| \\ University | Sr. Technologist - 973/972.0922 (2x0922) ~*~ RBHS Campus
||  \\    of NJ  | Office of Advanced Research Computing - MSB A555B, Newark
     `'

--
slurm-users mailing list -- slurm...@lists.schedmd.com
To unsubscribe send an email to slurm-us...@lists.schedmd.com

Ole Holm Nielsen via slurm-users

unread,
May 17, 2024, 2:53:05 AM5/17/24
to slurm...@lists.schedmd.com
On 5/17/24 05:16, Ratnasamy, Fritz via slurm-users wrote:
>  What is the "official" process to remove nodes safely? I have drained
> the nodes so jobs are completed and put them in down state after they are
> completely drained.
> I edited the slurm.conf file to remove the nodes. After some time, I can
> see that the nodes were removed from the partition with the command sinfo
>
> However, I was told I might need to restart the service slurmctld, do you
> know if it is necessary? Should I also run scontrol reconfig?

The SchedMD presentations in https://slurm.schedmd.com/publications.html
describe node add/remove.

I've collected my notes on this in the Wiki page
https://wiki.fysik.dtu.dk/Niflheim_system/Slurm_operations/#add-and-remove-nodes

/Ole
Reply all
Reply to author
Forward
0 new messages