We use a maintenance reservation to process node slurm updates from v.20.02.[3|6] to v. 20.11.2
The last step within the job is to set the node state to drain with reason slurm-updated. Once the job is done the node reboots and the reservation terminates.
After the node reboots we check that things are OK then resume the node.
In the past the node state would be “idle” at that point; now the node state is “maint”. Neither restarting slurmd nor slurmctld changes that.
The node will accept and run jobs but the node state remains “maint”
Jenny Williams
UNC Chapel Hill
Looked closer – figured this one out. Just taking longer on that phase.