[slurm-users] Can't clear the REASON

12 views
Skip to first unread message

Berg, Stephen P CIV USN NRL DET SSC MS (USA) via slurm-users

unread,
Apr 15, 2026, 9:19:42 AM (9 days ago) Apr 15
to slurm...@schedmd.com
Just upgraded my grid to 25.11 a couple days ago. Got two nodes that were down for a bit and when I set them to resume and set the reason="" they are still showing up with the reason I had set while they were down.  Nothing I do seems to unset that reason flag for these two nodes.

I've set the nodes to down with a reason and then back to idle with the blank reason but it never goes back to "none" like the other nodes. Whatever the last reason set was persists.

The behavior seems different since I updated to 25.11.  Was running the 20.11 available from the epel-8 repos previously.

Christopher Samuel via slurm-users

unread,
Apr 15, 2026, 10:06:04 AM (9 days ago) Apr 15
to slurm...@lists.schedmd.com
On 4/15/26 5:48 am, Berg, Stephen P CIV USN NRL DET SSC MS (USA) via
slurm-users wrote:

> Nothing I do seems to unset that reason flag for these two nodes.

What is the reason that is being set?

There are some (for instance related to invalid registrations because of
config issues or broken hardware) that cannot be cleared.

--
Chris Samuel : http://www.csamuel.org/ : Philadelphia, PA, USA

--
slurm-users mailing list -- slurm...@lists.schedmd.com
To unsubscribe send an email to slurm-us...@lists.schedmd.com

Berg, Stephen P CIV USN NRL DET SSC MS (USA) via slurm-users

unread,
Apr 15, 2026, 12:15:54 PM (9 days ago) Apr 15
to slurm...@lists.schedmd.com, Christopher Samuel
While I've been fiddling with it I've used "test", "testing", "cause" and earlier this morning I set all the nodes to "-" just to see if it would take that. Just noticed that after a couple hours 91 of the 92 nodes still have "-" in the REASON column, but one of them now shows up as "none".

If the flag is truly not set, or set to NULL does it show up as blank in the "sinfo -Nl" output or would it show as none like I'm used to seeing?

From: Christopher Samuel via slurm-users <slurm...@lists.schedmd.com>
Sent: Wednesday, April 15, 2026 8:41 AM
To: slurm...@lists.schedmd.com <slurm...@lists.schedmd.com>
Subject: [Non-DoD Source] [slurm-users] Re: Can't clear the REASON
 
On 4/15/26 5:48 am, Berg, Stephen P CIV USN NRL DET SSC MS (USA) via
slurm-users wrote:

> Nothing I do seems to unset that reason flag for these two nodes.

What is the reason that is being set?

There are some (for instance related to invalid registrations because of
config issues or broken hardware) that cannot be cleared.

--

Christopher Samuel via slurm-users

unread,
Apr 15, 2026, 2:19:47 PM (9 days ago) Apr 15
to slurm...@lists.schedmd.com
On 4/15/26 8:05 am, Berg, Stephen P CIV USN NRL DET SSC MS (USA) via
slurm-users wrote:

> While I've been fiddling with it I've used "test", "testing", "cause"
> and earlier this morning I set all the nodes to "-" just to see if it
> would take that. Just noticed that after a couple hours 91 of the 92
> nodes still have "-" in the REASON column, but one of them now shows up
> as "none".
>
> If the flag is truly not set, or set to NULL does it show up as blank in
> the "sinfo -Nl" output or would it show as none like I'm used to seeing?

[oops - accidentally replied privately - this time to the list!]

Ah - to resume a node you just do:

scontrol update node=$NODE state=resume

Don't try and set the reason field for it.

--
Chris Samuel : http://www.csamuel.org/ : Philadelphia, PA, USA

Berg, Stephen P CIV USN NRL DET SSC MS (USA) via slurm-users

unread,
Apr 17, 2026, 7:12:54 AM (7 days ago) Apr 17
to slurm...@lists.schedmd.com, Christopher Samuel
I have tried that and it does work but the reason persists after the nodes get to an idle state.  It's a bit confusing for the node to be idle after a reboot when the reason column still says "rebooting" or "down for maintenance" or whatever.

From: Christopher Samuel via slurm-users <slurm...@lists.schedmd.com>
Sent: Wednesday, April 15, 2026 12:50 PM
To: slurm...@lists.schedmd.com <slurm...@lists.schedmd.com>
Subject: [slurm-users] Re: [Non-DoD Source] Re: Can't clear the REASON
 
On 4/15/26 8:05 am, Berg, Stephen P CIV USN NRL DET SSC MS (USA) via
slurm-users wrote:

> While I've been fiddling with it I've used "test", "testing", "cause"
> and earlier this morning I set all the nodes to "-" just to see if it
> would take that. Just noticed that after a couple hours 91 of the 92
> nodes still have "-" in the REASON column, but one of them now shows up
> as "none".
>
> If the flag is truly not set, or set to NULL does it show up as blank in
> the "sinfo -Nl" output or would it show as none like I'm used to seeing?

[oops - accidentally replied privately - this time to the list!]

Ah - to resume a node you just do:

scontrol update node=$NODE state=resume

Don't try and set the reason field for it.

--
Reply all
Reply to author
Forward
0 new messages