[slurm-users] FUTURE nodes do not return to idle on slurmctld restart

Steve Kirk via slurm-users

unread,

Sep 25, 2025, 11:20:26 AMSep 25

to slurm-users list

Hello,

We've got a few nodes defined in our slurm.conf in 'FUTURE' state as
it's a new hardware type we're working on brining into service.

The nodes are currently all allocated to a dedicated partition. The
partition is configured as 'state=UP'. As we've built the new nodes and
started slurmd+munge, they've appeared in an idle state in the new
partition as expected. All good so far.

However if the slurmctld is restarted the nodes go back to being in
'FUTURE' state, and do not transition to idle, accept jobs etc.

The slurm daemon on the new nodes can clearly still talk to the
slurmctld, s* commands on the new nodes work as expected but remain in
FUTURE state - until slurmd on each node is restarted.

I could have misunderstood something about the FUTURE state but I was
expecting them to go back to idle; I understand that slurmctld doesn't
communicate out to nodes in FUTURE state but I at least expected them
to be picked up when they communicate _in_ to the slurmctld.

Is this expected behaviour or perhaps a bug? The reason I've defined
the new nodes this way so I don't have to update slurm.conf and restart
slurmctld as each is built, but can do that as a single job once
everything is finished, however it seems less useful if they can
'disappear' from the cluster as far as users are concerned.

Cheers,
Steve

--
slurm-users mailing list -- slurm...@lists.schedmd.com
To unsubscribe send an email to slurm-us...@lists.schedmd.com

Bjørn-Helge Mevik via slurm-users

unread,

Sep 26, 2025, 3:08:49 AMSep 26

to slurm...@schedmd.com

I think you have to remove them from the FUTURE state in slurm.conf.

--
B/H

signature.asc

Steve Kirk via slurm-users

unread,

Sep 26, 2025, 12:44:21 PMSep 26

to slurm...@lists.schedmd.com

Afternoon,

On Fri, 2025-09-26 at 09:06 +0200, Bjørn-Helge Mevik via slurm-users
wrote:

> I think you have to remove them from the FUTURE state in slurm.conf.

It does seem that way, and I intend to when work is complete, but that
also seems to limit the usefulness of the FUTURE state if I still have
to update slurm.conf each time and 'scontrol reconfgure'.

I guess it's fine for nodes we're testing. Perhaps the solution is to
ask SchedMD to add the behavoir to the docs/man page and leave it up to
the user to determine what they're happy to use FUTURE for.

Reply all

Reply to author

Forward