[slurm-users] Questions about adding new nodes to Slurm

3,786 views
Skip to first unread message

David Henkemeyer

unread,
Apr 27, 2021, 2:35:56 PM4/27/21
to slurm...@lists.schedmd.com
Hello,

I'm new to Slurm (coming from PBS), and so I will likely have a few questions over the next several weeks, as I work to transition my infrastructure from PBS to Slurm.

My first question has to do with adding nodes to Slurm.  According to the FAQ (and other articles I've read), you need to basically shut down slurm, update the slurm.conf file on all nodes in the cluster, then restart slurm.

- Why do all nodes need to know about all other nodes?  From what I have read, its Slurm does a checksum comparison of the slurm.conf file across all nodes.  Is this the only reason all nodes need to know about all other nodes? 
- Can I create a symlink that points <sysconfdir>/slurm.conf to a slurm.conf file on an NFS mount point, which is mounted on all the nodes?  This way, I would only need to update a single file, then restart Slurm across the entire cluster.
- Any additional help/resources for adding/removing nodes to Slurm would be much appreciated.  Perhaps there is a "toolkit" out there to automate some of these operations (which is what I already have for PBS, and will create for Slurm, if something doesn't already exist).

Thank you all,

David

Paul Edmon

unread,
Apr 27, 2021, 2:51:46 PM4/27/21
to slurm...@lists.schedmd.com

1. Part of the communications for slurm is hierarchical.  Thus nodes need to know about other nodes so they can talk to each other and forward messages to the slurmctld.

2. Yes, this is what we do.  We have our slurm.conf shared via NFS from our slurm master and then we just update that single conf.  After that update we then use salt to issue a global restart to all the slurmd's and slurmctld to pick up the new config.  scontrol reconfigure is not enough when adding new nodes, you have to issue a global restart.

3. It's pretty straight forward all told.  You just need to update the slurm.conf and do a restart.  You need to be careful that the names you enter into the slurm.conf are resolvable by DNS, else slurmctld may barf on restart.  Sadly no built in sanity checker exists that I am aware of aside from actually running slurmctld.  We got around this by putting together a gitlab runner which screens our slurm.conf's by running synthetic slurmctld to sanity check.

-Paul Edmon-

Max Voit

unread,
Apr 27, 2021, 3:51:51 PM4/27/21
to slurm...@lists.schedmd.com
On Tue, 27 Apr 2021 11:35:18 -0700
David Henkemeyer <david.he...@gmail.com> wrote:

> - Can I create a symlink that points <sysconfdir>/slurm.conf to a
> slurm.conf file on an NFS mount point, which is mounted on all the
> nodes? This way, I would only need to update a single file, then
> restart Slurm across the entire cluster.

You can also run slurm in "configless-mode", limiting the number of
hosts that need have the slurm.conf file to the ones running slurmctld:
https://slurm.schedmd.com/configless_slurm.html

On the other hand, operating a cluster a configuration management
system might come in handy anyways.

Best,
Max

Sid Young

unread,
Apr 27, 2021, 8:49:32 PM4/27/21
to Slurm User Community List
Hi David,

I use SaltStack to push out the slurm.conf file to all nodes and do a "scontrol reconfigure" of the slurmd, this makes management much easier across the cluster. You can also do service restarts from one point etc. Avoid NFS mounts for the config, if the mount locks up your screwed.




Sid

Ole Holm Nielsen

unread,
Apr 28, 2021, 3:38:34 AM4/28/21
to slurm...@lists.schedmd.com
On 4/28/21 2:48 AM, Sid Young wrote:
> I use SaltStack to push out the slurm.conf file to all nodes and do a
> "scontrol reconfigure" of the slurmd, this makes management much easier
> across the cluster. You can also do service restarts from one point etc.
> Avoid NFS mounts for the config, if the mount locks up your screwed.

Pushing the slurm.conf (and other config files) to all cluster nodes may
be done in several other ways:

1. Configless Slurm is ideal and simple to use, see
https://wiki.fysik.dtu.dk/niflheim/Slurm_configuration#configless-slurm
https://slurm.schedmd.com/configless_slurm.html

2. You can push the /etc/slurm directory with ClusterShell (clush --copy), see
https://wiki.fysik.dtu.dk/niflheim/SLURM#clustershell
https://clustershell.readthedocs.io/en/latest/intro.html

/Ole

Tina Friedrich

unread,
May 4, 2021, 8:26:39 AM5/4/21
to slurm...@lists.schedmd.com
Hello,

a lot of people already gave very good answer to how to tackle this.

Still, I thought it worth pointing this out - you said 'you need to
basically shut down slurm, update the slurm.conf file, then restart'.
That makes it sound like a major operation with lots of prep required.

It's not like that at all. Updating slurm.conf is not a major operation.

There's absolutely no reason to shut things down first & then change the
file. You can edit the file / ship out a new version (however you like)
and then restart the daemons.

The daemons do not have to all be restarted simultaneously. It is of no
consequence if they're running with out-of-sync config files for a bit,
really. (There's a flag you can set if you want to suppress the warning
- 'NO_CONF_HASH' debug flag I think).

Restarting the dameons (slurmctld, slurmd, ...) is safe. It does not
require cluster downtime or anything.

I control slurm.conf using configuration management; the config
management process restarts the appropriate daemon (slurmctld, slurmd,
slurmdbd) if the file changed. This certainly never happens at the same
time; there's splay in that. It doesn't even necessarily happen on the
controller first, or anything like that.

What I'm trying to get across - I have a feeling this 'updating the
cluster wide config file' and 'file must be the same on all nodes' is a
lot less of a procedure (and a lot less strict) than you currently
imagine it to be :)

Tina

On 27/04/2021 19:35, David Henkemeyer wrote:
> Hello,
>
> I'm new to Slurm (coming from PBS), and so I will likely have a few
> questions over the next several weeks, as I work to transition my
> infrastructure from PBS to Slurm.
>
> My first question has to do with *_adding nodes to Slurm_*.  According
> to the FAQ (and other articles I've read), you need to basically shut
> down slurm, update the slurm.conf file /*on all nodes in the cluster*/,
> then restart slurm.
>
> - Why do all nodes need to know about all other nodes?  From what I have
> read, its Slurm does a checksum comparison of the slurm.conf file across
> all nodes.  Is this the only reason all nodes need to know about all
> other nodes?
> - Can I create a symlink that points <sysconfdir>/slurm.conf to a
> slurm.conf file on an NFS mount point, which is mounted on all the
> nodes?  This way, I would only need to update a single file, then
> restart Slurm across the entire cluster.
> - Any additional help/resources for adding/removing nodes to Slurm would
> be much appreciated.  Perhaps there is a "toolkit" out there to automate
> some of these operations (which is what I already have for PBS, and will
> create for Slurm, if something doesn't already exist).
>
> Thank you all,
>
> David

--
Tina Friedrich, Advanced Research Computing Snr HPC Systems Administrator

Research Computing and Support Services
IT Services, University of Oxford
http://www.arc.ox.ac.uk http://www.it.ox.ac.uk

Sid Young

unread,
May 4, 2021, 8:32:48 AM5/4/21
to Slurm User Community List
You can push a new conf file and issue an "scontrol reconfigure" on the fly as needed... I do it on our cluster as needed, do the nodes first then login nodes then the slurm controller... you are making a huge issue of a very basic task...

Sid

Tina Friedrich

unread,
May 4, 2021, 8:48:25 AM5/4/21
to slurm...@lists.schedmd.com
Not sure if that's changed but aren't there cases where 'scontrol
reconfigure' isn't sufficient? (Like adding nodes?)

But yes, that's my point exactly; it is a pretty basic day to day task
to update slurm.conf, not some daunting operation that requires a
downtime or anything like it. (I remember this requirement to update the
config file everywhere & restart everything sounding like a major task
that requires announcements & downtimes to me when I started with SLURM
- coming from Grid Engine - and it took me while to figure out, and
trust, that an update to slurm.conf is a very minor task, and not a
risky one really :) ))

Tina
> http://www.arc.ox.ac.uk <http://www.arc.ox.ac.uk>
> http://www.it.ox.ac.uk <http://www.it.ox.ac.uk>

Ole Holm Nielsen

unread,
May 4, 2021, 9:52:19 AM5/4/21
to slurm...@lists.schedmd.com
The task of adding or removing nodes from Slurm is well documented and
discussed in SchedMD presentations, please see my Wiki page
https://wiki.fysik.dtu.dk/niflheim/SLURM#add-and-remove-nodes

/Ole

Prentice Bisbal

unread,
May 4, 2021, 3:14:55 PM5/4/21
to slurm...@lists.schedmd.com

I agree that people are making updating slurm.conf a bigger issue than people are making it out to be. However, there are certain config changes that do require restarting the daemon rather than just doing 'scontrol reconfigure.' these options are documented in the slurm.conf documentation (just search for "restart")

I believe it's often only the slurmctld that needs to be restarted, which is one daemon on one system, rather than restarting slurmd on all the compute nodes, but there are a few that require all slurm daemons being restarted. Adding nodes to a cluster is one of them:

Changes in node configuration (e.g. adding nodes, changing their processor count, etc.) require restarting both the slurmctld daemon and the slurmd daemons. All slurmd daemons must know each node in the system to forward messages in support of hierarchical communications

But to avoid this, you can use the future setting to define "future" nodes:

FUTURE
Indicates the node is defined for future use and need not exist when the Slurm daemons are started. These nodes can be made available for use simply by updating the node state using the scontrol command rather than restarting the slurmctld daemon. After these nodes are made available, change their State in the slurm.conf file. Until these nodes are made available, they will not be seen using any Slurm commands or nor will any attempt be made to contact them.

--
Prentice

Reply all
Reply to author
Forward
0 new messages