[slurm-users] Cluster nodes on multiple cluster networks

549 views
Skip to first unread message

Sajesh Singh

unread,
Jan 22, 2021, 1:37:42 PM1/22/21
to Slurm User Community List

We are looking at rolling out cloud bursting to our on-prem Slurm cluster and I am wondering how to deal with the slurm.conf variable SlurmctldHost. It is currently configured with the private cluster network address that the on-prem nodes use to contact it. The nodes in the cloud would contact the head node via its public IP address. How can I configure Slurm so that both IPs are recognized as the head node?

 

 

-Sajesh-

 

Brian Andrus

unread,
Jan 22, 2021, 1:45:28 PM1/22/21
to slurm...@lists.schedmd.com

You would need to have a direct connect/vpn so the cloud nodes can connect to your head node.

Brian Andrus

Sajesh Singh

unread,
Jan 22, 2021, 2:24:43 PM1/22/21
to Slurm User Community List

How would I deal with the address of the head node defined in the slurm.conf as I have it defined as

 

SlurmctldHost=private-hostname(private.ip.addr)

 

The private.ip.addr address is not reachable from the cloud nodes

 

-Sajesh-

 

From: slurm-users <slurm-use...@lists.schedmd.com> On Behalf Of Brian Andrus
Sent: Friday, January 22, 2021 1:45 PM
To: slurm...@lists.schedmd.com
Subject: Re: [slurm-users] Cluster nodes on multiple cluster networks

 

EXTERNAL SENDER

Michael Gutteridge

unread,
Jan 22, 2021, 3:19:11 PM1/22/21
to Slurm User Community List
I don't believe the IP address is required- if you can configure a DNS/hosts entry differently for cloud nodes you can set:

   SlurmCtldhost = controllername

Then have "controllername" resolve to the private IP for the controller for the on-prem cluster, the public IP for the nodes in the cloud.  Theoretically anyway- I haven't run a config like that and I'm not sure how the controller will react to such a configuration (i.e. getting slurm traffic on both interfaces).

If the on-prem nodes can reach the public IP address of the controller it may be simpler to use only the public IP for the controller, but I don't know how your routing is set up.

HTH

 - Michael


Sajesh Singh

unread,
Jan 22, 2021, 4:17:56 PM1/22/21
to Slurm User Community List

Thank you for the recommendation. Will try that out. Unfortunately the on-prem nodes cannot reach the head node via the public IP

 

-Sajesh-

William Brown

unread,
Jan 22, 2021, 5:05:36 PM1/22/21
to Slurm User Community List

I think there would be no reason why a slurm node will care about traffic on multiple interfaces as long as your configuration is set to listen on them, e.g. no firewalld rules in the way restricting traffic to the private network.

 

William

Florian Zillner

unread,
Jan 23, 2021, 6:10:21 AM1/23/21
to Slurm User Community List
Chiming in on Michael's suggestion.

You can specify the same hostname in the slurm.conf but for the on-premise nodes you either set the DNS or the /etc/hosts entry to the local (=private) IP address.
For the cloud nodes you set DNS or the hosts entry to the publicly reachable IP.

example /etc/hosts on-prem node:
10.10.10.10   myslurmcontroller 

example /etc/hosts cloud node:
50.50.50.50   myslurmcontroller 

example slurm.conf for both locations
ControlMachine=myslurmcontroller 

From: slurm-users <slurm-use...@lists.schedmd.com> on behalf of Sajesh Singh <ssi...@amnh.org>
Sent: Friday, 22 January 2021 22:17

To: Slurm User Community List <slurm...@lists.schedmd.com>
Subject: [External] Re: [slurm-users] Cluster nodes on multiple cluster networks
 
Reply all
Reply to author
Forward
0 new messages