How to update a partition ?

已查看 671 次
跳至第一个未读帖子

Kartikey Sarode

未读,
2021年11月2日 00:50:342021/11/2
收件人 google-cloud-slurm-discuss
Hi,

We have deployed a multi-partition Slurm cluster using https://github.com/SchedMD/slurm-gcp.

Now, we are looking to increase the maximum number of nodes in a partition from 1000 to 5000.
I'm unable to find anything specific in the documentation that will help me with the commands to do so.

One doc recommended the below approach:
1. Update the partition information using scontrol.
2. Restart slurmctld and slurmdbd in the controller.
3. Run "scontrol reconfigure".

Is the above approach correct or do I need to do something else ?

Thanks,
Kartikey

Alex Chekholko

未读,
2021年11月2日 15:08:522021/11/2
收件人 Kartikey Sarode、google-cloud-slurm-discuss
Hi Kartikey,

Here is how I remember doing it with this software configuration:
1) drain the cluster so all the compute nodes get turned off
2) update the number of nodes and partition definition in slurm.conf
3) restart slurmctld on the controller

You're done!  The login node doesn't have running slurm daemons so nothing to change there, and the compute nodes mount the central slurm.conf, so you can't easily change it for running nodes and all new nodes have slurmd load the new config.

Regards,
Alex


--
You received this message because you are subscribed to the Google Groups "google-cloud-slurm-discuss" group.
To unsubscribe from this group and stop receiving emails from it, send an email to google-cloud-slurm-...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/google-cloud-slurm-discuss/701685a0-94b3-43a8-9384-95fc40ec574an%40googlegroups.com.

Kartikey Sarode

未读,
2021年11月3日 10:03:482021/11/3
收件人 google-cloud-slurm-discuss
Thanks Alex,

I will try this and update how it goes !

Regards,
Kartikey

回复全部
回复作者
转发
0 个新帖子