[slurm-users] Novice Slurm Upgrade Questions

23 views
Skip to first unread message

Jason Simms

unread,
Dec 4, 2020, 2:36:36 PM12/4/20
to Slurm User Community List
Hello all,

Thank you for being such a helpful resource for All Things Slurm; I sincerely appreciate the helpful feedback. Right now, we are running 20.02 and considering upgrading to 20.11 during our next maintenance window in January. This will be the first time we have upgraded Slurm, so understandably we are somewhat nervous and have some questions.

I am able to download the source and build RPMs successfully. What is unclear to me is whether I have to adjust anything in the slurm.spec file or use a .rpmmacros file to control certain aspects of the installation. Since this would be an upgrade, rather than a new install, do I have to adjust, e.g., the --prefix value, and all other settings (X11 support, etc.)? Or, will a yum update "correctly" put the files where they are on my system, using settings from the existing 20.02 version?

We purchased the system from a vendor, and of course they use custom scripts to build and install Slurm, and those are tailored for an initial installation, not an upgrade. Their advice to us was, don't upgrade if you don't need to, which seems reasonable, except that many of you respond to initial requests for help by recommending an upgrade. And in any case, Slurm doesn't upgrade nicely from more than two major versions back, so I'm hesitant to go too long without patching.

I'm terribly sorry for my ignorance of all this. But I really lament how terrible most resources are about all this. They assume that you have built the RPMs already, without offering any real guidance as to how to adjust relevant options, or even whether that is a requirement for an upgrade vs. a fresh installation.

Any guidance would be most welcome.

Warmest regards,
Jason

--
Jason L. Simms, Ph.D., M.P.H.
Manager of Research and High-Performance Computing
XSEDE Campus Champion
Lafayette College
Information Technology Services
710 Sullivan Rd | Easton, PA 18042
Office: 112 Skillman Library
p: (610) 330-5632

Paul Edmon

unread,
Dec 4, 2020, 2:47:00 PM12/4/20
to slurm...@lists.schedmd.com

Usually the slurm.spec file provided doesn't change that much between versions.  What we do here is that we maintain a git repository of our slurm.spec that we use with our modifications.  Then each time Slurm is released we compare ours against what is provided, and simply modify the provided one with our changes.

Unless you make specific tweaks to the slurm.spec, you should be able to just use it out of the box no problem.  As always read the changelog to see if there are any major changes between the versions in case a feature you were using was deprecated.  This can happen during major version upgrades.

At least from my experience if you follow the directions on the Slurm documentation regarding upgrades, you should be fine.  The only real hitch is that by default the RPM's do restart the slurmdbd and slurmctld services, which you don't want when upgrading.  You should either neuter this or have those both stopped during the upgrade.  After the upgrade you should run slurmdbd and slurmctld in commandline mode for the initial run.  Once it is done and running normally you can kill these and restart the relevant services.

-Paul Edmon-

Ole Holm Nielsen

unread,
Dec 4, 2020, 3:11:13 PM12/4/20
to slurm...@lists.schedmd.com
Hi Jason,

Slurm upgrading should be pretty simple, IMHO. I've been through this
multiple times, and my Slurm Wiki has detailed upgrade documentation:
https://wiki.fysik.dtu.dk/niflheim/Slurm_installation#upgrading-slurm

Building RPMs is described in this page as well:
https://wiki.fysik.dtu.dk/niflheim/Slurm_installation#build-slurm-rpms

I hope this helps.

/Ole

Jason Simms

unread,
Dec 4, 2020, 3:40:28 PM12/4/20
to Ole.H....@fysik.dtu.dk, Slurm User Community List
Dear Ole,

Thanks. I've read through your docs many times. The relevant upgrade section begins with the assumption that you have properly configured RPMs, so all I'm trying to do is ensure I get to that point. As I noted, a vendor installed Slurm initially through a proprietary script, though they did base it off of created RPMs. I've reached out to them to see whether they used a modified slurm.spec file, which I suspect they did, given that Slurm is installed in /opt/slurm (which seems like a modified prefix, if nothing else).

The fundamental question is, if I am performing a yum update, and I don't adjust any settings in the default slurm.spec file, will it upgrade everything properly where they currently "live," or will it install new files in standard locations? It's a question of whether "yum update" is "smart enough" to figure out what was done before and go with that, or whether I must specify all relevant information in the slurm.spec file each time? Based on Paul's reply, it seems we do need an updated slurm.spec file that reflects our environment, each time we upgrade.

Jason

Paul Edmon

unread,
Dec 4, 2020, 3:58:08 PM12/4/20
to slurm...@lists.schedmd.com

It won't figure it out automatically no.  You will need to ensure that the spec is installing to the same locale as your vendor installed it if they didn't put it in the default location (/opt isn't the default).

-Paul Edmon-

Ole Holm Nielsen

unread,
Dec 5, 2020, 7:45:29 AM12/5/20
to slurm...@lists.schedmd.com
Hi Jason,

Paul is right, so I guess you need to make a decision: 1) Keep your
vendor's customized Slurm setup going forward with future upgrades, or
2) alternatively remove the vendor's Slurm installation and install
standard RPMs as offered by SchedMD's tar-balls.

If you choose 2) you need to analyze how deeply you have been locked
into the vendor's custom setup, also with cluster tools beyond Slurm.

You could save the vendor's slurm.conf and other vital .conf files
normally found in /etc/slurm/, plus make a dump of the Slurm database.
Then copy those files to your new setup.

I strongly recommend for you to try this migration on a test system (a
few old PCs or servers) to test if such a cluster works well. Migrating
the Slurm database needs to be done carefully, see my Wiki page
https://wiki.fysik.dtu.dk/niflheim/Slurm_database#migrate-the-slurmdbd-service-to-another-server

Best regards,
Ole
Reply all
Reply to author
Forward
0 new messages