Hi,
I am a little new to this, so please pardon my ignorance.
I have configured slurm in my cluster and it works fine with local users. But I am not able to get it working with LDAP/SSSD authentication.
User logins using ssh are working fine. An LDAP user can login to the login, slurmctld and compute nodes, but when they try to submit jobs, slurmctld logs an error about invalid account or partition for user.
Someone said we need to add the user manually into the database
using the sacctmgr command. But I am not sure we need to do this
for each and every LDAP user. Yes, it does work if we add the LDAP
user manually using sacctmgr. But I am not convinced this manual
way is the way to do.
The documentation is not very clear about using LDAP accounts.
Saw somewhere in the list about using UsePAM=1 and copying or
creating a softlink for slurm PAM module under /etc/pam.d . But it
didn't work for me.
Saw somewhere else that we need to specifying LaunchParameters=enable_nss_slurm
in the slurm.conf file and put slurm keyword in passwd/group
entry in the /etc/nsswitch.conf file. Did these, but didn't
help either.
I am
bereft of ideas at present. If anyone has real world
experience and can advise, I will be grateful.
Thank you,
Richard
We use Active Directory and NFSv4 and I think that we have some instructions for setting it up on CentOS 7. It was quite involved and does require that the directory service returns UID and GID information, so have populated the RFC2307 fields in AD. This is required for munge to work.
We also use AUKS (https://github.com/cea-hpc/auks and https://slurm.schedmd.com/slurm_ug_2012/auks-tutorial.pdf) so that the Kerberos keys are refreshed on the compute nodes, otherwise jobs must complete within the Kerberos key lifetime (for us 24 hours).
This may be overcomplicated for what you need, but it sounds as if you do not have consistent UIDs across all nodes which would create problems for munge.
I’ll let others chip in but I can probably find the documents used to set it up.
William
“An LDAP user can login to the login, slurmctld and compute nodes, but when they try to submit jobs, slurmctld logs an error about invalid account or partition for user.”
Since I don’t think it was mentioned below, does a non-LDAP user get the same error, or does it work by default?
We don’t use LDAP explicitly, but we’ve used sssd with Slurm and Active Directory for 6.5 years without issue. We’ve always added users to sacctmgr so that we could track usage by research group or class, so we never used a default account for all users.
From:
Richard Chang via slurm-users <slurm...@lists.schedmd.com>
Date: Saturday, February 3, 2024 at 11:41 PM
To: slurm...@schedmd.com <slurm...@schedmd.com>
Subject: [slurm-users] SLURM configuration for LDAP users
External Email Warning
This email originated from outside the university. Please use caution when opening attachments, clicking links, or responding to requests.
Job submission works for local users. I was not aware we need to
manually add the LDAP users to the SlurmDB. Does it mean we need
to add each and every user in LDAP to the Slurm database ?
--
slurm-users mailing list -- slurm...@lists.schedmd.com
To unsubscribe send an email to slurm-us...@lists.schedmd.com