[slurm-users] Secondary Unix group id of users not being issued in interactive srun command

Amjad Syed

Sep 21, 2021, 3:12:15 AM9/21/21
to slurm...@schedmd.com
Hello all

We have users who have have defined unix secondary id on our login nodes.

vas20xhu@login01 ~]$ groups


But when we run interactive  and go to compute node , the user does not have secondary  group of BIO_AFMAKAY_LAB_USERS

vas20xhu@c0077 ~]$ groups


This is our interactive script

alias interactive='srun -n1 -p interactive -J interactive --time=12:00:00 --mem-per-cpu=4G --pty bash --login'


When we ssh directly into node without using interactive script there are no issues  with groups.

Anything we are missing in that interactive script?


Ole Holm Nielsen

Sep 21, 2021, 3:32:23 AM9/21/21
to slurm...@lists.schedmd.com
On 9/21/21 9:11 AM, Amjad Syed wrote:
> We have users who have have defined unix secondary id on our login nodes.
> vas20xhu@login01 ~]$ groups
> But when we run interactive  and go to compute node , the user does not
> have secondary  group of BIO_AFMAKAY_LAB_USERS
> vas20xhu@c0077 ~]$ groups
> BIO_pg

I believe that Slurm creates users in the database using the primary UNIX
group name. Slurm would not know about any secondary UNIX groups.

There must be a uniform user and group name space (including UIDs and
GIDs) across the cluster. It is your own responsibility to configure
users and groups in the passwd and group databases consistently, see
https://slurm.schedmd.com/quickstart_admin.html (search for GIDs).

FWIW, I have some information about creation of users and groups in this
Wiki page:


Loris Bennett

Sep 21, 2021, 4:13:20 AM9/21/21
to Slurm User Community List
I would have thought that this maybe does not have anything to do with

Assuming you are using SSSD, it looks to me more like the settings in
sssd.conf on the nodes might be incorrect. In our sssd.conf I found the
following note to myself:

# LB: rfc2307bis should not be used if memberUid is used for group membership
# otherwise secondary groups fail
# ldap_schema = rfc2307bis

What schema you need depends on how your group information is stored.
rfc2307 assumes the groups just have a memberUID, whereas with rfc2307bis
the users also have a memberOf attribute.

I don't understand LDAP well enough to understand why rfc2307bis causes
the secondary group resolution to fail, even though the groups still
have the information via memberUID, but my experience was that it does
indeed fail.



Dr. Loris Bennett (Hr./Mr.)
ZEDAT, Freie Universität Berlin Email loris....@fu-berlin.de

Sternberger, Sven

Sep 21, 2021, 4:59:32 AM9/21/21
to Slurm User Community List

I tried it (I normally would use sacct and ssh) but even with "srun bash --login"
the "id" command gives me all groups. And we are using rfc2307bis.

And I don't think that slurm stores any group information. So it seems
to be the way the session starts on your node. But at this point
no idea where to look.

best regards

----- Ursprüngliche Mail -----
> Von: "Amjad Syed" <amja...@gmail.com>
> An: slurm...@schedmd.com
> Gesendet: Dienstag, 21. September 2021 09:11:33
> Betreff: [slurm-users] Secondary Unix group id of users not being issued in interactive srun command

Bjørn-Helge Mevik

Sep 21, 2021, 5:02:56 AM9/21/21
to slurm...@schedmd.com
Amjad Syed <amja...@gmail.com> writes:

> We have users who have have defined unix secondary id on our login nodes.
> vas20xhu@login01 ~]$ groups
> But when we run interactive and go to compute node , the user does not
> have secondary group of BIO_AFMAKAY_LAB_USERS
> vas20xhu@c0077 ~]$ groups
> BIO_pg

> When we ssh directly into node without using interactive script there are
> no issues with groups.

Have you set up your Slurm to be NSS provider for user and group info?
I believe that will only send primary group to the job step processes.
See the enable_nss_slurm LaunchParameters in man slurm.conf, and the URL
in that description.

Bjørn-Helge Mevik


Timo Rothenpieler

Sep 21, 2021, 5:08:19 AM9/21/21
to slurm...@lists.schedmd.com
Are you using LDAP for your users?
This sounds exactly like what I was seeing on our cluster when
nsswitch.conf was not properly set up.

In my case, I was missing a line like

> initgroups: files [SUCCESS=continue] ldap

Just adding ldap to group: was not enough, and only got the primary
group to work, exactly like in your case.
