Hi all,
I am running a slurm cluster using the basic example in terraform section of slurm-gcp repo. The setup uses the OSlogin and the user has admin access to the compute node.
Now, if the admin user ssh into the login node using IAP, it is able to sudo su and change themself to the root user, but if the same user starts an interactive session by using `srun --exclusive=user --pty --partition=P1 /bin/bash` they are not able to sudo su on the compute node. If the user directly ssh to compute node using IAP, sudo su works.
Things I have tried
1. adding /var/google-sudoers.d/$user with allowing all permission at startup (after google agent and before sshd). This results in `sudo: account validation failure, is your account locked` error.
The setup seems to involve multiple systems like OSlogin and munge and I lack understanding of PAM, NSS, etc to play with this setup. Is there a way to make sure that the Oslogin permissions are reflected in compute node without needing to separately ssh into the node?