[slurm-users] pam_slurm_adopt not working for all users

3,208 views
Skip to first unread message

Loris Bennett

unread,
May 21, 2021, 8:54:11 AM5/21/21
to Slurm Users Mailing List
Hi,

We have set up pam_slurm_adopt using the official Slurm documentation
and Ole's information on the subject. It works for a user who has SSH
keys set up, albeit the passphrase is needed:

$ salloc --partition=gpu --gres=gpu:1 --qos=hiprio --ntasks=1 --time=00:30:00 --mem=100
salloc: Granted job allocation 7202461
salloc: Waiting for resource configuration
salloc: Nodes g003 are ready for job

$ ssh g003
Warning: Permanently added 'g003' (ECDSA) to the list of known hosts.
Enter passphrase for key '/home/loris/.ssh/id_rsa':
Last login: Wed May 5 08:50:00 2021 from login.curta.zedat.fu-berlin.de

$ ssh g004
Warning: Permanently added 'g004' (ECDSA) to the list of known hosts.
Enter passphrase for key '/home/loris/.ssh/id_rsa':
Access denied: user loris (uid=182317) has no active jobs on this node.
Access denied by pam_slurm_adopt: you have no active jobs on this node
Authentication failed.

If SSH keys are not set up, then the user is asked for a password:

$ squeue --me
JOBID PARTITION NAME USER ST TIME NODES NODELIST(REASON)
7201647 main test_job nokeylee R 3:45:24 1 c005
7201646 main test_job nokeylee R 3:46:09 1 c005
$ ssh c005
Warning: Permanently added 'c005' (ECDSA) to the list of known hosts.
nokeylee@c005's password:

My assumption was that a user should be able to log into a node on which
that person has a running job without any further ado, i.e. without the
necessity to set up anything else or to enter any credentials.

Is this assumption correct?

If so, how can I best debug what I have done wrong?

Cheers,

Loris

--
Dr. Loris Bennett (Hr./Mr.)
ZEDAT, Freie Universität Berlin Email loris....@fu-berlin.de

Juergen Salk

unread,
May 21, 2021, 11:35:32 AM5/21/21
to Slurm User Community List
Hi Loris,

this depends largely on whether host-based authentication is
configured (which does not seem to be the case for you) and also on
how exactly the PAM stack for sshd looks like in /etc/pam.d/sshd.

As the rules are worked through in the order they appear in
/etc/pam.d/sshd, pam_slurm_adopt cannot bypass the rules that are
placed further up the PAM stack and that are responsible for
regular authentication such as password or public key
authentication.

Best regards
Jürgen

--
Jürgen Salk
Scientific Software & Compute Services (SSCS)
Kommunikations- und Informationszentrum (kiz)
Universität Ulm
Telefon: +49 (0)731 50-22478
Telefax: +49 (0)731 50-22471

* Loris Bennett <loris....@fu-berlin.de> [210521 14:53]:
--
GPG A997BA7A | 87FC DA31 5F00 C885 0DC3 E28F BD0D 4B33 A997 BA7A

Tina Friedrich

unread,
May 21, 2021, 11:35:53 AM5/21/21
to slurm...@lists.schedmd.com
Hi Loris,

I'm not an PAM expert, but - pam_slurm_adopt doesn't do authenticatio,
it only verifies that access for the authenticated user is allowed (by
checking there's a job). 'account' not 'auth' in PAM config. As in, it's
got nothing to do with how the user logs in to the server / is
authenticated by the server.

So yes, I'd expect this. For SSH logins to work, users need to, well, be
able to log in via ssh. Key based, password auth, host-based SSH,
Kerberos, ... - whatever auth mechanism your PAM config's configured to
use (or whatever you've configured in sshd_config).

If this is simply about quickly accessing nodes that they have jobs on
to check on them - we tell our users to 'srun' into a job allocation
(srun --jobid=XXXXXX).

Tina
Tina Friedrich, Advanced Research Computing Snr HPC Systems Administrator

Research Computing and Support Services
IT Services, University of Oxford
http://www.arc.ox.ac.uk http://www.it.ox.ac.uk

Ole Holm Nielsen

unread,
May 21, 2021, 11:40:16 AM5/21/21
to slurm...@lists.schedmd.com
Hi Loris,

I don't know if this would solve your problem, but I think that node SSH
keys should be gathered and distributed. See my notes in
https://wiki.fysik.dtu.dk/niflheim/SLURM#ssh-keys-for-password-less-access-to-cluster-nodes

/Ole

Marcus Wagner

unread,
May 21, 2021, 11:45:07 AM5/21/21
to slurm...@lists.schedmd.com
Hi Loris,

pam slurm adopt just allows or disallows a user to login to a node,
depending if a job runs or not.
Yet you have to do something, that the user can login passwordless, e.g.
through host-based authentication.

Best
Marcus

Juergen Salk

unread,
May 21, 2021, 12:31:24 PM5/21/21
to Slurm User Community List
* Tina Friedrich <tina.fr...@it.ox.ac.uk> [210521 16:35]:

> If this is simply about quickly accessing nodes that they have jobs on to
> check on them - we tell our users to 'srun' into a job allocation (srun
> --jobid=XXXXXX).

Hi Tina,

sadly, this does not always work in version 20.11.x any more because of the
new non-overlapping default behaviour for job step allocations.

$ sbatch -n 1 --wrap="srun sleep 600"
Submitted batch job 2550804
$ squeue --me
JOBID PARTITION NAME USER ST TIME NODES NODELIST(REASON)
2550804 standard wrap user01 R 0:06 1 n0326

$ srun --jobid=2550804 --pty /bin/bash
srun: Job 2550804 step creation temporarily disabled, retrying (Requested nodes are busy)

(and hangs forever untig Ctrl-C'ed ...)

^Csrun: Cancelled pending job step with signal 2
srun: error: Unable to create step for job 2550804: Job/step already completing or completed
$

This now needs --overlap option for both, the job allocation itself and the
srun command that attaches the shell, in order to always work as before.

Best regards
Jürgen



Brian Andrus

unread,
May 21, 2021, 3:34:12 PM5/21/21
to slurm...@lists.schedmd.com
Umm.. Your keys are password protected. If they were not, you would be
getting what you expect:

Enter passphrase for key '/home/loris/.ssh/id_rsa':

Brian Andrus

Brian Andrus

unread,
May 21, 2021, 3:36:07 PM5/21/21
to slurm...@lists.schedmd.com
Oh, you could also use the ssh-agent to mange the keys, then use
'ssh-add ~/.ssh/id_rsa' to type the passphrase once for your whole
session (from that system).

Brian Andrus


On 5/21/2021 5:53 AM, Loris Bennett wrote:

Loris Bennett

unread,
May 25, 2021, 8:10:09 AM5/25/21
to Slurm User Community List
Hi everyone,

Thanks for all the replies.

I think my main problem is that I expect logging in to a node with a job
to work with pam_slurm_adopt but without any SSH keys. My assumption
was that MUNGE takes care of the authentication, since users' jobs start
on nodes with the need for keys.

Can someone confirm that this expectation is wrong and, if possible, why
the analogy with jobs is incorrect?

I have a vague memory that this used work on our old cluster with an
older version of Slurm, but I could be thinking of a time before we set
up pam_slurm_adopt.

Cheers,

Loris

Ole Holm Nielsen

unread,
May 25, 2021, 8:39:14 AM5/25/21
to slurm...@lists.schedmd.com
Hi Loris,

I think you need, as pointed out by others, either of:

* SSH keys, see
https://wiki.fysik.dtu.dk/niflheim/SLURM#ssh-keys-for-password-less-access-to-cluster-nodes

* SSH host-base authentication, see
https://wiki.fysik.dtu.dk/niflheim/SLURM#host-based-authentication

/Ole

Loris Bennett

unread,
May 25, 2021, 10:57:11 AM5/25/21
to Slurm User Community List
Hi Ole,

Thanks for the links.

I have discovered that the users whose /home directories were migrated
from our previous cluster all seem to have a pair of keys which were
created along with files like '~/.bash_profile'. Users who have been
set up on the new cluster don't have these files.

Is there some /etc/skel-like mechanism which will create passwordless
SSH keys when a user logs into the system for the first time? It looks
increasingly to me that such a mechanism must have existed on our old
cluster.

Cheers,

Loris

Lloyd Brown

unread,
May 25, 2021, 11:16:20 AM5/25/21
to slurm...@lists.schedmd.com
We had something similar happen, when we migrated away from a
Rocks-based cluster.  We used a script like the one attached, in
/etc/profile.d, which was modeled heavily by something similar in Rocks.

You might need to adapt it a bit for your situation, but otherwise it's
pretty straightforward.

Lloyd

--
Lloyd Brown
HPC Systems Administrator
Office of Research Computing
Brigham Young University
http://marylou.byu.edu
ssh-key.sh

Loris Bennett

unread,
May 25, 2021, 12:07:28 PM5/25/21
to Slurm User Community List
Hi Lloyd,

Lloyd Brown <lloyd...@byu.edu> writes:

> We had something similar happen, when we migrated away from a Rocks-based
> cluster.  We used a script like the one attached, in /etc/profile.d, which was
> modeled heavily by something similar in Rocks.
>
> You might need to adapt it a bit for your situation, but otherwise it's pretty
> straightforward.
>
> Lloyd

I was just getting round to the idea that /etc/profile.d might be the
way to go, so your script looks like exactly the sort of thing I need.

Thanks!

Loris

PS Am I wrong to be surprised that this is something one needs to roll
oneself? It seems to me that most clusters would want to implement
something similar. Is that incorrect? If not, are people doing
something else? Or did some vendor setting things up with a home-spun
script in a dim and distance past happen to everyone else too?

Brian Andrus

unread,
May 25, 2021, 12:24:13 PM5/25/21
to slurm...@lists.schedmd.com
Your mistake is that munge has nothing to do with sshd, which is the
daemon you are connecting to. It can use PAM (hence the ability to use
pam_slurm_adopt), but munge has no pam integration that I am aware of.

As far as your /etc/skel bits, that is something that is done when a
user's home is first created at initial login (if so configured). So,
depending on how/where they did that, such items should be automatically
created.
ssh keys, however, are not created automatically. As others have
mentioned, you can create a script in /etc/profile.d/ where some of your
initial items can be executed. We have HPC_Setup.sh in there where we
create ssh keys, setup their .forward file and other setup tasks.

Brian Andrus

Tina Friedrich

unread,
May 25, 2021, 12:32:04 PM5/25/21
to slurm...@lists.schedmd.com
...I really didn't want to wade in on this, but why not set up host
based ssh? It's not exactly as if passphraseless keys give better security?

Tina

Patrick Goetz

unread,
May 25, 2021, 1:04:22 PM5/25/21
to slurm...@lists.schedmd.com
On 5/25/21 11:07 AM, Loris Bennett wrote:
> PS Am I wrong to be surprised that this is something one needs to roll
> oneself? It seems to me that most clusters would want to implement
> something similar. Is that incorrect? If not, are people doing
> something else? Or did some vendor setting things up with a home-spun
> script in a dim and distance past happen to everyone else too?
>

I'm guessing a lot of clusters don't allow users to connect to nodes
directly using ssh, as this allows them to circumvent slurm.





Max Voit

unread,
May 25, 2021, 1:10:55 PM5/25/21
to slurm...@lists.schedmd.com
On Tue, 25 May 2021 14:09:54 +0200
"Loris Bennett" <loris....@fu-berlin.de> wrote:

> to work with pam_slurm_adopt but without any SSH keys. My assumption
> was that MUNGE takes care of the authentication, since users' jobs
> start on nodes with the need for keys.
>
> Can someone confirm that this expectation is wrong and, if possible,
> why the analogy with jobs is incorrect?

sshd uses PAM for authentication purposes only for the methods
"password" and "challenge-response". The remaining involvement of PAM
is limited to the "account" and "session" facilities (the latter of
which pam_slurm_adopt is associated with). Thus, if not using
"password" or "challenge-response" authentication in sshd, some other
authentication method has to be used (by sshd) which cannot possibly
rely on PAM.

On Tue, 25 May 2021 17:31:42 +0100
Tina Friedrich <tina.fr...@it.ox.ac.uk> wrote:

> ...I really didn't want to wade in on this, but why not set up host
> based ssh? It's not exactly as if passphraseless keys give better
> security?

Imho it's worse. With hostbased-authentication you do limit from where
which nodes of the cluster can be accessed (and this will usually be
restricted to "inside the cluster"). With passwordless key-pairs, in
contrast, passwordless access to the cluster from outside is possible
as soon as a generated private key is taken outside.

Best regards,
Max

Michael Jennings

unread,
May 25, 2021, 2:09:33 PM5/25/21
to Slurm User Community List
On Tuesday, 25 May 2021, at 14:09:54 (+0200),
Loris Bennett wrote:

> I think my main problem is that I expect logging in to a node with a job
> to work with pam_slurm_adopt but without any SSH keys. My assumption
> was that MUNGE takes care of the authentication, since users' jobs start
> on nodes with the need for keys.
>
> Can someone confirm that this expectation is wrong and, if possible, why
> the analogy with jobs is incorrect?

Yes, that expectation is incorrect. When Slurm launches jobs, even
interactive ones, it is Slurm itself that handles connecting all the
right sockets to all the right places, and MUNGE handles the
authentication for that action.

SSHing into cluster node isn't done through Slurm; thus, sshd handles
the authentication piece by calling out to your PAM stack (by
default). And you should think of pam_slurm_adopt as adding a
"required but not sufficient" step in your auth process for SSH; that
is, if it fails, the user can't get in, but if it succeeds, PAM just
moves on to the next module in the stack.

(Technically speaking, it's PAM, so the above is only the default
configuration. It's theoretically possible to set up PAM in a
different way...but that's very much a not-good idea.)

> I have a vague memory that this used work on our old cluster with an
> older version of Slurm, but I could be thinking of a time before we set
> up pam_slurm_adopt.

Some cluster tools, such as Warewulf and PERCEUS, come with built-in
scripts to create SSH key pairs (with unencrypted private keys) that
had special names for any (non-system) user who didn't already have a
pair. Maybe the prior cluster was doing something like that? Or
could it have been using Host-based Auth?

> I have discovered that the users whose /home directories were migrated
> from our previous cluster all seem to have a pair of keys which were
> created along with files like '~/.bash_profile'. Users who have been
> set up on the new cluster don't have these files.
>
> Is there some /etc/skel-like mechanism which will create passwordless
> SSH keys when a user logs into the system for the first time? It looks
> increasingly to me that such a mechanism must have existed on our old
> cluster.

That tends to point toward the "something was doing it for you before
that is no longer present" theory.

You do NOT want to use /etc/skel for this, though. That would cause
all your users to have the same unencrypted private key providing
access to their user account, which means they'd be able to SSH around
as each other. That's...problematic. ;-)

> I was just getting round to the idea that /etc/profile.d might be
> the way to go, so your script looks like exactly the sort of thing I
> need.

You can definitely do it that way, and a lot of sites do. But
honestly, you're better served by setting up Host-based Auth for SSH.
It uses the same public/private keypair KEX to authenticate each other
that is normally used for users, so as long as your hosts are secure,
you can rely on the security of HostbasedAuthentication.

With unencrypted private keys (that's what "passphraseless" really
means), you definitely can be opening the door to abuse. If you want
to go that route, you'd likely want to set up something that users
couldn't abuse, e.g. via AuthorizedKeysCommand, rather than the
traditional in-homedir key pairs.

We use host-based for all of our clusters here at LANL, and it
simplifies a *lot* for us. If you want to give it a try, there's a
good cookbook here:
https://en.wikibooks.org/wiki/OpenSSH/Cookbook/Host-based_Authentication

HTH,
Michael

--
Michael E. Jennings <m...@lanl.gov> - [PGPH: he/him/his/Mr] -- hpc.lanl.gov
HPC Systems Engineer -- Platforms Team -- HPC Systems Group (HPC-SYS)
Strategic Computing Complex, Bldg. 03-2327, Rm. 2341 W: +1 (505) 606-0605
Los Alamos National Laboratory, P.O. Box 1663, Los Alamos, NM 87545-0001

Ole Holm Nielsen

unread,
May 25, 2021, 4:12:20 PM5/25/21
to slurm...@lists.schedmd.com
This is the reason why pam_slurm_adopt was introduced years ago! It
solves that problem nicely. Further info is in
https://wiki.fysik.dtu.dk/niflheim/Slurm_configuration#pam-module-restrictions

/Ole

Ole Holm Nielsen

unread,
May 25, 2021, 4:14:50 PM5/25/21
to slurm...@lists.schedmd.com
On 25-05-2021 18:07, Loris Bennett wrote:
> PS Am I wrong to be surprised that this is something one needs to roll
> oneself? It seems to me that most clusters would want to implement
> something similar. Is that incorrect? If not, are people doing
> something else? Or did some vendor setting things up with a home-spun
> script in a dim and distance past happen to everyone else too?

Yes, you need to configure SSH yourself on a Slurm cluster. It's well
understood and documented using for example either method:

Loris Bennett

unread,
May 27, 2021, 2:19:44 AM5/27/21
to Slurm User Community List
Hi Michael,
Thanks for the detailed explanations. I was obviously completely
confused about what MUNGE does. Would it be possible to say, in very
hand-waving terms, that MUNGE performs a similar role for the access of
processes to nodes as SSH does for the access of users to nodes?

Regarding keys vs. host-based SSH, I see that host-based would be more
elegant, but would involve more configuration. What exactly are the
simplification gains you see? I just have a single cluster and naively I
would think dropping a script into /etc/profile.d on the login node
would be less work than re-configuring SSH for the login node and
multiple compute node images.

Regarding AuthorizedKeysCommand, I don't think we can use that, because
users don't necessarily have existing SSH keys. What abuse scenarios
where you thinking of in connection with in-homedir key pairs?

Cheers,

Loris

Ole Holm Nielsen

unread,
May 27, 2021, 2:50:31 AM5/27/21
to slurm...@lists.schedmd.com
Hi Loris,

On 5/27/21 8:19 AM, Loris Bennett wrote:
> Regarding keys vs. host-based SSH, I see that host-based would be more
> elegant, but would involve more configuration. What exactly are the
> simplification gains you see? I just have a single cluster and naively I
> would think dropping a script into /etc/profile.d on the login node
> would be less work than re-configuring SSH for the login node and
> multiple compute node images.

IMHO, it's really simply to setup hostbased SSH authentification:
https://wiki.fysik.dtu.dk/niflheim/SLURM#host-based-authentication

This is more secure on Linux clusters, and you don't need to configure
users' SSH keys, so it requires less configuration for the sysadmin in the
long run.

/Ole

Ward Poelmans

unread,
May 27, 2021, 3:05:23 AM5/27/21
to slurm...@lists.schedmd.com
On 27/05/2021 08:19, Loris Bennett wrote:
> Thanks for the detailed explanations. I was obviously completely
> confused about what MUNGE does. Would it be possible to say, in very
> hand-waving terms, that MUNGE performs a similar role for the access of
> processes to nodes as SSH does for the access of users to nodes?

A tiny bit yes. Munge allows you to authenticate users between servers
(like a unix socket does within a single machine):
https://github.com/dun/munge/wiki/Man-7-munge


Ward

Loris Bennett

unread,
May 27, 2021, 4:12:02 AM5/27/21
to Slurm User Community List
OK, thanks for the information. I had already read the man page for
MUNGE, but to me it doesn't make it explicitly clear that MUNGE doesn't,
out of the box, include the possibility to do something like SSH.

Would it be correct to say that, if one were daft enough, one could
build some sort of terminal server on top of MUNGE without using SSH,
but which could then replicate basic SSH behaviour?

Loris Bennett

unread,
May 27, 2021, 4:25:00 AM5/27/21
to Slurm User Community List
Hi Ole,

Ole Holm Nielsen <Ole.H....@fysik.dtu.dk> writes:

> Hi Loris,
>
> On 5/27/21 8:19 AM, Loris Bennett wrote:
>> Regarding keys vs. host-based SSH, I see that host-based would be more
>> elegant, but would involve more configuration. What exactly are the
>> simplification gains you see? I just have a single cluster and naively I
>> would think dropping a script into /etc/profile.d on the login node
>> would be less work than re-configuring SSH for the login node and
>> multiple compute node images.
>
> IMHO, it's really simply to setup hostbased SSH authentification:
> https://wiki.fysik.dtu.dk/niflheim/SLURM#host-based-authentication

Your explanation is very clear, but it still seems like quite a few
steps with various gotchas, like the fact that, as I understand it,
shosts.equiv has to contain all the possible ways a host might be
addressed (short name, long name, IP).

> This is more secure on Linux clusters, and you don't need to configure users'
> SSH keys, so it requires less configuration for the sysadmin in the long run.

It is not clear to me what the security advantage is and setting up the
keys it just one script in /etc/profile.d. Regarding the long term, the
keys which were set up on our old cluster were just migrated to the new
cluster and still work, so it is also a one-time thing.

I assume I must be missing something.

Michael Jennings

unread,
May 27, 2021, 11:28:39 AM5/27/21
to Loris Bennett, Slurm User Community List
On Thursday, 27 May 2021, at 08:19:14 (+0200),
Loris Bennett wrote:

> Thanks for the detailed explanations. I was obviously completely
> confused about what MUNGE does. Would it be possible to say, in very
> hand-waving terms, that MUNGE performs a similar role for the access of
> processes to nodes as SSH does for the access of users to nodes?

If you replace the word "processes" with the word "jobs," you've got
it. :-)

MUNGE is really just intended to be a simple, lightweight solution to
allow for creating a single, global "credential domain" among all the
hosts in an HPC cluster using a single shared secret. Without going
into too much detail with the crypto stuff, it basically allows a
trusted local entity to cryptographically prove to another that
they're both part of the same trust/cred domain; having established
this, they know they can trust each other to provide and/or validate
credentials between hosts.

But I want to emphasize the "single shared secret" part. That means
there's a single trust domain. Think "root of trust" with nothing but
the root of trust. So you can authenticate a single group of hosts to
all the rest of the group such that all are equals, but that's it.
There's no additional facility for authenticating different roles or
anything like that. Either you have the same shared secret or you
don't; nothing else is possible.

> Regarding keys vs. host-based SSH, I see that host-based would be more
> elegant, but would involve more configuration. What exactly are the
> simplification gains you see? I just have a single cluster and naively I
> would think dropping a script into /etc/profile.d on the login node
> would be less work than re-configuring SSH for the login node and
> multiple compute node images.

I like to think of it as "one and done." At least in our case at
LANL, and at LBNL previously, all nodes of the same type/group boot
the same VNFS image. As long as I don't need to cryptographically
differentiate among, say, compute nodes, I only have to set up a
single set of credentials for all the hosts, and I'm done.

It also saves overall support time in my experience. By taking the
responsibility for inter-machine trust myself at the system level, I
don't have to worry about (1) modifying a user's SSH config without
their knowledge, (2) running the risk of them messing with their
config and breaking it, or (3) any user support/services calls about
"why can't I do any of the things on the stuff?!" :-)

It is totally a personal/team choice, but I'll be honest: Once I
"discovered" host-based authentication and all the headaches it saved
our sysadmin and consulting teams, I was kicking myself for having
done it the other way for so long! :-D

> Regarding AuthorizedKeysCommand, I don't think we can use that, because
> users don't necessarily have existing SSH keys. What abuse scenarios
> where you thinking of in connection with in-homedir key pairs?

Users don't have to have existing keys for it to work; the command you
specify can easily create a key pair, drop the private key, and output
the public key. Or even simpler, you can specify a value for
"AuthorizedKeysFile" that points to a directory users can't write to,
and store a key pair for each user in that location. Lots of ways to
do it.

But if I'm being frank about it, if I had my druthers, we'd be using
certificates for authentication, not files. The advantages are, in my
very humble opinion, well worth a little extra setup time!

As far as abuse of keys goes: What's stopping your user from taking
that private key you created for them (which is, as you recall,
*unencrypted*) outside of your cluster to another host somewhere else
on campus. Maybe something that has tons of untrusted folks with
root. Then any of those folks can SSH to your cluster as that user.

Credential theft is a *huge* problem in HPC across the world, so I
always recommend that sysadmins think of it as Public Enemy #1! The
more direct and permanent control you have over user credentials, the
better. :-)

> Would it be correct to say that, if one were daft enough, one could
> build some sort of terminal server on top of MUNGE without using SSH,
> but which could then replicate basic SSH behaviour?

No; that would only provide a method to authenticate servers at best.
You can't authenticate users for the reasons I noted above. Single
shared key, single trust domain.

> Your explanation is very clear, but it still seems like quite a few
> steps with various gotchas, like the fact that, as I understand it,
> shosts.equiv has to contain all the possible ways a host might be
> addressed (short name, long name, IP).

You are correct, though that's easy to automate with a teensy weensy
shell script. But yes, there's more up-front configuration. Again,
though, I truly believe it saves admin time in the long run (not to
mention user support staff time and user pain). But again, that's a
personal or team choice.

I'm not sure if I'm clearing things up or just muddying the waters.
But hopefully at least *some* of that helped! :-D

Prentice Bisbal

unread,
May 27, 2021, 11:55:36 AM5/27/21
to slurm...@lists.schedmd.com
Loris,

Your analogy is incorrect, because Slurm doesn't use SSH to launch jobs,
it uses it's own communication protocol, which uses munge for
authentication. Some schedulers used to use ssh to launch jobs, but most
have moved to using their own communications protocol outside of SSH.
It's possible that Slurm used SSH in the early days, too. I wouldn't
know. I've only been using Slurm for the past 5 years.

In those cases, you usually needed host-based SSH so that the scheduler
daemon could launch jobs on the compute nodes. In that situation, you
would be able to ssh from one node to another without per-user ssh keys,
since they'd already be setup on a per-host basis. Perhaps that's what
you are remembering.

Prentice

Prentice Bisbal

unread,
May 27, 2021, 12:02:35 PM5/27/21
to slurm...@lists.schedmd.com
What makes this more secure?

--
Prentice


Lloyd Brown

unread,
May 27, 2021, 12:08:22 PM5/27/21
to slurm...@lists.schedmd.com
While that's absolutely a significant issue, here's how we solved it,
despite still using user keys. This basically assures that while people
can SSH around with keys within our cluster, they get into the login
nodes using SSH keys.  Combine that with the required enrollment in 2FA,
and I think we're doing decently well.

Network routing rules and switch ACLs prevent users from getting into
the non-login nodes from outside the cluster.


(excerpt from sshd_config on login nodes only - It's much simpler on
non-login nodes):

>
> # default behavior - disallow PubKeyAuthentication
> PubKeyAuthentication no
>
> # default behavior - force people to the "you must enroll in 2FA"
> message, and then exit
> ForceCommand /usr/local/bin/2fa_notice.sh
>
> #All users enrolled in 2FA, are part of the twofactusers group
> Match group twofactusers
>         ForceCommand none
>
> #Allow PubKeyAuthentication for subnets that are internal to the cluster
> Match Address ListOfClusterInternalSubnets
>         PubKeyAuthentication yes

Lloyd


On 5/27/21 9:27 AM, Michael Jennings wrote:
>
> As far as abuse of keys goes:  What's stopping your user from taking
> that private key you created for them (which is, as you recall,
> *unencrypted*) outside of your cluster to another host somewhere else
> on campus.  Maybe something that has tons of untrusted folks with
> root.  Then any of those folks can SSH to your cluster as that user.

Lloyd Brown

unread,
May 27, 2021, 12:09:06 PM5/27/21
to slurm...@lists.schedmd.com
I mistyped that.  "they CAN'T get into the login nodes using SSH keys"

On 5/27/21 10:08 AM, Lloyd Brown wrote:
> they get into the login nodes using SSH keys

Reply all
Reply to author
Forward
0 new messages