Passwordless SSH w/dynamic cluster

111 views
Skip to first unread message

tsn...@gmail.com

unread,
Sep 9, 2021, 1:04:49 PM9/9/21
to google-cloud-slurm-discuss
Hi everybody,

Just wondering if there's a straight forward way to enable passwordless SSH (at least between the controller node and compute node, but preferably between all nodes) with a slurm-gcp cluster.  I can imagine how to get passwordless SSH manually with a static cluster, but have been having difficulty imagining how to achieve it with dynamic cluster scaling.

Any help or advice would be more than appreciated.

Much thanks,
Trevor

Alex Chekholko

unread,
Sep 9, 2021, 1:12:01 PM9/9/21
to tsn...@gmail.com, google-cloud-slurm-discuss
Hi Trevor,

Google has handled it for you in a medium-complicated and non-obvious way, but when it works, it's auto-magic.  When it doesn't work, it's a bear to troubleshoot because like 8 different systems are involved.

Each Google-provided image runs a daemon that polls the cloud metadata and if you enable "Compute OS Login", then at login time, that daemon provisions local accounts on that instance based on the IAM settings of your cloud project.  And also somehow handles auth based on your google account credentials with short-lived keys.

If you don't use "Compute OS Login", then it can do something very similar but different with SSH keys which propagate through that same cloud metadata service and get stored in key-value pairs in cloud metadata.

But I'm only talking about SSHing in from the outside, you're talking about SSHing from one compute node to another?  I imagine they already configured that if MPI workloads are supported.  You can start up a cluster and examine the sshd/slurm config.

Regards,
Alex

--
You received this message because you are subscribed to the Google Groups "google-cloud-slurm-discuss" group.
To unsubscribe from this group and stop receiving emails from it, send an email to google-cloud-slurm-...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/google-cloud-slurm-discuss/eabed4d1-6e98-4add-a96a-a3dfe87054e5n%40googlegroups.com.

Trevor Tanner

unread,
Sep 9, 2021, 2:55:18 PM9/9/21
to Alex Chekholko, google-cloud-slurm-discuss
Hi Alex,

Thanks so much for the reply!  I appreciate the quick run down about Compute OS Login - I regret that I should have been more specific in my initial message.  I indeed meant SSHing from one compute node to another - I have tried experimenting by just logging into the controller node and trying to SSH by host name to other compute hosts on a slurm-gcp cluster, but the passwordless access did not work (I suspect I could get it to work by manually adding SSH keys to the controller node from compute nodes in a static cluster, but don't really know how to approach the problem for a dynamic cluster).  Perhaps I'm missing something obvious though?

Thanks again,
Trevor

Trevor Tanner

unread,
Sep 9, 2021, 3:02:30 PM9/9/21
to Alex Chekholko, google-cloud-slurm-discuss
Hi again,

I feel like a fool because I just came across this article that I had not seen previously that suggests using the authorized_keys tool to achieve such functionality. I'll report back if I have any problems, but it would be super cool if the tool could be used automatically in slurm-gcp.

Cheers,
Trevor
Reply all
Reply to author
Forward
0 new messages