[slurm-users] REST-based CLI tools out there somewhere?

155 views
Skip to first unread message

Chip Seraphine

unread,
Nov 9, 2023, 6:15:31 PM11/9/23
to slurm...@lists.schedmd.com
Hello,

Our users submit their jobs from shared submit hosts, and have expressed an understandable preference for being able to submit directly from their own workstations. The obvious solution (installing the slurm client on their workstations, or providing a container that does something similar) are not available to us because of security concerns. This leaves REST as the best option. We’re hoping to provide a REST-based toolset that users familiar with the command line tools can make immediate use of (so, provides basic, stripped-down functionality of srun, squeue, sacct, and sinfo). Basically, we want to create a subset of the s* commands that can be run from some arbitrary machine if the user has the appropriate token.

It’d be surprising if we were the first people to go down this path, but searching has turned up nothing. Is there a project anyone knows about out there for providing command-line SLURM commands that use REST to talk to the daemons? Or am I missing some obvious solution here?

--

Chip Seraphine
Grid Operations
For support please use help-grid in email or slack.
This e-mail and any attachments may contain information that is confidential and proprietary and otherwise protected from disclosure. If you are not the intended recipient of this e-mail, do not read, duplicate or redistribute it by any means. Please immediately delete it and any attachments and notify the sender that you have received it by mistake. Unintended recipients are prohibited from taking action on the basis of information in this e-mail or any attachments. The DRW Companies make no representations that this e-mail or any attachments are free of computer viruses or other defects.

Davide DelVento

unread,
Nov 9, 2023, 7:45:00 PM11/9/23
to Slurm User Community List
Not a direct answer to your question, but have you looked at Open OnDemand? Or maybe JupyterHub?
I think most places today prefer to do either of those which provide somewhat the functionality you asked - and much more.

Chip Seraphine

unread,
Nov 9, 2023, 8:23:40 PM11/9/23
to Slurm User Community List
I’m passingly familiar with JupyterHub, but didn’t realize it had Slurm ties. I’ll take a look at Open OnDemand as well. I don’t think either will meet the requirement of being a replacement for srun from the shell, but a new GUI method would certainly be welcome.


From: slurm-users <slurm-use...@lists.schedmd.com> on behalf of Davide DelVento <davide....@gmail.com>
Reply-To: Slurm User Community List <slurm...@lists.schedmd.com>
Date: Thursday, November 9, 2023 at 6:45 PM
To: Slurm User Community List <slurm...@lists.schedmd.com>
Subject: [ext] Re: [slurm-users] REST-based CLI tools out there somewhere?

Not a direct answer to your question, but have you looked at Open OnDemand? Or maybe JupyterHub? I think most places today prefer to do either of those which provide somewhat the functionality you asked - and much more. On Thu, Nov 9, 2023

Not a direct answer to your question, but have you looked at Open OnDemand? Or maybe JupyterHub?
I think most places today prefer to do either of those which provide somewhat the functionality you asked - and much more.

Hagdorn, Magnus Karl Moritz

unread,
Nov 10, 2023, 2:31:13 AM11/10/23
to slurm...@lists.schedmd.com
Hi Chip,
what are the security concerns? Being able to control jobs is one
thing. Your users still need to setup the jobs which presumably
involves moving data. We have provided some launchers written in python
for specific use cases (that don't involve moving data). We also have a
jupyterhub that launches notebooks on the cluster. This works quite
well, but you do end up with interactive jobs which are most likely
less efficient than batch jobs. Again, this really depends on your use
cases.
Regards
magnus
--
Magnus Hagdorn
Charité – Universitätsmedizin Berlin
Geschäftsbereich IT | Scientific Computing
 
Campus Charité Mitte
BALTIC - Invalidenstraße 120/121
10115 Berlin
 
magnus....@charite.de
https://www.charite.de
HPC Helpdesk: sc-hpc-...@charite.de

Loris Bennett

unread,
Nov 10, 2023, 3:25:00 AM11/10/23
to Slurm User Community List
Chip Seraphine <csera...@DRWHoldings.com> writes:

> Hello,
>
> Our users submit their jobs from shared submit hosts, and have
> expressed an understandable preference for being able to submit
> directly from their own workstations. The obvious solution
> (installing the slurm client on their workstations, or providing a
> container that does something similar) are not available to us because
> of security concerns. This leaves REST as the best option. We’re
> hoping to provide a REST-based toolset that users familiar with the
> command line tools can make immediate use of (so, provides basic,
> stripped-down functionality of srun, squeue, sacct, and sinfo).
> Basically, we want to create a subset of the s* commands that can be
> run from some arbitrary machine if the user has the appropriate token.

I don't understand the use-case here. If the users are comfortable on
the command-line, why would running 'sbatch' et al. in a local shell be
preferable to first connecting to the cluster and then running 'sbatch'?

> It’d be surprising if we were the first people to go down this path,
> but searching has turned up nothing. Is there a project anyone knows
> about out there for providing command-line SLURM commands that use
> REST to talk to the daemons? Or am I missing some obvious solution
> here?

I'm surprised that you're surprised :-) but there may well be some part
of the story that I have failed to grasp.

Cheers,

Loris


> --
>
> Chip Seraphine
> Grid Operations
> For support please use help-grid in email or slack.
> This e-mail and any attachments may contain information that is confidential and proprietary and otherwise protected from disclosure. If you are not the intended recipient of this e-mail, do not read, duplicate or redistribute it by any means. Please immediately delete it and any attachments and notify the sender that you have received it by mistake. Unintended recipients are prohibited from taking action on the basis of information in this e-mail or any attachments. The DRW Companies make no representations that this e-mail or any attachments are free of computer viruses or other defects.
--
Dr. Loris Bennett (Herr/Mr)
ZEDAT, Freie Universität Berlin

Chip Seraphine

unread,
Nov 10, 2023, 1:36:00 PM11/10/23
to Slurm User Community List
> what are the security concerns?

The cluster is shared between some business units that do not want to share data, so if we install the munge key on a machine that users have administrative or physical access to it could become compromised. This could allow them to run jobs on the cluster as another user and retrieve data from shared filesystems.

Jupyterhub is heavily used, but suffers from the same problem- it needs to be running on a submit node. And, as you mentioned, an interactive solution has drawbacks with many types of workloads.
magnus....@charite.de <mailto:magnus....@charite.de>
https://www.charite.de <https://www.charite.de>
HPC Helpdesk: sc-hpc-...@charite.de <mailto:sc-hpc-...@charite.de>



Chip Seraphine

unread,
Nov 10, 2023, 1:47:21 PM11/10/23
to Slurm User Community List
On 11/10/23, 2:25 AM, "slurm-users on behalf of Loris Bennett" <slurm-use...@lists.schedmd.com <mailto:slurm-use...@lists.schedmd.com> on behalf of loris....@fu-berlin.de <mailto:loris....@fu-berlin.de>> wrote:

>> Basically, we want to create a subset of the s* commands that can be
>> run from some arbitrary machine if the user has the appropriate token.
>
> I don't understand the use-case here. If the users are comfortable on
> the command-line, why would running 'sbatch' et al. in a local shell be
> preferable to first connecting to the cluster and then running 'sbatch'?

Having a large number of researchers able to run arbitrary code on the same submit host has a marked tendency to result in an overloaded host. There are various ways to regulate that ranging from "constant scolding" to "aggressive quotas/cgroups/etc", but all involve some degree of inconvenience for all concerned. So the desire is to do the same things they are currently doing, but on a node they do not have to share.

For example, user X has a framework that consumes data from various sources, crunches it in Slurm by executing s* commands, and spits out reports to a NAS share. The framework itself is long-running and interactive, so they prefer to keep it out of Slurm; however it is also quite heavy, and thus a poor fit for a shared system. This can be addressed in many ways, but the lowest-effort route (from user X's point of view) would be to simply run the existing framework somewhere else so they do not need to share.

Rather than require all users in this situation to rewrite all their code to use REST calls, I'd like to offer a drop-in tool set that has similar inputs and outputs to commands like "sacct" or "srun", but under the covers uses REST, thus removing the requirement of having a local munge setup. This would give them an interim solution while the conversion to native REST is being worked at liesure.

Hope that clarifies!

Jared Baker

unread,
Nov 10, 2023, 2:03:42 PM11/10/23
to Slurm User Community List
At the risk of going a bit off the rails and alternative to the REST method (maybe), but not too far as we've been thinking of alternative ways for similar things (not slurm). Anyway, SSH certificates with a forced command entry and wrapper for slurm commands on submit hosts along with small wrappers on their workstation end could be viable... I'm sure there are many caveats here, but very doable too.

Davide DelVento

unread,
Nov 10, 2023, 3:36:48 PM11/10/23
to Slurm User Community List
Having a large number of researchers able to run arbitrary code on the same submit host has a marked tendency to result in an overloaded host.  There are various ways to regulate that ranging from "constant scolding" to "aggressive quotas/cgroups/etc", but all involve some degree of inconvenience for all concerned.   So the desire is to do the same things they are currently doing, but on a node they do not have to share.

If you have enough resources this could be a node managed by slurm, and you can use allocations to make sure people play nice.
 
For example, user X has a framework that consumes data from various sources, crunches it in Slurm by executing s* commands, and spits out reports to a NAS share.   The framework itself is long-running and interactive, so they prefer to keep it out of Slurm; however it is also quite heavy, and thus a poor fit for a shared system.  This can be addressed in many ways, but the lowest-effort route (from user X's point of view) would be to simply run the existing framework somewhere else so they do not need to share.

Why not a dedicated node on your cluster? 

Open OnDemand works really great for this use case! Give it a try, there is a nice demo install which you can use for testing it (no install required on your side): https://openondemand.org/run-open-ondemand

If that does not work for you, Jared's (*) suggestion of wrapping slurm commands in ssh scripts or the likes sounds like your best bet.

(*): Hi Jared, long time... hope you're doing well.

Chip Seraphine

unread,
Nov 10, 2023, 4:12:59 PM11/10/23
to Slurm User Community List
We actually have a bunch of people doing that. It’s fine for “I want to just run squeue and see how busy the cluster is without having to head over there and look”, but it starts to break down with “my GUI framework runs squeue and sacct to check job status every few seconds, all day long, per user”. ☹


From: slurm-users <slurm-use...@lists.schedmd.com> on behalf of Jared Baker <jba...@ucar.edu>
Reply-To: Slurm User Community List <slurm...@lists.schedmd.com>
Date: Friday, November 10, 2023 at 1:03 PM
To: Slurm User Community List <slurm...@lists.schedmd.com>
Subject: [ext] Re: [slurm-users] REST-based CLI tools out there somewhere?

At the risk of going a bit off the rails and alternative to the REST method (maybe), but not too far as we've been thinking of alternative ways for similar things (not slurm). Anyway, SSH certificates with a forced command entry and wrapper

At the risk of going a bit off the rails and alternative to the REST method (maybe), but not too far as we've been thinking of alternative ways for similar things (not slurm). Anyway, SSH certificates with a forced command entry and wrapper for slurm commands on submit hosts along with small wrappers on their workstation end could be viable... I'm sure there are many caveats here, but very doable too.

On Fri, Nov 10, 2023 at 11:36 AM Chip Seraphine <csera...@drwholdings.com<mailto:csera...@drwholdings.com>> wrote:
> what are the security concerns?

The cluster is shared between some business units that do not want to share data, so if we install the munge key on a machine that users have administrative or physical access to it could become compromised. This could allow them to run jobs on the cluster as another user and retrieve data from shared filesystems.

Jupyterhub is heavily used, but suffers from the same problem- it needs to be running on a submit node. And, as you mentioned, an interactive solution has drawbacks with many types of workloads.

magnus....@charite.de<mailto:magnus....@charite.de> <mailto:magnus....@charite.de<mailto:magnus....@charite.de>>
https://www.charite.de<https://urldefense.com/v3/__https:/www.charite.de__;!!EvhwMw!SIr6vqft4lrIWWkAUNnXlV7SdanIAvPQWp2DWWb1IuFEzlYVtFSi790uWXd1mepoliQwP9I6MqYMOxhxvUMy$> <https://www.charite.de<https://urldefense.com/v3/__https:/www.charite.de__;!!EvhwMw!SIr6vqft4lrIWWkAUNnXlV7SdanIAvPQWp2DWWb1IuFEzlYVtFSi790uWXd1mepoliQwP9I6MqYMOxhxvUMy$>>
HPC Helpdesk: sc-hpc-...@charite.de<mailto:sc-hpc-...@charite.de> <mailto:sc-hpc-...@charite.de<mailto:sc-hpc-...@charite.de>>



Oren Shani

unread,
Nov 12, 2023, 12:38:38 AM11/12/23
to Slurm User Community List
Hi Chip

Pyslurm is a python library that enables interfacing with slurm from within a python script. I played around with and older version that  implemened only srun/sbatch but it looks like the new version also covers the other s* commands take a look at the API here).

So I guess you can use Pyslurm as a basis for a web service and in this way achieve what you want. I wonder how you plan to do the users' authentication part, though... Would appreciate it if you can share some ideas on that.

BR

Oren
Reply all
Reply to author
Forward
0 new messages