[slurm-users] salloc not starting shell despite LaunchParameters=use_interactive_step

141 views
Skip to first unread message

Loris Bennett via slurm-users

unread,
Sep 5, 2024, 8:19:00 AM9/5/24
to Slurm Users Mailing List
Hi,

With

$ salloc --version
slurm 23.11.10

and

$ grep LaunchParameters /etc/slurm/slurm.conf
LaunchParameters=use_interactive_step

the following

$ salloc --partition=interactive --ntasks=1 --time=00:03:00 --mem=1000 --qos=standard
salloc: Granted job allocation 18928869
salloc: Nodes c001 are ready for job

creates a job

$ squeue --me
JOBID PARTITION NAME USER ST TIME NODES NODELIST(REASON)
18928779 interacti interact loris R 1:05 1 c001

but causes the terminal to block.

From a second terminal I can log into the compute node:

$ ssh c001
[13:39:36] loris@c001 (1000) ~

Is that the expected behaviour or should salloc return a shell directly
on the compute node (like srun --pty /bin/bash -l used to do)?

Cheers,

Loris

--
Dr. Loris Bennett (Herr/Mr)
FUB-IT, Freie Universität Berlin

--
slurm-users mailing list -- slurm...@lists.schedmd.com
To unsubscribe send an email to slurm-us...@lists.schedmd.com

Jason Simms via slurm-users

unread,
Sep 5, 2024, 9:46:07 AM9/5/24
to Loris Bennett, Slurm Users Mailing List
I know this doesn't particularly help you, but for me on 23.11.6 it works as expected and immediately drops me onto the allocated node. In answer to your question, yes, as I understand it the default/expected behavior is to return the shell directly.

Jason
--
Jason L. Simms, Ph.D., M.P.H.
Manager of Research Computing
Swarthmore College
Information Technology Services
Schedule a meeting: https://calendly.com/jlsimms

Carsten Beyer via slurm-users

unread,
Sep 5, 2024, 9:53:47 AM9/5/24
to slurm...@lists.schedmd.com
Hi Loris,

we use SLURM 23.02.7 (Production) and 23.11.1 (Testsystem). Our config
contains a second parameter InteractiveStepOptions in slurm.conf:

InteractiveStepOptions="--interactive --preserve-env --pty $SHELL -l"
LaunchParameters=enable_nss_slurm,use_interactive_step

That works fine for us:

[k202068@levantetest ~]$ salloc -N1 -A k20200 -p compute
salloc: Pending job allocation 857
salloc: job 857 queued and waiting for resources
salloc: job 857 has been allocated resources
salloc: Granted job allocation 857
salloc: Waiting for resource configuration
salloc: Nodes lt10000 are ready for job
[k202068@lt10000 ~]$

Best Regards,
Carsten


Am 05.09.24 um 14:17 schrieb Loris Bennett via slurm-users:
> Hi,
>
> With
>
> $ salloc --version
> slurm 23.11.10
>
> and
>
> $ grep LaunchParameters /etc/slurm/slurm.conf
> LaunchParameters=use_interactive_step
>
> the following
>
> $ salloc --partition=interactive --ntasks=1 --time=00:03:00 --mem=1000 --qos=standard
> salloc: Granted job allocation 18928869
> salloc: Nodes c001 are ready for job
>
> creates a job
>
> $ squeue --me
> JOBID PARTITION NAME USER ST TIME NODES NODELIST(REASON)
> 18928779 interacti interact loris R 1:05 1 c001
>
> but causes the terminal to block.
>
> From a second terminal I can log into the compute node:
>
> $ ssh c001
> [13:39:36] loris@c001 (1000) ~
>
> Is that the expected behaviour or should salloc return a shell directly
> on the compute node (like srun --pty /bin/bash -l used to do)?
>
> Cheers,
>
> Loris
>
--
Carsten Beyer
Abteilung Systeme

Deutsches Klimarechenzentrum GmbH (DKRZ)
Bundesstraße 45a * D-20146 Hamburg * Germany

Phone: +49 40 460094-221
Fax: +49 40 460094-270
Email: be...@dkrz.de
URL: http://www.dkrz.de

Geschäftsführer: Prof. Dr. Thomas Ludwig
Sitz der Gesellschaft: Hamburg
Amtsgericht Hamburg HRB 39784

Jason Simms via slurm-users

unread,
Sep 5, 2024, 9:57:15 AM9/5/24
to slurm...@lists.schedmd.com
Ours works fine, however, without the InteractiveStepOptions parameter.

JLS

Loris Bennett via slurm-users

unread,
Sep 5, 2024, 10:24:17 AM9/5/24
to Slurm Users Mailing List
Jason Simms via slurm-users <slurm...@lists.schedmd.com> writes:

> Ours works fine, however, without the InteractiveStepOptions parameter.

My assumption is also that default value should be OK.

It would be nice if some one could confirm that 23.11.10 was working for
them. However, we'll probably be upgrading to 24.5 fairly soon, and so
we shall see whether the issue persists.

Cheers,

Loris
--
Dr. Loris Bennett (Herr/Mr)
FUB-IT, Freie Universität Berlin

--

Carsten Beyer via slurm-users

unread,
Sep 5, 2024, 10:25:19 AM9/5/24
to Jason Simms, Slurm User Community List

Thanks Jason for the hint. Looks like, the parameter was kept in slurm.conf from previous SLURM versions at our site.  Works also without setting InteractiveStepOptions in slurm.conf.

Best Regards,
Carsten


Am 05.09.24 um 15:55 schrieb Jason Simms via slurm-users:

Paul Edmon via slurm-users

unread,
Sep 5, 2024, 10:26:49 AM9/5/24
to slurm...@lists.schedmd.com
Its definitely working for 23.11.8, which is what we are using.

-Paul Edmon-

Loris Bennett via slurm-users

unread,
Sep 6, 2024, 7:10:19 AM9/6/24
to Slurm Users Mailing List
Paul Edmon via slurm-users <slurm...@lists.schedmd.com> writes:

> Its definitely working for 23.11.8, which is what we are using.

It turns out we had unintentionally started firewalld on the login node.
Now this has been turned off, 'salloc' drops into a shell on a compute
node as desired.

Thanks for all the data points.

Cheers,

Loris
Dr. Loris Bennett (Herr/Mr)
FUB-IT, Freie Universität Berlin

--

Brian Andrus via slurm-users

unread,
Sep 6, 2024, 10:17:24 AM9/6/24
to slurm...@lists.schedmd.com
Folks have addressed the obvious config settings, but also check your
prolog/epilog scripts/settings as well as the .bashrc/.bash_profile and
stuff in /etc/profile.d/
That may be hanging it up.

Brian Andrus
Reply all
Reply to author
Forward
0 new messages