[slurm-users] Persistent Interactive Jobs

493 views
Skip to first unread message

Willy Markuske

unread,
Jun 9, 2022, 8:20:17 PM6/9/22
to slurm...@lists.schedmd.com

Hello All,

I have a request from users for the ability to have persistent interactive jobs. Currently some users are using srun to allocate and interactive job and run their scripts but sshd will close connections after 2 hours to prevent hanging ssh connections. They want to spawn an R or python shell to work in directly for testing.

I've attempted to use salloc and ssh to attach to the job but allocations are relinquished when the user leaves the submit node. Is there an easy way for a user to create a job allocation that spawns a terminal they can attach and unattach from? I'm looking at different ways to create a job that will spawn tmux and stay open, but it would be great if there was a way to directly attach/dettach to the output of a spawned terminal they could just ssh too.

Regards,

--

Willy Markuske

HPC Systems Engineer

Research Data Services

P: (619) 519-4435

Brian Andrus

unread,
Jun 10, 2022, 12:17:09 AM6/10/22
to slurm...@lists.schedmd.com

A couple suggestions:

1) You could use a gui (run X and vncserver so they can connect and have a desktop)

2) You could run something like Jupyter an have them connect to that.

Either method would allow them to connect via ssh as long as the 'job' is running.  I use both for different users/groups.

Brian Andrus

Burian, John

unread,
Jun 10, 2022, 7:59:44 AM6/10/22
to Slurm User Community List

Perhaps a remote desktop solution like TurboVNC? Users can disconnect and reconnect to the desktop for the duration of the allocation.

 

John

 

 

From: slurm-users <slurm-use...@lists.schedmd.com> on behalf of Willy Markuske <wmar...@sdsc.edu>
Reply-To: Slurm User Community List <slurm...@lists.schedmd.com>
Date: Thursday, June 9, 2022 at 8:21 PM
To: "slurm...@lists.schedmd.com" <slurm...@lists.schedmd.com>
Subject: [slurm-users] Persistent Interactive Jobs

 

Hello All, I have a request from users for the ability to have persistent interactive jobs. Currently some users are using srun to allocate and interactive job and run their scripts but sshd will close connections after 2 hours to prevent hanging

ZjQcmQRYFpfptBannerStart

This Message Is From an External Sender

This message came from outside your organization.

Search “email warning banner” on ANCHOR for more information

    Report Suspicious    ‌

ZjQcmQRYFpfptBannerEnd

Diego Zuccato

unread,
Jun 10, 2022, 8:59:06 AM6/10/22
to Slurm User Community List, Burian, John
Why not launch their job inside a screen session? Then they can
reconnect by "screen -r" after logging in on the worker node.
Way lighter than VNC, but does not support X.

Il 10/06/2022 13:58, Burian, John ha scritto:
> Perhaps a remote desktop solution like TurboVNC? Users can disconnect
> and reconnect to the desktop for the duration of the allocation.
>
> John
>
> *From: *slurm-users <slurm-use...@lists.schedmd.com> on behalf of
> Willy Markuske <wmar...@sdsc.edu>
> *Reply-To: *Slurm User Community List <slurm...@lists.schedmd.com>
> *Date: *Thursday, June 9, 2022 at 8:21 PM
> *To: *"slurm...@lists.schedmd.com" <slurm...@lists.schedmd.com>
> *Subject: *[slurm-users] Persistent Interactive Jobs
>
> Hello All, I have a request from users for the ability to have
> persistent interactive jobs. Currently some users are using srun to
> allocate and interactive job and run their scripts but sshd will close
> connections after 2 hours to prevent hanging
>
> ZjQcmQRYFpfptBannerStart
>
> *This Message Is From an External Sender *
>
> This message came from outside your organization.
>
> Search “email warning banner” on ANCHOR for more information
>
> *  Report Suspicious *
> <https://us-phishalarm-ewt.proofpoint.com/EWT/v1/NiUAmZJ8c1GNWg!a5BjTKwgL1QId_f8ZO7ug18eSSU_G4tIKhiI1Tg96r3zfYkwWU3PPl9q5sQH_m9g7BFElvXYHy-ok9BGCaMIqsMasp5A6EC6WrMw0e_VK-E$>  ‌
>
>
> ZjQcmQRYFpfptBannerEnd
>
> Hello All,
>
> I have a request from users for the ability to have persistent
> interactive jobs. Currently some users are using srun to allocate and
> interactive job and run their scripts but sshd will close connections
> after 2 hours to prevent hanging ssh connections. They want to spawn an
> R or python shell to work in directly for testing.
>
> I've attempted to use salloc and ssh to attach to the job but
> allocations are relinquished when the user leaves the submit node. Is
> there an easy way for a user to create a job allocation that spawns a
> terminal they can attach and unattach from? I'm looking at different
> ways to create a job that will spawn tmux and stay open, but it would be
> great if there was a way to directly attach/dettach to the output of a
> spawned terminal they could just ssh too.
>
> Regards,
>
> --
>
> Willy Markuske
>
>
>
> HPC Systems Engineer
>
>
>
> Research Data Services
>
> P: (619) 519-4435
>

--
Diego Zuccato
DIFA - Dip. di Fisica e Astronomia
Servizi Informatici
Alma Mater Studiorum - Università di Bologna
V.le Berti-Pichat 6/2 - 40127 Bologna - Italy
tel.: +39 051 20 95786

Hadrian Djohari

unread,
Jun 10, 2022, 9:25:36 AM6/10/22
to Slurm User Community List
OnDemand features from OSC have various "desktop" options that connect directly to the compute nodes.

--
Hadrian Djohari
Manager of Research Computing Services, [U]Tech
Case Western Reserve University
(W): 216-368-0395
(M): 216-798-7490

Willy Markuske

unread,
Jun 21, 2022, 10:05:28 AM6/21/22
to slurm...@lists.schedmd.com

Thanks for all the suggestions everyone. I was finally able to convince them to use Jupyter but I'm also putting OnDemand onto my test cluster. Wasn't aware they had an opensource implementation of exactly what I was looking for.

Willy Markuske

HPC Systems Engineer

Research Data Services

P: (619) 519-4435

Reply all
Reply to author
Forward
0 new messages