[slurm-users] slurm job_container/tmpfs

45 views
Skip to first unread message

Arsene Marian Alain

unread,
Nov 18, 2023, 10:35:54 AM11/18/23
to Slurm User Community List

Dear slurm community,

 

I run slurm 21.08.1 under Rocky Linux 8.5 on my small HPC cluster and am trying to configure job_container/tmpfs to manage the temporary directories.

 

I have a shared nfs drive "/home" and a local "/scratch" (with permissions 1777) on each node.

 

For each submitted job I manually create a directory with the "JOB_ID.$USER" in the local "/scratch" which is where all the temp files for the job will be generated. Now, I would like to do these automatically (especially to remove the directory when the job finishes or is canceled):

 

I added the following parameters in my /etc/slurm.conf:

 

JobContainerType=job_container/tmpfs

PrologFlags=contain

 

So, I have created the "job_container.conf" in the directory "/etc/slurm"

with the following configuration:

 

AutoBasePath=false

BasePath=/scratch

 

Then, I replicated the changes to all nodes and restarted the slurm daemons.

 

Finally, when I launch the job a directory with the "JOB_ID" is created in the local "/scratch" of the compute node. The only problem is that the owner of the directory is "root" and the user who submitted the job doesn’t have read and write permissions to that directory (other users do not either).

 

I would like that:

 

1) The name of the automatically created directory will be: "JOB_ID.$USER"

2) The owner of the directory will be the user who submitted the job, not "root".

 

Please, could someone help me?

 

 

 

Thanks a lot.

 

Best regards,

 

Alain

Brian Andrus

unread,
Nov 20, 2023, 5:28:58 PM11/20/23
to slurm...@lists.schedmd.com

How do you 'manually create a directory'? That would be when the ownership of root would be occurring. After creating it, you can chown/chmod it as well.

Brian Andrus

Arsene Marian Alain

unread,
Nov 21, 2023, 4:59:17 AM11/21/23
to Slurm User Community List

Hello Brian,

 

Thanks for your answer. With the job_container/tmpfs plugin I don't really create the directory manually.

 

I just give my Basepath=/scratch (a local directory for each node that is already mounted with 1777 permissions) in job_container.conf. The plugin automatically generates for each job a directory with the "JOB_ID", for example: /scratch/1805

 

The only problem is that directory 1805 is generated with root owner and permissions 700. So the user who submitted the job cannot write/read inside directory 1805.

 

Is there a way for the owner of directory 1805 to be the user who submitted the job and not root?

 

De: slurm-users <slurm-use...@lists.schedmd.com> En nombre de Brian Andrus
Enviado el: lunes, 20 de noviembre de 2023 23:29
Para: slurm...@lists.schedmd.com
Asunto: Re: [slurm-users] slurm job_container/tmpfs

 

ATENCIÓN: Este correo electrónico se envió desde fuera de la UAH. No haga clic en enlaces ni abra archivos adjuntos a menos que reconozca al remitente y sepa que el contenido es seguro.

Sean Mc Grath

unread,
Nov 21, 2023, 6:57:22 AM11/21/23
to Slurm User Community List
Would a prolog script, https://slurm.schedmd.com/prolog_epilog.html, do what you need? Sorry if you have already considered that and I missed it.

---
Sean McGrath
Senior Systems Administrator, IT Services


From: slurm-users <slurm-use...@lists.schedmd.com> on behalf of Arsene Marian Alain <alain....@uah.es>
Sent: Tuesday 21 November 2023 09:58
To: Slurm User Community List <slurm...@lists.schedmd.com>
Subject: Re: [slurm-users] slurm job_container/tmpfs
 

Ward Poelmans

unread,
Nov 21, 2023, 7:12:56 AM11/21/23
to slurm...@lists.schedmd.com
Hi Arsene,

On 21/11/2023 10:58, Arsene Marian Alain wrote:

> I just give my Basepath=/scratch (a local directory for each node that is already mounted with 1777 permissions) in job_container.conf. The plugin automatically generates for each job a directory with the "JOB_ID", for example: /scratch/1805
>
> The only problem is that directory 1805 is generated with root owner and permissions 700. So the user who submitted the job cannot write/read inside directory 1805.

If I look on our system, there should be hidden directory under that directory which is owned by the correct user.

Our job_container.conf has:
Basepath=/local
Dirs=/var/tmp,/tmp,/dev/shm

which gives directories like:
/local/6000522/.6000522/_tmp/

The hidden one is owned by the user of the job.


Ward

Arsene Marian Alain

unread,
Nov 21, 2023, 7:53:14 AM11/21/23
to Slurm User Community List
Hi Ward,

You're right.

[root@node01 scratch]# pwd
/scratch
[root@node01 scratch]# ll
total 0
drwx------ 3 root root 30 nov 21 13:41 1809
[root@node01 scratch]# ls -la 1809/
total 0
drwx------ 3 root root 30 nov 21 13:41 .
drwxrwxrwt. 3 root root 18 nov 21 13:41 ..
drwx------ 2 thais root 6 nov 21 13:41 .1809
-r--r--r-- 1 root root 0 nov 21 13:41 .ns

But how can user write or access the hidden directory .1809 if he doesn't have read/write permission on main directory 1809?

Thanks.

-----Mensaje original-----
De: slurm-users <slurm-use...@lists.schedmd.com> En nombre de Ward Poelmans
Enviado el: martes, 21 de noviembre de 2023 13:12
Para: slurm...@lists.schedmd.com
Asunto: Re: [slurm-users] slurm job_container/tmpfs

Arsene Marian Alain

unread,
Nov 21, 2023, 8:07:38 AM11/21/23
to Slurm User Community List

Thanks Sean. I’ve tried using slurm prolog/epilog scripts but without any success. That's why I decided to look for other solutions and job_container/tmpfs plugin seemed like a good alternative.

René Sitt

unread,
Nov 21, 2023, 8:35:59 AM11/21/23
to slurm...@lists.schedmd.com

Hello Alain,

as an alternative to job_container/tmpfs, you may also try your luck with the 'auto_tmpdir' SPANK plugin: https://github.com/University-of-Delaware-IT-RCI/auto_tmpdir

We've been using using that on our small HPC cluster (Slurm 22.05) and it does what it's supposed to. One thing one has to remember is that it requires a recompile after every Slurm update.

Kind regards,
René Sitt

Am 21.11.23 um 14:07 schrieb Arsene Marian Alain:
-- 
Dipl.-Chem. René Sitt
Hessisches Kompetenzzentrum für Hochleistungsrechnen
Philipps-Universität Marburg
Hans-Meerwein-Straße
35032 Marburg

Tel. +49 6421 28 23523
si...@hrz.uni-marburg.de
www.hkhlr.de

Lorenzo Bosio

unread,
Nov 21, 2023, 8:38:36 AM11/21/23
to Slurm User Community List, Arsene Marian Alain

Hello Alain,

maybe I'm missing the point, but from my understanding the job_container/tmpfs plugin uses the directory under BasePath to store its data, used to create the bind mounts for the users. The folder itself is not meant to be used by others.
The folders in the hidden directory with user privileges under your /scratch are the bind mounts. Those folders are specified in the Dirs parameter of job_container.conf. You may have more luck trying to use this parameter for your needs, perhaps? There is also a parameter to specify an "InitScript" which may be used to create folders dinamically.
One last thing, the those configuration has been added in one of the latest releases of Slurm, so they may not work with your version.

Best regards,
Lorenzo Bosio

--
Dott. Mag. Lorenzo Bosio
Tecnico di Ricerca
Dipartimento di Informatica


Università degli Studi di Torino
Corso Svizzera, 185 - 10149 Torino

Roberto Monti

unread,
Nov 21, 2023, 9:03:43 AM11/21/23
to Slurm User Community List
Hi,
From the perspective of the job, those directories are mapped to /tmp (and others, depending on your job_container.conf). There's no need for the user to be aware of the basepath that is specified in the conf file.

You can easily verify it is working by writing files to /tmp from a new slurm job, and running `find /scratch/<JOBID>` as root on the same node (while the job is still running).

Best,

--
Roberto P. Monti
DevOps Engineer I
robert...@jax.org

The Jackson Laboratory
United States | China | Japan
www.jax.org

-----Original Message-----
From: slurm-users <slurm-use...@lists.schedmd.com> On Behalf Of Arsene Marian Alain
Sent: Tuesday, November 21, 2023 7:53 AM
To: Slurm User Community List <slurm...@lists.schedmd.com>
---

The information in this email, including attachments, may be confidential and is intended solely for the addressee(s). If you believe you received this email by mistake, please notify the sender by return email as soon as possible.

Ward Poelmans

unread,
Nov 21, 2023, 9:18:08 AM11/21/23
to slurm...@lists.schedmd.com
Hi,

On 21/11/2023 13:52, Arsene Marian Alain wrote:

>
> But how can user write or access the hidden directory .1809 if he doesn't have read/write permission on main directory 1809?

Because it works as a namespace. On my side:

$ ls -alh /local/6000523/
total 0
drwx------ 3 root root 33 Nov 21 15:00 .
drwxrwxrwt 3 root root 21 Nov 21 15:00 ..
drwx------ 4 myuser root 34 Nov 21 15:00 .6000523
-r--r--r-- 1 root root 0 Nov 21 15:00 .ns

While inside the job you just see this subdirectory.

Ward
Reply all
Reply to author
Forward
0 new messages