Hi,
We have recently moved to slurm 22.05.8 and have configured job_container/tmpfs to allow private tmp folders.
job_container.conf contains:
AutoBasePath=true
BasePath=/slurm
And in slurm.conf we have set
JobContainerType=job_container/tmpfs
I can see the folders being created and they are being used but when a job completes the root folder is not being cleaned up.
Example of running job:
[root@papr-res-compute204 ~]# ls -al /slurm/14292874
total 32
drwx------ 3 root root 34 Mar 1 13:16 .
drwxr-xr-x 518 root root 16384 Mar 1 13:16 ..
drwx------ 2 mzethoven root 6 Mar 1 13:16 .14292874
-r--r--r-- 1 root root 0 Mar 1 13:16 .ns
Example once job completes /slurm/<jobid> remains:
[root@papr-res-compute204 ~]# ls -al /slurm/14292794
total 32
drwx------ 2 root root 6 Mar 1 09:33 .
drwxr-xr-x 518 root root 16384 Mar 1 13:16 ..
Is this to be expected or should the folder /slurm/<jobid> also be removed?
Do I need to create an epilog script to remove the directory that is left?
Many thanks for the assistance,
Jason
Jason Ellul
Head - Research Computing Facility
Office of Cancer Research
Peter MacCallum Cancer Center
Disclaimer: This email (including any attachments or links) may contain confidential and/or legally privileged information and is intended only to be read or used by the addressee. If you are not the intended addressee, any use, distribution, disclosure or copying of this email is strictly prohibited. Confidentiality and legal privilege attached to this email (including any attachments) are not waived or lost by reason of its mistaken delivery to you. If you have received this email in error, please delete it and notify us immediately by telephone or email. Peter MacCallum Cancer Centre provides no guarantee that this transmission is free of virus or that it has not been intercepted or altered and will not be liable for any delay in its receipt.
Thanks so much Ole for the info and link,
Your documentation is extremely useful.
Prior to moving to 22.05 we had been using slurm-spank-private-tmpdir with an epilog to clean-up the folders on job completion, but we were hoping to move to the inbuilt functionality to ensure future compatibility and reduce complexity.
Will try 23.02 and if that does not resolve our issue consider moving back to slurm-spank-private-tmpdir or auto_tmpdir.
Thanks again,
Jason
Jason Ellul
Head - Research Computing Facility
Office of Cancer Research
Peter MacCallum Cancer Center
From:
slurm-users <slurm-use...@lists.schedmd.com> on behalf of Ole Holm Nielsen <Ole.H....@fysik.dtu.dk>
Date: Wednesday, 1 March 2023 at 8:29 pm
To: slurm...@lists.schedmd.com <slurm...@lists.schedmd.com>
Subject: Re: [slurm-users] Cleanup of job_container/tmpfs
! EXTERNAL EMAIL: Think before you click. If suspicious send to Cyber...@petermac.org
Hi Michael,
Thanks so much for the info will try 23.02.
Cheers,
Jason
Jason Ellul
Head - Research Computing Facility
Office of Cancer Research
Peter MacCallum Cancer Center
From:
slurm-users <slurm-use...@lists.schedmd.com> on behalf of Michael Jennings <m...@lanl.gov>
Date: Thursday, 2 March 2023 at 9:17 am
To: slurm...@lists.schedmd.com <slurm...@lists.schedmd.com>
Subject: Re: [slurm-users] Cleanup of job_container/tmpfs
! EXTERNAL EMAIL: Think before you click. If suspicious send to Cyber...@petermac.org
That looks like the users' home directory doesn't exist on the node.
If you are not using a shared home for the nodes, your onboarding process should be looked at to ensure it can handle any issues that may arise.
If you are using a shared home, you should do the above and have the node ensure the shared filesystems are mounted before allowing jobs.
-Brian Andrus