[slurm-users] job_container/tmpfs and srun.

105 views
Skip to first unread message

Phill Harvey-Smith

unread,
Jan 9, 2024, 6:31:33 AM1/9/24
to slurm...@lists.schedmd.com
Hi all,

On our setup we are using job_container/tmpfs to give each job it's own
temp space. Since our compute nodes have reasonably sized disks for
tasks that do a lot of disk I/O on user's data we have asked users to
copy their data to the local disk at the beginning of the task and (if
needed) copy it back at the end. This saves lots of NFS thrashing
slowing down both the task and the NFS servers.

However some of our users are having problems with this, their initial
sbatch script will create a temp directory in their private /tmp copy
their data to it and then try to srun a program. The srun will fall over
as it doesn't seem to have have access to the copied data. I suspect
this is because the srun task is getting it's own private /tmp.

So my question is, is there a way to have the srun task inherit the /tmp
of the initial sbatch?

I'll include a sample of the script our user is using below.

If any further information is required please feel free to ask.

Cheers.

Phill.


#!/usr/bin/bash
#SBATCH --nodes 1
#SBATCH --ntasks-per-node=1
#SBATCH --cpus-per-task=1
#SBATCH --time=00:00:10
#SBATCH --mem-per-cpu=3999
#SBATCH --output=script_out.log
#SBATCH --error=script_error.log

# The above options puts the STDOUT and STDERR of sbatch in
# log files prefixed with 'script_'.

# Create a randomly-named directory under /tmp
jobtmpdir=$(mktemp -d)

# Register a function to try and cleanup in case of job failure
cleanup_handler()
{
echo "Cleaning up ${jobtmpdir}"
rm -rf ${jobtmpdir}
}
trap 'cleanup_handler' SIGTERM EXIT

# Change working directory to this directory
cd ${jobtmpdir}

# Copy the executable and input files from
# where the job was submitted to the temporary directory.
cp ${SLURM_SUBMIT_DIR}/a.out .
cp ${SLURM_SUBMIT_DIR}/input.txt .

# Run the executable, handling the collection of stdout
# and stderr ourselves by redirecting to file
srun ./a.out 2> task_error.log > task_out.log

# Copy output data back to the submit directory.
cp output.txt ${SLURM_SUBMIT_DIR}
cp task_out.log ${SLURM_SUBMIT_DIR}
cp task_error.log ${SLURM_SUBMIT_DIR}

# Cleanup
cd ${SLURM_SUBMIT_DIR}
cleanup_handler

Lambers, Martin via slurm-users

unread,
Jan 17, 2025, 6:26:56 AMJan 17
to slurm...@lists.schedmd.com
Hi all,

this comes a bit late, but we are having the same problem:

The sbatch script sees the job-specific /tmp created by
job_container/tmpfs and the job itself does too, but srun and mpirun do
not; they still see the system /tmp.

This is a problem especially if the user sets the working directory to
something inside the job-specific /tmp:

=====================
#!/bin/bash
#SBATCH --nodes=1
#SBATCH ...

mkdir /tmp/something
cd /tmp/something
srun hostname
=====================

This gives the message

slurmstepd: error: couldn't chdir to `/tmp/something': No such file or
directory: going to /tmp instead

In many cases, it seems that message can be ignored since the program
itself sees the job-specific /tmp, e.g. the following works as expected:

=====================
mkdir /tmp/something
cd /tmp/something
echo "42" > a
srun cat a
=====================

However, MPICH jobs fail with messages like these:

[proxy:1@gpu016] launch_procs (proxy/pmip_cb.c:869): unable to change
wdir to /tmp/something (No such file or directory)
[...] (more error messages; job aborts).

The new job_container/tmpfs parameter EntireStepInNS in Slurm 24.11
removes the slurmstepd error message, but MPICH still fails, so it seems
the problem is not entirely solved.

Does anybody have a solution for this?

Best,
Martin
--
Dr. habil. Martin Lambers
Forschung und wissenschaftliche Informationsversorgung
IT.SERVICES
Ruhr-Universität Bochum | 44780 Bochum | Germany
fon : +49 234 32 29941
https://www.it-services.rub.de/

Reply all
Reply to author
Forward
0 new messages