[slurm-users] prolog not passing env var to job

1,762 views
Skip to first unread message

Herc Silverstein

unread,
Feb 12, 2021, 4:13:05 PM2/12/21
to slurm...@schedmd.com

Hi,

I have a prolog script that is being run via the slurm.conf Prolog= setting.  I've verified that it's being executed on the compute node.  My problem is that I cannot get environment variables that I set in this prolog to be set/seen in the job.  For example the prolog:

#!/bin/bash

...

export CUDA_MPS_PIPE_DIRECTORY=blah

...

is not part of the job's env vars.  Someone had suggested that a "print" statement needs to be used. So I tried:

print export CUDA_MPS_PIPE_DIRECTORY=blah

but this doesn't work either.  Is this is really just for print-ing as its name implies?  Or is that how one actually has to do it and somehow the syntax I'm using isn't quite right?

The documentation says:

The task prolog is executed with the same environment as the user tasks to be initiated. The standard output of that program is read and processed as follows:

export name=value sets an environment variable for the user task
unset name clears an environment variable from the user task
print ... writes to the task's standard output.
The above functionality is limited to the task prolog script.

Basically, I'd like to know how to get an arbitrary environment variable passed on to the job via the prolog.

We are using slurm 20.02.5 on CentOS 7.

Thanks,

Herc

Sarlo, Jeffrey S

unread,
Feb 12, 2021, 4:18:02 PM2/12/21
to slurm...@schedmd.com
In our taskprolog file we have something like

#!/bin/sh
echo export SCRATCHDIR=/scratch/${SLURM_JOBID}



From: slurm-users <slurm-use...@lists.schedmd.com> on behalf of Herc Silverstein <herc.sil...@schrodinger.com>
Sent: Friday, February 12, 2021 3:12 PM
To: slurm...@schedmd.com <slurm...@schedmd.com>
Subject: [slurm-users] prolog not passing env var to job
 

mercan

unread,
Feb 12, 2021, 4:27:54 PM2/12/21
to Slurm User Community List, Herc Silverstein, slurm...@schedmd.com
Hi;

Prolog and TaskProlog are different parameters and scripts. You should
use the TaskProlog script to set env. variables.

Regards;

Ahmet M.


13.02.2021 00:12 tarihinde Herc Silverstein yazdı:
>
> Hi,
>
> I have a prolog script that is being run via the slurm.conf Prolog=
> setting.  I've verified that it's being executed on the compute node. 
> My problem is that I cannot get environment variables that I set in
> this prolog to be set/seen in the job.  For example the prolog:
>
> #!/bin/bash
>
> ...
>
> export CUDA_MPS_PIPE_DIRECTORY=blah
>
> ...
>
> is not part of the job's env vars.  Someone had suggested that a
> "print" statement needs to be used. So I tried:
>
> print export CUDA_MPS_PIPE_DIRECTORY=blah
>
> but this doesn't work either.  Is this is really just for print-ing as
> its name implies?  Or is that how one actually has to do it and
> somehow the syntax I'm using isn't quite right?
>
> The documentation says:
>
> /The task prolog is executed with the same environment as the user
> tasks to be initiated. The standard output of that program is read and
> processed as follows:/
>
> /export name=value//sets an environment variable for the user task//
> / /unset name//clears an environment variable from the user task//
> / /print ...//writes to the task's standard output.//
> //The above functionality is limited to the task prolog script./

Brian Andrus

unread,
Feb 12, 2021, 4:34:24 PM2/12/21
to slurm...@lists.schedmd.com
Your prolog script is run by/as the same user as slurmd, so any
environment variables you set there will not be available to the job
being run.

See: https://slurm.schedmd.com/prolog_epilog.html for info.

Brian Andrus

Herc Silverstein

unread,
Feb 12, 2021, 6:03:15 PM2/12/21
to Slurm User Community List, slurm...@schedmd.com
Thanks to everyone who replied!  It's working now.

I had to make a number of changes

1. set the env vars in the TaskProlog so that they are exported to the
job/task  (I just had assumed that even though the Prolog is run under
root and not the user id of the job that it would still end up passing
the env vars down to the job/task)

2. use: "echo export ... "   and not just "export ..."

3. Some setup is needed in this case.  Do the setup in the Prolog script
since (among other things) it needs to start up one process under root.

Herc

On 2/12/2021 1:27 PM, mercan wrote:

Chin,David

unread,
Mar 4, 2021, 12:28:40 AM3/4/21
to Slurm User Community List
Prolog and TaskProlog are different parameters and scripts. You should
> use the TaskProlog script to set env. variables.

Can you tell me how to do this for srun? E.g. users request an interactive shell:

    srun -n 1 -t 600 --pty /bin/bash

but the shell on the compute node does not have the env variables set.

I use the same prolog script as TaskProlog, which sets it properly for jobs submitted
with sbatch.

Thanks in advance,
    Dave Chin

--
David Chin, PhD (he/him)   Sr. SysAdmin, URCF, Drexel
dw...@drexel.edu                     215.571.4335 (o)
For URCF support: urcf-s...@drexel.edu
github:prehensilecode



From: slurm-users <slurm-use...@lists.schedmd.com> on behalf of mercan <ahmet....@uhem.itu.edu.tr>
Sent: Friday, February 12, 2021 16:27
To: Slurm User Community List <slurm...@lists.schedmd.com>; Herc Silverstein <herc.sil...@schrodinger.com>; slurm...@schedmd.com <slurm...@schedmd.com>
Subject: Re: [slurm-users] prolog not passing env var to job
 
External.


Hi;

Prolog and TaskProlog are different parameters and scripts. You should
use the TaskProlog script to set env. variables.

Regards;

Ahmet M.



Drexel Internal Data

Brian Andrus

unread,
Mar 4, 2021, 10:13:06 AM3/4/21
to slurm...@lists.schedmd.com


It seems to me, if you are using srun directly to get an interactive shell, you can just run the script once you get your shell.


You can set the variables and then run srun. It automatically exports the environment.

If you want to change a particular one (or more), use something like --export=ALL,MYVAR=othervalue

do 'man srun' and look at the --export option


Brian Andrus

Chin,David

unread,
Mar 4, 2021, 12:04:06 PM3/4/21
to Slurm User Community List
Hi, Brian:

So, this is my SrunProlog script -- I want a job-specific tmp dir, which makes for easy cleanup at end of job:

#!/bin/bash
if [[ -z ${SLURM_ARRAY_JOB_ID+x} ]]
then
    export TMP="/local/scratch/${SLURM_JOB_ID}"
    export TMPDIR="${TMP}"
    export LOCAL_TMPDIR="${TMP}"
    export BEEGFS_TMPDIR="/beegfs/scratch/${SLURM_JOB_ID}"
else
    export TMP="/local/scratch/${SLURM_ARRAY_JOB_ID}.${SLURM_ARRAY_TASK_ID}"
    export TMPDIR="${TMP}"
    export LOCAL_TMPDIR="${TMP}"
    export BEEGFS_TMPDIR="/beegfs/scratch/${SLURM_ARRAY_JOB_ID}.${SLURM_ARRAY_TASK_ID}"
fi

echo DEBUG srun_set_tmp.sh
echo I am `whoami`

/usr/bin/mkdir -p ${TMP}
chmod 700 ${TMP}
/usr/bin/mkdir -p ${BEEGFS_TMPDIR}
chmod 700 ${BEEGFS_TMPDIR}

And this is my srun session:

picotte001::~$ whoami
dwc62
picotte001::~$ srun -p def --mem 1000 -n 4 -t 600 --pty /bin/bash
DEBUG srun_set_tmp.sh
I am dwc62
node001::~$ echo $TMP
/local/scratch/80472
node001::~$ ll !$
ll $TMP
/bin/ls: cannot access '/local/scratch/80472': No such file or directory
node001::~$ mkdir $TMP
node001::~$ ll -d !$
ll -d $TMP
drwxrwxr-x 2 dwc62 dwc62 6 Mar  4 11:52 /local/scratch/80472/
node001::~$ exit

So, the "echo" and "whoami" statements are executed by the prolog script, as expected, but the mkdir commands are not?

Thanks,
    Dave

--
David Chin, PhD (he/him)   Sr. SysAdmin, URCF, Drexel
dw...@drexel.edu                     215.571.4335 (o)
For URCF support: urcf-s...@drexel.edu
github:prehensilecode


From: slurm-users <slurm-use...@lists.schedmd.com> on behalf of Brian Andrus <toom...@gmail.com>
Sent: Thursday, March 4, 2021 10:12
To: slurm...@lists.schedmd.com <slurm...@lists.schedmd.com>

Drexel Internal Data

Brian Andrus

unread,
Mar 4, 2021, 1:48:44 PM3/4/21
to slurm...@lists.schedmd.com

I think it isn't running how you think or there is something not provided in the description.


You have:

    export TMP="/local/scratch/${SLURM_ARRAY_JOB_ID}.${SLURM_ARRAY_TASK_ID}"

Notice that period in there.
Then you have:
node001::~$ echo $TMP
/local/scratch/80472
There is no period.
In fact, SLURM_ARRAY_JOB_ID should be blank too if you are not running as an array session.

However, to your desire for a job-specific tmp directory:
Check out the mktemp command. It should do just what you want. I use it for interactive desktop sessions for users to create the temp directory that is used for X sessions.
You just need to make sure the user has write access to the directory you are creating the directory in (chmod 1777 for the parent directory is good)

Brian Andrus

Chin,David

unread,
Mar 4, 2021, 3:05:00 PM3/4/21
to Slurm User Community List
Hi Brian:

This works just as I expect for sbatch.

The example srun execution I showed was a non-array job, so the first half of the "if []" statement holds. It is the second half, which deals with job arrays, which has the period.

The value of TMP is correct, i.e. "/local/scratch/80472" 

And the command, in the prolog script is correct, i.e. "/usr/bin/mkdir -p ${TMP}"

If I type that command during the interactive job, it does what I expect, i.e. creates the directory $TMP = /local/scratch/80472

Regards,
    Dave

From: slurm-users <slurm-use...@lists.schedmd.com> on behalf of Brian Andrus <toom...@gmail.com>
Sent: Thursday, March 4, 2021 13:48

Drexel Internal Data

Chin,David

unread,
Mar 4, 2021, 8:19:28 PM3/4/21
to Slurm User Community List
My mistake - from slurm.conf(5):

SrunProlog runs on the node where the "srun" is executing.

i.e. the login node, which explains why the directory is not being created on the compute node, while the echos work.

--
David Chin, PhD (he/him)   Sr. SysAdmin, URCF, Drexel
dw...@drexel.edu                     215.571.4335 (o)
For URCF support: urcf-s...@drexel.edu
github:prehensilecode

Drexel Internal Data

Reply all
Reply to author
Forward
0 new messages