[slurm-users] Job Step Output Delay

1,319 views
Skip to first unread message

Maria Semple

unread,
Feb 9, 2021, 6:47:50 PM2/9/21
to Slurm User Community List
Hello all,

I've noticed an odd behaviour with job steps in some Slurm environments. When a script is launched directly as a job, the output is written to file immediately. When the script is launched as a step in a job, output is written in ~30 second chunks. This doesn't happen in all Slurm environments, but if it happens in one, it seems to always happen. For example, on my local development cluster, which is a single node on Ubuntu 18, I don't experience this. On a large Centos 7 based cluster, I do.

Below is a simple reproducible example:

loop.sh:
#!/bin/bash
for i in {1..100}
do
   echo $i
   sleep 1
done

withsteps.sh:
#!/bin/bash
srun ./loop.sh

Then from the command line running sbatch loop.sh followed by tail -f slurm-<job #>.out prints the job output in smaller chunks, which appears to be related to file system buffering or the time it takes for the tail process to notice that the file has updated. Running cat on the file every second shows that the output is in the file immediately after it is emitted by the script.

If you run sbatch withsteps.sh instead, tail-ing or repeatedly cat-ing the output file will show that the job output is written in a chunk of 30 - 35 lines.

I'm hoping this is something that is possible to work around, potentially related to an OS setting, the way Slurm was compiled, or a Slurm setting.

--
Thanks,
Maria

Sean Maxwell

unread,
Feb 10, 2021, 7:29:24 AM2/10/21
to Slurm User Community List
Hi Maria,

Have you tried adding the -u flag (specifies unbuffered) to your srun command?


Your description sounds like buffering, so this might help.

Thanks,

-Sean

Tilman Schneider

unread,
Feb 10, 2021, 10:19:19 AM2/10/21
to slurm...@lists.schedmd.com
Hi Maria,

seem related to srun's behavior around -u ; from the official doc
-u, --unbuffered
By default the connection between slurmstepd and the user launched application is over a pipe. The stdio output written by the application is buffered by the glibc until it is flushed or the output is set as unbuffered. See setbuf(3). If this option is specified the tasks are executed with a pseudo terminal so that the application output is unbuffered. This option applies to step allocations.
 
Hth
Tilman

Message: 2
Date: Tue, 9 Feb 2021 15:47:12 -0800
From: Maria Semple <ma...@rstudio.com>
To: Slurm User Community List <slurm...@lists.schedmd.com>
Subject: [slurm-users] Job Step Output Delay
Message-ID:
        <CAJON5fi+V6ok3TstSxJr=wj6+2rD2Yr736-...@mail.gmail.com>
Content-Type: text/plain; charset="utf-8"
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.schedmd.com/pipermail/slurm-users/attachments/20210209/3bffe170/attachment-0001.htm>

Maria Semple

unread,
Feb 10, 2021, 4:11:54 PM2/10/21
to Slurm User Community List
Hi Sean,

Thanks for your suggestion!

Adding the -u flag does not seem to have an impact on whether data is buffered. I also tried adding stdbuf -o0 before the call to srun, to no avail. 

Best,
Maria
--
Thanks,
Maria

Aaron Jackson

unread,
Feb 10, 2021, 7:09:03 PM2/10/21
to Slurm User Community List
Is it being written to NFS? You say on your local dev cluster it's a
single node. Is it also the login node as well as compute? In that case
I guess there is no NFS. Larger cluster will be using some sort of
shared storage, so whichever shared file system you are using likely has
caching.

If you are able to connect directly to the node which is running the
job, you can try tailing from there. It'll likely update immediately if
what I said above is the case.

Cheers,
Aaron
Research Fellow
School of Computer Science
University of Nottingham



This message and any attachment are intended solely for the addressee
and may contain confidential information. If you have received this
message in error, please contact the sender and delete the email and
attachment.

Any views or opinions expressed by the author of this email do not
necessarily reflect the views of the University of Nottingham. Email
communications with the University of Nottingham may be monitored
where permitted by law.





Maria Semple

unread,
Feb 10, 2021, 7:14:46 PM2/10/21
to Slurm User Community List
The larger cluster is using NFS. I can see how that could be related to the difference of behaviours between the clusters.

 The buffering behaviour is the same if I tail the file from the node running the job. The only thing that seems to change the behaviour is whether I use srun to create a job step or not.
--
Thanks,
Maria

Bernstein, Noam CIV USN NRL (6393) Washington DC (USA)

unread,
Feb 11, 2021, 8:35:03 AM2/11/21
to Slurm User Community List
Could be this quote from the srun man page:
-u, --unbuffered
By default the connection between slurmstepd and the user launched application is over a pipe. The stdio output written by the application is buffered by the glibc until it is flushed or the output is set as unbuffered. See setbuf(3). If this option is specified the tasks are executed with a pseudo terminal so that the application output is unbuffered. This option applies to step allocations.

____________
|
|
|
U.S. NAVAL
|
|
_RESEARCH_
|
LABORATORY

Noam Bernstein, Ph.D.
Center for Materials Physics and Technology
U.S. Naval Research Laboratory

Bernstein, Noam CIV USN NRL (6393) Washington DC (USA)

unread,
Feb 11, 2021, 8:40:24 AM2/11/21
to Slurm User Community List
Could be this quote from the srun man page:
-u, --unbuffered
By default the connection between slurmstepd and the user launched application is over a pipe. The stdio output written by the application is buffered by the glibc until it is flushed or the output is set as unbuffered. See setbuf(3). If this option is specified the tasks are executed with a pseudo terminal so that the application output is unbuffered. This option applies to step allocations.
On Feb 10, 2021, at 7:14 PM, Maria Semple <ma...@rstudio.com> wrote:

Reply all
Reply to author
Forward
0 new messages