[slurm-users] slurmstepd: error: Exceeded job memory limit at some point.

2,901 views
Skip to first unread message

Geert Kapteijns

unread,
Feb 14, 2018, 6:22:29 AM2/14/18
to slurm...@lists.schedmd.com
Hi everyone,

I’m running into out-of-memory errors when I specify an array job. Needless to say, 100M should be more than enough, and increasing the allocated memory to 1G doesn't solve the problem. I call my script as follows: sbatch --array=100-199 run_batch_job. run_batch_job contains

#!/bin/env bash
#SBATCH --partition=lln
#SBATCH --output=/home/user/outs/%x.out.%a
#SBATCH --error=/home/user/outs/%x.err.%a
#SBATCH --cpus-per-task=1
#SBATCH --mem-per-cpu=100M
#SBATCH --time=2-00:00:00

srun my_program.out $SLURM_ARRAY_TASK_ID

Instead of using --mem-per-cpu and --cpus-per-task, I’ve also tried the following:

#SBATCH --mem=100M
#SBATCH --ntasks=1  # Number of cores
#SBATCH --nodes=1  # All cores on one machine

But in both cases for some of the runs, I get the error:

slurmstepd: error: Exceeded job memory limit at some point.
srun: error: obelix-cn002: task 0: Out Of Memory
slurmstepd: error: Exceeded job memory limit at some point.

I’ve also posted the question on stackoverflow. Does anyone know what is happening here?

Kind regards,
Geert Kapteijns



Loris Bennett

unread,
Feb 14, 2018, 7:06:20 AM2/14/18
to Geert Kapteijns, slurm...@lists.schedmd.com
Maybe once in a while a simulation really does just use more memory than you
were expecting. Have a look at the output of

sacct -j 123456 -o jobid,maxrss,state --units=M

with the appropriate job ID.

Regards

Loris

--
Dr. Loris Bennett (Mr.)
ZEDAT, Freie Universität Berlin Email loris....@fu-berlin.de

Chris Bridson (NBI)

unread,
Feb 14, 2018, 8:10:08 AM2/14/18
to Slurm User Community List, Geert Kapteijns
Also consider any cached information, e.g. nfs . You won't necessarily see this, but might be getting accounted for in the cgroup, depending on your setup/settings.

John DeSantis

unread,
Feb 14, 2018, 9:52:26 AM2/14/18
to slurm...@lists.schedmd.com
Geert,

Considering the the following response from Loris:

> Maybe once in a while a simulation really does just use more memory
> than you were expecting. Have a look at the output of
>
> sacct -j 123456 -o jobid,maxrss,state --units=M
>
> with the appropriate job ID.

This can certainly happen!

I'd suggest profiling the job(s) in question; perhaps a loop of `ps`
with the appropriate output modifiers, e.g. 'rss' (and vsz if you're
tracking virtual memory usage).

We've seen jobs that which will terminate after several hours of run
time because their memory usage spiked during a JobAcctGatherFrequency
sampling interval (every 30 seconds, adjusted within slurm.conf).

John DeSantis

Williams, Jenny Avis

unread,
Feb 15, 2018, 8:19:41 PM2/15/18
to Slurm User Community List
Here we see this. There is a difference in behavior depending whether the program runs out of the "standard" NFS or the GPFS filesystem.

If the I/O is from NFS, there can be conditions where we see this with some frequency on a given problem. It will not be every time but can be reproduced.

The same routine run over GPFS would likely not present this error.
Our GPFS however is configured with a huge local lroc, memory pinned for mmfsd etc. . You have to push the I/O much harder on GPFS than on the NFS to get a D wait on that filesystem.

It appears to correlate to how efficiently the file caching is handled.

Jenny
UNC Chapel Hill
Reply all
Reply to author
Forward
0 new messages