I’m running into out-of-memory errors when I specify an array job. Needless to say, 100M should be more than enough, and increasing the allocated memory to 1G doesn't solve the problem. I call my script as follows: sbatch --array=100-199 run_batch_job. run_batch_job contains
#!/bin/env bash
#SBATCH --partition=lln
#SBATCH --output=/home/user/outs/%x.out.%a
#SBATCH --error=/home/user/outs/%x.err.%a
#SBATCH --cpus-per-task=1
#SBATCH --mem-per-cpu=100M
#SBATCH --time=2-00:00:00
srun my_program.out $SLURM_ARRAY_TASK_ID
Instead of using --mem-per-cpu and --cpus-per-task, I’ve also tried the following:
#SBATCH --mem=100M
#SBATCH --ntasks=1 # Number of cores
#SBATCH --nodes=1 # All cores on one machine
But in both cases for some of the runs, I get the error:
slurmstepd: error: Exceeded job memory limit at some point.
srun: error: obelix-cn002: task 0: Out Of Memory
slurmstepd: error: Exceeded job memory limit at some point.
I’ve also posted the question on stackoverflow. Does anyone know what is happening here?
Kind regards,
Geert Kapteijns