[slurm-users] How to enforce memory contrains?

785 views
Skip to first unread message

Rodrigo Santibáñez

unread,
Oct 4, 2021, 6:48:19 PM10/4/21
to Slurm User Community List
Hello Slurm Users,

I'm having a hard time configuring slurm to kill jobs when they use more memory than requested. Also, I can't make jobs use only RAM, and some of them starts to use SWAP.

I don't know what I'm missing.

Thanks for your help

slurmd -V
slurm 20.02.6

slurm.conf
TaskPlugin=task/affinity,task/cgroup
ProctrackType=proctrack/cgroup

cgroup.conf
AllowedRAMSpace=100.0
AllowedSwapSpace=0.0
ConstrainRAMSpace=yes
ConstrainSwapSpace=yes
MemorySwappiness=0
CgroupAutomount=yes
ConstrainCores=yes

Hermann Schwärzler

unread,
Oct 5, 2021, 4:15:45 AM10/5/21
to slurm...@lists.schedmd.com
Hi Rodrigo,

a possible solution is using

VSizeFactor=100

in slurm.conf.

With this settings, programs that try to allocate more memory than
requested in the job's settings will fail.

Be aware that this puts a limit on *virtual* memory, not on RSS. This
might or might not be what you want as a lot of programs tend to
allocate (a lot) more virtual memory than they really use (RSS).

Regards,
Hermann

Tina Friedrich

unread,
Oct 5, 2021, 4:31:51 AM10/5/21
to slurm...@lists.schedmd.com
Hi Rodrigo,

we do pretty much what you do - constrain via cgroups - and it works
fine. So I know it's possible. (I don't think I've ever twiddled the
VSizeFactor.)

I think you also need

SelectType=select/cons_res (or cons_tres)
SelectTypeParameters=CR_Core_Memory

in your slurm.conf; have you got that?

My cgroup.conf is this:

CgroupMountpoint="/sys/fs/cgroup"
CgroupAutomount=yes
CgroupReleaseAgentDir="/etc/slurm/cgroup"
ConstrainCores=yes
TaskAffinity=no
ConstrainRAMSpace=yes
ConstrainSwapSpace=yes
ConstrainDevices=yes
AllowedDevicesFile="/etc/slurm/cgroup_allowed_devices_file.conf"
AllowedRamSpace=100
AllowedSwapSpace=0
MaxRAMPercent=100
MaxSwapPercent=0
MinRAMSpace=30

Tina
--
Tina Friedrich, Advanced Research Computing Snr HPC Systems Administrator

Research Computing and Support Services
IT Services, University of Oxford
http://www.arc.ox.ac.uk http://www.it.ox.ac.uk

Rodrigo Santibáñez

unread,
Oct 5, 2021, 2:15:16 PM10/5/21
to Slurm User Community List
Hello Tina and Hermann,

Thanks for your suggestions.

@Tina, I have SelectType=select/cons_res and SelectTypeParameters=CR_CPU_Memory in slurm.conf

I added TaskAffinity=no, MaxRAMPercent=100 and MaxSwapPercent=0 to cgroup.conf. I will see what happens.

Best!
Reply all
Reply to author
Forward
0 new messages