[slurm-users] Is SWAP memory mandatory for SLURM

John Joseph via slurm-users

unread,

Mar 4, 2024, 2:06:26 AM3/4/24

to slurm...@lists.schedmd.com

Dear All,

Good morning

I do have a 4 node SLURM instance up and running.

Like to know if I disable the SWAP memory, will it effect the SLURM performance

Is SWAP a mandatory requirement, I have each node more RAM, if my phsicall RAM is more, is there any need for the SWAP

thanks

Joseph John

Cutts, Tim via slurm-users

unread,

Mar 4, 2024, 5:01:40 AM3/4/24

to John Joseph, slurm...@lists.schedmd.com

It depends on a number of factors.

How do your workloads behave? Do they do a lot of fork()? I’ve had cases in the past where users submitted scripts which initially used quite a lot of memory and then used fork() or system() to execute subprocesses. This of course means that temporarily (between the fork() and the exec() system calls) the job uses twice as much virtual memory, although this does not become real because the pages are copy-on-write. Something similar happens if the code performs mmap() on large files.

Whether this has an impact on you needing swap space is down to what your sysctl settings are for vm.overcommit_memory and vm.overcommit_ratio

If you set vm.overcommit_memory to 2, then the OOM killer will never hit you (because malloc() will fail rather than allocate virtual memory that isn’t available), but cases like the above will tend to fail memory allocations unnecessarily, especially if you don’t have any swap allocated.

If you set vm.overcommit_memory to 0 or 1, then you need less swap allocated (possibly even zero) but you run the risk of running out of memory and the OOM killer blowing things up left right and centre.

If you provide swap, it only causes a performance impact if the node actually runs out of physical memory and actively starts swapping.

So bottom line is I think it depends on what you want the failure mode to be.

If you want everything to always run in a very deterministic way at full speed, with failures at the precise moment the memory is exhausted, but with a risk that jobs fail if they’re relying on overcommit (e.g. through fork(0/exec()), then vm.overcommit_memory=2 and no swap
If you want high throughput single threaded stuff to run more smoothly (think: horrible genomics perl and python scrips, etc), then overcommit_memory=0 and add some swap. You’ll probably get higher throughput, but things may blow up slight unpredictably from time to time when nodes run out of memory.

I now call on someone who understands cgroups properly to explain how this changes when cgroups are in play, because I’m not sure I understand that!

Tim

--

Tim Cutts

Scientific Computing Platform Lead

AstraZeneca

Find out more about R&D IT Data, Analytics & AI and how we can support you by visiting our Service Catalogue |

AstraZeneca UK Limited is a company incorporated in England and Wales with registered number:03674842 and its registered office at 1 Francis Crick Avenue, Cambridge Biomedical Campus, Cambridge, CB2 0AA.

This e-mail and its attachments are intended for the above named recipient only and may contain confidential and privileged information. If they have come to you in error, you must not copy or show them to anyone; instead, please reply to this e-mail, highlighting the error to the sender and then immediately delete the message. For information about how AstraZeneca UK Limited and its affiliates may process information, personal data and monitor communications, please see our privacy notice at www.astrazeneca.com

Brian Andrus via slurm-users

unread,

Mar 4, 2024, 1:36:13 PM3/4/24

to slurm...@lists.schedmd.com

Joseph,

You will likely get many perspectives on this. I disable swap completely on our compute nodes. I can be draconian that way. For the workflow supported, this works and is a good thing.
Other workflows may benefit from swap.

Brian Andrus

Christopher Samuel via slurm-users

unread,

Mar 4, 2024, 6:57:42 PM3/4/24

to slurm...@lists.schedmd.com

On 3/3/24 23:04, John Joseph via slurm-users wrote:

> Is SWAP a mandatory requirement

All our compute nodes are diskless, so no swap on them.

--
Chris Samuel : http://www.csamuel.org/ : Berkeley, CA, USA

--
slurm-users mailing list -- slurm...@lists.schedmd.com
To unsubscribe send an email to slurm-us...@lists.schedmd.com

Reply all

Reply to author

Forward