[slurm-users] Configuration for nodes with different TmpFs locations and TmpDisk sizes

73 views
Skip to first unread message

Jake Longo via slurm-users

unread,
Sep 4, 2024, 11:03:46 AM9/4/24
to slurm...@schedmd.com
Hi,

We have a number of machines in our compute cluster that have larger disks available for local data. I would like to add them to the same partition as the rest of the nodes but assign them a larger TmpDisk value which would allow users to request a larger tmp and land on those machines.

The main hurdle is that (for reasons beyond my control) the larger local disks are on a special mount point /largertmp whereas the rest of the compute cluster uses the vanilla /tmp. I can't see an obvious way to make this work as the TmpFs value appears to be global only and attempting to set TmpDisk to a value larger than TmpFs for those nodes will put the machine into an invalid state.

I couldn't see any similar support tickets or anything in the mail archive but I wouldn't have thought it would be that unusual to do this.

Thanks in advance!
Jake

Jake Longo via slurm-users

unread,
Sep 5, 2024, 6:13:28 AM9/5/24
to slurm...@schedmd.com
Hi all,

Cutts, Tim via slurm-users

unread,
Sep 5, 2024, 7:15:18 AM9/5/24
to Jake Longo, slurm...@schedmd.com

I’ve always had local storage mounted in the same place, in /tmp.  In LSF clusters, I just let LSF’s lim get on with autodetecting how big /tmp was and setting the tmp resource automatically.  I presume SLURM can do the same thing, but I’ve never checked.

 

Tim

 

-- 

Tim Cutts

Scientific Computing Platform Lead

AstraZeneca

 

Find out more about R&D IT Data, Analytics & AI and how we can support you by visiting our Service Catalogue |


AstraZeneca UK Limited is a company incorporated in England and Wales with registered number:03674842 and its registered office at 1 Francis Crick Avenue, Cambridge Biomedical Campus, Cambridge, CB2 0AA.

This e-mail and its attachments are intended for the above named recipient only and may contain confidential and privileged information. If they have come to you in error, you must not copy or show them to anyone; instead, please reply to this e-mail, highlighting the error to the sender and then immediately delete the message. For information about how AstraZeneca UK Limited and its affiliates may process information, personal data and monitor communications, please see our privacy notice at www.astrazeneca.com

simpsond4--- via slurm-users

unread,
Sep 6, 2024, 10:30:13 AM9/6/24
to Jake Longo, slurm...@schedmd.com

Hi,

This may help.

 

job_container.conf

--------

# All nodes have /localscratch but for some_nodes2 it is mounted as NVME.

AutoBasePath=true

BasePath=/localscratch

Shared=true

# Some nodes have /localscratch1 configured, as localscratch is actually taken by a valid local device setup

NodeName=some_nodes[9995-9999] AutoBasePath=true BasePath=/localscratch1 Shared=true

# Some_nodes2 where we want to use local NVME mounted at localscratch. If this is nvidia kit, we may not want /dev/shm so explicit /tmp

NodeName=some_nodes2[7770-7777] Dirs="/tmp" AutoBasePath=true BasePath=/localscratch Shared=true





David

 

----------

David Simpson - Senior Systems Engineer

ARCCA, Redwood Building,

King Edward VII Avenue,

Cardiff, CF10 3NB                                                                              

 

David Simpson - peiriannydd uwch systemau

ARCCA, Adeilad Redwood,

King Edward VII Avenue,

Caerdydd, CF10 3NB

 

 

From: Jake Longo via slurm-users <slurm...@lists.schedmd.com>
Date: Wednesday, 4 September 2024 at 16:19
To: slurm...@schedmd.com <slurm...@schedmd.com>
Subject: [slurm-users] Configuration for nodes with different TmpFs locations and TmpDisk sizes

External email to Cardiff University - Take care when replying/opening attachments or links.

Nid ebost mewnol o Brifysgol Caerdydd yw hwn - Cymerwch ofal wrth ateb/agor atodiadau neu ddolenni.

Reply all
Reply to author
Forward
0 new messages