[slurm-users] First setup of slurm with a GPU node

215 views
Skip to first unread message

Patrick Begou via slurm-users

unread,
Nov 13, 2024, 6:03:13 AM11/13/24
to Slurm User Community List

Hi,

I'm using slurm on a small 8 nodes cluster. I've recently added one GPU node with two Nvidia A100, one with 40Gb of RAM and one with 80Gb.

As using this GPU resource increase I would like to manage this resource with Gres to avoid usage conflict. But at this time my setup do not works as I can reach a GPU without reserving it:

srun -n 1 -p tenibre-gpu ./a.out

can use a GPU even if the reservation do not specify this resource (checked with running nvidia-smi  on the node). "tenibre-gpu" is a slurm partition with only this gpu node.

From the documentation I've created a gres.conf file and it is propagated on all the nodes (9 compute nodes, 1 login node and the management node) and slurmd has been restarted.

gres.conf is:*

## GPU setup on tenibre-gpu-0
NodeName=tenibre-gpu-0 Name=gpu Type=A100-40 File=/dev/nvidia0 Flags=nvidia_gpu_env
NodeName=tenibre-gpu-0 Name=gpu Type=A100-80 File=/dev/nvidia1 Flags=nvidia_gpu_env

In slurm.conf I have checked these flags:

## Basic scheduling
SelectTypeParameters=CR_Core_Memory
SchedulerType=sched/backfill
SelectType=select/cons_tres

## Generic resources
GresTypes=gpu

## Nodes list
....
Nodename=tenibre-gpu-0 RealMemory=257270 Sockets=2 CoresPerSocket=16 ThreadsPerCore=1 State=UNKNOWN
....

#partitions
PartitionName=tenibre-gpu MaxTime=48:00:00 DefaultTime=12:00:00 DefMemPerCPU=4096 MaxMemPerCPU=8192 Shared=YES  State=UP Nodes=tenibre-gpu-0
...



May be I've missed something ?  I'm running Slurm 20.11.7-1.

Thanks for your advices.

Patrick

Roberto Polverelli Monti via slurm-users

unread,
Nov 13, 2024, 9:48:11 AM11/13/24
to slurm...@lists.schedmd.com
Hello Patrick,

On 11/13/24 12:01 PM, Patrick Begou via slurm-users wrote:
> As using this GPU resource increase I would like to manage this resource
> with Gres to avoid usage conflict. But at this time my setup do not
> works as I can reach a GPU without reserving it:
>
> srun -n 1 -p tenibre-gpu ./a.out
>
> can use a GPU even if the reservation do not specify this resource
> (checked with running nvidia-smi  on the node). "tenibre-gpu" is a slurm
> partition with only this gpu node.

I think what you're looking for is the ConstrainDevices parameter in
cgroup.conf.

See here:
- https://slurm.schedmd.com/archive/slurm-20.11.7/cgroup.conf.html

Best,

--
Roberto Polverelli Monti
HPC System Engineer
Do IT Now | https://doitnowgroup.com

--
slurm-users mailing list -- slurm...@lists.schedmd.com
To unsubscribe send an email to slurm-us...@lists.schedmd.com

Patrick Begou via slurm-users

unread,
Nov 13, 2024, 11:03:10 AM11/13/24
to slurm...@lists.schedmd.com
Le 13/11/2024 à 15:45, Roberto Polverelli Monti via slurm-users a écrit :
Hello Patrick,

On 11/13/24 12:01 PM, Patrick Begou via slurm-users wrote:
As using this GPU resource increase I would like to manage this resource with Gres to avoid usage conflict. But at this time my setup do not works as I can reach a GPU without reserving it:

    srun -n 1 -p tenibre-gpu ./a.out

can use a GPU even if the reservation do not specify this resource (checked with running nvidia-smi  on the node). "tenibre-gpu" is a slurm partition with only this gpu node.

I think what you're looking for is the ConstrainDevices parameter in cgroup.conf.

See here:
- https://slurm.schedmd.com/archive/slurm-20.11.7/cgroup.conf.html

Best,

Hi Roberto,

thanks for pointing to this parameter. I set it, update all the nodes, restart slurmd everywhere but it does not change the behavior.
However, when looking in the slurmd log on the GPU node I notice this information:


[2024-11-13T16:41:08.434] debug:  CPUs:32 Boards:1 Sockets:8 CoresPerSocket:4 ThreadsPerCore:1
[2024-11-13T16:41:08.434] debug:  gres/gpu: init: loaded
[2024-11-13T16:41:08.434] WARNING: A line in gres.conf for GRES gpu:A100-40 has 1 more configured than expected in slurm.conf. Ignoring extra GRES.
[2024-11-13T16:41:08.434] WARNING: A line in gres.conf for GRES gpu:A100-80 has 1 more configured than expected in slurm.conf. Ignoring extra GRES.
[2024-11-13T16:41:08.434] debug:  gpu/generic: init: init: GPU Generic plugin loaded
[2024-11-13T16:41:08.434] topology/none: init: topology NONE plugin loaded
[2024-11-13T16:41:08.434] route/default: init: route default plugin loaded
[2024-11-13T16:41:08.434] CPU frequency setting not configured for this node
[2024-11-13T16:41:08.434] debug:  Resource spec: No specialized cores configured by default on this node
[2024-11-13T16:41:08.434] debug:  Resource spec: Reserved system memory limit not configured for this node
[2024-11-13T16:41:08.434] debug:  Reading cgroup.conf file /etc/slurm/cgroup.conf
[2024-11-13T16:41:08.434] error: MaxSwapPercent value (0.0%) is not a valid number
[2024-11-13T16:41:08.436] debug:  task/cgroup: init: core enforcement enabled
[2024-11-13T16:41:08.437] debug:  task/cgroup: task_cgroup_memory_init: task/cgroup/memory: total:257281M allowed:100%(enforced), swap:0%(enforced), max:100%(257281M) max+swap:100%(514562M) min:30M kmem:100%(257281M permissive) min:30M swappiness:0(unset)
[2024-11-13T16:41:08.437] debug:  task/cgroup: init: memory enforcement enabled
[2024-11-13T16:41:08.438] debug:  task/cgroup: task_cgroup_devices_init: unable to open /etc/slurm/cgroup_allowed_devices_file.conf: No such file or directory
[2024-11-13T16:41:08.438] debug:  task/cgroup: init: device enforcement enabled
[2024-11-13T16:41:08.438] debug:  task/cgroup: init: task/cgroup: loaded
[2024-11-13T16:41:08.438] debug:  auth/munge: init: Munge authentication plugin loaded

So something is wrong in may gres.conf file I think as I ttry do configure 2 different devices on the node may be?

## GPU setup on tenibre-gpu-0
NodeName=tenibre-gpu-0 Name=gpu Type=A100-40 File=/dev/nvidia0 Flags=nvidia_gpu_env
NodeName=tenibre-gpu-0 Name=gpu Type=A100-80 File=/dev/nvidia1 Flags=nvidia_gpu_env

Patrick

Benjamin Smith via slurm-users

unread,
Nov 13, 2024, 11:34:08 AM11/13/24
to Slurm User Community List

Hi Patrick,

You're missing a Gres= on your node in your slurm.conf:

Nodename=tenibre-gpu-0 RealMemory=257270 Sockets=2 CoresPerSocket=16 ThreadsPerCore=1 State=UNKNOWN Gres=gpu:A100-40:1,gpu:A100-80:1

Ben


On 13/11/2024 16:00, Patrick Begou via slurm-users wrote:
This email was sent to you by someone outside the University.
You should only click on links or attachments if you are certain that the email is genuine and the content is safe.
-- 
Benjamin Smith <bsm...@ed.ac.uk>
Computing Officer, AT-7.12a
Research and Teaching Unit
School of Informatics, University of Edinburgh
The University of Edinburgh is a charitable body, registered in Scotland, with registration number SC005336. Is e buidheann carthannais a th’ ann an Oilthigh Dhùn Èideann, clàraichte an Alba, àireamh clàraidh SC005336.

Henk Meij via slurm-users

unread,
Nov 13, 2024, 12:29:58 PM11/13/24
to Slurm User Community List, Benjamin Smith
Yes, I noticed this changed behavior too since v22 (testing v24 now)

The gres definitions in gres.conf are ignored but must be in slurm.conf

My gres.conf file now only has

NodeName=n[79-90] AutoDetect=nvml

-Henk

From: Benjamin Smith via slurm-users <slurm...@lists.schedmd.com>
Sent: Wednesday, November 13, 2024 11:31 AM
To: Slurm User Community List <slurm...@lists.schedmd.com>
Subject: [External] [slurm-users] Re: First setup of slurm with a GPU node
 

Patrick Begou via slurm-users

unread,
Nov 13, 2024, 4:10:58 PM11/13/24
to slurm...@lists.schedmd.com
Hi Benjamin,

Yes, I saw this on an archived discussion too and I've added these parameters. A little bit tricky to do as my setup is deployed via Ansible. But with this setup I'm not able to request a GPU at all. All these test are failing and slurm do not accept the job:

srun -n 1 -p tenibre-gpu --gres=gpu:A100-40 ./a.out
srun -n 1 -p tenibre-gpu --gres=gpu:A100-40:1 ./a.out
srun -n 1 -p tenibre-gpu --gpus-per-node=A100-40:1 ./a.out
srun -n 1 -p tenibre-gpu --gpus-per-node=1 ./a.out
srun -n 1 -p tenibre-gpu --gres=gpu:1 ./a.out

May be some restrictions on the GPU type field with the "minus" sign ? No idea. But launching a GPU code without reserving a GPU is failing at execution time on the node. So a first step is done!

May be should I upgrade my slurm version  from 20.11 to the latest. But I had to set the cluster back in production without the GPU setup this evening.

Patrick

Jason Simms via slurm-users

unread,
Nov 13, 2024, 4:18:51 PM11/13/24
to Patrick Begou, slurm...@lists.schedmd.com
Hello Patrick,

Yeah I'd recommend upgrading, and I imagine most others will, too. I have found with Slurm that upgrades are nearly mandatory, at least annually or so, mostly because it's more challenging to upgrade from much older versions and requires bootstrapping. Not sure about the minus sign; that's an interesting hypothesis. For what it's worth, we don't use minus signs in our names. You may want no characters like that, or perhaps an underscore.

Jason

--
slurm-users mailing list -- slurm...@lists.schedmd.com
To unsubscribe send an email to slurm-us...@lists.schedmd.com


--
Jason L. Simms, Ph.D., M.P.H.
Research Computing Manager
Swarthmore College
Information Technology Services
Reply all
Reply to author
Forward
0 new messages