[slurm-users] Enforcing GPU-CPU ratios

112 views
Skip to first unread message

Durai Arasan

unread,
Jun 23, 2020, 10:02:12 AM6/23/20
to Slurm User Community List
Hi,

We would like to enforce a fixed ratio of CPUs to GPUs allocated. To explain further -  when a job is submitted requesting a certain number of GPUs (using --gres=gpu:n) we would like to fix the number of CPUs that will be allocated to the job based on the number of GPUs. We would like this to be automatically managed by configuration instead of being specified by the job submitter, say using the "--cpus-per-gpu" option.

Which part of SLURM configuration can be used to enforce this ratio?

Thank you,
Durai Arasan
Zentrum für Datenverarbeitung
Tübingen

Bas van der Vlies

unread,
Jun 23, 2020, 10:34:47 AM6/23/20
to Slurm User Community List, Durai Arasan
Which version of slurm do you use? as slurm 19.05:
* DefCpuPerGPU

{{{
PartitionName=gpu_shared_education DefCpuPerGPU=3 DefMemPerCPU=20900
Default=No DefaultTime=5 DisableRootJobs=YES ExclusiveUser=NO MaxNodes=1
MaxTime=2-0 Nodes=r30n[4] OverSubscribe=FORCE Priority=1000
QOS=p_gpu_shared_education State=UP
TRESBillingWeights=CPU=4.0,Mem=196.61T,GRES/gpu=12.0,GRES/gpu:gtx1080ti=12.0
--
--
Bas van der Vlies
| Operations, Support & Development | SURFsara | Science Park 140 | 1098 XG
Amsterdam
| T +31 (0) 20 800 1300 | bas.van...@surfsara.nl | www.surfsara.nl |

Kilian Cavalotti

unread,
Mar 14, 2023, 8:50:28 PM3/14/23
to Slurm User Community List
On Tue, Jun 23, 2020 at 7:37 AM Bas van der Vlies
<bas.van...@surfsara.nl> wrote:
>
> Which version of slurm do you use? as slurm 19.05:
> * DefCpuPerGPU

Sorry for necroposting and undigging this old thread, but the
DefCpuPerGpu configuration option is actually just a default, which
will happily get overridden by job submission options. It's actually
reported as "JobDefaults" in `scontrol show partition`:

```
$ scontrol show partition foo | grep DefCpuPerGPU
JobDefaults=DefCpuPerGPU=1
```

It works as a default:
```
$ salloc -p foo -G 3

$ echo $SLURM_GPUS_ON_NODE
3
$ echo $SLURM_CPUS_ON_NODE
3

```

but doesn't enforce the ratio:
```
$ salloc -p foo -G 2 -c 4

$ echo $SLURM_GPUS_ON_NODE
2
$ echo $SLURM_CPUS_ON_NODE
4
```

There is currently (as of 23.02) no mechanism to enforce a fixed GPU
per CPU ratio.

We've recently submitted a bug to request this feature (as a
MaxCpuPerGPU option, for instance), but we unfortunately won't be able
to sponsor its development.
If anyone's interested, it's up for grabs at
https://bugs.schedmd.com/show_bug.cgi?id=16189

Cheers,
--
Kilian

Reply all
Reply to author
Forward
0 new messages