[slurm-users] Possible to require jobs to use NVLink-ed pairs of GPUs?
11 views
Skip to first unread message
Marcus Lauer via slurm-users
unread,
Oct 27, 2025, 11:12:06 AMOct 27
Reply to author
Sign in to reply to author
Forward
Sign in to forward
Delete
You do not have permission to delete messages in this group
Copy link
Report message
Show original message
Either email addresses are anonymous for this group or you need the view member email addresses permission to view the original message
to slurm...@lists.schedmd.com
One of our researchers asked whether it was possible to require a job to use NVLink-ed pairs of GPUs.
I see that there is a support ticket on the SchedMD site which covers this (https://support.schedmd.com/show_bug.cgi?id=15995). That ticket is a few years old though. Does anyone happen to know whether support for this has been added in newer releases of SLURM?
The cluster in question does use "AutoDetect=nvml" in its gres.conf and the output of "slurmd -G" shows that SLURM is aware of the NVLink pairs. I assume the scheduler is trying to use that information. What I want to know is whether there is some way for an end-user to add a constraint (for example) to a job such that it only runs on an NVLink-ed pair of GPUs.
I do know that there are other ways to implement this such as requiring jobs to run with even numbers of GPUs, perhaps just on some nodes to allow single GPU jobs to run on the remaining nodes. I'm specifically asking about a flag or setting a user could apply to their jobs. If there is such a thing maybe someone here knows about it. If so I'd love to hear about it. Thanks!