[slurm-users] Slurm 25.05: Retrieving jobs GPU Indices on Heterogeneous Cluster

3 views
Skip to first unread message

David Gauchard via slurm-users

unread,
Aug 26, 2025, 11:18:04 AMAug 26
to slurm...@lists.schedmd.com
Hello,

I'm running Slurm 25.05 on a heterogeneous cluster (several kind of GPU
in the same node) with AutoDetect=nvml and shared mode. When submitting
a job with `#SBATCH --gres=gpu:1`, CUDA_VISIBLE_DEVICES is correctly set
to a single and valid free GPU index but `scontrol show jobid` does not
report any detail about GPU allocation.

How can I retrieve the GPU indices assigned to running jobs (reflecting
CUDA_VISIBLE_DEVICES) in shared mode? Is there a Slurm command or
configuration to enable tracking of these indices on a heterogeneous
cluster?

The goal is to help automatic choice of the free and appropriate gpu
according to the job's needs in order to save bigger gpus for bigger
jobs at the time of submitting.

Thanks



--
slurm-users mailing list -- slurm...@lists.schedmd.com
To unsubscribe send an email to slurm-us...@lists.schedmd.com

Laura Hild via slurm-users

unread,
Aug 26, 2025, 4:38:14 PMAug 26
to slurm...@lists.schedmd.com
Have you tried scontrol show job --details? Ours gives an "IDX:" in parentheses after the GPU gres, though you'll have to check yourself if that actually corresponds to what you're after.


________________________________________
Od: David Gauchard via slurm-users <slurm...@lists.schedmd.com>
Poslano: torek, 26. avgust 2025 11:15
Za: slurm...@lists.schedmd.com
Zadeva: [slurm-users] Slurm 25.05: Retrieving jobs GPU Indices on Heterogeneous Cluster

David Gauchard via slurm-users

unread,
Aug 27, 2025, 4:22:45 AMAug 27
to slurm...@lists.schedmd.com

Many thanks, this command gives indeed what I need !
Reply all
Reply to author
Forward
0 new messages