[slurm-users] GPU configuration

455 views
Skip to first unread message

Giuseppe G. A. Celano

unread,
Dec 11, 2021, 2:21:37 AM12/11/21
to Slurm User Community List
Hi,

My cluster has 2 nodes, with the first having 2 gpus and the second 1 gpu. The states of both nodes is "drained" because "gres/gpu count reported lower than configured": any idea why this happens? Thanks.

My .conf files are:

slurm.conf

AccountingStorageTRES=gres/gpu
GresTypes=gpu
NodeName=technician Gres=gpu:2 CPUs=28 RealMemory=128503 Boards=1 SocketsPerBoard=1 CoresPerSocket=14 ThreadsPerCore=2 State=UNKNOWN
NodeName=worker0 Gres=gpu:1 CPUs=12 RealMemory=15922 Boards=1 SocketsPerBoard=1 CoresPerSocket=6 ThreadsPerCore=2 State=UNKNOWN
PartitionName=debug Nodes=ALL Default=YES MaxTime=INFINITE State=UP

gres.conf

NodeName=technician Name=gpu File=/dev/nvidia[0-1]
NodeName=worker0 Name=gpu File=/dev/nvidia0

Best,
Giuseppe
Reply all
Reply to author
Forward
0 new messages