Dear all,
a small update.
On 24.11.21 18:13, Sven Duscha wrote:
> So, maybe this wouldn't be a big disadvantage, if that allows us to
> use 32 slots on the "16 Cores with 2 SMT" Xeons in the PowerEdge R720
> machines with Ubuntu 20.04
>
>
> Has anyone else encountered this problem? Is there a better/proper for
> using all SMT/HT cores?
It took about half an hour - with no jobs running, besides some test
jobs - for the node to fall into "drained" state again:
sinfo -lNe
Wed Nov 24 18:23:05 2021
NODELIST NODES PARTITION STATE CPUS S:C:T MEMORY TMP_DISK
WEIGHT AVAIL_FE REASON
ekgen1 1 cluster* idle 16 2:8:1 480000 0 1
(null) none
ekgen2 1 cluster* mixed 32 2:8:2 250000 0 1
(null) none
ekgen3 1 debian idle 32 2:8:2 250000 0 1
(null) none
ekgen4 1 cluster* mixed 32 2:8:2 250000 0 1
(null) none
ekgen5 1 cluster* idle 32 2:8:2 250000 0 1
(null) none
ekgen6 1 debian idle 32 2:8:2 250000 0 1
(null) none
ekgen7 1 cluster* idle 32 2:8:2 250000 0 1
(null) none
ekgen8 1 debian drained 32 2:16:1 250000 0 1
(null) Low socket*core*thre
ekgen9 1 cluster* idle 32 2:8:2 192000 0 1
(null) none
Thus,
NodeName=ekgen[8] RealMemory=250000 Sockets=2 CoresPerSocket=16
ThreadsPerCore=1 State=UNKNOWN
isn't a working node declaration either.
The question remains why a declaration matching the output of slurmd -C
doesn't work with Ubuntu-20.04
P.S.: Fixed version typo in the subject.
--
Sven Duscha
Deutsches Herzzentrum München
Technische Universität München
Lazarettstraße 36
80636 München
+49 89 1218 2602