[slurm-users] Questions about dynamic nodes

196 views
Skip to first unread message

Groner, Rob

unread,
Sep 27, 2022, 11:26:57 AM9/27/22
to slurm...@lists.schedmd.com
I have 2 nodes that offer a "gc" feature.  Node t-gc-1202 is "normal", and node t-gc-1201 is dynamic.  I can successfully remove t-gc-1201 and bring it back dynamically.  Once I bring it back, that node appears JUST LIKE the "normal" node in the sinfo output, as seen here:

[rug262@testsch (RC) slurm] sinfo -o "%20N  %10c  %10m  %25f  %10G "
NODELIST              CPUS        MEMORY      AVAIL_FEATURES             GRES      
t-sc-[1101-1104]      48          358400      nogpu,sc                   (null)    
t-gc-1201             48          385420      gpu,gc,a100                gpu:2(S:0-
t-gc-1202             48          358400      gpu,gc,a100                gpu:2      
t-ic-1051             36          500000      ic,a40                     (null)

When I execute a job requiring 24 CPUs and the gc feature, then it runs on t-gc-1202 only.  If I sbatch 3 of the same jobs at once, then 2 run on t-gc-1202 and the 3rd is pending for resources.

[rug262@testsch (RC) slurm] squeue
             JOBID PARTITION     NAME     USER ST       TIME  NODES NODELIST(REASON)
               405 open-requ gpu_test   rug262 PD       0:00      1 (Resources)
               404 open-requ gpu_test   rug262  R       0:06      1 t-gc-1202
               403 open-requ gpu_test   rug262  R       0:07      1 t-gc-1202

Both nodes show up in the partitions and show idle before starting the jobs:

[rug262@testsch (RC) slurm] sinfo
PARTITION     AVAIL  TIMELIMIT  NODES  STATE NODELIST
open*            up 2-00:00:00      4   idle t-sc-[1101-1104]
open-requeue     up 2-00:00:00      6   idle t-gc-[1201-1202],t-sc-[1101-1104]
intr             up 2-00:00:00      1   idle t-ic-1051
sla-prio         up   infinite      6   idle t-gc-[1201-1202],t-sc-[1101-1104]
burst            up   infinite      4   idle t-sc-[1101-1104]
burst-requeue    up   infinite      6   idle t-gc-[1201-1202],t-sc-[1101-1104]
debug            up   infinite      7   idle t-gc-[1201-1202],t-ic-1051,t-sc-[1101-1104]


So my 2 questions:
  1. How do I get my dynamic node to be utilized like the non-dynamic nodes?
  2. I want to have a DIFFERENT feature on my dynamic node, that is not present in the "normal" nodes.  When a job is submitted that requires the feature of the dynamic node, I need the job to suspend until the dynamic node becomes available.  How do I go about setting that up?  
Thanks.




Kevin Buckley

unread,
Sep 28, 2022, 12:47:42 AM9/28/22
to slurm...@lists.schedmd.com
On 2022/09/27 23:26, Groner, Rob wrote:
> I have 2 nodes that offer a "gc" feature. Node t-gc-1202 is "normal", and node t-gc-1201 is dynamic.
> I can successfully remove t-gc-1201 and bring it back dynamically. Once I bring it back, that node
> appears JUST LIKE the "normal" node in the sinfo output, as seen here:
>
> [rug262@testsch (RC) slurm] sinfo -o "%20N %10c %10m %25f %10G "
> NODELIST CPUS MEMORY AVAIL_FEATURES GRES
> t-sc-[1101-1104] 48 358400 nogpu,sc (null)
> t-gc-1201 48 385420 gpu,gc,a100 gpu:2(S:0-
> t-gc-1202 48 358400 gpu,gc,a100 gpu:2
> t-ic-1051 36 500000 ic,a40 (null)
>
> When I execute a job requiring 24 CPUs and the gc feature, then it runs on t-gc-1202 only.
> If I sbatch 3 of the same jobs at once, then 2 run on t-gc-1202 and the 3rd is pending for
> resources.

Always assumed Features were a "boolean indicator", as in a node
either has it, or it doesn't have it, for scheduling purposes, but
the behaviour you are seeing suggests that Slurm MAY be 'counting"
TWO nodes as having the feature, and then giving up after it has
scheduled TWO jobs, which, modulo some other countable resource
being exhausted by the two running jobs, seems wrong.

So, what happens if you sbatch 3 of the same jobs, each asking
for 16 CPUs and the gc feature ?

If the three all start on t-gc-1202, then there'd seem to be
something screwed and tied into the "bringing t-gc-1201 back
dynamically", but if only two start, and start on t-gc-1202,
then it points towards the total number of gc features, rather,
total number of nodes with the gc feature, being counted, or
some other countable resource being exhausted.



Groner, Rob

unread,
Sep 28, 2022, 9:23:27 AM9/28/22
to Kevin Buckley, slurm...@lists.schedmd.com

I tried a simpler test, removing the features altogether so it was just another node offering 48 CPUs.  I then started jobs asking for 24 CPUs a bunch of times.  The jobs started on every node EXCEPT t-gc-1201, and jobs went pending for resources until the "normal" nodes could return.

So at this point, I cannot come up with a working method to bring up a node dynamically and have it be useful in any way.  I'm sure I'm missing something if this actually does work for other people.  The dynamic node guides is incredibly spartan, so I believe I've covered everything there.

Rob



Groner, Rob

unread,
Sep 28, 2022, 3:45:52 PM9/28/22
to Kevin Buckley, slurm...@lists.schedmd.com
I ended up getting some help, and in the process, I noticed (for the first time) that the topology plugin was listed in the slurm.conf file.  I remembered that the dynamic nodes docs mentioned that dynamic nodes was not compatible with the topology plugin.  I had previously removed the nodes from the topology.conf file, and had thought that would be enough....apparently not.   I removed the plugin line from slurm.conf, restarted the ctld, and now jobs use resources from the dynamic nodes.  sigh

Reply all
Reply to author
Forward
0 new messages