[slurm-users] do oversubscription with algorithm other than least-loaded?

140 views
Skip to first unread message

Herc Silverstein

unread,
Feb 24, 2022, 4:36:00 PM2/24/22
to slurm...@schedmd.com

Hi,

We would like to do over-subscription on a cluster that's running in the cloud.  The cluster dynamically spins up and down cpu nodes as needed.  What we see is that the least-loaded algorithm causes the maximum number of nodes specified in the partition to be spun up and each loaded with N jobs for the N cpu's in a node before it "doubles back" and starts over-subscribing.

What we actually want is for the minimum number of nodes to be used and for it to fully load (to the limit of the oversubscription setting) one node before starting up another.  That is, we really want a "most-loaded" algorithm.  This would allow us to reduce the number of nodes we need to run and reduce costs.

Is there a way to get this behavior somehow?

Herc



Daniel Letai

unread,
Mar 3, 2022, 1:37:00 PM3/3/22
to slurm...@lists.schedmd.com

I could be missing something here, but if you refer to the SelectTypeParameters=cr_lln you could just try cr_pack_nodes.

https://slurm.schedmd.com/slurm.conf.html#OPT_CR_Pack_Nodes


If you want it on a per-partition configuration, I'm not sure that's possible, you might need to set a distribution (-m) in your job submit script/wrapper (E.g., -m block:*:*,pack)

https://slurm.schedmd.com/sbatch.html#OPT_distribution


If you're referring to something else entirely, could you elaborate on the least-loaded configuration in your setup?

-- 
Regards,

Daniel Letai
+972 (0)505 870 456

Herc Silverstein

unread,
Mar 7, 2022, 11:29:07 PM3/7/22
to slurm...@lists.schedmd.com, slurm-use...@lists.schedmd.com
Hi,

We'd like to have just one of the partitions over subscribe the nodes in
it.  The nodes are not shared with any other partitions.

The SLURM documentation (https://slurm.schedmd.com/cons_res_share.html)
seems to indicate that the least-loaded algorithm is always used when
oversubscribe=force.  I believe oversubscribe=force is what we want (but
have it packeach  node fully first).

Thanks for pointing out the -m option.  Our jobs are separately
sbatched.  So, unfortunately, I don't see we can use it in this case.

What we want to be able to do is on, say, a 4 core node run 8 (or 12)
jobs.  But only do it for the nodes in this one partition. The other
partitions should continue to run N jobs on an N core node.

Herc


> <html style="direction: ltr;">   <head>
>     <meta http-equiv="Content-Type" content="text/html; charset=UTF-8">
>     <style id="bidiui-paragraph-margins" type="text/css">body p { margin-bottom: 0cm; margin-top: 0pt; } </style>
>   </head>
>   <body bidimailui-charset-is-forced="true" style="direction: ltr;">
>     <p>I could be missing something here, but if you refer to the <b>SelectTypeParameters=cr_lln
>       </b>you could just try cr_pack_nodes.</p>
>     <p><a class="moz-txt-link-freetext" href="https://slurm.schedmd.com/slurm.conf.html#OPT_CR_Pack_Nodes">https://slurm.schedmd.com/slurm.conf.html#OPT_CR_Pack_Nodes</a><br>
>     </p>     <p><br>     </p>
>     <p>If you want it on a per-partition configuration, I'm not sure
>       that's possible, you might need to set a distribution (-m) in your
>       job submit script/wrapper (E.g., -m block:*:*,pack)</p>
>     <p><a class="moz-txt-link-freetext" href="https://slurm.schedmd.com/sbatch.html#OPT_distribution">https://slurm.schedmd.com/sbatch.html#OPT_distribution</a><br>
>     </p>     <p><br>     </p>
>     <p>If you're referring to something else entirely, could you
>       elaborate on the least-loaded configuration in your setup?</p>
>     <p><br>     </p>     <p><br>       <b></b></p>
>     <div class="moz-cite-prefix">On 24/02/2022 23:35:30, Herc
>       Silverstein wrote:<br>     </div>     <blockquote type="cite"
>       cite="mid:3145b0e8-6ae0-f233...@schrodinger.com">
>       <meta http-equiv="content-type" content="text/html; charset=UTF-8">
>       <p>Hi,</p>
>       <p>We would like to do over-subscription on a cluster that's
>         running in the cloud.  The cluster dynamically spins up and down
>         cpu nodes as needed.  What we see is that the least-loaded
>         algorithm causes the maximum number of nodes specified in the
>         partition to be spun up and each loaded with N jobs for the N
>         cpu's in a node before it "doubles back" and starts
>         over-subscribing.</p>
>       <p>What we actually want is for the <i>minimum </i>number of
>         nodes to be used and for it to fully load (to the limit of the
>         oversubscription setting) one node before starting up another.Â
>         That is, we really want a "most-loaded" algorithm.  This would
>         allow us to reduce the number of nodes we need to run and reduce
>         costs.</p>
>       <p>Is there a way to get this behavior somehow?</p>
>       <p>Herc</p>       <p><br>       </p>       <p><br>       </p>
>     </blockquote>     <pre class="moz-signature" cols="72">-- Regards,
> Daniel Letai +972 (0)505 870 456</pre>   </body> </html>


Reply all
Reply to author
Forward
0 new messages