[slurm-users] Power saving and node weight

20 views
Skip to first unread message

Gizo Nanava

unread,
Feb 28, 2023, 10:44:45 AM2/28/23
to slurm...@lists.schedmd.com
Hello,

it seems that if a slurm power saving is enabled then the parameter
"Weight" seem to be ignored for nodes that are in a power down state.

Is there any way to make the option working for a cluster running slurm
in a powe saveing mode?.

I am aware of the note to the weight option in the man of slurm.conf.
However, nodes that are in a power down state usually have an "IDLE"
marker in the state string in addition to the "POWERED_DOWN".

We run slurm 22.05.3.

Thank you,
Gizo

Brian Andrus

unread,
Feb 28, 2023, 12:02:13 PM2/28/23
to slurm...@lists.schedmd.com

Gizo,

I had that issue and opened a ticket. It is not considered a bug but a feature request. They have no plans to address it at this time.

9734 – Jobs sent to higher weight idle node instead of starting lower weight node (schedmd.com)

You may be able to use the alternate approach that I was able to do as well.

Brian Andrus

Jake Jellinek

unread,
Feb 28, 2023, 12:49:53 PM2/28/23
to Slurm User Community List
Hi all

I come from a SGE/UGE background and am used to the convention that I can qrsh to a node and, from there, start a new qrsh to a different node with different parameters.
I've tried this with Slurm and found that this doesn’t work the same.

For example, if I issue an 'srun' command, I get a new node.
However if I then try to start a new srun session to a different node type (different resource requirements), it just puts me back on the same box.

I did find a post from 12 years ago that suggested that this was by design but am hoping that this has now changed or that there is a config option which turns off this feature.

Thank you
Jake

Brian Andrus

unread,
Feb 28, 2023, 1:47:56 PM2/28/23
to slurm...@lists.schedmd.com
Jake,

It may help more to understand what you are trying to do accomplish
rather than find out how to do it the way you expect.

I am guessing you are using srun to get an interactive session on a
node. That approach is being deprecated and you get a shell by default
with salloc

If you are trying to start new jobs on other nodes, you would want to
use salloc/sbatch to launch them.
If you are wanting to have multiple nodes on a single job, IIRC, you
would request them with the initial salloc and then use options to srun
to launch appropriately.

What specifically do you want to get (resource-wise) and how do you want
to use them?

Brian Andrus

Jake Jellinek

unread,
Feb 28, 2023, 5:26:18 PM2/28/23
to Slurm User Community List
Hi Brian

Thanks for your response

> I am guessing you are using srun to get an interactive session on a node. That approach is being deprecated and you get a shell by default with salloc
This is exactly what I'm trying to do .... I didn’t know about the salloc thing

Let me do some more testing and I'll see if you've just resolved my issue.

Will be in touch very soon
Jake

Doug Meyer

unread,
Feb 28, 2023, 9:17:24 PM2/28/23
to Slurm User Community List
Hi,

I read the problem differently.  Might also want to look at heterogeneous jobs.

Doug

Gizo Nanava

unread,
Mar 1, 2023, 3:30:16 AM3/1/23
to Slurm User Community List
Hello Brian,

thanks a lot for the info.

>
> You may be able to use the alternate approach that I was able to do as well.
>
I would be insterested in any alternatives. Could you point me to some doc?

Best wishes
Gizo

Brian Andrus

unread,
Mar 1, 2023, 9:14:39 AM3/1/23
to slurm...@lists.schedmd.com
Gizo,

There is no documentation, only the bug report mentioned.

We may be able to help from the list if you explain what you want to do.

Brian
Reply all
Reply to author
Forward
0 new messages