[slurm-users] Job step do not take the hole allocation

75 views
Skip to first unread message

Danny Marc Rotscher

unread,
Jun 30, 2023, 2:15:29 AM6/30/23
to slurm...@schedmd.com
Dear all,

we currently see a change of a default behavior of a job step.
On our old cluster (Slurm 20.11.9) a job step take all the resources of my allocation.
rotscher@tauruslogin5:~> salloc --partition=interactive --nodes=1 --ntasks=1 --cpus-per-task=24 --hint=nomultithread
salloc: Pending job allocation 37851810
salloc: job 37851810 queued and waiting for resources
salloc: job 37851810 has been allocated resources
salloc: Granted job allocation 37851810
salloc: Waiting for resource configuration
salloc: Nodes taurusi6605 are ready for job
bash-4.2$ srun numactl -show
policy: default
preferred node: current
physcpubind: 0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23
cpubind: 0 1
nodebind: 0 1
membind: 0 1

If run the same command on our new cluster the job step take only 1 core instead of all without any further paramter.
[rotscher@login1 ~]$ salloc --nodes=1 --ntasks=1 --cpus-per-task=24 --hint=nomultithread
salloc: Pending job allocation 9197
salloc: job 9197 queued and waiting for resources
salloc: job 9197 has been allocated resources
salloc: Granted job allocation 9197
salloc: Waiting for resource configuration
salloc: Nodes n1601 are ready for job
[rotscher@login1 ~]$ srun numactl -show
policy: default
preferred node: current
physcpubind: 0
cpubind: 0
nodebind: 0
membind: 0 1 2 3 4 5 6 7

If I add the parameter „-c 24“ to the job step it also take the hole resources, but the step should take it per default.
[rotscher@login1 ~]$ srun -c 24 numactl -show
policy: default
preferred node: current
physcpubind: 0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23
cpubind: 0 1
nodebind: 0 1
membind: 0 1 2 3 4 5 6 7

I searched the slurm.conf documentation, the mailing list and also the changelog, but found no reference to a matching parameter.
Do anyone of you know the behavior and how to change it?

Best wishes,
Danny

Tommi Tervo

unread,
Jun 30, 2023, 2:41:30 AM6/30/23
to Slurm User Community List
> I searched the slurm.conf documentation, the mailing list and also the
> changelog, but found no reference to a matching parameter.
> Do anyone of you know the behavior and how to change it?

Hi,

This was an annoying change:

22.05.x RELEASE_NOTES:
-- srun will no longer read in SLURM_CPUS_PER_TASK. This means you will
implicitly have to specify --cpus-per-task on your srun calls, or set the
new SRUN_CPUS_PER_TASK env var to accomplish the same thing.

Here one can find relevant discussion:

https://bugs.schedmd.com/show_bug.cgi?id=15632

I'll attach our cli-filter pre_submit function which works for us.

BR,
Tommi Tervo
CSC


cli_filter.lua

Danny Marc Rotscher

unread,
Jun 30, 2023, 4:00:43 AM6/30/23
to Slurm User Community List
Hi,

thank you very much for your help!

Best wishes,
Danny

> Am 30.06.2023 um 08:41 schrieb Tommi Tervo <tommi...@csc.fi>:
>
> <cli_filter.lua>

Ole Holm Nielsen

unread,
Jun 30, 2023, 5:53:46 AM6/30/23
to slurm...@lists.schedmd.com
On 6/30/23 08:41, Tommi Tervo wrote:
> This was an annoying change:
>
> 22.05.x RELEASE_NOTES:
> -- srun will no longer read in SLURM_CPUS_PER_TASK. This means you will
> implicitly have to specify --cpus-per-task on your srun calls, or set the
> new SRUN_CPUS_PER_TASK env var to accomplish the same thing.
>
> Here one can find relevant discussion:
>
> https://bugs.schedmd.com/show_bug.cgi?id=15632
>
> I'll attach our cli-filter pre_submit function which works for us.

The discussion in bug 15632 concludes that this bug will only be fixed in
23.11. Your workaround looks nice, however, I have not been able to find
any documentation of slurmctld calling any Lua functions named
slurm_cli_pre_submit or slurm_cli_post_submit.

Some very similar functions are documented in
https://slurm.schedmd.com/cli_filter_plugins.html for functions
cli_filter_p_setup_defaults, cli_filter_p_pre_submit, and
cli_filter_p_post_submit.

Can anyone she light on the relationship between Tommi's
slurm_cli_pre_submit function and the ones defined in the
cli_filter_plugins page?

Thanks,
Ole

Bjørn-Helge Mevik

unread,
Jun 30, 2023, 9:34:38 AM6/30/23
to slurm...@schedmd.com
Hei, Ole! :)

Ole Holm Nielsen <Ole.H....@fysik.dtu.dk> writes:

> Can anyone she light on the relationship between Tommi's
> slurm_cli_pre_submit function and the ones defined in the
> cli_filter_plugins page?

I think the *_p_* functions are functions you need to implement if you
write a cli plugin in C. When you write a cli plugin-script in Lua, you write
Lua functions called slurm_cli_setup_defaults, slurm_cli_pre_submit,
etc in the Lua code, and then the C-code of the Lua plugin itself
implements the *_p_* functions (I believe).

That said, I too found it hard to find any documentation of the Lua
plugin. Eventually, I found an example script in the Slurm source code
(etc/cli_filter.lua.example), which I've taken as a starting point for
my cli filter plugin scripts.

--
B/H

signature.asc
Reply all
Reply to author
Forward
0 new messages