Hi everybody,
I am (along with others) a little bit puzzled by the meaning of a
statement in the documentation concerning heterogeneous job steps inside
het. jobs. The docs state
(
https://slurm.schedmd.com/archive/slurm-24.11.5/heterogeneous_jobs.html#het_steps):
> You also cannot request heterogeneous steps from within a heterogeneous job. (A)
On a very small Slurm test installation with just two nodes, the
following het job that requests het steps (does it, right?!) runs fine:
$ cat hetjob-steps.sh
#!/bin/bash
#SBATCH --mem-per-cpu=2g --nodes=1 --cpus-per-task=8
#SBATCH hetjob
#SBATCH --mem-per-cpu=1g --nodes=1 --cpus-per-task=4
srun -l --cpus-per-task=4 nproc : -l --cpus-per-task=2 nproc
$ cat slurm-125.out
1: 4
2: 2
3: 2
0: 4
The output looks reasonable and it looks like the above quote does not
apply since one can apparently request het steps in a het job. Or am I
wrong?
The intro in the respective section also gives the impression that het
jobsteps are a convenience feature that does not require het jobs, but
it does not explicitly exclude the usage of het steps in het jobs:
> Slurm version 20.11 introduces the ability to request heterogeneous job steps from within a non-homogeneous job allocation. This allows you the flexibility to have different layouts for job steps without requiring the use of heterogeneous jobs, where having separate jobs for the components may be undesirable.
So what does the initial statement (A) actually mean then? Am I just
using a lucky example which is actually not supported?
A short clarification would be helpful.
Thanks in advance
Steffen