Il 05/10/2021 09:22, Ole Holm Nielsen ha scritto:
> What is a "frontend"? Do you mean the slurmctld server?
Yes, sorry. "Frontend" is how we call the node(s) used by users to
submit jobs, where slurmctld and slurmdbd run. We'll probably move
slurmdbd and slurmctld to a dedicated VM in a future upgrade (mainly, I
have to be sure it doesn't need IB or access to the gluster fs that's
only available over IB).
Does sbatch give slurmctld just a path to the job script or the whole
script?
>> worked with IDLE (RESUME gives "Invalid node state specified").
> So "scontrol update node=... state=idle" gives the node a correct idle
> state, whereas "state=resume" doesn't? Did you restart the slurmd on
> the compute nodes?
Yes. Complete node reboots, actually. Multiple times. When desperate,
try rebooting.
>> SLURM 20.11.4.
> You wrote that you use Slurm 21.08 from Debian 11. How did 20.11 get
> into the picture?
Good question. I copy-pasted 21.08 from a node after the upgrade, but
now all nodes say 20.11.4 . Really confused :-? Just to add to the
confusion,
packages.debian.org gives 20.11.7+really20.11.4-2 as
slurmctld version for bullseye. No mention of 21.08 anywhere, not even
in sid (20.11.8). ARGH! Did I dream it? And if so, how could I c&p it????
Yup. That's why I upgraded the whole cluster at once.
Tks for the help.