[slurm-users] Slurm version 20.11.8 is now available

220 views
Skip to first unread message

Tim Wickberg

unread,
Jul 1, 2021, 7:00:53 PM7/1/21
to slurm-a...@schedmd.com, slurm...@schedmd.com
We are pleased to announce the availability of Slurm version 20.11.8.

This includes a number of minor-to-moderate severity bug fixes.

Slurm can be downloaded from https://www.schedmd.com/downloads.php .

- Tim

--
Tim Wickberg
Chief Technology Officer, SchedMD LLC
Commercial Slurm Development and Support

> * Changes in Slurm 20.11.8
> ==========================
> -- slurmctld - fix erroneous "StepId=CORRUPT" messages in error logs.
> -- Correct the error given when auth plugin fails to pack a credential.
> -- Fix unused-variable compiler warning on FreeBSD in fd_resolve_path().
> -- acct_gather_filesystem/lustre - only emit collection error once per step.
> -- srun - leave SLURM_DIST_UNKNOWN as default for --interactive.
> -- Add GRES environment variables (e.g., CUDA_VISIBLE_DEVICES) into the
> interactive step, the same as is done for the batch step.
> -- Fix various potential deadlocks when altering objects in the database
> dealing with every cluster in the database.
> -- slurmrestd - handle slurmdbd connection failures without segfaulting.
> -- slurmrestd - fix segfault for searches in slurmdb/v0.0.36/jobs.
> -- slurmrestd - remove (non-functioning) users query parameter for
> slurmdb/v0.0.36/jobs from openapi.json
> -- slurmrestd - fix segfault in slurmrestd db/jobs with numeric queries
> -- slurmrestd - add argv handling for job/submit endpoint.
> -- srun - fix broken node step allocation in a heterogeneous allocation.
> -- Fail step creation if -n is not multiple of --ntasks-per-gpu.
> -- job_container/tmpfs - Fix slowdown on teardown.
> -- Fix problem with SlurmctldProlog where requeued jobs would never launch.
> -- job_container/tmpfs - Fix issue when restarting slurmd where the namespace
> mount points could disappear.
> -- sacct - avoid truncating JobId at 34 characters.
> -- scancel - fix segfault when --wckey filtering option is used.
> -- select/cons_tres - Fix memory leak.
> -- Prevent file descriptor leak in job_container/tmpfs on slurmd restart.
> -- slurmrestd/dbv0.0.36 - Fix values dumped in job state/current and
> job step state.
> -- slurmrestd/dbv0.0.36 - Correct description for previous state property.
> -- perlapi/libslurmdb - expose tres_req_str to job hash.
> -- scrontab - close and reopen temporary crontab file to deal with editors
> that do not change the original file, but instead write out then rename
> a new file.
> -- sstat - fix linking so that it will work when --without-shared-libslurm
> was used to build Slurm.
> -- Clear allocated cpus for running steps in a job before handling requested
> nodes on new step.
> -- Don't reject a step if not enough nodes are available. Instead, defer the
> step until enough nodes are available to satisfy the request.
> -- Don't reject a step if it requests at least one specific node that is
> already allocated to another step. Instead, defer the step until the
> requested node(s) become available.
> -- slurmrestd - add description for slurmdb/job endpoint.
> -- Better handling of --mem=0.
> -- Ignore DefCpuPerGpu when --cpus-per-task given.
> -- sacct - fix segfault when printing StepId (or when using --long).


Tina Friedrich

unread,
Jul 15, 2021, 7:03:37 AM7/15/21
to slurm...@lists.schedmd.com
Hello,

(unfortunately this came out like a day after I upgraded to 20.11.7)

Is there any more information on this:

-- select/cons_tres - Fix memory leak.

anywhere? I tried to search for it but my search foo is failing. Is that
a memory lead in slurmctld, slurmd, ...? How is it triggered / how
severe is it? (As in, is this a case of 'you need to urgently update to
20.11.8 if you're running 20.11.7?)

Tina

On 02/07/2021 00:00, Tim Wickberg wrote:
> We are pleased to announce the availability of Slurm version 20.11.8.
>
> This includes a number of minor-to-moderate severity bug fixes.
>
> Slurm can be downloaded from https://www.schedmd.com/downloads.php .
>
> - Tim
>

--
Tina Friedrich, Advanced Research Computing Snr HPC Systems Administrator

Research Computing and Support Services
IT Services, University of Oxford
http://www.arc.ox.ac.uk http://www.it.ox.ac.uk

Reply all
Reply to author
Forward
0 new messages