We are pleased to announce the availability of Slurm release candidate
25.11.0rc1.
To highlight some new features coming in 25.11:
* Added new "Expedited Requeue" mode for batch jobs. Batch jobs with
--requeue=expedite will automatically requeue on node failure, or if the
batch script returns a non-zero exit code and one or more Epilog scripts
fail. Expedited requeue jobs are eligible to restart immediately, are
treated as the highest priority job in the system, and their previously
allocated set of nodes will be prevented from launching other work.
* Added a new "Mode 3" of operation to Hierarchical Resources. This mode
complements the existing Mode 1 and Mode 2 by summing usage from lower
levels automatically. This can be used, e.g., to implement a
power-capping mode modeling power distribution between the datacenter,
local distribution, and individual racks.
* Added direct support for exporting OpenMetrics (Prometheus) telemetry
from slurmctld. This is accessible on SlurmctldPort on SlurmctldHost by
default, or can be disabled if desired.
* Added an experimental asynchronous-reply mode to slurmctld. If enabled
with "SlurmctldParameters=enable_async_reply", RPC responses are
offloaded to the kernel for further processing, freeing individual
worker threads for new traffic.
This is the first release candidate of the upcoming 25.11 release
series, and represents the end of development for this release, and a
finalization of the RPC and state file formats.
If any issues are identified with this release candidate, please report
them through
https://bugs.schedmd.com against the 25.11.x version and we
will address them before the first production 25.11.0 release is made.
Please note that the release candidates are not intended for production use.
A preview of the updated documentation can be found at
https://slurm.schedmd.com/archive/slurm-master/ .
Slurm can be downloaded from
https://www.schedmd.com/download-slurm/.
The changelog for 25.11 can be found here:
https://github.com/SchedMD/slurm/blob/master/CHANGELOG/slurm-25.11.md
--
Tim Wickberg
Chief Technology Officer, SchedMD LLC
Commercial Slurm Development and Support
--
slurm-users mailing list --
slurm...@lists.schedmd.com
To unsubscribe send an email to
slurm-us...@lists.schedmd.com