Hi,
Enhancement name: Workload-aware preemption
Enhancement status: Alpha
SIG: SIG Scheduling
k/enhancements repo issue #:
5710PR #s:
Additional time needed (in calendar days, due end of day AoE): 5
The first PR is already aligned with all approvers (sanposhiho@, macsko@, dom4ha@) and just requires addressing a few remaining details.
The second PR - API is already approved, only the last integration comments needs deeper review, but is also directionally aligned with approvers.
Reason this enhancement is critical for this milestone:
Workload-Aware Scheduling is critical for running AI workloads on Kubernetes and Workload-Aware Preemption is its critical building block.
This feature is important to unblock the graduation of the Workload API in 1.37 and allow experimentation of multiple in-progress integrations already happening in the ecosystem.
Risks from adding code late: Low. The feature builds on the Workload API and Gang Scheduling (
KEP-4671) which introduced a new "workload-aware" path in the scheduler. Workload-Aware Preemption is not exercised outside that path, which itself is protected by feature gates.
Risks from cutting enhancement: Existing pod-by-pod preemption is fundamentally incompatible with Workload-Aware Scheduling. Consequently, a clear preemption solution is a blocker for gang-scheduling (a heavily requested and awaited feature) and the Workload API itself.
Multiple ecosystem projects (JobSet, TrainJob, KubeRay, LeaderWorkerSet) are already designing & working on integrations with the Workload API to deliver value from gang-scheduling and subsequent Workload-Aware Scheduling features to their users.
Lack of this feature in Alpha 1.36 may further delay Gang scheduling feature to Beta in 1.37 and impede overall Kubernetes adoption for AI workloads.
thanks
wojtek tyczynski