Hello,
I’m requesting a code freeze exception for KEP-5729: DRA: ResourceClaim Support for Workloads.
Enhancement name: DRA: ResourceClaim Support for Workloads
Enhancement status: alpha
SIG: Scheduling
k/enhancements repo issue #:
#5729
Additional time needed: 3 days (21 March)
Reason this enhancement is critical for this milestone:
Managing multi-host workloads is necessary for AI/ML. This feature builds on the workload aware scheduling to enabling lifecycle management of the related resources along with the PodGroup. It is important to understand how these resources will be managed and
getting this to alpha state helps us iron out any gaps and additional requirements.
Risks from adding code late:
Low risk. The relevant tests on the PR have been stable. All of the functionality is gated on the DRAWorkloadResourceClaims feature gate.
Risks from cutting enhancement:
This is a key feature for the future of workload-aware scheduling. This feature is needed to unblock experiments for critical advancements in large-scale AI/ML workload orchestration in Kubernetes.
Thanks,
Jon Huhn