Enhancement name: RestartAllContainers
Enhancement status (alpha/beta/stable): alpha
SIG: sig-node
k/enhancements repo issue #: https://github.com/kubernetes/enhancements/issues/5532
PR #’s: https://github.com/kubernetes/kubernetes/pull/134345
Additional time needed (in calendar days, due end of day AoE): 5 days (until Tuesday)
Reason this enhancement is critical for this milestone: The feature is critical for another proposed enhancement for LLM training jobs https://docs.google.com/document/d/16zexVooHKPc80F4dVtUjDYK9DOpkVPRNfSv0zRtfFpk/edit?tab=t.0#heading=h.y6xl7juq7465 ; implementation PR is ready and already have LGTM labels from sig-node, just needing reviews from API.
Risks from adding code late: (to k8s stability, testing, etc.) Low, all changes are behind the alpha-level feature gate and disabled by default; the feature also has good e2e testing coverage.
Risks from cutting enhancement: (partial implementation, critical customer usecase, etc.) The feature is critical for another proposed enhancement for LLM training jobs https://docs.google.com/document/d/16zexVooHKPc80F4dVtUjDYK9DOpkVPRNfSv0zRtfFpk/edit?tab=t.0#heading=h.y6xl7juq7465Thanks all!Yuan Wang