There are two remaining PRs: 112360 extends the feature, 113360 promotes the feature into Beta. The review of 112360 is in-progress with most of the remarks applied, no major concerns raised. The 113360 PR is LGTMed, but it depends on the 112360 PR so will need to be rebased, but no essential changes will be required.
Enhancement name: Retriable and non-retriable Pod failures for Jobs
Enhancement status (alpha/beta/stable): Beta
SIG: sig-apps (sig-node participating)
k/enhancements repo issue #: 3329
PR #’s: 112360, 113360
Additional time needed (in days): 5
Reason this enhancement is critical for this milestone: To keep the functionality on track for GA
Risks from adding code late: (to k8s stability, testing, etc.) Most of the new code is behind the “PodDisruptionConditions” feature gate which limits the risk. The 112360 PR includes unit, integration and node e2e tests providing nearly 100% coverage for the new code. The 113360 PR graduates the feature into Beta, but includes e2e tests for the core functionality to lower the risk.
Risks from cutting enhancement: (partial implementation, critical customer usecase, etc.): Delayed adoption of the feature in OSS to support use-cases around avoiding unnecessary costs for running batch workloads. Criticality is unknown.