Please review the below exception request. The review is in-progress with most of the remarks applied.
Enhancement name: Retriable and non-retriable Pod failures for Jobs
Enhancement status (alpha/beta/stable): alpha
SIG: sig-apps
k/enhancements repo issue #: 3329
PR #’s: 110959, 111475, 111113
Additional time needed (in days): 5
Reason this enhancement is critical for this milestone: To keep the functionality on track for GA
Risks from adding code late: (to k8s stability, testing, etc.) Most of the new code is behind feature gates which limit the risk. Cleanup (#111475) is not protected by a feature gate, but is limited in scope and well tested. The new unit and integration tests provide nearly 100% coverage for the new code. Also, the new tests were run in a loop over 100 times to minimize the risk of introducing a flaky test.
Risks from cutting enhancement: (partial implementation, critical customer usecase, etc.) Delayed adoption of the feature in OSS to support use-cases around avoiding unnecessary costs for running batch workloads. Criticality is unknown.