Groups keyboard shortcuts have been updated
Dismiss
See shortcuts

Test freeze extension request for KEP-4603 Tune CrashLoopBackOff e2es

168 views
Skip to first unread message

Laura Lorenz

unread,
Nov 6, 2024, 12:33:22 PM11/6/24
to releas...@kubernetes.io, kubernetes-sig-node, kubernetes-...@googlegroups.com
Enhancement name: Tune CrashLoopBackOff

Enhancement status (alpha/beta/stable): alpha

SIG: SIG-Node

k/enhancements repo issue #: #4603

PR #’s: #128559, (already in review: #128374#128356, merged: #128369)

Additional time needed (in days): 5 (to get 3 business days)

Reason this enhancement is critical for this milestone: This is the first alpha to address a long standing (6 years) and highly upvoted issues in k/k, Kubernetes#57291. This enhancement was already delayed one release from missing enhancements freeze for 1.31, and with this alpha we get a discrete step forward designed to give us good signal on how to proceed with limited risk.

Risks from adding code late: (to k8s stability, testing, etc.) Low since a) this is for alpha features that are all behind alpha gates, and b) the target PR for the extension are e2e tests.

Risks from cutting enhancement: (partial implementation, critical customer usecase, etc.)  This e2e test is based on additional tests that went in an unrelated another PR during this code freeze, but is able to better cover the feature than the existing e2e framework or the existing unit tests, and is worth including with the other 1.32 changes for this enhancement.

Thanks!
Laura

Antonio Ojea

unread,
Nov 6, 2024, 2:12:30 PM11/6/24
to Laura Lorenz, releas...@kubernetes.io, kubernetes-sig-node, kubernetes-...@googlegroups.com
Please try to not split the PRs into features and e2e, we need to try
to merge both things altogether for several reasons ...

1. we can not be confident the code merged is ok if the e2e are not
running, what happens if the e2e detects a bug after it merge?
2. There is a risk the e2e go out of the code freeze window that put
as in a situation we may better avoid
2.1 we decide to merge the e2e and we find a bug in the
implementation , do we fix forward or revert?
2.2 There is no time for finishing the e2e. Do we keep the feature
without testing or do we revert?
> --
> You received this message because you are subscribed to the Google Groups "kubernetes-sig-release" group.
> To unsubscribe from this group and stop receiving emails from it, send an email to kubernetes-sig-re...@googlegroups.com.
> To view this discussion visit https://groups.google.com/d/msgid/kubernetes-sig-release/CAAF3WrmKeBfizT6vPExYFmgcitD4A_oDh4nhQqHTK4yVH1%3Dhiw%40mail.gmail.com.

Tim Allclair

unread,
Nov 6, 2024, 4:16:48 PM11/6/24
to Antonio Ojea, Laura Lorenz, releas...@kubernetes.io, kubernetes-sig-node, kubernetes-...@googlegroups.com
Let me add some context to this request.

The crashloop backoff changes Laura is working on are changes on the order of seconds to the rate of container retries. Our E2E infrastructure is flaky enough on things measured in minutes, so it will be very difficult to get any particularly meaningful test coverage of this in the E2E environment. Instead, unit tests & manual testing will need to be the primary way we verify these changes, so I consider the e2e tests to be nice-to-have (at least for Alpha).

In the context of #128559 specifically, that is just missing coverage for existing features. The related non-test change is just a refactor with no behavioral changes.

As an approver for this feature, I'm OK with separating the E2E tests for this specific case, and also OK with the extension, but this should not set a precedent for other alpha features.

You received this message because you are subscribed to the Google Groups "kubernetes-sig-node" group.
To unsubscribe from this group and stop receiving emails from it, send an email to kubernetes-sig-...@googlegroups.com.
To view this discussion visit https://groups.google.com/d/msgid/kubernetes-sig-node/CABhP%3DtZDyE07KBrRmHGeYEMm%3DywzNS2as-_Xgqs3ZFqb0amRkA%40mail.gmail.com.

Antonio Ojea

unread,
Nov 6, 2024, 4:23:46 PM11/6/24
to Tim Allclair, Laura Lorenz, releas...@kubernetes.io, kubernetes-sig-node, kubernetes-...@googlegroups.com
Hi Tim,

thanks for clarifying, this makes sense to me too, apologies for the
intrusion, I read it differently

Frederico Muñoz

unread,
Nov 11, 2024, 9:10:53 AM11/11/24
to kubernetes-sig-release
Hello,

Following the comments here and the discussion in Slack, the v1.32 Release Team is APPROVING this Test Freeze exception request. The updated deadline is 19:00 PDT Tuesday, 12th November 2024.

If you need any clarification, please reach out to us in the #sig-release Slack channel.

Thank you,
-- 
Frederico Muñoz
1.32 Release Team Lead
Reply all
Reply to author
Forward
0 new messages