I do not think the timeout reduce lead to more flaky. I have a thought, fake happens when Sync routine run after rc deleted and also Run routine misses the rc delete event. We can add explicit sleep after setup start two routines to see if flakes reduced.
—
You are receiving this because you are on a team that was mentioned.
Reply to this email directly, view it on GitHub, or mute the thread.![]()
I think @caesarxuchao has been chasing this one for a long time; I want to take a look with fresh eyes and see if I can find a root cause. I'll let @caesarxuchao decide which prior issue to re-open so we can cut down on duplicate issues.
Good news is I was able to reproduce the issue with stress (took ~300 iterations to trigger).
@ironcladlou Previously I found two root causes and fixed them, so I closed the previous issues. Not sure what's the cause of this recent flake.
There’s a reflector backing the shared informer for replicationcontrollers, and sometimes (when this bug manifests), the reflector’s watch loop (ListWatch) is returning due to:
watch error: watch event(ok=%!s(bool=true), expectedType: *v1.ReplicationController): {ERROR &Status{ListMeta:ListMeta{SelfLink:,ResourceVersion:,Continue:,},Status:Failure,Message:too old resource version: 123872 (124014),Reason:Gone,Details:nil,Code:410,}}
and
k8s.io/kubernetes/vendor/k8s.io/client-go/informers/factory.go:86: watch of *v1.ReplicationController ended with: too old resource version: 129521 (129614)
When this happens, the reflector doesn’t restart ListWatch again until the full resync period hits. The net effect is the reflector isn't dispatching the rc events to the informer (and thus the GC), although given enough time (the resync period) I suspect things would self-repair.
This is perhaps somehow related to the watch cache. I first theorized that the replicationcontroller watch cache capacity was being reached sometimes, breaking the watch, but my latest debugging doesn't seem to prove it. The default capacity for replicationcontrollers in the test server setup is 100.
When this happens, the reflector doesn’t restart ListWatch again until the full resync period hits.
It seems it restarts the ListWatch after r.period, which is always set to time.Second. (it's not the resync period).
It seems it restarts the ListWatch after r.period, which is always set to time.Second. (it's not the resync period).
Good catch! That leads me down another path...
This is the flakiest test in the repo:
this pattern seems common:
W1219 16:44:47.799199 23324 reflector.go:341] k8s.io/kubernetes/vendor/k8s.io/client-go/informers/factory.go:86: watch of *v1.ReplicationController ended with: too old resource version: 4291 (4860)graph_builder.go:588] GraphBuilder process object log message for the test.rc.1 replication controllerthe only three events we have for test.rc.1:
I1219 16:44:48.199297 23324 graph_builder.go:396] add virtual node.identity: [v1/ReplicationController, namespace: gc-cascading-deletion, name: test.rc.1, uid: ecaed94f-e4db-11e7-88ab-0242ac110002]I1219 16:44:48.199391 23324 garbagecollector.go:374] processing item [v1/ReplicationController, namespace: gc-cascading-deletion, name: test.rc.1, uid: ecaed94f-e4db-11e7-88ab-0242ac110002]I1219 16:44:48.209573 23324 garbagecollector.go:411] object [v1/ReplicationController, namespace: gc-cascading-deletion, name: test.rc.1, uid: ecaed94f-e4db-11e7-88ab-0242ac110002]'s doesn't have an owner, continue on next iteminterestingly, the virtual node addition of test.rc.1 is usually within 1 second of the watch dropping, which would be within the window where the watch was down. opened #57419 with debugging logs
I1219 16:44:48.199391 23324 garbagecollector.go:374] processing item [v1/ReplicationController, namespace: gc-cascading-deletion, name: test.rc.1, uid: ecaed94f-e4db-11e7-88ab-0242ac110002]
I1219 16:44:48.209573 23324 garbagecollector.go:411] object [v1/ReplicationController, namespace: gc-cascading-deletion, name: test.rc.1, uid: ecaed94f-e4db-11e7-88ab-0242ac110002]'s doesn't have an owner, continue on next item
I1219 16:44:48.238847 23324 graph_builder.go:588] GraphBuilder process object: v1/Pod, namespace gc-cascading-deletion, name test.pod.3, uid ecf26de2-e4db-11e7-88ab-0242ac110002, event type add
I1219 16:44:48.238913 23324 graph_builder.go:396] add virtual node.identity: [v1/ReplicationController, namespace: gc-cascading-deletion, name: test.rc.2, uid: ecaf8bc0-e4db-11e7-88ab-0242ac110002]
I1219 16:44:48.238970 23324 garbagecollector.go:374] processing item [v1/ReplicationController, namespace: gc-cascading-deletion, name: test.rc.2, uid: ecaf8bc0-e4db-11e7-88ab-0242ac110002]
I1219 16:44:48.241206 23324 garbagecollector.go:411] object [v1/ReplicationController, namespace: gc-cascading-deletion, name: test.rc.2, uid: ecaf8bc0-e4db-11e7-88ab-0242ac110002]'s doesn't have an owner, continue on next item
I1219 16:44:48.279013 23324 graph_builder.go:588] GraphBuilder process object: v1/Pod, namespace gc-cascading-deletion, name test.pod.2, uid ecf876df-e4db-11e7-88ab-0242ac110002, event type add
I1219 16:44:48.657159 23324 garbagecollector.go:183] syncing garbage collector with updated resources from discovery: map[{ v1 events}:{} {rbac.authorization.k8s.io v1 roles}:{} {rbac.authorization.k8s.io v1 clusterrolebindings}:{} {storage.k8s.io v1alpha1 volumeattachments}:{} { v1 replicationcontrollers}:{} { v1 serviceaccounts}:{} { v1 limitranges}:{} {extensions v1beta1 ingresses}:{} {extensions v1beta1 daemonsets}:{} {apps v1 controllerrevisions}:{} {policy v1beta1 poddisruptionbudgets}:{} {settings.k8s.io v1alpha1 podpresets}:{} { v1 resourcequotas}:{} {apps v1 daemonsets}:{} {apps v1 replicasets}:{} {certificates.k8s.io v1beta1 certificatesigningrequests}:{} {admissionregistration.k8s.io v1beta1 validatingwebhookconfigurations}:{} { v1 configmaps}:{} {apps v1 statefulsets}:{} {admissionregistration.k8s.io v1beta1 mutatingwebhookconfigurations}:{} {scheduling.k8s.io v1alpha1 priorityclasses}:{} { v1 pods}:{} {extensions v1beta1 networkpolicies}:{} {batch v1beta1 cronjobs}:{} {networking.k8s.io v1 networkpolicies}:{} { v1 persistentvolumes}:{} { v1 secrets}:{} { v1 namespaces}:{} {extensions v1beta1 replicasets}:{} {events.k8s.io v1beta1 events}:{} {batch v1 jobs}:{} {rbac.authorization.k8s.io v1 rolebindings}:{} {admissionregistration.k8s.io v1alpha1 initializerconfigurations}:{} {extensions v1beta1 deployments}:{} {apps v1 deployments}:{} {storage.k8s.io v1 storageclasses}:{} { v1 podtemplates}:{} { v1 nodes}:{} { v1 persistentvolumeclaims}:{} { v1 services}:{} {apiregistration.k8s.io v1beta1 apiservices}:{} {extensions v1beta1 podsecuritypolicies}:{} {autoscaling v1 horizontalpodautoscalers}:{} {rbac.authorization.k8s.io v1 clusterroles}:{} {apiextensions.k8s.io v1beta1 customresourcedefinitions}:{} { v1 endpoints}:{}]
I1219 16:44:48.837813 23324 graph_builder.go:588] GraphBuilder process object: v1/ReplicationController, namespace gc-cascading-deletion, name test.rc.2, uid ecaf8bc0-e4db-11e7-88ab-0242ac110002, event type add
I1219 16:44:49.316790 23324 graph_builder.go:263] synced monitors; added 0, kept 43, removed 0
I1219 16:44:49.316840 23324 graph_builder.go:295] started 0 new monitors, 43 currently running
I1219 16:44:49.316860 23324 controller_utils.go:1019] Waiting for caches to sync for garbage collector controller
I1219 16:44:49.417177 23324 controller_utils.go:1026] Caches are synced for garbage collector controller
I1219 16:44:49.417222 23324 garbagecollector.go:220] synced garbage collector
I1219 16:44:54.447225 23324 garbagecollector.go:178] no resource updates from discovery, skipping garbage collector sync
I1219 16:44:59.473854 23324 garbagecollector.go:178] no resource updates from discovery, skipping garbage collector sync
I1219 16:45:04.500785 23324 garbagecollector.go:178] no resource updates from discovery, skipping garbage collector sync
I1219 16:45:09.530108 23324 garbagecollector.go:178] no resource updates from discovery, skipping garbage collector sync
I1219 16:45:14.562200 23324 garbagecollector.go:178] no resource updates from discovery, skipping garbage collector sync
I1219 16:45:19.590412 23324 garbagecollector.go:178] no resource updates from discovery, skipping garbage collector sync
I1219 16:45:24.611082 23324 garbagecollector.go:178] no resource updates from discovery, skipping garbage collector sync
I1219 16:45:29.640088 23324 garbagecollector.go:178] no resource updates from discovery, skipping garbage collector sync
I1219 16:45:34.663912 23324 garbagecollector.go:178] no resource updates from discovery, skipping garbage collector sync
I1219 16:45:39.691322 23324 garbagecollector.go:178] no resource updates from discovery, skipping garbage collector sync
I1219 16:45:44.717282 23324 garbagecollector.go:178] no resource updates from discovery, skipping garbage collector sync
I1219 16:45:48.373161 23324 garbagecollector.go:154] Shutting down garbage collector controller
I1219 16:45:48.377971 23324 graph_builder.go:348] stopped 43 of 43 monitors
I1219 16:45:48.378008 23324 graph_builder.go:349] GraphBuilder stopping
--- FAIL: TestCascadingDeletion (68.31s)
testserver.go💯 Starting kube-apiserver on port 45589...
testserver.go:112: Waiting for /healthz to be ok...
garbage_collector_test.go:368: timed out waiting for the condition
</details>
Finally tracked it down. I can consistently recreate this by artificially triggering the "too old resource version" error on first replication controller watch establishment. That results in the following flow:
| time | pod informer | rc informer | test driver | graph builder |
| 0 | list pods, get [] at rv=10 | list rcs, get [] at rv=10 | ||
| 1 | mark synced | mark synced | ||
| 2 | create rc.test.1 at rv=11 | |||
| 3 | create pod.test at rv=12 | |||
| 4 | watch at rv=10 | watch at rv=10, get "too old" error | ||
| 5 | observe pod.test create at rv=12 | |||
| 6 | process pod.test create at rv=12 | |||
| 7 | add virtual node for rc.test.1 | |||
| 8 | add node for pod.test | |||
| 9 | attemptToDelete for rc.test.1, lookup finds the object | |||
| 10 | delete rc at rv=13 | |||
| 11 | after 1 second, list rcs, get [] at rv=13 | |||
| 12 | watch at rv=13 |
That leaves us with a node in the GC graph for rc.test.1 that was confirmed by direct lookup in attemptToDelete, but never observed via informer events. Because rc.test.1 was created AND deleted during the one second watch outage, the informer will never deliver a deletion event for that object.
@liggitt If we explicitly sleep more than 1 second, this probablity can be reduced.
@liggitt If we explicitly sleep more than 1 second, this probablity can be reduced.
where are you thinking we would add the sleep? I am more interested in tracking the virtual nodes that have not been observed via informer more rigorously to avoid them getting orphaned in the graph, rather than trying to time our way out of this bug.
Opened #57503 with fix