Should we pay attention to the performance impact due to the extra checks in the GC controller ?
Looking under pkg/controller/garbagecollector , I don't see any benchmark test.
—
You are receiving this because you are on a team that was mentioned.
Reply to this email directly, view it on GitHub, or unsubscribe.
In GarbageCollector#attemptToDeleteItem :
for _, dep := range deps {
if dep.isDeletingDependents() {
...
if _, err := gc.patch(item, patch, gc.unblockOwnerReferencesJSONMergePatch); err != nil {
return err
}
break
I think we should continue with other dependents performing the unblocking (instead of breaking out of the loop).
/cc ironpan
Bug triage for release 1.17 here. Since this issue has been tagged for the milestone, we would like to check it's status. Code freeze is on Nov. 19. Will this issue be resolved before then? If not, we should move it out of the milestone.
@liggitt Could you confirm the status of this issue with regard to v1.17 and the recent PR referencing it? Should we have that PR in the milestone?
no, that PR is not planned to merge as is. the changes required to fix this issue are too risky for this point in the release, will defer to 1.18
/milestone v1.18
Hello @liggitt !
Bug Triage team here for the 1.18 release. This is a friendly reminder that code freeze is scheduled for 5 March. Is this issue still intended for milestone 1.18, or should it extend to the next one? Thanks in advance!
My team was just bitten by an otherwise-valid ownerRef from a namespace-scoped object to an object in a different namespace. This caused some things to get deleted that definitely should not have been deleted leading to all kinds of headaches. It's bad on us for creating this invalid ownerReference in one of our controllers, but it's worse on Kubernetes for letting us set it and then nuking the owner during GC just because the owned thing has a cross-namespace ownerRef.
This lets any user elevate their view permissions on an resource to delete permissions just by creating any other resource with careful ownerReferences and waiting on GC.
nuking the owner during GC
To be clear, an ownerRef does not cause deletion of the owner
I agree this is important to fix. I am looking into potential mitigations to make behavior in the presence of these ownerRefs more deterministic
Can the apiserver block creation of these invalid cross-namespace ownerReferences?
Can the apiserver block creation of these invalid cross-namespace ownerReferences?
No, apiserver doesn't enforce references. For better or worse, but I don't see us starting now.
Hello @liggitt !
Bug Triage team here for the 1.18 release. This is a friendly reminder that code freeze is scheduled for 5 March. Is this issue still intended for milestone 1.18, or should it extend to the next one? Thanks in advance!
—
As we are now in code freeze for 1.18, I am moving this to the next milestone.
/milestone v1.19
We also hit this one. A user was copying some resources from a namespace to another, keeping the owner reference. When the controller manager restarted the garbage collector also deleted some PVCs in the original namespace 😕
No, apiserver doesn't enforce references. For better or worse, but I don't see us starting now.
Any particular concern about rejecting creation/update of an object when the ownerReference does not exist in the namespace or at the cluster level ?
I agree this is important to fix. I am looking into potential mitigations to make behavior in the presence of these ownerRefs more deterministic
Could you share your thoughts ? At first I would say that a workaround would be to not only use the UID in the absentOwnerCache
but also the namespace (for namespace scoped resources ofc) . It doesn't feel like a "clean" solution but it would prevent this issue ?
Hello, @jtslear here from Bug Triage! This issue has not been updated for a long time, so I'd like to check on the status. The code freeze is starting June 25th and while there is still plenty of time, we want to ensure that each PR has a chance to be merged on time.
As the issue is tagged for 1.19, is it still planned for this release?
Any particular concern about rejecting creation/update of an object when the ownerReference does not exist in the namespace or at the cluster level ?
Yes, it's much more complicated than it sounds like, e.g. the owning object might not exist in the apiserver, it might be in e.g. a different aggregated apiserver, which the apiserver handling the request doesn't know about.
@kubernetes/sig-api-machinery-bugs
I haven't seen a followup on this issue for awhile. Is this still planned for the v1.19 release?
Could you share your thoughts ? At first I would say that a workaround would be to not only use the UID in the absentOwnerCache but also the namespace (for namespace scoped resources ofc) . It doesn't feel like a "clean" solution but it would prevent this issue ?
Something like that, probably the namespace+group+kind+uid coordinates. I had a quick proof of concept around this but hadn't satisfied myself that it behaved well in all conditions.
Based on #65200 (comment), I see that we are not trying to validate the ownerReference during create/update, but trying to fix cache of gc to make its behavior deterministic, letting gc immediately delete the object if its ownerRef is pointing at wrong namespace/group/kind.
@liggitt can you share the corner cases when you are back?
Due to reaching the code freeze deadline, I'm going to bump this to the next release.
/milestone v1.20
I'm working on a fix for this for the 1.20 timeframe
👋🏽 Hey @liggitt ! I'm from the Bug Triage team. This issue has not been updated for a while, so I'd like to check on the status. The code freeze is starting November 12th (about 3 weeks from now) and while there is still plenty of time, we want to ensure that each PR has a chance to be merged on time.
As the issue is tagged for 1.20, is it still planned for this release?
Yes, work on this is in progress for 1.20
Fixed in 1.20 by #92743