One question on adding label selector based indexedTrigger at watch cache

Jiang, Yunhong

unread,

Dec 16, 2021, 2:55:46 PM12/16/21

to kubernetes-sig...@googlegroups.com, woj...@google.com

Hi, Wojciech,

We have a usage that Pods in a daemonset are watching one resources based on the label selector. We want to utilize the indexedTrigger

( https://github.com/kubernetes/kubernetes/blob/master/staging/src/k8s.io/apiserver/pkg/storage/cacher/cacher.go#L265 ) to reduce the watchers number processing the event.

However, after checking the code, seems currently this feature only support field selector, as https://github.com/kubernetes/kubernetes/blob/master/staging/src/k8s.io/apiserver/pkg/storage/cacher/cacher.go#L471 only check the pred.Field . Is the understanding correct? If yes, are there any special reason that no label selector support yet? We are considering to add the label selector support but want to make sure it’s ok to do that way.

BTW, for the code at https://github.com/kubernetes/kubernetes/blob/master/staging/src/k8s.io/apiserver/pkg/storage/cacher/cacher.go#L469, why do we need loop through the pred.IndexField? Per my understanding, the pred.IndexField will decide how the store is indexed, but have nothing to do with the watch trigger, right? For example, the client can have field selector LIST based on field A while field selector WATCH based on field B, then we can have indexField to be field A which will reduce the list cost, and have indexTrigger based on fieldB, which will reduce the watch cost, am I right? Of course, normally the list watch will have list/watch using the same selector, but it’s not mandatory, right?

Thanks

--jyh

Wojciech Tyczyński

unread,

Jan 5, 2022, 9:32:52 AM1/5/22

to Jiang, Yunhong, kubernetes-sig...@googlegroups.com

On Thu, Dec 16, 2021 at 8:55 PM Jiang, Yunhong <yunh...@ebay.com> wrote:

Hi, Wojciech,

We have a usage that Pods in a daemonset are watching one resources based on the label selector. We want to utilize the indexedTrigger

( https://github.com/kubernetes/kubernetes/blob/master/staging/src/k8s.io/apiserver/pkg/storage/cacher/cacher.go#L265 ) to reduce the watchers number processing the event.

However, after checking the code, seems currently this feature only support field selector, as https://github.com/kubernetes/kubernetes/blob/master/staging/src/k8s.io/apiserver/pkg/storage/cacher/cacher.go#L471 only check the pred.Field . Is the understanding correct? If yes, are there any special reason that no label selector support yet? We are considering to add the label selector support but want to make sure it’s ok to do that way.

The lack of good usecase is the reason.

We consciously didn't expose that to the user and have it just hardcoded inside kube-apiserver, because each new index grows kube-apiserver memory.

Users shouldn't be able to easily do that.

So it's not supposed to be a generic indexing mechanism. That was optimization purely targeted for specific usecases (that core k8s components have).

BTW, for the code at https://github.com/kubernetes/kubernetes/blob/master/staging/src/k8s.io/apiserver/pkg/storage/cacher/cacher.go#L469, why do we need loop through the pred.IndexField? Per my understanding, the pred.IndexField will decide how the store is indexed, but have nothing to do with the watch trigger, right? For example, the client can have field selector LIST based on field A while field selector WATCH based on field B, then we can have indexField to be field A which will reduce the list cost, and have indexTrigger based on fieldB, which will reduce the watch cost, am I right? Of course, normally the list watch will have list/watch using the same selector, but it’s not mandatory, right?

Technically speaking - you're correct.

But as above - this wasn't supposed to be a generic indexing mechanism. We're optimizing just a predefined set of specific usecases of core-components.

Which actually are doing "list+watch" on the same selectors. And IIRC the implementation assumes the matching between them in some place(s).

Thanks

--jyh

Jiang, Yunhong

unread,

Jan 5, 2022, 1:54:26 PM1/5/22

to Wojciech Tyczyński, kubernetes-sig...@googlegroups.com

Thanks for the reply.

“The lack of good usecase” is a good reason. Otehrwise, if we add the label selector index trigger and there is no usage case in the core k8s components, then it’s not easy to verify it.

Do you think if we should change our data model to use field index (changing data model is always challenge)? Or if you think it would be ok to have the label index trigger in the upstream k8s?

Thanks

-jyh

From: Wojciech Tyczyński <woj...@google.com>
Date: Wednesday, January 5, 2022 at 6:32 AM
To: Jiang, Yunhong <yunh...@ebay.com>
Cc: kubernetes-sig...@googlegroups.com <kubernetes-sig...@googlegroups.com>
Subject: Re: One question on adding label selector based indexedTrigger at watch cache

External Email

Wojciech Tyczyński

unread,

Jan 10, 2022, 5:10:50 AM1/10/22

to Jiang, Yunhong, kubernetes-sig...@googlegroups.com

On Wed, Jan 5, 2022 at 7:54 PM Jiang, Yunhong <yunh...@ebay.com> wrote:

Thanks for the reply.

“The lack of good usecase” is a good reason. Otehrwise, if we add the label selector index trigger and there is no usage case in the core k8s components, then it’s not easy to verify it.

Do you think if we should change our data model to use field index (changing data model is always challenge)? Or if you think it would be ok to have the label index trigger in the upstream k8s?

I'm somewhat reluctant to adding label index in upstream.

You mentioned one reason above - without a real usage, it will be not hard to break it without us learning about that.

The second reason is that after adding it will be tempting to add some real usage to it. And we don't want to start accumulating more and more indexes.

Given that adding support for label selector seems to be enough to you, I guess you're running a fork of k8s anyway? [Otherwise you won't be able to configure it for your usecase anyway.]

If you're running your fork, I think the first thing to try would be to implement it and maintain in your fork - the patch should be relatively small, I think.