Logging and naming of components

128 views
Skip to first unread message

Filip Krepinsky

unread,
Jan 25, 2023, 7:47:23 PM1/25/23
to d...@kubernetes.io

Hi,

While reviewing the kube-controller-manager we noticed that there are many inconsistencies in the naming of the controllers. This is becoming more important with the current switch to contextual logging as each component can specify a logging prefix.

There are 4 places for each controller that are important from a user's perspective

- controller initializer name [1]: can be used when enabling/disabling controllers with the --controllers flag
- client (service account) name [2], [3]: useful when inspecting audit logs.
- event source name: useful for identifying the relevant controller that emitted the event
- logs: messages are currently only identified by a file where they originated, which does not always correspond to the controller.

There are inconsistencies between these places for some controllers. And inconsistencies between the controllers.

I have made a spreadsheet to summarize these for a bigger picture [4].

I think it is helpful for users to have a consistent identifier for each controller across all the possible mentions. It can help the user to search for the same string in audit logs, events and logs.

We have been discussing this with Patrick and Maciej in #113464 PR [5] and would like to get additional opinions on this.

Currently the best candidate for a logging component prefix that would make a lot of sense, is to use the same name as the client (SA) name. This would at least help to correlate the logs with the audit logs.

However, there are a few problems with these:

- "-" not applied consistently between words
- "-controller" suffix not applied consistently
- some names ("node-controller", "expand-controller") are too generic
- some controllers use the same SA:
    - csrsigning, csrapproving and csrcleaner  with "certificate-controller"
    - nodeipam, nodelifecycle and cloud-node-lifecycle with "node-controller"
 
I have the following questions:

1. Is it feasible to change the client (SA) names to achieve better naming consistency? I suppose we would have to put RBAC migration into place, so it might not be worth it? Nevertheless, this group of names is the most consistent and would not require that many changes.

2. The controller initializers are the most inconsistent. Is there a way to rename them? Or is the risk of breaking people here too big?

3. Can we change the event source to be the same as the logger name? This would mostly fix the inconsistent use of "_" instead of "-". And would help in a few other places, such as disruption-controller emitting events with controllermanager as a source.

To reiterate, with the introduction of logging prefixes, it would be great to standardize this somehow, because the change in a logging format is a big area and will affect a lot of people.

Any feedback would be appreciated!

Filip


1. https://github.com/kubernetes/kubernetes/blob/1e3946ce9dcb745771b03845d4afd97caa7c7737/cmd/kube-controller-manager/app/controllermanager.go#L434
2. https://github.com/kubernetes/kubernetes/blob/c865a641d3d7ed85723ef2f37f3756d52d8f5da3/cmd/kube-controller-manager/app/core.go#L161
3. https://github.com/kubernetes/kubernetes/blob/master/plugin/pkg/auth/authorizer/rbac/bootstrappolicy/controller_policy.go#L66
4. https://docs.google.com/spreadsheets/d/1s0Y3ruLDUWR8k0KyMLlY-5gm09TDiOjONFsMqs0-Fjw
5. https://github.com/kubernetes/kubernetes/pull/113464#discussion_r1065155841

Jordan Liggitt

unread,
Jan 25, 2023, 8:10:48 PM1/25/23
to fkre...@redhat.com, d...@kubernetes.io
Both the service account, initializer name, and event source used for controller loops are admin or end-user visible… I'd be pretty hesitant to change them for the purpose of contextual logging.

If we were going to change something, of those, the event source seems like the lowest-impact one to change, since it would not impact command-line invocations or audit/auth configurations, and events are already ephemeral, but I don't have a good sense for whether anyone has monitoring built around events emitted by specific controllers.

If we want each controller loop to have a log prefix, the initializer seems like it might be the right thing to use, since every loop is guaranteed to have a unique initializer name. 



--
You received this message because you are subscribed to the Google Groups "dev" group.
To unsubscribe from this group and stop receiving emails from it, send an email to dev+uns...@kubernetes.io.
To view this discussion on the web visit https://groups.google.com/a/kubernetes.io/d/msgid/dev/CAEp5ocj0nKsD_AANST2ReuAyrsxBBEBn3Ee08R_x10oOL1r5Kg%40mail.gmail.com.

Tim Hockin

unread,
Jan 25, 2023, 8:16:29 PM1/25/23
to lig...@google.com, fkre...@redhat.com, d...@kubernetes.io
Can we add `fka` fields to events and aliases to initializers so
`--controllers` can take either old names or new, but only lists new
ones in docs and doesn't double-dip on startup?

I agree that these names are ungood - if we invest in a mechanism to
rename, it gives us freedom over time.
> To view this discussion on the web visit https://groups.google.com/a/kubernetes.io/d/msgid/dev/CAMBP-pJfvLgwMHurQcpeZagr2xGZmN__w%2BE%2B0_Rf%2BCssDrfniA%40mail.gmail.com.

Jordan Liggitt

unread,
Jan 25, 2023, 8:21:49 PM1/25/23
to Tim Hockin, d...@kubernetes.io, fkre...@redhat.com
Aliasing initializers to preserve existing invocations would actually be pretty easy.

Not sure an "fka" field in Event objects would be worth it… anyone aware of that and willing to adapt to inspect it could just as easily adapt to use the new controller event source name. 

Maybe a bigger question is what the stability guarantees are for prefixes and keys in contextual/structured logging. Is this a new API surface that people are going to integrate with and depend on us to not change?

Tim Hockin

unread,
Jan 25, 2023, 9:19:17 PM1/25/23
to Jordan Liggitt, dev, fkre...@redhat.com
I think we continue to assert that logs are not an API with strong guarantees. 

Jordan Liggitt

unread,
Jan 25, 2023, 9:42:26 PM1/25/23
to Tim Hockin, dev, fkre...@redhat.com
That would be my inclination as well. That posture would be good to capture explicitly in compatibility and API guidelines / docs.

Davanum Srinivas

unread,
Jan 25, 2023, 9:47:04 PM1/25/23
to tho...@google.com, Jordan Liggitt, dev, fkre...@redhat.com
"I think we continue to assert that logs are not an API with strong guarantees." 

+1000

Patrick Ohly

unread,
Jan 26, 2023, 3:29:47 AM1/26/23
to tho...@google.com, Jordan Liggitt, dev, fkre...@redhat.com
"'Tim Hockin' via dev" <d...@kubernetes.io> writes:
> I think we continue to assert that logs are not an API with strong
> guarantees.

I agree. That means we can choose some log prefix and change it at any
time.

If some of the other names get changed, it shouldn't be because of
contextual logging but rather to reduce the overall
inconsistencies. Trying to choose a log prefix for each controller just
made it more obvious that these inconsistencies exist.

--
Best Regards

Patrick Ohly
Cloud Software Architect

Filip Krepinsky

unread,
Jan 26, 2023, 9:08:42 AM1/26/23
to patric...@intel.com, tho...@google.com, Jordan Liggitt, dev
--
You received this message because you are subscribed to the Google Groups "dev" group.
To unsubscribe from this group and stop receiving emails from it, send an email to dev+uns...@kubernetes.io.
On Thu, Jan 26, 2023 at 3:47 AM Davanum Srinivas <dav...@gmail.com> wrote:
"I think we continue to assert that logs are not an API with strong guarantees." 

+1000

On Wed, Jan 25, 2023 at 9:19 PM 'Tim Hockin' via dev <d...@kubernetes.io> wrote:
I think we continue to assert that logs are not an API with strong guarantees. 


Aliasing sounds like a nice idea. We would have 2 consistent places that cleanly identify the controllers to start with. And we can think about the other places later.

I have updated the spreadsheet how these would look and highlighted the differences. I have two issues now:

1. I have not put a hyphen between kubernetes resources as people are used to seeing them together in the CamelCase. For example, daemonset, cronjob, resourcequota, persistentvolume, serviceaccount. It also helps to minimize the difference between the event source names.

2. I am not sure if we should use shortcuts for the long names that are very commonly used. For example, persistentvolumeclaim (pvc), persistentvolume (pv), certificatesigningrequest (csr).  Full names would help the consistency and be more beginner friendly. Shortcuts would help to bring down the verbosity a bit and be more advanced friendly.

Does anyone want to give the aliasing change a go? Or can I look into it?


On Wed, Jan 25, 2023, 5:21 PM Jordan Liggitt <lig...@google.com> wrote:
Aliasing initializers to preserve existing invocations would actually be pretty easy.

Not sure an "fka" field in Event objects would be worth it… anyone aware of that and willing to adapt to inspect it could just as easily adapt to use the new controller event source name. 

Maybe a bigger question is what the stability guarantees are for prefixes and keys in contextual/structured logging. Is this a new API surface that people are going to integrate with and depend on us to not change?


On Wed, Jan 25, 2023 at 8:16 PM Tim Hockin <tho...@google.com> wrote:
Can we add `fka` fields to events and aliases to initializers so
`--controllers` can take either old names or new, but only lists new
ones in docs and doesn't double-dip on startup?

I agree that these names are ungood - if we invest in a mechanism to
rename, it gives us freedom over time.

On Wed, Jan 25, 2023 at 5:10 PM 'Jordan Liggitt' via dev
<d...@kubernetes.io> wrote:
>
> Both the service account, initializer name, and event source used for controller loops are admin or end-user visible… I'd be pretty hesitant to change them for the purpose of contextual logging.
>
> If we were going to change something, of those, the event source seems like the lowest-impact one to change, since it would not impact command-line invocations or audit/auth configurations, and events are already ephemeral, but I don't have a good sense for whether anyone has monitoring built around events emitted by specific controllers.

Not sure about the monitoring, but code search did not show much usage of this: https://sourcegraph.com/search?q=context:global+Source.Component+%3D%3D&patternType=standard&sm=1&groupBy=repo

For the relevant controllers

Btw, I saw that for example cluster-autoscaler uses the events as an API and some people are using it: https://github.com/kubernetes/autoscaler/blob/master/cluster-autoscaler/FAQ.md#what-events-are-emitted-by-ca

Since we cannot be sure how much people rely on these for monitoring, can we make an announcement about the source name change or put the change behind a feature gate to notify people that this is happening? Would that be enough or is there another option?

FYI, some of the controllers do not use events and some of them already have correct names The change would affect only these event sources:

controllermanager (disruption)
node-controller (node-lifecycle)
route_controller
persistentvolume-controller
volume_expand
root_ca_cert_publisher
ephemeral_volume
resource_claim


Tim Hockin

unread,
Jan 26, 2023, 4:22:03 PM1/26/23
to Filip Krepinsky, patric...@intel.com, Jordan Liggitt, dev
On Thu, Jan 26, 2023 at 6:08 AM Filip Krepinsky <fkre...@redhat.com> wrote:
>
>
>
> On Thu, Jan 26, 2023 at 9:29 AM Patrick Ohly <patric...@intel.com> wrote:
>>
>> "'Tim Hockin' via dev" <d...@kubernetes.io> writes:
>> > I think we continue to assert that logs are not an API with strong
>> > guarantees.
>>
>> I agree. That means we can choose some log prefix and change it at any
>> time.

Just for clarity: We should acknowledge that changing log messages
and keys is not zero-cost, and seek to avoid doing it. But if that's
the change that makes sense, it is not forbidden.

> 1. I have not put a hyphen between kubernetes resources as people are used to seeing them together in the CamelCase. For example, daemonset, cronjob, resourcequota, persistentvolume, serviceaccount. It also helps to minimize the difference between the event source names.

I wish we had more discipline here, historically, but I agree with you now.

> 2. I am not sure if we should use shortcuts for the long names that are very commonly used. For example, persistentvolumeclaim (pvc), persistentvolume (pv), certificatesigningrequest (csr). Full names would help the consistency and be more beginner friendly. Shortcuts would help to bring down the verbosity a bit and be more advanced friendly.

No shortcuts, I think.

Filip Krepinsky

unread,
Feb 15, 2023, 6:12:09 PM2/15/23
to Tim Hockin, patric...@intel.com, Jordan Liggitt, dev
Hi all,

Here is the PR which introduces the discussed aliases: https://github.com/kubernetes/kubernetes/pull/115813

I put a couple of points for a discussion there (proposing a few new changes since we reviewed the google sheets doc).

I would really appreciate it if you could at least look at the names of the controllers (since it impacts almost everybody), if you don't have time for a full review. I am sorry that the PR is so big, but due to changes in the shared controller-mananger module, the refactoring needed to be done also for cloud-provider / cloud-controller-mananger so it makes the PR more bigger.

The CLI bits are in a lot of places, but I tried to structure the commits somewhat logically so it should help to review the changes per commit.

Filip Krepinsky

unread,
Mar 6, 2023, 7:32:05 AM3/6/23
to Tim Hockin, patric...@intel.com, Jordan Liggitt, dev
Bump, to gather additional opinions on the renames.

I would like to suggest additional renames that are not present in the document/PR

ttl-controller -> time-to-live-controller
ttl-after-finished-controller -> time-to-live-after-finished-controller
node-ipam-controller -> node-ip-address-management-controller
root-ca-cert-publisher-controller -> root-ca-certificate-publisher-controller or root-certificate-authority-certificate-publisher-controller (this one seems too verbose/long)
service-controller -> cloud-service-controller
route-controller -> cloud-route-controller

Any thoughts?

Jordan Liggitt

unread,
Mar 6, 2023, 8:52:01 AM3/6/23
to Filip Krepinsky, Tim Hockin, patric...@intel.com, dev
On Mon, Mar 6, 2023 at 7:31 AM Filip Krepinsky <fkre...@redhat.com> wrote:
Bump, to gather additional opinions on the renames.

I would like to suggest additional renames that are not present in the document/PR

ttl-controller -> time-to-live-controller
ttl-after-finished-controller -> time-to-live-after-finished-controller

I'd keep ttl since that's the API field name.
 
node-ipam-controller -> node-ip-address-management-controller

For ipam, is that a standard abbreviation like CIDR or URL? I see us using IPAM in other places like IPAMFromCloud and IPAMFromCluster values for the --cidr-allocator-type flag
 
root-ca-cert-publisher-controller -> root-ca-certificate-publisher-controller or root-certificate-authority-certificate-publisher-controller (this one seems too verbose/long)

This controller publishes a configmap named "kube-root-ca.crt" ... matching terms in that name seems reasonable to me.

Filip Krepinsky

unread,
Mar 7, 2023, 2:55:56 PM3/7/23
to Jordan Liggitt, Tim Hockin, patric...@intel.com, dev


On Mon, Mar 6, 2023 at 2:51 PM Jordan Liggitt <lig...@google.com> wrote:
On Mon, Mar 6, 2023 at 7:31 AM Filip Krepinsky <fkre...@redhat.com> wrote:
Bump, to gather additional opinions on the renames.

I would like to suggest additional renames that are not present in the document/PR

ttl-controller -> time-to-live-controller
ttl-after-finished-controller -> time-to-live-after-finished-controller

I'd keep ttl since that's the API field name.
 
node-ipam-controller -> node-ip-address-management-controller

For ipam, is that a standard abbreviation like CIDR or URL? I see us using IPAM in other places like IPAMFromCloud and IPAMFromCluster values for the --cidr-allocator-type flag
 
root-ca-cert-publisher-controller -> root-ca-certificate-publisher-controller or root-certificate-authority-certificate-publisher-controller (this one seems too verbose/long)

This controller publishes a configmap named "kube-root-ca.crt" ... matching terms in that name seems reasonable to me.

Thanks. I suppose it makes sense to keep using well known shortcuts (the API field name can fall under this category as well). It will help the names not be overly verbose. I updated the naming conventions to:

// NAMING CONVENTIONS
// 1. naming should be consistent across the controllers
// 2. use of shortcuts should be avoided, unless they are well-known non-Kubernetes shortcuts
// 3. Kubernetes' resources should be written together without a hyphen ("-")

and changed the controller name to root-ca-certificate-publisher-controller

 
service-controller -> cloud-service-controller
route-controller -> cloud-route-controller

jury is still out on the cloud prefixes

Tim Hockin

unread,
Mar 8, 2023, 7:05:53 PM3/8/23
to Jordan Liggitt, Filip Krepinsky, patric...@intel.com, dev
On Mon, Mar 6, 2023 at 5:51 AM Jordan Liggitt <lig...@google.com> wrote:
>
> On Mon, Mar 6, 2023 at 7:31 AM Filip Krepinsky <fkre...@redhat.com> wrote:
>>
>> Bump, to gather additional opinions on the renames.
>>
>> I would like to suggest additional renames that are not present in the document/PR
>>
>> ttl-controller -> time-to-live-controller
>> ttl-after-finished-controller -> time-to-live-after-finished-controller
>
>
> I'd keep ttl since that's the API field name.

agree, and TTL is not an obscure thing

>>
>> node-ipam-controller -> node-ip-address-management-controller
>
>
> For ipam, is that a standard abbreviation like CIDR or URL? I see us using IPAM in other places like IPAMFromCloud and IPAMFromCluster values for the --cidr-allocator-type flag

Yes - it as well understood acronym (IP Address Management)

>> root-ca-cert-publisher-controller -> root-ca-certificate-publisher-controller or root-certificate-authority-certificate-publisher-controller (this one seems too verbose/long)
>
>
> This controller publishes a configmap named "kube-root-ca.crt" ... matching terms in that name seems reasonable to me.
>
>>
>> service-controller -> cloud-service-controller
>> route-controller -> cloud-route-controller

service controller is entirely about service LB, so maybe
"service-lb-controller" or, if we want to prefix things with "cloud",
I could live with "cloud-service-lb-controller"

route controller could be "cloud-route-controller" or
"cloud-pod-route-controller" or "cloud-pod-range-route-controller".
If there's a PR I would tag some sig-net folks for opinions :)

Tim Hockin

unread,
Mar 8, 2023, 7:16:07 PM3/8/23
to Jordan Liggitt, Filip Krepinsky, patric...@intel.com, dev
On Wed, Mar 8, 2023 at 4:05 PM Tim Hockin <tho...@google.com> wrote:
>
> On Mon, Mar 6, 2023 at 5:51 AM Jordan Liggitt <lig...@google.com> wrote:
> >
> > On Mon, Mar 6, 2023 at 7:31 AM Filip Krepinsky <fkre...@redhat.com> wrote:
> >>
> >> Bump, to gather additional opinions on the renames.
> >>
> >> I would like to suggest additional renames that are not present in the document/PR
> >>
> >> ttl-controller -> time-to-live-controller
> >> ttl-after-finished-controller -> time-to-live-after-finished-controller
> >
> >
> > I'd keep ttl since that's the API field name.
>
> agree, and TTL is not an obscure thing
>
> >>
> >> node-ipam-controller -> node-ip-address-management-controller
> >
> >
> > For ipam, is that a standard abbreviation like CIDR or URL? I see us using IPAM in other places like IPAMFromCloud and IPAMFromCluster values for the --cidr-allocator-type flag
>
> Yes - it as well understood acronym (IP Address Management)
>
> >> root-ca-cert-publisher-controller -> root-ca-certificate-publisher-controller or root-certificate-authority-certificate-publisher-controller (this one seems too verbose/long)
> >
> >
> > This controller publishes a configmap named "kube-root-ca.crt" ... matching terms in that name seems reasonable to me.
> >
> >>
> >> service-controller -> cloud-service-controller
> >> route-controller -> cloud-route-controller
>
> service controller is entirely about service LB, so maybe
> "service-lb-controller" or, if we want to prefix things with "cloud",
> I could live with "cloud-service-lb-controller"

As an aside - this was called "service controller" because most
resources have/had one controller which implements all of the control
loops that resource needs. But is that the ideal pattern? Should we
have one controller per resource or should we have lots of smaller,
async controllers which do one thing? There's some overhead, but the
caches and SharedInformer makes that mostly not matter, right?

Jay Pipes

unread,
Mar 9, 2023, 8:40:20 AM3/9/23
to tho...@google.com, Jordan Liggitt, Filip Krepinsky, patric...@intel.com, dev
On Wed, Mar 8, 2023 at 7:16 PM 'Tim Hockin' via dev <d...@kubernetes.io> wrote:
> service controller is entirely about service LB, so maybe
> "service-lb-controller" or, if we want to prefix things with "cloud",
> I could live with "cloud-service-lb-controller"

As an aside - this was called "service controller" because most
resources have/had one controller which implements all of the control
loops that resource needs.  But is that the ideal pattern?  Should we
have one controller per resource or should we have lots of smaller,
async controllers which do one thing?  There's some overhead, but the
caches and SharedInformer makes that mostly not matter, right?

My opinion is that the controller should be problem-domain-specific as much as possible, not resource-specific. If there are a number of resources related to a single problem domain I think it's useful to have a single controller managing those related resources. The issue that sometimes comes up, though, is when certain resources cross multiple problem domains. When that happens, it can sometimes be tough to decide which controller should manage those kinds of resources. When that happens, I find it makes sense to have a separate controller handle those cross-problem-domain resource kinds.

At least, in my experience that's what works best... :)

Best,
-jay
Reply all
Reply to author
Forward
0 new messages