Re: Union Discriminator in Kubernetes APIs?

222 views
Skip to first unread message

Tim Hockin

unread,
Apr 8, 2022, 5:42:51 PM4/8/22
to kubernetes-a...@googlegroups.com, Antoine Pelisse, Rob Scott
To demonstrate in code, consider a Fobnicator, which has a mode.
Today we know it can be "Horizontal" or "Vertical", both of which take
a "pattern" param.

What I see is something like:

```
type FrobMode string

const (
FrobHorizonal FrobMode = "Horizontal"
FrobVertical FrobMode = "Vertical"
)

type FrobSpec {
// FrobMode indicates how this Frobber works.
Mode FrobMode

// Pattern indicates how the input pattern to frob.
Length string
}
```

We suspect that, in the future, we might add "Periodic" or other modes.

```diff
const (
FrobHorizonal FrobMode = "Horizontal"
FrobVertical FrobMode = "Vertical"
+FrobPeriodic FrobMode = "Periodic"
)

type FrobSpec {
// FrobMode indicates how this Frobber works.
Mode FrobMode

// Pattern indicates how the input pattern to frob.
+ // Only applies when Mode is Horizontal or Vertical.
Pattern string
+
+ // Period specifies the rate to frob.
+ // Only applies when Mode is Periodic
+ Period int32
}
```

What I have seen (CF Service) is this spins out of control, and you
end up with N fields, some different subset of which apply per mode.
For other APIs, I encourages a different approach:

```
type FrobSpec {
// FrobMode indicates how this Frobber works.
Mode FrobMode

// Horizontal specifies how horizontal frobbers operate.
Horizontal *HorizFrob

// Vertical specifies how vertical frobbers operate.
Vertical *VertFrob

// Periodic specifies how periodic frobbers operate.
Periodic *PeriodicFrob
}

type HorizFrob struct {
// Pattern indicates how the input pattern to frob.
Pattern string
}

type VertFrob struct {
// Pattern indicates how the input pattern to frob.
Pattern string
}

type PeriodicFrob struct {
Period specifies the rate to frob.
Period int32
}
```

It's more boilerplate but totally unambiguous and easier to document
and reason about, by far.

So question #1 - anyone want to argue this?

Question #2 - how far should this apply? Consider an "address" which
could be an IP, a CIDR, a name, or who knows what else in the future.
Would you say:

```
Type AddressType
Address string
```

or

```
Type AddressType
IP string
CIDR string
Hostname string
```

Lastly, we've never really locked down guidance (AFAIK) about the need
for discriminators. Originally we said it was so we could
unambiguously know the difference between "a one-of value was
specified that I don't comprehend" from "no value was specified". But
we're not at all consistent in the application of it, and for most
consumers the reaction to those two cases will be the same.

So question #3: discriminators or no?

Tim

On Fri, Apr 8, 2022 at 11:50 AM Rob Scott <robert...@google.com> wrote:
>
> Hey Everyone,
>
> I was just chatting with +Tim Hockin about Gateway API and our plans to graduate some resources to beta this month. We have several instances in our API that use a union discriminator field, such as HTTPRouteFilter. As far as I can tell, a union discriminator field is not strictly necessary but can be helpful. I can't find any clear guidance in our API conventions as far as if they should be used, and usage within existing Kubernetes APIs seems inconsistent.
>
> For Gateway API specifically, validation with a union discriminator has been a bit more difficult since it requires additional logic in our validating webhook (oneOf is helpful, but does not prevent someone from setting a field that is different than what the union discriminator indicates should be populated).
>
> I'm actually fine with either approach here. Since we already have union discriminators in some places in Gateway API, it would likely be most straightforward to continue with that pattern for the rest of the API, but want to get some recommendations from API reviewers for what we should do here.
>
> Thanks!
>
> Rob
>

Daniel Smith

unread,
Apr 11, 2022, 12:27:09 PM4/11/22
to Tim Hockin, kubernetes-a...@googlegroups.com, Antoine Pelisse, Rob Scott
I strongly support this second pattern.
 

It's more boilerplate but totally unambiguous and easier to document
and reason about, by far.

So question #1 - anyone want to argue this?

Question #2 - how far should this apply?  Consider an "address" which
could be an IP, a CIDR, a name, or who knows what else in the future.
Would you say:

```
Type AddressType
Address string
```

or

```
Type AddressType
IP string
CIDR string
Hostname string
```

The second one is strongly preferred -- otherwise, what will you do when you need something that doesn't have a standard one-string encoding? It's also sooooo much easier to document.
 

Lastly, we've never really locked down guidance (AFAIK) about the need
for discriminators.  Originally we said it was so we could
unambiguously know the difference between "a one-of value was
specified that I don't comprehend" from "no value was specified".  But
we're not at all consistent in the application of it, and for most
consumers the reaction to those two cases will be the same.

So question #3: discriminators or no?

The last state I knew about, we still wanted discriminators because they will enable much simpler auto-clearing when the user changes type, even if it's an implicit change. That all legs of the union attempt to set a single field means changes naturally collide and SSA can give good conflict messages. However Antoine may have more recent state.

Also I believe actual api-level union support is either a live project or next in the queue -- Antoine can say more about that, too.
 

Tim

On Fri, Apr 8, 2022 at 11:50 AM Rob Scott <robert...@google.com> wrote:
>
> Hey Everyone,
>
> I was just chatting with +Tim Hockin about Gateway API and our plans to graduate some resources to beta this month. We have several instances in our API that use a union discriminator field, such as HTTPRouteFilter. As far as I can tell, a union discriminator field is not strictly necessary but can be helpful. I can't find any clear guidance in our API conventions as far as if they should be used, and usage within existing Kubernetes APIs seems inconsistent.
>
> For Gateway API specifically, validation with a union discriminator has been a bit more difficult since it requires additional logic in our validating webhook (oneOf is helpful, but does not prevent someone from setting a field that is different than what the union discriminator indicates should be populated).
>
> I'm actually fine with either approach here. Since we already have union discriminators in some places in Gateway API, it would likely be most straightforward to continue with that pattern for the rest of the API, but want to get some recommendations from API reviewers for what we should do here.
>
> Thanks!
>
> Rob
>

--
You received this message because you are subscribed to the Google Groups "kubernetes-api-reviewers" group.
To unsubscribe from this group and stop receiving emails from it, send an email to kubernetes-api-rev...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/kubernetes-api-reviewers/CAO_Rewb76ARHXTwH6dsC0gqYPWzuKXu7RvWwOG48%3DunmdaiiRg%40mail.gmail.com.

Tim Hockin

unread,
Apr 22, 2022, 5:21:45 PM4/22/22
to Daniel Smith, kubernetes-a...@googlegroups.com, Antoine Pelisse, Rob Scott
ACK. I will stick to guidance of "use one field per option" (which is
mostly what I was saying anyway, phew!)

Antoine - want to weigh in on discriminators?

Daniel Smith

unread,
Apr 22, 2022, 5:24:39 PM4/22/22
to Tim Hockin, kubernetes-a...@googlegroups.com, Antoine Pelisse, Rob Scott
BTW I hope we will be working soon on a design for union types that are enforced in the API, so that we can stop having this discussion for good!

Rob Scott

unread,
Apr 22, 2022, 6:04:23 PM4/22/22
to kubernetes-a...@googlegroups.com, Tim Hockin

Daniel Smith

unread,
Apr 22, 2022, 7:14:04 PM4/22/22
to Antoine Pelisse, Tim Hockin, kubernetes-a...@googlegroups.com, Rob Scott
I believe the primary argument for discriminators was that they were necessary in order for apiserver to be able to interpret all possible pairs of {old, new} objects and be able to clear the right thing. 

I'd like to be pretty sure that's not true before recommending against them!

(also if you're confused why the first message in this thread just showed up today it's because I found the moderation queue notification about Rob's original message in my spam folder a little bit ago -- sorry Rob!)

On Fri, Apr 22, 2022 at 4:08 PM Antoine Pelisse <apel...@google.com> wrote:
I think in general, when designing APIs, dependencies between fields
(e.g. field A makes sense only if field B has a specific value, or in
your example "Only applies when Mode is Horizontal or Vertical.") gives
another opportunity for users to write invalid objects and should be
avoided.

The pattern that you're proposing above (with HorizFrob, VertFrob, ...)
prevents wrong configuration from being authored. All the possible
options for an Horizontal Frob are constrained within that struct, so
one can't even make a mistake when they write their YAML (and/or don't
depend on the documentation as much). That's good.

If you apply the same logic to discriminator though, that's definitely
against my initial comment of dependencies between fields. What does it
mean to have Vertical Mode with an HorizontalFrob specified?

So all-in-all, I think the best would probably be to:
1. Avoid discriminators in general, since they give an opportunity to
author invalid configurations

2. Possibly, isolate members of a union within a struct, so they can
be easily cleared by clients (even if they don't know about new branches
of the union), e.g.
```
// This marker below means we only have union members in that struct:
// +union
type FrobMode struct {

    // Horizontal specifies how horizontal frobbers operate.
    Horizontal *HorizFrob

    // Vertical specifies how vertical frobbers operate.
    Vertical *VertFrob

    // Periodic specifies how periodic frobbers operate.
    Periodic *PeriodicFrob

    // WARNING: Don't add unrelated fields here!
}
```
Unfortunately, it's sometimes impossible to know if a single field will
eventually become a member of a union.

3. Server-side apply should conflict properly and clear the other branch
of unions as needed,

4. The API server could clear unions for clients (especially for those
unions that couldn't be isolated) based on old/new object differences. This
means we could have multiple unions in the same struct which would be a
little ugly, but we can deal with it.

Antoine Pelisse

unread,
Apr 22, 2022, 7:27:51 PM4/22/22
to Daniel Smith, Tim Hockin, kubernetes-a...@googlegroups.com, Rob Scott
It has the advantage, if set properly, to be unambiguous. Detecting old/new requires
being a little "smarter", and depends on the existing object. The problem with the
discriminator is that it can be unambiguously wrong if one forgets to update it.

Antoine Pelisse

unread,
Apr 22, 2022, 7:27:56 PM4/22/22
to Daniel Smith, Tim Hockin, kubernetes-a...@googlegroups.com, Rob Scott
Re-sending after joining the group ...

On Fri, Apr 22, 2022 at 4:08 PM Antoine Pelisse <apel...@google.com> wrote:
I think in general, when designing APIs, dependencies between fields
(e.g. field A makes sense only if field B has a specific value, or in
your example "Only applies when Mode is Horizontal or Vertical.") gives
another opportunity for users to write invalid objects and should be
avoided.

The pattern that you're proposing above (with HorizFrob, VertFrob, ...)
prevents wrong configuration from being authored. All the possible
options for an Horizontal Frob are constrained within that struct, so
one can't even make a mistake when they write their YAML (and/or don't
depend on the documentation as much). That's good.

If you apply the same logic to discriminator though, that's definitely
against my initial comment of dependencies between fields. What does it
mean to have Vertical Mode with an HorizontalFrob specified?

So all-in-all, I think the best would probably be to:
1. Avoid discriminators in general, since they give an opportunity to
author invalid configurations

2. Possibly, isolate members of a union within a struct, so they can
be easily cleared by clients (even if they don't know about new branches
of the union), e.g.
```
// This marker below means we only have union members in that struct:
// +union
type FrobMode struct {
    // Horizontal specifies how horizontal frobbers operate.
    Horizontal *HorizFrob

    // Vertical specifies how vertical frobbers operate.
    Vertical *VertFrob

    // Periodic specifies how periodic frobbers operate.
    Periodic *PeriodicFrob

    // WARNING: Don't add unrelated fields here!
}
```
Unfortunately, it's sometimes impossible to know if a single field will
eventually become a member of a union.

3. Server-side apply should conflict properly and clear the other branch
of unions as needed,

4. The API server could clear unions for clients (especially for those
unions that couldn't be isolated) based on old/new object differences. This
means we could have multiple unions in the same struct which would be a
little ugly, but we can deal with it.
On Fri, Apr 22, 2022 at 2:24 PM Daniel Smith <dbs...@google.com> wrote:

Daniel Smith

unread,
Apr 22, 2022, 7:28:56 PM4/22/22
to Antoine Pelisse, Tim Hockin, kubernetes-a...@googlegroups.com, Rob Scott
Unambiguously wrong is fine, we can detect that-- I'm more worried about ambiguously wrong... :)

Antoine Pelisse

unread,
Apr 22, 2022, 7:31:44 PM4/22/22
to Daniel Smith, Tim Hockin, kubernetes-a...@googlegroups.com, Rob Scott
Yeah, that's poorly phrased. Unambiguously wrong could be good, but is this good:
Current Object:
```
Discriminator: "A"
A: MyAConfig{ ... }
```
actually, I want B, so I send:
```
Discriminator: "A"
B: MyBConfig{ ... }
```

Oops, I forgot that I have to update 2 fields as opposed to one ...
```

Daniel Smith

unread,
Apr 22, 2022, 7:35:29 PM4/22/22
to Antoine Pelisse, Tim Hockin, kubernetes-a...@googlegroups.com, Rob Scott
You can also send this:

```
A: MyAConfig{ ... }
B: MyBConfig{ ... }
```

I don't think we can prevent people from making this error of setting fields relevant to both.

Antoine Pelisse

unread,
Apr 22, 2022, 7:36:43 PM4/22/22
to Daniel Smith, Tim Hockin, kubernetes-a...@googlegroups.com, Rob Scott
Yeah, I didn't mention this one because it's unavoidable.

Daniel Smith

unread,
Apr 25, 2022, 1:14:34 PM4/25/22
to Rob Scott, Antoine Pelisse, Tim Hockin, kubernetes-a...@googlegroups.com
Some of this debate is coming back to me -- one of the reasons for the discriminator may have been to help out old clients which aren't aware of all of the options any more. If there's a discriminator, such clients can at least tell that there's an option selected that they don't understand, rather than just seeing nothing selected at all.

On Mon, Apr 25, 2022 at 10:06 AM Rob Scott <robert...@google.com> wrote:
Thanks for the great input here! It seems like there's widespread agreement to use type specific fields in a struct. Whether or not to use union discriminators seems somewhat more controversial. With that in mind, I made a Gateway API PR, that would use type specific fields + a union discriminator. My rationale for using a union discriminator is that:

1. We already use a union discriminator elsewhere in the API. 
2. We can choose to make a union discriminator optional in the future, we can't retroactively add one.

If we're 100% sure we want to avoid union discriminators in Kubernetes APIs, I'm fine with removing it from this, it just seems like it's still not completely clear.

Rob Scott

unread,
Apr 25, 2022, 1:14:38 PM4/25/22
to Antoine Pelisse, Daniel Smith, Tim Hockin, kubernetes-a...@googlegroups.com
Thanks for the great input here! It seems like there's widespread agreement to use type specific fields in a struct. Whether or not to use union discriminators seems somewhat more controversial. With that in mind, I made a Gateway API PR, that would use type specific fields + a union discriminator. My rationale for using a union discriminator is that:

1. We already use a union discriminator elsewhere in the API. 
2. We can choose to make a union discriminator optional in the future, we can't retroactively add one.

If we're 100% sure we want to avoid union discriminators in Kubernetes APIs, I'm fine with removing it from this, it just seems like it's still not completely clear.

Rob Scott

unread,
Apr 25, 2022, 1:53:59 PM4/25/22
to Daniel Smith, Antoine Pelisse, Tim Hockin, kubernetes-a...@googlegroups.com
If there's a discriminator, such clients can at least tell that there's an option selected that they don't understand, rather than just seeing nothing selected at all.

I think this is a very important detail for us. With Gateway API we already have 14 implementations and integrations, with more on the way. It seems very likely that there will be different versions installed in a cluster, all supporting slightly different versions of the API. Assuming we add a new "Foo" type to this union in the future, it would be very useful for implementations to be able to log a message saying they don't support "Foo" instead of that they see an empty struct and don't know what to do with it.

Tim Hockin

unread,
Apr 26, 2022, 3:01:50 AM4/26/22
to Rob Scott, Daniel Smith, Antoine Pelisse, kubernetes-a...@googlegroups.com
The ability to log good errors (e.g. "Unsupported frob type:
'Periodic'" vs "Unknown frob type") is nice but does it justify the
use of discriminators everywhere? This is always the argument - can
we avoid one extra line of YAML?

The update problem seems resolvable as long as you have the before and
after, right? I know this is super subtle, so I am trying to think
thru all the skew possibilities. It seems OK? Am I missing
something? It feels like we should document one-ofs as "if you see no
value in a one-of, assume it was something you don't understand".

Daniel Smith

unread,
Apr 26, 2022, 11:32:30 AM4/26/22
to Tim Hockin, Rob Scott, Antoine Pelisse, kubernetes-a...@googlegroups.com
On Tue, Apr 26, 2022 at 12:01 AM Tim Hockin <tho...@google.com> wrote:
The ability to log good errors (e.g. "Unsupported frob type:
'Periodic'" vs "Unknown frob type") is nice but does it justify the
use of discriminators everywhere?

It's slightly better than that -- you can tell the difference between an invalid object and a valid one you just don't understand. (for some but not all invalid objects)
 
  This is always the argument - can
we avoid one extra line of YAML?

The update problem seems resolvable as long as you have the before and
after, right?  I know this is super subtle, so I am trying to think
thru all the skew possibilities. 

I think the place to hash this out is the union/one-of KEP / design, we should have a table with super old/old/new clients (super old meaning, doesn't understand unions), super old/old/new apiserver, and add/modify/change operations (both via apply and via regular update / patch), and all cases of conflicts. That's at least 54 cases, as many as 108 depending on how many kinds of conflicts there can be, and I might have missed a dimension. (Antoine, is there a draft I can put this comment in?)
 
It seems OK?  Am I missing
something?  It feels like we should document one-ofs as "if you see no
value in a one-of, assume it was something you don't understand".

Yes, that's the other alternative. But I think we need to list out the cases before settling on an answer, I doubt any of us have considered every single case.

Rahul Joshi

unread,
Apr 27, 2022, 7:20:26 PM4/27/22
to kubernetes-api-reviewers
Some trouble we had was with the verbosity imposed by discriminators. Since there are already type specific fields, it is rather odd for the user to also have to specify which type they're setting -- that ought to be implied by setting the type-specific fields:

```
// Feels redundant
{ Type: A, A: MyConfigA{} }

// This ought to be sufficient
{ A: MyConfigA{} }
```
 
Also for CRDs (which as we understand all new API should start as CRDs before eventually being merged in) there's no easy way to ensure that the union discriminator and the type specific fields are in sync. We can't do kube-apisever validation, and admission/validating webhooks are not considered a viable requirement to impose -- we can't ensure all users use the webhook.

Antoine Pelisse

unread,
Apr 27, 2022, 7:20:26 PM4/27/22
to Daniel Smith, Tim Hockin, Rob Scott, kubernetes-a...@googlegroups.com
Duly noted, let's try to see if we can finally get it done in 1.25 (don't hold your breath).

Antoine

Daniel Smith

unread,
Apr 27, 2022, 7:29:52 PM4/27/22
to Rahul Joshi, kubernetes-api-reviewers
On Wed, Apr 27, 2022 at 4:20 PM 'Rahul Joshi' via kubernetes-api-reviewers <kubernetes-a...@googlegroups.com> wrote:
Some trouble we had was with the verbosity imposed by discriminators. Since there are already type specific fields, it is rather odd for the user to also have to specify which type they're setting -- that ought to be implied by setting the type-specific fields:

```
// Feels redundant
{ Type: A, A: MyConfigA{} }

// This ought to be sufficient
{ A: MyConfigA{} }
```
 
Also for CRDs (which as we understand all new API should start as CRDs before eventually being merged in) there's no easy way to ensure that the union discriminator and the type specific fields are in sync.

The good news is, when we get a design, to get apiserver to validate a union in your CRD, all you'll have to do is add the right marker to the field.
 
We can't do kube-apisever validation, and admission/validating webhooks are not considered a viable requirement to impose -- we can't ensure all users use the webhook.

You probably know, but CEL validation is in flight and should land soon, that should help this.
 
Reply all
Reply to author
Forward
0 new messages