API review consult (some about machinery)

66 views
Skip to first unread message

Tim Hockin

unread,
Aug 15, 2018, 1:29:40 PM8/15/18
to kubernetes-sig-architecture, kubernetes-a...@googlegroups.com, Jing Xu, Saad Ali
Hi all,

As part of the API review for volume snapshots
(https://github.com/kubernetes-sigs/architecture-tracking/issues/44 ,
kubernetes/community#2335 , kubernetes-csi/external-snapshotter#6)
there a re a few topics I was not sure on and wanted to ask for a
consult from another reviewer or reviewers.

1) Proposal adds SnapshotClass independent of StorageClass. The
argument is that there could be more than one class of snapshot for a
given storage class (e.g. retention time, etc). I'm OK with that. It
has a side effect of breaking a "nice" property I observed before
around API groups.

The existing type (StorageClass) is an admin-centric type. The new
types (except SnapshotClass) are user-centric. Having them in
different API groups seems OK. Addign SnapshotClass to the
user-centric group is wrong, but I feel disinclined to make a third
group, in part because this user/admin split isn't codified anywhere
or followed anywhere else and because there is no naming convention
around it.

Thoughts? Is that a pattern we want to endorse or am I over-training
on a side-effect?

2) The act of creating a snapshot can fail asynchronously and the
design wants to NOT retry (or not indefinitely). I'm on the fence
about that aspect, and am inclined to trust the dev team. The
question is in reporting the error state. An event is sub-par because
it is a terminal condition. It won't be retried, so event-carried
data will eventually age-out. They have added an "error" field which
indicates the existence of an error and details about it.

I don't know if we have other precedents of this pattern or some other
way to handle this. It could be a Condition, but it sort of goes
against the recent edits around Condition conventions being an
extension point.

Thoughts?

Other than these two, I have given some minor feedback that they will
incorporate, and then I am satisfied.

Tim

Clayton Coleman

unread,
Aug 15, 2018, 1:33:37 PM8/15/18
to Tim Hockin, kubernetes-sig-architecture, kubernetes-a...@googlegroups.com, Jing Xu, Saad Ali
> On Aug 15, 2018, at 1:29 PM, 'Tim Hockin' via kubernetes-api-reviewers <kubernetes-a...@googlegroups.com> wrote:
>
> Hi all,
>
> As part of the API review for volume snapshots
> (https://github.com/kubernetes-sigs/architecture-tracking/issues/44 ,
> kubernetes/community#2335 , kubernetes-csi/external-snapshotter#6)
> there a re a few topics I was not sure on and wanted to ask for a
> consult from another reviewer or reviewers.
>
> 1) Proposal adds SnapshotClass independent of StorageClass. The
> argument is that there could be more than one class of snapshot for a
> given storage class (e.g. retention time, etc). I'm OK with that. It
> has a side effect of breaking a "nice" property I observed before
> around API groups.
>
> The existing type (StorageClass) is an admin-centric type. The new
> types (except SnapshotClass) are user-centric. Having them in
> different API groups seems OK. Addign SnapshotClass to the
> user-centric group is wrong, but I feel disinclined to make a third
> group, in part because this user/admin split isn't codified anywhere
> or followed anywhere else and because there is no naming convention
> around it.
>
> Thoughts? Is that a pattern we want to endorse or am I over-training
> on a side-effect?


By analogy, pod and node are in the same API group but have different
audiences. The same for PV and PVC. So apigroups typically aren’t
expected to define user audience, but instead are about a class of
related functionality (in the large).


>
> 2) The act of creating a snapshot can fail asynchronously and the
> design wants to NOT retry (or not indefinitely). I'm on the fence
> about that aspect, and am inclined to trust the dev team. The
> question is in reporting the error state. An event is sub-par because
> it is a terminal condition. It won't be retried, so event-carried
> data will eventually age-out. They have added an "error" field which
> indicates the existence of an error and details about it.
>
> I don't know if we have other precedents of this pattern or some other
> way to handle this. It could be a Condition, but it sort of goes
> against the recent edits around Condition conventions being an
> extension point.


Pods and jobs that are restartnever have a terminal phase (arguably
could be a Boolean). Is the analogy between Jobs and Snapshots
sufficient?


>
> Thoughts?
>
> Other than these two, I have given some minor feedback that they will
> incorporate, and then I am satisfied.
>
> Tim
>
> --
> You received this message because you are subscribed to the Google Groups "kubernetes-api-reviewers" group.
> To unsubscribe from this group and stop receiving emails from it, send an email to kubernetes-api-rev...@googlegroups.com.
> To post to this group, send email to kubernetes-a...@googlegroups.com.
> To view this discussion on the web visit https://groups.google.com/d/msgid/kubernetes-api-reviewers/CAO_RewapCRkfzjcjHRy6G2-fBJp%3DeakM-ZEf4KX51k_YG5hoVQ%40mail.gmail.com.
> For more options, visit https://groups.google.com/d/optout.

Tim Hockin

unread,
Aug 15, 2018, 1:37:41 PM8/15/18
to Clayton Coleman, kubernetes-sig-architecture, kubernetes-a...@googlegroups.com, Jing Xu, Saad Ali
All of which pre-date apigroups :)

The question is whether we think there's a meaningful "edge" that we
want to strengthen.

> > 2) The act of creating a snapshot can fail asynchronously and the
> > design wants to NOT retry (or not indefinitely). I'm on the fence
> > about that aspect, and am inclined to trust the dev team. The
> > question is in reporting the error state. An event is sub-par because
> > it is a terminal condition. It won't be retried, so event-carried
> > data will eventually age-out. They have added an "error" field which
> > indicates the existence of an error and details about it.
> >
> > I don't know if we have other precedents of this pattern or some other
> > way to handle this. It could be a Condition, but it sort of goes
> > against the recent edits around Condition conventions being an
> > extension point.
>
>
> Pods and jobs that are restartnever have a terminal phase (arguably
> could be a Boolean). Is the analogy between Jobs and Snapshots
> sufficient?

Well, we tell people not to use "Phase" in APIs. Jobs seems apropos.
I see it has status fields to convey the final situation, so I guess
error struct here seems OK.

> >
> > Thoughts?
> >
> > Other than these two, I have given some minor feedback that they will
> > incorporate, and then I am satisfied.
> >
> > Tim
> >
> > --
> > You received this message because you are subscribed to the Google Groups "kubernetes-api-reviewers" group.
> > To unsubscribe from this group and stop receiving emails from it, send an email to kubernetes-api-rev...@googlegroups.com.
> > To post to this group, send email to kubernetes-a...@googlegroups.com.
> > To view this discussion on the web visit https://groups.google.com/d/msgid/kubernetes-api-reviewers/CAO_RewapCRkfzjcjHRy6G2-fBJp%3DeakM-ZEf4KX51k_YG5hoVQ%40mail.gmail.com.
> > For more options, visit https://groups.google.com/d/optout.
>
> --
> You received this message because you are subscribed to the Google Groups "kubernetes-api-reviewers" group.
> To unsubscribe from this group and stop receiving emails from it, send an email to kubernetes-api-rev...@googlegroups.com.
> To post to this group, send email to kubernetes-a...@googlegroups.com.
> To view this discussion on the web visit https://groups.google.com/d/msgid/kubernetes-api-reviewers/CAH16ShLRm_Q9d0J_GC%2B3V16qWOi6nq%3DaZK3%3Dv8W8jzuQ6xAiTA%40mail.gmail.com.

Clayton Coleman

unread,
Aug 15, 2018, 1:44:38 PM8/15/18
to Tim Hockin, kubernetes-sig-architecture, kubernetes-a...@googlegroups.com, Jing Xu, Saad Ali
Service cat has admin and user focused together. So does istio and
knative (well, to a degree). I don’t know that I’ve felt those
suffered without the edge. But I don’t dislike the edge other than
it’s one more thing in a list somewhere that is more complicated than
one thing.

Daniel Smith

unread,
Aug 15, 2018, 1:47:59 PM8/15/18
to Tim Hockin, Clayton Coleman, kubernetes-sig-architecture, kubernetes-a...@googlegroups.com, Jing Xu, Saad Ali
I think during the transition to doing more stuff in CRDs we will have to tolerate either a lot of groups or mixed stuff in groups. Since right now you can't mix api implementation mechanisms in a groupversion.

Eventually we will have a "great group reshuffle" and refactor the groups.
 

> > 2) The act of creating a snapshot can fail asynchronously and the
> > design wants to NOT retry (or not indefinitely).  I'm on the fence
> > about that aspect, and am inclined to trust the dev team.  The
> > question is in reporting the error state.  An event is sub-par because
> > it is a terminal condition.  It won't be retried, so event-carried
> > data will eventually age-out.  They have added an "error" field which
> > indicates the existence of an error and details about it.
> >
> > I don't know if we have other precedents of this pattern or some other
> > way to handle this.  It could be a Condition, but it sort of goes
> > against the recent edits around Condition conventions being an
> > extension point.
>
>
> Pods and jobs that are restartnever have a terminal phase (arguably
> could be a Boolean).  Is the analogy between Jobs and Snapshots
> sufficient?

Well, we tell people not to use "Phase" in APIs.  Jobs seems apropos.
I see it has status fields to convey the final situation, so I guess
error struct here seems OK.

What is wrong with a "permanently failed" condition?
 

> >
> > Thoughts?
> >
> > Other than these two, I have given some minor feedback that they will
> > incorporate, and then I am satisfied.
> >
> > Tim
> >
> > --
> > You received this message because you are subscribed to the Google Groups "kubernetes-api-reviewers" group.
> > To unsubscribe from this group and stop receiving emails from it, send an email to kubernetes-api-rev...@googlegroups.com.
> > To post to this group, send email to kubernetes-a...@googlegroups.com.
> > To view this discussion on the web visit https://groups.google.com/d/msgid/kubernetes-api-reviewers/CAO_RewapCRkfzjcjHRy6G2-fBJp%3DeakM-ZEf4KX51k_YG5hoVQ%40mail.gmail.com.
> > For more options, visit https://groups.google.com/d/optout.
>
> --
> You received this message because you are subscribed to the Google Groups "kubernetes-api-reviewers" group.
> To unsubscribe from this group and stop receiving emails from it, send an email to kubernetes-api-rev...@googlegroups.com.
> To post to this group, send email to kubernetes-a...@googlegroups.com.
> To view this discussion on the web visit https://groups.google.com/d/msgid/kubernetes-api-reviewers/CAH16ShLRm_Q9d0J_GC%2B3V16qWOi6nq%3DaZK3%3Dv8W8jzuQ6xAiTA%40mail.gmail.com.
> For more options, visit https://groups.google.com/d/optout.

--
You received this message because you are subscribed to the Google Groups "kubernetes-api-reviewers" group.
To unsubscribe from this group and stop receiving emails from it, send an email to kubernetes-api-rev...@googlegroups.com.
To post to this group, send email to kubernetes-a...@googlegroups.com.

Tim Hockin

unread,
Aug 15, 2018, 1:48:40 PM8/15/18
to Clayton Coleman, kubernetes-sig-architecture, kubernetes-a...@googlegroups.com, Jing Xu, Saad Ali
Agree on all fronts. I like it, but it may just not be worth the
effort. We could START pushing in that direction but I am not sure I
care enough to champion it.
> To view this discussion on the web visit https://groups.google.com/d/msgid/kubernetes-api-reviewers/CAH16ShLN9x16y%3D6dJUcgL2iDgv34--9n2qNKBF%3D08qUKEY8Hxw%40mail.gmail.com.

Tim Hockin

unread,
Aug 15, 2018, 1:51:00 PM8/15/18
to Daniel Smith, Clayton Coleman, kubernetes-sig-architecture, kubernetes-a...@googlegroups.com, Jing Xu, Saad Ali
On Wed, Aug 15, 2018 at 10:48 AM 'Daniel Smith' via
For "core" groups, yes. Do you have a strong enough sense of the
user/admin split being useful to advocate for it now-ish?

>> > > 2) The act of creating a snapshot can fail asynchronously and the
>> > > design wants to NOT retry (or not indefinitely). I'm on the fence
>> > > about that aspect, and am inclined to trust the dev team. The
>> > > question is in reporting the error state. An event is sub-par because
>> > > it is a terminal condition. It won't be retried, so event-carried
>> > > data will eventually age-out. They have added an "error" field which
>> > > indicates the existence of an error and details about it.
>> > >
>> > > I don't know if we have other precedents of this pattern or some other
>> > > way to handle this. It could be a Condition, but it sort of goes
>> > > against the recent edits around Condition conventions being an
>> > > extension point.
>> >
>> >
>> > Pods and jobs that are restartnever have a terminal phase (arguably
>> > could be a Boolean). Is the analogy between Jobs and Snapshots
>> > sufficient?
>>
>> Well, we tell people not to use "Phase" in APIs. Jobs seems apropos.
>> I see it has status fields to convey the final situation, so I guess
>> error struct here seems OK.
>
>
> What is wrong with a "permanently failed" condition?

It's on the edge of what our guidelines say - we specifically talk
about the idea that Conditions are for things you can't enumerate a
priori, and maybe don't apply to every instance. Every snapshot is
either failed or not failed. We know that a priori and we are setting
it from standard controllers.
> To view this discussion on the web visit https://groups.google.com/d/msgid/kubernetes-api-reviewers/CAB_J3babuMTwnjrBWNmUWjmcUEUVqj9VZpxMQJ8tqP87sZWeGQ%40mail.gmail.com.

Clayton Coleman

unread,
Aug 15, 2018, 2:03:23 PM8/15/18
to Hockin, Tim, Daniel Smith, kubernetes-sig-architecture, kubernetes-a...@googlegroups.com, Jing Xu, Saad Ali
In the "conditions are for extension" discussion we had the clear separation between:

1. fields that are fundamental to the status of the object should be real fields (pod status ready, pod container status, etc)
2. data that is informational, are accessed in a fairly generic scenario, or might be set by a different controller should be condition

Note that "kubectl wait" now supports arbitrary conditions, which means that with a bit of forethought you can make scripting for generic conditions much, much less painful than before.  Anything that is a specific field won't be generically extensible for all kubectl clients. :)

Tim Hockin

unread,
Aug 15, 2018, 8:29:39 PM8/15/18
to Clayton Coleman, Daniel Smith, kubernetes-sig-architecture, kubernetes-a...@googlegroups.com, Jing Xu, Saad Ali
I am fine advising a Condition here - it seems like a decent fit, just
sort of borderline as per guidance.

Do we have any thinking around ACLs to allow a controller to set
conditions but not write other fields?

Clayton

unread,
Aug 15, 2018, 8:35:49 PM8/15/18
to Tim Hockin, Clayton Coleman, Daniel Smith, kubernetes-sig-architecture, kubernetes-a...@googlegroups.com, Jing Xu, Saad Ali


> On Aug 15, 2018, at 8:29 PM, 'Tim Hockin' via kubernetes-sig-architecture <kubernetes-si...@googlegroups.com> wrote:
>
> I am fine advising a Condition here - it seems like a decent fit, just
> sort of borderline as per guidance.

Note adding both is probably fine as well (explicit field and a condition for generic clients).

>
> Do we have any thinking around ACLs to allow a controller to set
> conditions but not write other fields?

All we have is admission webhooks, but those have successfully been used for this use case.
> You received this message because you are subscribed to the Google Groups "kubernetes-sig-architecture" group.
> To unsubscribe from this group and stop receiving emails from it, send an email to kubernetes-sig-arch...@googlegroups.com.
> To post to this group, send email to kubernetes-si...@googlegroups.com.
> To view this discussion on the web visit https://groups.google.com/d/msgid/kubernetes-sig-architecture/CAO_RewY0aQ4wYmgWKznk_vKHb3Vb%3Da3m5dh-v7fmu6cwYfi-uA%40mail.gmail.com.

Tim Hockin

unread,
Aug 16, 2018, 2:19:31 AM8/16/18
to Clayton Coleman, Clayton Coleman, Daniel Smith, kubernetes-sig-architecture, kubernetes-a...@googlegroups.com, Jing Xu, Saad Ali
Jing,

Do you have a feeling about field vs Condition for this?
> To view this discussion on the web visit https://groups.google.com/d/msgid/kubernetes-api-reviewers/6749C530-B23E-4EFE-B0B4-A7C996F3ECE1%40gmail.com.

Daniel Smith

unread,
Aug 16, 2018, 1:08:19 PM8/16/18
to Tim Hockin, Clayton Coleman, Clayton Coleman, kubernetes-sig-architecture, kubernetes-a...@googlegroups.com, Jing Xu, Saad Ali
Given the explanation earlier in this thread, I could go either way on condition vs field. If a reason is going to be attached to the failure, then a condition has a nice (? standard, at least) way to represent that. If there's no reason that's going to be attached, then a field seems pretty simple. I'm not sure where it ended up, but before I went OOO there was talk of making a GC that notices and removes "finished" resources. A single field would work well with the last design I saw. A minor consideration, though.

Tim Hockin

unread,
Aug 16, 2018, 1:12:36 PM8/16/18
to Daniel Smith, Clayton Coleman, Clayton Coleman, kubernetes-sig-architecture, kubernetes-a...@googlegroups.com, Jing Xu, Saad Ali
Error is a flag and a message - Condition actually fits nicely in this case.

Brian Grant

unread,
Aug 16, 2018, 3:52:40 PM8/16/18
to Tim Hockin, Daniel Smith, Clayton Coleman, Clayton Coleman, kubernetes-sig-architecture, kubernetes-a...@googlegroups.com, Jing Xu, Saad Ali
Example of reflecting a reason in a single field:

With respect to user vs admin: This came up for DaemonSet, which is primarily an admin (or privileged user) feature, and is in the apps group. The goal for groups is to evolve a related set of functionality consistently. That consideration outweighed the user vs admin distinction. My experience with versioning every resource type individually is that it can be confusing to use, say, v1 of one type with v3 of another. I think the group approach worked pretty well for apps, where we made changes such as in label selectors and revision history consistently across all of the workload APIs, including both Deployment and DaemonSet. If PV and PVC in an api group (and eventually they will), I'd similarly want them in the same group.

Tim Hockin

unread,
Aug 16, 2018, 3:59:22 PM8/16/18
to Brian Grant, Daniel Smith, Clayton Coleman, Clayton Coleman, kubernetes-sig-architecture, kubernetes-a...@googlegroups.com, Jing Xu, Saad Ali
Cherrypicking one comment

On Thu, Aug 16, 2018 at 12:52 PM Brian Grant <brian...@google.com> wrote:

> The goal for groups is to evolve a related set of functionality consistently. That consideration outweighed the user vs admin distinction. My experience with versioning every resource type individually is that it can be confusing to use,

I mentioned this once, but I don't think the loop was ever closed.
The way API machinery is evolving is effectively a per-Kind version.
Each version of a group can be sparse with respect to all others, so
we WILL end up with what you just described (unless I misunderstand,
but I tried really hard to clarify this).

Saad Ali

unread,
Aug 16, 2018, 4:51:19 PM8/16/18
to Tim Hockin, Eric Tune, Brian Grant, Daniel Smith, smarter...@gmail.com, Clayton Coleman, kubernetes-sig-architecture, kubernetes-a...@googlegroups.com, Jing Xu
Regarding field vs condition for error: for the VolumeAttachment object we ended up going with a field and created a VolumeError struct (currently beta). This was our conclusion after lots of discussion with +Eric Tune. It would be nice to be able to reuse this struct in for error in the Snapshot object for consistency.

Tim Hockin

unread,
Aug 16, 2018, 5:57:09 PM8/16/18
to Saad Ali, Eric Tune, Brian Grant, Daniel Smith, Clayton Coleman, Clayton Coleman, kubernetes-sig-architecture, kubernetes-a...@googlegroups.com, Jing Xu
OK. Since this is on the edge, I'll take precedent. Let's run with
that. I had forgotten that.

Thanks

Tim

Eric Tune

unread,
Aug 17, 2018, 12:13:20 PM8/17/18
to Tim Hockin, Saad Ali, Brian Grant, Daniel Smith, Clayton Coleman, Clayton Coleman, kubernetes-si...@googlegroups.com, kubernetes-a...@googlegroups.com, Jing Xu, Jeffrey Regan
Jeff Regan is working on a proposal for generic completion and readiness.  The main viable options are (1) complex heuristics based on historical precedent, (2) openapi annotation of a single boolean field as having a specific meaning (e.g. x-kubernetes-readiness-flag), (3) duck typing using (a) everyone has to have the same .status.condition, or (b) everyone has to have the same .status.readinessFlag.


Tim Hockin

unread,
Aug 17, 2018, 12:14:30 PM8/17/18
to Eric Tune, Saad Ali, Brian Grant, Daniel Smith, Clayton Coleman, Clayton Coleman, kubernetes-sig-architecture, kubernetes-a...@googlegroups.com, Jing Xu, Jeffrey Regan
Looking forward to a proposal.

Joe Beda

unread,
Aug 17, 2018, 12:26:55 PM8/17/18
to Tim Hockin, Brian Grant, Daniel Smith, Clayton Coleman, Clayton Coleman, kubernetes-sig-architecture, kubernetes-a...@googlegroups.com, ji...@google.com, Saad Ali
One slightly related thing here wrt API groups that I noticed when playing with Istio is a set of CRD types that are domain specific for various adapters.  These are all part of the main Istio API group.  To me this smells bad.  I would expect that each of these "pluggable" things would be versioned independently.

So the axis wrt API groups are user vs. admin but also "core" vs. "plugin".  This is going to be relevant as we start breaking out different provider types and parameters with domain specific parameters.

(Am I making sense?  This stuff is hard to describe concisely)

Joe



--
You received this message because you are subscribed to the Google Groups "kubernetes-sig-architecture" group.
To unsubscribe from this group and stop receiving emails from it, send an email to kubernetes-sig-arch...@googlegroups.com.
To post to this group, send email to kubernetes-si...@googlegroups.com.

Daniel Smith

unread,
Aug 17, 2018, 3:44:08 PM8/17/18
to Joe Beda, Tim Hockin, Brian Grant, Clayton Coleman, Clayton Coleman, kubernetes-sig-architecture, kubernetes-a...@googlegroups.com, Jing Xu, Saad Ali
On Fri, Aug 17, 2018 at 9:26 AM Joe Beda <j...@heptio.com> wrote:
One slightly related thing here wrt API groups that I noticed when playing with Istio is a set of CRD types that are domain specific for various adapters.  These are all part of the main Istio API group.  To me this smells bad.  I would expect that each of these "pluggable" things would be versioned independently.

So the axis wrt API groups are user vs. admin but also "core" vs. "plugin".  This is going to be relevant as we start breaking out different provider types and parameters with domain specific parameters.

(Am I making sense?  This stuff is hard to describe concisely)

Yes, I've been needing a name for this concept for like a year now. The concept is roughly "a set of api groups that are developed together and will often be consumed together" and unfortunately we already used the term group for something a bit more granular.

I agree it's not good to put "interface" and "implementation" types in the same group. We shouldn't combine things with different expected evolutionary trajectories.
 

Joe



On Thu, Aug 16, 2018 at 12:59 PM 'Tim Hockin' via kubernetes-sig-architecture <kubernetes-si...@googlegroups.com> wrote:
Cherrypicking one comment

On Thu, Aug 16, 2018 at 12:52 PM Brian Grant <brian...@google.com> wrote:

> The goal for groups is to evolve a related set of functionality consistently. That consideration outweighed the user vs admin distinction. My experience with versioning every resource type individually is that it can be confusing to use,

I mentioned this once, but I don't think the loop was ever closed.
The way API machinery is evolving is effectively a per-Kind version.
Each version of a group can be sparse with respect to all others, so
we WILL end up with what you just described (unless I misunderstand,
but I tried really hard to clarify this).

--
You received this message because you are subscribed to the Google Groups "kubernetes-sig-architecture" group.
To unsubscribe from this group and stop receiving emails from it, send an email to kubernetes-sig-arch...@googlegroups.com.
To post to this group, send email to kubernetes-si...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/kubernetes-sig-architecture/CAO_RewbVsKh78%2BmtcwEExtJra_Fny84u8sVMdK9wJe-G92pn_Q%40mail.gmail.com.
For more options, visit https://groups.google.com/d/optout.

--
You received this message because you are subscribed to the Google Groups "kubernetes-api-reviewers" group.
To unsubscribe from this group and stop receiving emails from it, send an email to kubernetes-api-rev...@googlegroups.com.
To post to this group, send email to kubernetes-a...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/kubernetes-api-reviewers/CAEAFPY%2B2LNnALuSFxN%2BRPiKbwZcwht2M3MrP8JOwjmwCRVFaZA%40mail.gmail.com.
Reply all
Reply to author
Forward
0 new messages