Re: Issue triage again - labels

65 views
Skip to first unread message

Tim Hockin

unread,
Mar 8, 2019, 1:07:20 PM3/8/19
to kubernetes-sig-contribex, kubernetes-sig-leads
I got bounced from contribex. Resending.

wrote a little about this a few weeks ago and got some good pointers
to how people are operating today. Thanks.

The problem I am facing in sig-net is a little different from what
most people described. Specifically I have 250+ issues, some years
old and some days old, which need to be looked at by a human. I want
to scrub EVERY SINGLE ONE.

This raises some tactical issues. I raised it on slack, but this is a
better forum.

Preface: This is not about assigning sig/* labels. This is about
following up on the issues that are assigned sig/network. Making sure
they are real, have enough information, and are actionable.

As we march through these issues, we'll get some looked at every week.
It will take months. During this time, new issues will arrive. Some
issues will be easy to handle - obviously a bug, obviously a feature
request, obviously support (close it). Some will need more
information. Some will need to be reproduced to confirm.

How do I, iteratively, week-on-week, keep track of which issues are
acknowledged and which issues are in progress and which issues are yet
to be done? Assignee is not right - I am not asking people to FIX the
issues, just to investigate. And many people are not org members
anyway, so can not be assigned.

I am proposing that we do something along the lines of adding a new
label, e.g. "needs-sig-ack", which is automatically applied to ALL new
issues.

Now, I can search for sig/network && needs-sig-ack, and chip away at
that list. I can get volunteers to followup on issues and either
close them, ask for more info, or ACK them (removing that label). We
can also use that moment to look at kind/* and area/* labels.

We have a handful of triage/* labels, but they mostly don't make sense
to me, at least for non-closed issues. As historical justification
for closing they make sense, but seem like the wrong mechanism (they
are retroactively mutable). Not sure what to do with those. The
meaning of "triage" there doesn't match what I think of as bug
triage.

Eventually we will have scrubbed them all and the needs-sig-ack label
will represent new issues only.

Aaron correctly said "OMG another label, ugh". That's why I am here
to discuss. Some issues are assigned multiple SIGs and some are
legitimately cross-SIG. I think that's fine - if one SIG acknowledges
an issue, that should be good enough.

Discuss...

Tim St. Clair

unread,
Mar 8, 2019, 1:32:08 PM3/8/19
to Tim Hockin, kubernetes-sig-contribex, kubernetes-sig-leads
inline 

On Fri, Mar 8, 2019 at 12:03 PM 'Tim Hockin' via kubernetes-sig-leads <kubernetes...@googlegroups.com> wrote:
I wrote a little about this a few weeks ago and got some good pointers

to how people are operating today.  Thanks.

The problem I am facing in sig-net is a little different from what
most people described.  Specifically I have 250+ issues, some years
old and some days old, which need to be looked at by a human.  I want
to scrub EVERY SINGLE ONE.

This raises some tactical issues.  I raised it on slack, but this is a
better forum.

Preface: This is not about assigning sig/* labels.  This is about
following up on the issues that are assigned sig/network.  Making sure
they are real, have enough information, and are actionable.

As we march through these issues, we'll get some looked at every week.
It will take months.  During this time, new issues will arrive.  Some
issues will be easy to handle - obviously a bug, obviously a feature
request, obviously support (close it).  Some will need more
information.  Some will need to be reproduced to confirm.

How do I, iteratively, week-on-week, keep track of which issues are
acknowledged and which issues are in progress and which issues are yet
to be done?  Assignee is not right - I am not asking people to FIX the
issues, just to investigate.  And many people are not org members
anyway, so can not be assigned.

I am proposing that we do something along the lines of adding a new
label, e.g. "needs-sig-ack", which is automatically applied to ALL new
issues.


We treat un-labeled issues as needing triage.  Given that SCL repos are federated it's much easier then k/k.  But we could defaultly apply a need-triage(or needs-sig-ack) label to indicate that no one from the sig has triaged the issue. 

 
Now, I can search for sig/network && needs-sig-ack, and chip away at
that list.  I can get volunteers to followup on issues and either
close them, ask for more info, or ACK them (removing that label).  We
can also use that moment to look at kind/* and area/* labels.

We have a handful of triage/* labels, but they mostly don't make sense
to me, at least for non-closed issues.  As historical justification
for closing they make sense, but seem like the wrong mechanism (they
are retroactively mutable).  Not sure what to do with those.  The
meaning of "triage" there doesn't match what I think of as bug
triage.

Eventually we will have scrubbed them all and the needs-sig-ack label
will represent new issues only.
 

Aaron correctly said "OMG another label, ugh".  That's why I am here
to discuss.  Some issues are assigned multiple SIGs and some are
legitimately cross-SIG.  I think that's fine - if one SIG acknowledges
an issue, that should be good enough.

I think creating a simple flowchart on your process model and nuke un-used labels makes a ton of sense IMO. 
 

Discuss...

--
You received this message because you are subscribed to the Google Groups "kubernetes-sig-leads" group.
To unsubscribe from this group and stop receiving emails from it, send an email to kubernetes-sig-l...@googlegroups.com.
To post to this group, send email to kubernetes...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/kubernetes-sig-leads/CAO_RewZnWAftdKGz99UjQfTYvv7GMgzc8EZFx-pQPMdyMe68eg%40mail.gmail.com.
For more options, visit https://groups.google.com/d/optout.


--
Cheers,
Timothy St. Clair

“Do all the good you can. By all the means you can. In all the ways you can. In all the places you can. At all the times you can. To all the people you can. As long as ever you can.” 

Tim Pepper

unread,
Mar 8, 2019, 1:37:33 PM3/8/19
to Tim St. Clair, Tim Hockin, kubernetes-sig-contribex, kubernetes-sig-leads

I’m not sure one sig ack’ing an issue is enough.  As I think about the state diagram it feels like doing this well is going to mean removing a triaged (past-tense) label upon removal of a sig label and upon addition of a sig label, or…it drives at per-sig triaged labels.

 

Often a first sig label is applied (eg: sig cluster lifecycle…”upgrade failed”), initial triage is done, and it’s on to next sig(s) and back in need of their triage.

 

-- 

Tim Pepper

Orchestration & Containers Lead

VMware Open Source Technology Center

Tim Hockin

unread,
Mar 8, 2019, 1:50:59 PM3/8/19
to Tim Pepper, Tim St. Clair, kubernetes-sig-contribex, kubernetes-sig-leads
Aside how fun to have a thread between Tim, Tim, and Tim. :)

On Fri, Mar 8, 2019 at 10:37 AM Tim Pepper <tpe...@vmware.com> wrote:
>
> I’m not sure one sig ack’ing an issue is enough. As I think about the state diagram it feels like doing this well is going to mean removing a triaged (past-tense) label upon removal of a sig label and upon addition of a sig label, or…it drives at per-sig triaged labels.

I disagree. An ACK means a human has looked at it and confirmed it's
a real thing. If I have an issue that crosses SIGs, I will consult
with those SIGs before ACKing. But even if I unilaterally ACk that
should be enough -- I am asserting with confidence that it is real.
It may need more input from another SIG, but it's a real thing.

That said, I don't care if we had per-sig ACKs, but that seems like
process for its own sake. I don't think it's needed.

> Often a first sig label is applied (eg: sig cluster lifecycle…”upgrade failed”), initial triage is done, and it’s on to next sig(s) and back in need of their triage.

Yeah, I would not ACK it if it is not legitimately my SIG's issue.

Tim

Tim Hockin

unread,
Mar 8, 2019, 2:14:08 PM3/8/19
to Tim Allclair, Tim Pepper, Tim St. Clair, kubernetes-sig-contribex, kubernetes-sig-leads
Four of a kind! We win the game!

On Fri, Mar 8, 2019 at 10:56 AM Tim Allclair <tall...@google.com> wrote:
>
> I just wanted to weigh in because ... we've recently been discussing how to better handle issue triage in SIG-Auth. We're not looking at anything too advanced right now, but plan on adding a triage section to our bi-weekly meetings, and ensure that all issues created since the previous meeting have been triaged.
>
> -- Tim
>> --
>> You received this message because you are subscribed to the Google Groups "kubernetes-sig-leads" group.
>> To unsubscribe from this group and stop receiving emails from it, send an email to kubernetes-sig-l...@googlegroups.com.
>> To post to this group, send email to kubernetes...@googlegroups.com.
>> To view this discussion on the web visit https://groups.google.com/d/msgid/kubernetes-sig-leads/CAO_RewadYGxK_Y1czp06NjNz7%2B_V0fjQsiwWt68gTM6g_703gA%40mail.gmail.com.

Sahdev Zala

unread,
Mar 8, 2019, 2:57:19 PM3/8/19
to Tim Hockin, Tim Allclair, Tim Pepper, Tim St. Clair, kubernetes-sig-contribex, kubernetes-sig-leads

LOL, very cool. Tim is all over in the news too :-) 


I like the idea as mentioned on the slack but trying to understand more:


So currently we have a `needs-sig` label that applies automatically to new issues. Once a `sig/ *` label is applied, the `needs-sig` goes away. Can we use the same label as such or by renaming it with `needs-sig-ack`, and, instead of removing it automatically, keep it until at least one sig remove it manually?  (so that we don’t create a new label, and keep it to one step instead of two steps). 


Secondly, once the proposed `needs-sig-ack` removed manually by a SIG, an issue will look like what it looks like today with one or more sig labels right? Shouldn’t the removal of `needs-sig-ack` label reflect the acknowledged ownership of a particular SIG in the issue? So that a SIG can easily search those issues later on to work on their true backlog? (I assume ack doesn’t mean the SIG will work on the issue right away). 


Thanks! 


Regards,

Sahdev Zala


Tim Hockin

unread,
Mar 8, 2019, 3:04:35 PM3/8/19
to Sahdev Zala, Tim Allclair, Tim Pepper, Tim St. Clair, kubernetes-sig-contribex, kubernetes-sig-leads
On Fri, Mar 8, 2019 at 11:57 AM Sahdev Zala <sahdev...@gmail.com> wrote:
>
> LOL, very cool. Tim is all over in the news too :-)
>
>
> I like the idea as mentioned on the slack but trying to understand more:
>
>
> So currently we have a `needs-sig` label that applies automatically to new issues. Once a `sig/ *` label is applied, the `needs-sig` goes away. Can we use the same label as such or by renaming it with `needs-sig-ack`, and, instead of removing it automatically, keep it until at least one sig remove it manually? (so that we don’t create a new label, and keep it to one step instead of two steps).

In theory, yes. I think it's a little less clear, but I would roll
with it if that is what contribex thought best. Personally I like the
needs-* concept as single-purpose instructions. Needs a kind, needs a
sig, needs an ACK...

> Secondly, once the proposed `needs-sig-ack` removed manually by a SIG, an issue will look like what it looks like today with one or more sig labels right? Shouldn’t the removal of `needs-sig-ack` label reflect the acknowledged ownership of a particular SIG in the issue? So that a SIG can easily search those issues later on to work on their true backlog? (I assume ack doesn’t mean the SIG will work on the issue right away).

In my mind ACK means "we agree this is a valid bug report or RFE or
whatever". That includes "we did a repro and confirmed the bug" and
"this idea is not fundamentally impossible". It does not imply any
priority or timeline -- we have other labels for those things.

Stephen Augustus

unread,
Mar 8, 2019, 3:34:10 PM3/8/19
to Tim Hockin, Sahdev Zala, Tim Allclair, Tim Pepper, Tim St. Clair, kubernetes-sig-contribex, kubernetes-sig-leads
I'm trying to look at it from two lenses: the SIG and the submitter/a passerby.
There's a difference between "Has someone even seen this?" and "Is this a valid issue/RFE?".

From the perspective of the user, I care about if something is even getting attention.
A bot could drop by and say, "Hey. Thanks for submitting. Your foo is pending triage/attention from a Kubernetes SIG member" or some such --> apply the label

Again from the perspective of the user, I'd want some sort of acknowledgement that something is happening behind the scenes.
"Hey! We got your foo and we're going to take a look" --> soft prio the issue, close if it's clear it's support, ensure SIG, milestone, priority labels are attached, assign if it's clear who it belongs to, else add help-wanted, and optionally good-first-issue, and remove the needs-{triage,attention,ack} label
(This could be made more accessible if there was a triage dashboard a la gubernator.)

Now the user knows there's some sort of movement.

From there, now that the issue/RFE is labeled appropriately, the SIG can jump in and give a deeper inspection, reprio as necessary, move to a KEP, close outright apply some process like the SIG Cluster Lifecycle grooming guidance: https://github.com/kubernetes/community/blob/master/sig-cluster-lifecycle/grooming.md

That first soft prio could potentially be done by any contributor with just enough context to validate and route to the appropriate SIG/people, which alleviate the burden on some of the top-level reviewer/approvers.

As far as grooming goes, I'm planning to make that the first part of the SIG Azure, Release, and PM meetings moving forward.

-- Stephen

You received this message because you are subscribed to the Google Groups "kubernetes-sig-contribex" group.
To unsubscribe from this group and stop receiving emails from it, send an email to kubernetes-sig-con...@googlegroups.com.
To post to this group, send email to kubernetes-s...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/kubernetes-sig-contribex/CAO_RewafjQ4JoNUT1aazhExr2VbEcf_NaU_uarRhSXhmURVrbg%40mail.gmail.com.

Sahdev Zala

unread,
Mar 8, 2019, 3:34:16 PM3/8/19
to Tim Hockin, Tim Allclair, Tim Pepper, Tim St. Clair, kubernetes-sig-contribex, kubernetes-sig-leads

On Fri, Mar 8, 2019 at 3:04 PM 'Tim Hockin' via kubernetes-sig-contribex <kubernetes-s...@googlegroups.com> wrote:

On Fri, Mar 8, 2019 at 11:57 AM Sahdev Zala <sahdev...@gmail.com> wrote:
>
> LOL, very cool. Tim is all over in the news too :-)
>
>
> I like the idea as mentioned on the slack but trying to understand more:
>
>
> So currently we have a `needs-sig` label that applies automatically to new issues. Once a `sig/ *` label is applied, the `needs-sig` goes away. Can we use the same label as such or by renaming it with `needs-sig-ack`, and, instead of removing it automatically, keep it until at least one sig remove it manually?  (so that we don’t create a new label, and keep it to one step instead of two steps).

In theory, yes.  I think it's a little less clear, but I would roll
with it if that is what contribex thought best.  Personally I like the
needs-* concept as single-purpose instructions.  Needs a kind, needs a
sig, needs an ACK...

Yes, I did think that it is less clear but thought it’s worth to think around it and see if we can avoid creating a new label.  I do not have a strong opinion and I am good with what we end up deciding.  

> Secondly, once the proposed `needs-sig-ack` removed manually by a SIG, an issue will look like what it looks like today with one or more sig labels right? Shouldn’t the removal of `needs-sig-ack` label reflect the acknowledged ownership of a particular SIG in the issue? So that a SIG can easily search those issues later on to work on their true backlog? (I assume ack doesn’t mean the SIG will work on the issue right away).

In my mind ACK means "we agree this is a valid bug report or RFE or
whatever".  That includes "we did a repro and confirmed the bug" and
"this idea is not fundamentally impossible".  It does not imply any
priority or timeline -- we have other labels for those things.
 OK, as long as ACK clearly gives ownership of an issue to a particular SIG which is easily searchable, that's good. 
 
You received this message because you are subscribed to the Google Groups "kubernetes-sig-contribex" group.
To unsubscribe from this group and stop receiving emails from it, send an email to kubernetes-sig-con...@googlegroups.com.
To post to this group, send email to kubernetes-s...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/kubernetes-sig-contribex/CAO_RewafjQ4JoNUT1aazhExr2VbEcf_NaU_uarRhSXhmURVrbg%40mail.gmail.com.

Tim Hockin

unread,
Mar 8, 2019, 3:53:42 PM3/8/19
to Stephen Augustus, Sahdev Zala, Tim Allclair, Tim Pepper, Tim St. Clair, kubernetes-sig-contribex, kubernetes-sig-leads
On Fri, Mar 8, 2019 at 12:34 PM Stephen Augustus <Ste...@agst.us> wrote:
>
> I'm trying to look at it from two lenses: the SIG and the submitter/a passerby.
> There's a difference between "Has someone even seen this?" and "Is this a valid issue/RFE?".
>
> From the perspective of the user, I care about if something is even getting attention.
> A bot could drop by and say, "Hey. Thanks for submitting. Your foo is pending triage/attention from a Kubernetes SIG member" or some such --> apply the label
>
> Again from the perspective of the user, I'd want some sort of acknowledgement that something is happening behind the scenes.
> "Hey! We got your foo and we're going to take a look" --> soft prio the issue, close if it's clear it's support, ensure SIG, milestone, priority labels are attached, assign if it's clear who it belongs to, else add help-wanted, and optionally good-first-issue, and remove the needs-{triage,attention,ack} label
> (This could be made more accessible if there was a triage dashboard a la gubernator.)

I don't think someone outside the sig has the ability to assess prio
properly in many cases, nor the good-first-issue-ness. This is not
"triage" this is routing. I very much value the work of routing and
getting issues to (possibly) relevant SIGs, but issue filers should
not interpret that as anything other than "it's in the hopper".

Users can kind of see that today with the change from needs-sig to
sig/foo right?

> Now the user knows there's some sort of movement.
>
> From there, now that the issue/RFE is labeled appropriately, the SIG can jump in and give a deeper inspection, reprio as necessary, move to a KEP, close outright apply some process like the SIG Cluster Lifecycle grooming guidance: https://github.com/kubernetes/community/blob/master/sig-cluster-lifecycle/grooming.md
>
> That first soft prio could potentially be done by any contributor with just enough context to validate and route to the appropriate SIG/people, which alleviate the burden on some of the top-level reviewer/approvers.

Yep - I just need some way to track whether that has or has not been
done yet without reading every PR. Github labels are approximately
the only mechanism we have, right?

You're fleshing out my hypothesis for me. A state machine for this
EXISTS. We even follow it most of the time. We just have never
actually written it down. We should actually draw a state diagram for
it and indicate at which states various categories of labels are
applied and what the meaning is.

Tim Hockin

unread,
Mar 8, 2019, 3:56:12 PM3/8/19
to Sahdev Zala, Tim Allclair, Tim Pepper, Tim St. Clair, kubernetes-sig-contribex, kubernetes-sig-leads
On Fri, Mar 8, 2019 at 12:34 PM Sahdev Zala <sahdev...@gmail.com> wrote:
>
> On Fri, Mar 8, 2019 at 3:04 PM 'Tim Hockin' via kubernetes-sig-contribex <kubernetes-s...@googlegroups.com> wrote:
>>
>> On Fri, Mar 8, 2019 at 11:57 AM Sahdev Zala <sahdev...@gmail.com> wrote:
>> >
>> > LOL, very cool. Tim is all over in the news too :-)
>> >
>> >
>> > I like the idea as mentioned on the slack but trying to understand more:
>> >
>> >
>> > So currently we have a `needs-sig` label that applies automatically to new issues. Once a `sig/ *` label is applied, the `needs-sig` goes away. Can we use the same label as such or by renaming it with `needs-sig-ack`, and, instead of removing it automatically, keep it until at least one sig remove it manually? (so that we don’t create a new label, and keep it to one step instead of two steps).
>>
>> In theory, yes. I think it's a little less clear, but I would roll
>> with it if that is what contribex thought best. Personally I like the
>> needs-* concept as single-purpose instructions. Needs a kind, needs a
>> sig, needs an ACK...
>
>
> Yes, I did think that it is less clear but thought it’s worth to think around it and see if we can avoid creating a new label. I do not have a strong opinion and I am good with what we end up deciding.
>>
>>
>> > Secondly, once the proposed `needs-sig-ack` removed manually by a SIG, an issue will look like what it looks like today with one or more sig labels right? Shouldn’t the removal of `needs-sig-ack` label reflect the acknowledged ownership of a particular SIG in the issue? So that a SIG can easily search those issues later on to work on their true backlog? (I assume ack doesn’t mean the SIG will work on the issue right away).
>>
>> In my mind ACK means "we agree this is a valid bug report or RFE or
>> whatever". That includes "we did a repro and confirmed the bug" and
>> "this idea is not fundamentally impossible". It does not imply any
>> priority or timeline -- we have other labels for those things.
>
> OK, as long as ACK clearly gives ownership of an issue to a particular SIG which is easily searchable, that's good.

I think the cases where 2 SIGs are tagged and one ACKs and the other
disagrees are going to be vanishingly few. So I am not super worried
about it. I predict 99.9% of cases will be one SIG label and no
problems :)

Tim Hockin

unread,
Mar 15, 2019, 12:18:06 PM3/15/19
to Sahdev Zala, Tim Allclair, Tim Pepper, Tim St. Clair, kubernetes-sig-contribex, kubernetes-sig-leads
Ping. A week later - any objections?

Anyone who knows theexisting system have time to collab on a more
thorough design?

Tim St. Clair

unread,
Mar 15, 2019, 12:21:37 PM3/15/19
to Tim Hockin, Sahdev Zala, Tim Allclair, Tim Pepper, kubernetes-sig-contribex, kubernetes-sig-leads
No objections, but maybe a requirement.  

If we add some new labels, can we get rid of some old ones? 

Cheers,
Tim 

Daniel Smith

unread,
Mar 15, 2019, 12:22:23 PM3/15/19
to Tim Hockin, Sahdev Zala, Tim Allclair, Tim Pepper, Tim St. Clair, kubernetes-sig-contribex, kubernetes-sig-leads
I haven't completely read this thread, but I'd love a standardized triage system, one that works for PRs as well as issues.

We triage new issues and PRs by looking at things tagged SIG API Machinery and are greater than the last number we stopped at. This is better than nothing but we miss PRs/issues where the sig label is added after we have already passed that number.

If we had a normalized process, we could actually e.g. have the SIG allocate a triage role or roles and do a much better job here.

Niko Penteridis

unread,
Mar 15, 2019, 2:18:08 PM3/15/19
to kubernetes-sig-contribex
I'm the lead of Issue Triage for the 1.13 and 1.14 release teams, and have been researching triage workflows and experimental spreadsheet automation for the last 6 months.

Overall, the workflow that is currently in place is only being held alive by the active and manual work of many individuals passing through the Release Team subteams every 3 months, with a lot of help by long-term core contributors such as liggitt, dims, tpepper, sttts etc. This workflow is not normalized/standardized and is generally guideline based, loosely dependent on re-occurring lazy consensus and existing test-infra automation (k8s-ci-bot, fejta-bot, etc).

It's impossible to sift through the ~2100 issues on k/k, as there are so many mis-labeled ones, support requests that shouldn't be there, etc. This can be eventually fixed, in a collaborative way throughout all SIGs.

My conclusion is that much of the work done by the Release Team could be automated as-is into a workflow that is universally followed and agreed by all SIGs. That is, a mechanism that works for everyone and there are clear paths from point A to point B and vice versa.

I have come up with a series of improvements and a proposed prototype workflow for issue triaging, its starting point being the label 'needs-sig-triage' as proposed by thockin. On trying various external tooling/spreadsheet mechanisms, the verdict is that it's too cumbersome to create and maintain a dependence on external tooling. The 'spreadsheet' mechanism already exists, and it's called Github labels + queries + project boards, which is the main theme of the following improvements:

1.

All issues hitting K/K are auto-labeled as 'needs-sig-review' or something similar.

 

2.

SIGs are tasked by definition to regularly search all issues and appropriately label them / categorize them - There seems to be good consensus for this here

 

3. Each SIG has a dedicated project/Kanban board each, where visibility of current and upcoming work and milestoned work is very, very visible with a quick glance - columns like Backlog, In Progress, Release-Blocking, etc..

Case in point: https://github.com/orgs/kubernetes/projects/8 , the SIG-Windows board has worked great, both for them and release issue triage.

 

4. After SIG reviews the new ticket(issue), it gets an appropriate category - either via direct labels or via Project Board automated labels. thockin suggested the use of `triage` labels which are a bit legacy and should be reworked in tandem with project boards to have the desired workflow.
An example on a project board being: issues moved from 'backlog' to 'in progress' automatically get a 'triage/inprogress' label (or smth similar). Label Automation + Projectboards + searchQueries should all have seamless integration and compliment each other in the final iteration of the new workflow.

 

5. Release team specific: Based on all above, incoming 'milestoned' work is work that belongs to SIGs and it should be a SIG's responsibility to control and estimate what can be done for each release cycle, with the release team stepping in only when needed (as release approaches). Standard calendar checkpoints in release-readiness will further help - this is what the 'Enhancements Deadline' stands for, but doesn't cover stuff outside of new features and that work is usually left for the release team to ponder upon their fate.

6. Therefore, the prototype flowchart is: New Ticket -> SIG -> Labeling or Deletion <-> Project Boards <-> Re-labeling based on current status <-> Release Team is able to view status at any time via project boards

 
7. For all above, mass rework of labels is needed.
'priority' labels are a subject of discussion in every release cycle as it's a fuzzy concept in itself, it should be reworked with ideas such as 'impact' and 'importance' in mind,
'triage' labels are a bit old and currently mostly unused but can be very helpful if properly reworked and integrated into a standard system,
'kind' labels can be further reworked as there are many issues that do not belong in any current 'kind',
deletion of unwanted labels or re-work into other ones,
addition of new labels like 'needs-sig-review', 'release-blocking', etc.



Other generic improvements include:

 

- Mechanism that auto-applies milestone in PRs that are merged out of code freeze, so the full list of PRs included in 1.14 is easily grepped

(issue is here https://github.com/kubernetes/test-infra/issues/11611)

 

- Label that signifies a PR that is changing something Outside of core k/k, whether it's testing/CI/automation/external bundles like fluentd-gcp et cetera. Currently there's only a `kind/cleanup` which is rather vague. Label variety should be encouraged - with proper standardization, good ruling and automation around them they can be easily understood and utilized.

 

- Labels that indicate whether a ticket is release-blocking or good-to-have, e.g. (kind/release-blocking | kind/good-to-have)

 

- Label + mechanism that automatically shifts a Ticket to the next milestone

a few days after Freeze hits - this automates punting of 'good-to-have' stuff to the next milestone

 

- Any ticket in the release-blocking column of a board automatically gets a kind/release-blocking label - this way, anyone can search github issues and PRs via `label:kind/release-blocking+milestone:v1.14` query



I'm free to collab with @thockin on a new system and anyone else interested in it. I believe this falls into SIG-pm or RelEng @justaugustus ?

Bowei Du

unread,
Mar 15, 2019, 2:20:12 PM3/15/19
to Daniel Smith, Tim Hockin, Sahdev Zala, Tim Allclair, Tim Pepper, Tim St. Clair, kubernetes-sig-contribex, kubernetes-sig-leads
We do have a project board for SIG-Network: https://github.com/orgs/kubernetes/projects/10.

It is at least useful for managing by hand during a triage session.

Bowei

Niko Penteridis

unread,
Mar 15, 2019, 2:56:51 PM3/15/19
to Bowei Du, Daniel Smith, Tim Hockin, Sahdev Zala, Tim Allclair, Tim Pepper, Tim St. Clair, kubernetes-sig-contribex, kubernetes-sig-leads
(sending this again to reply-all)

Tim Hockin

unread,
Mar 18, 2019, 12:22:55 PM3/18/19
to Niko Penteridis, kubernetes-sig-contribex
On Fri, Mar 15, 2019 at 11:18 AM Niko Penteridis
<nickpen...@gmail.com> wrote:
>
> I'm the lead of Issue Triage for the 1.13 and 1.14 release teams, and have been researching triage workflows and experimental spreadsheet automation for the last 6 months.
>
> Overall, the workflow that is currently in place is only being held alive by the active and manual work of many individuals passing through the Release Team subteams every 3 months, with a lot of help by long-term core contributors such as liggitt, dims, tpepper, sttts etc. This workflow is not normalized/standardized and is generally guideline based, loosely dependent on re-occurring lazy consensus and existing test-infra automation (k8s-ci-bot, fejta-bot, etc).
>
> It's impossible to sift through the ~2100 issues on k/k, as there are so many mis-labeled ones, support requests that shouldn't be there, etc. This can be eventually fixed, in a collaborative way throughout all SIGs.
>
> My conclusion is that much of the work done by the Release Team could be automated as-is into a workflow that is universally followed and agreed by all SIGs. That is, a mechanism that works for everyone and there are clear paths from point A to point B and vice versa.
>
> I have come up with a series of improvements and a proposed prototype workflow for issue triaging, its starting point being the label 'needs-sig-triage' as proposed by thockin. On trying various external tooling/spreadsheet mechanisms, the verdict is that it's too cumbersome to create and maintain a dependence on external tooling. The 'spreadsheet' mechanism already exists, and it's called Github labels + queries + project boards, which is the main theme of the following improvements:
>
> 1.
>
> All issues hitting K/K are auto-labeled as 'needs-sig-review' or something similar.

I need this label fairly urgently - any objections to expediting the
addition of this label definition, for manual use while the process
gets sorted?

> 2.
>
> SIGs are tasked by definition to regularly search all issues and appropriately label them / categorize them - There seems to be good consensus for this here
>
>
>
> 3. Each SIG has a dedicated project/Kanban board each, where visibility of current and upcoming work and milestoned work is very, very visible with a quick glance - columns like Backlog, In Progress, Release-Blocking, etc..
>
> Case in point: https://github.com/orgs/kubernetes/projects/8 , the SIG-Windows board has worked great, both for them and release issue triage.
>
>
>
> 4. After SIG reviews the new ticket(issue), it gets an appropriate category - either via direct labels or via Project Board automated labels. thockin suggested the use of `triage` labels which are a bit legacy and should be reworked in tandem with project boards to have the desired workflow.

To be clear: I find the existing triage labels to be pretty
meaningless. My understanding of the word triage is that it is purely
about intake. Is this real? Is it actively on fire, smoldering but
no flames yet, or just an annoyance? I think we have this
information, more or less with the kind/* and priority/* labels. If
we need to make those more precise, I am in favor, but I'd just as
soon nuke the triage/* labels.

> An example on a project board being: issues moved from 'backlog' to 'in progress' automatically get a 'triage/inprogress' label (or smth similar). Label Automation + Projectboards + searchQueries should all have seamless integration and compliment each other in the final iteration of the new workflow.

I don't think this is "triage" once it enters the backlog, triage
should be complete. There may be need for a triage-in-progress label
for that time period between being looked at by SIG and being decided
by SIG, but I am not 100% on that need.

We should draw the state diagram. This doesn't consider milestones,
release-blcokers, security, etc, but it's a start.

```
|
open bug/PR
|
V
WAITING-ROOM: needs-sig, needs-triage
| ^
(assign SIG) |
| |
V |
--> TRIAGE: needs-triage <----
| / \ |
| (close with (verify) |
| reason) | |
| | V |
-- CLOSED BACKLOG: kind/*, priority/*
|
(assign or claim)
|
V
IN-PROGRESS: assignee
```

> 5. Release team specific: Based on all above, incoming 'milestoned' work is work that belongs to SIGs and it should be a SIG's responsibility to control and estimate what can be done for each release cycle, with the release team stepping in only when needed (as release approaches). Standard calendar checkpoints in release-readiness will further help - this is what the 'Enhancements Deadline' stands for, but doesn't cover stuff outside of new features and that work is usually left for the release team to ponder upon their fate.
>
> 6. Therefore, the prototype flowchart is: New Ticket -> SIG -> Labeling or Deletion <-> Project Boards <-> Re-labeling based on current status <-> Release Team is able to view status at any time via project boards
>
>
> 7. For all above, mass rework of labels is needed.
> 'priority' labels are a subject of discussion in every release cycle as it's a fuzzy concept in itself, it should be reworked with ideas such as 'impact' and 'importance' in mind,
> 'triage' labels are a bit old and currently mostly unused but can be very helpful if properly reworked and integrated into a standard system,
> 'kind' labels can be further reworked as there are many issues that do not belong in any current 'kind',
> deletion of unwanted labels or re-work into other ones,
> addition of new labels like 'needs-sig-review', 'release-blocking', etc.

I agree with all of this, but I don't think any of it should block
needs-triage (or wehatever we call it) from being defined and applied
to new issues

> I'm free to collab with @thockin on a new system and anyone else interested in it. I believe this falls into SIG-pm or RelEng @justaugustus ?

I am happy to consult or even implement if needed, within bounds :) I
want this label ASAP. :)

Tim
> --
> You received this message because you are subscribed to the Google Groups "kubernetes-sig-contribex" group.
> To unsubscribe from this group and stop receiving emails from it, send an email to kubernetes-sig-con...@googlegroups.com.
> To post to this group, send email to kubernetes-s...@googlegroups.com.
> To view this discussion on the web visit https://groups.google.com/d/msgid/kubernetes-sig-contribex/be57872a-47c8-4ef8-a477-6ec09ef56550%40googlegroups.com.

Niko Penteridis

unread,
Mar 18, 2019, 2:15:34 PM3/18/19
to Tim Hockin, kubernetes-sig-contribex, Aaron Crickenberger, ste...@agst.us, co...@google.com
Opened https://github.com/kubernetes/test-infra/pull/11818 that adds a 'needs-review' label for this (I think this will add the automation needed).

Name open to change/discussion. Chose neither 'needs-triage' due to potential confusion with existing triage labels (which need nuking or rework) nor 'needs-sig-review' due to how the 'needs-sig' regex works.

On 'triage' rework, replacing it with the existing `lifecycle` would be good - there's already 'lifecycle/active' which compliments existing automation (lifecycle/stale etc). Can be auto-labeled when someone is assigned to an issue/pr.

Will gradually open tickets for other proposals and/or discuss them in sig-release and/or 1.14 retro

Tim Hockin

unread,
Mar 18, 2019, 2:18:43 PM3/18/19
to Niko Penteridis, kubernetes-sig-contribex, Aaron Crickenberger, Stephen Augustus, Cole Wagner
Awesome. Comments on PR. Thanks.

On Mon, Mar 18, 2019 at 11:11 AM Niko Penteridis
Reply all
Reply to author
Forward
0 new messages