Time for a recap now this discussion has died down a little.
TL;DR: I would like to experiment with this for the 46 cycle to see how
this works in practice. If we find it has negative effects, we can stop
doing it (immediately; we don't need to wait for the end of the
cycle...). If we find it has broadly positive effects (ie doesn't lead
to more breakage than usual), we can look at broadening the set of
things we uplift without going through the formal approval process we
currently use.
On 20/11/2015 19:22, Gijs Kruitbosch wrote:
> I would like to propose that we eliminate the explicit aurora approval
> requirement for changesets that meet ALL of the following criteria:
> - only change CSS and/or image files
> - are landing in the first 4 weeks of the aurora cycle;
> - have landed on m-c and stuck
On 20/11/2015 19:33, Robert Kaiser wrote:
> How many of those do we have every cycle (i.e. is it enough that it
> gives us a significant benefit and even warrants discussion)?
I don't know. I tried to find out, and it's hard to get that
information. I'm fairly sure it's a comparatively small (<10%) subset of
the patches that gets uplift, though.
I think it warrants discussion because I think there is broad agreement
from relmgmt, QE and engineers that the current status quo is not ideal,
but less agreement on what we should be aiming for instead. I'd like us
to experiment and evolve our strategy here. This is not me personally
going "ugh, I wish the patches I wrote didn't have to do this approval
dance". This is me going "I wonder if this is a good set of things to
use to evolve how we deal with approvals" (though full disclosure,
because of the type of patches I often write, this probably does benefit
me personally more than it will some other people). I'll get back to
this point replying to some of the other posts further downthread.
> What's the definition of "have stuck"? For how long?
"landed on m-c without being backed out". I don't really want to use
extra criteria like "landed and stuck for X days" because I think the
criteria are already pretty narrow, and adding verification burden here
is not going to help materially.
On 20/11/2015 19:34, Chris Peterson wrote:
> Why not allow any code change to be uplifted to Aurora based on the
> developer's discretion? Or allow a patch's reviewer to a+ for Aurora
> uplift?
I think "developer's discretion" alone is not likely to be conducive to
stability. I think patch reviewer and/or third person review by another
person (not necessarily relman) might be interesting to explore, but
still more high-risk than what I am suggesting. As an experiment, I'd
like to start with what I originally suggested. We can evaluate after 1
cycle (or before if it blows up in our face).
On 20/11/2015 19:55, Liz Henry (:lizzard) wrote:
> I could be ok with us outlining this kind of criteria and letting
> developers know they can land their own changes in some cases.
> Verifying the fix on m-c (by someone other than the developer) might
> help too.
>
> But, what those cases are, I'm not entirely sure. I think there have
> been css only uplifts that cause significant regressions. Now, me
> reviewing them isn't going to catch those regressions. But my being
> aware of a cluster of uplift requests can mean I know to ask QE for
> extra testing in particular areas. I factor the uplift action into a
> more holistic view of what's happening across the board for a release
> and the amount of change and risk we're looking at. That isn't always
> visible to the developers requesting uplift.
>
> Gijs you mention devs not being "required" to do this. I wonder if
> this process might normalize uplifting things that could just as well
> ride the trains.
>
> The value of asking people to ask for uplift is sometimes in that they
> take their changes more seriously. Often, people talk to me first on
> irc before even asking, and during that conversation they find more
> work to do, they think about testing or verifying fixes, or I uncover
> more uplifts that need to happen for their work to land on aurora.
I think all of these paragraphs point to a single (valid!) concern:
uplifting things increases instability. We can contain the risk of
instability by not uplifting things, or at least by asking for automated
tests and/or manual QE, or restricting what things we uplift (interface
changes, patch size, l10n concerns, etc. (l10n being a bit special
because there are other reasons besides stability)).
But as dbaron and past have pointed out (and as I've discussed with Liz
privately), realistically part of the issue is that we often don't get
QE verification or crash-rate-combatting until we get to beta/release.
That means nightly and, to a slightly lesser degree, aurora, are both
not particularly stable, and that that stability is not completely
dependent on how strict we are with uplifts.
It's very hard to assess the success of the current system at increasing
stability because there is no way to know "what would have been". We
don't know if developers or relman are overly cautious or not
cautious/strict enough, unless we experiment. Because nobody seems
particularly happy with the current state of things, I would like us to
experiment. :-)
So perhaps my original phrasing was clumsy. I would like us to
experiment with being "more permissive" about uplifts, and I'm
purposefully starting out with a subset of patches that I suspect are
lower risk, and where there is a clear secondary benefit (ie devedition
theming).
I believe it's probably equally useful to take a hard look at improving
quality on nightly and aurora through other means. I'll open a separate
thread about that.
~ Gijs