Proposal: approval-by-default for aurora for CSS/image-only changes in the first 4 weeks of each aurora cycle after successful m-c landing

61 views
Skip to first unread message

Gijs Kruitbosch

unread,
Nov 20, 2015, 2:22:37 PM11/20/15
to Firefox Dev, release-mgmt
Hello,

I would like to propose that we eliminate the explicit aurora approval
requirement for changesets that meet ALL of the following criteria:
- only change CSS and/or image files
- are landing in the first 4 weeks of the aurora cycle;
- have landed on m-c and stuck

Why would this not cause horror and chaos:
- I don't recall ever seeing this type of changes be denied aurora uplift;
- I don't recall ever seeing this type of changes be backed out of
aurora (but stay on m-c) after uplift because they broke the tree;
- There would be no *obligation* to land changes on aurora (just the
possibility to do so) - so if we refactor half the theme, engineers
should be sensible enough not to uplift based just on this rule and
either let things ride the train or request explicit approval;
- We already effectively have test-only auto-approval for intermittent
test fixes that worked on Nightly, so there is precedent that has been
pretty problem-free to the best of my knowledge;

Why do this?
- Reduce process overhead. I don't think that the time I spend writing
the approval requests and the time relman spends reviewing them is useful;
- It would help streamline maintenance of the devedition theme, where we
often don't find issues until right after merge day;
- Reduce the time theme fixes take to reach general release audiences;
- I have repeatedly seen mention that folks handling the approvals are
overloaded, and this would lighten that burden a little, which would
help with response time for other requests.


How do people feel about this? Are there reasons not to do this that
I've missed? Would people want to argue that there should be even fewer
restrictions and/or are there other sets of changes that we could do
this with?

Gijs

(please follow-up on firefox-dev)
_______________________________________________
firefox-dev mailing list
firef...@mozilla.org
https://mail.mozilla.org/listinfo/firefox-dev

Robert Kaiser

unread,
Nov 20, 2015, 2:33:38 PM11/20/15
to Firefox Dev, release-mgmt
Gijs Kruitbosch schrieb:
> I would like to propose that we eliminate the explicit aurora approval
> requirement for changesets that meet ALL of the following criteria:
> - only change CSS and/or image files
> - are landing in the first 4 weeks of the aurora cycle;
> - have landed on m-c and stuck

How many of those do we have every cycle (i.e. is it enough that it
gives us a significant benefit and even warrants discussion)?

What's the definition of "have stuck"? For how long?

With Autoland being worked on, I hope we will be able to actually
automate uplifts with firm rules like that in the future, so having
rules that are easy to detect by an automated tool for any such
proposals would be great!

KaiRo

Chris Peterson

unread,
Nov 20, 2015, 2:34:27 PM11/20/15
to firef...@mozilla.org
How often are Aurora uplifts denied for other types of code changes (C++
or JS)?

Why not allow any code change to be uplifted to Aurora based on the
developer's discretion? Or allow a patch's reviewer to a+ for Aurora uplift?

Sebastian Hengst

unread,
Nov 20, 2015, 2:47:57 PM11/20/15
to firef...@mozilla.org
Please no automatic approval for patches which change localization files
(e.g. .dtd, .properties). This is a common reason why an uplift to
aurora requires changes and this sometimes gets missed by the assignee.
If such patches land with the changes to those files, they break the
string freeze and can be missed by localizers.

Sebastian

-------- Original-Nachricht --------
Betreff: Re: Proposal: approval-by-default for aurora for CSS/image-only
changes in the first 4 weeks of each aurora cycle after successful m-c
landing
Von: Chris Peterson <cpet...@mozilla.com>
An: firef...@mozilla.org
Datum: 2015-11-20 20:34

Richard Newman

unread,
Nov 20, 2015, 2:49:28 PM11/20/15
to Chris Peterson, firefox-dev
I think I'm with cpeterson on this. Three cents:
  1. My experience has been that, if I want to, I can write an approval request that is persuasive enough to land just about anything in Aurora.
  2. I'd rather have developers take an extra week to get something right, and uplift it to early Aurora with module peer approval, than rush to get changes in just before a merge to avoid having to go through the uplift approval process. Higher quality, less stress.
  3. I would tentatively assert that the primary reason we don't get more bustage from uplifts is because of developers not asking to uplift risky changes, rather than approvals being denied.
In short: I'm in favor of removing process where it has more cost than benefit.

(Obviously the localization process still plays in here, so it's not a total free-for-all.)

Eric Rescorla

unread,
Nov 20, 2015, 2:55:22 PM11/20/15
to Chris Peterson, Firefox Dev
+1

On Fri, Nov 20, 2015 at 11:34 AM, Chris Peterson <cpet...@mozilla.com> wrote:

Liz Henry (:lizzard)

unread,
Nov 20, 2015, 3:46:13 PM11/20/15
to Firefox Dev, release-mgmt
We were discussing similar ideas at our recent work week.  I think once we have Autoland we will be a lot closer to making this happen in other areas as well.

I could be ok with us outlining this kind of criteria and letting developers know they can land their own changes in some cases.   Verifying the fix on m-c (by someone other than the developer) might help too.

But, what those cases are, I'm not entirely sure.  I think there have been css only uplifts that cause significant regressions.   Now, me reviewing them isn't going to catch those regressions. But my being aware of a cluster of uplift requests can mean I know to ask QE for extra testing in particular areas. I factor the uplift action into a more holistic view of what's happening across the board for a release and the amount of change and risk we're looking at. That isn't always visible to the developers requesting uplift.

Gijs you mention devs not being "required" to do this. I wonder if this process might normalize uplifting things that could just as well ride the trains.

The value of asking people to ask for uplift is sometimes in that they take their changes more seriously. Often, people talk to me first on irc before even asking, and during that conversation they find more work to do, they think about testing or verifying fixes, or I uncover more uplifts that need to happen for their work to land on aurora.

Must add that I rarely say no to rnewman's requests because he is incredibly organized and thoughtful and seems to triple check everything before even considering bringing work onto a new branch. 

- Liz


--
----
Liz Henry (:lizzard)
Firefox Release Manager
lhe...@mozilla.com

Ritu Kothari

unread,
Nov 20, 2015, 4:06:44 PM11/20/15
to Liz Henry (:lizzard), release-mgmt, Firefox Dev
+1 to everything Liz said. When I am reviewing the patch(es) nominated for uplift to Aurora, it serves several goals towards improving overall ship quality like:
  • Has the patch landed in Nightly? And is the issue severe enough to be uplifted to Beta as well?
  • Have the code review comments been addressed?
  • Has the patch baked enough on Nightly?
    • If the uplift request mentions a risk of startup crashes or stability, I try to let it bake for longer while observing Nightly crash-stats.
  • If there are IDL changes, do we need to engage with AMO team?
  • Can we get the bug filer to verify the fix and possibly catch any unexpected fall outs early on in the cycle?
  • Do we need to request SV/QE team to do focused testing in this area as there are several uplifts happenings too close to each other?
  • Do we need to rel-note this?

So while it may seem that most patches just pass through the RelMan team, we are using these patches to grab meta data that we use through the Aurora cycle and carry forward to Beta cycle to minimize disruptions to overall shipped quality.

Can we get smarter about grabbing this metadata from other places and not necessarily patch uplift requests? Yes. But we are not ready for that now. And DevEd44 stability is extremely poor atm and if we eliminate any checks at this point, we run the risk of destabilizing quality and stability even further.

Last but not the least, let me leave you with an example of a patch uplift that I reviewed in Aurora44 cycle where we were going to uplift an Image with the wrong resolution that I was able to catch when reviewing the patch uplift request. Please see comment 9 and comment 10 https://bugzilla.mozilla.org/show_bug.cgi?id=1221330#c9. I consider this as a value add that occurs from time to time when patches pass through the RelMan pipeline.

Thanks,
Ritu

Matthew N.

unread,
Nov 20, 2015, 5:43:21 PM11/20/15
to Ritu Kothari, release-mgmt, Firefox Dev, Liz Henry (:lizzard)
On Fri, Nov 20, 2015 at 1:04 PM, Ritu Kothari <rkot...@mozilla.com> wrote:

Please see comment 9 and comment 10 https://bugzilla.mozilla.org/show_bug.cgi?id=1221330#c9. I consider this as a value add that occurs from time to time when patches pass through the RelMan pipeline.

​Note that this actually ​wouldn't have been a problem if you didn't mention it since sheriffs and developers I know doing uplifts know to always uplift the changesets that actually landed, not what is on the bug (for this exact reason). Also, auto-land will make this problem go away.

Matthew

Dao Gottwald

unread,
Nov 21, 2015, 7:16:22 AM11/21/15
to firef...@mozilla.org, releas...@mozilla.com, rkot...@mozilla.com, ehe...@mozilla.com
Matthew N. wrote on 20.11.2015 23:42:
>
> On Fri, Nov 20, 2015 at 1:04 PM, Ritu Kothari <rkot...@mozilla.com <mailto:rkot...@mozilla.com> > wrote:
>>
>> Please see comment 9 and comment 10 https://bugzilla.mozilla.org/show_bug.cgi?id=1221330#c9 <https://bugzilla.mozilla.org/show_bug.cgi?id=1221330#c9> . I consider this as a value add that occurs from time to time when patches pass through the RelMan pipeline.

>
>
> ​Note that this actually ​wouldn't have been a problem if you didn't mention it since sheriffs and developers I know doing uplifts know to always uplift the changesets that actually landed, not what is on the bug (for this exact reason). Also, auto-land will make this problem go away.

Even if Ritu had caught a mistake here (which I'm sure this has happened elsewhere), I don't think this alone would say much. Extra eyes can always help, so I could easily make the straw-man proposal that every patch should be reviewed by two or three or ten peers. Clearly we're not going to do that, since it would add too much overhead. The interesting question is, what's the cost/benefit ratio for release managers doing code reviews before uplift?

dao

Panos Astithas

unread,
Nov 22, 2015, 10:02:35 AM11/22/15
to Ritu Kothari, release-mgmt, Firefox Dev, Liz Henry (:lizzard)
On Fri, Nov 20, 2015 at 11:04 PM, Ritu Kothari <rkot...@mozilla.com> wrote:
And DevEd44 stability is extremely poor atm and if we eliminate any checks at this point, we run the risk of destabilizing quality and stability even further.

I think Gijs request is reasonable, but this is the most salient point for me:  Developer Edition is not just a train stop towards release, it's an entire product for a particular demographic that we even spend marketing resources to promote. We should be very picky about what lands on it.

Panos

L. David Baron

unread,
Nov 22, 2015, 10:41:55 AM11/22/15
to Panos Astithas, release-mgmt, Ritu Kothari, Liz Henry (:lizzard), Firefox Dev
Is a significant part of the instability on aurora related to things
that land on aurora, rather than things that are unstable on nightly
and then are uplifted to aurora every 6 weeks?

I think if we're worried about crashes on aurora, we should be
putting more resources into quickly filing and quickly fixing (or
backing out) crash regressions on both nightly and aurora. My
understanding is that we spend most of our topcrash-chasing
resources on stages later in the release train. Yet it's quite
straightforward to file new topcrash bugs three or four hours after
a new nightly hits the wire, assuming it's somebody's job to do so
at the right time of day.

Similar for any other forms of quality and stability that you're
worried about.

Letting low quality hang around on nightly and aurora for extended
periods of time just improves the chances that some of it will get
through to release. If we want to address issues of quality on
those channels, I'd rather use a project management process that
looks more like sheriffing than one that looks like approvals.


As to the original question: I agree with cpeterson's proposal.
But I also don't think we get much value out of aurora as a separate
six-week stage in the release train, and I'd still like to see us
shorten the path from nightly to release.

-David

--
𝄞 L. David Baron http://dbaron.org/ 𝄂
𝄢 Mozilla https://www.mozilla.org/ 𝄂
Before I built a wall I'd ask to know
What I was walling in or walling out,
And to whom I was like to give offense.
- Robert Frost, Mending Wall (1914)
signature.asc

Axel Hecht

unread,
Nov 23, 2015, 12:52:57 AM11/23/15
to firef...@mozilla.org
Here's a few observations of mine:

I only see a very limited set of patches trying to get uplifted. Which means that most folks probably do a pretty good judgement on whether they should uplift or not.

I see very few patches that request uplift that shouldn't. Mostly because people don't understand the tree rules in detail.

Thus, if we make exceptions, we should only make those that we ensure by an hg hook that validates that patches as landed stick to the rules. An existing example is the l10n-approval hook, which only exists because developers said "No string changes in this patch" wrongly too often.

That said, something like the devtools theme sounds like a good candidate for an exception.

I've also pondered whether all approvals need to be done by relman, or if there's a set of engineers that can help with those requests.
OTH, relman also wants to get more involved in Nightly, I heard. Those two probably build one context to look at.

Axel

Eric Rescorla

unread,
Nov 23, 2015, 7:26:49 AM11/23/15
to Axel Hecht, Firefox Dev
On Sun, Nov 22, 2015 at 9:52 PM, Axel Hecht <l1...@mozilla.com> wrote:
Here's a few observations of mine:

I only see a very limited set of patches trying to get uplifted. Which means that most folks probably do a pretty good judgement on whether they should uplift or not.

I don't believe that this follows at all. Good judgement includes not only of saying
no to things that shouldn't be uplifted, but also saying yes to things that shouldl.

Say that out of 1000 patches, 50 should probably get uplifted, but people apply
really high standards and so only the best 5 get uplifted. That's bad judgement
not good.
 
-Ekr

Robert Kaiser

unread,
Nov 23, 2015, 9:33:21 AM11/23/15
to L. David Baron, Panos Astithas, release-mgmt, Ritu Kothari, Liz Henry (:lizzard), Firefox Dev
L. David Baron schrieb:
> I think if we're worried about crashes on aurora, we should be
> putting more resources into quickly filing and quickly fixing (or
> backing out) crash regressions on both nightly and aurora. My
> understanding is that we spend most of our topcrash-chasing
> resources on stages later in the release train. Yet it's quite
> straightforward to file new topcrash bugs three or four hours after
> a new nightly hits the wire, assuming it's somebody's job to do so
> at the right time of day.

I fully agree, but we don't have the resources. My job is release
quality (including stability), so I concentrate first on what is on or
near release, and only as time allows take a look at Dev Edition and
Nightly - which due to frequent fires on beta is very rare.
In addition, we recently lost the only developer that was focused on
looking at crashes (dmajor) with no replacement in sight. He looked at
and found a number of issues on Nightly pretty fast, esp. as he knew
what things mean when looking at code. I'm convinced we'll need someone
like that again - plus additional people to look at data and identify
issues. Even more so with the ton of stability issues that e10s is
causing on those channels (and may I say I shiver when thinking about
getting some of that on beta soon due to e10s being turned on there for
some parts of the population, with a quality level that I personally
think is not even really good enough for aurora - just that the feature
is important enough to push on it).

But yes, I agree, those things we catch in Nightly and Dev Edition will
not haunt us on Beta or Release.

KaiRo

Gijs Kruitbosch

unread,
Nov 26, 2015, 7:28:24 AM11/26/15
to Firefox Dev, release-mgmt
Time for a recap now this discussion has died down a little.

TL;DR: I would like to experiment with this for the 46 cycle to see how
this works in practice. If we find it has negative effects, we can stop
doing it (immediately; we don't need to wait for the end of the
cycle...). If we find it has broadly positive effects (ie doesn't lead
to more breakage than usual), we can look at broadening the set of
things we uplift without going through the formal approval process we
currently use.

On 20/11/2015 19:22, Gijs Kruitbosch wrote:
> I would like to propose that we eliminate the explicit aurora approval
> requirement for changesets that meet ALL of the following criteria:
> - only change CSS and/or image files
> - are landing in the first 4 weeks of the aurora cycle;
> - have landed on m-c and stuck

On 20/11/2015 19:33, Robert Kaiser wrote:
> How many of those do we have every cycle (i.e. is it enough that it
> gives us a significant benefit and even warrants discussion)?
I don't know. I tried to find out, and it's hard to get that
information. I'm fairly sure it's a comparatively small (<10%) subset of
the patches that gets uplift, though.

I think it warrants discussion because I think there is broad agreement
from relmgmt, QE and engineers that the current status quo is not ideal,
but less agreement on what we should be aiming for instead. I'd like us
to experiment and evolve our strategy here. This is not me personally
going "ugh, I wish the patches I wrote didn't have to do this approval
dance". This is me going "I wonder if this is a good set of things to
use to evolve how we deal with approvals" (though full disclosure,
because of the type of patches I often write, this probably does benefit
me personally more than it will some other people). I'll get back to
this point replying to some of the other posts further downthread.
> What's the definition of "have stuck"? For how long?
"landed on m-c without being backed out". I don't really want to use
extra criteria like "landed and stuck for X days" because I think the
criteria are already pretty narrow, and adding verification burden here
is not going to help materially.

On 20/11/2015 19:34, Chris Peterson wrote:
> Why not allow any code change to be uplifted to Aurora based on the
> developer's discretion? Or allow a patch's reviewer to a+ for Aurora
> uplift?
I think "developer's discretion" alone is not likely to be conducive to
stability. I think patch reviewer and/or third person review by another
person (not necessarily relman) might be interesting to explore, but
still more high-risk than what I am suggesting. As an experiment, I'd
like to start with what I originally suggested. We can evaluate after 1
cycle (or before if it blows up in our face).

On 20/11/2015 19:55, Liz Henry (:lizzard) wrote:
> I could be ok with us outlining this kind of criteria and letting
> developers know they can land their own changes in some cases.
> Verifying the fix on m-c (by someone other than the developer) might
> help too.
>
> But, what those cases are, I'm not entirely sure. I think there have
> been css only uplifts that cause significant regressions. Now, me
> reviewing them isn't going to catch those regressions. But my being
> aware of a cluster of uplift requests can mean I know to ask QE for
> extra testing in particular areas. I factor the uplift action into a
> more holistic view of what's happening across the board for a release
> and the amount of change and risk we're looking at. That isn't always
> visible to the developers requesting uplift.
>
> Gijs you mention devs not being "required" to do this. I wonder if
> this process might normalize uplifting things that could just as well
> ride the trains.
>
> The value of asking people to ask for uplift is sometimes in that they
> take their changes more seriously. Often, people talk to me first on
> irc before even asking, and during that conversation they find more
> work to do, they think about testing or verifying fixes, or I uncover
> more uplifts that need to happen for their work to land on aurora.

I think all of these paragraphs point to a single (valid!) concern:
uplifting things increases instability. We can contain the risk of
instability by not uplifting things, or at least by asking for automated
tests and/or manual QE, or restricting what things we uplift (interface
changes, patch size, l10n concerns, etc. (l10n being a bit special
because there are other reasons besides stability)).

But as dbaron and past have pointed out (and as I've discussed with Liz
privately), realistically part of the issue is that we often don't get
QE verification or crash-rate-combatting until we get to beta/release.
That means nightly and, to a slightly lesser degree, aurora, are both
not particularly stable, and that that stability is not completely
dependent on how strict we are with uplifts.

It's very hard to assess the success of the current system at increasing
stability because there is no way to know "what would have been". We
don't know if developers or relman are overly cautious or not
cautious/strict enough, unless we experiment. Because nobody seems
particularly happy with the current state of things, I would like us to
experiment. :-)


So perhaps my original phrasing was clumsy. I would like us to
experiment with being "more permissive" about uplifts, and I'm
purposefully starting out with a subset of patches that I suspect are
lower risk, and where there is a clear secondary benefit (ie devedition
theming).

I believe it's probably equally useful to take a hard look at improving
quality on nightly and aurora through other means. I'll open a separate
thread about that.

~ Gijs

Lawrence Mandel

unread,
Nov 27, 2015, 11:22:22 AM11/27/15
to Firefox Dev, Sylvestre Ledru, release-mgmt
Good discussion. A few thoughts on top and in summary of what's already been said:

- I think experimenting is good. 46 seems like a good time in that we have some time to work out all of the details.
- The relman team agrees that while we do get value from Aurora approvals there is a lot of overhead here - and I would say ideally there are better activities on which to spend our time.
- Much of the value that relman gets out of approvals should be obtainable from other sources. We should identify the real value (I think we can do this quickly) and figure out how to get this information from the pushlog and other sources.
- There is an issue of trust. While many (in my experience the majority) of Mozilla devs have demonstrated that they make good decisions wrt uplifts, some don't. Having an open m-a branch is scary. We really need to keep this branch stable. This is the one piece of value, control over landings, that will be hard to replace. I agree with experimenting as we don't know how much actual value we're getting with our current model.

Sylvestre - Can we please put this on our list to discuss in Orlando and come back with either a commitment to do what Gijs has proposed or an alternative proposal?

Lawrence

Sylvestre Ledru

unread,
Dec 10, 2015, 2:54:16 PM12/10/15
to Lawrence Mandel, Firefox Dev, release-mgmt
Hello,

First, I understand your pain and frustration. We have plan to simplify the aurora/devedition uplift processes for next year.

I think we can give a try to your proposition in 46. However, we think we will need two things:
* usage of a a=css-image-only approval
* a hook to verify the list of files being touched in the commit.

I reported this bug:
https://bugzilla.mozilla.org/show_bug.cgi?id=1231768
to get this hook.

Thanks for this proposition,
Sylvestre

Emma Humphries

unread,
Dec 10, 2015, 3:15:04 PM12/10/15
to Sylvestre Ledru, Matthew Noorenberghe, release-mgmt, Lawrence Mandel, Firefox Dev
MattN was demoing me a tool he was building as a side project that in
conjunction with a tool he used for participation during the Australis
work (showing screenshots to users and asking if things were broken)
might be something that could be used to leverage participation in
finding regressions in these changes.

Matt's plate is more than full, so this is something which would need
people assigned to it if it would be of benefit.

Matthew N.

unread,
Dec 10, 2015, 3:28:51 PM12/10/15
to Emma Humphries, Sylvestre Ledru, release-mgmt, Lawrence Mandel, Firefox Dev
I'm hoping to have this tool running in automation (on Try to start) in the next few work days then we can get it running more often (e.g. Nightlies). Of course it only covers things that screenshots are taken of which we can easily tweak over time.

Follow along in bug 1169179 and bug 1231408.

Matthew N.

Sylvestre Ledru

unread,
Dec 16, 2015, 12:58:40 PM12/16/15
to Lawrence Mandel, Firefox Dev, release-mgmt
Coming back on this thread. I am sure that Gijs and others will think about it but even with the approval-by-default, we still need the status flags (affected => fixed/verified) to be updated once the patch reaches aurora.
Release management relies on the values of the status flags.

Reply all
Reply to author
Forward
0 new messages