Google Groups no longer supports new Usenet posts or subscriptions. Historical content remains viewable.
Dismiss

Rushing to meet deadlines

105 views
Skip to first unread message

Matt Brubeck

unread,
May 24, 2011, 12:01:07 PM5/24/11
to
There was a big traffic jam in mozilla-central last night as people rushed to get features and fixes landed before the m-c to aurora migration. I strongly suggest that we should not be doing this. As a guideline, I propose that if you would not request to land a patch on Aurora the day after the migration, you should probably not land it on mozilla-central the day before the migration. Risky changes should not be going to Aurora with zero Nightly coverage.

I speak from experience: The mobile team rushed in a number of features right before the Firefox 5 migration to Aurora. We got lucky with many of them, but a few had problems and needed multiple fixes and backouts. They did not end up shipping with Firefox 5, and rushing them in did not benefit users, testers, or developers. It just wasted time spent preparing patches for diverging branches, watching two different trees as we double-landed changes, verifying fixes in multiple nightly builds, and revising plans and release notes. We would have been better off just letting all the questionable last-minute changes wait for the next train.

Holding onto a patch to land it early in the next release cycle might seem like a wasted opportunity, but it will actually give you a lot more freedom to iterate and tweak and fix things on trunk without the restrictions of the Aurora branch and the extra work and process involved. (On Aurora, you will be forced to back out or take extremely low-risk fixes, rather than the fix that you might think is ideal.)

Aakash Desai

unread,
May 24, 2011, 12:09:50 PM5/24/11
to mozilla dev planning, dev-pl...@lists.mozilla.org
+1.

-- Aakash

_______________________________________________
dev-planning mailing list
dev-pl...@lists.mozilla.org
https://lists.mozilla.org/listinfo/dev-planning

Curtis Koenig

unread,
May 24, 2011, 12:33:24 PM5/24/11
to dev-pl...@lists.mozilla.org
+
I think there should be a 24 hour moratorium (at the minimum) on
landings before cut over.

Kyle Huey

unread,
May 24, 2011, 12:35:40 PM5/24/11
to cur...@mozilla.com, dev-pl...@lists.mozilla.org
All that does is move the problem 24 hours earlier ...

Also, the point of the new release train schedule is that m-c never closes.

- Kyle

Matt Brubeck

unread,
May 24, 2011, 12:41:11 PM5/24/11
to cur...@mozilla.com, dev-pl...@lists.mozilla.org
On Tuesday, May 24, 2011 9:35:40 AM UTC-7, Kyle Huey wrote:
> All that does is move the problem 24 hours earlier ...

No, it leaves m-c open for regression fixes and low-risk changes, while ensuring that higher-risk changes land sooner so that any fixes and backouts can be done on trunk where it requires less effort and churn for everyone.

> Also, the point of the new release train schedule is that m-c never closes.

I'm not proposing that m-c should be closed; I'm proposing that while *some* changes can land at any time, *other* changes are better landing a few days earlier or waiting for the next train.

There's a sliding scale of riskiness, and I think we all have some sense of it. Really complicated changes (anything likely to result in multiple followup bugs) should probably be timed to land in the first week or two of a new development cycle, even if it means sitting on a finished patch for a while. Slightly less risky changes could land later in the cycle, but maybe not in the last week, or the last day, or whatever the engineers and QA agree is a reasonable time for testing and stabilization.

Matt Brubeck

unread,
May 24, 2011, 12:41:11 PM5/24/11
to mozilla.de...@googlegroups.com, dev-pl...@lists.mozilla.org, cur...@mozilla.com
On Tuesday, May 24, 2011 9:35:40 AM UTC-7, Kyle Huey wrote:
> All that does is move the problem 24 hours earlier ...

No, it leaves m-c open for regression fixes and low-risk changes, while ensuring that higher-risk changes land sooner so that any fixes and backouts can be done on trunk where it requires less effort and churn for everyone.

> Also, the point of the new release train schedule is that m-c never closes.

I'm not proposing that m-c should be closed; I'm proposing that while *some* changes can land at any time, *other* changes are better landing a few days earlier or waiting for the next train.

Matt Brubeck

unread,
May 24, 2011, 12:44:16 PM5/24/11
to
On Tuesday, May 24, 2011 9:41:11 AM UTC-7, Matt Brubeck wrote:
> On Tuesday, May 24, 2011 9:35:40 AM UTC-7, Kyle Huey wrote:
> > All that does is move the problem 24 hours earlier ...
>
> No, it leaves m-c open for regression fixes and low-risk changes, while ensuring that higher-risk changes land sooner so that any fixes and backouts can be done on trunk where it requires less effort and churn for everyone.

Oops, I didn't notice when I wrote this that Kyle was responding to Curtis's message rather than to my original post. Yes, I agree with Kyle (and disagree with Curtis) that actually closing m-c would not solve the problem.

[Also, sorry for constantly double-posting - Google Groups posts to both the newsgroup and mailing list by default, and I keep forgetting to fix it.]

Johnathan Nightingale

unread,
May 24, 2011, 12:46:23 PM5/24/11
to Kyle Huey, dev-pl...@lists.mozilla.org, cur...@mozilla.com
On 2011-05-24, at 12:35 PM, Kyle Huey wrote:

> All that does is move the problem 24 hours earlier ...
>

> Also, the point of the new release train schedule is that m-c never closes.

This. I've heard a couple people talking about this lately, and I think it requires some basic judgement:

- Don't last-minute-land something that you know you'll immediately have to back out or that you affirmatively want nightly bake time for (why are you last-minute landing that anyhow?)
- Otherwise treat pre-merge as any other day, and only land code you believe to be product-quality, but don't accord it particularly special status.

As Kyle says, a key aspect of the rapid release model is that aurora and beta allow us to stabilize, and central is allowed to keep being fast. Your features should have clear kill switches, or be easily backed out, but we should not let an impending aurora merge chill central. The exception being, as I say, some basic judgement around the likelihood you'll have to back your change out immediately or that it's not aurora-ready without some nightly bake time.

J

---
Johnathan Nightingale
Director of Firefox Engineering
joh...@mozilla.com

Matt Brubeck

unread,
May 24, 2011, 12:57:21 PM5/24/11
to dev-pl...@lists.mozilla.org
On 05/24/2011 09:46 AM, L. David Baron wrote:
> I think the idea of the new cycle is that we sort this stuff out on
> Aurora, not on mozilla-central, so mozilla-central is always open.
> That said, those who land right before the pull will be responsible
> for landing fixes in two places, but that's their problem.

Yes, absolutely. Please take my suggestions as "advice for developers
who want to save themselves time in the long run" rather than "new rules
that we need to enforce on everyone."

John O'Duinn

unread,
May 24, 2011, 2:01:13 PM5/24/11
to Aakash Desai, Matthew Brubeck, dev-pl...@lists.mozilla.org
+1.

Thanks for posting this Matt. Excellent summary of why not to rush in
last minute - its exactly what this rapid release cadence is to help
avoid...

tc
John.
=====

Mike Shaver

unread,
May 24, 2011, 2:41:40 PM5/24/11
to cur...@mozilla.com, dev-pl...@lists.mozilla.org
On Tue, May 24, 2011 at 12:33 PM, Curtis Koenig <cur...@mozilla.com> wrote:
> +
> I think there should be a 24 hour moratorium (at the minimum) on landings
> before cut over.

That just kicks the can down (up?) the road. It doesn't matter when a
hard cut-off is, if there's a hard cut-off. We can fix (remove)
things on Aurora before we push updates, and AIUI that's the plan of
record. One such option is just to roll back to an earlier day of m-c
than the cut-off, if the commons got too tragic.

I think that, given the smaller number of things that can be ready on
each release clock tick, we'll see this be much better than the
stampedes of old; we are likely still working through a
non-representative backlog from the long march to FF4.

We need to see what we actually do, with more than a single data
point, before we start revising our policies if we do that at all. We
had a good (great) plan for this model, but the map is not the
territory.

I think we've learned, usually painfully, that policy works best when
it reinforces behaviour rather than imposes a change. Social
pressure/alignment and paving cowpaths are the way we have seen change
happen most effectively here.

All that said, kudos for making a specific, actionable suggestion --
even if it's one I disagree with.

Mike

Mike Shaver

unread,
May 24, 2011, 2:44:28 PM5/24/11
to jod...@mozilla.com, dev-pl...@lists.mozilla.org, Matthew Brubeck, Aakash Desai
On Tue, May 24, 2011 at 2:01 PM, John O'Duinn <jod...@mozilla.com> wrote:
> Thanks for posting this Matt. Excellent summary of why not to rush in
> last minute - its exactly what this rapid release cadence is to help
> avoid...

I think the opposite, actually.

- you can rush to m-c any day you want
- aurora cut-over should be just another day in the life of m-c, a
mere release mechanic
- we deal with messy rushes that day like any other: piecemeal or
wholesale backout, etc.

Corollary: people should stop thinking about the calendar in the very
general case. It'll take time to break that (previously adaptive)
habit, but we'll get there.

Mike

Justin Lebar

unread,
May 24, 2011, 3:25:34 PM5/24/11
to
>> All that does is move the problem 24 hours earlier ...
> No, it leaves m-c open for regression fixes and low-risk changes

I thought the point of Aurora was to take regression fixes and low-risk changes?

IOW, if this isn't the same as "closing the tree", maybe it's the same as "making m-c into Aurora-lite" for X hours.

Matt Brubeck

unread,
May 24, 2011, 4:17:02 PM5/24/11
to
On Tuesday, May 24, 2011 11:44:28 AM UTC-7, Mike Shaver wrote:
> I think the opposite, actually.
>
> - you can rush to m-c any day you want

On any other day, people pushing to m-c are not "rushing."

> - aurora cut-over should be just another day in the life of m-c, a
> mere release mechanic

Treating it as "just another day" would be a great step forward. Right now it's the opposite - people are clearly treating the Aurora migration date as a time to make more pushes, with more haste and more urgency.

In the 24 hours before the migration, we had 37 pushes to mozilla-central including 10 backouts or bustage fixes. In the same period one week earlier, we had a 22 pushes and only 1 backout.

If we were actually treating this time like any other, I wouldn't have seen any need to start this thread. But what I see are people trying to land things as late in the cycle as possible, which I know from experience leads to extra work and wasted time in the long run. This "mere release mechanic" affects how we spend our development time, and what (not just when) we deliver to our users.

Mike Shaver

unread,
May 24, 2011, 5:04:43 PM5/24/11
to dev-pl...@lists.mozilla.org
On Tue, May 24, 2011 at 4:17 PM, Matt Brubeck <mbru...@mozilla.com> wrote:
> On Tuesday, May 24, 2011 11:44:28 AM UTC-7, Mike Shaver wrote:
>> I think the opposite, actually.
>>
>> - you can rush to m-c any day you want
>
> On any other day, people pushing to m-c are not "rushing."

After prolonged tree closures, often after all-hands or summits,
sometimes after major holidays -- we see spikes sometimes for reasons
other than (perceived or real) deadlines. Or, we certainly have in
the history of the project.

> If we were actually treating this time like any other, I wouldn't have seen any need to start this thread.  But what I see are people trying to land things as late in the cycle as possible, which I know from experience leads to extra work and wasted time in the long run.  This "mere release mechanic" affects how we spend our development time, and what (not just when) we deliver to our users.

Yes, as I said, we're not there yet. But I think it's premature to
try and solve this with policy (esp since it's not clear what the
actual problematic result was, so far, other than a frustrating day on
trunk).

Some human-to-human direct contact about "why did you feel you needed
to land that day? did you feel like you had to take shortcuts?" is
probably more productive than this thread, but I'm not sufficiently
informed to know who to have those conversations with. These are
people making decisions with good intentions, not abstract predators
attacking the sanctity of our tree.

Mike

Marco Bonardo

unread,
May 24, 2011, 5:15:24 PM5/24/11
to
Il 24/05/2011 22:17, Matt Brubeck ha scritto:
> But what I see are people trying to land things as late in the cycle as possible

While I share part of your thoughts, about the fact it should not work
like this, I don't think anybody is trying or willing to land things as
late as possible.

What happens (among other things) imo is that:
- We are not yet used to the new process. Especially to the 6 weeks
timeframe.
- we are passionate about releasing the best browser we can. We should
do better at thinking it's just matter of 6 weeks before users will get
the next code.
- some review path can take a lot of time, so much that you attach patch
at the beginning of the 6 weeks and you get a review the day before the
merge. While this should not be a problem (you can just take the next
train) it's still frustrating when weeks before you planned that as part
of a certain release.

None of these is a compelling reason or excuse to rush, but I think all
of them are things we are not seeing the right way, yet.

And, as someone who slept 4 hours to try helping keep the tree sane, and
sheriffed most of the rush, I can ensure you nobody gave me the
impression to be willing to destroy the tree, or to land as late as
possible just to do it.

Cheers,
Marco

Robert O'Callahan

unread,
May 24, 2011, 5:49:59 PM5/24/11
to Marco Bonardo, dev-pl...@lists.mozilla.org
If the sheriff metered checkins all day, and we required people to be
merging from green trees or from green tryserver pushes, we wouldn't have
problems, right? Which of those did we not do this time?

Rob
--
"Now the Bereans were of more noble character than the Thessalonians, for
they received the message with great eagerness and examined the Scriptures
every day to see if what Paul said was true." [Acts 17:11]

Marco Bonardo

unread,
May 24, 2011, 6:08:31 PM5/24/11
to
Il 24/05/2011 23:49, Robert O'Callahan ha scritto:
> If the sheriff metered checkins all day, and we required people to be
> merging from green trees or from green tryserver pushes, we wouldn't have
> problems, right? Which of those did we not do this time?

Mostly was fine, but not all patches had full green tryruns, some had
partial runs that ended up missing exactly what failed. Like the change
that caused Windows Talos regressions, had green tryrun, but not Windows
Talos.

A partial issue regarding Try is that in the last 2 days getting results
from it was taking hours, someone got a build after 645 minutes. So
putting more hardware on Try may help.

Another issue was the xpcshell.ini late change in the cycle, it caused
at least 2-3 burnings (and I was among those with a merge from a green
tree, just not enough up-to-date), and lots of patches had to be
modified last-second.

Apart these, I think it didn't work that bad, it was just a long long
queue (that is what we are discussing), with some not-completely-tested
patch in the middle (that we may improve). But it ended up with a tree
in a decent status (last 6 pushes before the merge were green), it just
took some time to get all talos and verify the perf regression was fixed.

Cheers,
Marco

Axel Hecht

unread,
May 24, 2011, 6:40:40 PM5/24/11
to
Hi,

ignoring the data about our tree status in the past few, I'd also like
to chime in with an observation I made myself over the past few days:

There are actually a good deal of people that are not rushing their
fixes in. They were close, some of them actually had r+ with comments,
but they didn't land yet because those comments weren't in yet.

I'm saying that because I've had a bunch of bugs in my inbox that I
cared about but couldn't dedicate the brain cycles to that I wanted. And
not one of them got rushed in into the tree over the weekend. Reviewers
were helpful, and I generally didn't observe a "must not miss milestone"
panic.

To me, this feels much better than other code freezes. Independent of
the tree status.

Axel

Justin Wood (Callek)

unread,
May 24, 2011, 6:41:04 PM5/24/11
to
On 5/24/2011 6:08 PM, Marco Bonardo wrote:
> Another issue was the xpcshell.ini late change in the cycle,

This, imo, is the type of change that shouldn't land in last week or two
of a cycle, since it affects so many people with no direct product
benefit, (has a nice benefit, but has the potential to stall, harm
everyone else getting things done).

Also this wasn't announced publically as a "we'll be doing <x> this day"
First I heard announced of this (other than SeaMonkey trees burning) was
like 2 or so days after it landed.

If it was announced, it wasn't public enough, and this from someone who
watches many places, but does have to skim information sometimes.

We need to do better on large-scale changes like this.

--
~Justin Wood (Callek)

Mike Shaver

unread,
May 24, 2011, 7:06:41 PM5/24/11
to Justin Wood (Callek), dev-pl...@lists.mozilla.org
On Tue, May 24, 2011 at 3:41 PM, Justin Wood (Callek) <Cal...@gmail.com> wrote:
> On 5/24/2011 6:08 PM, Marco Bonardo wrote:
>>
>> Another issue was the xpcshell.ini late change in the cycle,
>
> This, imo, is the type of change that shouldn't land in last week or two of
> a cycle,

Disagree. There are no cycles, they are an illusion perpetrated by
the release team. If things land a few days later because of a
disruptive change, they might miss a cut-over, or a weekend of
nightly-bake or whatever.

Maybe we should slightly randomize the dates of cut-over.

Mike

Asa Dotzler

unread,
May 24, 2011, 7:15:19 PM5/24/11
to
On 5/24/2011 3:40 PM, Axel Hecht wrote:
> Hi,
>
> ignoring the data about our tree status in the past few, I'd also like
> to chime in with an observation I made myself over the past few days:
>
> There are actually a good deal of people that are not rushing their
> fixes in. They were close, some of them actually had r+ with comments,
> but they didn't land yet because those comments weren't in yet.
>
> I'm saying that because I've had a bunch of bugs in my inbox that I
> cared about but couldn't dedicate the brain cycles to that I wanted. And
> not one of them got rushed in into the tree over the weekend. Reviewers
> were helpful, and I generally didn't observe a "must not miss milestone"
> panic.
>
> To me, this feels much better than other code freezes. Independent of
> the tree status.
>
> Axel

This is something I tweeted earlier today that I wanted to second here.
A lot more people were not rushing in at the last minute than were.
Maybe it was a bit of a mess the last 18 hours of the cycle but it was
only a mess for a handful of people that for what ever reason were up
against the deadline.

That seems like a decent feedback loop to have. If you're really late,
expect to crash up against the handful of other people that are really
late. If you get in early or you decide to wait a day or two and land
for the next cycle, you avoid that pain.

Or put another way, I don't think we should be optimizing, or even
tuning very much, (and certainly not setting policy) for this
pathological case. Most people did the right thing and got in early
enough or held off and the system worked for those people.

- A

Asa Dotzler

unread,
May 24, 2011, 7:20:54 PM5/24/11
to

I proposed this, jokingly, on IRC -- that we just pick a nice green
changeset from some arbitrary time in the last three or four days and
migrate that to Aurora.

- A (illusion perpetrator)

Shawn Wilsher

unread,
May 24, 2011, 7:24:04 PM5/24/11
to dev-pl...@lists.mozilla.org
On 5/24/2011 4:06 PM, Mike Shaver wrote:
> Maybe we should slightly randomize the dates of cut-over.
Just because our migration date is a Tuesday doesn't mean the release
team has to take everything right up until then. They could very easily
pick a known-good state before then, and take up to that, right?

It does make it harder for people to properly set the Target Milestone
in bugzilla, but then the release team could go fix bugs they didn't
take too.

Cheers,

Shawn

Asa Dotzler

unread,
May 24, 2011, 7:29:21 PM5/24/11
to

We discussed this at this morning's migration meeting and agreed that we
should abide by the published plan this time. I think this is worth
considering for next time, but not as a counter to the day before chaos
so much as a way around the "we migrate even if the tree is red" implied
by that published plan.

- A

Axel Hecht

unread,
May 25, 2011, 2:20:32 AM5/25/11
to

I disagree. The main benefit of having a strict calendar is that people
can plan for it. Apparently not everybody's plan worked out this time,
but the learning curve isn't that bad, I think.

The ability to plan is not only good for developers on mozilla-central,
it's also a great feature for localizers. It's only a few hours after
the migration, and we already have 8 good localizations on aurora.

Randomizing the cut-over date can be done in a ton of ways, but I didn't
come up with a way that wouldn't cut into the benefits of the new model
for l10n.

Axel

Wes Garland

unread,
May 25, 2011, 7:56:01 AM5/25/11
to Mike Shaver, dev-pl...@lists.mozilla.org, Justin Wood (Callek)
On 24 May 2011 19:06, Mike Shaver <mike....@gmail.com> wrote:

> There are no cycles, they are an illusion perpetrated by
> the release team.


Cycles may be an illusion, but I have found that nothing sharpens the mind
like a deadline.

A slight twist on Asa's idea might be interesting - pick the last green
changeset before the cut-over time; otherwise treat Aurora day as you would
any other (busy?) day.

This offers the benefits of the illusory cycle, while putting social
pressure on those who might otherwise rush-push patches that risk green.

Wes

--
Wesley W. Garland
Director, Product Development
PageMail, Inc.
+1 613 542 2787 x 102

Shawn Wilsher

unread,
May 25, 2011, 8:19:52 AM5/25/11
to dev-pl...@lists.mozilla.org
On 5/24/2011 11:20 PM, Axel Hecht wrote:
> The ability to plan is not only good for developers on mozilla-central,
> it's also a great feature for localizers. It's only a few hours after
> the migration, and we already have 8 good localizations on aurora.
>
> Randomizing the cut-over date can be done in a ton of ways, but I didn't
> come up with a way that wouldn't cut into the benefits of the new model
> for l10n.
I'm failing to see how this would impact l10n in any meaningful way.
Nobody is suggesting that the migration happen on a random day, but just
that the time we choose to take changesets up to be somewhat random
instead of "must be in m-c by Monday 11:59 PM PDT". Migration would
still happen at it's regularly scheduled time.

Cheers,

Shawn

Robert Kaiser

unread,
May 25, 2011, 8:36:04 AM5/25/11
to
L. David Baron schrieb:

> I think the idea of the new cycle is that we sort this stuff out on
> Aurora, not on mozilla-central, so mozilla-central is always open.

My understanding as well.

> That said, those who land right before the pull will be responsible
> for landing fixes in two places, but that's their problem.

Well, on Aurora we should mostly disable them or back them out eagerly,
at least if the fix doesn't look to be damn easy and low risk by itself.
And I would be more eager with those backouts or disabling on stuff that
landed a short time before the cutover, as that didn't have time to bake
and could contain even more problems than the immediate one.

If you have trains leaving regularly, you will always have a few people
race in the last minute to catch one. That's the case in the physical
world (observed it often enough) and the virtual world as well. And in
the end, in our world, it's not too much of a problem, we'll just throw
out the one fast from Aurora that cause problems (to be fixed on trunk
for the next train) and leave the good ones in. I think we just need to
be relaxed about that and just deal with in terms of the train model
instead of trying to fight it happening. Having a bit of a last minute
race is human and just shows that people are eager to see their work go
out to users. As long as it doesn't get too bad, let's just sit back and
enjoy the show and clean up in the first day of Aurora, and we'll be fine.

That said, the race this time was very small compared to what I've seen
in freezes for previous stable versions, so I think the new model works
out fine and a lot of people are aware that they'll just make the next
train. After all, things that didn't make this cutover are still well on
schedule to be shipped this year - which is a major shift compared to
previous freeze deadlines. ;-)

Robert Kaiser


--
Note that any statements of mine - no matter how passionate - are never
meant to be offensive but very often as food for thought or possible
arguments that we as a community should think about. And most of the
time, I even appreciate irony and fun! :)

Axel Hecht

unread,
May 25, 2011, 8:42:59 AM5/25/11
to

Well, you're restricting shaver's comment quite significantly.

So in that scenario, (merge war room happens on schedule, and picks a
changeset from the past):

If there are l10n-impact changes between the picked changeset and the
current state of the tree, localizers could end up landing content on
l10n-central that's not good for aurora. That did happen when we cut
aurora the first time in a way. Pretty naughty it was.

We can further restrict the randomness, of course:

- limit the timewindow that we're willing to go back during warroom.

Either by date, or by saying you're not crossing an l10n-impact landing.

The first will basically put an l10n embargo for the maximum time window
on central, I don't think that's a win.

Limiting the random to not cross an l10n-impact landing is going to work
technically, but it might just make the window in which you can choose a
changeset for aurora rather small.

Axel

Mark Finkle

unread,
May 25, 2011, 12:18:17 PM5/25/11
to
On May 24, 2:44 pm, Mike Shaver <mike.sha...@gmail.com> wrote:

> On Tue, May 24, 2011 at 2:01 PM, John O'Duinn <jodu...@mozilla.com> wrote:
> > Thanks for posting this Matt. Excellent summary of why not to rush in
> > last minute - its exactly what this rapid release cadence is to help
> > avoid...
>
> I think the opposite, actually.
>
> - you can rush to m-c any day you want
> - aurora cut-over should be just another day in the life of m-c, a
> mere release mechanic
> - we deal with messy rushes that day like any other: piecemeal or
> wholesale backout, etc.

I agree with this and would add the following: Benjamin Smedberg made
a great decision to just pick a changeset that was green, _not_ the
tip of mozilla-central, to use as the migration point. I think this
approach will also help us break our bad habit of rushing to the
deadline. The "deadline" will just move back to a green point anyway.

> Corollary: people should stop thinking about the calendar in the very
> general case.  It'll take time to break that (previously adaptive)
> habit, but we'll get there.

It is very hard to break. I wish we had a few simple ways to try to
break the habit. Going cold turkey isn't easy

Mike Shaver

unread,
May 25, 2011, 12:21:07 PM5/25/11
to Mark Finkle, dev-pl...@lists.mozilla.org
On Wed, May 25, 2011 at 9:18 AM, Mark Finkle <mark....@gmail.com> wrote:
> It is very hard to break. I wish we had a few simple ways to try to
> break the habit. Going cold turkey isn't easy

We're not going cold turkey, really, and I think the first drop in our
dose went pretty well, actually.

Mike

Marco Bonardo

unread,
May 25, 2011, 12:46:57 PM5/25/11
to
Il 25/05/2011 18:31, Dave Townsend ha scritto:
> We actually did pick the tip of m-c even though tests hadn't completed on it
> yet, because that is what we had announced we were going to do.

And I think, based on the announcement, this was the best choice, it was
clearly said "if you want your stuff in Firefox 6 land it before 10AM".

I suggest for next announcement to generically say "We will choose a
green changeset in this day", without specifying any time.

Cheers,
Marco

Mike Shaver

unread,
May 25, 2011, 12:58:33 PM5/25/11
to rob...@ocallahan.org, dev-pl...@lists.mozilla.org, Justin Wood (Callek)
On Tue, May 24, 2011 at 6:21 PM, Robert O'Callahan <rob...@ocallahan.org> wrote:
> On Wed, May 25, 2011 at 11:06 AM, Mike Shaver <mike....@gmail.com> wrote:
>>
>> Disagree.  There are no cycles, they are an illusion perpetrated by
>> the release team.
>
> I don't agree with that. Six weeks is still a significant amount of time,
> and I think it's a good thing for people to try to get something into
> release N instead of N + 1 --- when that can be done without breaking stuff.

I think we should be trying to optimize globally, not locally
per-feature, and I believe that we get more great stuff to more people
in a shorter time if we don't spend 5 weekdays of every 30 pushing for
faster review, crowding try, not re-testing after other big landings,
stepping on each other's backouts, etc. Slow is smooth and smooth is
fast.

>> Maybe we should slightly randomize the dates of cut-over.
>

> I like this plan!

Well, liking that plan means that things can still miss for 6 weeks
because they landed a day after the revealed cut-over, so I don't know
how to reconcile that with your first point, tbh.

Mike

Benjamin Smedberg

unread,
May 25, 2011, 2:24:02 PM5/25/11
to Christian Legnitto, Marco Bonardo, dev-pl...@lists.mozilla.org
On 5/25/2011 2:19 PM, Christian Legnitto wrote:
> I don't like the uncertainty that gives devs about their fix making it in. We have been pretty consistent saying we would migrate on red and deal with the fallout on mozilla-aurora (the benefit of having multiple repos, yay!). Pulling a green changeset is a big shift from the previously announce plan and I really don't feel the benefits outweigh the costs.
I strongly disagree. It didn't hurt too badly this time, but I think
that we should be importing "known good" code as measured by our tests.
It's easy to do, and means that the -central tree can be managed
normally without people trying to get in under the wire and dealing with
backouts across multiple locations.

Or we should just pick the changeset which was chosen for the Tuesday
nightly.

--BDS

Mike Connor

unread,
May 25, 2011, 2:34:11 PM5/25/11
to Christian Legnitto, Marco Bonardo, dev-pl...@lists.mozilla.org

On 2011-05-25, at 2:19 PM, Christian Legnitto wrote:

>
> On May 25, 2011, at 9:46 AM, Marco Bonardo wrote:
>

> I don't like the uncertainty that gives devs about their fix making it in. We have been pretty consistent saying we would migrate on red and deal with the fallout on mozilla-aurora (the benefit of having multiple repos, yay!). Pulling a green changeset is a big shift from the previously announce plan and I really don't feel the benefits outweigh the costs.

I don't think the uncertainty is substantially different. If I'm trying to make a cutoff, I'm already not certain that the tree will be in a good state at the last minute. Unless I'm going to just land on red to make the cutoff, I don't think this has as much cost as you're implying here.

I think that if a developer really wants to make a cutoff they should be landing well in advance of the merge, not in the last 12 hours.

-- Mike


Chris Cooper

unread,
May 25, 2011, 3:22:13 PM5/25/11
to dev-pl...@lists.mozilla.org
On 2011-05-25 2:24 PM, Benjamin Smedberg wrote:
> Or we should just pick the changeset which was chosen for the Tuesday
> nightly.

This seems like a good solution to me. The automation has already found
a known-good changeset for you.

cheers,
--
coop

signature.asc

Dao

unread,
May 25, 2011, 3:27:41 PM5/25/11
to
On 25.05.2011 20:24, Benjamin Smedberg wrote:
> Or we should just pick the changeset which was chosen for the Tuesday
> nightly.

That changeset was chosen because it was successfully built across
platforms, not because tests were passing.

Christian Legnitto

unread,
May 25, 2011, 3:27:46 PM5/25/11
to Chris Cooper, dev-pl...@lists.mozilla.org
On May 25, 2011, at 12:22 PM, Chris Cooper wrote:

> On 2011-05-25 2:24 PM, Benjamin Smedberg wrote:
>> Or we should just pick the changeset which was chosen for the Tuesday
>> nightly.
>

> This seems like a good solution to me. The automation has already found
> a known-good changeset for you.

Will this prevent the last minute rush, which is the whole point of this thread? I bet it won't.

Christian

Benjamin Smedberg

unread,
May 25, 2011, 3:34:58 PM5/25/11
to Christian Legnitto, dev-pl...@lists.mozilla.org, Chris Cooper
It will be an incentive to land a day or two earlier, certainly. And the
only people it would hurt are the people who actually landed in the
middle of a non-good tree, which isn't really supposed to happen anyway.
So I think it's a win-win.

--BDS

Mike Connor

unread,
May 25, 2011, 3:40:26 PM5/25/11
to Benjamin Smedberg, dev-pl...@lists.mozilla.org, Christian Legnitto, Chris Cooper

If there is a last minute, there will be a last minute rush. Going back to the last green changeset means that people who tried to beat the buzzer will frequently miss out. That should be enough to change behaviours, but if not it at least means that the Aurora merge doesn't have to incur that overhead.

-- Mike

Armen Zambrano Gasparnian

unread,
May 25, 2011, 4:13:58 PM5/25/11
to Mike Connor, Benjamin Smedberg, dev-pl...@lists.mozilla.org, Christian Legnitto, Chris Cooper
If we use the changeset of the last nightly we would know that the
aurora build will have the same issues that would be reported during the
day of the merge by m-c users. If we take any changeset after the
previous nightly cutoff we will be testing something slightly different.

What do you think?

cheers,
Armen

Armen Zambrano Gasparnian

unread,
May 25, 2011, 4:13:58 PM5/25/11
to Mike Connor, dev-pl...@lists.mozilla.org, Benjamin Smedberg, Christian Legnitto, Chris Cooper
On 11-05-25 3:40 PM, Mike Connor wrote:
>

Axel Hecht

unread,
May 25, 2011, 6:28:00 PM5/25/11
to

What's the exact algorithm the nightlies use?

Axel

Marco Bonardo

unread,
May 25, 2011, 7:21:43 PM5/25/11
to
Il 25/05/2011 20:24, Benjamin Smedberg ha scritto:
> Or we should just pick the changeset which was chosen for the Tuesday
> nightly.

While would be good to have nightly testing on a known changeset (taking
another changeset means nightly testers are using something with more or
less fixes), this won't prevent taking a bad changeset.
In this specific case, the nightly contained a fix that was backed out 3
pushes later, due to causing random crashes in CC. Then this is nor much
different from the current cherrypicking.

Taking the nightly, while better for testing, would also not solve the
rush "problem", saying "push before 3:30AM to be in the release" is not
different from "push before 10AM to be in the release".

Personally I think that, more than finding a solution to randomize the
time of the merge, would be better to concentrate on finding reasons
that bring patches and features to be ready on the last day, in the last
hours.
I don't think that I'd stay awake till the morning to push a patch, if I
could have it reviewed and green on try a week, or even just a day
before. I want to push when things are ready, so why things get ready
exactly in those last hours?
Faster Try, earlier feedback on approaches/UI/patches, more reviewers
for strategic components, may make the difference here.

When I get last-minute feedback, last-minute reviews, and 10 hours to
get results from Try, I find myself ready just some hour before the merge.
I could just skip it and take the next train, but that looks like a
failure for me and I feel sorry for my project manager, my reviewer and
also users. Most likely this psychological thing matters too, we still
fail thinking that it's just matter of 6 weeks. I'm not yet used to
this, honestly, but I think it will get better.

Cheers,
Marco

Robert O'Callahan

unread,
May 25, 2011, 9:06:52 PM5/25/11
to Mike Shaver, dev-pl...@lists.mozilla.org, Justin Wood (Callek)
On Thu, May 26, 2011 at 4:58 AM, Mike Shaver <mike....@gmail.com> wrote:

> I think we should be trying to optimize globally, not locally
> per-feature, and I believe that we get more great stuff to more people
> in a shorter time if we don't spend 5 weekdays of every 30 pushing for
> faster review, crowding try, not re-testing after other big landings,
> stepping on each other's backouts, etc. Slow is smooth and smooth is
> fast.
>

Some of those are obvious bad things that we should never do.

But saying "hey, if you review my feature patch today and my big refactoring
patch tomorrow instead of the other way around, users may get this feature
six weeks earlier" seems perfectly rational to me. Always ignoring the
release cycle would not be rational.

Less hypothetically, I interrupted some of my bigger projects to do a batch
of small regression fixes that I wanted to be in FF6 instead of FF7. I don't
think that was wrong. The fixes landed via cedar a couple of days before the
cutoff, and no process violations occurred.

>> Maybe we should slightly randomize the dates of cut-over.
> >
> > I like this plan!
>
> Well, liking that plan means that things can still miss for 6 weeks
> because they landed a day after the revealed cut-over, so I don't know
> how to reconcile that with your first point, tbh.
>

"Optimizing for the timing of the release cycle" can mean more than
"everything gets checked in at the last minute", and "sometimes unexpected
events will make you wait for the next train" will always be true
regardless.

FWIW, on reflection I like "most recent green-ish changeset" more than
randomized choice.

Rob
--
"Now the Bereans were of more noble character than the Thessalonians, for
they received the message with great eagerness and examined the Scriptures
every day to see if what Paul said was true." [Acts 17:11]

Daniel Cater

unread,
May 25, 2011, 9:15:40 PM5/25/11
to
On Wednesday, 25 May 2011 19:24:02 UTC+1, Benjamin Smedberg wrote:
> Or we should just pick the changeset which was chosen for the Tuesday
> nightly.
>
> --BDS

I think this is a good idea. It means that you get all of the nightly testers running the same code that will be in the first Aurora build. It also slightly helps the "in before the deadline" problem as the changeset used for nightlies isn't entirely predictable. All builds have to go green and the build time of the longest build is long and varies quite a lot. If a changeset hasn't finished building across all platforms by the nightly start time then a previous changeset will be used.

In future, I expect the code to choose the nightly changeset will take into account the test runs as well.

Ehsan Akhgari

unread,
May 25, 2011, 9:27:13 PM5/25/11
to Chris Cooper, dev-pl...@lists.mozilla.org
On 11-05-25 3:22 PM, Chris Cooper wrote:

> On 2011-05-25 2:24 PM, Benjamin Smedberg wrote:
>> Or we should just pick the changeset which was chosen for the Tuesday
>> nightly.
>
> This seems like a good solution to me. The automation has already found
> a known-good changeset for you.

FWIW, the second sentence is actually false! It has picked the last
changeset known to build successfully. There is a huge difference
between the two.

Cheers,
Ehsan

Chris Cooper

unread,
May 25, 2011, 10:49:09 PM5/25/11
to dev-pl...@lists.mozilla.org
On 2011-05-25 9:27 PM, Ehsan Akhgari wrote:
> FWIW, the second sentence is actually false! It has picked the last
> changeset known to build successfully. There is a huge difference
> between the two.

Replace "known-good" with "known-to-build" and you have my intent.

It just means one less hurdle to be overcome in the first days of the a
new Aurora cycle, and drivers don't have to play favorites with
late-landing developer check-ins.

cheers,
--
coop

signature.asc

Mike Connor

unread,
May 25, 2011, 11:07:51 PM5/25/11
to Armen Zambrano Gasparnian, dev-pl...@lists.mozilla.org, Benjamin Smedberg, Christian Legnitto, Chris Cooper

I think I'd be okay with this if nightly builds actually built off all-green changesets. Since they don't, I think the value is questionable. I see very little argument presented for taking untested csets onto Aurora if we can simply go back to an actually-known-good changeset.

That we also provide a bigger disincentive for last-minute and/or risky landings (i.e. your change is not guaranteed to make Aurora unless you get a green run before the cutoff) is a strong bonus and fulfils the overall desire to minimize/remove this pain.

-- Mike

sayrer

unread,
May 26, 2011, 1:03:08 AM5/26/11
to Mike Shaver, dev-pl...@lists.mozilla.org, Justin Wood (Callek), rob...@ocallahan.org
On Wednesday, May 25, 2011 6:06:52 PM UTC-7, Robert O&#39;Callahan wrote:
>
> But saying "hey, if you review my feature patch today and my big refactoring
> patch tomorrow instead of the other way around, users may get this feature
> six weeks earlier" seems perfectly rational to me. Always ignoring the
> release cycle would not be rational.

I agree. But I want to suggest that we give this process some time to sink in. I wonder if we will see frantic check-in behavior for Firefox 10. I think we should see whether developers react differently once they realize that these events happen often.

Also, we have a relatively green tree on mozilla-central today, so we didn't lose tons of time. Recall that our previous release tactics have resulted in multi-day tree closures.

- Rob

sayrer

unread,
May 26, 2011, 1:03:08 AM5/26/11
to mozilla.de...@googlegroups.com, dev-pl...@lists.mozilla.org, Justin Wood (Callek), rob...@ocallahan.org, Mike Shaver
On Wednesday, May 25, 2011 6:06:52 PM UTC-7, Robert O&#39;Callahan wrote:
>
> But saying "hey, if you review my feature patch today and my big refactoring
> patch tomorrow instead of the other way around, users may get this feature
> six weeks earlier" seems perfectly rational to me. Always ignoring the
> release cycle would not be rational.

I agree. But I want to suggest that we give this process some time to sink in. I wonder if we will see frantic check-in behavior for Firefox 10. I think we should see whether developers react differently once they realize that these events happen often.

sayrer

unread,
May 26, 2011, 1:17:46 AM5/26/11
to
On Tuesday, May 24, 2011 9:01:07 AM UTC-7, Matt Brubeck wrote:
>
> I speak from experience

I will be taking the advice Matt gave here. It's understandable for a development effort to miss a target release. It's only when code misses two or three or four release opportunities (depending on the complexity) that we'll have to look closer at what's happening.

2 release cycles: 12 weeks, 2.76 months
3 release cycles: 18 weeks, 4.14 months
4 release cycles: 24 weeks, 5.52 months

Some efforts will take 6 months or more than a year. Hopefully, we'll be able to spot those by the amount of planning and resources we're putting in. Or, we'll be up front about taking a risk that may not pan out.

- Rob

Asa Dotzler

unread,
May 26, 2011, 4:37:25 AM5/26/11
to
On 5/25/2011 4:21 PM, Marco Bonardo wrote:
> Il 25/05/2011 20:24, Benjamin Smedberg ha scritto:
>> Or we should just pick the changeset which was chosen for the Tuesday
>> nightly.
>
> While would be good to have nightly testing on a known changeset (taking
> another changeset means nightly testers are using something with more or
> less fixes), this won't prevent taking a bad changeset.

It's also practically not interesting. We're going to spend a few days
evaluating and turning off or backing out things that aren't ready on
Aurora before we deliver our first build to the Aurora channel so we're
probably never going to match the nightly anyway.

- A

Robert Kaiser

unread,
May 26, 2011, 12:11:05 PM5/26/11
to
Robert O'Callahan schrieb:

> "Optimizing for the timing of the release cycle" can mean more than
> "everything gets checked in at the last minute"

Actually, for me, it means the exact opposite - good optimization for
the cycle means that the big things will land in the first two weeks and
smaller fixes as well as improvements on those bug things will land in
the latter four weeks of it. Somehow that reminds me of the "merge
windows" used for the Linux kernel at the beginning of each cycle...

Robert Kaiser

--
Note that any statements of mine - no matter how passionate - are never
meant to be offensive but very often as food for thought or possible
arguments that we as a community should think about. And most of the
time, I even appreciate irony and fun! :)

Mike Hommey

unread,
May 26, 2011, 12:46:51 PM5/26/11
to Robert Kaiser, dev-pl...@lists.mozilla.org
On Thu, May 26, 2011 at 06:11:05PM +0200, Robert Kaiser wrote:
> Robert O'Callahan schrieb:
> >"Optimizing for the timing of the release cycle" can mean more than
> >"everything gets checked in at the last minute"
>
> Actually, for me, it means the exact opposite - good optimization
> for the cycle means that the big things will land in the first two
> weeks and smaller fixes as well as improvements on those bug things
> will land in the latter four weeks of it. Somehow that reminds me of
> the "merge windows" used for the Linux kernel at the beginning of
> each cycle...

But then, the Linux kernel development cycle doesn't work with fixed
date, but simply waits for things to be good enough after the fixups
period following the merge window.

Mike

Robert Kaiser

unread,
May 26, 2011, 3:31:44 PM5/26/11
to
Mike Hommey schrieb:

Sure, and they don't ship binaries to end users, etc. - What I meant was
just the roughly two weeks of merging the large changes at the beginning
of the cycle.

0 new messages