Going back to m-c and gaia master for v1.1

Jonas Sicking

unread,

Apr 3, 2013, 3:16:41 PM4/3/13

to dev...@lists.mozilla.org, mi...@mozilla.com

Hi All,

It's come up a number of times that there are a fairly large number of
issues with the way that we're using "v1.1 branches", i.e. b2g18 for
Gecko and v1-train for gaia.

I won't go in to the various issues here since I think it's generally
agreed that it would be good if we could avoid them.

The current release plan for v1.1 means that we have code freeze on
August 12th. This is the date when, as I understand it, we absolutely
have to be done with the last line of code and it's pencils down for
everyone.

This actually means that we would have the ability to go back to
mozilla-central and gaia master right now and ride the normal release
trains for the Gecko 23 release, while still reaching the release
milestone for gecko before v1.1 is done.

This has a lot of attractive features:
* No backporting work for now!
* Much simpler backporting work once we branch for aurora on may 13th.
* We spend more time writing code and less time porting it.
* Less risk involved when backporting patches.
* We pick up a whole host of bug fixes and new features that's
happened since Gecko 18 branched. Including performance work.

However there are quite a few things that we need to check before we
can make such a move:
* Will it affect our partners ability to upgrade 1.0 users to the 1.1 release?
* Are there any big and scary landings that are planned for the Gecko
23 release that we'd rather not take. Gfx layers-refactoring comes to
mind.
* Will we be able to take security fixes that are landing in Gecko 23
all up until August 6th?
* Is mozilla-central and gaia master working well enough right now for
Firefox OS that this is an option?
* Will all relevant partner testing still be possible even through
we'll be on mozilla-central for some of it.

We also have to develop a plan for what to do if we need to fix some
Gecko issue that requires large enough changes that we can't land it
in the Firefox release trains. At that point we have to branch away
from the normal Firefox release branch. This means that all fixes that
go into the Firefox branch will have to also be ported to the B2G
branch.

All in all I think the advantages outweigh the disadvantages here.
Things from Firefox development landing in aurora and beta tend to be
very safe. Definitely safer than the sum of the large amounts of work
we'll still be doing for B2G. So if we can work through the issue list
above, I think we should do it.

Would love input from both developers and product on this.
Unfortunately I'll be on vacation thursday and friday so it would be
great to get help driving looking into the checklist above during that
time.

/ Jonas

Andreas Gal

unread,

Apr 3, 2013, 3:35:39 PM4/3/13

to Jonas Sicking, dev...@lists.mozilla.org, mi...@mozilla.com

Hi Jonas,

I am a bit confused by your dates here. The code freeze for v1.1 is 4/26 (or some time shortly thereafter), not some time in August. After 4/26 we will ideally only do small fixes requested by partners. Starting 4/1 the vendor already has started with QA, and the chipset vendor's QA timeline relies on there being a small delta between v1.0.1 and v1.1.

A freeze in the August/September timeline will be for 1.2. Maybe you mean that? And for that it absolutely makes sense to start with trunk again.

Thanks,

Andreas

> _______________________________________________
> dev-b2g mailing list
> dev...@lists.mozilla.org
> https://lists.mozilla.org/listinfo/dev-b2g

Justin Lebar

unread,

Apr 3, 2013, 3:41:48 PM4/3/13

to Jonas Sicking, dev...@lists.mozilla.org, mi...@mozilla.com

This is similar to what we did with 1.0. We rode the trains until
beta, then we branched.

The problem was, we branched too early. We thought the branch would
stabilize for a few weeks before we shipped, but in fact b2g18 has
been stabilizing for months and still isn't ready to ship as v1.0.1.

To avoid making the same mistake again, I think we'd probably want to
say that we're not going to branch. That essentially means we need to
be done when Aurora branches to Beta, because changes on Beta are
(rightly) highly restricted.

If we're not ready by the time we hit Beta, we slip by 6 weeks. We
don't put our beta users at risk for breakage to make our b2g
deadlines.

Note that we'd still be double-landing most patches on aurora and m-c,
because we will inevitably be under a huge amount of schedule
pressure. But that's a heck of a lot better than double-landing on
b2g18 and m-c.

Jonas Sicking

unread,

Apr 3, 2013, 3:47:42 PM4/3/13

to Andreas Gal, dev...@lists.mozilla.org, mi...@mozilla.com

Unfortunately it's not the case that we'll do only small fixes post
4/26. First of all the current leo+ list is bringing us past 4/26, and
we'll have plenty of leo+ bugs filed between now and then.
Additionally, we're expecting very large number bugs to be filed in
response to testing that's starting after 4/26.

So the amount of code churn that we'll get from aurora and beta
landings I would expect to be dramatically smaller than the amount of
code churn that we'll be doing specifically for B2G no matter which
branch we are on.

/ Jonas

On Wed, Apr 3, 2013 at 12:35 PM, Andreas Gal <andre...@gmail.com> wrote:
> Hi Jonas,
>
> I am a bit confused by your dates here. The code freeze for v1.1 is 4/26 (or some time shortly thereafter), not some time in August. After 4/26 we will ideally only do small fixes requested by partners. Starting 4/1 the vendor already has started with QA, and the chipset vendor's QA timeline relies on there being a small delta between v1.0.1 and v1.1.
>
> A freeze in the August/September timeline will be for 1.2. Maybe you mean that? And for that it absolutely makes sense to start with trunk again.
>
> Thanks,
>
> Andreas
>

Andreas Gal

unread,

Apr 3, 2013, 3:55:09 PM4/3/13

to Jonas Sicking, dev...@lists.mozilla.org, mi...@mozilla.com

Whatever we have to land on the branch, landing more than what we have to is not an option. Starting 4/26 the chipset vendor and the OEMs drive and decide what patches they take since they will be in an active QA cycle. Every time we touch the code we risk breaking something, so they will want to minimize changes (this is not a guess, I know this is the case with them). They will not take any patches they don't request. If we dump our trunk on them through August, it will be completely impossible for them to stabilize and they will have to fork internally. The result will be a lot of invisible breakage we don't understand because they will start reporting bugs against a branch we can't see or reproduce. We worked very hard to have these publicly visible "vendor" branches. They make us much more open than any other open source project involving hardware. Losing this benefit and openness would be extremely bad, in my opinion.

I agree with what Justin said though. We should avoid branching too early. We should hold off 1.2 work until 1.1 is in good shape and not changing much any more. That will avoid difficult uplifts. Already today most of the uplifts are not done by engineers because they are trivial. It would be good if we can keep it that way.

Andreas

Andreas Gal

unread,

Apr 3, 2013, 3:59:32 PM4/3/13

to Justin Lebar, mi...@mozilla.com, Jonas Sicking, dev...@lists.mozilla.org

The vendor predicts that they will file around 300 regressions during the QA cycle (between now and August). I don't think we want to fix that on Aurora or Beta, and "lets just slip 6 weeks" is unfortunately not how the hardware business works. We are currently closely tied to hardware schedules because we are supporting individual hardware launches with software. The goal is over time to switch to a reference software model where we make a golden reference for a specific platform (a real hardware platform, or even just a reference platform), and OEMs make products out of that (similar to the Android model). Its clearly not a model where we have arrived at yet. For a while we will have to continue to be closely involved in the actual product engineering. In a year or two ideally we will be closer to the reference platform model, and in that world riding trains makes sense, and slipping will be an option.

Andreas

On Apr 3, 2013, at 9:41 PM, Justin Lebar <justin...@gmail.com> wrote:

> This is similar to what we did with 1.0. We rode the trains until
> beta, then we branched.
>
> The problem was, we branched too early. We thought the branch would
> stabilize for a few weeks before we shipped, but in fact b2g18 has
> been stabilizing for months and still isn't ready to ship as v1.0.1.
>
> To avoid making the same mistake again, I think we'd probably want to
> say that we're not going to branch. That essentially means we need to
> be done when Aurora branches to Beta, because changes on Beta are
> (rightly) highly restricted.
>
> If we're not ready by the time we hit Beta, we slip by 6 weeks. We
> don't put our beta users at risk for breakage to make our b2g
> deadlines.
>
> Note that we'd still be double-landing most patches on aurora and m-c,
> because we will inevitably be under a huge amount of schedule
> pressure. But that's a heck of a lot better than double-landing on
> b2g18 and m-c.
>

Milan Sreckovic

unread,

Apr 3, 2013, 4:01:38 PM4/3/13

to Andreas Gal, mi...@mozilla.com, Jonas Sicking, dev...@lists.mozilla.org

Agreeing with everybody, even though there is a bit of a disagreement :-) There is the "right thing" and the "thing we have to do". I know it's hard for me to talk about "things we have to do" because, well, that leads to imperfect world.

The reason the trains work for us is because we have everybody looking at nightly, so by the time we start slowing down for Aurora and Beta, we know we're in decent shape, and we know we can change our behaviour, lock things down, and hit the dates for shipping.

With the current setup, we have a large group of people, with a large set of tests, get on the train late and introduce the work that we're not used to doing at that stage of the game. To beat the analogy to death, it seems like we should have another station, or another set of tracks for them. In the perfect world though, they would be on the train from day one, and not surprise us and themselves with all the work way late in the game.

I'm getting to the point - if our partners can get on the aurora train right now (22), then maybe there is enough time to ship based on that version. I doubt our partners will go for it, but it would be something to start planting now. As for riding the 23 train, I like to think I'm brave, but that one scares the crap out of me, with a few working days between us producing a build and pencils down.

Milan

On 2013-04-03, at 3:55 PM, Andreas Gal <andre...@gmail.com> wrote:

>
> Whatever we have to land on the branch, landing more than what we have to is not an option. Starting 4/26 the chipset vendor and the OEMs drive and decide what patches they take since they will be in an active QA cycle. Every time we touch the code we risk breaking something, so they will want to minimize changes (this is not a guess, I know this is the case with them). They will not take any patches they don't request. If we dump our trunk on them through August, it will be completely impossible for them to stabilize and they will have to fork internally. The result will be a lot of invisible breakage we don't understand because they will start reporting bugs against a branch we can't see or reproduce. We worked very hard to have these publicly visible "vendor" branches. They make us much more open than any other open source project involving hardware. Losing this benefit and openness would be extremely bad, in my opinion.
>
> I agree with what Justin said though. We should avoid branching too early. We should hold off 1.2 work until 1.1 is in good shape and not changing much any more. That will avoid difficult uplifts. Already today most of the uplifts are not done by engineers because they are trivial. It would be good if we can keep it that way.
>

> Andreas
>

> On Apr 3, 2013, at 9:47 PM, Jonas Sicking <jo...@sicking.cc> wrote:
>
>> Unfortunately it's not the case that we'll do only small fixes post
>> 4/26. First of all the current leo+ list is bringing us past 4/26, and
>> we'll have plenty of leo+ bugs filed between now and then.
>> Additionally, we're expecting very large number bugs to be filed in
>> response to testing that's starting after 4/26.
>>
>> So the amount of code churn that we'll get from aurora and beta
>> landings I would expect to be dramatically smaller than the amount of
>> code churn that we'll be doing specifically for B2G no matter which
>> branch we are on.
>>
>> / Jonas
>>
>> On Wed, Apr 3, 2013 at 12:35 PM, Andreas Gal <andre...@gmail.com> wrote:
>>> Hi Jonas,
>>>
>>> I am a bit confused by your dates here. The code freeze for v1.1 is 4/26 (or some time shortly thereafter), not some time in August. After 4/26 we will ideally only do small fixes requested by partners. Starting 4/1 the vendor already has started with QA, and the chipset vendor's QA timeline relies on there being a small delta between v1.0.1 and v1.1.
>>>
>>> A freeze in the August/September timeline will be for 1.2. Maybe you mean that? And for that it absolutely makes sense to start with trunk again.
>>>
>>> Thanks,
>>>

Jonas Sicking

unread,

Apr 3, 2013, 4:03:43 PM4/3/13

to Andreas Gal, Justin Lebar, dev...@lists.mozilla.org, mi...@mozilla.com

Yeah, I don't believe we have the luxury of being "done" before
branching to aurora. That's certainly where we should eventually get,
but we have too many table-stakes features still to implement that we
can't put a 12 week gap at the end.

I agree that we did branch too early for the v1.0 release and my
proposal does put branching at a later stage compared to when we did
it for v1.0.

/ Jonas

On Wed, Apr 3, 2013 at 12:59 PM, Andreas Gal <andre...@gmail.com> wrote:
>
> The vendor predicts that they will file around 300 regressions during the QA cycle (between now and August). I don't think we want to fix that on Aurora or Beta, and "lets just slip 6 weeks" is unfortunately not how the hardware business works. We are currently closely tied to hardware schedules because we are supporting individual hardware launches with software. The goal is over time to switch to a reference software model where we make a golden reference for a specific platform (a real hardware platform, or even just a reference platform), and OEMs make products out of that (similar to the Android model). Its clearly not a model where we have arrived at yet. For a while we will have to continue to be closely involved in the actual product engineering. In a year or two ideally we will be closer to the reference platform model, and in that world riding trains makes sense, and slipping will be an option.
>

> Andreas
>

> On Apr 3, 2013, at 9:41 PM, Justin Lebar <justin...@gmail.com> wrote:
>
>> This is similar to what we did with 1.0. We rode the trains until
>> beta, then we branched.
>>
>> The problem was, we branched too early. We thought the branch would
>> stabilize for a few weeks before we shipped, but in fact b2g18 has
>> been stabilizing for months and still isn't ready to ship as v1.0.1.
>>
>> To avoid making the same mistake again, I think we'd probably want to
>> say that we're not going to branch. That essentially means we need to
>> be done when Aurora branches to Beta, because changes on Beta are
>> (rightly) highly restricted.
>>
>> If we're not ready by the time we hit Beta, we slip by 6 weeks. We
>> don't put our beta users at risk for breakage to make our b2g
>> deadlines.
>>
>> Note that we'd still be double-landing most patches on aurora and m-c,
>> because we will inevitably be under a huge amount of schedule
>> pressure. But that's a heck of a lot better than double-landing on
>> b2g18 and m-c.
>>

Justin Lebar

unread,

Apr 3, 2013, 4:07:40 PM4/3/13

to Andreas Gal, mi...@mozilla.com, Jonas Sicking, dev...@lists.mozilla.org

> "lets just slip 6 weeks" is unfortunately not how the hardware business works.

If the past months have taught me anything, it's that "let's just
branch and double-land everything" is unfortunately not a path to
shipping quality software on time with happy, productive engineers.
So this is not a one-sided trade-off.

I get that there are partner constraints here, and perhaps riding the
trains is incompatible with them, but I think there's a certain amount
of creativity called for here, given that what we've been doing hasn't
been working. I'd be curious to know if you have any ideas in this
respect beyond "wait a few years".

Jonas Sicking

unread,

Apr 3, 2013, 4:07:40 PM4/3/13

to Milan Sreckovic, Andreas Gal, dev...@lists.mozilla.org, mi...@mozilla.com

On Wed, Apr 3, 2013 at 1:01 PM, Milan Sreckovic <msrec...@mozilla.com> wrote:
> Agreeing with everybody, even though there is a bit of a disagreement :-) There is the "right thing" and the "thing we have to do". I know it's hard for me to talk about "things we have to do" because, well, that leads to imperfect world.
>
> The reason the trains work for us is because we have everybody looking at nightly, so by the time we start slowing down for Aurora and Beta, we know we're in decent shape, and we know we can change our behaviour, lock things down, and hit the dates for shipping.
>
> With the current setup, we have a large group of people, with a large set of tests, get on the train late and introduce the work that we're not used to doing at that stage of the game. To beat the analogy to death, it seems like we should have another station, or another set of tracks for them. In the perfect world though, they would be on the train from day one, and not surprise us and themselves with all the work way late in the game.
>
> I'm getting to the point - if our partners can get on the aurora train right now (22), then maybe there is enough time to ship based on that version. I doubt our partners will go for it, but it would be something to start planting now. As for riding the 23 train, I like to think I'm brave, but that one scares the crap out of me, with a few working days between us producing a build and pencils down.

As far as I know, we have shipped 16 releases of Firefox using the
train model without ever being more than a few days late. So I'm not
worried about having the non-b2g parts of gecko ready for the target
date here. There's a lot more risk in the B2G work, but that risk is
unaffected by which train we are on.

In fact, optimizing for reducing that risk means that we should do
whatever makes b2g development easiest. And the shorted distance that
we have between m-c and the train we are shipping, the easier writing
b2g code is.

/ Jonas

Justin Lebar

unread,

Apr 3, 2013, 4:12:32 PM4/3/13

to Jonas Sicking, Andreas Gal, dev...@lists.mozilla.org, mi...@mozilla.com

> I agree that we did branch too early for the v1.0 release and my
> proposal does put branching at a later stage compared to when we did
> it for v1.0.

Ultimately I think branching based on a date has been a mistake every
time we've done it in this project. We branch when we hit a date but
ship when we meet quality standards. It's easy to see how this leads
to branching way too early.

On Wed, Apr 3, 2013 at 4:03 PM, Jonas Sicking <jo...@sicking.cc> wrote:
> Yeah, I don't believe we have the luxury of being "done" before
> branching to aurora. That's certainly where we should eventually get,
> but we have too many table-stakes features still to implement that we
> can't put a 12 week gap at the end.
>
> I agree that we did branch too early for the v1.0 release and my
> proposal does put branching at a later stage compared to when we did
> it for v1.0.
>
> / Jonas
>

Andreas Gal

unread,

Apr 3, 2013, 4:13:03 PM4/3/13

to Justin Lebar, mi...@mozilla.com, Jonas Sicking, dev...@lists.mozilla.org

On Apr 3, 2013, at 10:07 PM, Justin Lebar <justin...@gmail.com> wrote:

>> "lets just slip 6 weeks" is unfortunately not how the hardware business works.
>

> If the past months have taught me anything, it's that "let's just
> branch and double-land everything" is unfortunately not a path to
> shipping quality software on time with happy, productive engineers.
> So this is not a one-sided trade-off.
>
> I get that there are partner constraints here, and perhaps riding the
> trains is incompatible with them, but I think there's a certain amount
> of creativity called for here, given that what we've been doing hasn't
> been working. I'd be curious to know if you have any ideas in this
> respect beyond "wait a few years".

I don't think there are any silver bullets I can offer. What exactly is the problem we are trying to solve here though. How often do people still have to double-land themselves? The feedback I was getting is that the vast majority of landings is done by our awesome uplifting mini-team. In rare instances people have to help directly (thats what we should try to reduce by keeping trunk and v1.1 close as long v1.1 is still changing). Is this accurate or do you feel that uplifting is still a pain?

Andreas

Andreas Gal

unread,

Apr 3, 2013, 4:16:09 PM4/3/13

to Jonas Sicking, Milan Sreckovic, dev...@lists.mozilla.org, mi...@mozilla.com

You are asking for different things here. You are proposing two things.

1. Switch from 18 to 22 for v1.1, which means taking hundreds or thousands of patches that were never tested or certified as part of v1.0, dramatically changing our risk profile. This is a complete no-go for our partners. We can keep arguing about this, but it will absolutely not happen. You have the vendor contacts yourself. Feel free to reach out and ask.

2. Ride the X train. If X=18, we can do that, but it means landing hundreds of b2g18 patches that are not on firefox18 on firefox18. If you can convince the Firefox release drivers to take those on the ESR, I am ok with this approach, but I don't see the benefits.

The general complication here is indeed not the influx of aurora/beta fixes into b2g, its really the other way around that I see as problematic.

Andreas

>
> / Jonas

Ben Francis

unread,

Apr 3, 2013, 4:20:02 PM4/3/13

to Jonas Sicking, dev...@lists.mozilla.org, mi...@mozilla.com

On Wed, Apr 3, 2013 at 8:16 PM, Jonas Sicking <jo...@sicking.cc> wrote:

> This actually means that we would have the ability to go back to
> mozilla-central and gaia master right now and ride the normal release
> trains for the Gecko 23 release, while still reaching the release
> milestone for gecko before v1.1 is done.
>

I won't try to speak for the gecko part, because I understand there are
challenges in syncing Firefox release schedules with Firefox OS release
schedules.

But from Gaia's point of view I absolutely support going back to using
master for 1.1. I'd like to see us branch gaia releases as late as possible
(if at all), and only allow bugfixes on that branch (not new features or
enhancements).

Right now a very large proportion of patches on master are getting uplifted
to v1-train anyway, and many of those that aren't uplifted end up being
uplifted later because there turn out to be unexpected regressions due to
dependencies and bad merges.

Once we switch to two week iterations of development post-1.1 presumably
those iterations will happen in master too?

Ben

Andreas Gal

unread,

Apr 3, 2013, 4:20:24 PM4/3/13

to Justin Lebar, mi...@mozilla.com, Jonas Sicking, dev...@lists.mozilla.org

On Apr 3, 2013, at 10:12 PM, Justin Lebar <justin...@gmail.com> wrote:

>> I agree that we did branch too early for the v1.0 release and my
>> proposal does put branching at a later stage compared to when we did
>> it for v1.0.
>
> Ultimately I think branching based on a date has been a mistake every
> time we've done it in this project. We branch when we hit a date but
> ship when we meet quality standards. It's easy to see how this leads
> to branching way too early.

I think this is as close as it gets to a silver bullet. I think in practice we will have to branch because there will be patches that people want to land that aren't meant for v1.1, or where we aren't sure it can go into 1.1 and we want to land them somewhere (trunk!). But we should try to keep trunk and v1.1 close to reduce the pain of landings to close to 0 (engineers don't have to do it, no conflicts). This is how we "branch when we are ready". When we think v1.1 is stable enough, we can let trunk float and absorb v1.2 work which will make it diverge and make uplifts hard, but at that point we will be ready for that.

Andreas

>
> On Wed, Apr 3, 2013 at 4:03 PM, Jonas Sicking <jo...@sicking.cc> wrote:
>> Yeah, I don't believe we have the luxury of being "done" before
>> branching to aurora. That's certainly where we should eventually get,
>> but we have too many table-stakes features still to implement that we
>> can't put a 12 week gap at the end.
>>
>> I agree that we did branch too early for the v1.0 release and my
>> proposal does put branching at a later stage compared to when we did
>> it for v1.0.
>>
>> / Jonas
>>

>>>>> This actually means that we would have the ability to go back to
>>>>> mozilla-central and gaia master right now and ride the normal release
>>>>> trains for the Gecko 23 release, while still reaching the release
>>>>> milestone for gecko before v1.1 is done.
>>>>>

Jonas Sicking

unread,

Apr 3, 2013, 4:20:02 PM4/3/13

to Andreas Gal, Milan Sreckovic, dev...@lists.mozilla.org, mi...@mozilla.com

On Wed, Apr 3, 2013 at 1:16 PM, Andreas Gal <andre...@gmail.com> wrote:
>
> On Apr 3, 2013, at 10:07 PM, Jonas Sicking <jo...@sicking.cc> wrote:
>
>> On Wed, Apr 3, 2013 at 1:01 PM, Milan Sreckovic <msrec...@mozilla.com> wrote:
>>> Agreeing with everybody, even though there is a bit of a disagreement :-) There is the "right thing" and the "thing we have to do". I know it's hard for me to talk about "things we have to do" because, well, that leads to imperfect world.
>>>
>>> The reason the trains work for us is because we have everybody looking at nightly, so by the time we start slowing down for Aurora and Beta, we know we're in decent shape, and we know we can change our behaviour, lock things down, and hit the dates for shipping.
>>>
>>> With the current setup, we have a large group of people, with a large set of tests, get on the train late and introduce the work that we're not used to doing at that stage of the game. To beat the analogy to death, it seems like we should have another station, or another set of tracks for them. In the perfect world though, they would be on the train from day one, and not surprise us and themselves with all the work way late in the game.
>>>
>>> I'm getting to the point - if our partners can get on the aurora train right now (22), then maybe there is enough time to ship based on that version. I doubt our partners will go for it, but it would be something to start planting now. As for riding the 23 train, I like to think I'm brave, but that one scares the crap out of me, with a few working days between us producing a build and pencils down.
>>
>> As far as I know, we have shipped 16 releases of Firefox using the
>> train model without ever being more than a few days late. So I'm not
>> worried about having the non-b2g parts of gecko ready for the target
>> date here. There's a lot more risk in the B2G work, but that risk is
>> unaffected by which train we are on.
>>
>> In fact, optimizing for reducing that risk means that we should do
>> whatever makes b2g development easiest. And the shorted distance that
>> we have between m-c and the train we are shipping, the easier writing
>> b2g code is.
>
> You are asking for different things here. You are proposing two things.
>
> 1. Switch from 18 to 22 for v1.1, which means taking hundreds or thousands of patches that were never tested or certified as part of v1.0, dramatically changing our risk profile. This is a complete no-go for our partners. We can keep arguing about this, but it will absolutely not happen. You have the vendor contacts yourself. Feel free to reach out and ask.

Yup. This was one of the points that I listed in the original emails.
I don't want to speak for Alex, but I *think* that he has offered to
do this.

FWIW, I think the large number of patches that we've taken for B2G
since certification poses a much larger risk than the gecko patches
that we are talking about here. Simply because the B2G patches are
much closer to the relevant code.

/ Jonas

Justin Lebar

unread,

Apr 3, 2013, 4:27:42 PM4/3/13

to Andreas Gal, mi...@mozilla.com, Jonas Sicking, dev...@lists.mozilla.org

>> If the past months have taught me anything, it's that "let's just
>> branch and double-land everything" is unfortunately not a path to
>> shipping quality software on time with happy, productive engineers.
>> So this is not a one-sided trade-off.
>>
>> I get that there are partner constraints here, and perhaps riding the
>> trains is incompatible with them, but I think there's a certain amount
>> of creativity called for here, given that what we've been doing hasn't
>> been working. I'd be curious to know if you have any ideas in this
>> respect beyond "wait a few years".
>
> I don't think there are any silver bullets I can offer. What exactly is the problem we are trying to solve here though. How often do people still have to double-land themselves? The feedback I was getting is that the vast majority of landings is done by our awesome uplifting mini-team.

Even with our wonderful sheriff team, the many branches we have are a
huge pain. There are a lot of reasons. Here are some.

* Constant fighting about "risk". Essentially all important patches
have to go through two layers of approval (review, blocking+).
Non-blocking patches have to go through three layers of approval
(review, tracking+, approval+) these days.

In most cases, the risk assessment is completely pro-forma. Engineers
get what they want, in the end. But the process takes up time and
energy. I contend we would /never/ put up with this sort of process
BS in normal Firefox development.

* Divergence between the branches makes landing hard. It's not at all
uncommon for my patches to work on trunk and cause test failures on
b2g18. And this problem will only get worse the longer we keep these
branches around.

* Divergence between the branches causes bugs not caught by tests.
Again, this has happened to me personally, and has a high cost.

* Divergence between the branches means that all branches aren't
tested. Essentially nobody is testing b2g nightly, so it often
doesn't work. So why am I even landing patches on m-c?

> In rare instances people have to help directly (thats what we should try to reduce by
> keeping trunk and v1.1 close as long v1.1 is still changing). Is this accurate or do you
> feel that uplifting is still a pain?

I don't think keeping v1.1 and trunk close is a goal of release
drivers. Their goal is to stabilize 1.1 by denying approval for risky
patches. That is directly at odds with keeping trunk and 1.1 close.

If we want to keep 1.1 and trunk close, there's a simple solution:
Make them the same branch. If that doesn't work, perhaps we need a
more creative solution. But certainly what we're doing now isn't
working towards this goal at all.

Andreas Gal

unread,

Apr 3, 2013, 4:30:32 PM4/3/13

to Jonas Sicking, Milan Sreckovic, dev...@lists.mozilla.org, mi...@mozilla.com

>
> FWIW, I think the large number of patches that we've taken for B2G
> since certification poses a much larger risk than the gecko patches
> that we are talking about here. Simply because the B2G patches are
> much closer to the relevant code.

I think this is where the vendor will disagree with you. Anything they report they want a fix for. Anything they don't report they don't care about, its just unnecessary risk/cost to them. Feel free to reach out though. Definitely can't hurt. In general, I would prefer to ship the most recent gecko we can, obviously.

Andreas

>
> / Jonas

Andreas Gal

unread,

Apr 3, 2013, 4:32:59 PM4/3/13

to Justin Lebar, mi...@mozilla.com, Jonas Sicking, dev...@lists.mozilla.org

>
> I don't think keeping v1.1 and trunk close is a goal of release
> drivers. Their goal is to stabilize 1.1 by denying approval for risky
> patches. That is directly at odds with keeping trunk and 1.1 close.

What about the other option? Avoid v1.2 work until v1.1 is more stable?

Andreas

DANIEL JESUS COLOMA BAIGES

unread,

Apr 3, 2013, 4:39:58 PM4/3/13

to Andreas Gal, Justin Lebar, dev...@lists.mozilla.org, Jonas Sicking, mi...@mozilla.com

On 4/3/13 10:13 PM, "Andreas Gal" <andre...@gmail.com> wrote:

>
>On Apr 3, 2013, at 10:07 PM, Justin Lebar <justin...@gmail.com> wrote:
>
>>> "lets just slip 6 weeks" is unfortunately not how the hardware
>>>business works.
>>

>> If the past months have taught me anything, it's that "let's just
>> branch and double-land everything" is unfortunately not a path to
>> shipping quality software on time with happy, productive engineers.
>> So this is not a one-sided trade-off.
>>
>> I get that there are partner constraints here, and perhaps riding the
>> trains is incompatible with them, but I think there's a certain amount
>> of creativity called for here, given that what we've been doing hasn't
>> been working. I'd be curious to know if you have any ideas in this
>> respect beyond "wait a few years".
>
>I don't think there are any silver bullets I can offer. What exactly is
>the problem we are trying to solve here though. How often do people still
>have to double-land themselves? The feedback I was getting is that the

>vast majority of landings is done by our awesome uplifting mini-team. In

>rare instances people have to help directly (thats what we should try to
>reduce by keeping trunk and v1.1 close as long v1.1 is still changing).
>Is this accurate or do you feel that uplifting is still a pain?
>

>Andreas

Until a couple of weeks ago that was not a big problem (although the
number of times devs needed to help uplifting sheriffs has not been low),
now that new features are being uplifted to v1-train the number of times
in which people need to create branch-specific patches is increasing. This
is a heavy burden, especially now that we are under pressure to deliver
1.0.1 and the certification process is likely to raise many bugs.

Apart from that, I think the level of control we are enforcing to land
things in v1.1 is high, that means that v1.1 and master are not that close
either.

I am not saying I have a solution for this, but that these points are
definitely reducing team productivity *a lot*

Este mensaje se dirige exclusivamente a su destinatario. Puede consultar nuestra política de envío y recepción de correo electrónico en el enlace situado más abajo.
This message is intended exclusively for its addressee. We only send and receive email on the basis of the terms set out at:
http://www.tid.es/ES/PAGINAS/disclaimer.aspx

Alex Keybl

unread,

Apr 3, 2013, 4:42:20 PM4/3/13

to Justin Lebar, dev...@lists.mozilla.org, Andreas Gal, Jonas Sicking, mi...@mozilla.com

Justin, let's take a step back and not make invalid assertions.

> * Constant fighting about "risk". Essentially all important patches
> have to go through two layers of approval (review, blocking+).
> Non-blocking patches have to go through three layers of approval
> (review, tracking+, approval+) these days.

This is false. Any bug can request approval without being tracking+ (same as desktop). Tracking nomination are a good way to understand whether the user impact is enough to warrant uplift, or to get automatic uplift in the next release.

> In most cases, the risk assessment is completely pro-forma. Engineers
> get what they want, in the end. But the process takes up time and
> energy.

If it's not worth the 2 minutes for an engineer to request approval, it's likely not worth uplifting the code change. And if approved, the bug is uplifted automatically by sheriffs.

You're right though - you all could lie to us about risk or reward. We're trusting you aren't.

> I contend we would /never/ put up with this sort of process
> BS in normal Firefox development.

This is so apples and oranges it's astonishing. How can you compare the landing requirements of desktop releases (which involve no partners, have massive pre-release testing populations, is a mature product, and has no hard feature scope) to B2G releases?

-Alex

On Apr 3, 2013, at 1:27 PM, Justin Lebar <justin...@gmail.com> wrote:

>>> If the past months have taught me anything, it's that "let's just
>>> branch and double-land everything" is unfortunately not a path to
>>> shipping quality software on time with happy, productive engineers.
>>> So this is not a one-sided trade-off.
>>>
>>> I get that there are partner constraints here, and perhaps riding the
>>> trains is incompatible with them, but I think there's a certain amount
>>> of creativity called for here, given that what we've been doing hasn't
>>> been working. I'd be curious to know if you have any ideas in this
>>> respect beyond "wait a few years".
>>
>> I don't think there are any silver bullets I can offer. What exactly is the problem we are trying to solve here though. How often do people still have to double-land themselves? The feedback I was getting is that the vast majority of landings is done by our awesome uplifting mini-team.
>

> Even with our wonderful sheriff team, the many branches we have are a
> huge pain. There are a lot of reasons. Here are some.
>
> * Constant fighting about "risk". Essentially all important patches
> have to go through two layers of approval (review, blocking+).
> Non-blocking patches have to go through three layers of approval
> (review, tracking+, approval+) these days.
>
> In most cases, the risk assessment is completely pro-forma. Engineers
> get what they want, in the end. But the process takes up time and
> energy. I contend we would /never/ put up with this sort of process
> BS in normal Firefox development.
>
> * Divergence between the branches makes landing hard. It's not at all
> uncommon for my patches to work on trunk and cause test failures on
> b2g18. And this problem will only get worse the longer we keep these
> branches around.
>
> * Divergence between the branches causes bugs not caught by tests.
> Again, this has happened to me personally, and has a high cost.
>
> * Divergence between the branches means that all branches aren't
> tested. Essentially nobody is testing b2g nightly, so it often
> doesn't work. So why am I even landing patches on m-c?
>

>> In rare instances people have to help directly (thats what we should try to reduce by
>> keeping trunk and v1.1 close as long v1.1 is still changing). Is this accurate or do you
>> feel that uplifting is still a pain?
>

> I don't think keeping v1.1 and trunk close is a goal of release
> drivers. Their goal is to stabilize 1.1 by denying approval for risky
> patches. That is directly at odds with keeping trunk and 1.1 close.
>

> If we want to keep 1.1 and trunk close, there's a simple solution:
> Make them the same branch. If that doesn't work, perhaps we need a
> more creative solution. But certainly what we're doing now isn't
> working towards this goal at all.
>

Justin Lebar

unread,

Apr 3, 2013, 4:43:53 PM4/3/13

to Andreas Gal, mi...@mozilla.com, Jonas Sicking, dev...@lists.mozilla.org

On Wed, Apr 3, 2013 at 4:32 PM, Andreas Gal <andre...@gmail.com> wrote:
>>
>> I don't think keeping v1.1 and trunk close is a goal of release
>> drivers. Their goal is to stabilize 1.1 by denying approval for risky
>> patches. That is directly at odds with keeping trunk and 1.1 close.
>

> What about the other option? Avoid v1.2 work until v1.1 is more stable?

I'm not sure I completely understand this proposal.

But if you're suggesting that we do everything the same except we
avoid checking in patches for 1.2 into trunk, I don't think that
improves the situation much. In my experience, little 1.2 work is
ongoing compared to the amount of regular Gecko work that gets checked
in every cycle. As a result, it's not the B2G 1.2 work that's causing
the painful divergences between m-c and the 1.1 branch for me.

It's good to give people an outlet for 1.2 work that they're doing,
otherwise they'll argue harder to get it into 1.1, which is unhelpful.

Alex Keybl

unread,

Apr 3, 2013, 4:44:47 PM4/3/13

to Justin Lebar, dev...@lists.mozilla.org, Andreas Gal, Jonas Sicking, mi...@mozilla.com

>> I contend we would /never/ put up with this sort of process
>> BS in normal Firefox development.
>
> This is so apples and oranges it's astonishing. How can you compare the landing requirements of desktop releases (which involve no partners, have massive pre-release testing populations, is a mature product, and has no hard feature scope) to B2G releases?

Forgot to mention a lack of reliable, automated testing.

-Alex

>> I don't think keeping v1.1 and trunk close is a goal of release
>> drivers. Their goal is to stabilize 1.1 by denying approval for risky
>> patches. That is directly at odds with keeping trunk and 1.1 close.
>>

Alex Keybl

unread,

Apr 3, 2013, 4:48:38 PM4/3/13

to Jonas Sicking, Milan Sreckovic, Andreas Gal, dev...@lists.mozilla.org, mi...@mozilla.com

> Yup. This was one of the points that I listed in the original emails.
> I don't want to speak for Alex, but I *think* that he has offered to
> do this.

I share engineering's desire to move to tip of our branches as soon as possible, but in our conversation I noted that it was a blocker from previous partner qualification/update conversations (we can revisit with partners) and that it would also be very difficult to evaluate how many regressions we'd be introducing by taking m-c/master as v1.1. It may put us in another 1.0 situation all over again.

-Alex

> FWIW, I think the large number of patches that we've taken for B2G
> since certification poses a much larger risk than the gecko patches
> that we are talking about here. Simply because the B2G patches are
> much closer to the relevant code.
>

Ben Francis

unread,

Apr 3, 2013, 4:55:38 PM4/3/13

to Andreas Gal, dev...@lists.mozilla.org, Justin Lebar, Jonas Sicking, mi...@mozilla.com

On Wed, Apr 3, 2013 at 9:13 PM, Andreas Gal <andre...@gmail.com> wrote:

> What exactly is the problem we are trying to solve here though. How often
> do people still have to double-land themselves? The feedback I was getting
> is that the vast majority of landings is done by our awesome uplifting

> mini-team. In rare instances people have to help directly (thats what we

> should try to reduce by keeping trunk and v1.1 close as long v1.1 is still
> changing). Is this accurate or do you feel that uplifting is still a pain?
>

I suspect you're really asking about gecko here, but I'd like to share an
experience in gaia where the magic of uplifting didn't work so well.

Bug 836199 was blocked by bug 836647 so bug 836647 was uplifted. But this
patch caused a regression which was later fixed in bug 849280 so that was
uplifted too. But bug 849280 turned out to be subtly dependent on
bug 830644 which wasn't uplifted, thereby causing bug 855021. Bug 855021
could be fixed by uplifting bug 830644, but that bug has 25+ patches
attached to it, so that wouldn't be very smart. I ended up writing a patch
just for 1.0.1 and 1.1 to stop the dependency madness.

I was asked to uplift most of these myself because there were serious merge
conflicts caused by the divergence between v1-train and master over a long
period of time.

I guess what I'm trying to say is that the longer we keep those branches
open the more crazy things get, the more weird regressions there
are (pushing out release dates) and the more it hits our productivity
having to spend time resolving these things.

It's perhaps worth mentioning that gaia and gecko don't necessarily have to
follow the same release/branching schedule as they should be relatively
independent from each other, in theory.

Ben

--
Ben Francis
http://tola.me.uk

Alex Keybl

unread,

Apr 3, 2013, 4:56:35 PM4/3/13

to DANIEL JESUS COLOMA BAIGES, Justin Lebar, mi...@mozilla.com, Andreas Gal, dev...@lists.mozilla.org, Jonas Sicking

Hi Daniel,

Can you speak to this point Jonas made from your perspective?

>>>>>> * Will it affect our partners ability to upgrade 1.0 users to the
>>>>>> 1.1 release?

Update size would increase, and we can't yet say how many regressions we've taken between master/m-c and v1-train/m-b2g18.

-Alex

On Apr 3, 2013, at 1:39 PM, DANIEL JESUS COLOMA BAIGES <dco...@tid.es> wrote:

>
>
> On 4/3/13 10:13 PM, "Andreas Gal" <andre...@gmail.com> wrote:

>
>>
>> On Apr 3, 2013, at 10:07 PM, Justin Lebar <justin...@gmail.com> wrote:
>>
>>>> "lets just slip 6 weeks" is unfortunately not how the hardware
>>>> business works.
>>>

>>> If the past months have taught me anything, it's that "let's just
>>> branch and double-land everything" is unfortunately not a path to
>>> shipping quality software on time with happy, productive engineers.
>>> So this is not a one-sided trade-off.
>>>
>>> I get that there are partner constraints here, and perhaps riding the
>>> trains is incompatible with them, but I think there's a certain amount
>>> of creativity called for here, given that what we've been doing hasn't
>>> been working. I'd be curious to know if you have any ideas in this
>>> respect beyond "wait a few years".
>>

>> I don't think there are any silver bullets I can offer. What exactly is

>> the problem we are trying to solve here though. How often do people still
>> have to double-land themselves? The feedback I was getting is that the
>> vast majority of landings is done by our awesome uplifting mini-team. In
>> rare instances people have to help directly (thats what we should try to
>> reduce by keeping trunk and v1.1 close as long v1.1 is still changing).
>> Is this accurate or do you feel that uplifting is still a pain?
>>

>> Andreas
>
> Until a couple of weeks ago that was not a big problem (although the
> number of times devs needed to help uplifting sheriffs has not been low),
> now that new features are being uplifted to v1-train the number of times
> in which people need to create branch-specific patches is increasing. This
> is a heavy burden, especially now that we are under pressure to deliver
> 1.0.1 and the certification process is likely to raise many bugs.
>
> Apart from that, I think the level of control we are enforcing to land
> things in v1.1 is high, that means that v1.1 and master are not that close
> either.
>
> I am not saying I have a solution for this, but that these points are
> definitely reducing team productivity *a lot*
>
>>
>>>

>>>>>> / Jonas
>>>>>> _______________________________________________
>>>>>> dev-b2g mailing list
>>>>>> dev...@lists.mozilla.org
>>>>>> https://lists.mozilla.org/listinfo/dev-b2g
>>>>> _______________________________________________
>>>>> dev-b2g mailing list
>>>>> dev...@lists.mozilla.org
>>>>> https://lists.mozilla.org/listinfo/dev-b2g
>>>>
>>
>> _______________________________________________
>> dev-b2g mailing list
>> dev...@lists.mozilla.org
>> https://lists.mozilla.org/listinfo/dev-b2g
>
>

> ________________________________
>
> Este mensaje se dirige exclusivamente a su destinatario. Puede consultar nuestra política de envío y recepción de correo electrónico en el enlace situado más abajo.
> This message is intended exclusively for its addressee. We only send and receive email on the basis of the terms set out at:
> http://www.tid.es/ES/PAGINAS/disclaimer.aspx

Alex Keybl

unread,

Apr 3, 2013, 5:00:52 PM4/3/13

to Ben Francis, mi...@mozilla.com, Jonas Sicking, dev...@lists.mozilla.org

> But from Gaia's point of view I absolutely support going back to using
> master for 1.1. I'd like to see us branch gaia releases as late as possible
> (if at all), and only allow bugfixes on that branch (not new features or
> enhancements).

I think this is a good point - separating out taking tip of Gecko/Gaia for an upcoming Firefox OS release. That being said, I'm not certain we can pause from v1.1 development long enough to seriously evaluate the quality of master and determine whether it's a viable switch in the next month.

Perhaps we should change the topic to being "Going back to m-c and/or gaia master for v1.2"

-Alex

On Apr 3, 2013, at 1:20 PM, Ben Francis <b...@krellian.com> wrote:

> On Wed, Apr 3, 2013 at 8:16 PM, Jonas Sicking <jo...@sicking.cc> wrote:
>
>> This actually means that we would have the ability to go back to
>> mozilla-central and gaia master right now and ride the normal release
>> trains for the Gecko 23 release, while still reaching the release
>> milestone for gecko before v1.1 is done.
>>
>

> I won't try to speak for the gecko part, because I understand there are
> challenges in syncing Firefox release schedules with Firefox OS release
> schedules.
>
> But from Gaia's point of view I absolutely support going back to using
> master for 1.1. I'd like to see us branch gaia releases as late as possible
> (if at all), and only allow bugfixes on that branch (not new features or
> enhancements).
>
> Right now a very large proportion of patches on master are getting uplifted
> to v1-train anyway, and many of those that aren't uplifted end up being
> uplifted later because there turn out to be unexpected regressions due to
> dependencies and bad merges.
>
> Once we switch to two week iterations of development post-1.1 presumably
> those iterations will happen in master too?
>
> Ben

Justin Lebar

unread,

Apr 3, 2013, 5:02:37 PM4/3/13

to Alex Keybl, dev...@lists.mozilla.org, Andreas Gal, Jonas Sicking, mi...@mozilla.com

I'm sorry, Alex. My intent wasn't to pick a fight.

>> * Constant fighting about "risk". Essentially all important patches
>> have to go through two layers of approval (review, blocking+).
>> Non-blocking patches have to go through three layers of approval
>> (review, tracking+, approval+) these days.
>
> This is false. Any bug can request approval without being tracking+ (same as desktop). Tracking nomination are a good way to understand whether the user impact is enough to warrant uplift, or to get automatic uplift in the next release.

This rule apparently changed four days ago, or perhaps the wiki was
wrong for a spell. I didn't get the memo that this changed; sorry I
was a bit out of date here.

https://wiki.mozilla.org/Release_Management/B2G_Landing?title=Release_Management%2FB2G_Landing&action=historysubmit&diff=642991&oldid=642501

>> In most cases, the risk assessment is completely pro-forma. Engineers
>> get what they want, in the end. But the process takes up time and
>> energy.
>
> If it's not worth the 2 minutes for an engineer to request approval, it's likely not worth uplifting the code change. And if approved, the bug is uplifted automatically by sheriffs.

It's a bit frustrating to be told that the work I claim is distracting
is in fact not a drag on my productivity. I think I'm probably a
better judge of what is and isn't a drag than anyone else.

An approval request is not two minutes of work. It involves a context
switch back to bug after it's landed. It usually involves some
back-and-forth in the bug. It involves looking up bugs to fill in the
"regression caused by" field. It involves keeping track of bugs to
make sure that the ones you care about get plus'ed. These small
context switches have an outsized cost for engineers; see [1].

But I also don't think you're responding to my main point, which is
that not only are uplift requests distracting, they contribute little
value, because engineers almost always get the outcome they want.
That's a question of fact, and if you think it's incorrect, we could
certainly look at the data.

> You're right though - you all could lie to us about risk or reward. We're trusting you aren't.

People respond to incentives. It's not a question of lying so much as
taking a position in a debate. But again, my point is that since
release drivers have to trust engineers' assessments, release drivers
contribute relatively little value to the discussion.

I claim these approvals are a point of frustration and a drag on
productivity. I'm not suggesting that we get rid of them and leave
everything else the same; I was responding to Andreas's query as to
why branches are suboptimal even though we don't have to land on them
ourselves. I don't think we actually have a serious disagreement
here.

[1] http://www.paulgraham.com/makersschedule.html

> On Apr 3, 2013, at 1:27 PM, Justin Lebar <justin...@gmail.com> wrote:
>
>>>> If the past months have taught me anything, it's that "let's just
>>>> branch and double-land everything" is unfortunately not a path to
>>>> shipping quality software on time with happy, productive engineers.
>>>> So this is not a one-sided trade-off.
>>>>
>>>> I get that there are partner constraints here, and perhaps riding the
>>>> trains is incompatible with them, but I think there's a certain amount
>>>> of creativity called for here, given that what we've been doing hasn't
>>>> been working. I'd be curious to know if you have any ideas in this
>>>> respect beyond "wait a few years".
>>>
>>> I don't think there are any silver bullets I can offer. What exactly is the problem we are trying to solve here though. How often do people still have to double-land themselves? The feedback I was getting is that the vast majority of landings is done by our awesome uplifting mini-team.
>>

>> Even with our wonderful sheriff team, the many branches we have are a
>> huge pain. There are a lot of reasons. Here are some.
>>
>> * Constant fighting about "risk". Essentially all important patches
>> have to go through two layers of approval (review, blocking+).
>> Non-blocking patches have to go through three layers of approval
>> (review, tracking+, approval+) these days.
>>
>> In most cases, the risk assessment is completely pro-forma. Engineers
>> get what they want, in the end. But the process takes up time and
>> energy. I contend we would /never/ put up with this sort of process
>> BS in normal Firefox development.
>>
>> * Divergence between the branches makes landing hard. It's not at all
>> uncommon for my patches to work on trunk and cause test failures on
>> b2g18. And this problem will only get worse the longer we keep these
>> branches around.
>>
>> * Divergence between the branches causes bugs not caught by tests.
>> Again, this has happened to me personally, and has a high cost.
>>
>> * Divergence between the branches means that all branches aren't
>> tested. Essentially nobody is testing b2g nightly, so it often
>> doesn't work. So why am I even landing patches on m-c?
>>

>>> In rare instances people have to help directly (thats what we should try to reduce by
>>> keeping trunk and v1.1 close as long v1.1 is still changing). Is this accurate or do you
>>> feel that uplifting is still a pain?
>>

>> I don't think keeping v1.1 and trunk close is a goal of release
>> drivers. Their goal is to stabilize 1.1 by denying approval for risky
>> patches. That is directly at odds with keeping trunk and 1.1 close.
>>
>> If we want to keep 1.1 and trunk close, there's a simple solution:
>> Make them the same branch. If that doesn't work, perhaps we need a
>> more creative solution. But certainly what we're doing now isn't
>> working towards this goal at all.
>>

>>>>>>> This actually means that we would have the ability to go back to
>>>>>>> mozilla-central and gaia master right now and ride the normal release
>>>>>>> trains for the Gecko 23 release, while still reaching the release
>>>>>>> milestone for gecko before v1.1 is done.
>>>>>>>

Alex Keybl

unread,

Apr 3, 2013, 5:22:33 PM4/3/13

to Justin Lebar, dev...@lists.mozilla.org, Andreas Gal, Jonas Sicking, mi...@mozilla.com

> This rule apparently changed four days ago, or perhaps the wiki was
> wrong for a spell. I didn't get the memo that this changed; sorry I
> was a bit out of date here.

Seems to have been wrong for a spell, but at least that's cleared up now.

> An approval request is not two minutes of work. It involves a context
> switch back to bug after it's landed. It usually involves some
> back-and-forth in the bug. It involves looking up bugs to fill in the
> "regression caused by" field. It involves keeping track of bugs to
> make sure that the ones you care about get plus'ed. These small
> context switches have an outsized cost for engineers; see [1].

I consider this us being proactive (making sure a landing makes sense) as opposed to reactive (finding unnecessary regressions weeks later). We're taking a small cost up front, and helping to walk engineers through the risk/reward process (which many are unfamiliar with).

> But I also don't think you're responding to my main point, which is
> that not only are uplift requests distracting, they contribute little
> value, because engineers almost always get the outcome they want.

You seem to be suggesting that the approval process doesn't catch/prevent mistakes. That's just not true. We still get frivolous bugs being nominated for uplift, which points to the fact that these changes would have otherwise been landed without a conversation, and possibly caused blocker regressions. We still find uplift nominations asking for unnecessary string changes late in the cycle. We still get approvals that haven't gone through a UX review. The list of things that we have an eye for goes on and on.

Then there's the whole set of bugs which aren't nominated for uplift because there wasn't good enough reason to land, but which may have landed unnecessarily otherwise.

> That's a question of fact, and if you think it's incorrect, we could
> certainly look at the data.

I always welcome data - it's a lot easier to have a conversation about than generalizations made from the experience of a single developer. Those of us in triage are in a unique position to see the forest for the trees, and that's all too often forgotten.

-Alex

L. David Baron

unread,

Apr 3, 2013, 5:49:47 PM4/3/13

to Andreas Gal, dev...@lists.mozilla.org, Justin Lebar, Jonas Sicking, mi...@mozilla.com

On Wednesday 2013-04-03 22:20 +0200, Andreas Gal wrote:
> On Apr 3, 2013, at 10:12 PM, Justin Lebar <justin...@gmail.com> wrote:
> >> I agree that we did branch too early for the v1.0 release and my
> >> proposal does put branching at a later stage compared to when we did
> >> it for v1.0.
> >
> > Ultimately I think branching based on a date has been a mistake every
> > time we've done it in this project. We branch when we hit a date but
> > ship when we meet quality standards. It's easy to see how this leads
> > to branching way too early.
>
> I think this is as close as it gets to a silver bullet. I think in
> practice we will have to branch because there will be patches that
> people want to land that aren't meant for v1.1, or where we aren't
> sure it can go into 1.1 and we want to land them somewhere
> (trunk!). But we should try to keep trunk and v1.1 close to reduce
> the pain of landings to close to 0 (engineers don't have to do it,
> no conflicts). This is how we "branch when we are ready". When we
> think v1.1 is stable enough, we can let trunk float and absorb
> v1.2 work which will make it diverge and make uplifts hard, but at
> that point we will be ready for that.

What I learned about branchpoints from release cycles before we were
doing train-based releases was that we don't want to branch until
there's a substantial amount of pressure from people who want to
land stuff that's not for that release.

If the reason to branch is worry that developers might need to land
features for the next release, it's too early to branch. If the
reason is pressure from a handful of developers who want to land
features for the next release, it's still too early to branch. The
time to branch is when somewhere around half the developers on the
project are clamoring to have the trunk open because there's no
longer useful work they can do for the release that needs to branch.

Branching earlier leads to either or both of:
(a) extra work from maintaining branches
(b) loss of emphasis on the release as people move to trunk work
even when the release isn't done
and the upside of branching (letting development continue on the
trunk) isn't worthwhile until you have a significant portion of
developers who can no longer make useful contributions to the
current release.

That said, this experience is based on a model in which the trunk
can and should be restricted (eliminating the upside of branching of
reduced risk due to the restrictions). But I think that's a model
worth considering here, with weak restrictions (things needed for
release X, not an approval process, and only in the B2G-specific
areas).

-David

--
𝄞 L. David Baron http://dbaron.org/ 𝄂
𝄢 Mozilla http://www.mozilla.org/ 𝄂

L. David Baron

unread,

Apr 3, 2013, 5:59:17 PM4/3/13

to Justin Lebar, mi...@mozilla.com, Alex Keybl, Andreas Gal, dev...@lists.mozilla.org, Jonas Sicking

On Wednesday 2013-04-03 17:02 -0400, Justin Lebar wrote:
> > If it's not worth the 2 minutes for an engineer to request approval, it's likely not worth uplifting the code change. And if approved, the bug is uplifted automatically by sheriffs.
>
> It's a bit frustrating to be told that the work I claim is distracting
> is in fact not a drag on my productivity. I think I'm probably a
> better judge of what is and isn't a drag than anyone else.
>
> An approval request is not two minutes of work. It involves a context
> switch back to bug after it's landed. It usually involves some
> back-and-forth in the bug. It involves looking up bugs to fill in the
> "regression caused by" field. It involves keeping track of bugs to
> make sure that the ones you care about get plus'ed. These small
> context switches have an outsized cost for engineers; see [1].

There's also the cost of the things that don't get uplifted, because
people forget about them, or because people who are less core to the
project (like me) don't know what the current standards are and
worry about making too many requests for the current standards.
That's a cost in bugs that people spend time filing and debugging
that have already been fixed. There's the cost in testing that the
bug exists on a particular branch in order to figure out whether
approval requests for that branch are needed. (I think I've seen
people not bother; if a bug was reported on v1-train, then it's
probably not a problem on v1.0.1, so why bother asking for uplift
there? Another cost of branches.)

The cost of the context switching is also really substantial. The
rules that approval shouldn't be requested until after review means
that the developer has to load everything back into memory in order
to write the approval request that could have been written quickly
right after the patch was written.

Ben Francis

unread,

Apr 3, 2013, 7:38:36 PM4/3/13

to Andreas Gal, dev...@lists.mozilla.org, Justin Lebar, Jonas Sicking, mi...@mozilla.com

On Wed, Apr 3, 2013 at 9:32 PM, Andreas Gal <andre...@gmail.com> wrote:

>
> What about the other option? Avoid v1.2 work until v1.1 is more stable?
>

If we start using Scrum and two week iterations as Jonas has described,
then there should be no need for any engineer to work on 1.n+1 work until
1.n is complete, or even work on iteration n+1 until iteration n is
complete.

If we have Scrum teams dedicated to product areas with a product owner and
scrum master for each team then we should have a clear ordered product
backlog and sprint backlogs to work from. If iterations are time-boxed then
you can not, by definition, be working on the next iteration.

Sorry to bring project management into a branching discussion again, but I
think the two are inextricably linked. A continuous delivery approach might
call for a different branching strategy for example.

Julien Wajsberg

unread,

Apr 4, 2013, 5:28:13 AM4/4/13

to Andreas Gal, dev...@lists.mozilla.org, Justin Lebar, Jonas Sicking, mi...@mozilla.com

Le 03/04/2013 22:32, Andreas Gal a écrit :
>> I don't think keeping v1.1 and trunk close is a goal of release
>> drivers. Their goal is to stabilize 1.1 by denying approval for risky
>> patches. That is directly at odds with keeping trunk and 1.1 close.

> What about the other option? Avoid v1.2 work until v1.1 is more stable?
>

I see no particular problem in landing 1.2 work in a 1.1 branch (which
can be master/m-c). We can disable it if it's not stable enough when in
the stabilizing phase (as we do on Firefox AFAIK).

signature.asc

Julien Wajsberg

unread,

Apr 4, 2013, 5:34:26 AM4/4/13

to Alex Keybl, Andreas Gal, mi...@mozilla.com, Justin Lebar, dev...@lists.mozilla.org, Jonas Sicking

Le 03/04/2013 23:22, Alex Keybl a écrit :
>
>> But I also don't think you're responding to my main point, which is
>> that not only are uplift requests distracting, they contribute little
>> value, because engineers almost always get the outcome they want.
> You seem to be suggesting that the approval process doesn't catch/prevent mistakes. That's just not true. We still get frivolous bugs being nominated for uplift, which points to the fact that these changes would have otherwise been landed without a conversation, and possibly caused blocker regressions. We still find uplift nominations asking for unnecessary string changes late in the cycle. We still get approvals that haven't gone through a UX review. The list of things that we have an eye for goes on and on.
>
> Then there's the whole set of bugs which aren't nominated for uplift because there wasn't good enough reason to land, but which may have landed unnecessarily otherwise.

Actually I've seen some bugs that should have been asked for approval
but weren't. For these bugs that I knew they needed to be uplifted, (or
that Vivien asked me to watch) I personally had to ask for approval so
that they eventually get uplifted.

(just to add some weight to what David is saying).

--
Julien

signature.asc

Andreas Gal

unread,

Apr 4, 2013, 5:37:56 AM4/4/13

to Julien Wajsberg, dev...@lists.mozilla.org, Justin Lebar, Jonas Sicking, mi...@mozilla.com

That works for some features and code changes but not all. v1.1 stabilization has started _NOW_. Going forward every time we touch the code we risk breaking something, even when landing code thats disabled by default. So until yesterday it would have been ok to land v1.2 work on v1.1 as long its disabled. From now until we ship v1.1 its not really a viable option due to stability concerns.

Andreas

Julien Wajsberg

unread,

Apr 4, 2013, 8:55:54 AM4/4/13

to Andreas Gal, dev...@lists.mozilla.org, Justin Lebar, Jonas Sicking, mi...@mozilla.com

Le 04/04/2013 11:37, Andreas Gal a écrit :
> That works for some features and code changes but not all. v1.1 stabilization has started _NOW_. Going forward every time we touch the code we risk breaking something, even when landing code thats disabled by default. So until yesterday it would have been ok to land v1.2 work on v1.1 as long its disabled. From now until we ship v1.1 its not really a viable option due to stability concerns.

Yep I agree that now is probably too late.

Then we shouldn't land any features that were due for 1.1 but for some
reason are not ready now either.
(I understand that's probably too extremist though...)

signature.asc

Julien Wajsberg

unread,

Apr 4, 2013, 9:01:33 AM4/4/13

to Ben Francis, Justin Lebar, mi...@mozilla.com, Andreas Gal, dev...@lists.mozilla.org, Jonas Sicking

Le 03/04/2013 22:55, Ben Francis a écrit :

> On Wed, Apr 3, 2013 at 9:13 PM, Andreas Gal <andre...@gmail.com> wrote:
>
>> What exactly is the problem we are trying to solve here though. How often
>> do people still have to double-land themselves? The feedback I was getting
>> is that the vast majority of landings is done by our awesome uplifting

>> mini-team. In rare instances people have to help directly (thats what we

>> should try to reduce by keeping trunk and v1.1 close as long v1.1 is still
>> changing). Is this accurate or do you feel that uplifting is still a pain?
>>

> I suspect you're really asking about gecko here, but I'd like to share an
> experience in gaia where the magic of uplifting didn't work so well.
>
> Bug 836199 was blocked by bug 836647 so bug 836647 was uplifted. But this
> patch caused a regression which was later fixed in bug 849280 so that was
> uplifted too. But bug 849280 turned out to be subtly dependent on
> bug 830644 which wasn't uplifted, thereby causing bug 855021. Bug 855021
> could be fixed by uplifting bug 830644, but that bug has 25+ patches
> attached to it, so that wouldn't be very smart. I ended up writing a patch
> just for 1.0.1 and 1.1 to stop the dependency madness.

each time we have a conflict, we need to ask: do I need to uplift
another bug, or do I need to resolve the conflict ?

And that's a quite difficult question to answer indeed :
* to uplift another bug, we first need to find it. This goes by blaming
+ git show + diffing + manually comparing. This is very time consuming
* resolving the conflict is probably easier for the dev but in my past
experience it seems that most conflicts are actually resolved by
uplifting other bugs that were not hard dependencies but rather made the
diff different.

--
Julien

signature.asc

Julien Wajsberg

unread,

Apr 4, 2013, 9:13:41 AM4/4/13

to Alex Keybl, dev...@lists.mozilla.org, Ben Francis, Jonas Sicking, mi...@mozilla.com

Le 03/04/2013 23:00, Alex Keybl a écrit :
>> But from Gaia's point of view I absolutely support going back to using
>> master for 1.1. I'd like to see us branch gaia releases as late as possible
>> (if at all), and only allow bugfixes on that branch (not new features or
>> enhancements).
> I think this is a good point - separating out taking tip of Gecko/Gaia for an upcoming Firefox OS release. That being said, I'm not certain we can pause from v1.1 development long enough to seriously evaluate the quality of master and determine whether it's a viable switch in the next month.
>
> Perhaps we should change the topic to being "Going back to m-c and/or gaia master for v1.2"
>

Agreed, this is too late for v1.1. But as long as we have a clear path
to doing that in a future version I'm happy.

To minimize the number of gecko branches, I think we ultimately need to
use ESR versions of gecko in B2G, because we'll need a long-standing
branch for the gecko used in B2G anyway. I don't know which version is
planned to be the next ESR but it's worth having this version as our
goal for v1.2 (or v2).

Another (maybe braindead) thought: as I understand, it's difficult for
us to use the gecko stable branch using the normal train because we'd
need to land too many patches. Then, maybe we can use the beta branch as
B2G stable branch. I told you it was gonna be dumb.

--
Julien

signature.asc

Milan Sreckovic

unread,

Apr 4, 2013, 9:23:51 AM4/4/13

to Ben Francis, dev...@lists.mozilla.org, Justin Lebar, Andreas Gal, Jonas Sicking, mi...@mozilla.com

Milan

On 2013-04-03, at 7:38 PM, Ben Francis <b...@krellian.com> wrote:

> On Wed, Apr 3, 2013 at 9:32 PM, Andreas Gal <andre...@gmail.com> wrote:
>
> What about the other option? Avoid v1.2 work until v1.1 is more stable?
>

> If we start using Scrum and two week iterations as Jonas has described, then there should be no need for any engineer to work on 1.n+1 work until 1.n is complete, or even work on iteration n+1 until iteration n is complete.

I'd be careful here. There is nothing magical about scrum that would have stopped us from getting exactly to where we are today. I don't see it as either necessary or sufficient condition to get to a better state, whatever that state we may choose to be.

>
> If we have Scrum teams dedicated to product areas with a product owner and scrum master for each team then we should have a clear ordered product backlog and sprint backlogs to work from. If iterations are time-boxed then you can not, by definition, be working on the next iteration.

Sure, but that assumes you have a particular approach in mind, and everybody follows it. Scrum doesn't give you that approach, it needs to come separately. It'll help with execution and focus and such, but won't tell you what the right thing to do is.

Don't get me wrong, I don't mind scrum in general, but it doesn't magically wave the problems away.

Ryan VanderMeulen

unread,

Apr 4, 2013, 9:48:48 AM4/4/13

to mozilla...@lists.mozilla.org

On 4/4/2013 9:13 AM, Julien Wajsberg wrote:
> To minimize the number of gecko branches, I think we ultimately need to
> use ESR versions of gecko in B2G, because we'll need a long-standing
> branch for the gecko used in B2G anyway. I don't know which version is
> planned to be the next ESR but it's worth having this version as our
> goal for v1.2 (or v2).

The next ESR will be Gecko 24. It goes every 7 releases.

Andrew Halberstadt

unread,

Apr 4, 2013, 12:57:43 PM4/4/13

to mozilla...@lists.mozilla.org

On 04/03/2013 04:44 PM, Alex Keybl wrote:
> Forgot to mention a lack of reliable, automated testing.

This is a bit of a catch-22. The automation is unreliable in part
*because* of the branching model:

1. (Almost) no one is interested in fixing issues that affect tests on
trunk. E.g reftests have been hidden for weeks because the b2g process
keeps crashing and no one seems interested in fixing it. This is mostly
because everyone is so busy focusing on 1.1.

2. Automation is horribly out of date on b2g18. Many hacks are in place
just to have anything running at all.

Andrew

Ben Francis

unread,

Apr 4, 2013, 2:28:05 PM4/4/13

to Milan Sreckovic, dev...@lists.mozilla.org, Justin Lebar, Andreas Gal, Jonas Sicking, mi...@mozilla.com

On Thu, Apr 4, 2013 at 2:23 PM, Milan Sreckovic <msrec...@mozilla.com>wrote:

> There is nothing magical about scrum that would have stopped us from
> getting exactly to where we are today.
>

It's true that Scrum (or any methodology) isn't a silver bullet, and agile
requires a lot of discipline and buy-in, even culture change. It's also
true that Scrum doesn't prescribe as much process as other methodologies
like Extreme Programming which I've also used. But I disagree that we'd be
exactly where we are today if we were already using it.

Sure, but that assumes you have a particular approach in mind, and
> everybody follows it. Scrum doesn't give you that approach, it needs to
> come separately. It'll help with execution and focus and such, but won't
> tell you what the right thing to do is.
>

The basic concepts of a product backlog, sprint backlog, product owner,
scrum master and development team that I mention are all core parts of
Scrum. Of course there's a lot more that's needed, but just having an
ordered product backlog and a focus on having a potentially releasable
product at the end of of each timeboxed iteration instead of extremely long
release cycles with no checkpoints could lead us to choose a different
branching strategy.

Anyway, it seems like things are already moving in this direction for 1.2
and I agree with others that it may not now be a good idea to make these
changes for 1.1 as we're already a long way along with that and
expectations have been set.

Onwards and upwards!

Ben

Jonas Sicking

unread,

Apr 4, 2013, 3:22:21 PM4/4/13

to mi...@mozilla.com, dev...@lists.mozilla.org

To bring us back to the main topic of this thread.

All developers in this thread has said that there is significant cost to
staying on the branches that way we currently do. Part of that cost is the
approval process, part of it is the uplifting of patches.

This matches what I've heard talking to engineers.

I'm really inclined to believe this to be true. I would encourage others to
as well.

This doesn't mean that we've failed to help engineers with making the
process simpler. It surely would be a lot more painful if we hadn't had
people to help with uplifts or people to go through approvals as often as
we do.

But we are still spying a high cost by staying on the b2g18 and v1-train
branches.

I agree that there's an unknown amount of risk in going back to m-c and
gaia master. But there is also an unknown amount of risk in all the merging
that we are doing.

And the regressions associated with switching brances is something that
would happen now and that we would have between now and August to fix.
Regressions due to nerves happen all through development, up until the day
we codefreeze.

Current estimates of number of remaining bugs we have ahead of us is in the
several hundreds. This is a significant amount of work. The risk of not
being able to finish that on the current schedule is also something we need
to take into account.

This is as of right now my biggest concern. Together with the general
unhappyness from engineers with the current branch handling.

Note that under the plan I'm proposing we would leave m-c and gaia master
in early may. That means that that is when gecko goes into a much more
aggressive stabilization mode than we ever have had for b2g code.

So at that date is when our risk management through approvals would
essentially be the same as what we have now. I.e. the main source of risk
would be due to required b2g patches landed to fix blockers.

Also note that by "code freeze" on Aug 12th, I'm referring to a phase that
we haven't yet reached for the v1.0.1 release, I.e. no more checkins
allowed *at all*.

/ Jonas

Andrew Sutherland

unread,

Apr 4, 2013, 5:49:18 PM4/4/13

to dev...@lists.mozilla.org

On 04/03/2013 05:22 PM, Alex Keybl wrote:
> You seem to be suggesting that the approval process doesn't catch/prevent mistakes. That's just not true. We still get frivolous bugs being nominated for uplift, which points to the fact that these changes would have otherwise been landed without a conversation, and possibly caused blocker regressions. We still find uplift nominations asking for unnecessary string changes late in the cycle. We still get approvals that haven't gone through a UX review. The list of things that we have an eye for goes on and on.

Is everyone making these mistakes? It seems like there may be people
with significant mozilla development experience like jlebar, roc, etc.
who can be granted 'a=' authority and trusted to know when their own
judgement is sufficient and when they need to ask for approval.

And I think assuming that everyone asking for approval would have landed
the bugs without asking is wrong. I know I've asked for approval for
patches with l10n changes because of situations similar to how *partners
are even now still requesting features on v1.0.1 that require new
strings*. (At those points in time the flag situation was different to
how it is now.) Obviously, unnecessary string changes are different
from completely new strings, but I know when it became clear that our
string freeze was a beautiful dream detached from reality, I stopped
obsessing about minor string changes slipping through.

Andrew

Axel Hecht

unread,

Apr 7, 2013, 11:34:25 AM4/7/13

to mozilla...@lists.mozilla.org

.... and there goes your product. Seriously, I can't picture how you can
expect to ship a product that doesn't remotely matter in en-US with that
attitude.

Axel