rethinking mozilla-central, or how I learned to stop worrying and love project branches

Mike Connor

unread,

May 9, 2011, 6:12:29 PM5/9/11

to dev-planning@lists.mozilla.org planning

tl;dr - I want to propose a change to the way we land code on central. I think this will ease integration and take a lot of the tree watching burden off of individual developers. I think that individual changes should land on a cedar-like branch which is regularly merged with central, just like project branches are today.

Background

For quite some time now, more or less since we started running tests on tinderbox, we've struggled with oranges creating tree closures. Everyone has felt this pain at one time or another, pulling busted changesets, or being unable to land. Random oranges are less of a problem now than before, we still have a very real problem with real test failures on landings. Our current build cycle time (up to four hours per cycle) means that a broken changeset blocks the tree for at least four hours and often, especially if we attempt to fix instead of back out, much longer.

These oranges don’t just block a landing, they block periodic merges from mozilla-central to project branches. To retain sanity, project branch merges should pull m-c (when green!) to the branch, ensure that cycles green, then merge back to a green m-c. The Services team intends to merge services-central on a weekly train model (which includes QA signoffs!), but we’ve been blocked at one end or the other nearly every time we’ve wanted to merge recently. This situation makes frequent merges less appealing, which leads to increasingly-large merges in both directions, compounding the problem.

The opportunity cost of our primary development tree being broken at some point _every_ day is hard to measure, but significant. The problem is actually twofold: one broken commit can block _everyone_ for 4-8 hours, and because of that, we require that every committer watch the tree for hours on every checkin to ensure that they quickly respond to a broken commit. This is a huge amount of focus/context switching. We _must_ do better.

Caveats

No single change or technological solution will solve this problem completely, but perfect cannot be the enemy of good here. We still need to fight and win the War on Orange, but it does not actually solve this problem, since much of the recent orange has been real regressions caught by our test suites. We still should look at ways to automate merging and stop relying on/requiring human intervention, but not only does that require winning the War on Orange, it also requires a lot of code that doesn’t exist yet. In the meantime, we can make something better.

Proposal

First, we should create a special project branch, modelled after how Ehsan was running cedar. <bikeshed>Let’s call this mozilla-random</bikeshed> for lack of a better name. If you do not have a project branch, or your patch doesn’t fit on a current project branch, this is where you will push patches. Breaking this tree still sucks, but since it doesn’t actually block _everyone_ if you do this, *tree watching will not be required* on mozilla-random.

Once a day (or more) the sheriff will verify that mozilla-random is green and there are no perf regressions. If backouts need to happen (up to and including reverting all the way to the prior merge), the sheriff will do that, and then merge what’s left to mozilla-central. (As we know the sheriff list needs some work, we have a small group of volunteers including Ehsan, philikon, and rnewman who will take point on this in the short term.)

Once we have this workflow up and running, mozilla-central will be an integration branch for merging from other branches. Direct landings will require explicit sheriff approval, and should be reserved for extreme cases (and no, “it’s Aurora merge day” doesn’t count as extreme). This means that everything that hits m-c will have already gone through at least one, and typically much more than one, build run on our infrastructure without causing problems, so we expect this tree to stay _very_ green.

This proposal means less work for developers currently landing on mozilla-central (no tree watching!) and less work for maintainers of project branches who are trying to merge (tree is far less often closed/unstable). It also means the impact of any landing that causes bustage is contained to a much smaller group of committers, allowing for resolution on a less urgent schedule. The cost is that someone has to merge mozilla-random to mozilla-central each day, but it is still dramatically less work and overhead than the current system.

Robert O'Callahan

unread,

May 9, 2011, 6:54:13 PM5/9/11

to Mike Connor, dev-planning@lists.mozilla.org planning

FWIW this sounds good to me.

Rob
--
"Now the Bereans were of more noble character than the Thessalonians, for
they received the message with great eagerness and examined the Scriptures
every day to see if what Paul said was true." [Acts 17:11]

Rob Campbell

unread,

May 9, 2011, 7:14:18 PM5/9/11

to Mike Connor, dev-planning@lists.mozilla.org planning

This is how we've been using the Devtools project branch for the last couple of months and it's worked very well. Today's merge didn't require a whole lot of baby-sitting on my part as I already had a high level of confidence in our changesets.

One caveat: I've been doing frequent merges from m-c back to Devtools. This resulted in a large number of merge commits. Going forward, I think we'll be more selective about these merges to keep our history cleaner.

Cheers,
Rob

--
Rob Campbell

> _______________________________________________
> dev-planning mailing list
> dev-pl...@lists.mozilla.org
> https://lists.mozilla.org/listinfo/dev-planning

Marco Bonardo

unread,

May 9, 2011, 7:27:27 PM5/9/11

to

Il 10/05/2011 00:12, Mike Connor ha scritto:
> First, we should create a special project branch, modelled after how Ehsan was running cedar.

One of the problems with Cedar merges is that sometimes it can make
regressionwindows hard to find.
This happens especially with large merges, recently bz (iirc) tried to
merge Cedar to central with no more than 10 changesets in it.
Doing the merge once a day means removing any value to hourly builds in
central, at that point any regressionwindow work should be done in this
new branch.

Another thing is that iirc Ehsan was suggesting to change our landing
habits using try, so that when a patch is ready is pushed to try, and if
try results are fine (less than 2 oranges) it is automatically
transplanted to central. I think it's worth hearing from him how this
new proposal merges with his. Clearly Ehsan's proposal means also
checking talos, if a talos regression is suspected automatically respin
talos tests to get a range and evaluate if it's a real regression or noise.

Finally, but this is most likely the minor problem, it means everybody
will have to maintain locally another repository to be able to push to
it (and eventually keep an updated build of it, if something could have
bitrotted in the meanwhile).

-m

Philipp von Weitershausen

unread,

May 9, 2011, 7:29:10 PM5/9/11

to Rob Campbell, dev-planning@lists.mozilla.org planning, Mike Connor

On Mon, May 9, 2011 at 6:14 PM, Rob Campbell <rcam...@mozilla.com> wrote:
> One caveat: I've been doing frequent merges from m-c back to Devtools. This resulted in a large number of merge commits. Going forward, I think we'll be more selective about these merges to keep our history cleaner.

I'm not sure I agree with the "fewer merge commits == cleaner history"
assertion. In any case, I don't think it matters that much. Also, more
frequent merges allow us to catch breakage due to integration sooner
which I think is the most important point here.

(Anecdotal evidence: it was lamented that Tracemonkey merges
occasionally break m-c, which may have been caught sooner if m-c were
merged to Tracemonkey more frequently and the drama was sorted out
*there* first.)

Chris Pearce

unread,

May 9, 2011, 7:35:04 PM5/9/11

to dev-pl...@lists.mozilla.org

On 10/05/2011 10:12 a.m., Mike Connor wrote:
> Proposal
>
> First, we should create a special project branch, modelled after how Ehsan was running cedar.<bikeshed>Let’s call this mozilla-random</bikeshed> for lack of a better name. If you do not have a project branch, or your patch doesn’t fit on a current project branch, this is where you will push patches. Breaking this tree still sucks, but since it doesn’t actually block _everyone_ if you do this, *tree watching will not be required* on mozilla-random.

I am not opposed to the idea of project branches and periodic merges to
mozilla-central, but I do still think we need to require developers to
watch mozilla-random after pushing there. If a push to mozilla-random
breaks the build or is perma-orange, and more bad changesets land on top
of that, it could take a long time for the tree to stabilize before the
sheriff could merge mozilla-random with mozilla-central. On a bad day
the sheriff would need to close mozilla-random for hours in order for
that tree to stabilize before merge. Unless you're suggesting that only
the sheriff be *required* to watch mozilla-random, and backout as things
go bad? What about when the sheriff's asleep? Should landers we required
to watch the tree only when the sheriff is asleep or otherwise off duty?

It sounds like you're suggesting we use mozilla-random sort of like a
TryServer, where developers push there, and if the changeset is good, it
merges with mozilla-central.

An alternative solution would be to require at least one greenish "try:
-a" push for each landing before allowing landing on mozilla-central?
I've taken that approach with my own landings, and that's worked well
for me. The only time I've had problems with landing recently is when I
*didn't* do this, or when I took ride alongs which didn't do this. It
takes more time to be this careful, but it greatly reduces the chance of
a bad landing, which saves everyone elses' time in the long run.

Regards,
Chris Pearce.

Philipp von Weitershausen

unread,

May 9, 2011, 7:40:02 PM5/9/11

to Marco Bonardo, dev-pl...@lists.mozilla.org

On Mon, May 9, 2011 at 6:27 PM, Marco Bonardo <ma...@supereva.it> wrote:
> Il 10/05/2011 00:12, Mike Connor ha scritto:
>>
>> First, we should create a special project branch, modelled after how Ehsan
>> was running cedar.
>
> One of the problems with Cedar merges is that sometimes it can make
> regressionwindows hard to find.
> This happens especially with large merges, recently bz (iirc) tried to merge
> Cedar to central with no more than 10 changesets in it.
> Doing the merge once a day means removing any value to hourly builds in
> central, at that point any regressionwindow work should be done in this new
> branch.

Not sure about that. Certainly regression ranges might end up leading
back to mozilla-random. I, for one, hope that mozilla-random will
eventually just for *random* patches, and that most teams will
consider project branches (they can be temporary ones, too, like
alder, birch, cedar, etc.)

> Another thing is that iirc Ehsan was suggesting to change our landing habits
> using try, so that when a patch is ready is pushed to try, and if try
> results are fine (less than 2 oranges) it is automatically transplanted to
> central.

What mconnor is proposing is very much like this, except:
a) using a project branch instead of "try"
b) doing merges instead of transplant
c) doing manual checks (by the sheriff)

We could at some point automate (c), as you and ehsan suggest, but
that shouldn't stop us from implement mconnor's proposal *NOW*. Let's
not block what clearly is an improvement for everybody on implementing
a whole new releng infrastructure.

> I think it's worth hearing from him how this new proposal merges
> with his. Clearly Ehsan's proposal means also checking talos, if a talos
> regression is suspected automatically respin talos tests to get a range and
> evaluate if it's a real regression or noise.

Yup, we'll need to do talos runs for every single mozilla-random push
to be able to identify regressions.

> Finally, but this is most likely the minor problem, it means everybody will
> have to maintain locally another repository to be able to push to it (and
> eventually keep an updated build of it, if something could have bitrotted in
> the meanwhile).

Really, I encourage every team to think about getting a project
branch. I for one don't intend to keep a checkout of mozilla-random.

Ehsan Akhgari

unread,

May 9, 2011, 7:42:21 PM5/9/11

to Marco Bonardo, dev-pl...@lists.mozilla.org

On 11-05-09 7:27 PM, Marco Bonardo wrote:
> Another thing is that iirc Ehsan was suggesting to change our landing
> habits using try, so that when a patch is ready is pushed to try, and if
> try results are fine (less than 2 oranges) it is automatically
> transplanted to central. I think it's worth hearing from him how this
> new proposal merges with his. Clearly Ehsan's proposal means also
> checking talos, if a talos regression is suspected automatically respin
> talos tests to get a range and evaluate if it's a real regression or noise.

These are two orthogonal problems. My plans address the longer term
needs, and mconnor's plans address the shorter term needs.

Cheers,
Ehsan

Ehsan Akhgari

unread,

May 9, 2011, 7:45:13 PM5/9/11

to Chris Pearce, dev-pl...@lists.mozilla.org

On 11-05-09 7:35 PM, Chris Pearce wrote:
> On 10/05/2011 10:12 a.m., Mike Connor wrote:
>> Proposal
>>

>> First, we should create a special project branch, modelled after how

>> Ehsan was running cedar.<bikeshed>Let’s call this
>> mozilla-random</bikeshed> for lack of a better name. If you do not
>> have a project branch, or your patch doesn’t fit on a current project
>> branch, this is where you will push patches. Breaking this tree still
>> sucks, but since it doesn’t actually block _everyone_ if you do this,
>> *tree watching will not be required* on mozilla-random.
>
> I am not opposed to the idea of project branches and periodic merges to
> mozilla-central, but I do still think we need to require developers to
> watch mozilla-random after pushing there. If a push to mozilla-random
> breaks the build or is perma-orange, and more bad changesets land on top
> of that, it could take a long time for the tree to stabilize before the
> sheriff could merge mozilla-random with mozilla-central. On a bad day
> the sheriff would need to close mozilla-random for hours in order for
> that tree to stabilize before merge. Unless you're suggesting that only
> the sheriff be *required* to watch mozilla-random, and backout as things
> go bad? What about when the sheriff's asleep? Should landers we required
> to watch the tree only when the sheriff is asleep or otherwise off duty?

When the sheriff wants to merge from mozilla-random, they take a look at
it. If something has broken builds/tests, that changeset and
_everything_ landed on top of it will get backed out, and then a merge
will happen. This way, there is no need for anybody to watch
mozilla-random.

Cheers,
Ehsan

Mike Connor

unread,

May 9, 2011, 7:52:27 PM5/9/11

to Ehsan Akhgari, dev-pl...@lists.mozilla.org, Chris Pearce

This is precisely correct. Stabilizing mozilla-random == back out to
last known good changeset. If your patch breaks the tree, you should
get backed out, period. If people aren't burning cycles to watch the
tree for hours after a checkin, this makes the cost of backouts like
this dramatically smaller, so there shouldn't be as strong of an
aversion to that as there is now.

-- Mike

Mark Finkle

unread,

May 9, 2011, 7:57:37 PM5/9/11

to

Won't mozilla-random inherit the same drama that mozilla-central does
today? I suppose even if it does, there will be a benefit for those
groups currently using a project branch. They get to merge to mozilla-
central without needing to worry about mozilla-random.

I think we'd need to be sure that project branches, including mozilla-
random, pull from mozilla-central _before_ merging back in and waiting
a cycle or two to make sure the merge will truly be green.

Philipp von Weitershausen

unread,

May 9, 2011, 8:01:59 PM5/9/11

to Chris Pearce, dev-pl...@lists.mozilla.org

On Mon, May 9, 2011 at 6:35 PM, Chris Pearce <ch...@pearce.org.nz> wrote:
> I am not opposed to the idea of project branches and periodic merges to
> mozilla-central, but I do still think we need to require developers to watch
> mozilla-random after pushing there. If a push to mozilla-random breaks the
> build or is perma-orange, and more bad changesets land on top of that, it
> could take a long time for the tree to stabilize before the sheriff could
> merge mozilla-random with mozilla-central.

Yup. We're willing to accept that. It's the developer's choice to
choose mozilla-random rather than a project branch. The way I see it,
this will encourage teams to get project branches and sort out drama
there.

Also I think mconnor adequately addresses this point: orange
changesets get backed out on mozilla-random mercilessly. Worst case,
if mozilla-random goes hopelessly orange, we back out *everything*
until the last successful green merge.

> On a bad day the sheriff would
> need to close mozilla-random for hours in order for that tree to stabilize
> before merge. Unless you're suggesting that only the sheriff be *required*
> to watch mozilla-random, and backout as things go bad? What about when the
> sheriff's asleep? Should landers we required to watch the tree only when the
> sheriff is asleep or otherwise off duty?

If the Sheriff is off duty, people can land. As far as I understand
it, the next Sheriff will begin his/her shift by checking the tree and
doing merges or backouts, depending whether the tree is green or not.
The point is, this isn't m-c where it blocks *everyone*. It will just
mean that people who landed stuff to mozilla-random might have to land
again because it got backed out because they landed (unknowningly) on
orange.

I think that's an acceptable price to pay for not breaking m-c. Also,
you know, people can get their own project branches that they can
watch or not watch at their own pleasure.

> An alternative solution would be to require at least one greenish "try: -a"
> push for each landing before allowing landing on mozilla-central? I've taken
> that approach with my own landings, and that's worked well for me. The only
> time I've had problems with landing recently is when I *didn't* do this, or
> when I took ride alongs which didn't do this. It takes more time to be this
> careful, but it greatly reduces the chance of a bad landing, which saves
> everyone elses' time in the long run.

I salute you, Chris, for being so disciplined. I, too, have always
provided try pushes for patches that are not targeting
services-central, our team's project branch. Alas, it seems not
everybody is as disciplined. Sometimes it even seems people aren't
checking whether a local build will succeed.

I think the time has come to accept that a) our current process is
broken (because it requires way too much waiting and context
switching) and b) that even a collection of incredibly smart people
will not be disciplined enough to follow such a process.

Marco Bonardo

unread,

May 9, 2011, 8:02:50 PM5/9/11

to

Il 10/05/2011 01:57, Mark Finkle ha scritto:
> I think we'd need to be sure that project branches, including mozilla-
> random, pull from mozilla-central _before_ merging back in and waiting
> a cycle or two to make sure the merge will truly be green.

At that point something other could land in central that still breaks
your project merge, like a new merge from mozilla-random. But sure
requiring a recent-enough merge sounds like feasible.
-m

Philipp von Weitershausen

unread,

May 9, 2011, 8:05:11 PM5/9/11

to Mark Finkle, dev-pl...@lists.mozilla.org

On Mon, May 9, 2011 at 6:57 PM, Mark Finkle <mark....@gmail.com> wrote:
> Won't mozilla-random inherit the same drama that mozilla-central does
> today? I suppose even if it does, there will be a benefit for those
> groups currently using a project branch. They get to merge to mozilla-
> central without needing to worry about mozilla-random.

Precisely!

> I think we'd need to be sure that project branches, including mozilla-
> random, pull from mozilla-central _before_ merging back in and waiting
> a cycle or two to make sure the merge will truly be green.

Yes. For services-central, for instance, we have a pretty clear
process for this:
https://wiki.mozilla.org/Services/Process/MergingBetweenBranches.

Philipp von Weitershausen

unread,

May 9, 2011, 8:08:11 PM5/9/11

to Marco Bonardo, dev-pl...@lists.mozilla.org

On Mon, May 9, 2011 at 7:02 PM, Marco Bonardo <ma...@supereva.it> wrote:
> At that point something other could land in central that still breaks your
> project merge, like a new merge from mozilla-random. But sure requiring a
> recent-enough merge sounds like feasible.

To quote mconnor:

Boris Zbarsky

unread,

May 9, 2011, 8:41:02 PM5/9/11

to

On 5/9/11 7:27 PM, Marco Bonardo wrote:
> One of the problems with Cedar merges is that sometimes it can make
> regressionwindows hard to find.

We're already SOL on this, with Tracemonkey, mobile, etc, etc.

The mozregression script seems to deal with TM; we should teach it to
deal with other branches.

I think we do want nightlies on _all_ project branches and hourlies on
both m-c and mozilla-random to make that work. The current Cedar
nightly-less situation is pretty sucky (which is why I've been trying to
merg it often).

> Doing the merge once a day means removing any value to hourly builds in
> central, at that point any regressionwindow work should be done in this
> new branch.

Regression work will just need to get smarter....

> Finally, but this is most likely the minor problem, it means everybody
> will have to maintain locally another repository

This seems like a pretty small burden, from my point of view. Disk
space is cheap; there's no need to build the branch in question if you
test your stuff on try, etc. (Then again, I might be biased; I already
have 9 different branches here, not counting the various clones of m-c
I'm using.)

-Boris

Boris Zbarsky

unread,

May 9, 2011, 8:44:33 PM5/9/11

to

On 5/9/11 7:40 PM, Philipp von Weitershausen wrote:
> Not sure about that. Certainly regression ranges might end up leading
> back to mozilla-random.

Will, not might.

> I, for one, hope that mozilla-random will
> eventually just for *random* patches, and that most teams will
> consider project branches

I think your viewpoint here is colored by your interactions with the
tree. We have a number of developers doing work across notional "teams"
in intersecting areas of code; I suspect having separate "project
branches" for DOM, layout, and graphics work wouldn't work wouldn't work
that well. But I'm willing to be proved wrong!

> (they can be temporary ones, too, like
> alder, birch, cedar, etc.)

That would _definitely_ not solve the regression-finding problem. To
solve that problem we need nightly or hourly builds coming at regular
intervals (the interval can vary based on the volume of patches) on all
branches that feed into m-c.

> Really, I encourage every team to think about getting a project
> branch. I for one don't intend to keep a checkout of mozilla-random.

That's a perfectly valid approach for some people. Not for others.

-Boris

Boris Zbarsky

unread,

May 9, 2011, 8:45:34 PM5/9/11

to

On 5/9/11 7:45 PM, Ehsan Akhgari wrote:
> When the sheriff wants to merge from mozilla-random, they take a look at
> it. If something has broken builds/tests, that changeset and
> _everything_ landed on top of it will get backed out, and then a merge
> will happen.

I think sheriffs should be more proactive about watching mozilla-random
and backing out changes from there. Backing out everything has a very
real cost that we should avoid if we can.

-Boris

Boris Zbarsky

unread,

May 9, 2011, 8:49:40 PM5/9/11

to

On 5/9/11 7:52 PM, Mike Connor wrote:
> If your patch breaks the tree, you should get
> backed out, period.

Yes, agreed.

> If people aren't burning cycles to watch the tree
> for hours after a checkin, this makes the cost of backouts like this
> dramatically smaller

I disagree. It still takes time to push patches, it takes time to mark
bugs, to reopen bugs, to push the patches again, etc. The obvious
failure mode here is that nothing would ever get into m-c because people
would keep breaking mozilla-random.

Not to mention that extra push traffic that does nothing useful
increases infrastructure load and makes things worse for everyone else.

Now I do think that anyone pushing to mozilla-random should feel free to
back out any previous changesets that are making the tree orange. That
way the tree will be more or less self-policing and should have a good
shot at being green.

> so there shouldn't be as strong of an aversion to
> that as there is now.

I think there should be aversion to backing out _everything_ after a
given point. I think there should be no aversion whatsoever to targeted
backouts.

Again, I'm coming from the perspective of someone who actually expects
to use mozilla-random for a nontrivial portion of my work. I realize
that you don't plan to ever use it, but that doesn't mean that'll be
everyone's workflow.

-Boris

Philipp von Weitershausen

unread,

May 9, 2011, 8:53:55 PM5/9/11

to Boris Zbarsky, dev-pl...@lists.mozilla.org

On Mon, May 9, 2011 at 7:44 PM, Boris Zbarsky <bzba...@mit.edu> wrote:
> On 5/9/11 7:40 PM, Philipp von Weitershausen wrote:
>>
>> Not sure about that. Certainly regression ranges might end up leading
>> back to mozilla-random.
>
> Will, not might.

Well, in the sense that everything will eventually end up in m-r, yes.
Not everything will originate there.

>> I, for one, hope that mozilla-random will
>> eventually just for *random* patches, and that most teams will
>> consider project branches
>
> I think your viewpoint here is colored by your interactions with the tree.
> We have a number of developers doing work across notional "teams" in
> intersecting areas of code; I suspect having separate "project branches"
> for DOM, layout, and graphics work wouldn't work wouldn't work that well.
> But I'm willing to be proved wrong!

I too am willing to proved wrong. There's only one way to find out! :)

>> (they can be temporary ones, too, like
>> alder, birch, cedar, etc.)
>
> That would _definitely_ not solve the regression-finding problem. To solve
> that problem we need nightly or hourly builds coming at regular intervals
> (the interval can vary based on the volume of patches) on all branches that
> feed into m-c.

alder, birch, cedar etc. get tinderbox builds, just like m-c. Why
isn't that enough? If not, we should be making those behave *exactly*
like m-c does now.

The goal is to sort out as much drama as we can before stuff gets to
m-c, so I'm all for making project branches (temporary or not) be
clones of m-c releng-wise.

>> Really, I encourage every team to think about getting a project
>> branch. I for one don't intend to keep a checkout of mozilla-random.
>
> That's a perfectly valid approach for some people. Not for others.

I accept that. That's why I'm not only not opposed to m-r but have
also volunteered to assist the merging along with rnewman and ehsan.

Boris Zbarsky

unread,

May 9, 2011, 8:58:07 PM5/9/11

to

On 5/9/11 8:53 PM, Philipp von Weitershausen wrote:
> On Mon, May 9, 2011 at 7:44 PM, Boris Zbarsky<bzba...@mit.edu> wrote:
>> On 5/9/11 7:40 PM, Philipp von Weitershausen wrote:
>>>
>>> Not sure about that. Certainly regression ranges might end up leading
>>> back to mozilla-random.
>>
>> Will, not might.
>
> Well, in the sense that everything will eventually end up in m-r, yes.
> Not everything will originate there.

I meant in the sense that there will 100% guaranteed be regressions
originating in mozilla-random. Not all of them, but a number of them.
So some of them might lead there, but at least some will definitely do so.

>> That would _definitely_ not solve the regression-finding problem. To solve
>> that problem we need nightly or hourly builds coming at regular intervals
>> (the interval can vary based on the volume of patches) on all branches that
>> feed into m-c.
>
> alder, birch, cedar etc. get tinderbox builds, just like m-c. Why
> isn't that enough?

Because we only keep tinderbox builds for a few weeks. We need to keep
builds used for regression-finding for much longer than that ("in
perpetuity" in my ideal world).

> If not, we should be making those behave *exactly*
> like m-c does now.

Yes, exactly.

> The goal is to sort out as much drama as we can before stuff gets to
> m-c, so I'm all for making project branches (temporary or not) be
> clones of m-c releng-wise.

Yes.

-Boris

Philipp von Weitershausen

unread,

May 9, 2011, 9:05:47 PM5/9/11

to Boris Zbarsky, dev-pl...@lists.mozilla.org

On Mon, May 9, 2011 at 7:49 PM, Boris Zbarsky <bzba...@mit.edu> wrote:
>> If people aren't burning cycles to watch the tree
>> for hours after a checkin, this makes the cost of backouts like this
>> dramatically smaller
>
> I disagree. It still takes time to push patches, it takes time to mark
> bugs, to reopen bugs, to push the patches again, etc.

Bugs wouldn't have to be reopened because stuff wasn't merged to m-c
yet, it was only landed on m-r. But they would have to be marked, for
sure. The sheriff would do that, though, not the developer. One of the
goals is, as mconnor mentions, freeing the developers of the context
switching that landing on m-c incurs right now.

Potentially having to reland is a small to price to pay compared to
that, I think. Because just like the initial landing, it's push + walk
away.

> The obvious failure
> mode here is that nothing would ever get into m-c because people would keep
> breaking mozilla-random.

To be clear, I understood mconnor's proposal insofar that m-r will
have the same rules as m-c does now. (It's definitely not "push to see
if it compiles" like try server is.)

> Not to mention that extra push traffic that does nothing useful increases
> infrastructure load and makes things worse for everyone else.
>
> Now I do think that anyone pushing to mozilla-random should feel free to
> back out any previous changesets that are making the tree orange. That way
> the tree will be more or less self-policing and should have a good shot at
> being green.
>
>> so there shouldn't be as strong of an aversion to
>> that as there is now.
>
> I think there should be aversion to backing out _everything_ after a given
> point. I think there should be no aversion whatsoever to targeted backouts.

Agreed. The "back everything out until the last merge" should be the
last resort to restore sanity, not something that happens on every
bustage.

> Again, I'm coming from the perspective of someone who actually expects to
> use mozilla-random for a nontrivial portion of my work. I realize that you
> don't plan to ever use it, but that doesn't mean that'll be everyone's
> workflow.

Point taken. I think the best way to see whether mozilla-random will
work exactly as proposed is to try it out.

In any case, the "back out everything" option, if used as a last
resort measure, shouldn't be a case that's common enough to be worthy
of bikeshedding. If this is the only point we might want to refine in
the proposed process, we're in pretty good shape.

Nicholas Nethercote

unread,

May 9, 2011, 9:24:19 PM5/9/11

to Chris Pearce, dev-pl...@lists.mozilla.org

On Tue, May 10, 2011 at 9:35 AM, Chris Pearce <ch...@pearce.org.nz> wrote:
>>
>> *tree watching will not be required* on
>> mozilla-random.
>

> I am not opposed to the idea of project branches and periodic merges to
> mozilla-central, but I do still think we need to require developers to watch
> mozilla-random after pushing there.

I agree. Everything described pretty much matches what already
happens on tracemonkey, and JS developers still watch the tree when
they land patches; it would be horribly rude to do otherwise. The
benefit is that you are inconveniencing fewer people if you break a
project repo.

Nick

Boris Zbarsky

unread,

May 9, 2011, 9:49:00 PM5/9/11

to

On 5/9/11 9:05 PM, Philipp von Weitershausen wrote:
> To be clear, I understood mconnor's proposal insofar that m-r will
> have the same rules as m-c does now. (It's definitely not "push to see
> if it compiles" like try server is.)

Sure. I would still expect breakage pretty often (after all; that's the
issue with m-c).

>> I think there should be aversion to backing out _everything_ after a given
>> point. I think there should be no aversion whatsoever to targeted backouts.
>
> Agreed. The "back everything out until the last merge" should be the
> last resort to restore sanity, not something that happens on every
> bustage.

Sounds like we agree here. Good. ;)

> In any case, the "back out everything" option, if used as a last
> resort measure, shouldn't be a case that's common enough to be worthy
> of bikeshedding

Yep.

-Boris

Boris Zbarsky

unread,

May 9, 2011, 9:50:57 PM5/9/11

to

On 5/9/11 9:24 PM, Nicholas Nethercote wrote:
> I agree. Everything described pretty much matches what already
> happens on tracemonkey, and JS developers still watch the tree when
> they land patches

But there is no dedicated sheriff on TM. That's a very important part
of the mozilla-random proposal that makes it possible to consider the
no-watching policy: if you land and go orange sheriff backs you out life
goes on.

For project repos without sheriff coverage, projects should set whatever
rules they want, of course.

-Boris

Justin Lebar

unread,

May 9, 2011, 10:04:41 PM5/9/11

to Chris Pearce, dev-pl...@lists.mozilla.org

>> If a push to mozilla-random breaks the
>> build or is perma-orange, and more bad changesets land on top of that, it
>> could take a long time for the tree to stabilize before the sheriff could
>> merge mozilla-random with mozilla-central.
>
> Yup. We're willing to accept that. It's the developer's choice to
> choose mozilla-random rather than a project branch. The way I see it,
> this will encourage teams to get project branches and sort out drama
> there.

Let's be clear -- *you're* willing to accept that. What I think Chris and Boris are saying is that they are less willing -- that's how I feel, in any case.

I think the issue of watching versus not watching the tree is orthogonal to the question of where we land. Let's focus on the big question -- should we have mozilla-random at all? -- before we worry about exactly how we should police it.

Boris Zbarsky

unread,

May 9, 2011, 10:06:57 PM5/9/11

to

On 5/9/11 10:04 PM, Justin Lebar wrote:
> I think the issue of watching versus not watching the tree is orthogonal to the question of where we land. Let's focus on the big question -- should we have mozilla-random at all? -- before we worry about exactly how we should police it.

To be clear, I think having mozilla-random would be a good idea. We've
been doing something similar with Cedar already, and it's been working
pretty darned well.

-Boris

Nicholas Nethercote

unread,

May 9, 2011, 10:17:42 PM5/9/11

to Boris Zbarsky, dev-pl...@lists.mozilla.org

On Mon, May 9, 2011 at 6:50 PM, Boris Zbarsky <bzba...@mit.edu> wrote:
> On 5/9/11 9:24 PM, Nicholas Nethercote wrote:
>>
>> I agree. Everything described pretty much matches what already
>> happens on tracemonkey, and JS developers still watch the tree when
>> they land patches
>
> But there is no dedicated sheriff on TM. That's a very important part of
> the mozilla-random proposal that makes it possible to consider the
> no-watching policy: if you land and go orange sheriff backs you out life
> goes on.

Oh. So mozilla-random is special? If the goal is to get people to
use project-specific repos when possible, won't the fact that
mozilla-random is special make it harder to achieve that goal? "I
would use a project-specific repo but then I'd have to watch my
landings so instead I use mozilla-random."

Watching the tree after landing a patch doesn't seem like a big deal
to me. Maybe I'm missing something.

Nick

Boris Zbarsky

unread,

May 9, 2011, 10:21:41 PM5/9/11

to

On 5/9/11 10:17 PM, Nicholas Nethercote wrote:
> Oh. So mozilla-random is special?

Yes, because we need sheriffs anyway to push checkin-needed patches and
the like....

> If the goal is to get people to
> use project-specific repos when possible, won't the fact that
> mozilla-random is special make it harder to achieve that goal? "I
> would use a project-specific repo but then I'd have to watch my
> landings so instead I use mozilla-random."

I agree that there's a bit of a perverse incentive here, but I think we
need to trust people to do the right thing on stuff like this.

> Watching the tree after landing a patch doesn't seem like a big deal
> to me. Maybe I'm missing something.

It's pretty distracting; you have to look at it every 10-15 minutes to
make sure things are still OK, so it's hard to focus on anything else in
the meantime. Or you're checking it more rarely than that, and then the
watching is not that useful....

-Boris

Joe Drew

unread,

May 9, 2011, 10:29:37 PM5/9/11

to dev-planning@lists.mozilla.org planning

One concern I have that just occurred to me is that this will increase
the bug-marking burden on sheriffs/merge vikings. Unless and until we
have some way of automatically changing bug status by checkin, bugs will
have to be marked "fixed-in-m-r" and then marked fixed by the merge
viking. Similarly, backouts (which will be more common, I guess?) will
require a lot of metadata updating.

It's not a huge concern, but it's extra work that doesn't exist now.

Joe

Boris Zbarsky

unread,

May 9, 2011, 10:56:48 PM5/9/11

to

It already exists for everything coming through cedar. It's not really
that much work, compared to the tree-watching involved, from my
experience. Ehsan or Mounir, if you disagree please speak up!

-Boris

Shawn Wilsher

unread,

May 9, 2011, 11:05:52 PM5/9/11

to dev-pl...@lists.mozilla.org

On 5/9/2011 7:29 PM, Joe Drew wrote:
> One concern I have that just occurred to me is that this will increase
> the bug-marking burden on sheriffs/merge vikings. Unless and until we
> have some way of automatically changing bug status by checkin, bugs will
> have to be marked "fixed-in-m-r" and then marked fixed by the merge
> viking. Similarly, backouts (which will be more common, I guess?) will
> require a lot of metadata updating.

This should be very easy to do with Pulse. Heck, I'll even volunteer to
do it if it would make people feel better about this proposal.

Cheers,

Shawn

Shawn Wilsher

unread,

May 9, 2011, 11:12:32 PM5/9/11

to dev-pl...@lists.mozilla.org

On 5/9/2011 4:57 PM, Mark Finkle wrote:
> Won't mozilla-random inherit the same drama that mozilla-central does
> today? I suppose even if it does, there will be a benefit for those
> groups currently using a project branch. They get to merge to mozilla-
> central without needing to worry about mozilla-random.

This is exactly the carrot we want to encourage the use of project
branches. While it's true that you'll have to watch the tree yourself
on a project branch, it's going to be far more stable than mozilla-random.

Cheers,

Shawn

Shawn Wilsher

unread,

May 9, 2011, 11:18:39 PM5/9/11

to dev-pl...@lists.mozilla.org

On 5/9/2011 5:44 PM, Boris Zbarsky wrote:
> I think your viewpoint here is colored by your interactions with the
> tree. We have a number of developers doing work across notional "teams"
> in intersecting areas of code; I suspect having separate "project
> branches" for DOM, layout, and graphics work wouldn't work wouldn't work
> that well. But I'm willing to be proved wrong!

The fact that you put teams in quotes is a good point, although I don't
think you were trying to make it. Teams are whatever we want to define
it as. For instance, Marco and I don't land Places-only changes to the
Places branch. We tend to land whatever we are working on at the time
(often Places stuff, but frequently Storage and random browser stuff
too) and then merge that back to mozilla-central.

I don't think it would be bad for a group of people who tend to work
together to share a branch that has a general focus but gets other stuff
to land on it. The buckets (branches) are self-defined, and we should
not let a name prevent us from using one sensibly.

Cheers,

Shawn

Nicholas Nethercote

unread,

May 9, 2011, 11:26:13 PM5/9/11

to Boris Zbarsky, dev-pl...@lists.mozilla.org

On Mon, May 9, 2011 at 7:21 PM, Boris Zbarsky <bzba...@mit.edu> wrote:
>
>> Watching the tree after landing a patch doesn't seem like a big deal
>> to me. Maybe I'm missing something.
>
> It's pretty distracting; you have to look at it every 10-15 minutes to make
> sure things are still OK, so it's hard to focus on anything else in the
> meantime. Or you're checking it more rarely than that, and then the
> watching is not that useful....

Or someone pings you on IRC to say you broke the tree :)

Nick

Boris Zbarsky

unread,

May 9, 2011, 11:27:14 PM5/9/11

to

On 5/9/11 11:18 PM, Shawn Wilsher wrote:
> I don't think it would be bad for a group of people who tend to work
> together to share a branch

My point was that by that metric all of
layout/content/dom/xpconnect/editor would share a branch, because
there's so much cross linking there of the "x tends to work together
with y" relationship.

-Boris

L. David Baron

unread,

May 9, 2011, 11:43:00 PM5/9/11

to Boris Zbarsky, dev-pl...@lists.mozilla.org

You forgot gfx+widget. There's a lot of layout/gfx interaction, and
I think also a good bit of gfx/widget.

-David

--
L. David Baron http://dbaron.org/
Mozilla Corporation http://www.mozilla.com/

Steve Fink

unread,

May 10, 2011, 12:25:53 AM5/10/11

to dev-pl...@lists.mozilla.org

On 05/09/2011 07:21 PM, Boris Zbarsky wrote:
>
> It's pretty distracting; you have to look at it every 10-15 minutes to
> make sure things are still OK, so it's hard to focus on anything else
> in the meantime. Or you're checking it more rarely than that, and
> then the watching is not that useful....

In a previous life, I set tinderbox up to IM me with build results. We
had an order of magnitude fewer configurations to worry about, but I'd
be happy if firebot sent me just the failures.

Sorry, that wasn't relevant to this thread. I'll file a bug:
https://bugzilla.mozilla.org/show_bug.cgi?id=655935

Bas Schouten

unread,

May 10, 2011, 12:55:34 AM5/10/11

to Boris Zbarsky, dev-pl...@lists.mozilla.org

I'm not so sure this would work either. With all the cross dependencies I think you'd end up continuously pulling the stuff from different branches into each other. I think it only works well if there's a well-defined component boundary, which often doesn't exist within our tree, between teams.

Further more I think the project branch approach would cause complications if there's lots of them going on simultaneously and they start diverging more strongly. I.e. if such an approach was taken it would be important that projects which have more inter-dependencies get kept close together in code. Otherwise it would lead to hard to track down bugs in later stages where a change on branch A and B was innocent on both those branches but caused a subtle issue when both were merged back into m-c.

I also share the concern that some other people have expressed that there is higher overhead of tracking where issues are fixed (fixed on m-r, does that mean resolved fixed or not?) If it does mean resolved fixed, then there's certainly not a zero cost of backing out I think. Which raises the question of how concerned we need to be about orange showing up on m-r, and how hard it is to backout a large stack of patches. If we need to start watching m-r closely, we're simply moving a problem, and creating more work.

I personally do not find watching the tree very hard, the random oranges can of course be a bit painful but have become a lot easier to manage with the TBPL bug number suggestions. But if others do I can understand that's seen as a problem.

Bas

-Boris
_______________________________________________
dev-planning mailing list
dev-pl...@lists.mozilla.org
https://lists.mozilla.org/listinfo/dev-planning

Shawn Wilsher

unread,

May 10, 2011, 1:20:40 AM5/10/11

to dev-pl...@lists.mozilla.org

(I managed to reply to only bz earlier, so trying this again...)

On 5/9/2011 8:27 PM, Boris Zbarsky wrote:
> My point was that by that metric all of
> layout/content/dom/xpconnect/editor would share a branch, because
> there's so much cross linking there of the "x tends to work together
> with y" relationship.

OK, so that group gets N branches to work with, and between them figure
out which work needs to be on the same branch because it is dependent,
and the rest can go to whichever other branch based on load.

The idea here is that these decisions and landing chaos get pushed out
more to the edges.

You could even go so far as to have one project branch for this area,
and then sub project branches that get merged into this one. Not sure
if there's value in that, however.

Cheers,

Shawn

sayrer

unread,

May 10, 2011, 1:47:29 AM5/10/11

to dev-planning@lists.mozilla.org planning

> These oranges don’t just block a landing, they block periodic merges from
> mozilla-central to project branches.

Brief test failures on mozilla-central do not block merges from mozilla-central to project branches. If you're doing such a merge, you probably don't want to pull the tip of mozilla-central, because it will likely still be under test. That means you'll probably be pulling a revision that's a tiny bit older, in which case you might as well go back to the last known good one.

So far, we've managed to scale mozilla-central by letting groups decide to opt to use a separate tree on their own. Since we're seeing more and more of that, I don't see why we should hand down a policy.

I also don't think merges from branches wmake us "SOL" when regression hunting. If you hit a merge in a regression hunt, you can bisect among the merged commits. It requires a tiny mental switch, but that's it.

- Rob

sayrer

unread,

May 10, 2011, 1:47:29 AM5/10/11

to mozilla.de...@googlegroups.com, dev-planning@lists.mozilla.org planning

Philip Chee

unread,

May 10, 2011, 2:02:03 AM5/10/11

to

On Mon, 09 May 2011 20:44:33 -0400, Boris Zbarsky wrote:
> On 5/9/11 7:40 PM, Philipp von Weitershausen wrote:

>> Really, I encourage every team to think about getting a project
>> branch. I for one don't intend to keep a checkout of mozilla-random.
>
> That's a perfectly valid approach for some people. Not for others.

And there are a lot of occasional contributors from the greater Mozilla
ecology who are not part of any team. I suppose there could be a "Team
Rocket" as a catch all for those of us out there in the left field.

Phil

--
Philip Chee <phi...@aleytys.pc.my>, <phili...@gmail.com>
http://flashblock.mozdev.org/ http://xsidebar.mozdev.org
Guard us from the she-wolf and the wolf, and guard us from the thief,
oh Night, and so be good for us to pass.

Axel Hecht

unread,

May 10, 2011, 2:31:16 AM5/10/11

to

How would performance monitoring work in this scheme?

Axel

On 10.05.11 00:12, Mike Connor wrote:
> tl;dr - I want to propose a change to the way we land code on central. I think this will ease integration and take a lot of the tree watching burden off of individual developers. I think that individual changes should land on a cedar-like branch which is regularly merged with central, just like project branches are today.
>
> Background
>
> For quite some time now, more or less since we started running tests on tinderbox, we've struggled with oranges creating tree closures. Everyone has felt this pain at one time or another, pulling busted changesets, or being unable to land. Random oranges are less of a problem now than before, we still have a very real problem with real test failures on landings. Our current build cycle time (up to four hours per cycle) means that a broken changeset blocks the tree for at least four hours and often, especially if we attempt to fix instead of back out, much longer.
>
> These oranges don’t just block a landing, they block periodic merges from mozilla-central to project branches. To retain sanity, project branch merges should pull m-c (when green!) to the branch, ensure that cycles green, then merge back to a green m-c. The Services team intends to merge services-central on a weekly train model (which includes QA signoffs!), but we’ve been blocked at one end or the other nearly every time we’ve wanted to merge recently. This situation makes frequent merges less appealing, which leads to increasingly-large merges in both directions, compounding the problem.
>
> The opportunity cost of our primary development tree being broken at some point _every_ day is hard to measure, but significant. The problem is actually twofold: one broken commit can block _everyone_ for 4-8 hours, and because of that, we require that every committer watch the tree for hours on every checkin to ensure that they quickly respond to a broken commit. This is a huge amount of focus/context switching. We _must_ do better.
>
> Caveats
>
> No single change or technological solution will solve this problem completely, but perfect cannot be the enemy of good here. We still need to fight and win the War on Orange, but it does not actually solve this problem, since much of the recent orange has been real regressions caught by our test suites. We still should look at ways to automate merging and stop relying on/requiring human intervention, but not only does that require winning the War on Orange, it also requires a lot of code that doesn’t exist yet. In the meantime, we can make something better.
>
> Proposal
>
> First, we should create a special project branch, modelled after how Ehsan was running cedar.<bikeshed>Let’s call this mozilla-random</bikeshed> for lack of a better name. If you do not have a project branch, or your patch doesn’t fit on a current project branch, this is where you will push patches. Breaking this tree still sucks, but since it doesn’t actually block _everyone_ if you do this, *tree watching will not be required* on mozilla-random.
>
> Once a day (or more) the sheriff will verify that mozilla-random is green and there are no perf regressions. If backouts need to happen (up to and including reverting all the way to the prior merge), the sheriff will do that, and then merge what’s left to mozilla-central. (As we know the sheriff list needs some work, we have a small group of volunteers including Ehsan, philikon, and rnewman who will take point on this in the short term.)
>
> Once we have this workflow up and running, mozilla-central will be an integration branch for merging from other branches. Direct landings will require explicit sheriff approval, and should be reserved for extreme cases (and no, “it’s Aurora merge day” doesn’t count as extreme). This means that everything that hits m-c will have already gone through at least one, and typically much more than one, build run on our infrastructure without causing problems, so we expect this tree to stay _very_ green.
>
> This proposal means less work for developers currently landing on mozilla-central (no tree watching!) and less work for maintainers of project branches who are trying to merge (tree is far less often closed/unstable). It also means the impact of any landing that causes bustage is contained to a much smaller group of committers, allowing for resolution on a less urgent schedule. The cost is that someone has to merge mozilla-random to mozilla-central each day, but it is still dramatically less work and overhead than the current system.
>

Philipp von Weitershausen

unread,

May 10, 2011, 3:33:24 AM5/10/11

to mozilla.de...@googlegroups.com, dev-planning@lists.mozilla.org planning

On Tue, May 10, 2011 at 12:47 AM, sayrer <say...@gmail.com> wrote:
>> These oranges don’t just block a landing, they block periodic merges from
>> mozilla-central to project branches.
>
> Brief test failures on mozilla-central do not block merges from mozilla-central to project branches.

Are you saying it's ok to land on orange then? Because every single
Tuesday -- which is when s-c is *scheduled* to merge to m-c -- the
tree has been orange for the past few weeks.

> If you're doing such a merge, you probably don't want to pull the tip of mozilla-central, because it will likely still be under test. That means you'll probably be pulling a revision that's a tiny bit older, in which case you might as well go back to the last known good one.

That's what we do when merging m-c to s-c. It doesn't solve our
problem when merging s-c back.

> I also don't think merges from branches wmake us "SOL" when regression hunting. If you hit a merge in a regression hunt, you can bisect among the merged commits. It requires a tiny mental switch, but that's it.

Agreed.

Philipp von Weitershausen

unread,

May 10, 2011, 3:39:52 AM5/10/11

to Bas Schouten, Boris Zbarsky, dev-pl...@lists.mozilla.org

On Mon, May 9, 2011 at 11:55 PM, Bas Schouten <bsch...@mozilla.com> wrote:
> I'm not so sure this would work either. With all the cross dependencies I think you'd end up continuously pulling the stuff from different branches into each other. I think it only works well if there's a well-defined component boundary, which often doesn't exist within our tree, between teams.

That's why mconnor proposed mozilla-random, instead of just proposing
that everybody worked off a project branch.

> Further more I think the project branch approach would cause complications if there's lots of them going on simultaneously and they start diverging more strongly.

I don't think that will happen. At least not if people maintain the
project branches well. It's important that project branches are kept
up to date frequently. Like I said in an earlier reply to this thread,
it seems like some of the problems we've had with Tracemonkey merges
seem to stem from this problem. As far as mconnor's proposal is
concerned, m-r would merge in updates from m-c every day. So the
divergence would be very short lived.

> I also share the concern that some other people have expressed that there is higher overhead of tracking where issues are fixed (fixed on m-r, does that mean resolved fixed or not?) If it does mean resolved fixed, then there's certainly not a zero cost of backing out I think.

I'm pretty sure RESOLVED FIXED will retain its current meaning: landed on m-c.

> Which raises the question of how concerned we need to be about orange showing up on m-r, and how hard it is to backout a large stack of patches. If we need to start watching m-r closely, we're simply moving a problem, and creating more work.
>
> I personally do not find watching the tree very hard,

It's important to understand that not having to watch the m-r tree is
one of the (optional) benefits of mconnor's proposal. The main goal is
to prevent any kind of drama whatsoever on m-c.

Paul Biggar

unread,

May 10, 2011, 6:57:41 AM5/10/11

to Joe Drew, dev-planning@lists.mozilla.org planning

On Mon, May 9, 2011 at 19:29, Joe Drew <j...@mozilla.com> wrote:
> One concern I have that just occurred to me is that this will increase the
> bug-marking burden on sheriffs/merge vikings. Unless and until we have some
> way of automatically changing bug status by checkin, bugs will have to be
> marked "fixed-in-m-r" and then marked fixed by the merge viking. Similarly,
> backouts (which will be more common, I guess?) will require a lot of
> metadata updating.
>

> It's not a huge concern, but it's extra work that doesn't exist now.

cdleary, the TM merge viking, has a script that magics that. Here's a
sample bug comment:

cdleary-bot mozilla-central merge info:
http://hg.mozilla.org/mozilla-central/rev/334ada87e329
http://hg.mozilla.org/mozilla-central/rev/41b74f0b32fe (backout)
Note: not marking as fixed because last changeset is a backout.

--
Paul Biggar
Compiler Geek
pbi...@mozilla.com

Robert Kaiser

unread,

May 10, 2011, 9:35:57 AM5/10/11

to

Mike Connor schrieb:

> tl;dr - I want to propose a change to the way we land code on central. I think this will ease integration and take a lot of the tree watching burden off of individual developers. I think that individual changes should land on a cedar-like branch which is regularly merged with central, just like project branches are today.

From all I hear of people investigating regression windows for when a
crash started, it's more more painful to pinpoint regressions that
happened on a merged project branch than ones that were checked into m-c
directly. Based on that, I'm not 100% sold to doing everything on
project-branch-like trees and only have merges on m-c.

Robert Kaiser

--
Note that any statements of mine - no matter how passionate - are never
meant to be offensive but very often as food for thought or possible
arguments that we as a community needs answers to. And most of the time,
I even appreciate irony and fun! :)

Robert Kaiser

unread,

May 10, 2011, 9:39:19 AM5/10/11

to

Philipp von Weitershausen schrieb:
> Well, in the sense that everything will eventually end up in m-r, yes.

Note that "m-r" is already taken by "mozilla-release" ;-)

Mike Connor

unread,

May 10, 2011, 9:50:18 AM5/10/11

to Robert Kaiser, dev-pl...@lists.mozilla.org

On 2011-05-10, at 9:35 AM, Robert Kaiser wrote:

> Mike Connor schrieb:
>> tl;dr - I want to propose a change to the way we land code on central. I think this will ease integration and take a lot of the tree watching burden off of individual developers. I think that individual changes should land on a cedar-like branch which is regularly merged with central, just like project branches are today.
>
> From all I hear of people investigating regression windows for when a crash started, it's more more painful to pinpoint regressions that happened on a merged project branch than ones that were checked into m-c directly. Based on that, I'm not 100% sold to doing everything on project-branch-like trees and only have merges on m-c.

That's a tooling problem, AIUI. As Boris pointed out elsewhere on this thread, we should fix the tools in that case, since we already have that pain point for the rest of the project branches which will continue to exist with or without this proposal.

-- Mike

L. David Baron

unread,

May 10, 2011, 9:55:58 AM5/10/11

to Mike Connor, dev-pl...@lists.mozilla.org, Robert Kaiser

On Tuesday 2011-05-10 09:50 -0400, Mike Connor wrote:
> On 2011-05-10, at 9:35 AM, Robert Kaiser wrote:

> > Mike Connor schrieb:
> >> tl;dr - I want to propose a change to the way we land code on central. I think this will ease integration and take a lot of the tree watching burden off of individual developers. I think that individual changes should land on a cedar-like branch which is regularly merged with central, just like project branches are today.
> >
> > From all I hear of people investigating regression windows for when a crash started, it's more more painful to pinpoint regressions that happened on a merged project branch than ones that were checked into m-c directly. Based on that, I'm not 100% sold to doing everything on project-branch-like trees and only have merges on m-c.
>

> That's a tooling problem, AIUI. As Boris pointed out elsewhere on this thread, we should fix the tools in that case, since we already have that pain point for the rest of the project branches which will continue to exist with or without this proposal.

It's not just a tools problem for crashes that we detect only
through crash-stats, and don't have steps to reproduce for. If we
see a new crash in layout code starting on a particular day, the
obvious thing to start with is by looking through the layout changes
on that day. That becomes a lot less useful when all the layout
changes for two weeks land on one day. (This is the "oh, there's a
new topcrash in JS that started with *this* tracemonkey merge"
effect.)

The solution here might be distributing our nightly users among the
project branches, which we've discussed in the past, but I haven't
heard it discussed lately.

Mike Connor

unread,

May 10, 2011, 10:04:15 AM5/10/11

to Axel Hecht, dev-pl...@lists.mozilla.org

On 2011-05-10, at 2:31 AM, Axel Hecht wrote:

> How would performance monitoring work in this scheme?

The same way it does with every other project branch. Most/all branches run full Talos, we have tools that let you compare results. One of the obvious conditions of merging to m-c would be "no regressions" on top of "no oranges"

-- Mike

Mike Connor

unread,

May 10, 2011, 10:13:50 AM5/10/11

to Boris Zbarsky, dev-pl...@lists.mozilla.org

On 2011-05-09, at 8:45 PM, Boris Zbarsky wrote:

> On 5/9/11 7:45 PM, Ehsan Akhgari wrote:
>> When the sheriff wants to merge from mozilla-random, they take a look at
>> it. If something has broken builds/tests, that changeset and
>> _everything_ landed on top of it will get backed out, and then a merge
>> will happen.
>
> I think sheriffs should be more proactive about watching mozilla-random and backing out changes from there. Backing out everything has a very real cost that we should avoid if we can.

Most of this has already been addressed, but to recap:

* More proactive backouts are good.
* Backing out everything is a last resort, but an unlikely case.
* If we have to back out a whole day of commits it's because people are pushing on top of orange instead of seeing orange and backing out. Pushing on orange/red is still a crappy move.

-- Mike

Ehsan Akhgari

unread,

May 10, 2011, 10:19:12 AM5/10/11

to Boris Zbarsky, dev-pl...@lists.mozilla.org

I agree. We can always get better tools, for sure, but as somebody who
has done this before, I think this is not costly enough to be worth
considering here. :-)

Cheers,
Ehsan

Ehsan Akhgari

unread,

May 10, 2011, 10:25:42 AM5/10/11

to mozilla.de...@googlegroups.com, Chris Pearce, dev-pl...@lists.mozilla.org, Justin Lebar

On 11-05-09 10:04 PM, Justin Lebar wrote:
>>> If a push to mozilla-random breaks the
>>> build or is perma-orange, and more bad changesets land on top of that, it
>>> could take a long time for the tree to stabilize before the sheriff could
>>> merge mozilla-random with mozilla-central.
>>
>> Yup. We're willing to accept that. It's the developer's choice to
>> choose mozilla-random rather than a project branch. The way I see it,
>> this will encourage teams to get project branches and sort out drama
>> there.
>
> Let's be clear -- *you're* willing to accept that. What I think Chris and Boris are saying is that they are less willing -- that's how I feel, in any case.
>
> I think the issue of watching versus not watching the tree is orthogonal to the question of where we land. Let's focus on the big question -- should we have mozilla-random at all? -- before we worry about exactly how we should police it.

Yes, we should.

mozilla-central is the integration point for the Mozilla codebase, and
all of the applications built on top of it (Firefox, Firefox Mobile,
Thunderbird, SeaMonkey).

Breaking mozilla-central means potentially stopping *everybody* working
on those projects (maybe the first two more than the others). This is
what the proposal is trying to avoid.

mozilla-random, however, is not an integration point. Sure, we should
not push on top of orange or red there, but if we do, and m-r is left in
a completely broken state, the damage is limited to the people who have
landed on top of the first broken revision. The people who are updating
their local repos are not affected. Project branches trying to merge
into m-c are not affected. And hours of developer time does not get
wasted trying to fix the tree state.

This is, I believe, the real point behind mconnor's proposal.

Cheers,
Ehsan

Ehsan Akhgari

unread,

May 10, 2011, 10:30:50 AM5/10/11

to Marco Bonardo, dev-pl...@lists.mozilla.org

On 11-05-09 8:02 PM, Marco Bonardo wrote:
> Il 10/05/2011 01:57, Mark Finkle ha scritto:
>> I think we'd need to be sure that project branches, including mozilla-
>> random, pull from mozilla-central _before_ merging back in and waiting
>> a cycle or two to make sure the merge will truly be green.
>
> At that point something other could land in central that still breaks
> your project merge, like a new merge from mozilla-random. But sure
> requiring a recent-enough merge sounds like feasible.

Please note that under this proposal, mozilla-central is only a place
for other branches to merge in, which makes landings there infrequent
enough that this won't happen too often in practice, I hope.

Cheers,
Ehsan

Justin Lebar

unread,

May 10, 2011, 10:35:53 AM5/10/11

to dev-planning@lists.mozilla.org planning

> I also don't think merges from branches wmake us "SOL" when regression hunting.
> If you hit a merge in a regression hunt, you can bisect among the merged
> commits. It requires a tiny mental switch, but that's it.

It may be an edge case, but I have a counterexample: I was bisecting a crash which appeared only in Windows PGO builds, and it regressed in a large TM merge. Bisecting by pushing to try would have taken a week. But thankfully TM had nightlies I could bisect, and I quickly figured out what was going wrong.

Certainly if we have nightly builds from the project branches we're in as good shape as we currently are.

Ehsan Akhgari

unread,

May 10, 2011, 10:39:31 AM5/10/11

to L. David Baron, dev-pl...@lists.mozilla.org, Robert Kaiser, Mike Connor

On 11-05-10 9:55 AM, L. David Baron wrote:
> On Tuesday 2011-05-10 09:50 -0400, Mike Connor wrote:
>> On 2011-05-10, at 9:35 AM, Robert Kaiser wrote:

>>> Mike Connor schrieb:
>>>> tl;dr - I want to propose a change to the way we land code on central. I think this will ease integration and take a lot of the tree watching burden off of individual developers. I think that individual changes should land on a cedar-like branch which is regularly merged with central, just like project branches are today.
>>>
>>> From all I hear of people investigating regression windows for when a crash started, it's more more painful to pinpoint regressions that happened on a merged project branch than ones that were checked into m-c directly. Based on that, I'm not 100% sold to doing everything on project-branch-like trees and only have merges on m-c.
>>

>> That's a tooling problem, AIUI. As Boris pointed out elsewhere on this thread, we should fix the tools in that case, since we already have that pain point for the rest of the project branches which will continue to exist with or without this proposal.
>
> It's not just a tools problem for crashes that we detect only
> through crash-stats, and don't have steps to reproduce for. If we
> see a new crash in layout code starting on a particular day, the
> obvious thing to start with is by looking through the layout changes
> on that day. That becomes a lot less useful when all the layout
> changes for two weeks land on one day. (This is the "oh, there's a
> new topcrash in JS that started with *this* tracemonkey merge"
> effect.)

In this proposal, we're talking about daily merges, so this wouldn't be
much of a concern. If some people on the layout team move to another
project branch, then I agree that this is something that they need to
focus on.

Cheers,
Ehsan

Axel Hecht

unread,

May 10, 2011, 10:54:29 AM5/10/11

to

The regression bot doesn't run on other project branches, I assume? I
wonder how much folks rely on those.

Axel

(Also, PS: I'm not a fan of "don't watch the tree", don't think we're
ready for it)

Shawn Wilsher

unread,

May 10, 2011, 11:06:33 AM5/10/11

to dev-pl...@lists.mozilla.org

On 5/9/2011 11:02 PM, Philip Chee wrote:
> And there are a lot of occasional contributors from the greater Mozilla
> ecology who are not part of any team. I suppose there could be a "Team
> Rocket" as a catch all for those of us out there in the left field.

Or they use checkin-needed and someone comes around and lands on
mozilla-random for them. Regardless, it's a solvable problem that
doesn't block this proposal.

Cheers,

Shawn

Shawn Wilsher

unread,

May 10, 2011, 11:09:30 AM5/10/11

to dev-pl...@lists.mozilla.org

On 5/10/2011 7:54 AM, Axel Hecht wrote:
> The regression bot doesn't run on other project branches, I assume? I
> wonder how much folks rely on those.

It can run on branches. Regardless, if a project branch merges to m-c
and then it causes a regression, we just back it out. No regressions
unless it's announced ahead of time it's expected, we want it, and it
has driver sign off.

Cheers,

Shawn

Boris Zbarsky

unread,

May 10, 2011, 11:40:22 AM5/10/11

to

On 5/10/11 1:47 AM, sayrer wrote:
> I also don't think merges from branches wmake us "SOL" when regression hunting. If you hit a merge in a regression hunt, you can bisect among the merged commits. It requires a tiny mental switch, but that's it.

Bisecting on commits requires building. Bisecting on nightlies just
requires an internet connection, so our volunteer QA can do it.

-Boris

Boris Zbarsky

unread,

May 10, 2011, 11:44:02 AM5/10/11

to

To be clear, the talos bot would need to report to _someone_ (and
ideally to someplace public) for every project branch. Right now this
is very much not the case, as I understand.

-Boris

Boris Zbarsky

unread,

May 10, 2011, 11:54:03 AM5/10/11

to

Oh, and also important: bisecting on nightlies means worst-case
downloading the nightly, which for me, say, takes 20-30 seconds.
Bisecting on commits means building, which takes a good bit longer.

-Boris

Paul Biggar

unread,

May 10, 2011, 12:12:58 PM5/10/11

to Boris Zbarsky, dev-pl...@lists.mozilla.org

On Tue, May 10, 2011 at 08:54, Boris Zbarsky <bzba...@mit.edu> wrote:
> Oh, and also important: bisecting on nightlies means worst-case downloading
> the nightly, which for me, say, takes 20-30 seconds. Bisecting on commits
> means building, which takes a good bit longer.

OK, so I didn't even know we could do this, so forgive the question:
why do we use nightlies for this? It would seem using tinderbox-builds
(per push) would be more accurate.

Either way, I'd guess we need to solve this whether or not
mozilla-random gets created, if other teams are all moving to project
branches.

Kyle Huey

unread,

May 10, 2011, 12:16:38 PM5/10/11

to Paul Biggar, Boris Zbarsky, dev-pl...@lists.mozilla.org

We don't use Tinderbox builds (in general) because we don't store them for
very long. If the tinderbox builds for that range happen to be available,
that of course lets you narrow it down even further.

- Kyle

Boris Zbarsky

unread,

May 10, 2011, 12:34:45 PM5/10/11

to

On 5/10/11 12:12 PM, Paul Biggar wrote:
> On Tue, May 10, 2011 at 08:54, Boris Zbarsky<bzba...@mit.edu> wrote:
>> Oh, and also important: bisecting on nightlies means worst-case downloading
>> the nightly, which for me, say, takes 20-30 seconds. Bisecting on commits
>> means building, which takes a good bit longer.
>
> OK, so I didn't even know we could do this, so forgive the question:
> why do we use nightlies for this? It would seem using tinderbox-builds
> (per push) would be more accurate.

Because we keep tinderbox-builds for something like 3 weeks (up from 1
week!) but we keep nightlies for years.

By the time a bug is reported, chances are the tinderbox builds for the
day it was introduced have long since been deleted.

-Boris

Paul Biggar

unread,

May 10, 2011, 12:55:06 PM5/10/11

to Boris Zbarsky, dev-pl...@lists.mozilla.org

On Tue, May 10, 2011 at 09:34, Boris Zbarsky <bzba...@mit.edu> wrote:
> On 5/10/11 12:12 PM, Paul Biggar wrote:
>>
>> On Tue, May 10, 2011 at 08:54, Boris Zbarsky<bzba...@mit.edu> wrote:
>>>
>>> Oh, and also important: bisecting on nightlies means worst-case
>>> downloading
>>> the nightly, which for me, say, takes 20-30 seconds. Bisecting on commits
>>> means building, which takes a good bit longer.
>>
>> OK, so I didn't even know we could do this, so forgive the question:
>> why do we use nightlies for this? It would seem using tinderbox-builds
>> (per push) would be more accurate.
>
> Because we keep tinderbox-builds for something like 3 weeks (up from 1
> week!) but we keep nightlies for years.

Why not keep tinderbox builds for years? It sounds like we can solve
the whole problem* with a bit of disk space.

* the problem was that merging tons of commits into M-C from a branch
screws our regression bisecting tools.

Boris Zbarsky

unread,

May 10, 2011, 1:02:14 PM5/10/11

to

On 5/10/11 12:55 PM, Paul Biggar wrote:
> Why not keep tinderbox builds for years? It sounds like we can solve
> the whole problem* with a bit of disk space.

Apparently the problem is the lack of disk space, yes. See previous
threads on deleting old nightlies and the like....

-Boris

Matt Brubeck

unread,

May 10, 2011, 1:06:43 PM5/10/11

to

On Monday, May 9, 2011 4:27:27 PM UTC-7, Marco Bonardo wrote:
> Finally, but this is most likely the minor problem, it means everybody
> will have to maintain locally another repository to be able to push to
> it (and eventually keep an updated build of it, if something could have
> bitrotted in the meanwhile).

I use a single local clone to push and pull both mozilla-central and mozilla-aurora. This isn't as simple as it should be (it's one of the main reasons I wish we used git), but it works reasonably well if there's one repository that you push and pull most often, and others that you interact with less frequently. Keeping multiple up-to-date builds is somewhat annoying; you can use multiple mozconfigs with different objdirs if you want to keep two builds up to date using the same sourcedir.

Benjamin Stover

unread,

May 10, 2011, 1:09:37 PM5/10/11

to dev-pl...@lists.mozilla.org

So in this brave new world of mozilla-random, we would stop allowing people
to push to m-c unless it is a merge? Would that be some sort of trigger or
just a sheriff yelling at you?

Would it be possible to push to m-c if you've had a successful try run? This
is similar to Ehsan's long-term strategy, just not automated. :) I
personally find it a lot easier to drop off a try run URL in my bug, and
come back later to push if it was green.

Ben

(apologies to Paul who got this message directly first :))

On Tue, May 10, 2011 at 9:55 AM, Paul Biggar <pbi...@mozilla.com> wrote:

> On Tue, May 10, 2011 at 09:34, Boris Zbarsky <bzba...@mit.edu> wrote:

> > On 5/10/11 12:12 PM, Paul Biggar wrote:
> >>
> >> On Tue, May 10, 2011 at 08:54, Boris Zbarsky<bzba...@mit.edu> wrote:
> >>>
> >>> Oh, and also important: bisecting on nightlies means worst-case
> >>> downloading
> >>> the nightly, which for me, say, takes 20-30 seconds. Bisecting on
> commits
> >>> means building, which takes a good bit longer.
> >>
> >> OK, so I didn't even know we could do this, so forgive the question:
> >> why do we use nightlies for this? It would seem using tinderbox-builds
> >> (per push) would be more accurate.
> >
> > Because we keep tinderbox-builds for something like 3 weeks (up from 1
> > week!) but we keep nightlies for years.
>

> Why not keep tinderbox builds for years? It sounds like we can solve
> the whole problem* with a bit of disk space.
>

> * the problem was that merging tons of commits into M-C from a branch
> screws our regression bisecting tools.
>
>

> --
> Paul Biggar
> Compiler Geek
> pbi...@mozilla.com

> _______________________________________________
> dev-planning mailing list
> dev-pl...@lists.mozilla.org
> https://lists.mozilla.org/listinfo/dev-planning
>

Boris Zbarsky

unread,

May 10, 2011, 1:14:15 PM5/10/11

to

On 5/10/11 1:09 PM, Benjamin Stover wrote:
> So in this brave new world of mozilla-random, we would stop allowing people
> to push to m-c unless it is a merge?

Or something with explicit approval for some reason, yes.

> Would that be some sort of trigger or just a sheriff yelling at you?

I'd assume the latter for a start.

> Would it be possible to push to m-c if you've had a successful try run?

No. You would push to mozilla-random in cases when you currently push
to m-c.

-Boris

Zack Weinberg

unread,

May 10, 2011, 3:51:03 PM5/10/11

to

On 2011-05-09 7:21 PM, Boris Zbarsky wrote:
> On 5/9/11 10:17 PM, Nicholas Nethercote wrote:
>> Watching the tree after landing a patch doesn't seem like a big deal
>> to me. Maybe I'm missing something.
>
> It's pretty distracting; you have to look at it every 10-15 minutes to
> make sure things are still OK, so it's hard to focus on anything else in
> the meantime. Or you're checking it more rarely than that, and then the
> watching is not that useful....

This is where I bring up again the notion of pushing a *new head* which
gets auto-tested, then auto-merged if and only if it's green. There is
no "backing out" under normal conditions, only getting a notification
that you need to go look at some orange (which you can fix either by
tacking additional patches on your microbranch, or starting over).

In the context of -random, I think this wouldn't have the problems that
people brought up when I proposed it for -central a while back, and it
has the very definite advantage that NOBODY has to watch the tree. I'm
very firmly of the opinion that tree-watching is a job for a computer,
not a human.

On the other tentacle, last time I pushed something, which was sometime
last week, we were still getting 1-3 false oranges per test cycle. The
expected number of oranges (in the mathematical sense) needs to be zero
for any aggressive-backout scheme to be viable, I think -- with or
without test-then-merge.

zw

Mike Connor

unread,

May 10, 2011, 4:30:28 PM5/10/11

to Zack Weinberg, dev-pl...@lists.mozilla.org

a) automatic solutions > stuff done by humans. That's a long play, but we (read: Ehsan) are starting down that path.
b) We can/should expand out the notification-on-failure stuff that Try has to be available for all branches. We still break -central often enough through human error in pushes that mitigation of those failures is worthwhile, but there's no especially good reason to make people watch the tree manually.

-- Mike

Boris Zbarsky

unread,

May 10, 2011, 4:40:53 PM5/10/11

to

On 5/10/11 4:30 PM, Mike Connor wrote:
> b) We can/should expand out the notification-on-failure stuff that Try has to be available for all branches.

Can we make it usable in the process? Specifically:

1) Send out failure notifications as we do now.
2) Don't send out notifications for success.
3) Send out a notification once all the jobs for the push are done.

-Boris

Mike Connor

unread,

May 10, 2011, 4:48:13 PM5/10/11

to Boris Zbarsky, dev-pl...@lists.mozilla.org

That seems eminently reasonable. I'll file that bug now.

-- Mike

Ehsan Akhgari

unread,

May 10, 2011, 5:58:07 PM5/10/11

to Mike Connor, Boris Zbarsky, dev-pl...@lists.mozilla.org

On 11-05-10 4:48 PM, Mike Connor wrote:
>
> On 2011-05-10, at 4:40 PM, Boris Zbarsky wrote:
>

> That seems eminently reasonable. I'll file that bug now.

I'd appreciate if you reply back with the bug number.

Thanks!
Ehsan

Philipp von Weitershausen

unread,

May 11, 2011, 12:38:13 AM5/11/11

to b...@stechz.com, dev-pl...@lists.mozilla.org

On Tue, May 10, 2011 at 12:09 PM, Benjamin Stover <b...@stechz.com> wrote:
> So in this brave new world of mozilla-random, we would stop allowing people

> to push to m-c unless it is a merge? Would that be some sort of trigger or

> just a sheriff yelling at you?

Yes. A merge or explicit approval (shouldn't happen, ideally)

> Would it be possible to push to m-c if you've had a successful try run? This
> is similar to Ehsan's long-term strategy, just not automated. :) I
> personally find it a lot easier to drop off a try run URL in my bug, and
> come back later to push if it was green.

You can still push to try while you wait for review and then come back
to it and land it later. You'll just have to land it in m-r, not m-c.
As mconnor said,

"This means that everything that hits m-c will have already gone
through *at least one, and
typically much more than one,* build run on our infrastructure
without causing problems."

(my emphasis)

sayrer

unread,

May 11, 2011, 1:23:48 AM5/11/11

to dev-planning@lists.mozilla.org planning

On Tuesday, May 10, 2011 12:33:24 AM UTC-7, Philipp von Weitershausen wrote:
> On Tue, May 10, 2011 at 12:47 AM, sayrer <say...@gmail.com> wrote:
> >> These oranges don’t just block a landing, they block periodic merges from
> >> mozilla-central to project branches.
> >
> > Brief test failures on mozilla-central do not block merges from mozilla-central to project branches.
>
> Are you saying it's ok to land on orange then? Because every single
> Tuesday -- which is when s-c is *scheduled* to merge to m-c -- the
> tree has been orange for the past few weeks.

mconnor's message said "they block periodic merges from mozilla-central to project branches." That statement concerns merges in one direction. Since it was part of the rationale for this epic bikeshed thread, it seemed worth pointing out that it was wrong.

Unpacking more:
>
> Are you saying it's ok to land on orange then?

Yes, sort of. The tree almost always has one or two tests failing out of hundreds of thousands, something all other projects of our size contend with, and yet people manage to land code. We also track test failures, which can be problems with the tests themselves or in the product. I trust that anyone with access to mozilla-central can quickly suss out whether they're looking at a flaky test or a widespread problem related to a recent checkin.

> Tuesday -- which is when s-c is *scheduled* to merge to m-c

That sounds like a busy time. How about Friday at 6pm Pacific?

- Rob

sayrer

unread,

May 11, 2011, 1:23:48 AM5/11/11

to mozilla.de...@googlegroups.com, dev-planning@lists.mozilla.org planning

Philipp von Weitershausen

unread,

May 11, 2011, 1:46:44 AM5/11/11

to mozilla.de...@googlegroups.com, dev-pl...@lists.mozilla.org

On Wed, May 11, 2011 at 12:23 AM, sayrer <say...@gmail.com> wrote:
> On Tuesday, May 10, 2011 12:33:24 AM UTC-7, Philipp von Weitershausen wrote:
>> On Tue, May 10, 2011 at 12:47 AM, sayrer <say...@gmail.com> wrote:
>> >> These oranges don’t just block a landing, they block periodic merges from
>> >> mozilla-central to project branches.
>> >
>> > Brief test failures on mozilla-central do not block merges from mozilla-central to project branches.
>>
>> Are you saying it's ok to land on orange then? Because every single
>> Tuesday -- which is when s-c is *scheduled* to merge to m-c -- the
>> tree has been orange for the past few weeks.
>
> mconnor's message said "they block periodic merges from mozilla-central to project branches." That statement concerns merges in one direction. Since it was part of the rationale for this epic bikeshed thread, it seemed worth pointing out that it was wrong.

Ignore me, I misread your statement to mean the exact opposite. Sorry
for the noise.

(FWIW, I agree with afrosdwilsh re: bikeshedding
https://twitter.com/#!/afrosdwilsh/status/67828862930780160)

>> Tuesday -- which is when s-c is *scheduled* to merge to m-c
>
> That sounds like a busy time. How about Friday at 6pm Pacific?

That's certainly a less busy time... for a reason ;)

The current schedule is QA driven since we have them sign off every
merge. We could renegotiate the schedule, of course, or just hold off
on merging for 4 days. Neither feels like a good solution. I'd
personally rather get to a place where m-c is perma green.

Shawn Wilsher

unread,

May 11, 2011, 1:47:17 AM5/11/11

to dev-pl...@lists.mozilla.org

On 5/10/2011 10:23 PM, sayrer wrote:
> mconnor's message said "they block periodic merges from mozilla-central to project branches." That statement concerns merges in one direction. Since it was part of the rationale for this epic bikeshed thread, it seemed worth pointing out that it was wrong.

Except that we've had a string of oranges that lasted for several hours
while folks tried to figure out which changeset(s) was the problem.
It's not random orange that they were referring to, which is pretty easy
to identify.

> Yes, sort of. The tree almost always has one or two tests failing out of hundreds of thousands, something all other projects of our size contend with, and yet people manage to land code. We also track test failures, which can be problems with the tests themselves or in the product. I trust that anyone with access to mozilla-central can quickly suss out whether they're looking at a flaky test or a widespread problem related to a recent checkin.

I'm sorry, but I've seen it happen on a number of occasions where
someone thinks it is either a new random orange or misstars it.

I'll agree that your statements should be true, but they aren't, which
is why we need it.

Cheers,

Shawn

sayrer

unread,

May 11, 2011, 3:09:48 AM5/11/11

to dev-pl...@lists.mozilla.org

On Tuesday, May 10, 2011 10:47:17 PM UTC-7, Shawn Wilsher wrote:
> On 5/10/2011 10:23 PM, sayrer wrote:
> > mconnor's message said "they block periodic merges from mozilla-central to project branches." That statement concerns merges in one direction. Since it was part of the rationale for this epic bikeshed thread, it seemed worth pointing out that it was wrong.
> Except that we've had a string of oranges that lasted for several hours
> while folks tried to figure out which changeset(s) was the problem.
> It's not random orange that they were referring to, which is pretty easy
> to identify.
>

You seem to be refuting an assertion I did not make.

> > Yes, sort of. The tree almost always has one or two tests failing out of hundreds of thousands, something all other projects of our size contend with, and yet people manage to land code. We also track test failures, which can be problems with the tests themselves or in the product. I trust that anyone with access to mozilla-central can quickly suss out whether they're looking at a flaky test or a widespread problem related to a recent checkin.
> I'm sorry, but I've seen it happen on a number of occasions where
> someone thinks it is either a new random orange or misstars it.
>

Yes, people make mistakes. They will probably continue to do so under any system. I don't think your point contradicts the bulk of what I wrote, but you have pointed out an obvious exception to a generalization I made in the last sentence.

> I'll agree that your statements should be true, but they aren't, which
> is why we need it.

Well, what is "it"? I think we've had really good luck with groups of coders making their own project repositories, which spreads out integration pain. That's great. Which group of coders need their own repository now? I bet some groups do, but surely we can do better than "here's a bunch of process for other people, so my existing process is easier". Wouldn't it be better to convince people landing too much stuff on mozilla-central that they'd be happier on their own repository? (they would be)

- Rob

sayrer

unread,

May 11, 2011, 3:09:48 AM5/11/11

to mozilla.de...@googlegroups.com, dev-pl...@lists.mozilla.org

On Tuesday, May 10, 2011 10:47:17 PM UTC-7, Shawn Wilsher wrote:

> On 5/10/2011 10:23 PM, sayrer wrote:
> > mconnor's message said "they block periodic merges from mozilla-central to project branches." That statement concerns merges in one direction. Since it was part of the rationale for this epic bikeshed thread, it seemed worth pointing out that it was wrong.
> Except that we've had a string of oranges that lasted for several hours
> while folks tried to figure out which changeset(s) was the problem.
> It's not random orange that they were referring to, which is pretty easy
> to identify.
>

You seem to be refuting an assertion I did not make.

> > Yes, sort of. The tree almost always has one or two tests failing out of hundreds of thousands, something all other projects of our size contend with, and yet people manage to land code. We also track test failures, which can be problems with the tests themselves or in the product. I trust that anyone with access to mozilla-central can quickly suss out whether they're looking at a flaky test or a widespread problem related to a recent checkin.

> I'm sorry, but I've seen it happen on a number of occasions where
> someone thinks it is either a new random orange or misstars it.
>

Yes, people make mistakes. They will probably continue to do so under any system. I don't think your point contradicts the bulk of what I wrote, but you have pointed out an obvious exception to a generalization I made in the last sentence.

> I'll agree that your statements should be true, but they aren't, which

> is why we need it.

Well, what is "it"? I think we've had really good luck with groups of coders making their own project repositories, which spreads out integration pain. That's great. Which group of coders need their own repository now? I bet some groups do, but surely we can do better than "here's a bunch of process for other people, so my existing process is easier". Wouldn't it be better to convince people landing too much stuff on mozilla-central that they'd be happier on their own repository? (they would be)

- Rob

Henri Sivonen

unread,

May 11, 2011, 8:25:30 AM5/11/11

to dev-pl...@lists.mozilla.org

On Mon, 2011-05-09 at 20:45 -0400, Boris Zbarsky wrote:
> On 5/9/11 7:45 PM, Ehsan Akhgari wrote:
> > When the sheriff wants to merge from mozilla-random, they take a look at
> > it. If something has broken builds/tests, that changeset and
> > _everything_ landed on top of it will get backed out, and then a merge
> > will happen.
>
> I think sheriffs should be more proactive about watching mozilla-random
> and backing out changes from there. Backing out everything has a very
> real cost that we should avoid if we can.

Since we currently don't have sheriffs during European working hours,
won't any sheriff-reliant process fail for folks in Europe (unless we
add Europe-based sheriffs who sheriff at European hours)?

I'm a bit skeptical about mozilla-random requiring people to watch the
tree less. I ended up watching cedar when I landed there. Also, I think
I've spent more time starring the previous oranges when I've sought to
land on cedar or aurora than when landing to m-c.

Another timezone worry I have is that deadlines are closer than they
appear with more merge steps. I'm worried that if it's harder for a
North America-based reviewer to remember what the effective deadline for
a review is so that a Europe-based developer can address review comments
and push in time for a given release train when both cross-timezone
feedback cycle *and* the longer merge pipeline have to be factored in.

--
Henri Sivonen
hsiv...@iki.fi
http://hsivonen.iki.fi/

beltzner

unread,

May 11, 2011, 11:25:27 AM5/11/11

to Henri Sivonen, dev-pl...@lists.mozilla.org

Why don't we have European sheriffs? Do we not have enough qualified
contributors in those timezones?

Also, I thought #developers was the de facto on duty sheriff when no
declared sheriff existed. Is that system not sufficient?

cheers,
mike

Kyle Huey

unread,

May 11, 2011, 12:04:08 PM5/11/11

to mbel...@gmail.com, Henri Sivonen, dev-pl...@lists.mozilla.org

The sheriff schedule needs work in general. There are people on there who
never show up to sheriff ...

Also I think your reply-to header is wrong.

- Kyle

Marco Bonardo

unread,

May 11, 2011, 2:18:00 PM5/11/11

to

Il 11/05/2011 14:25, Henri Sivonen ha scritto:
> Since we currently don't have sheriffs during European working hours,
> won't any sheriff-reliant process fail for folks in Europe (unless we
> add Europe-based sheriffs who sheriff at European hours)?

I'm available to sheriff in EU timezone, I often do unofficially, I
guess many others could be available to do the same.

Marco

Mike Connor

unread,

May 11, 2011, 2:50:38 PM5/11/11

to Henri Sivonen, dev-pl...@lists.mozilla.org

On 2011-05-11, at 8:25 AM, Henri Sivonen wrote:

> On Mon, 2011-05-09 at 20:45 -0400, Boris Zbarsky wrote:
>> On 5/9/11 7:45 PM, Ehsan Akhgari wrote:
>>> When the sheriff wants to merge from mozilla-random, they take a look at
>>> it. If something has broken builds/tests, that changeset and
>>> _everything_ landed on top of it will get backed out, and then a merge
>>> will happen.
>>
>> I think sheriffs should be more proactive about watching mozilla-random
>> and backing out changes from there. Backing out everything has a very
>> real cost that we should avoid if we can.
>

> Since we currently don't have sheriffs during European working hours,
> won't any sheriff-reliant process fail for folks in Europe (unless we
> add Europe-based sheriffs who sheriff at European hours)?

The process is only reliant on there being a sheriff at some point during the day, but I do think we should take more advantage of timezones. I expect more details on that plan to surface soonish.

> I'm a bit skeptical about mozilla-random requiring people to watch the
> tree less. I ended up watching cedar when I landed there. Also, I think
> I've spent more time starring the previous oranges when I've sought to
> land on cedar or aurora than when landing to m-c.
>
> Another timezone worry I have is that deadlines are closer than they
> appear with more merge steps. I'm worried that if it's harder for a
> North America-based reviewer to remember what the effective deadline for
> a review is so that a Europe-based developer can address review comments
> and push in time for a given release train when both cross-timezone
> feedback cycle *and* the longer merge pipeline have to be factored in.

Yes, this shifts the effective deadline to earlier, but I don't think that's especially harmful over the long term.

-- Mike

Dao

unread,

May 11, 2011, 3:53:35 PM5/11/11

to

On 11.05.2011 17:25, beltzner wrote:
> Why don't we have European sheriffs? Do we not have enough qualified
> contributors in those timezones?

We probably have /fewer/ qualified contributors. They'd have to sheriff
more often.

We also have less traffic during European work hours.

> Also, I thought #developers was the de facto on duty sheriff when no
> declared sheriff existed. Is that system not sufficient?

Seems sufficient as things stand.

Robert Kaiser

unread,

May 11, 2011, 6:12:00 PM5/11/11

to

Shawn Wilsher schrieb:
> On 5/10/2011 7:54 AM, Axel Hecht wrote:
>> The regression bot doesn't run on other project branches, I assume? I
>> wonder how much folks rely on those.
> It can run on branches. Regardless, if a project branch merges to m-c
> and then it causes a regression, we just back it out. No regressions
> unless it's announced ahead of time it's expected, we want it, and it
> has driver sign off.

The "regressions" I mostly care about are crashes that we track with our
reports, which we usually generate at the earliest a day after the
nightly ships, and then it takes some more hours for people to look at
the data, file bugs, find out a regression range and a checkin/merge
that is responsible - usually way too late for a backout. The problem
gets even larger on a huge merge, esp. in the recently quite crash-heavy
JS area (i.e. tracemonkey merges).
At times, it even needs a few days of data until we know this crash is
something to look at and start narrowing it down to when it started (or
started rising), in which case things become even harder.
From what I hear, at least tracemonkey nightlies make it a bit easier
to dig into narrowing things down once you have a reproducible test case
- but in many cases we haven't and crash data from Nightly users is all
we have (that's in the best case, in some cases the crashes only start
rising once we're on larger user bases, and then we can mostly forget
looking for regression ranges anyhow in a lot of cases).

Of course, that problem is probably not easy to solve in any case when
we want to scale well to landing even more stuff, but the fact is that
we only start seeing useful crash stats once the builds with the patches
are in the hands of enough testers - and on any kind of project branches
that's usually not the case.

Robert Kaiser

--
Note that any statements of mine - no matter how passionate - are never
meant to be offensive but very often as food for thought or possible
arguments that we as a community needs answers to. And most of the time,
I even appreciate irony and fun! :)

Justin Dolske

unread,

May 11, 2011, 7:28:15 PM5/11/11

to

On 5/10/11 5:55 AM, david bolter wrote:
> Is anyone strongly opposed to creating mozilla-random and just trying this
> out?

Sounds like a fantastic idea to me. It can remain optional until we have
some experience doing it, and if it flops we know we need to do
something different. :)

Bonus if it's "so successful people _want_ to use it".

Justin

Shawn Wilsher

unread,

May 11, 2011, 9:29:22 PM5/11/11

to dev-pl...@lists.mozilla.org

On 5/11/2011 5:25 AM, Henri Sivonen wrote:
> Another timezone worry I have is that deadlines are closer than they
> appear with more merge steps. I'm worried that if it's harder for a
> North America-based reviewer to remember what the effective deadline for
> a review is so that a Europe-based developer can address review comments
> and push in time for a given release train when both cross-timezone
> feedback cycle *and* the longer merge pipeline have to be factored in.

I'm not sure why this is a big deal. If it misses this release train,
it gets the next one in six weeks. Rushing to get something in puts it
at a higher risk of being turned off in aurora anyway because people
miss things when they are in a hurry.

Cheers,

Shawn

Shawn Wilsher

unread,

May 11, 2011, 9:34:19 PM5/11/11

to dev-pl...@lists.mozilla.org

On 5/11/2011 12:09 AM, sayrer wrote:
> Well, what is "it"? I think we've had really good luck with groups of coders making their own project repositories, which spreads out integration pain. That's great. Which group of coders need their own repository now? I bet some groups do, but surely we can do better than "here's a bunch of process for other people, so my existing process is easier". Wouldn't it be better to convince people landing too much stuff on mozilla-central that they'd be happier on their own repository? (they would be)

I think the mobile team could probably stand to have their own project
branch. I also wouldn't argue with anyone who might suggest bz get his
own, but apart from that, I'm not keeping a list. I do agree that folks
would be happier in their own repository too, but I'm not sure what else
we could say or do to convince them of that.

Cheers,

Shawn

Blair McBride

unread,

May 11, 2011, 10:22:19 PM5/11/11

to dev-pl...@lists.mozilla.org

On 12/05/2011 4:04 a.m., Kyle Huey wrote:
> The sheriff schedule needs work in general.

Or we could just get a few full-time sheriffs in various timezones,
instead of taking away developer time (which is what prompted this
thread anyway).

- Blair

sayrer

unread,

May 12, 2011, 12:20:26 AM5/12/11

to

On Wednesday, May 11, 2011 4:28:15 PM UTC-7, Justin Dolske wrote:
> On 5/10/11 5:55 AM, david bolter wrote:
> > Is anyone strongly opposed to creating mozilla-random and just trying this
> > out?
>
> Sounds like a fantastic idea to me. It can remain optional until we have
> some experience doing it

I can't imagine anyone being opposed to trying something optional.

I am very opposed to making this proposal the required way of working, but totally into trying it out and making it available.

- Rob

Asa Dotzler

unread,

May 12, 2011, 12:33:11 AM5/12/11

to

This sounds like the right approach to me. Any pressure that can be
pulled out of m-c is a win, right?

- A

Mark Banner

unread,

May 12, 2011, 5:12:49 AM5/12/11

to

I'd really like this NOT to be via email. We should be using pulse, or
require everyone to use an updated tinderstatus or something
configurable for the pusher.

I typically work on a different account and can be there for hours not
watching email, so this just wouldn't work and I'd just end up deleting
lots of emails (just like I do with the try server ones, because looking
at the tree is far easier).

Standard8

Mike Connor

unread,

May 12, 2011, 9:47:14 AM5/12/11

to Mark Banner, dev-pl...@lists.mozilla.org

I'm _sure_ you can create a mail filter to dump those mails and never see them. However, the core rationale is to not require people to be online through a whole cycle, which is where any "just make people watch X" falls down.

No one solution will be perfect for everyone, but switching to push-driven notification over human-based polling feels like a pretty substantial step. I'm sure we could bolt this onto IRC/IM/SMS as well, but email is an attribute we already have for everyone.

-- Mike

Johnathan Nightingale

unread,

May 12, 2011, 10:45:47 AM5/12/11

to mozilla.dev.planning group

On 2011-05-12, at 12:33 AM, Asa Dotzler wrote:

> On 5/11/2011 9:20 PM, sayrer wrote:
>> On Wednesday, May 11, 2011 4:28:15 PM UTC-7, Justin Dolske wrote:
>>> On 5/10/11 5:55 AM, david bolter wrote:
>>>> Is anyone strongly opposed to creating mozilla-random and just trying this
>>>> out?
>>>
>>> Sounds like a fantastic idea to me. It can remain optional until we have
>>> some experience doing it
>>
>> I can't imagine anyone being opposed to trying something optional.
>>
>> I am very opposed to making this proposal the required way of working, but totally into trying it out and making it available.
>

> This sounds like the right approach to me. Any pressure that can be pulled out of m-c is a win, right?

I sense violent agreement on this point. Who's got the ball to make it go? Connor?

J

---
Johnathan Nightingale
Director of Firefox Engineering
joh...@mozilla.com