Account Options

  1. Sign in
The old Google Groups will be going away soon, but your browser is incompatible with the new version.
Google Groups Home
« Groups Home
Proper Try use, and living with low test hardware capacity
There are currently too many topics in this group that display first. To make this topic appear first, remove this option from another topic.
There was an error processing your request. Please try again.
flag
  Messages 76 - 100 of 105 - Collapse all  -  Translate all to Translated (View all originals) < Older  Newer >
The group you are posting to is a Usenet group. Messages posted to this group will make your email address visible to anyone on the Internet.
Your reply message has not been sent.
Your post was successful
 
From:
To:
Cc:
Followup To:
Add Cc | Add Followup-to | Edit Subject
Subject:
Validation:
For verification purposes please type the characters you see in the picture below or the numbers you hear by clicking the accessibility icon. Listen and type the numbers you hear
 
Ehsan Akhgari  
View profile  
 More options Aug 31 2012, 10:50 am
Newsgroups: mozilla.dev.planning
From: Ehsan Akhgari <ehsan.akhg...@gmail.com>
Date: Fri, 31 Aug 2012 10:50:06 -0400
Local: Fri, Aug 31 2012 10:50 am
Subject: Re: Proper Try use, and living with low test hardware capacity
On 12-08-30 7:11 PM, L. David Baron wrote:

> On Thursday 2012-08-30 18:45 -0400, Zack Weinberg wrote:
>> My process is: only ask for review *after* the patch is green on
>> try. Until then, for all I know I'm going to need major
>> architectural changes just to make the testsuite happy, and there's
>> no point wasting reviewers' time, which is a *far* scarcer resource
>> in this organization than CPU hours.

> But there's also the opposite problem, which is that in some cases
> reviewers might require changes that will invalidate (or make
> unnecessary) the work that's been done to get the patch green.

> I don't think there's a single correct solution here.  Testing and
> peer review are both tools we use to improve the quality of our
> code; they don't necessarily belong in a particular order.

Usually when I'm working on a bug, my goal is to get the patch landed as
soon as possible and move on to other work.  I have a hard time when I
have a lot of pending patches (more than 5 really) since they incur a
cognitive load for me as I always have to keep thinking about them.  On
average, the single biggest thing which gets in the way of me landing
the patch is waiting for the review.  In many cases, all of the other
steps in fixing a bug (understanding the bug, thinking of a solution,
coding it, testing it, pushing to the try server and waiting for
results) takes less time than it takes for the reviewer to start looking
at my patch.  Because of this reason, I optimize for attaching the patch
to the bug and asking for review *as soon as possible*.

Ehsan


 
You must Sign in before you can post messages.
To post a message you must first join this group.
Please update your nickname on the subscription settings page before posting.
You do not have the permission required to post.
Ehsan Akhgari  
View profile  
 More options Aug 31 2012, 11:05 am
Newsgroups: mozilla.dev.planning
From: Ehsan Akhgari <ehsan.akhg...@gmail.com>
Date: Fri, 31 Aug 2012 11:04:53 -0400
Local: Fri, Aug 31 2012 11:04 am
Subject: Re: Proper Try use, and living with low test hardware capacity
On 12-08-30 8:48 PM, Steve Fink wrote:

To give you a concrete example of why this kind of stuff does not work,
when I was working on bug 157681 (which was a layout optimization), I
came across a single browser-chrome test failure happening only on Mac,
which seemed pretty unrelated to my changes at first, but it turned out
that it actually uncovered a subtle bug in my patch which none of the
other layout tests that we have managed to catch.

This kind of stuff is rare, true, but it happens frequently enough that
it really matters.  I don't think we can seriously consider bucketing
tests based on which files have changed in a patch without losing this
important aspect of catching bugs in patches -- except perhaps for
extremely localized components of the code.

Ehsan


 
You must Sign in before you can post messages.
To post a message you must first join this group.
Please update your nickname on the subscription settings page before posting.
You do not have the permission required to post.
Ehsan Akhgari  
View profile  
 More options Aug 31 2012, 11:14 am
Newsgroups: mozilla.dev.planning
From: Ehsan Akhgari <ehsan.akhg...@gmail.com>
Date: Fri, 31 Aug 2012 11:14:01 -0400
Local: Fri, Aug 31 2012 11:14 am
Subject: Re: Proper Try use, and living with low test hardware capacity
On 12-08-31 8:16 AM, Ben Hearsum wrote:

> On 08/31/12 12:39 AM, Robert O'Callahan wrote:
>> If we could run Linux functional tests on AWS, then maybe we could keep the
>> Linux build/functional-test backlog at zero and encourage people to try the
>> Linux non-functional tests before every non-trivial commit to inbound. It
>> seems to me that would greatly reduce bustage. (I suppose we have enough
>> data for someone to compute the fraction of bustage-inducing pushes that
>> did not break Linux functional tests.)

> I know this is in the works (sorry, I don't which bug is happening in),
> but we can't quite run all of our unit tests on AWS. Anything that
> depends on a GPU (reftest, some crashtests, and even some mochitests
> I've heard) can't run there. We definitely want to move everything we
> can to the cloud, though.

I think that only mochitests which test canvas fall into that category,
which means mochitest-1 (and we could bucket up those tests into a
separate suite if needed).  The rest should be possible to be pushed to
the cloud.

Ehsan


 
You must Sign in before you can post messages.
To post a message you must first join this group.
Please update your nickname on the subscription settings page before posting.
You do not have the permission required to post.
Ehsan Akhgari  
View profile  
 More options Aug 31 2012, 11:15 am
Newsgroups: mozilla.dev.planning
From: Ehsan Akhgari <ehsan.akhg...@gmail.com>
Date: Fri, 31 Aug 2012 11:15:02 -0400
Local: Fri, Aug 31 2012 11:15 am
Subject: Re: Proper Try use, and living with low test hardware capacity
On 12-08-30 11:48 PM, Nicholas Nethercote wrote:

> philor (who knows as much about this stuff as anyone) just mentioned
> the following on IRC:

> "did anyone point out that we take 60 minutes to run Win xpcshell,
> when locally it takes 7 minutes, or that we build and test desktop on
> pushes that only touch mobile/ or b2g/?"

> Sounds like two pieces of large, low-hanging fruit.

Except that as I understand things, we don't have a reliable way to
handle them, since our infrastructure is only capable of looking at the
tip of a push, not every changeset in it.

Ehsan


 
You must Sign in before you can post messages.
To post a message you must first join this group.
Please update your nickname on the subscription settings page before posting.
You do not have the permission required to post.
Chris AtLee  
View profile  
 More options Aug 31 2012, 11:33 am
Newsgroups: mozilla.dev.planning
From: Chris AtLee <cat...@mozilla.com>
Date: Fri, 31 Aug 2012 11:33:03 -0400
Local: Fri, Aug 31 2012 11:33 am
Subject: Re: Proper Try use, and living with low test hardware capacity
On 31/08/12 11:15 AM, Ehsan Akhgari wrote:

> On 12-08-30 11:48 PM, Nicholas Nethercote wrote:
>> philor (who knows as much about this stuff as anyone) just mentioned
>> the following on IRC:

>> "did anyone point out that we take 60 minutes to run Win xpcshell,
>> when locally it takes 7 minutes, or that we build and test desktop on
>> pushes that only touch mobile/ or b2g/?"

>> Sounds like two pieces of large, low-hanging fruit.

> Except that as I understand things, we don't have a reliable way to
> handle them, since our infrastructure is only capable of looking at the
> tip of a push, not every changeset in it.

That's just how it's currently implemented; it's certainly changeable
with enough effort.

Are the win xpcshell test times something to be concerned about? Are
there other tests that are taking unreasonably long?


 
You must Sign in before you can post messages.
To post a message you must first join this group.
Please update your nickname on the subscription settings page before posting.
You do not have the permission required to post.
Ehsan Akhgari  
View profile  
 More options Aug 31 2012, 11:40 am
Newsgroups: mozilla.dev.planning
From: Ehsan Akhgari <ehsan.akhg...@gmail.com>
Date: Fri, 31 Aug 2012 11:40:31 -0400
Local: Fri, Aug 31 2012 11:40 am
Subject: Re: Proper Try use, and living with low test hardware capacity
On 12-08-31 11:33 AM, Chris AtLee wrote:

Good point!  -> bug 787449

> Are the win xpcshell test times something to be concerned about? Are
> there other tests that are taking unreasonably long?

Absolutely!  Filed bug 787448 for the investigation on why this happens.
  I don't know if the same problem happens with other tests as well.

Ehsan


 
You must Sign in before you can post messages.
To post a message you must first join this group.
Please update your nickname on the subscription settings page before posting.
You do not have the permission required to post.
Discussion subject changed to "Future of Linux64 support" by Robert Kaiser
Robert Kaiser  
View profile  
 More options Aug 31 2012, 12:15 pm
Newsgroups: mozilla.dev.planning
From: Robert Kaiser <ka...@kairo.at>
Date: Fri, 31 Aug 2012 18:15:18 +0200
Local: Fri, Aug 31 2012 12:15 pm
Subject: Re: Future of Linux64 support
Mike Connor schrieb:

> Looking at the data, we're still at around 2/3 Linux32 on Fx13/14

Are you looking at the "release" channel there or at the "default"
channel as well? Distro builds are usually on the "default" channel so
that we don't provide updates (as the distro does that).

Robert Kaiser


 
You must Sign in before you can post messages.
To post a message you must first join this group.
Please update your nickname on the subscription settings page before posting.
You do not have the permission required to post.
Discussion subject changed to "Proper Try use, and living with low test hardware capacity" by Steve Fink
Steve Fink  
View profile  
 More options Aug 31 2012, 12:55 pm
Newsgroups: mozilla.dev.planning
From: Steve Fink <sf...@mozilla.com>
Date: Fri, 31 Aug 2012 09:55:38 -0700
Local: Fri, Aug 31 2012 12:55 pm
Subject: Re: Proper Try use, and living with low test hardware capacity
On 08/31/2012 08:04 AM, Ehsan Akhgari wrote:

I can't argue convincingly without concrete data, but this sounds wrong
to me.

You give an example of where the test restrictions would fail due to the
bucketing, but you also say "This kind of stuff is rare...". So when
something like this happens, you wouldn't get a test build and wouldn't
see the failure until several pushes later when the test *did* get run.
So we get bad coalescing in rare cases.

In return, we lower the infrastructure load across the board, resulting
in less coalescing in the common case.

I think the tradeoff is likely to be worth it, but it totally depends on
the numbers. And predicting how much coalescing will be reduced, but
only during busy times when it matters, based on a certain reduction in
test load, is Hard.

Your example does point out that we'd also want test suppression to be
relative to current load -- no need to suppress any tests during off
hours, and in fact you'd probably want to set the threshold based on
current activity/backlog. Perhaps that makes it more palatable: "we're
overloaded and can't run everything, so what jobs would be least harmful
if we suppressed them?"


 
You must Sign in before you can post messages.
To post a message you must first join this group.
Please update your nickname on the subscription settings page before posting.
You do not have the permission required to post.
Ehsan Akhgari  
View profile  
 More options Aug 31 2012, 3:02 pm
Newsgroups: mozilla.dev.planning
From: Ehsan Akhgari <ehsan.akhg...@gmail.com>
Date: Fri, 31 Aug 2012 15:02:32 -0400
Local: Fri, Aug 31 2012 3:02 pm
Subject: Re: Proper Try use, and living with low test hardware capacity
On 12-08-31 12:55 PM, Steve Fink wrote:

OK, thinking more about this, I see your point now.  And I definitely
agree that this is the sort of thing which is hard to evaluate without
the data.

> Your example does point out that we'd also want test suppression to be
> relative to current load -- no need to suppress any tests during off
> hours, and in fact you'd probably want to set the threshold based on
> current activity/backlog. Perhaps that makes it more palatable: "we're
> overloaded and can't run everything, so what jobs would be least harmful
> if we suppressed them?"

Makes sense.

Cheers,
Ehsan


 
You must Sign in before you can post messages.
To post a message you must first join this group.
Please update your nickname on the subscription settings page before posting.
You do not have the permission required to post.
Chris Pearce  
View profile  
 More options Aug 31 2012, 9:08 pm
Newsgroups: mozilla.dev.planning
From: Chris Pearce <cpea...@mozilla.com>
Date: Sat, 01 Sep 2012 13:08:25 +1200
Local: Fri, Aug 31 2012 9:08 pm
Subject: Re: Proper Try use, and living with low test hardware capacity
On 31/08/12 07:15, Ehsan Akhgari wrote:

> On 12-08-30 2:33 PM, Steve Fink wrote:

>> 5. Or go the other way, and make more tests runnable in parallel. More
>> efficient than #4 because it avoids the VM overhead, much harder to
>> implement, would also improve testing locally. (Though making it easy to
>> set up test VMs could help local testing too.) Needing window focus will
>> again bite us here.

> I'm not convinced that this is feasible in the short to middle term
> for any of our graphical test suites.

The <audio> and <video> mochitests have a pretty simple test manager [1]
written in JS which can run multiple sub-tests in parallel. The level of
parallelsim can be cranked up or down by changing a simple parameter
[2]. Not all mochitests could be written like this (fullscreen
mochitests couldn't for example), but some of the slower running tests
may be able to be refactored to use techniques like this.

Chris P.

[1]
http://mxr.mozilla.org/mozilla-central/source/content/media/test/mani...
[2]
http://mxr.mozilla.org/mozilla-central/source/content/media/test/mani...


 
You must Sign in before you can post messages.
To post a message you must first join this group.
Please update your nickname on the subscription settings page before posting.
You do not have the permission required to post.
Johnathan Nightingale  
View profile  
 More options Sep 3 2012, 12:59 pm
Newsgroups: mozilla.dev.planning
From: Johnathan Nightingale <john...@mozilla.com>
Date: Mon, 3 Sep 2012 12:59:46 -0400
Local: Mon, Sep 3 2012 12:59 pm
Subject: Re: Proper Try use, and living with low test hardware capacity
On Aug 31, 2012, at 12:55 PM, Steve Fink wrote:

> On 08/31/2012 08:04 AM, Ehsan Akhgari wrote:
>> To give you a concrete example of why this kind of stuff does not work, when I was working on bug 157681 (which was a layout optimization), I came across a single browser-chrome test failure happening only on Mac, which seemed pretty unrelated to my changes at first, but it turned out that it actually uncovered a subtle bug in my patch which none of the other layout tests that we have managed to catch.

>> This kind of stuff is rare, true, but it happens frequently enough that it really matters.  I don't think we can seriously consider bucketing tests based on which files have changed in a patch without losing this important aspect of catching bugs in patches -- except perhaps for extremely localized components of the code.

> I can't argue convincingly without concrete data, but this sounds wrong to me.

> You give an example of where the test restrictions would fail due to the bucketing, but you also say "This kind of stuff is rare...". So when something like this happens, you wouldn't get a test build and wouldn't see the failure until several pushes later when the test *did* get run. So we get bad coalescing in rare cases.

> In return, we lower the infrastructure load across the board, resulting in less coalescing in the common case.

I'd also be perfectly okay with saying that changes someone like Ehsan makes to something like layout are gonna run the full suite every time. Layout pushes in general are likely to touch surprising things. But even granting that, Steve's suggestions could help firefox, toolkit, mobile, js, nss, webgl, &c pushes get out of the way by running subsets.

We could label whole directories as "touching this ends the world, test everything" and be pretty liberal about where we apply that label because at the moment, we effectively apply it to everything.

So who's gonna volunteer to do the strawman test-bucket vs code location matrix? :)

J

---
Johnathan Nightingale
Sr. Director of Firefox Engineering
@johnath


 
You must Sign in before you can post messages.
To post a message you must first join this group.
Please update your nickname on the subscription settings page before posting.
You do not have the permission required to post.
Discussion subject changed to "Future of Linux32 support (was Future of Linux64 support)" by Chris Cooper
Chris Cooper  
View profile  
 More options Sep 4 2012, 1:16 pm
Newsgroups: mozilla.dev.planning
From: Chris Cooper <ccoo...@deadsquid.com>
Date: Tue, 04 Sep 2012 13:16:35 -0400
Local: Tues, Sep 4 2012 1:16 pm
Subject: Future of Linux32 support (was Future of Linux64 support)
This thread has wandered far, far away from the original purpose
(surprise) which was to assess whether we still needed/wanted Linux64 as
both a build and test platform.

Aside from the expected "OMGCHANGE" reactions, there were valid
arguments for keeping Linux64. We should invest the effort to get bug
527907 fixed.

However...

I'm not feeling a lot of love for 32-bit linux. Many people suggested
turning off linux32 instead if we needed to make a choice.

Would we consider stopping builds and tests on linux32 instead of
linux64, or at least putting some sort of horizon on how long we would
plan to support 32-bit linux as a tier 1 platform?

Again, no one is (necessarily) talking in absolutes here. We can
continue to run both linux platforms, we could demote linux32 to tier 2,
etc.

While it would obviously help unburden release engineering to reduce the
number of build/test environments we support, our primary goal here is
to make sure we're expending effort on relevant platforms and architecture.

cheers,
--
coop


 
You must Sign in before you can post messages.
To post a message you must first join this group.
Please update your nickname on the subscription settings page before posting.
You do not have the permission required to post.
Discussion subject changed to "Proper Try use, and living with low test hardware capacity" by Ed Morley
Ed Morley  
View profile  
 More options Sep 4 2012, 1:30 pm
Newsgroups: mozilla.dev.planning
From: Ed Morley <bmo.takethis...@edmorley.co.uk>
Date: Tue, 4 Sep 2012 10:30:28 -0700 (PDT)
Local: Tues, Sep 4 2012 1:30 pm
Subject: Re: Proper Try use, and living with low test hardware capacity

This was never the purpose of mozilla-inbound.

The idea was to:
 a) Have a tree where people did not have to watch their pushes for 4-6+ hours, since someone would keep an eye on it. (The primary dev incentive).
 b) Mean that other branches could confidently pull from mozilla-central, knowing that it would be green.
 c) Reduce the number of push races when merging other repos into mozilla-central (which can be more of a pain to rebase than normal sized pushes), since the traffic is lower.
 d) Give us a way to not tie up mozilla-central if we end up with extreme bustage on mozilla-inbound. We also gained the ability to reset (mozilla-inbound) to a previous revision without reverting merges from other repos.

However, it was not ever meant to be a replacement for Try, and the inbound tree rules explicitly state this:
https://wiki.mozilla.org/Tree_Rules/Inbound#What_are_the_tree_rules_f...

If people feel that it would be preferable (either from infra load or workflow) to change this policy, please can they start a dev.{platform,planning} discussion proposing a change - but in the meantime I would prefer it if they don't ignore the tree rules - since it results in very sadfaces sheriffs :-(

Best wishes,

Ed


 
You must Sign in before you can post messages.
To post a message you must first join this group.
Please update your nickname on the subscription settings page before posting.
You do not have the permission required to post.
Discussion subject changed to "Future of Linux32 support (was Future of Linux64 support)" by Benoit Jacob
Benoit Jacob  
View profile  
 More options Sep 4 2012, 2:31 pm
Newsgroups: mozilla.dev.planning
From: Benoit Jacob <jacob.benoi...@gmail.com>
Date: Tue, 4 Sep 2012 14:31:51 -0400
Local: Tues, Sep 4 2012 2:31 pm
Subject: Re: Future of Linux32 support (was Future of Linux64 support)
2012/9/4 Chris Cooper <ccoo...@deadsquid.com>:

I tried to make that point in the previous thread:

The problem with dropping a platform is not just that that platform
may be worth keeping, it is also that the fact that you feel the need
to drop a platform is probably a consequence of a deeper problem which
is the right thing to fix: our testing is too expensive. How do we
make it less expensive? People discussed possible ideas in the other
thread, including running tests less often and/or skipping part of the
tests depending on what didn't change.

Benoit


 
You must Sign in before you can post messages.
To post a message you must first join this group.
Please update your nickname on the subscription settings page before posting.
You do not have the permission required to post.
Chris Pearce  
View profile  
 More options Sep 4 2012, 3:47 pm
Newsgroups: mozilla.dev.planning
From: Chris Pearce <cpea...@mozilla.com>
Date: Wed, 05 Sep 2012 07:47:26 +1200
Local: Tues, Sep 4 2012 3:47 pm
Subject: Re: Future of Linux32 support (was Future of Linux64 support)
On 05/09/12 06:31, Benoit Jacob wrote:

> The problem with dropping a platform is not just that that platform
> may be worth keeping, it is also that the fact that you feel the need
> to drop a platform is probably a consequence of a deeper problem which
> is the right thing to fix: our testing is too expensive.

I agree.

But if we *were* to consider dropping a platform to tier 2, we should
make that decision with data to back it up, which for Linux should also
include data regarding the Firefox x86/x64 split in the major distros,
since most roll their own Firefox packages which we don't track.

Chris P.


 
You must Sign in before you can post messages.
To post a message you must first join this group.
Please update your nickname on the subscription settings page before posting.
You do not have the permission required to post.
Discussion subject changed to "Proper Try use, and living with low test hardware capacity" by Ehsan Akhgari
Ehsan Akhgari  
View profile  
 More options Sep 4 2012, 5:21 pm
Newsgroups: mozilla.dev.planning
From: Ehsan Akhgari <ehsan.akhg...@gmail.com>
Date: Tue, 04 Sep 2012 17:20:37 -0400
Local: Tues, Sep 4 2012 5:20 pm
Subject: Re: Proper Try use, and living with low test hardware capacity
On 12-09-03 12:59 PM, Johnathan Nightingale wrote:

This makes sense.  Do you wanna file a bug in Core::Build Config and
assign it to Steve?  ;-)

Ehsan


 
You must Sign in before you can post messages.
To post a message you must first join this group.
Please update your nickname on the subscription settings page before posting.
You do not have the permission required to post.
Justin Dolske  
View profile  
 More options Sep 5 2012, 4:32 pm
Newsgroups: mozilla.dev.planning
From: Justin Dolske <dol...@mozilla.com>
Date: Wed, 05 Sep 2012 13:32:11 -0700
Local: Wed, Sep 5 2012 4:32 pm
Subject: Re: Proper Try use, and living with low test hardware capacity
On 9/4/12 10:30 AM, Ed Morley wrote:

>> Breaking mozilla-inbound should 100% acceptable, and trivial for a
>> sheriff or others to fix.
...

> However, it was not ever meant to be a replacement for Try, and the inbound tree rules explicitly state this:
> https://wiki.mozilla.org/Tree_Rules/Inbound#What_are_the_tree_rules_f...

I _suspect_ you both may be saying the basically the same thing,
although differing on exactly where the line is...

I think it's true that m-i isn't a playground; developers should be
surprised if a m-i push fails, and not just expect it to flush out
problems. At a _minimum_, developers should have at least built and run
relevant tests locally. No push'n'pray.

I also think it's true that using Try is a best-practice. It's easy and
helps to spot the unexpected without causing work for other people. It's
even essentially _required_ if you're doing things that have a history
of being touchy -- C++ magic that various compilers might dislike,
invasive build system changes, platform-specific changes that you can't
check yourself, etc.

But in-between we're trusting developers to use their best judgement.
Trivial, well-understood changes might not need Try at all. When Try is
used, we ask them to use TryChooser to limit resource usage by doing
what's needed. Not everything needs a full Talos run + debug + opt + all
tests + all platforms (+ multiple runs to ensure no new random orange is
added or you got lucky with a random green).

Assuming this is all true, it seems what we might really want here are
some better guidelines for helping developers tune/improve their "best
judgement". Some of the quote-unquote-obvious things raised this thread
would be a good start.

Justin


 
You must Sign in before you can post messages.
To post a message you must first join this group.
Please update your nickname on the subscription settings page before posting.
You do not have the permission required to post.
Jonathan Kew  
View profile  
 More options Sep 5 2012, 6:33 pm
Newsgroups: mozilla.dev.planning
From: Jonathan Kew <jfkth...@googlemail.com>
Date: Wed, 05 Sep 2012 23:33:37 +0100
Local: Wed, Sep 5 2012 6:33 pm
Subject: Re: Proper Try use, and living with low test hardware capacity
On 5/9/12 21:32, Justin Dolske wrote:

Indeed. My understanding has always been that the expected "patch
quality" for m-i is the same as for m-c. I wouldn't push a patch to
inbound unless I believe that patch is ready for mozilla-central. If I
have any significant level of doubt about this, I'd push to tryserver
first to verify whatever tests/platforms/etc I'm concerned about.

Of course, I may misjudge this sometimes, in which case our faithful
sheriffs will rescue the tree by backing me out. But the primary reason
for me to land on inbound rather than m-c is simply that it frees me
from tree-watching responsibilities -- not that it lets me push stuff
that I feel is too risky for m-c.

JK


 
You must Sign in before you can post messages.
To post a message you must first join this group.
Please update your nickname on the subscription settings page before posting.
You do not have the permission required to post.
Steve Fink  
View profile  
 More options Sep 6 2012, 12:16 am
Newsgroups: mozilla.dev.planning
From: Steve Fink <sf...@mozilla.com>
Date: Wed, 05 Sep 2012 21:16:24 -0700
Local: Thurs, Sep 6 2012 12:16 am
Subject: Re: Proper Try use, and living with low test hardware capacity
On Tue 04 Sep 2012 02:20:37 PM PDT, Ehsan Akhgari wrote:

I'd be fine with that, though I also wouldn't get to it for a while
unless I make it through a couple of other projects faster than I have
been so far.

Then again... ok, here's v1, in bash:

echo "run everything"

or in Python

print("run everything")

Now, who can hook this into buildbot? I'll patch it from there. :-)

Except I'm not kidding. I can go to town on some crazy algorithm, but
I've no clue about the code or the process for getting anything
actually hooked in and deployed.

Btw, upon further reflection, my previously sketched-out algorithm is
all wrong. You don't just want to have a per-push trigger that says
"what tests should we kick off for this push?" You really want a job
completion trigger that says "what could I do with this now-available
machine that would give me the most information, given what I currently
know?" Or maybe that's unit of information per machine-minute, I'm not
sure. But that formulation gives way more possibilities -- it might
choose to bisect a past coalesced failure rather than just kicking off
an almost-certain-to-be-useless test for the latest push. And as long
as you don't hint to it that intermittent *greens* are possible, it'll
naturally decay to running everything for every push if resources are
available. (If you don't limit it, it'll also use idle resources to
rerun every failure forever to make sure it's not intermittent, too.)
Welcome to our new overlord, the Robosheriff!


 
You must Sign in before you can post messages.
To post a message you must first join this group.
Please update your nickname on the subscription settings page before posting.
You do not have the permission required to post.
Steve Fink  
View profile  
 More options Sep 7 2012, 2:36 pm
Newsgroups: mozilla.dev.planning
From: Steve Fink <sf...@mozilla.com>
Date: Fri, 07 Sep 2012 11:35:57 -0700
Local: Fri, Sep 7 2012 2:35 pm
Subject: Re: Proper Try use, and living with low test hardware capacity
On Thu 06 Sep 2012 07:16:32 PM PDT, Ehsan Akhgari wrote:

How does coalescing happen? Does the build machine always request the
full set of tests, and then buildbot ignores the request if it's
overloaded? Or does the build machine actually know something about the
overload state? If the former, then plainly the build machine can
continue doing exactly what it's doing, and whatever is currently aware
of the overload would just need to be given information on the changes
made so that it could selectively suppress jobs. But I somehow doubt
it's that simple.

The pie-in-the-sky optimal interface would integrate more deeply, and
might require a bit of rearchitecting. It really wants to be a daemon
monitoring these notifications:

- job completion, with status
- new slave available (probably because it completed a job, but also
when adding to the pool or rebooting or whatever)
- changes pushed, with a way of knowing what's in that change
- star comment added

The "new slave available" notification might actually be a synchronous
call, since it would be the only thing kicking off new jobs. Optionally,
this daemon could cancel known-to-be-bad jobs, trigger clobbers, and
auto-star in limited cases.

Oh, and it wants to be able to distinguish regular pushes from merges
and backouts, because failure probabilities are totally different across
those. But a regex match is good enough for that.

In other words, it kind of wants to be the global scheduler. It would
maintain state. Version 1 would watch incoming pushes and queue up all
the build jobs. When a build job completed, it would queue up the test
jobs, only it wouldn't be a linear queue because when another build came
in it would need to reimplement the current coalescing strategy. When a
slave became available, it would throw a job at it. Ignoring the
(enormous) buildbot architectural questions, this should be pretty quick
and straightforward to implement.

Later versions would be maintaining state to quickly and correctly
answer the question, when a new slave is available, "what is the most
useful job to run on this machine?" Usually that would be grabbing one
of the test jobs from the most recent build, but could be bisecting
coalesced failures or retriggering possibly intermittent failures.

To correctly answer the "most useful job" question, it would need to
maintain estimates of the probability of any given job failing, as well
as an estimate of the current state of every type of job in the tree (eg
M1 is (85% probability) failing from one of the last 3 pushes, or (15%
probability) is a not-yet-starred intermittent failure; M2 is totally
happy with respect to the latest push.) That means it could eventually
provide a sheriff's dashboard, enumerating the possible causes of the
current horrific breakage and its plan for figuring out what's going on
(which of course can be overridden at any time via manual retriggers or
whatever.) It could even give its logic for why it picked each upcoming
job. It should be written to be reactive, though, so it doesn't depend
on anything following its advice.

In fact, an alternative implementation route would be implement the
dashboard with all the crazy estimation stuff first, but not give it any
ability to start/stop/star jobs. Then it could be validated on actual
data before giving it the reins.

This would not want live on the builders, though. It needs global
visibility.


 
You must Sign in before you can post messages.
To post a message you must first join this group.
Please update your nickname on the subscription settings page before posting.
You do not have the permission required to post.
Chris AtLee  
View profile  
 More options Sep 7 2012, 3:56 pm
Newsgroups: mozilla.dev.planning
From: Chris AtLee <cat...@mozilla.com>
Date: Fri, 07 Sep 2012 15:56:45 -0400
Local: Fri, Sep 7 2012 3:56 pm
Subject: Re: Proper Try use, and living with low test hardware capacity

> How does coalescing happen? Does the build machine always request the
> full set of tests, and then buildbot ignores the request if it's
> overloaded? Or does the build machine actually know something about the
> overload state? If the former, then plainly the build machine can
> continue doing exactly what it's doing, and whatever is currently aware
> of the overload would just need to be given information on the changes
> made so that it could selectively suppress jobs. But I somehow doubt
> it's that simple.

tl;dr - builds and tests are greedy - a machine will grab all pending
work of the same type when it starts a build/test

Coalescing happens on the buildbot master at the time when a build
starts. Once a machine is available to start a job the default behaviour
is to grab all other pending jobs of the same type. The primary
exceptions to this are try jobs where coalescing is disabled completely.

For builds, this turns into something like this when we're running at
full capacity:
* push A -> pending build requests for win32, linux64, etc.
* push B -> pending build requests for win32, linux64, etc.
* push C -> pending build requests for win32, linux64, etc.
* win32 build slave becomes available. build master coalesces pending
requests for win32 A,B,C into a single job, and tells slave to
checkout/build the latest code (C).
* push D -> pending build requests for win32, linux64, etc.
* push E -> pending build requests for win32, linux64, etc.
* win32 build slave becomes available. build master coalesces pending
requests for win32 D,E into a single job, and tells slave to
checkout/build the latest code (E).

At this point, the build master has a lot of information about the
changes going into A,B,C,D,E, including which files have changed. This
data isn't currently communicated to the build slave, nor does it
influence decisions about what should be built in most cases.

For each build platform, when the builds of C,E finish, they trigger
tests by notifying the build master of a few pieces of data: branch,
revision, platform as well as  urls to the builds, tests, and symbols.
This results in the pending queue for tests looking like:

* win32 mozilla-central C mochitests-1 http://....
* win32 mozilla-central C mochitests-2 http://....
...
* win32 mozilla-central E mochitests-1 http://....

These are subject to the same coalescing behaviours as the builds. So
all the "mochitest-1" jobs for win32 mozilla-central will be coalesced
the next time a slave is free. Note that the pending requests for test
jobs only include the revision, not the list of files that were changed
for the build. Also note that the test requests give no indication of
how many pushes were coalesced into one build. pushes A,B,D never
existed as far as tests are concerned.

This isn't to say that we *can't* change which tests are run in response
to which files are changed, rather that it's a significant change from
the current implementation.

I hope this helps!
Chris


 
You must Sign in before you can post messages.
To post a message you must first join this group.
Please update your nickname on the subscription settings page before posting.
You do not have the permission required to post.
Chris AtLee  
View profile  
 More options Sep 7 2012, 4:09 pm
Newsgroups: mozilla.dev.planning
From: Chris AtLee <cat...@mozilla.com>
Date: Fri, 07 Sep 2012 16:09:33 -0400
Local: Fri, Sep 7 2012 4:09 pm
Subject: Re: Proper Try use, and living with low test hardware capacity
On 07/09/12 02:35 PM, Steve Fink wrote:

Indeed the architectural issues there are enormous...intelligent
scheduling is really tricky to get right, and then trickier to implement
in buildbot. We've been working a few approaches that may help: one is
to basically dump out all of the relevant state and events from the
buildbot master to make it consumable from external processes. These
processes can then inject new work into the system at their own pace.

This should make it easier to implement schedulers that require more
state, or that may require some "expensive" operations to figure out
what to do next (e.g. looking up past results in a DB, checking starred
status, etc.)

One big difference from what you describe is that buildbot doesn't
generate work in response to slave availability; it instead keeps a list
of pending work, and which slaves are eligible to do it, and assigns the
work out when slaves are free.


 
You must Sign in before you can post messages.
To post a message you must first join this group.
Please update your nickname on the subscription settings page before posting.
You do not have the permission required to post.
Henri Sivonen  
View profile  
 More options Sep 18 2012, 7:58 am
Newsgroups: mozilla.dev.planning
From: Henri Sivonen <hsivo...@iki.fi>
Date: Tue, 18 Sep 2012 14:57:57 +0300
Local: Tues, Sep 18 2012 7:57 am
Subject: Re: Proper Try use, and living with low test hardware capacity

On Fri, Aug 31, 2012 at 3:16 PM, Ben Hearsum <bhear...@mozilla.com> wrote:
> I know this is in the works (sorry, I don't which bug is happening in),
> but we can't quite run all of our unit tests on AWS. Anything that
> depends on a GPU (reftest, some crashtests, and even some mochitests
> I've heard) can't run there. We definitely want to move everything we
> can to the cloud, though.

Would it be possible to use llvmpipe to make a CPU-only config on AWS
that to Firefox looks like a config with a GPU?

--
Henri Sivonen
hsivo...@iki.fi
http://hsivonen.iki.fi/


 
You must Sign in before you can post messages.
To post a message you must first join this group.
Please update your nickname on the subscription settings page before posting.
You do not have the permission required to post.
Ben Hearsum  
View profile  
 More options Sep 18 2012, 8:11 am
Newsgroups: mozilla.dev.planning
From: Ben Hearsum <bhear...@mozilla.com>
Date: Tue, 18 Sep 2012 08:11:29 -0400
Local: Tues, Sep 18 2012 8:11 am
Subject: Re: Proper Try use, and living with low test hardware capacity
On 09/18/12 07:57 AM, Henri Sivonen wrote:

> On Fri, Aug 31, 2012 at 3:16 PM, Ben Hearsum <bhear...@mozilla.com> wrote:
>> I know this is in the works (sorry, I don't which bug is happening in),
>> but we can't quite run all of our unit tests on AWS. Anything that
>> depends on a GPU (reftest, some crashtests, and even some mochitests
>> I've heard) can't run there. We definitely want to move everything we
>> can to the cloud, though.

> Would it be possible to use llvmpipe to make a CPU-only config on AWS
> that to Firefox looks like a config with a GPU?

Would testing like that constitute a valid test? We've been pretty
insistent that we test on real-world things in the past.

 
You must Sign in before you can post messages.
To post a message you must first join this group.
Please update your nickname on the subscription settings page before posting.
You do not have the permission required to post.
Henri Sivonen  
View profile  
 More options Sep 18 2012, 8:28 am
Newsgroups: mozilla.dev.planning
From: Henri Sivonen <hsivo...@iki.fi>
Date: Tue, 18 Sep 2012 15:28:53 +0300
Local: Tues, Sep 18 2012 8:28 am
Subject: Re: Proper Try use, and living with low test hardware capacity

On Tue, Sep 18, 2012 at 3:11 PM, Ben Hearsum <bhear...@mozilla.com> wrote:
> On 09/18/12 07:57 AM, Henri Sivonen wrote:
>> On Fri, Aug 31, 2012 at 3:16 PM, Ben Hearsum <bhear...@mozilla.com> wrote:
>>> I know this is in the works (sorry, I don't which bug is happening in),
>>> but we can't quite run all of our unit tests on AWS. Anything that
>>> depends on a GPU (reftest, some crashtests, and even some mochitests
>>> I've heard) can't run there. We definitely want to move everything we
>>> can to the cloud, though.

>> Would it be possible to use llvmpipe to make a CPU-only config on AWS
>> that to Firefox looks like a config with a GPU?

> Would testing like that constitute a valid test? We've been pretty
> insistent that we test on real-world things in the past.

To the extent there are already Linux distros that run Gnome Shell on
llvmpipe when suitable GPU OpenGL drivers are missing and Ubuntu is
moving to running the 3D version of Unity on llvmpipe when suitable
GPU OpenGL drivers are missing, I'd expect running on top of llvmpipe
to correspond to one kind of real-world situation, though I don't
actually know how Firefox sees the OpenGL stack when running with
Gnome Shell/llvmpipe or Unity/llvmpipe.

--
Henri Sivonen
hsivo...@iki.fi
http://hsivonen.iki.fi/


 
You must Sign in before you can post messages.
To post a message you must first join this group.
Please update your nickname on the subscription settings page before posting.
You do not have the permission required to post.
Messages 76 - 100 of 105 < Older  Newer >
« Back to Discussions « Newer topic     Older topic »