Notes on ways to improve the build/automation turn around time

142 views
Skip to first unread message

Clint Talbert

unread,
May 24, 2011, 3:27:43 PM5/24/11
to
After the platform meeting, we talked a bit about how to reduce the
amount of turnaround time it takes to do a full build/test cycle in our
automation.

It's all on the meeting wiki, but for ease in reading, here's the set of
actions we came up with.

Short term
* (joduinn/releng) Try by default will not do anything
* (joduinn/releng) Stop running always failing tests automatically.
* investigate bug 659222 - joduinn

Medium term
* (releng) Test suites in progress should be available on try and
selfserve even when those test suites are not being run automatically.
(This is an addition to the "stop running always failing tests" above)
* (releng) see if we can do anything to add machines before we get a new
colo.

Long term
* Experiment with moving tests into virutalization (bmoss and rsayre to
get team to figure this out). ctalbert volunteers to help
** Rsayre will help with getting engineering help for long term
solution/fixing tests that proove intermittent in virtualization
* Figure out a way to not run tests that always pass on every test run.
(Ateam/Releng)

Thanks,
Clint

Justin Dolske

unread,
May 24, 2011, 4:03:33 PM5/24/11
to
On 5/24/11 12:27 PM, Clint Talbert wrote:

> Short term
...


> * (joduinn/releng) Stop running always failing tests automatically.

What tests would those be?

Also, I see a thread in m.d.tree-management about combining some tests:

"(e.g. a11y and scroll tests are run as separate jobs, and only take a
few minutes of test time, which is pretty inefficient due to time
required to reboot, download and unpack new build and symbols, etc.)"

Sounds like this could also be a short-term and easy fix?


> Long term


> * Figure out a way to not run tests that always pass on every test run.

I don't understand this, conceptually.

Except for intermittent failures (which I don't think are relevant to
this?), every test should be green on every run. Until someone breaks
something, which is the point of having the tests. :)

Justin

Joshua Cranmer

unread,
May 24, 2011, 9:30:02 PM5/24/11
to
On 05/24/2011 04:03 PM, Justin Dolske wrote:
>> Long term
>> * Figure out a way to not run tests that always pass on every test run.
>
> I don't understand this, conceptually.
>
> Except for intermittent failures (which I don't think are relevant to
> this?), every test should be green on every run. Until someone breaks
> something, which is the point of having the tests. :)
Another alternative is to try to figure out which tests "depend" on
which changes and not run them if those don't change.

Justin Lebar

unread,
May 25, 2011, 8:30:44 AM5/25/11
to
There are two separate issues: How long do builds take, and how long do tests take? To address only the first one:

We have an existence proof that there's a lot of headway we could make in terms of build speed. My mac takes 12m to do a debug build from scratch. I have a Linux box which is similarly fast. So there's no question that we could speed up mac builds and debug (i.e. non-PGO) Linux builds by getting faster machines, right?

IIRC, we don't use pymake on Windows builds. If we did, that would be a huge speedup for non-PGO builds, because we could use -j4 or greater.

Ted proposed not running PGO unless we ask for it; that would make release builds appear much faster on Linux and especially Windows.

Kyle Huey

unread,
May 25, 2011, 9:11:34 AM5/25/11
to dev-pl...@lists.mozilla.org
On Wed, May 25, 2011 at 5:30 AM, Justin Lebar <justin...@gmail.com>wrote:

> There are two separate issues: How long do builds take, and how long do
> tests take? To address only the first one:
>

Which is the less interesting one; we're not backing up on the builders
here.

>
> We have an existence proof that there's a lot of headway we could make in
> terms of build speed. My mac takes 12m to do a debug build from scratch. I
> have a Linux box which is similarly fast. So there's no question that we
> could speed up mac builds and debug (i.e. non-PGO) Linux builds by getting
> faster machines, right?
>

It might help a bit. Remember that the builders reboot between every build,
so the build is generally a "cold" build (unless the hg clone pulled most of
the repo into memory). Without having done any measurements myself, I
imagine that IO performance is pretty important here compared to raw cpu
power.

>
> IIRC, we don't use pymake on Windows builds. If we did, that would be a
> huge speedup for non-PGO builds, because we could use -j4 or greater.
>

Timing on build slaves indicates that pymake is not a big win there.

Ted proposed not running PGO unless we ask for it; that would make release
> builds appear much faster on Linux and especially Windows.
>

Yeah, we really should do that.

> _______________________________________________
> dev-planning mailing list
> dev-pl...@lists.mozilla.org
> https://lists.mozilla.org/listinfo/dev-planning
>

- Kyle

Mike Hommey

unread,
May 25, 2011, 9:32:05 AM5/25/11
to Kyle Huey, dev-pl...@lists.mozilla.org
On Wed, May 25, 2011 at 06:11:34AM -0700, Kyle Huey wrote:
> It might help a bit. Remember that the builders reboot between every build,
> so the build is generally a "cold" build (unless the hg clone pulled most of
> the repo into memory). Without having done any measurements myself, I
> imagine that IO performance is pretty important here compared to raw cpu
> power.

Why do they reboot between every build? Can't we skip that?

> > IIRC, we don't use pymake on Windows builds. If we did, that would be a
> > huge speedup for non-PGO builds, because we could use -j4 or greater.
> >
>

> Timing on build slaves indicates that pymake is not a big win there.

AFAIK these timings were without -j4.

Mike

Axel Hecht

unread,
May 25, 2011, 9:33:51 AM5/25/11
to
On 25.05.11 15:11, Kyle Huey wrote:
> On Wed, May 25, 2011 at 5:30 AM, Justin Lebar<justin...@gmail.com>wrote:
>
>> There are two separate issues: How long do builds take, and how long do
>> tests take? To address only the first one:
>>
>
> Which is the less interesting one; we're not backing up on the builders
> here.
>
>>
>> We have an existence proof that there's a lot of headway we could make in
>> terms of build speed. My mac takes 12m to do a debug build from scratch. I
>> have a Linux box which is similarly fast. So there's no question that we
>> could speed up mac builds and debug (i.e. non-PGO) Linux builds by getting
>> faster machines, right?
>>
>
> It might help a bit. Remember that the builders reboot between every build,
> so the build is generally a "cold" build (unless the hg clone pulled most of
> the repo into memory). Without having done any measurements myself, I
> imagine that IO performance is pretty important here compared to raw cpu
> power.
>

Do the builders reboot or just the testers?

Also, how much time are we spending on reboot? Is that included in the
"few minutes" setup time that John mentioned?

Asking because a reboot sounds like an expensive and drastic way to work
around problems for which we may have more efficient solutions.

Axel

Armen Zambrano Gasparnian

unread,
May 25, 2011, 10:04:01 AM5/25/11
to Mike Hommey, Kyle Huey, dev-pl...@lists.mozilla.org
We reboot both builders and testers after every job (with few exceptions
like L10n repacks).

Reboots for builders originally had two main purposes:
* clean state for unit tests (when we used to run them there)
* clean state for builds (nothing from previous builds is chewing memory)

There are also configuration management purposes:
* synchronize with puppet/opsi to have the right packages installed
* synchronize with slave allocator to determine with which master the
slave should be talking to

Are we trying to determine how much it buys to build on warn?
I believe catlee and others did some experiments on seeing how much it
would buy us but he is away for few days.

The reboot time is not included on the setup time that joduinn might
have mentioned. Setup time includes checking out repositories and
clobbering.

Even if a slave is rebooting there is generally another slave idle
available to take a job if it happens. The "build" and "try" wait times
say that we generally take jobs soon enough.

cheers,
Armen


Armen Zambrano Gasparnian

unread,
May 25, 2011, 10:04:01 AM5/25/11
to Mike Hommey, Kyle Huey, dev-pl...@lists.mozilla.org
On 11-05-25 9:32 AM, Mike Hommey wrote:

Robert Kaiser

unread,
May 25, 2011, 1:42:03 PM5/25/11
to
Justin Lebar schrieb:

> My mac takes 12m to do a debug build from scratch. I have a Linux box which is similarly fast.

Do those run "make buildsymbols" as well in that time? A significant
portion of the time our builders take is AFAIK for that step (which we
need if we want to be able to gather any sort of meaningful crash stats).

Robert Kaiser


--
Note that any statements of mine - no matter how passionate - are never
meant to be offensive but very often as food for thought or possible
arguments that we as a community should think about. And most of the
time, I even appreciate irony and fun! :)

Clint Talbert

unread,
May 25, 2011, 2:32:26 PM5/25/11
to
On 5/24/2011 1:03 PM, Justin Dolske wrote:
> On 5/24/11 12:27 PM, Clint Talbert wrote:
>
>> Short term
> ...
>> * (joduinn/releng) Stop running always failing tests automatically.
>
> What tests would those be?
At the moment, the only one I know of is Jetpack on Windows. John
seemed to think there were others as well.

>
> Also, I see a thread in m.d.tree-management about combining some tests:
>
> "(e.g. a11y and scroll tests are run as separate jobs, and only take a
> few minutes of test time, which is pretty inefficient due to time
> required to reboot, download and unpack new build and symbols, etc.)"
>
> Sounds like this could also be a short-term and easy fix?
>

Yep, it's being tracked by bug 659328. It's probably a relatively minor
fix, but nonetheless something good to do.


>
>> Long term
>> * Figure out a way to not run tests that always pass on every test run.
>
> I don't understand this, conceptually.
>
> Except for intermittent failures (which I don't think are relevant to
> this?), every test should be green on every run. Until someone breaks
> something, which is the point of having the tests. :)

There are a ton of ways to implement this kind of thing. I prefer some
sort of cycling through the tests, so as an example:
You run full tests every moment the tree is free over a period of time.
You keep track of everything that is green constantly for every run.

Then you take x% of those perma-green tests out of the "on change" runs
for the next week, and you only run full tests on nightlies. That way if
you do break one of these perma green tests, you'll find out about it
from the nightly build.

Each week, you reactivate/deactivate a different percentage of
permagreen tests.

This is a pretty complicated mechanism, but it has a benefit that you're
cycling through all the tests over time, and you have full runs each day.

Another way to do it is to wire it to code areas, and activate tests
based on the checkin's area of effect, but that's hard to measure well.

However we implement it, the purpose is to reduce cycle time by reducing
the set of tests we run on each change, while still running a full run
of tests periodically - either each day or at certain points through the
day - perhaps every 6 hours or some such.

Note that this would not be the case with Talos. We'd always run all of
talos, this would only affect correctness tests.

Does that hand-wavy outline explain the thinking behind the approach?

Clint

Daniel Holbert

unread,
May 25, 2011, 3:14:40 PM5/25/11
to Clint Talbert, dev-pl...@lists.mozilla.org
On 05/25/2011 11:32 AM, Clint Talbert wrote:
> This is a pretty complicated mechanism, but it has a benefit that you're
> cycling through all the tests over time, and you have full runs each day.

So what happens when we get orange in one of these tests? How do we get
to a known-good tree state?

Right now, we have the option of backing out *just* the push that went
orange[1].

With your proposal, we'd have to back out everything *back to the last
time that test was run* (which sounds like it could be up to a day's
worth of pushes) in order to be in a known-good state. That sounds painful.

~Daniel

[1] (and possibly everything since that push, if the orange was an
'aborts test suite' type issue that prevented test coverage for
subsequent pushes)

Justin Lebar

unread,
May 25, 2011, 3:41:50 PM5/25/11
to
> we're not backing up on the builders.

I admit to missing the meeting, but Clint said in the original post:

> After the platform meeting, we talked a bit about how to reduce the
> amount of turnaround time it takes to do a full build/test cycle in our
> automation.

It still takes a long time to get your results even when there's no backlog. Surely faster builds would help with that.

Additionally, although we don't back up on builders, we do occasionally kick builds off on VMs, which appear to be considerably slower than bare hardware. (The Linux-32 build VM I'm currently SSH'ed into has only one CPU!)

> It might help a bit. Remember that the builders reboot between every build,
> so the build is generally a "cold" build (unless the hg clone pulled most of
> the repo into memory). Without having done any measurements myself, I
> imagine that IO performance is pretty important here compared to raw cpu
> power.

To offer an alternative interpretation also not backed up by data: We should be able to hide disk latency by running more jobs than there are CPUs.

But the bigger point is that neither of us has a clue. If it's I/Os that matter, then knowing that would help guide us towards a solution (SSDs, running a different clone script so the source files stay in buffer cache).

My understanding, btw, is that we no longer do a full hg clone, but instead only pull the necessary files for the tip rev. Presumably they all stay in buffer cache. But maybe we only do this on try, or maybe I'm misremembering.

> Timing on build slaves indicates that pymake is not a big win there.

Do you know why that is?

Armen Zambrano Gasparnian

unread,
May 25, 2011, 4:08:56 PM5/25/11
to mozilla.de...@googlegroups.com, Justin Lebar
On 11-05-25 3:41 PM, Justin Lebar wrote:
> But the bigger point is that neither of us has a clue. If it's I/Os that matter, then knowing that would help guide us towards a solution (SSDs, running a different clone script so the source files stay in buffer cache).
We are getting enterprise drives on the IX machines to speed I/O.

>
> My understanding, btw, is that we no longer do a full hg clone, but instead only pull the necessary files for the tip rev. Presumably they all stay in buffer cache. But maybe we only do this on try, or maybe I'm misremembering.

That is right. For try we do full clones; for everything else we just
add missing changesets (unless a clobber is requested).

cheers,
Armen

Ben Hearsum

unread,
May 25, 2011, 4:15:25 PM5/25/11
to Armen Zambrano Gasparnian, mozilla.de...@googlegroups.com, Justin Lebar

Previously, we did limited clones -- pulling in only what was needed for
the revision we cared about.

However, we don't do full clones on try anymore, we keep a read-only
clone of the repository on the machines, and use "hg share" to update
the working copy. Pulling in the incremental changes to the read-only
clone and updating the working copy still takes 5-7 minutes, but it's
much faster than before.

sayrer

unread,
May 26, 2011, 12:54:07 AM5/26/11
to
On Tuesday, May 24, 2011 1:03:33 PM UTC-7, Justin Dolske wrote:
>
> > Long term
> > * Figure out a way to not run tests that always pass on every test run.
>
> I don't understand this, conceptually.
>
> Except for intermittent failures (which I don't think are relevant to
> this?), every test should be green on every run.

We're looking to speed up test cycle times. One way to do that is to evaluate the odds that a test will fail. I'm sure there are, say, W3C DOM Level 2 Core tests that have never failed since they were checked in. Running them on every check-in is a waste of time, cycles, and greenhouse gasses.

What if these tests that nearly always pass were only run once a day? Then you would still catch them in a reasonable amount of time, and it would probably be obvious which check-in did it. Tests that do fail could also re-enter the suite that's always run.

Below, Joshua suggests looking at data to determine which tests to execute. This is another, more sophisticated way to determine which tests might be worth running.

Getting to this point will take some time. Does that rationale make sense?

Mike Hommey

unread,
May 26, 2011, 1:58:46 AM5/26/11
to dev-pl...@lists.mozilla.org

It does make sense, provided we have an easy way to trigger these tests
on intermediate csets when they start failing, to allow to narrow down
to one particular cset doing the regression.

Mike

Ehsan Akhgari

unread,
May 26, 2011, 12:57:17 PM5/26/11
to Mike Hommey, dev-pl...@lists.mozilla.org
On 11-05-26 1:58 AM, Mike Hommey wrote:
> On Wed, May 25, 2011 at 09:54:07PM -0700, sayrer wrote:
> It does make sense, provided we have an easy way to trigger these tests
> on intermediate csets when they start failing, to allow to narrow down
> to one particular cset doing the regression.

Does this proposal also cover the try server? These tests might have
never failed on mozilla-central, but I'm pretty sure that they've
allowed people to catch regressions before hitting m-c.

Ehsan

jmaher

unread,
May 27, 2011, 1:22:46 PM5/27/11
to
what about selectively running the long running tests?

I took a look at mochitest (1-5) and found 191 test_* files which have
a runtime of >10 seconds. Actually 4 tests have >2 minutes. All in
all, on a debug build we could save about an hour of test time by
ignoring these and an opt build would be between 15-20 minutes. Keep
in mind these times would need to be divided by 5 (the number of
chunks we run).

It would require a manifest or some other mechanism to add the runtime
metadata, but all of that is possible. We could run the mSlow tests
on nightly builds or a few times/day.

Boris Zbarsky

unread,
May 27, 2011, 2:22:41 PM5/27/11
to
On 5/27/11 1:22 PM, jmaher wrote:
> what about selectively running the long running tests?
>
> I took a look at mochitest (1-5) and found 191 test_* files which have
> a runtime of>10 seconds. Actually 4 tests have>2 minutes.

Which ones, if I might ask?

(Note that I suspect that the difference here is that the granularity of
test_* files in mochitest is different; some might be "test this one DOM
feature" while others are "test that every CSS property we implement
round-trips correctly".)

-Boris

jmaher

unread,
May 27, 2011, 3:21:43 PM5/27/11
to

The 4 longest tests are:
m1:
35404 INFO TEST-END | /tests/content/base/test/test_websocket.html |
finished in 128442ms
36529 INFO TEST-END | /tests/content/base/test/
test_ws_basic_tests.html | finished in 116905ms
40603 INFO TEST-END | /tests/content/canvas/test/webgl/
test_webgl_conformance_test_suite.html | finished in 133501ms
m4:
72059 INFO TEST-END | /tests/layout/style/test/test_value_cloning.html
| finished in 234695ms

Other lengthy tests are the layout/base (reftests?) inside of
mochitest (from m4 - a sample of about 123):
487 INFO TEST-END | /tests/layout/base/tests/test_bug441782-1b.html |
finished in 11519ms
490 INFO TEST-END | /tests/layout/base/tests/test_bug441782-1c.html |
finished in 22459ms
493 INFO TEST-END | /tests/layout/base/tests/test_bug441782-1d.html |
finished in 22449ms
496 INFO TEST-END | /tests/layout/base/tests/test_bug441782-1e.html |
finished in 24156ms
499 INFO TEST-END | /tests/layout/base/tests/test_bug441782-2a.html |
finished in 19105ms
502 INFO TEST-END | /tests/layout/base/tests/test_bug441782-2b.html |
finished in 19182ms
505 INFO TEST-END | /tests/layout/base/tests/test_bug441782-2c.html |
finished in 23415ms

Boris Zbarsky

unread,
May 27, 2011, 3:33:11 PM5/27/11
to
On 5/27/11 3:21 PM, jmaher wrote:
> 35404 INFO TEST-END | /tests/content/base/test/test_websocket.html |
> finished in 128442ms
> 36529 INFO TEST-END | /tests/content/base/test/
> test_ws_basic_tests.html | finished in 116905ms

As I recall those are just buggy: they use large setTimeouts all over
and stuff. Can we just fix them?

> 40603 INFO TEST-END | /tests/content/canvas/test/webgl/
> test_webgl_conformance_test_suite.html | finished in 133501ms

No idea what the deal is here.

> 72059 INFO TEST-END | /tests/layout/style/test/test_value_cloning.html
> | finished in 234695ms

This test is testing all sorts of stuff about lots of possible values
for each CSS property... but it does this by doing them one after the
other, with a new document being loaded for each test. I think we can
probably improve this. Let me take a look.

> Other lengthy tests are the layout/base (reftests?)

No, reftests is something different.

> 487 INFO TEST-END | /tests/layout/base/tests/test_bug441782-1b.html |
> finished in 11519ms
> 490 INFO TEST-END | /tests/layout/base/tests/test_bug441782-1c.html |
> finished in 22459ms
> 493 INFO TEST-END | /tests/layout/base/tests/test_bug441782-1d.html |
> finished in 22449ms
> 496 INFO TEST-END | /tests/layout/base/tests/test_bug441782-1e.html |
> finished in 24156ms
> 499 INFO TEST-END | /tests/layout/base/tests/test_bug441782-2a.html |
> finished in 19105ms
> 502 INFO TEST-END | /tests/layout/base/tests/test_bug441782-2b.html |
> finished in 19182ms
> 505 INFO TEST-END | /tests/layout/base/tests/test_bug441782-2c.html |
> finished in 23415ms

These tests are exposing a bug in the mochitest harness, effectively:
they change preferences that affect the layout of the harness document
itself, and that document is _huge_... I was sure we had a bug about
this, but can't find it right now.

We should be able to make these tests _much_ faster if we actually want to.

-Boris

Joe Drew

unread,
May 27, 2011, 3:43:03 PM5/27/11
to dev-pl...@lists.mozilla.org

On 2011-05-27 3:33 PM, Boris Zbarsky wrote:
>> 40603 INFO TEST-END | /tests/content/canvas/test/webgl/
>> test_webgl_conformance_test_suite.html | finished in 133501ms
>
> No idea what the deal is here.

"Run the entire WebGL conformance test suite." There are a lot of tests
in there.

Joe

Ehsan Akhgari

unread,
May 27, 2011, 4:04:09 PM5/27/11
to Boris Zbarsky, dev-pl...@lists.mozilla.org
On 11-05-27 3:33 PM, Boris Zbarsky wrote:
> On 5/27/11 3:21 PM, jmaher wrote:
>> 35404 INFO TEST-END | /tests/content/base/test/test_websocket.html |
>> finished in 128442ms
>> 36529 INFO TEST-END | /tests/content/base/test/
>> test_ws_basic_tests.html | finished in 116905ms
>
> As I recall those are just buggy: they use large setTimeouts all over
> and stuff. Can we just fix them?

I think we should. There is no need here for any timeouts, we control
the whole stack and we should be able to figure out what exactly to wait
for. Who's our Web Socket person?

>> 40603 INFO TEST-END | /tests/content/canvas/test/webgl/
>> test_webgl_conformance_test_suite.html | finished in 133501ms
>
> No idea what the deal is here.

Apparently there are a bunch of things which we can do here. Filed bug
660322.

>> 72059 INFO TEST-END | /tests/layout/style/test/test_value_cloning.html
>> | finished in 234695ms
>
> This test is testing all sorts of stuff about lots of possible values
> for each CSS property... but it does this by doing them one after the
> other, with a new document being loaded for each test. I think we can
> probably improve this. Let me take a look.

Is there a bug on file for this?

>> Other lengthy tests are the layout/base (reftests?)
>
> No, reftests is something different.
>
>> 487 INFO TEST-END | /tests/layout/base/tests/test_bug441782-1b.html |
>> finished in 11519ms
>> 490 INFO TEST-END | /tests/layout/base/tests/test_bug441782-1c.html |
>> finished in 22459ms
>> 493 INFO TEST-END | /tests/layout/base/tests/test_bug441782-1d.html |
>> finished in 22449ms
>> 496 INFO TEST-END | /tests/layout/base/tests/test_bug441782-1e.html |
>> finished in 24156ms
>> 499 INFO TEST-END | /tests/layout/base/tests/test_bug441782-2a.html |
>> finished in 19105ms
>> 502 INFO TEST-END | /tests/layout/base/tests/test_bug441782-2b.html |
>> finished in 19182ms
>> 505 INFO TEST-END | /tests/layout/base/tests/test_bug441782-2c.html |
>> finished in 23415ms
>
> These tests are exposing a bug in the mochitest harness, effectively:
> they change preferences that affect the layout of the harness document
> itself, and that document is _huge_... I was sure we had a bug about
> this, but can't find it right now.
>
> We should be able to make these tests _much_ faster if we actually want to.

This is bug 479352. If we did have a way to run a subset of reftests in
privileged mode so that they can change prefs, we would have been able
to avoid the mochitest harness cost for these altogether. ;-)

Ehsan

Armen Zambrano Gasparnian

unread,
May 27, 2011, 4:40:55 PM5/27/11
to Boris Zbarsky
Could we change our harnesses to create a performance summary for each
individual test?
Perhaps a summary of the tests that take the longest and triage them
every once in a while to see if they are going as fast as they should.
Asking for a per test run regression tool might be asking too much but
you guys can say.

cheers,
Armen

smaug

unread,
May 27, 2011, 5:35:25 PM5/27/11
to Ehsan Akhgari, Boris Zbarsky, dev-pl...@lists.mozilla.org
On 05/27/2011 11:04 PM, Ehsan Akhgari wrote:
> On 11-05-27 3:33 PM, Boris Zbarsky wrote:
>> On 5/27/11 3:21 PM, jmaher wrote:
>>> 35404 INFO TEST-END | /tests/content/base/test/test_websocket.html |
>>> finished in 128442ms
>>> 36529 INFO TEST-END | /tests/content/base/test/
>>> test_ws_basic_tests.html | finished in 116905ms
>>
>> As I recall those are just buggy: they use large setTimeouts all over
>> and stuff. Can we just fix them?
>
> I think we should. There is no need here for any timeouts, we control
> the whole stack and we should be able to figure out what exactly to wait
> for. Who's our Web Socket person?

In general there are plenty of reasons for timeouts, especially when we
test networking. For example when you want to test that something does
not happen, or that something happens X times, but not X+1 times.
(Timeouts don't really guarantee either one, but they give quite good
estimate)

smaug

unread,
May 27, 2011, 5:35:25 PM5/27/11
to Ehsan Akhgari, Boris Zbarsky, dev-pl...@lists.mozilla.org
On 05/27/2011 11:04 PM, Ehsan Akhgari wrote:
> On 11-05-27 3:33 PM, Boris Zbarsky wrote:
>> On 5/27/11 3:21 PM, jmaher wrote:
>>> 35404 INFO TEST-END | /tests/content/base/test/test_websocket.html |
>>> finished in 128442ms
>>> 36529 INFO TEST-END | /tests/content/base/test/
>>> test_ws_basic_tests.html | finished in 116905ms
>>
>> As I recall those are just buggy: they use large setTimeouts all over
>> and stuff. Can we just fix them?
>
> I think we should. There is no need here for any timeouts, we control
> the whole stack and we should be able to figure out what exactly to wait
> for. Who's our Web Socket person?

In general there are plenty of reasons for timeouts, especially when we


test networking. For example when you want to test that something does
not happen, or that something happens X times, but not X+1 times.
(Timeouts don't really guarantee either one, but they give quite good
estimate)


>

Boris Zbarsky

unread,
May 27, 2011, 9:41:35 PM5/27/11
to
On 5/27/11 3:33 PM, Boris Zbarsky wrote:

>> 72059 INFO TEST-END | /tests/layout/style/test/test_value_cloning.html
>> | finished in 234695ms

> This test is testing all sorts of stuff about lots of possible values
> for each CSS property... but it does this by doing them one after the
> other, with a new document being loaded for each test. I think we can
> probably improve this. Let me take a look.

https://bugzilla.mozilla.org/show_bug.cgi?id=660398 has a fix. Looks
like my hardware (which the numbers in the bug come from) is a bit
faster than the test machine here, but if things scale linearly this
test should be down to 10s or so in a debug build on the test machine.
Not great, but a lot better.

-Boris

Clint Talbert

unread,
May 28, 2011, 2:12:49 AM5/28/11
to
Mochitest already outputs such a summary. Reftest doesn't: filed bug
660419. We can certainly add this.

Clint


Clint Talbert

unread,
May 28, 2011, 2:16:37 AM5/28/11
to
On 5/26/2011 9:57 AM, Ehsan Akhgari wrote:

>
> Does this proposal also cover the try server? These tests might have
> never failed on mozilla-central, but I'm pretty sure that they've
> allowed people to catch regressions before hitting m-c.
>

Hadn't completely thought through all the details yet of the proposal.
But, I think that I would prefer try to always *by default* mirror
mozilla-central. Back when try didn't mirror mozilla-central it was a
chronic issue.

That said, I'd like to see that while try mirrors mozilla central by
default, we should encourage and enhance try's ability to be
configurable. So there should be some try chooser syntax for you to
"ignore rules about current passing test set" if you want to in your patch.

Clint

Clint Talbert

unread,
May 28, 2011, 2:46:11 AM5/28/11
to
On 5/25/2011 12:14 PM, Daniel Holbert wrote:
> With your proposal, we'd have to back out everything *back to the last
> time that test was run* (which sounds like it could be up to a day's
> worth of pushes) in order to be in a known-good state. That sounds painful.
>
> ~Daniel
>
> [1] (and possibly everything since that push, if the orange was an
> 'aborts test suite' type issue that prevented test coverage for
> subsequent pushes)

Good question. Yes, you run the risk of backing everything out to find
an orange. There are a couple of ways to mitigate that risk. Since
this is really early stages, I don't have data to make a concrete
proposal. But here are some thoughts:

* You optimize by running a full build of everything every X changesets.
* You provide hands-off regression hunting tools to find the changeset
that hit the orange and (possibly) auto-back it out. (the a-team has
already begun work on that regression hunting tool)
* Or as Mike Homney suggested, you provide a means (through an extension
to the self-service build API, for example) to run the orange test on
the intervening changesets that have happened between the last known
good run and the current orange one. This requires a sheriff or though
because the tree-watching time in this instance could be very long,
depending on how often the full runs are performed.

Of course, new intermittent issues that are introduced will throw a
wrench in these plans. Regardless of what we do, I think that
intermittent oranges will be the Achilles heel of any automation
solution to increase build turnaround time. Our best hope is to
continue the war on orange and drive the orangefactor number down to <=1
and keep it there. But outside of the intermittent issues, we do have
some options to make this idea into a viable approach.

I'm glad you're all bringing up these issues and also finding concrete
tests to fix now. That's awesome. We've also started pulling a team
together to crunch the data and develop something more than just vague
bullet points.

Clint

Justin Dolske

unread,
May 29, 2011, 1:35:54 AM5/29/11
to
On 5/25/11 9:54 PM, sayrer wrote:

>>> Long term
>>> * Figure out a way to not run tests that always pass on every test run.
>>
>> I don't understand this, conceptually.
>

> We're looking to speed up test cycle times. One way to do that is to evaluate the odds that a test will fail. I'm sure there are, say, W3C DOM Level 2 Core tests that have never failed since they were checked in. Running them on every check-in is a waste of time, cycles, and greenhouse gasses.

> [...]


> Below, Joshua suggests looking at data to determine which tests to execute. This is another, more sophisticated way to determine which tests might be worth running.

Ah, ok. Hmm.

So, I see potential value in a "dependency" system. Some front-end
changes (like, say, password manager) have well-defined tests that are
useful to run, and the rest are basically useless. I'm sure there are
other areas, though I'd a bit dubious about well we can identify them
and how significant the wins will be for daily m-c activity.
[Conversely, there are other areas (like, say, xpconnect) where we'd
want to run everything, because there are different uses of it all over.]

I'm highly wary about disabling tests based just on probability, though.
I suspect there's a lot of code that, while any particular area is
changed infrequently, has a significant probability of breaking a test
even though on the average that test always passed.

As a hypothetical example: consider a project with 100 independent code
modules, and an incompetent programmer who makes a broken patch to 1
module each day. [Or a competent programmer working with incompetent
code, which might be closer to our situation ;-)] On average, the test
for any particular module will only have a 1% failure rate per day. But
in reality, the tree is always 100% broken.

OTOH, I suppose it's possible there are enough tests that just fail _so_
rarely that the cycle time savings are worth having larger, more complex
regression ranges. EG, it seems like a win if we got 50% faster cycle
times but once every couple of months had to close the tree to figure
out which non-obvious commit broke an infrequently run test.

There's a bright side here, though. We have historical logs -- data! If
we identify a set of tests we want to run less often, it would be
possible to use that data to determine how frequently there would be a
"delayed orange" in that set. As well as gauge how complex identifying
the cause and backing it out would be, based on actual checkins around
that time.

Justin

Robert Kaiser

unread,
May 29, 2011, 3:38:15 PM5/29/11
to
Justin Dolske schrieb:

> So, I see potential value in a "dependency" system. Some front-end
> changes (like, say, password manager) have well-defined tests that are
> useful to run, and the rest are basically useless.

*In theory*, yes. We've at times seen how some change in some part of
front end suddenly broke a seemingly unrelated area that depended on
something from the other without people knowing that. May not be the
usual case, but sometimes happens.

Robert Kaiser


--
Note that any statements of mine - no matter how passionate - are never
meant to be offensive but very often as food for thought or possible

arguments that we as a community needs answers to. And most of the time,

Armen Zambrano Gasparnian

unread,
May 30, 2011, 9:50:06 AM5/30/11
to Clint Talbert
Worth taking note that we have had the problem of try running less
things than mozilla-central but never the other way around.

Justin Lebar

unread,
May 30, 2011, 10:54:39 AM5/30/11
to
> Worth taking note that we have had the problem of try running less
> things than mozilla-central but never the other way around.

Isn't this almost always [1]? That is, try runs the tests, but they don't show up in TBPL.

[1] https://bugzilla.mozilla.org/show_bug.cgi?id=590526

Mike Connor

unread,
May 30, 2011, 11:03:07 AM5/30/11
to Clint Talbert, dev-pl...@lists.mozilla.org
On 2011-05-28, at 2:46 AM, Clint Talbert wrote:

> Of course, new intermittent issues that are introduced will throw a wrench in these plans. Regardless of what we do, I think that intermittent oranges will be the Achilles heel of any automation solution to increase build turnaround time. Our best hope is to continue the war on orange and drive the orangefactor number down to <=1 and keep it there. But outside of the intermittent issues, we do have some options to make this idea into a viable approach.

So, before we invest in this approach, it would be good to get some idea of what we feel the benefit will be. I'm finding it hard to assess cost/benefit here, mostly because the benefit is sort of loosely defined. "Wasting cycles" is an easy goal to get behind, but we should argue based on data, as has been pointed out in many recent threads.

What I'd like to see addressed:

* How long are current cycles?
* What is our target cycle time?
* How close can we get simply by improving the test suites to run faster, without sacrificing coverage?
* Is cycle time on unit tests a significant proportion of our overall cycle times?

-- Mike

Chris AtLee

unread,
May 30, 2011, 11:34:25 AM5/30/11
to
Coming in a bit late to this party...but here's my $0.02

On 25/05/11 08:30 AM, Justin Lebar wrote:
> There are two separate issues: How long do builds take, and how long
> do tests take? To address only the first one:
>
> We have an existence proof that there's a lot of headway we could
> make in terms of build speed. My mac takes 12m to do a debug build
> from scratch. I have a Linux box which is similarly fast. So
> there's no question that we could speed up mac builds and debug (i.e.
> non-PGO) Linux builds by getting faster machines, right?

I think our Linux builds are fine as they are. We get about 20 minute
builds for non-PGO. Mac and Windows builds are the real bottleneck here.

> IIRC, we don't use pymake on Windows builds. If we did, that would
> be a huge speedup for non-PGO builds, because we could use -j4 or
> greater.

We do not, correct. I tested a debug build (non-PGO) with gnu make -j1
vs pymake -j4, and the times were within a minute of each other. It'll
be worth testing again once we get the hard drives in the windows
builders replaced.

> Ted proposed not running PGO unless we ask for it; that would make
> release builds appear much faster on Linux and especially Windows.

Totally agree that we should do this. Is there consensus here?

Mike Connor

unread,
May 30, 2011, 11:39:33 AM5/30/11
to Chris AtLee, dev-pl...@lists.mozilla.org

On 2011-05-30, at 11:34 AM, Chris AtLee wrote:

>> Ted proposed not running PGO unless we ask for it; that would make
>> release builds appear much faster on Linux and especially Windows.
>
> Totally agree that we should do this. Is there consensus here?

I believe the sum total is "yes, there is consensus, as long as we still get good coverage from the PGO nightly builds."

-- Mike

Chris AtLee

unread,
May 30, 2011, 11:57:53 AM5/30/11
to
On 30/05/11 11:03 AM, Mike Connor wrote:
> On 2011-05-28, at 2:46 AM, Clint Talbert wrote:
>
>> Of course, new intermittent issues that are introduced will throw a wrench in these plans. Regardless of what we do, I think that intermittent oranges will be the Achilles heel of any automation solution to increase build turnaround time. Our best hope is to continue the war on orange and drive the orangefactor number down to<=1 and keep it there. But outside of the intermittent issues, we do have some options to make this idea into a viable approach.
>
> So, before we invest in this approach, it would be good to get some idea of what we feel the benefit will be. I'm finding it hard to assess cost/benefit here, mostly because the benefit is sort of loosely defined. "Wasting cycles" is an easy goal to get behind, but we should argue based on data, as has been pointed out in many recent threads.
>
> What I'd like to see addressed:
>
> * How long are current cycles?

For builds, the biggest offenders here are:
Windows opt builds (average 3h 4m 21s)
Mac opt builds (average 2h 35m 13s)

NB that Linux opt builds are now up to 1h 23m 37s on average since
enabling PGO.

For tests, debug tests take a long time. e.g.
XP debug mochitest-other (average 1h 24m 22s)
Fedora debug mochitests-4/5 (average 1h 21m 33s)
Win7 debug mochitest-other (average 1h 14m 54s)

Slowest build is windows at 3h4m, and the slowest opt windows test is
win7 xpcshell at 41m, so that brings our cycle time up to 3h45m assuming
we can start all builds/tests promptly.

> * What is our target cycle time?

As fast as possible?

> * How close can we get simply by improving the test suites to run faster, without sacrificing coverage?

For debug builds, tests are 50% of the cycle time. I'd SWAG that for
non-PGO linux and windows opt builds the same would hold true.

> * Is cycle time on unit tests a significant proportion of our overall cycle times?

Yes. Specifically debug unit tests. Of our opt tests, the slowest are:
Win7 talos dromaeo (0h 43m 1s)
WinXP talos dromaeo (0h 41m 51s)
Win7 opt xpcshell (0h 41m 48s) (which is about 2x the time it takes on
other platforms)

Robert Kaiser

unread,
May 30, 2011, 12:09:16 PM5/30/11
to
Chris AtLee schrieb:

> For builds, the biggest offenders here are:
> Windows opt builds (average 3h 4m 21s)

That's probably the PGO cost, as mentioned a number of times in this thread.

> Mac opt builds (average 2h 35m 13s)

Those are universal builds, so actually we are doing two build runs
there. I guess the only things we can do there is either beef up the
hardware or make our build process faster in general (which is probably
quite hard).

> NB that Linux opt builds are now up to 1h 23m 37s on average since
> enabling PGO.

PGO has a cost everywhere as it's doing a second run of things - still
hugely better than Windows here, though.

>> * What is our target cycle time?
>
> As fast as possible?

Sure, but it helps to set a target we really want to be the max of what
we need to wait to have results.

Chris AtLee

unread,
May 30, 2011, 12:15:09 PM5/30/11
to
On 30/05/11 12:09 PM, Robert Kaiser wrote:
> Chris AtLee schrieb:
>> For builds, the biggest offenders here are:
>> Windows opt builds (average 3h 4m 21s)
>
> That's probably the PGO cost, as mentioned a number of times in this
> thread.
>
>> Mac opt builds (average 2h 35m 13s)
>
> Those are universal builds, so actually we are doing two build runs
> there. I guess the only things we can do there is either beef up the
> hardware or make our build process faster in general (which is probably
> quite hard).

There are a few approaches here:
* Beefier build machines. We're already looking at options here.
* Single-pass build instead of two-pass build. This requires a lot of
make/configure/etc. work, but would be awesome to do.
* Build each architecture in parallel on different machines, and unify
them after. Theoretically possible, practically very difficult.

Chris AtLee

unread,
May 30, 2011, 12:17:45 PM5/30/11
to
On 30/05/11 12:15 PM, Chris AtLee wrote:
> On 30/05/11 12:09 PM, Robert Kaiser wrote:
>> Chris AtLee schrieb:
>>> For builds, the biggest offenders here are:
>>> Windows opt builds (average 3h 4m 21s)
>>
>> That's probably the PGO cost, as mentioned a number of times in this
>> thread.
>>
>>> Mac opt builds (average 2h 35m 13s)
>>
>> Those are universal builds, so actually we are doing two build runs
>> there. I guess the only things we can do there is either beef up the
>> hardware or make our build process faster in general (which is probably
>> quite hard).
>
> There are a few approaches here:
> * Beefier build machines. We're already looking at options here.
> * Single-pass build instead of two-pass build. This requires a lot of
> make/configure/etc. work, but would be awesome to do.
> * Build each architecture in parallel on different machines, and unify
> them after. Theoretically possible, practically very difficult.

Actually, just had a thought here. Could we do 64-bit only opt builds
during the day, and have our nightlies and release builds be universal
32/64-bit?

Armen Zambrano Gasparnian

unread,
May 30, 2011, 12:22:35 PM5/30/11
to mozilla.de...@googlegroups.com, Justin Lebar
I was talking that:
* mozilla-centrals' coverage >= try's coverage
but never
* mozilla-centrals' coverage < try's coverage

I will make a comment on the bug for a workaround.

cheers,
Armen

Mike Connor

unread,
May 30, 2011, 12:23:43 PM5/30/11
to Chris AtLee, dev-pl...@lists.mozilla.org

Are we testing on 32 and 64 bit machines, or just 64 bit?

-- Mike

Armen Zambrano Gasparnian

unread,
May 30, 2011, 12:26:41 PM5/30/11
to Mike Connor, Chris AtLee, dev-pl...@lists.mozilla.org
We would loose the ability to do optimized tests on 10.5 testers.
We would still have the 10.5 debug tests.

Armen Zambrano Gasparnian

unread,
May 30, 2011, 12:26:41 PM5/30/11
to Mike Connor, Chris AtLee, dev-pl...@lists.mozilla.org
On 11-05-30 12:23 PM, Mike Connor wrote:
>

Mike Connor

unread,
May 30, 2011, 12:27:54 PM5/30/11
to Armen Zambrano Gasparnian, Chris AtLee, dev-pl...@lists.mozilla.org

Do we have any data on failures that happen on 10.5 but not 10.6?

-- Mike

Zandr Milewski

unread,
May 30, 2011, 12:33:02 PM5/30/11
to dev-pl...@lists.mozilla.org
[oops, meant to send this to the list]

On 5/30/11 9:23 AM, Mike Connor wrote:

>> Actually, just had a thought here. Could we do 64-bit only opt
>> builds during the day, and have our nightlies and release builds be
>> universal 32/64-bit?
>

> Are we testing on 32 and 64 bit machines, or just 64 bit?

All of the test machines are capable of running 64-bit binaries. I
don't know which binaries we actually test on which machines.

Having said that, none of the test machines run a 64-bit kernel. Apple
has only been enabling 64-bit kernels by default very recently.

http://support.apple.com/kb/HT3770

Zack Weinberg

unread,
May 30, 2011, 12:42:05 PM5/30/11
to
On 2011-05-30 9:09 AM, Robert Kaiser wrote:
>>> * What is our target cycle time?
>>
>> As fast as possible?
>
> Sure, but it helps to set a target we really want to be the max of what
> we need to wait to have results.

If a complete cycle, push to all results available, took less than half
an hour, then IMO it would be reasonable to forbid pushes while results
from a previous cycle were pending. And that would render the "what do
we back out when we discover an orange" argument moot. (We would have
to have a landing queue all the time, but I think that's ok.)

So that's my suggestion for target cycle time.

zw

Armen Zambrano Gasparnian

unread,
May 30, 2011, 12:44:12 PM5/30/11
to Mike Connor, Chris AtLee, dev-pl...@lists.mozilla.org
Hold down, I probably replied this incorrectly (from zandr's email).

I was under the assumption that we were testing the 32-bit side of the
Mac bundle but if we are capable of running the 64-bit side of it on the
10.5 machines we should then be fine.

Anyone can confirm this?

-- armenzg

Armen Zambrano Gasparnian

unread,
May 30, 2011, 12:44:12 PM5/30/11
to Mike Connor, Chris AtLee, dev-pl...@lists.mozilla.org

Kyle Huey

unread,
May 30, 2011, 12:56:05 PM5/30/11
to Chris AtLee, dev-pl...@lists.mozilla.org
Replies inline

On Mon, May 30, 2011 at 8:57 AM, Chris AtLee <cat...@mozilla.com> wrote:

> On 30/05/11 11:03 AM, Mike Connor wrote:
>
>> On 2011-05-28, at 2:46 AM, Clint Talbert wrote:
>>
>> Of course, new intermittent issues that are introduced will throw a
>>> wrench in these plans. Regardless of what we do, I think that intermittent
>>> oranges will be the Achilles heel of any automation solution to increase
>>> build turnaround time. Our best hope is to continue the war on orange and
>>> drive the orangefactor number down to<=1 and keep it there. But outside of
>>> the intermittent issues, we do have some options to make this idea into a
>>> viable approach.
>>>
>>
>> So, before we invest in this approach, it would be good to get some idea
>> of what we feel the benefit will be. I'm finding it hard to assess
>> cost/benefit here, mostly because the benefit is sort of loosely defined.
>> "Wasting cycles" is an easy goal to get behind, but we should argue based
>> on data, as has been pointed out in many recent threads.
>>
>> What I'd like to see addressed:
>>
>> * How long are current cycles?
>>
>
> For builds, the biggest offenders here are:
> Windows opt builds (average 3h 4m 21s)
>

There's not much we can do here.


> Mac opt builds (average 2h 35m 13s)
>

Bug 417044 would help *a lot* here.

NB that Linux opt builds are now up to 1h 23m 37s on average since enabling
> PGO.
>
> For tests, debug tests take a long time. e.g.
> XP debug mochitest-other (average 1h 24m 22s)
>

Can you break this out by suite? I wouldn't be surprised if mochitest-a11y
is a large chunk of this for stupid reasons.


> Fedora debug mochitests-4/5 (average 1h 21m 33s)
>

What are the N (5, 10, whatever) tests here that take the longest?


> Win7 debug mochitest-other (average 1h 14m 54s)
>

Same as for XP.


> Slowest build is windows at 3h4m, and the slowest opt windows test is win7
> xpcshell at 41m, so that brings our cycle time up to 3h45m assuming we can
> start all builds/tests promptly.


AIUI xpcshell starts a new process for each test. I wonder if we could run
multiple tests in parallel? I believe the js shell tests do this.

* What is our target cycle time?
>

As fast as possible?


Right.

* How close can we get simply by improving the test suites to run faster,
> without sacrificing coverage?

For debug builds, tests are 50% of the cycle time. I'd SWAG that for
> non-PGO linux and windows opt builds the same would hold true.
>
>
> * Is cycle time on unit tests a significant proportion of our overall
>> cycle times?
>>
>
> Yes. Specifically debug unit tests. Of our opt tests, the slowest are:
> Win7 talos dromaeo (0h 43m 1s)
> WinXP talos dromaeo (0h 41m 51s)
> Win7 opt xpcshell (0h 41m 48s) (which is about 2x the time it takes on
> other platforms)
>

> _______________________________________________
> dev-planning mailing list
> dev-pl...@lists.mozilla.org
> https://lists.mozilla.org/listinfo/dev-planning
>

- Kyle

Armen Zambrano Gasparnian

unread,
May 30, 2011, 12:59:46 PM5/30/11
to Mike Connor, Chris AtLee, dev-pl...@lists.mozilla.org
I checked with zandr that the Mac build is run on the 10.5 it enforces
the 32-bit side of it (We looked at Activity monitor and checked the
"kind" column which said "Intel").

If we could force the 64-bit side of it on the 10.5 machines would we
care of double building?

-- armenzg

Armen Zambrano Gasparnian

unread,
May 30, 2011, 12:59:46 PM5/30/11
to Chris AtLee, dev-pl...@lists.mozilla.org, Mike Connor

Mike Connor

unread,
May 30, 2011, 1:17:47 PM5/30/11
to Chris AtLee, dev-pl...@lists.mozilla.org

On 2011-05-30, at 11:57 AM, Chris AtLee wrote:

> On 30/05/11 11:03 AM, Mike Connor wrote:
>> On 2011-05-28, at 2:46 AM, Clint Talbert wrote:
>>
>> * What is our target cycle time?
>
> As fast as possible?

I prefer achievable and incremental milestones for any project, since it makes it easier to prioritize well. Taken to the extreme, we could have a single machine for each test, and get cycle time to < 2 minutes.

How about we start by targeting 30 minutes as timeframe for the build-done to test-done? That likely means we need to cut run time to < 25 minutes for every test job. This would be a significant improvement on the current situation, and something we can target for Q3, just by optimizing specific jobs.

-- Mike

Chris AtLee

unread,
May 30, 2011, 2:18:16 PM5/30/11
to
>> For tests, debug tests take a long time. e.g.
>> XP debug mochitest-other (average 1h 24m 22s)
>>
>
> Can you break this out by suite? I wouldn't be surprised if mochitest-a11y
> is a large chunk of this for stupid reasons.

mochitest-chrome is ~15 minutes
mochitest-browser-chrome is ~48 minutes
mochitest-a11y is ~5 minutes (although seems to range a bunch)
mochitest-ipcplugins is 1 minute

>> Fedora debug mochitests-4/5 (average 1h 21m 33s)
>>
>
> What are the N (5, 10, whatever) tests here that take the longest?

I'm not sure how to tell. You can probably find out by looking at some
test logs.

Robert Kaiser

unread,
May 30, 2011, 2:22:06 PM5/30/11
to
Chris AtLee schrieb:

> * Single-pass build instead of two-pass build. This requires a lot of
> make/configure/etc. work, but would be awesome to do.

Right, forgot about that. I wonder what roadblocks there still are, I
know we had some tries on this some time ago and there were problems,
but we changed a number of things since then. And nobody would miss
unify for sure. ;-)

Robert Kaiser

--
Note that any statements of mine - no matter how passionate - are never
meant to be offensive but very often as food for thought or possible

arguments that we as a community should think about. And most of the

Mike Connor

unread,
May 30, 2011, 2:25:47 PM5/30/11
to Chris AtLee, dev-pl...@lists.mozilla.org

On 2011-05-30, at 2:18 PM, Chris AtLee wrote:

>>> For tests, debug tests take a long time. e.g.
>>> XP debug mochitest-other (average 1h 24m 22s)
>>>
>>
>> Can you break this out by suite? I wouldn't be surprised if mochitest-a11y
>> is a large chunk of this for stupid reasons.
>
> mochitest-chrome is ~15 minutes
> mochitest-browser-chrome is ~48 minutes
> mochitest-a11y is ~5 minutes (although seems to range a bunch)
> mochitest-ipcplugins is 1 minute

Can we split -browser-chrome into a separate suite? then we'd have ~21 and ~48 minutes. Doesn't save us actual machine time, but should help end to end times.

-- Mike

Armen Zambrano Gasparnian

unread,
May 30, 2011, 2:40:51 PM5/30/11
to
We are actually trying to unify jobs to reduce the cost of reboots (bug
659328).
We can improve end-to-end time but if we worsen the wait times I don't
think we are winning much.

-- armenzg

Boris Zbarsky

unread,
May 30, 2011, 4:20:07 PM5/30/11
to
On 5/30/11 2:40 PM, Armen Zambrano Gasparnian wrote:
> We can improve end-to-end time but if we worsen the wait times I don't
> think we are winning much.

Given a fixed end-to-end time, wait times can be reduced by increasing
pool size (which we're doing anyway, right?).

So we should be looking into both.

-Boris

Boris Zbarsky

unread,
May 30, 2011, 4:21:03 PM5/30/11
to
On 5/30/11 12:59 PM, Armen Zambrano Gasparnian wrote:
> I checked with zandr that the Mac build is run on the 10.5 it enforces
> the 32-bit side of it (We looked at Activity monitor and checked the
> "kind" column which said "Intel").
>
> If we could force the 64-bit side of it on the 10.5 machines would we
> care of double building?

Running the 64-bit build on 10.5 doesn't work right last I checked due
to OS-level bugs. That's why we default to 32-bit on 10.5 and 64-bit on
10.6.

-Boris

Chris AtLee

unread,
May 30, 2011, 4:22:25 PM5/30/11