test tinderboxes, catching ASSERTIONs, and --enable-debug

L. David Baron

unread,

Jan 7, 2009, 8:24:13 PM1/7/09

to dev-pl...@lists.mozilla.org

I'd like to use our tinderbox test infrastructure better to reduce
the number of assertions we hit, and reduce the number of times new
assertions start appearing (regressions).

We've discussed doing this in the past, but we've repeatedly run
into the problem that we can't make assertions fatal because we
would need to fix all of the existing ones first.

So I have a patch in
https://bugzilla.mozilla.org/show_bug.cgi?id=472557 to allow
annotating the manifest for reftests and crashtests to say which
tests are known to assert (and how many times, with ranges allowed,
and also the the usual conditions). I'd like to do the same for
mochitest at some point, although the mechanics there may be a bit
harder.

However, getting the benefits of this on tinderbox (and also getting
the benefits of https://bugzilla.mozilla.org/show_bug.cgi?id=346922 )
requires that we run unit tests on builds built with --enable-debug.
Currently we don't do this.

(It should NOT require any changes to the current unit test
configuration other than doing the build with --enable-debug.)

I can see three ways to accomplish this:
(1) Make the tinderboxes that currently do debug builds and leak
tests (those labeled "leak test build" on
http://tinderbox.mozilla.org/Firefox/ ) also run all the unit
tests.

(2) Add an additional column to tinderbox with machines that are
just like the existing unit test tinderboxes (the ones labeled
"unit test" on http://tinderbox.mozilla.org/Firefox/ ), but
build with --enable-debug.

(3) Switch to --enable-debug on the existing unit test tinderboxes.

I think (1) or (2) are significantly better than (3), since (3)
replaces coverage of something closer to what we ship with coverage
of something further from what we ship.

The difference between (1) vs. (2) for developers is, I think, a
tradeoff between tinderbox cycle time and number of columns on
the tinderbox waterfall. (But if we exposed a reflection of the
buildbot waterfall that showed status for builds in progress, the
cycle time issue would largely disappear.)

I expect the amount of work to set this up is probably smallest for
(3) and largest for (1), although I could be wrong. That's really a
question for the people who would be doing the setting up.

But I wanted to post here to get feedback, particularly on:
* whether people agree we should do this
* whether others agree that (1) vs. (2) doesn't matter much, but
that they're both significantly better than (3)

-David

--
L. David Baron http://dbaron.org/
Mozilla Corporation http://www.mozilla.com/

Mike Shaver

unread,

Jan 7, 2009, 8:39:18 PM1/7/09

to L. David Baron, dev-pl...@lists.mozilla.org

On Wed, Jan 7, 2009 at 8:24 PM, L. David Baron <dba...@dbaron.org> wrote:
> (3) Switch to --enable-debug on the existing unit test tinderboxes.

Perhaps:

(4) Add --enable-assertions which can override the default
debug/non-debug setting of that flag, so that we minimize the
deviation from what we ship.

Mike

L. David Baron

unread,

Jan 7, 2009, 8:48:36 PM1/7/09

to Mike Shaver, dev-pl...@lists.mozilla.org

I think most of the code that's #ifdef DEBUG exists so that we can
make assertions about things; there's certainly a lot of code that's
explicitly |#ifdef DEBUG| only so we avoid (when not DEBUG) the
extra setup needed to make an assertion.

That said, --enable-debug does do two independent things that it
would be nice to configure independently: (a) define DEBUG and
(b) pass -g or similar to the compiler so it generates data that
debugging tools use for symbols, line numbers, etc.

John J. Barton

unread,

Jan 7, 2009, 11:23:43 PM1/7/09

to

L. David Baron wrote:
> On Wednesday 2009-01-07 20:39 -0500, Mike Shaver wrote:

>...

> That said, --enable-debug does do two independent things that it
> would be nice to configure independently: (a) define DEBUG and
> (b) pass -g or similar to the compiler so it generates data that
> debugging tools use for symbols, line numbers, etc.

You know this, but (a) is also two independent things, (a1) assertions
-- you want in this case (a2) tracing, -- you don't. The tracing hits
the start up time, hence testing time.

--enable-debugger-info-modules does (b) only. Maybe there is (a1) only....

jjb

Mike Shaver

unread,

Jan 8, 2009, 7:43:20 AM1/8/09

to John J. Barton, dev-pl...@lists.mozilla.org

On Wed, Jan 7, 2009 at 11:23 PM, John J. Barton
<johnj...@johnjbarton.com> wrote:

> L. David Baron wrote:
>>
>> That said, --enable-debug does do two independent things that it
>> would be nice to configure independently: (a) define DEBUG and
>> (b) pass -g or similar to the compiler so it generates data that
>> debugging tools use for symbols, line numbers, etc.
>
> You know this, but (a) is also two independent things, (a1) assertions --
> you want in this case (a2) tracing, -- you don't.

If you mean the JS engine's tracing JIT, then that's not correct --
the tracer is independent of the state of DEBUG (and DEBUG builds are
used to debug the tracer quite frequently!).

Mike

Clint Talbert

unread,

Jan 8, 2009, 10:08:30 AM1/8/09

to

On 1/7/09 7:24 PM, L. David Baron wrote:
..snip..

> I can see three ways to accomplish this:
> (1) Make the tinderboxes that currently do debug builds and leak
> tests (those labeled "leak test build" on
> http://tinderbox.mozilla.org/Firefox/ ) also run all the unit
> tests.
>
> (2) Add an additional column to tinderbox with machines that are
> just like the existing unit test tinderboxes (the ones labeled
> "unit test" on http://tinderbox.mozilla.org/Firefox/ ), but
> build with --enable-debug.
>
> (3) Switch to --enable-debug on the existing unit test tinderboxes.
>
>

I like choice 2.

Choice 1 multiplexes the set of things that the leak tinderboxes are
measuring. While it's pretty trivial to look at the rollup of numbers
and figure out if the box went orange due to leaks or assertions, I
think it'd be easier for everyone to be able to glance at the boxes and
know instantly why something went orange - if it's a leak box, it's
because we have a leak, if it's an assertion box, it's because we have
assertions. Granted, I also think the increased cycle time with this
option really overshadows this one.

Choice 3: Same reservation as you. Ideally, after we break apart the
dependency on the build tree and the tests for Fennec (bug 462889 and
bug 421611) I want to do the same for these, so that these tinderboxen
will pull the same shipping bits from the build tinderbox and run tests
on that. Therefore, I don't support moving these boxes any further from
testing what we ship. I only want them to move closer to test what we ship.

And besides, we already need a mile long screen to see all the columns
anyway. How bad are three more? ;-)

My vote is for choice 2 unless some better option comes up.

Clint

Ted Mielczarek

unread,

Jan 8, 2009, 11:13:28 AM1/8/09

to

On Jan 7, 8:24 pm, "L. David Baron" <dba...@dbaron.org> wrote:
> (1) Make the tinderboxes that currently do debug builds and leak
> tests (those labeled "leak test build" on

> http://tinderbox.mozilla.org/Firefox/) also run all the unit
> tests.

I would vote for this option, and we've discussed this in the past, in
fact. Despite Clint's objection, I would note that the leak boxes
don't currently test much of anything that's useful. They load
bloatcycle.html and then shut down, so the test just proves that we
don't leak while doing that. They do have a fast build cycle, but
that's mostly because they're not doing much. Having assertion
coverage over all of our test suites is a big enough win that we
should get this done.

On a related note, one of my goals for this quarter is to make it
possible to run the unit tests on a pre-packaged build, such that we
could eventually have the unit test machines not do their own builds,
and instead run tests on the builds from the build machines. The
downside to this is that we can't --enable-logrefcnt on the build
machines, since we're perf testing those builds as well, so we'd lose
the leak testing in Mochitest, which would be bad. Any of your
proposed solutions would allow us to keep leak coverage for Mochitest,
so they're all equal in that regard.

With regards to difficulty of implementation, IANARE (I am not a
release engineer), but I speculate that it's something like:
3 - Easiest, just add --enable-debug to mozconfig
1 - Add the existing unit test steps to the debug machine buildbot
configs
2 - Create new buildbot configurations, make sure we have enough build
slaves to handle the extra load

-Ted

timeless

unread,

Jan 8, 2009, 11:16:17 AM1/8/09

to Mike Shaver, dev-pl...@lists.mozilla.org, John J. Barton

On Thu, Jan 8, 2009 at 2:43 PM, Mike Shaver <mike....@gmail.com> wrote:
> If you mean the JS engine's tracing JIT, then that's not correct --
> the tracer is independent of the state of DEBUG (and DEBUG builds are
> used to debug the tracer quite frequently!).

he meant --enable-debugger-info-modules which is what enables
debugging and profiling of native code.

John J. Barton

unread,

Jan 8, 2009, 11:28:33 AM1/8/09

to

I forgot that 'tracing' has a special meaning to mozilla. I meant:

http://en.wikipedia.org/wiki/Tracing_(software)
In software engineering, tracing is a specialized use of logging to
record information about a program's execution.

aka Lots of stuff on the OS console.

jjb

L. David Baron

unread,

Jan 8, 2009, 11:28:47 AM1/8/09

to Ted Mielczarek, dev-pl...@lists.mozilla.org

On Thursday 2009-01-08 08:13 -0800, Ted Mielczarek wrote:
> With regards to difficulty of implementation, IANARE (I am not a
> release engineer), but I speculate that it's something like:
> 3 - Easiest, just add --enable-debug to mozconfig
> 1 - Add the existing unit test steps to the debug machine buildbot
> configs
> 2 - Create new buildbot configurations, make sure we have enough build
> slaves to handle the extra load

For (1), we might need more build slaves as well, since it will add
significantly to the cycle times for the builders.

For (2), we might actually have some of the configuration sitting
around from the previous attempt at debug unit testers; it would
just need to have the fatal-asserts part removed from
reftest/crashtest/mochitest (although not unit test, I think).

Given the mixed opinions about (1) vs. (2), I think it might be
reasonable to let the choice come down to which is easier to set up.

Ben Hearsum

unread,

Jan 8, 2009, 12:32:38 PM1/8/09

to John O'Duinn

On 1/7/09 8:24 PM, L. David Baron wrote:
> I can see three ways to accomplish this:
> (1) Make the tinderboxes that currently do debug builds and leak
> tests (those labeled "leak test build" on
> http://tinderbox.mozilla.org/Firefox/ ) also run all the unit
> tests.
>
> (2) Add an additional column to tinderbox with machines that are
> just like the existing unit test tinderboxes (the ones labeled
> "unit test" on http://tinderbox.mozilla.org/Firefox/ ), but
> build with --enable-debug.
>

As has been mentioned further down either of these options requires more
slaves. I know we're currently underserved, especially on Windows, so
this would currently be a blocking issue.

> (3) Switch to --enable-debug on the existing unit test tinderboxes.
>
>

None of these are particularly complicated things on the RelEng side,
but but I don't think option 2) makes a lot of sense. If we're going to
run unit tests with --enable-debug I don't see a lot of sense is running
the old leak tests.

I'd favour a combination of (1) and (3) - run unittest with
--enable-debug which replace the existing leak tests. I don't know if
there are pitfalls to doing this, however.

(And just to be clear, I think this would/will be great but I can't
commit to taking on the RelEng side of it.)

Sergey Yanovich

unread,

Jan 9, 2009, 1:38:43 PM1/9/09

to

On 2009-01-08 03:24 +0200, L. David Baron wrote:
> (3) Switch to --enable-debug on the existing unit test tinderboxes.

Not running unit tests on release builds may allow for appearance of
elusive release-only failures, which are exceptionally difficult to locate.

--
Sergey Yanovich

Ben Hearsum

unread,

Jan 9, 2009, 2:03:48 PM1/9/09

to

Er. We have never run unit tests on the release builds, so it's not a
factor here. (Yes, we should be running them on release builds).

Sergey Yanovich

unread,

Jan 9, 2009, 3:30:03 PM1/9/09

to

By 'release builds' I mean builds configured just as release builds are:
optimized, with --disable-debug. This is the current unit test tinderbox
config, right?

--
Sergey Yanovich

Ben Hearsum

unread,

Jan 9, 2009, 4:10:49 PM1/9/09

to Sergey Yanovich

Gotcha. Yeah, that's a fairly good point, then.

L. David Baron

unread,

Jan 9, 2009, 4:55:15 PM1/9/09

to Sergey Yanovich, dev-pl...@lists.mozilla.org

The current unit test tinderboxes are a lot closer to the releases
than an --enable-debug build would be. But there are some
differences. For example, on Windows, we use profile-guided
optimization for releases but not unit tests.

That said, I also made this point in the post in which I started the
thread, where I wrote:
# I think (1) or (2) are significantly better than (3), since (3)
# replaces coverage of something closer to what we ship with coverage
# of something further from what we ship.