Improving the reliability and utility of unit testing

schrep

unread,

Jul 2, 2008, 11:47:21 PM7/2/08

to

Hey Folks,

http://tinderbox.mozilla.org/showbuilds.cgi?tree=Mozilla2

Currently warns:

The tree is OPEN (but beware of bug 442169).

This is a easy way for us to start ignoring test results all together.
I believe there are a number of rather straightforward things we
can do right now to a) make our unit tests more deterministic b) make
the analysis of test failures much easier. I've documented those
things here:

https://bugzilla.mozilla.org/showdependencytree.cgi?id=443323&hide_resolved=1

Some of these changes like:

https://bugzilla.mozilla.org/show_bug.cgi?id=443090

Are straightforward but will require coordinated changes to the unit
test infrastructure and the tinderbox setup to not break everything.
They will also require help from multiple people. I'd like us to take
the time right now to try and make progress on this stuff - even if we
have to close the tree to land the fixes and divert folks temporarily
from other tasks.

Please add your thoughts on specific solutions to the bugs (and/or file
new ones) or add them here.

Best,

Schrep

John O'Duinn

unread,

Jul 3, 2008, 10:59:25 AM7/3/08

to schrep, dev-pl...@lists.mozilla.org, dev-q...@lists.mozilla.org

hi;

1) Two additional unittest machines are now visible again on tinderbox
for mozilla-central. qm-win2k3-moz2-01 and qm-centos5-03 now give us
duplication on win32 and linu, which should help with debugging.

2) I've updated tinderbox to now point instead to bug#438871, where
we've been tracking a bunch of intermittent unittest problems.
Bug#442169 was just one of those problems.

tc
John.
=====

> _______________________________________________
> dev-planning mailing list
> dev-pl...@lists.mozilla.org
> https://lists.mozilla.org/listinfo/dev-planning

Robert O'Callahan

unread,

Jul 6, 2008, 10:53:52 PM7/6/08

to

On Jul 3, 3:47 pm, schrep <sch...@mozilla.com> wrote:
> Please add your thoughts on specific solutions to the bugs (and/or file
> new ones) or add them here.

My pet bug: https://bugzilla.mozilla.org/show_bug.cgi?id=438954. If
that was set up, then whenever an orange happened on the test box we'd
have a fighting chance of debugging it.

Rob

schrep

unread,

Jul 7, 2008, 12:19:51 PM7/7/08

to

We've been making great progress - quick updates for everyone:

1) Ted has been making great progress on getting our logging
rationalized (https://bugzilla.mozilla.org/show_bug.cgi?id=443090)

2) Coop is making get progress getting better output for oranges in make
check that currently report no obvious failures
(https://bugzilla.mozilla.org/show_bug.cgi?id=438324)

3) Lukas is getting the basic log consolidation infrastructure up - once
443090 completes this should move fast

4) Sayre has filed a number of real code bugs as a result of valgrind
analysis (see dependency tree of
https://bugzilla.mozilla.org/show_bug.cgi?id=438871)

5) Gavin has been doing some great debugging work on the mochitest
failures (https://bugzilla.mozilla.org/show_bug.cgi?id=431745)

6) We are bringing up a few physical boxes (mini's) this week to augment
the vms for side-to-side comparisons.

From #4 it is clear that at least some of the failures are related to
real code bugs that just get tickled by the different i/o, cpu, and
timing behavior of the VMs. As further proof of this the the physical
box (qm-moz2mini01) running OSX fails about 1 time a day on mochitest
with similar strange behavior
(http://tinderbox.mozilla.org/showlog.cgi?log=Firefox/1215429798.1215436152.29149.gz&fulltext=1).
So there are *definitely* code or test issues causing problems.

So if you get a valgrind bug or other related issue filed I'd appreciate
it if you could move that to the top of your priority queue so we can
get these machines cycling more reliably.

Super-big thanks to everyone who has dove into this - esp Ted, Sayre,
Gavin, Coop, and Lukas. You folks rock!

Cheers,

Schrep

Mike Beltzner

unread,

Jul 8, 2008, 3:19:13 PM7/8/08

to schrep, dev-pl...@lists.mozilla.org

This is, indeed, great progress.

The bad news is that we have what appears to be a set of permanent
orange boxes (Linux and Windows PGO dep unit test boxes) on the
Firefox 3 tree (cvsroot) which are preventing work on 1.9.0.2. Can we
get some of the same attention?

cheers,
mike

sayrer

unread,

Jul 9, 2008, 6:13:55 PM7/9/08

to

On Jul 8, 3:19 pm, Mike Beltzner <beltz...@mozilla.com> wrote:
> This is, indeed, great progress.
>
> The bad news is that we have what appears to be a set of permanent
> orange boxes (Linux and Windows PGO dep unit test boxes) on the
> Firefox 3 tree (cvsroot) which are preventing work on 1.9.0.2. Can we
> get some of the same attention?

When did it start? Did we release Firefox 3 with them orange?

- Rob

schrep

unread,

Jul 9, 2008, 6:31:38 PM7/9/08

to sayrer

I see two failures recently:

qm-pmac03 (talos) is consistently orange. Other two members of the
mac-triad are not.

qm-centos-01 is showing intermittent orange that was much worse before
today. Some of them look to be the check lack of error reporting:

https://bugzilla.mozilla.org/show_bug.cgi?id=438324

If we could get r+ on this that would help!

Test failures look like:

REFTEST UNEXPECTED FAIL (LOADING):
file:///builds/slave/trunk_centos5/mozilla/layout/reftests/bugs/413292-1.html

-- and --

*** 12913 ERROR FAIL | Test timed out. | |
/tests/docshell/test/navigation/test_opener.html
*** 12916 ERROR FAIL | Unable to restore focus, expect failures and
timeouts. | |
/tests/docshell/test/navigation/test_popup-navigates-children.html
*** 12923 ERROR FAIL | Unable to restore focus, expect failures and
timeouts. | | /tests/docshell/test/navigation/test_reserved.html
*** 12934 ERROR FAIL | Unable to restore focus, expect failures and
timeouts. | |
/tests/docshell/test/navigation/test_sibling-matching-parent.html
*** 12941 ERROR FAIL | Unable to restore focus, expect failures and
timeouts. | | /tests/docshell/test/navigation/test_sibling-off-domain.html
*** 12948 ERROR FAIL | Unable to restore focus, expect failures and
timeouts. | | /tests/docshell/test/test_bug344861.html
*** 12953 ERROR FAIL | Unable to restore focus, expect failures and
timeouts. | | /tests/docshell/test/test_bug369814.html

Phil Ringnalda

unread,

Jul 9, 2008, 8:42:05 PM7/9/08

to

On Jul 9, 3:13 pm, sayrer <say...@gmail.com> wrote:
> When did it start? Did we release Firefox 3 with them orange?

Thanks to the special joy of tinderbox renaming, "when did it start?"
is pretty much unanswerable.

However, the "perma" orange beltzner was talking about was two things
- the Linux part was the bug 431745 test_sleep_wake.js thing, and the
Win/PGO one was the way that the combination of build-on-checkin on a
controlled and essentially closed branch and a very slow machine makes
it look like forever. Assuming I'm getting the timing of beltzner's
message right, the box hit bug 427142 just before midnight the night
before, took a couple of hours to fail, then took a five hour rest,
then took 3.5 hours to fail out of a download manager test where it
threw trying to get various directories including CurProcD, followed
by a test_sleep_wake failure that was always nice for a slow timeout,
so that from start of orange until it turned green with just two
failed builds was just shy of 14 hours.

Unfortunately, other than the fix for test_sleep_wake that's since
landed in 1.9.0.x, I'm not sure what's fixable there.

Having a way to look back at the history of just one box, rather than
the PITA scroll-over, scroll-down, scroll-back, load a huge table,
over-down-back, so beltzner could have seen it was just two builds for
the PGO box, and before that (back to the July 4th dawn of time) it
was (whatever it was, maybe mostly green, I haven't looked because
it's a PITA) would be nice, but actual changes to the way the
waterfall works like that are... rare. I've certainly never seen one.

Having continuous build is nice when you've got nothing to check in
and you want to see if it's going to green up (or just to increase
your chances of random green), but it's a pain when you have to check
in to an old stale branch full of mangy slow machines, and you miss
catching the start of a build by just a few minutes and have to be on
the hook for nearly two full cycles.

There's probably *something* there in that throwing getting CurProcD
thing, but, how on earth are you going to find it? Something happened
on an unwatched box that nobody has access to, and now it's gone. What
can we change, that will make it possible to debug that?

What can we change about the way we rename trees, to make it possible
to see whether or not it has happened before, when a box on the
Firefox4.0 tree does it a week after the current Firefox tree has been
renamed?

Phil Ringnalda

unread,

Jul 9, 2008, 8:50:47 PM7/9/08

to

On Jul 9, 5:42 pm, Phil Ringnalda <philringna...@gmail.com> wrote:
> Having a way to look back at the history of just one box,

Actually, while I'm wishing for a pony, I'd like a herd, too: being
able to see both "A week of this box" and "A week of this class" (in
this case, all the Windows unit test boxes) would be lovely.

Shawn Wilsher

unread,

Jul 9, 2008, 10:59:48 PM7/9/08

to Phil Ringnalda

Phil Ringnalda wrote:
> However, the "perma" orange beltzner was talking about was two things
> - the Linux part was the bug 431745 test_sleep_wake.js thing, and the
> Win/PGO one was the way that the combination of build-on-checkin on a
> controlled and essentially closed branch and a very slow machine makes
> it look like forever. Assuming I'm getting the timing of beltzner's
> message right, the box hit bug 427142 just before midnight the night
> before, took a couple of hours to fail, then took a five hour rest,
> then took 3.5 hours to fail out of a download manager test where it
> threw trying to get various directories including CurProcD, followed
> by a test_sleep_wake failure that was always nice for a slow timeout,
> so that from start of orange until it turned green with just two
> failed builds was just shy of 14 hours.

> There's probably *something* there in that throwing getting CurProcD

> thing, but, how on earth are you going to find it? Something happened
> on an unwatched box that nobody has access to, and now it's gone. What
> can we change, that will make it possible to debug that?

That throwing is perfectly normal. Those area all caught, but our error
reporting still likes to show them for whatever reason.

Cheers,

Shawn