Project Stockwell - February 2017 update

jma...@mozilla.com

unread,

Feb 7, 2017, 12:40:04 PM2/7/17

to

This is the second update of project stockwell (first update: https://goo.gl/1X31t8).

This month we will be recommending and asking that intermittent failures that occur >=30 times/week be resolved within 2 weeks of triaging them.

Yesterday we had these stats:
Orange Factor: 10.75 (https://goo.gl/qvFbeB)
count(high_frequency_bugs): 61

Last month we had these stats:
Orange Factor: 13.76 (https://goo.gl/o5XOof)
count(high_frequency_bugs): 42

For more details of the bugs and what we are working on, you can read more on this recent blog post:
https://elvis314.wordpress.com/2017/02/07/project-stockwell-february-2017/

Thanks for helping out with intermittent test failures when we ping you about them!

Bill McCloskey

unread,

Feb 8, 2017, 12:33:40 AM2/8/17

to Joel Maher, dev-platform

Hi Joel,
I spent about an hour tonight trying to debug a test failure, and I'm
writing this email in frustration at how difficult it is. It seems like the
process has actually gotten a lot worse over the last few years (although
it was never good). Here's the situation I ran into:

A test is failing on try. I want to reproduce it. Assume that running the
test by itself isn't sufficient. I would like to run whatever set of tests
actually ran together on the test machine in a single Firefox invocation. I
don't want any more tests to run than those. I can't figure out any way to
do that.

I can pass a directory to |mach mochitest|. But that has a number of
problems:
- It also runs every subdirectory recursively inside that directory. Often
that includes way more tests. I can't figure out any way to stop it from
doing this. I tried the "--chunk-by-dir" option, but it complains that the
argument is supposed to be an integer. What is this option for?
- |mach mochitest| runs all flavor of tests even though I only want one.
There is the --flavor option to disable that. But I have never figured out
how to use it correctly. No matter what I do, some irrelevant devtools are
a11y or plugin tests seem to run instead of what I want.
- There is a --start-at option that should allow me to start running tests
near the one that I want. But it never seems to work either. I'm not sure
if it's confounded by the two problems above, or if it's just completely
broken.

We could easily fix this by printing in the tinderbox log the mach command
that you need to run in order to run the tests for a particular directory
(and making that discoverable through treeherder).

I want to emphasize that, from a developer's perspective, this is the
second most basic thing that I could ask for. (Simply running a test by
itself is the most basic, and it works fine.) Running tests by directory in
automation has been a huge improvement, but we're not really earning the
dividends from it because it's so hard to get the same behavior locally.

Anyway, sorry about the rant. And sorry to pick on your email. But it's
frustrating to see all these advanced ideas being proposed when we can't
even get basic stuff right.

As an aside, I would also like to complain about the decision to strip a
lot of the useful information out of test logs. I understand this was done
to make the tests faster, and I can "just" check in a patch to add
"SimpleTest.requestCompleteLog()" to the intermittent test. But why didn't
we instead figure out why logging was so slow and fix that? Fundamentally,
it doesn't seem like saving 50MB of log data to S3 should take very long.

-Bill

> _______________________________________________
> dev-platform mailing list
> dev-pl...@lists.mozilla.org
> https://lists.mozilla.org/listinfo/dev-platform
>

Mike Conley

unread,

Feb 10, 2017, 10:24:56 AM2/10/17

to dev-pl...@lists.mozilla.org

There's good feedback in here. Are some of these known, jmaher? Are any
intentional choices, or should we just start turning these into bugs to
get fixed?

Andrew Halberstadt

unread,

Feb 14, 2017, 12:27:27 PM2/14/17

to Mike Conley, dev-pl...@lists.mozilla.org

Just noticed no one looped back here. Joel filed bug 1337844
<https://bugzilla.mozilla.org/show_bug.cgi?id=1337844> and bug 1337839
<https://bugzilla.mozilla.org/show_bug.cgi?id=1337839>. There has been some
discussion there. To summarize, running tests locally is currently
optimized towards "Run all tests related to code in <dir>" instead of "Run
all tests in <job in automation>". Optimizing for one by default, comes
with a trade off on the other.

That being said, I think there is some low hanging fruit that could make
the overall situation better, namely:

* Ability to run manifests in the args, e.g: ./mach mochitest
path/to/manifest.ini (this would bypass subdirs)
* An overall summary (bug 1209463
<https://bugzilla.mozilla.org/show_bug.cgi?id=1209463>)
* A mode to prevent multi-Firefox instances from running (we would error
out if e.g multiple dirs or subsuites would be run)
* Advertising/bootstrapping aliases for common configurations, e.g add the
following to ~/.mozbuild/machrc:
[alias]
mochitest-media = mochitest -f plain --subsuite media

I agree that this kind of stuff is important, though can't make promises on
a timeline.
-Andrew

L. David Baron

unread,

Feb 14, 2017, 5:25:30 PM2/14/17

to bi...@mozilla.com, Joel Maher, dev-platform

On Tuesday 2017-02-07 21:33 -0800, Bill McCloskey wrote:
> I spent about an hour tonight trying to debug a test failure, and I'm
> writing this email in frustration at how difficult it is. It seems like the
> process has actually gotten a lot worse over the last few years (although
> it was never good). Here's the situation I ran into:

Another aspect of debugging test failures that has gotten worse
recently:

* If you have an intermittent that's actually affecting the tree,
it's become harder to see the range of TEST-UNEXPECTED-FAIL
messages that are occurring. These used to be present in the
comments that tbplbot made on bugs, but now it requires following
a link for each log in the orangefactor interface. (Having this
range was useful to me today in fixing 1159532, although clicking
through to 6 logs was sufficient to help understand the problem.)

This also makes it much harder to tell if bugs are being
mis-classified (e.g., two different problems being starred into
one bug).

(I thought the point of structured logging was to make it easier to
get this sort of data.)

-David

--
𝄞 L. David Baron http://dbaron.org/ 𝄂
𝄢 Mozilla https://www.mozilla.org/ 𝄂
Before I built a wall I'd ask to know
What I was walling in or walling out,
And to whom I was like to give offense.
- Robert Frost, Mending Wall (1914)

signature.asc

Joel Maher

unread,

Feb 15, 2017, 4:24:32 PM2/15/17

to L. David Baron, dev-platform, McCloskey, William

I wonder if we could make a single link in orangefactor that would give you
the range of TEST-UNEXPECTED-FAIL messages to help with this. I filed
https://bugzilla.mozilla.org/show_bug.cgi?id=1339937 to track this, Please
do offer suggestions/use cases for that specific bug.