Proposed policy change: reusability of tests by other browsers

Aryeh Gregor

unread,

Aug 16, 2012, 5:31:01 AM8/16/12

to dev-pl...@lists.mozilla.org, Ms2ger

Mozilla has a long-standing policy that with certain limited
exceptions, all code changes must be accompanied by a test. Following
this policy has given us an excellent and steadily growing regression
test-suite. Some of these tests are very specific to Mozilla, but a
substantial fraction test our conformance to web standards that other
UAs implement. Unfortunately, new tests are as a rule written in very
Mozilla-specific formats. I think we would do a better job of
fulfilling our mission to advance the web if we made more of an effort
to write our tests in such a way that other UAs could more easily use
them, where relevant. I believe we could do this without
significantly increasing the burden on test-writers.

To that end, I suggest the following course of action:

1) Decide on guidelines for whether a test is internal or reusable.
As a starting point, I suggest that all tests that are regular
webpages that don't use any Mozilla-specific features should be
candidates for reuse. Examples of internal tests would be tests
written in XUL and unit tests. In particular, I think we should write
tests for reuse if they cover anything that other browsers implement
or might implement, even if there's currently no standard for it.
Other browsers should still be able to run these tests, even if they
might decide not to follow them. Also, tests that currently use
prefixed web-exposed properties should still be made reusable, since
the properties should eventually be unprefixed.

2) Write an introduction to testharness.js targeted at people familiar
with mochitest. testharness.js is the de facto standard testing
harness in the web standards world, and we already can run such tests
as mochitests automatically (see dom/imptests/), so JavaScript tests
meant to be usable by other browsers should be written in that format.

3) Require that all new tests that qualify as reusable must be written
in testharness.js format rather than mochitest format if possible.
(Reftests and crashtests can remain as-is, IMO. Some mochitests might
not be possible to rewrite as testharness.js yet, e.g., those using
SpecialPowers, so I guess they can stay mochitests for now.)

4) Require that all new tests that qualify as reusable must be checked
into a specific new directory created for this purpose, rather than
someplace near the code as they are currently. Reviewers need to
eventually start giving r- for tests written in the wrong format or
put in the wrong place, although it would make sense to phase the
requirement in over time and not be too strict at first. Test writers
do not have to bother actually publishing them -- they just have to
write them in the correct format and put them in the correct directory
in the source tree.

5) Make sure someone is keeping an eye on the reusable-tests
directory, and submitting the tests as appropriate to somewhere where
they can be easily reused. This might involve submitting some of them
to standards bodies for formal approval. Other tests might not
currently follow any standard, but could still be imported by other
browsers to test for crashes or assertions, or to flag possible
regressions. Those tests might not be moved anywhere special, but
should still be easier for other browsers to reuse than they are now.

I think that the above won't make anything much harder for our coders,
but will be a big step forward for web testing -- especially if our
example motivates other browsers to do the same. It needs a little
bit of infrastructure work, but nothing much. (1) and (2) are easy,
and I suspect someone like Ms2ger will be happy to handle (5). But
(3) and (4) need broad agreement from module owners/peers. Does
anyone have any objections to them? Once we have a system like this
in place, we can try to persuade other browser vendors to use our
tests, and provide their tests under similar terms.

Ehsan Akhgari

unread,

Aug 16, 2012, 11:10:20 AM8/16/12

to dev-pl...@lists.mozilla.org

I think this is generally a good idea. I have a few questions before
jumping in to agree though.

1. Is the current testharness.js API the documentation at the beginning
of <http://w3c-test.org/resources/testharness.js>? If that is the case,
the API looks a lot heavier weight than the default mochitest API we
use. In that case, would it make sense for us to have a compatibility
layer which translates our mochitest APIs to the equivalent
testharness.js API calls? (I'm not 100% sure if that conversion would
be straightforward.)

2. Is there any support for running reftest-style tests in a framework
that is reusable by other browsers? If not, can we move to propose the
reftest framework to the appropriate standards bodies so that it can be
adopted by other browsers? Our reftest framework has been carefully
designed to be Gecko-agnostic, and is much superior to the equivalent
testing framework that WebKit has (not sure about other browser
engines). Furthermore, the files loaded by this framework are not
loaded in a privileged context with APIs such as SpecialPowers, which
makes a large number of them portable to other browser engines.

I think it makes sense for us if we can start this effort on the reftest
framework, since that has a much lower barrier to entry, and ultimately
this effort would be valuable only if other browser engines start to use
our tests (and hopefully share theirs with us as well).

Cheers,
Ehsan

> _______________________________________________
> dev-platform mailing list
> dev-pl...@lists.mozilla.org
> https://lists.mozilla.org/listinfo/dev-platform
>

Benjamin Smedberg

unread,

Aug 16, 2012, 11:25:29 AM8/16/12

to Aryeh Gregor, Ms2ger, dev-pl...@lists.mozilla.org

On 8/16/2012 5:31 AM, Aryeh Gregor wrote:
>
>
> 4) Require that all new tests that qualify as reusable must be checked
> into a specific new directory created for this purpose, rather than
> someplace near the code as they are currently. Reviewers need to
> eventually start giving r- for tests written in the wrong format or
> put in the wrong place, although it would make sense to phase the
> requirement in over time and not be too strict at first. Test writers
> do not have to bother actually publishing them -- they just have to
> write them in the correct format and put them in the correct directory
> in the source tree.

I agree with the first 3 points, but I object rather strongly to this
one. I think we should try to keep the tests close to the relevant code
whenever possible; this makes it more clear which module owner is
responsible for the test, and makes it easier to find and run the
relevant tests when modifying code. I think our system should try to
keep this style of tests in the code module.

>
> 5) Make sure someone is keeping an eye on the reusable-tests
> directory, and submitting the tests as appropriate to somewhere where
> they can be easily reused. This might involve submitting some of them
> to standards bodies for formal approval. Other tests might not
> currently follow any standard, but could still be imported by other
> browsers to test for crashes or assertions, or to flag possible
> regressions. Those tests might not be moved anywhere special, but
> should still be easier for other browsers to reuse than they are now.

Why do you think it would be better to have (somebody == Ms2ger) do
this, instead of expecting module owners in general to be a part of this
task? It feels to me that module owners should primarily be trying to
accomplish this sort of thing, and if they need help figuring out the
right standards body, asking for help from Ms2ger or other experts is a
great fallback plan.

Given the recent discussion about QA, it feels like this would also be a
great thing to involve QA in.

--BDS

Ms2ger

unread,

Aug 16, 2012, 11:34:03 AM8/16/12

to

On 08/16/2012 05:10 PM, Ehsan Akhgari wrote:
> I think this is generally a good idea. I have a few questions before
> jumping in to agree though.
>
> 1. Is the current testharness.js API the documentation at the beginning
> of <http://w3c-test.org/resources/testharness.js>? If that is the case,
> the API looks a lot heavier weight than the default mochitest API we
> use. In that case, would it make sense for us to have a compatibility
> layer which translates our mochitest APIs to the equivalent
> testharness.js API calls? (I'm not 100% sure if that conversion would
> be straightforward.)

I don't feel it's terribly heavyweight. In the simple case, you need a

test(function() {
// tests
});

That's two extra lines of boilerplate. Our mochitest boilerplate is…
(*looks it up*) 30 lines. I think we'll deal. The method names for
assertions (assert_*) are a bit longer than we're used to, but in my
experience, you get used to it rather quickly.

IMO, it would not be a good idea to try to layer another API on top of
this, because it makes our tests harder to understand and less reusable
by other browser vendors, and it means that experience with our brand of
testharness.js doesn't help much to understand the standard brand.

I also don't think it would be terribly straightforward, and I think
that having real mochitests, testharness.js-that-look-like-mochitest and
standard testharness.js in our tree would make our testing situation
more confusing than the lessened learning curve would be worth. In
particular, having two APIs that superficially look the same but are
actually built on unrelated frameworks seems a recipe for a lot of
annoying differences in edge cases.

> 2. Is there any support for running reftest-style tests in a framework
> that is reusable by other browsers? If not, can we move to propose the
> reftest framework to the appropriate standards bodies so that it can be
> adopted by other browsers? Our reftest framework has been carefully
> designed to be Gecko-agnostic, and is much superior to the equivalent
> testing framework that WebKit has (not sure about other browser
> engines). Furthermore, the files loaded by this framework are not
> loaded in a privileged context with APIs such as SpecialPowers, which
> makes a large number of them portable to other browser engines.

The W3C has already adopted reftests. See
<https://test.csswg.org/harness/test/CSS-STYLE-ATTR_DEV/single/style-attr-braces-001/>
for example.

In fact, we've already got a place in m-c to put reftests to be
submitted to the CSSWG:
<http://mxr.mozilla.org/mozilla-central/source/layout/reftests/w3c-css/submitted/README?force=1>.

The main issue here is that the CSSWG appears to impose rather stringent
test formatting requirements, which makes writing reftests for them much
more of a drag than just writing them for ourselves.

> I think it makes sense for us if we can start this effort on the reftest
> framework, since that has a much lower barrier to entry, and ultimately
> this effort would be valuable only if other browser engines start to use
> our tests (and hopefully share theirs with us as well).

Opera already shares a lot of their tests, and Google, Apple and
Microsoft have been known to submit tests as well, some of which are
already running on tinderbox.

HTH
Ms2ger

Ehsan Akhgari

unread,

Aug 16, 2012, 12:07:12 PM8/16/12

to dev-pl...@lists.mozilla.org

On 12-08-16 11:25 AM, Benjamin Smedberg wrote:
> On 8/16/2012 5:31 AM, Aryeh Gregor wrote:
>>
>>
>> 4) Require that all new tests that qualify as reusable must be checked
>> into a specific new directory created for this purpose, rather than
>> someplace near the code as they are currently. Reviewers need to
>> eventually start giving r- for tests written in the wrong format or
>> put in the wrong place, although it would make sense to phase the
>> requirement in over time and not be too strict at first. Test writers
>> do not have to bother actually publishing them -- they just have to
>> write them in the correct format and put them in the correct directory
>> in the source tree.
> I agree with the first 3 points, but I object rather strongly to this
> one. I think we should try to keep the tests close to the relevant code
> whenever possible; this makes it more clear which module owner is
> responsible for the test, and makes it easier to find and run the
> relevant tests when modifying code. I think our system should try to
> keep this style of tests in the code module.

I agree with Benjamin here. In fact, I think if we take out item 4
completely Aryeh's proposal still makes sense. Where the tests live in
our tree should not really matter.

Ehsan

Ehsan Akhgari

unread,

Aug 16, 2012, 12:21:35 PM8/16/12

to Ms2ger, dev-pl...@lists.mozilla.org

On 12-08-16 11:34 AM, Ms2ger wrote:
> On 08/16/2012 05:10 PM, Ehsan Akhgari wrote:
>> I think this is generally a good idea. I have a few questions before
>> jumping in to agree though.
>>
>> 1. Is the current testharness.js API the documentation at the beginning
>> of <http://w3c-test.org/resources/testharness.js>? If that is the case,
>> the API looks a lot heavier weight than the default mochitest API we
>> use. In that case, would it make sense for us to have a compatibility
>> layer which translates our mochitest APIs to the equivalent
>> testharness.js API calls? (I'm not 100% sure if that conversion would
>> be straightforward.)
>
> I don't feel it's terribly heavyweight. In the simple case, you need a
>
> test(function() {
> // tests
> });
>
> That's two extra lines of boilerplate. Our mochitest boilerplate is…
> (*looks it up*) 30 lines. I think we'll deal. The method names for
> assertions (assert_*) are a bit longer than we're used to, but in my
> experience, you get used to it rather quickly.

I was not necessarily talking about the size of the boilerblate. I was
mostly talking about the size of the API. The mochitest API is
considerably smaller than testharness.js, and is therefore easier to
read and learn, I think.

> IMO, it would not be a good idea to try to layer another API on top of
> this, because it makes our tests harder to understand and less reusable
> by other browser vendors, and it means that experience with our brand of
> testharness.js doesn't help much to understand the standard brand.

I would agree if the mochitest API was also huge. We're basically
talking about 3 functions: is, is_not and ok. I don't agree that using
these functions would make our tests any less usable by other browser
vendors. But I understand that in their eyes, these functions may not
be as intuitive as in mine. :-)

> I also don't think it would be terribly straightforward, and I think
> that having real mochitests, testharness.js-that-look-like-mochitest and
> standard testharness.js in our tree would make our testing situation
> more confusing than the lessened learning curve would be worth. In
> particular, having two APIs that superficially look the same but are
> actually built on unrelated frameworks seems a recipe for a lot of
> annoying differences in edge cases.

Hrm, which edge cases are you talking about? Let's talk about is(a, b,
"msg") for example. Why do you think there would be cases where calling
is() is less understandable or more error prone than assert_equals?

On the testharness.js side, we have things like assert_regexp_match, for
example. I would argue that whether or not assert_regexp_match(a,
/foo/, "msg") is more readable than ok(/foo/.match(a), "msg") is very
subjective and depends on what the author of the test is used to see.

>> 2. Is there any support for running reftest-style tests in a framework
>> that is reusable by other browsers? If not, can we move to propose the
>> reftest framework to the appropriate standards bodies so that it can be
>> adopted by other browsers? Our reftest framework has been carefully
>> designed to be Gecko-agnostic, and is much superior to the equivalent
>> testing framework that WebKit has (not sure about other browser
>> engines). Furthermore, the files loaded by this framework are not
>> loaded in a privileged context with APIs such as SpecialPowers, which
>> makes a large number of them portable to other browser engines.
>
> The W3C has already adopted reftests. See
> <https://test.csswg.org/harness/test/CSS-STYLE-ATTR_DEV/single/style-attr-braces-001/>
> for example.
>
> In fact, we've already got a place in m-c to put reftests to be
> submitted to the CSSWG:
> <http://mxr.mozilla.org/mozilla-central/source/layout/reftests/w3c-css/submitted/README?force=1>.

Oh, that's good news! Thanks for the pointer.

> The main issue here is that the CSSWG appears to impose rather stringent
> test formatting requirements, which makes writing reftests for them much
> more of a drag than just writing them for ourselves.

I'm assuming that you're talking about
<http://wiki.csswg.org/test/format>. Yeah, that's quite verbose and
cumbersome. :(

>> I think it makes sense for us if we can start this effort on the reftest
>> framework, since that has a much lower barrier to entry, and ultimately
>> this effort would be valuable only if other browser engines start to use
>> our tests (and hopefully share theirs with us as well).
>
> Opera already shares a lot of their tests, and Google, Apple and
> Microsoft have been known to submit tests as well, some of which are
> already running on tinderbox.

Great!

Ehsan

Boris Zbarsky

unread,

Aug 16, 2012, 12:41:23 PM8/16/12

to

On 8/16/12 12:07 PM, Ehsan Akhgari wrote:
> I agree with Benjamin here. In fact, I think if we take out item 4
> completely Aryeh's proposal still makes sense. Where the tests live in
> our tree should not really matter.

It matters insofar as it makes it more complicated to export our tests
to the official W3C test suite, right? As long as we solve that problem
in a locaton-agnostic way, we should be fine.

-Boris

Ms2ger

unread,

Aug 16, 2012, 1:01:34 PM8/16/12

to

On 08/16/2012 06:21 PM, Ehsan Akhgari wrote:
> On 12-08-16 11:34 AM, Ms2ger wrote:
>> On 08/16/2012 05:10 PM, Ehsan Akhgari wrote:
>>> I think this is generally a good idea. I have a few questions before
>>> jumping in to agree though.
>>>
>>> 1. Is the current testharness.js API the documentation at the beginning
>>> of <http://w3c-test.org/resources/testharness.js>? If that is the case,
>>> the API looks a lot heavier weight than the default mochitest API we
>>> use. In that case, would it make sense for us to have a compatibility
>>> layer which translates our mochitest APIs to the equivalent
>>> testharness.js API calls? (I'm not 100% sure if that conversion would
>>> be straightforward.)
>>
>> I don't feel it's terribly heavyweight. In the simple case, you need a
>>
>> test(function() {
>> // tests
>> });
>>
>> That's two extra lines of boilerplate. Our mochitest boilerplate is…
>> (*looks it up*) 30 lines. I think we'll deal. The method names for
>> assertions (assert_*) are a bit longer than we're used to, but in my
>> experience, you get used to it rather quickly.
>
> I was not necessarily talking about the size of the boilerblate. I was
> mostly talking about the size of the API. The mochitest API is
> considerably smaller than testharness.js, and is therefore easier to
> read and learn, I think.

I don't really agree that the API is much larger; it's just that all the
assert_* functions are documented, unlike the various functions we
expose to mochitests.

It's of course true that this is another API to learn, but I don't think
it's significantly harder than the mochitest API for people who don't
yet know either.

>> IMO, it would not be a good idea to try to layer another API on top of
>> this, because it makes our tests harder to understand and less reusable
>> by other browser vendors, and it means that experience with our brand of
>> testharness.js doesn't help much to understand the standard brand.
>
> I would agree if the mochitest API was also huge. We're basically
> talking about 3 functions: is, is_not and ok. I don't agree that using
> these functions would make our tests any less usable by other browser
> vendors. But I understand that in their eyes, these functions may not
> be as intuitive as in mine. :-)

How about
SimpleTest.ise()/isa()/typeOf()/isDeeply()/waitForExplicitFinish()/finish()/executeSoon()/expectUncaughtException()/…?

I would argue that functions like assert_regexp_match don't get in your
way if you don't use them, just like SimpleTest.isDeeply() never gets in
my way when writing mochitests. In general, assert_{true,false},
assert_{equals,not_equals} and maybe assert_throws are, in my
experience, sufficient for most test.

>> I also don't think it would be terribly straightforward, and I think
>> that having real mochitests, testharness.js-that-look-like-mochitest and
>> standard testharness.js in our tree would make our testing situation
>> more confusing than the lessened learning curve would be worth. In
>> particular, having two APIs that superficially look the same but are
>> actually built on unrelated frameworks seems a recipe for a lot of
>> annoying differences in edge cases.
>
> Hrm, which edge cases are you talking about? Let's talk about is(a, b,
> "msg") for example. Why do you think there would be cases where calling
> is() is less understandable or more error prone than assert_equals?

Excellent example: is(undefined, null) passes, assert_equals(undefined,
null) fails.

However, I was thinking more about the fact that mochitest watches
window.onerror, while testharness.js catches exceptions from the
functions you pass to test(). To me, it seems easier to make sure you
use two different-looking APIs correctly, than to make sure you've got
the right API when one tries to disguise as the other.

> On the testharness.js side, we have things like assert_regexp_match, for
> example. I would argue that whether or not assert_regexp_match(a,
> /foo/, "msg") is more readable than ok(/foo/.match(a), "msg") is very
> subjective and depends on what the author of the test is used to see.

I wouldn't disagree.

HTH
Ms2ger

L. David Baron

unread,

Aug 16, 2012, 1:20:28 PM8/16/12

to Ms2ger, dev-pl...@lists.mozilla.org

On Thursday 2012-08-16 17:34 +0200, Ms2ger wrote:
> On 08/16/2012 05:10 PM, Ehsan Akhgari wrote:
> >I think this is generally a good idea. I have a few questions before
> >jumping in to agree though.
> >
> >1. Is the current testharness.js API the documentation at the beginning
> >of <http://w3c-test.org/resources/testharness.js>? If that is the case,
> >the API looks a lot heavier weight than the default mochitest API we
> >use. In that case, would it make sense for us to have a compatibility
> >layer which translates our mochitest APIs to the equivalent
> >testharness.js API calls? (I'm not 100% sure if that conversion would
> >be straightforward.)
>
> I don't feel it's terribly heavyweight. In the simple case, you need a
>
> test(function() {
> // tests
> });
>
> That's two extra lines of boilerplate. Our mochitest boilerplate is…
> (*looks it up*) 30 lines. I think we'll deal. The method names for
> assertions (assert_*) are a bit longer than we're used to, but in my
> experience, you get used to it rather quickly.

It's two extra lines of boilerplate if you only have one test in the
file.

But if you have many tests in the file, it's a lot more, since each
test needs to be wrapped in this -- at least in my understanding.
Some browser vendors (e.g., Opera) seem to care quite strongly that
each test file always execute the same number of tests in the same
order -- even if some of those tests fail by throwing an exception.
So my understanding is that the intent here is that *each* test be
wrapped this way, presumably along with anything that might throw an
exception. (That said, I think this "might throw" concept is rather
loose.)

I think it's probably worth writing tests this way because of the
value of sharing them. But I wouldn't minimize that it is more
overhead.

One other characteristic of tests to be submitted to the W3C that's
rather important is that they fail when the feature isn't
implemented. If this isn't true, then people will build tables that
show a feature as being partially implemented, etc. (It's
particularly bad if, say, all but one of a large set of tests that
mostly test error handling actually pass when the feature isn't
implemented.)

-David

--
𝄞 L. David Baron http://dbaron.org/ 𝄂
𝄢 Mozilla http://www.mozilla.org/ 𝄂

L. David Baron

unread,

Aug 16, 2012, 1:27:56 PM8/16/12

to Aryeh Gregor, Ms2ger, dev-pl...@lists.mozilla.org

On Thursday 2012-08-16 12:31 +0300, Aryeh Gregor wrote:
> I think that the above won't make anything much harder for our coders,
> but will be a big step forward for web testing -- especially if our
> example motivates other browsers to do the same. It needs a little

I agree that this is worth doing.

I think the key to making it work is figuring out how to distribute
the knowledge effectively in the Mozilla community. This requires
educating Mozilla module owners and code reviewers about the testing
guidelines for W3C tests. Some of this can be done by written
documentation, but some of it, I think, can only be taught through
review and feedback cycles. In some cases, this means getting
reasonably rapid feedback (from other browser vendors or others
involved in W3C testing efforts) on submitted tests so that people
at Mozilla who are writing tests can learn, through rapid feedback,
what's required of tests submitted to W3C groups.

That said, some test reviews in the W3C space tend to be
unnecessarily nitpicky. I think we need to be careful to filter the
review feedback appropriately for the change requests that are
actually motivated by real testing needs, and to push back on the
others so that the amount of information that we need to distribute
through the Mozilla community is not too large.

Ehsan Akhgari

unread,

Aug 16, 2012, 3:38:39 PM8/16/12

to Boris Zbarsky, dev-pl...@lists.mozilla.org

I would imagine having a manifest somewhere which points to the tests
which can be submitted would solve that problem, right?

Ehsan

Boris Zbarsky

unread,

Aug 16, 2012, 4:17:16 PM8/16/12

to

On 8/16/12 3:38 PM, Ehsan Akhgari wrote:
> I would imagine having a manifest somewhere which points to the tests
> which can be submitted would solve that problem, right?

Sure. Just need to maintain that manifest as new tests (or just new
test dirs?) are added.

-Boris

Neil

unread,

Aug 16, 2012, 4:33:07 PM8/16/12

to

Ehsan Akhgari wrote:

> On the testharness.js side, we have things like assert_regexp_match,
> for example. I would argue that whether or not assert_regexp_match(a,
> /foo/, "msg") is more readable than ok(/foo/.match(a), "msg") is very
> subjective and depends on what the author of the test is used to see.

I don't know whether we do a lot of regexp matching but if we did I
would want it to use a dedicated method. I implemented ise because
having an entire test reporting via ok(foo === bar, "foo was not exactly
equal to bar"); does not make for good debugging. (Sadly I still wasn't
able to work out why the test was failing and had to unCC myself from
the bug again. (No it wasn't my test.))

--
Warning: May contain traces of nuts.

ja...@hoppipolla.co.uk

unread,

Aug 16, 2012, 5:29:12 PM8/16/12

to Ms2ger, dev-pl...@lists.mozilla.org

On Thursday, 16 August 2012 18:21:35 UTC+2, Ehsan Akhgari wrote:
> On the testharness.js side, we have things like assert_regexp_match, for
>
> example. I would argue that whether or not assert_regexp_match(a,
>
> /foo/, "msg") is more readable than ok(/foo/.match(a), "msg") is very
>
> subjective and depends on what the author of the test is used to see.

As the original author of testharness.js this is one style choice I feel I can defend. The API is indeed relatively large – but no more so than the typical JUnit inspired framework – but this design has substantial benefits, and it is easy to ignore the parts that you aren't using.

The typical test format we used at Opera before testharness.js was exceedingly lightweight/liberal; it basically has a single function that takes a boolean indicating whether the test passed. Sadly this makes the tests very inconsistent and hard to read because every one is a special snowflake that does its own validation logic. This requires lots of code, often duplicated across tests, that is not really related to the thing under test but is needed to validate the results. It also means that every author makes their own decisions about how to do various kinds of test e.g. checking the right exception was thrown by a DOM method.

The advantage of having a richer API is that it brings rigor and consistency to tests. By far the most common assertion in testharness.js is assert_equals. Even this simple method is a win compared to having people write their own equality test because it will always use strict equality.

For more complex cases where the logic needed to make the test correct is more involved the benefits are correspondingly greater e.g. assert_array_equals removes the need for an explicit loop cluttering up the test itself and assert_throws has some relatively complex logic to handle checking the expected properties of DOM and ECMAScript exceptions.

If test authors were left to reimplement on an ad-hoc basis the assertions that testharness.js provides built-in tests would be harder to read, less consistent, and more prone to error. That seems well worth the cost of a having such a rich API.

Justin Dolske

unread,

Aug 16, 2012, 5:39:07 PM8/16/12

to

On 8/16/12 8:10 AM, Ehsan Akhgari wrote:

> I think it makes sense for us if we can start this effort on the reftest
> framework, since that has a much lower barrier to entry, and ultimately
> this effort would be valuable only if other browser engines start to use
> our tests (and hopefully share theirs with us as well).

Is there a concrete plan for getting other browsers to run these shared
tests?

The basic idea here sounds worthy, but one concern is that our own tests
are often unreliable in our own browser -- and I'd expect that to only
get worse as other browsers and their tests enter the picture. I'd
therefore suggest that a successful cross-browser test effort should
prioritize getting stuff running (even with just a handful of tests)...
That way fun problems like reliability have a chance to be found/fixed
over time, instead of having a megatestsuite suddenly appear that's
unappealing to get working.

Justin

Kyle Huey

unread,

Aug 16, 2012, 6:01:39 PM8/16/12

to Justin Dolske, dev-pl...@lists.mozilla.org

That's not really true. Most of our mochitests and reftests are pretty
solid. I don't think reliability will be a huge issue, and reliability
problems due to bad tests can be fixed quite easily.

- Kyle

Aryeh Gregor

unread,

Aug 17, 2012, 7:22:58 AM8/17/12

to Ehsan Akhgari, ja...@hoppipolla.co.uk, Benjamin Smedberg, Justin Dolske, dev-pl...@lists.mozilla.org

On Thu, Aug 16, 2012 at 6:10 PM, Ehsan Akhgari <ehsan....@gmail.com> wrote:
> 1. Is the current testharness.js API the documentation at the beginning of
> <http://w3c-test.org/resources/testharness.js>? If that is the case, the
> API looks a lot heavier weight than the default mochitest API we use.

Not in practice. The assert_*() functions are pretty
self-explanatory. They often make test code appreciably simpler or
more rigorous. For instance, assert_throws() can check that the thing
being thrown is a proper DOMException with all expected properties.
In mochitests, it would be a pain to do every time, so often we'll
just test that something is thrown but not test what it is, etc.
Another advantage of having a lot of functions is that they produce
nicer failure messages.

Basically, as someone who's written lots of testharness.js tests and
lots of mochitests: testharness.js is somewhat more complex, but not
dramatically.

If there's one big problem with shared tests, it's that we have to
change the way we annotate expected failures. Currently we just go in
and change ok() to todo() or whatever in the source code of the test,
but of course that doesn't work for shared tests. testharness.js
expects you to break things up into test()s of one or more
assert_*()s, with the same number of test()s running no matter what,
and each test() either passes or fails as a unit. Then you have to
keep track of expected failures out-of-band (see files in
dom/imptests/failures/). The major disadvantage of this is that if a
test() tests multiple things and one of them is expected to fail, you
lose regression-testing for any subsequent ones, because the test()
aborts at the first assert failure.

So in practice, it's not always clear where to divide up your test()s.
One assert per test would be best for regression-testing, but it adds
a lot of clutter to the source code. I think this is the one big
thing that makes testharness.js more complicated to use than
mochitest, although it's still not rocket science. If we decide
test-per-assert is the way to go, perhaps we could get a series of
functions added to make single-assert tests simpler.

James, could you explain again what exactly the benefit is of this
test/assert distinction? Mochitests effectively have one assert per
test hardcoded in, and they work fine for us.

> 2. Is there any support for running reftest-style tests in a framework that
> is reusable by other browsers? If not, can we move to propose the reftest
> framework to the appropriate standards bodies so that it can be adopted by
> other browsers? Our reftest framework has been carefully designed to be
> Gecko-agnostic, and is much superior to the equivalent testing framework
> that WebKit has (not sure about other browser engines). Furthermore, the
> files loaded by this framework are not loaded in a privileged context with
> APIs such as SpecialPowers, which makes a large number of them portable to
> other browser engines.
>

> I think it makes sense for us if we can start this effort on the reftest
> framework, since that has a much lower barrier to entry, and ultimately this
> effort would be valuable only if other browser engines start to use our
> tests (and hopefully share theirs with us as well).

The CSSWG has such a framework. Unfortunately, they're extremely
demanding about accepting tests, requiring all kinds of documentation
that it follows the standard, and they have formatting guidelines and
so on and so forth. So it's not compatible with the idea of "do
things a little differently and everyone can use our tests".

I agree that reftests would be easier to share, though. Crashtests
would be even easier! But mochitests are really where most of our
tests are. Also, unlike reftests, they can mostly be run in the
browser with no special privileges. But as far as the actual
sharing-tests thing goes, yes, it would make sense to start any kind
of sharing initiative with crashtests, then reftests, then mochitests.

On Thu, Aug 16, 2012 at 6:25 PM, Benjamin Smedberg
<benj...@smedbergs.us> wrote:
> I agree with the first 3 points, but I object rather strongly to this one. I
> think we should try to keep the tests close to the relevant code whenever
> possible; this makes it more clear which module owner is responsible for the
> test, and makes it easier to find and run the relevant tests when modifying
> code. I think our system should try to keep this style of tests in the code
> module.

There are two basic models you can have of test-sharing:

1) Everyone owns their own tests and just exports them to the world.
Everyone else has to use them as-is or not at all; third parties can't
make changes directly. This is compatible with whatever internal
formatting we like.

2) Tests by all parties are put in a shared repository and maintained
in common. Submitters don't own their tests; they're subject to
review and adjustment by others. In this case, it doesn't make sense
for us to organize the tests by our internal code structure. This is
the model that will be used by standards bodies, for instance.
They'll likely want to break things up by the specification being
tested, not any particular implementation. dom/imptests/ is already
organized by specification, because it's just imported.

So for things that get contributed to standards bodies, I do think we
need to match their directory structure, because we have to to mirror
their tests. In this model, tests would be put somewhere as a staging
ground to be submitted to the standards body, and once they're
submitted they'd be reimported along with other vendors' tests in a
place like dom/imptests/, and the original removed from the staging
ground.

For random other tests, I agree that we could probably keep our
directory structure. We just need some clear way to delineate
exported from non-exported tests. Since we don't guarantee that these
tests are meaningful for other vendors anyway, I guess we could just
export everything and let all the Gecko-specific ones be marked as
expected fails by other vendors.

> Why do you think it would be better to have (somebody == Ms2ger) do this,
> instead of expecting module owners in general to be a part of this task? It
> feels to me that module owners should primarily be trying to accomplish this
> sort of thing, and if they need help figuring out the right standards body,
> asking for help from Ms2ger or other experts is a great fallback plan.

If module owners want to be involved in submitting tests to standards
bodies, that would be great. But we shouldn't try to require them to
if they don't want to.

> Given the recent discussion about QA, it feels like this would also be a
> great thing to involve QA in.

It would be great if we had people specifically assigned to this,
yeah. I don't have an opinion on whether they belong in QA or
someplace else.

On Thu, Aug 16, 2012 at 8:20 PM, L. David Baron <dba...@dbaron.org> wrote:
> It's two extra lines of boilerplate if you only have one test in the
> file.
>
> But if you have many tests in the file, it's a lot more, since each
> test needs to be wrapped in this -- at least in my understanding.
> Some browser vendors (e.g., Opera) seem to care quite strongly that
> each test file always execute the same number of tests in the same
> order -- even if some of those tests fail by throwing an exception.
> So my understanding is that the intent here is that *each* test be
> wrapped this way, presumably along with anything that might throw an
> exception. (That said, I think this "might throw" concept is rather
> loose.)
>
> I think it's probably worth writing tests this way because of the
> value of sharing them. But I wouldn't minimize that it is more
> overhead.

I think it's fine if we run a different number of tests each time.
It's not a problem for us. Others who want to use our tests can adapt
their systems to accommodate it. The key thing is we share the tests.

> One other characteristic of tests to be submitted to the W3C that's
> rather important is that they fail when the feature isn't
> implemented. If this isn't true, then people will build tables that
> show a feature as being partially implemented, etc. (It's
> particularly bad if, say, all but one of a large set of tests that
> mostly test error handling actually pass when the feature isn't
> implemented.)

I'm focusing here mostly on sharing tests with other browsers, not
submitting to the W3C. Submitting to the W3C is a further step that
requires a lot more effort, such as: making sure there's a
specification, making sure it mandates what we're testing for, testing
in other browsers to identify possible spec bugs, and responding to
feedback from any random person in the WG who submits it.

This stuff is all great to do, but it's extra work. I think we should
start by identifying ways to share tests *without* extra work by test
authors or module owners, because that will allow us to share all of
our tests, not just a tiny fraction.

On Fri, Aug 17, 2012 at 12:39 AM, Justin Dolske <dol...@mozilla.com> wrote:
> Is there a concrete plan for getting other browsers to run these shared
> tests?

The W3C already has test suites we can submit to in testharness.js
format. We run some of those tests as mochitests; I know Opera does
as well. I believe WebKit doesn't run them automatically yet. James
Graham of Opera has indicated that they'd probably be interested in
running our tests. (Opera gets much less user testing than we do, so
they're very interested in automated testing.)

> The basic idea here sounds worthy, but one concern is that our own tests are
> often unreliable in our own browser -- and I'd expect that to only get worse
> as other browsers and their tests enter the picture. I'd therefore suggest
> that a successful cross-browser test effort should prioritize getting stuff
> running (even with just a handful of tests)... That way fun problems like
> reliability have a chance to be found/fixed over time, instead of having a
> megatestsuite suddenly appear that's unappealing to get working.

Yes, I think it would be a good idea to start small.

James Graham

unread,

Aug 17, 2012, 9:40:04 AM8/17/12

to Aryeh Gregor, Ehsan Akhgari, Benjamin Smedberg, dev-pl...@lists.mozilla.org, Justin Dolske

On 08/17/2012 01:22 PM, Aryeh Gregor wrote:

> If there's one big problem with shared tests, it's that we have to
> change the way we annotate expected failures. Currently we just go in
> and change ok() to todo() or whatever in the source code of the test,
> but of course that doesn't work for shared tests. testharness.js
> expects you to break things up into test()s of one or more
> assert_*()s, with the same number of test()s running no matter what,
> and each test() either passes or fails as a unit. Then you have to
> keep track of expected failures out-of-band (see files in
> dom/imptests/failures/). The major disadvantage of this is that if a
> test() tests multiple things and one of them is expected to fail, you
> lose regression-testing for any subsequent ones, because the test()
> aborts at the first assert failure.
>
> So in practice, it's not always clear where to divide up your test()s.
> One assert per test would be best for regression-testing, but it adds
> a lot of clutter to the source code. I think this is the one big
> thing that makes testharness.js more complicated to use than
> mochitest, although it's still not rocket science. If we decide
> test-per-assert is the way to go, perhaps we could get a series of
> functions added to make single-assert tests simpler.
>
> James, could you explain again what exactly the benefit is of this
> test/assert distinction? Mochitests effectively have one assert per
> test hardcoded in, and they work fine for us.

So the theory is that a test can have preconditions that are needed for
the rest of the test to make sense. For example if you are writing a
history navigation test you might depend on a particular action adding
an entry to the session history to get to the state where you can start
your real test. If that fails you want the test to abort before it
starts trying to manipulate the erroneous state. Hence the design that
allows multiple asserts per logical test.

I think for the way that Mozilla track regressions it works OK to just
abort the whole file at the first sign of failure. That probably makes
the Mochitest approach acceptable. however it doesn't work well for
everyone. In particular it doesn't meet Opera's needs very well.

> I agree that reftests would be easier to share, though. Crashtests
> would be even easier! But mochitests are really where most of our
> tests are. Also, unlike reftests, they can mostly be run in the
> browser with no special privileges. But as far as the actual
> sharing-tests thing goes, yes, it would make sense to start any kind
> of sharing initiative with crashtests, then reftests, then mochitests.

FWIW we already use some of your reftests. I think our import is rather
out of date though, possibly dating from when we first implemented
reftests and needed some to test the system. Our implementation is not
100% compatible with yours; for example we don't use reftest-wait but
have special magic in our harness that achieves something similar in an
entirely different way, but essentially the interoperability is very
good. So if you put reftests somewhere it is easy for us to import them
we will happily make use of them.

> On Thu, Aug 16, 2012 at 8:20 PM, L. David Baron <dba...@dbaron.org> wrote:
>> It's two extra lines of boilerplate if you only have one test in the
>> file.
>>
>> But if you have many tests in the file, it's a lot more, since each
>> test needs to be wrapped in this -- at least in my understanding.
>> Some browser vendors (e.g., Opera) seem to care quite strongly that
>> each test file always execute the same number of tests in the same
>> order -- even if some of those tests fail by throwing an exception.

(We don't actually care about order. But we do care that the same test
file always runs the same tests).

> The W3C already has test suites we can submit to in testharness.js
> format. We run some of those tests as mochitests; I know Opera does
> as well. I believe WebKit doesn't run them automatically yet. James
> Graham of Opera has indicated that they'd probably be interested in
> running our tests. (Opera gets much less user testing than we do, so
> they're very interested in automated testing.)

Yes, we are interested in running your tests and very much hope that you
are interested in running the tests that we write (we are trying, with
some success, to share our testsuites with the W3C, although the process
is not yet as slick as I would like).

The relative amount of user testing various implementations get isn't
really relevant; we like automated testing because it allows us to find
bugs seconds to hours after the code is written rather than days or
weeks. We like sharing tests because it improves the web platform as a
whole by encouraging browsers to converge on interoperable behaviours.
There are currently wild differences between UAs in some areas, and that
doesn't benefit anyone.

>> The basic idea here sounds worthy, but one concern is that our own tests are
>> often unreliable in our own browser -- and I'd expect that to only get worse
>> as other browsers and their tests enter the picture. I'd therefore suggest
>> that a successful cross-browser test effort should prioritize getting stuff
>> running (even with just a handful of tests)... That way fun problems like
>> reliability have a chance to be found/fixed over time, instead of having a
>> megatestsuite suddenly appear that's unappealing to get working.
>
> Yes, I think it would be a good idea to start small.

FWIW we try to automatically detect unreliable tests and disable them
until they are fixed. This makes it possible to deal with large test
dumps without many problems. I don't know quite how that would translate
to your infrastructure.

ja...@hoppipolla.co.uk

unread,

Aug 16, 2012, 5:29:12 PM8/16/12

to mozilla.de...@googlegroups.com, Ms2ger, dev-pl...@lists.mozilla.org

On Thursday, 16 August 2012 18:21:35 UTC+2, Ehsan Akhgari wrote:

> On the testharness.js side, we have things like assert_regexp_match, for
>
> example. I would argue that whether or not assert_regexp_match(a,
>
> /foo/, "msg") is more readable than ok(/foo/.match(a), "msg") is very
>
> subjective and depends on what the author of the test is used to see.

Justin Dolske

unread,

Aug 17, 2012, 5:38:22 PM8/17/12

to

On 8/16/12 3:01 PM, Kyle Huey wrote:

>> The basic idea here sounds worthy, but one concern is that our own tests
>> are often unreliable in our own browser -- and I'd expect that to only get
>> worse as other browsers and their tests enter the picture.
>

> That's not really true. Most of our mochitests and reftests are pretty
> solid. I don't think reliability will be a huge issue, and reliability
> problems due to bad tests can be fixed quite easily.

Sure, on an individual basis most tests are solid.

I'm talking about the problem of having a large set of tests with a
small percentage that fail intermittently, which is what we have today
in m-c. Even if they all magically became cross-browser compatible right
now, I think it could still be a tough sell to get other browsers
vendors to run them. The rarity of a successful all-green run means you
need people (like our tree sheriffs) to interpret what is a known
failures and what is a real problem. AIUI Chromium has similar issues
(any know about MS/Apple/Opera?), so if we were importing their tests
we'd have to decide if that was worth it.

Given the long history (shall I say "plague"?) of intermittent-orange in
our tree, I can't agree that this would be a non-issue or is easy to
fix! [Nor am I saying reusable tests are a bad idea -- just that it
would seem wise to ramp up over time.]

Justin

Gavin Sharp

unread,

Aug 17, 2012, 6:08:29 PM8/17/12

to Justin Dolske, dev-pl...@lists.mozilla.org

On Fri, Aug 17, 2012 at 5:38 PM, Justin Dolske <dol...@mozilla.com> wrote:
> I'm talking about the problem of having a large set of tests with a small
> percentage that fail intermittently, which is what we have today in m-c.

What percentage of the intermittently-failing tests are tests for web
features vs. tests for our UI or other Firefox-specific things, I
wonder? It may be that the set of tests that would be shareable with
other browsers is more reliable on average than the set of tests we
have that are not relevant to other browsers.

Gavin

Ms2ger

unread,

Aug 18, 2012, 6:23:04 AM8/18/12

to

Also, the fact that a test fails intermittently in Gecko doesn't
necessarily imply that it will also fail intermittently in other
browsers; these failures often point to actual Gecko bugs, rather than
test issues.

HTH
Ms2ger

ja...@hoppipolla.co.uk

unread,

Aug 18, 2012, 9:42:56 AM8/18/12

to

On Friday, 17 August 2012 23:38:22 UTC+2, Justin Dolske wrote:

> I'm talking about the problem of having a large set of tests with a
> small percentage that fail intermittently, which is what we have today
> in m-c. Even if they all magically became cross-browser compatible right
> now, I think it could still be a tough sell to get other browsers
> vendors to run them. The rarity of a successful all-green run means you
> need people (like our tree sheriffs) to interpret what is a known
> failures and what is a real problem. AIUI Chromium has similar issues
> (any know about MS/Apple/Opera?), so if we were importing their tests
> we'd have to decide if that was worth it.

I know about Opera ;)

We have experienced the same kind of problems with randomly failing tests that you have, both in tests we have written ourselves and tests that have been imported from other places. We have put quite a lot of effort into fixing the problem and now have quite extensive systems for identifying unstable tests as soon as they are added to our test repository, and before they have the chance to cause the equivalent of "intermittent orange". We also have ways to flag bogus changes in test status, and so can identify frequently misbehaving tests.

As a consequence of this we have become better at writing stable tests, and by the time we release tests we are generally pretty confident that they are stable at least in Opera on our systems. We are also unafraid of using externally written tests because we can detect many quality issues before they have a chance to cost developers lots of time investigating bad results.

> Given the long history (shall I say "plague"?) of intermittent-orange in
> our tree, I can't agree that this would be a non-issue or is easy to
> fix! [Nor am I saying reusable tests are a bad idea -- just that it
> would seem wise to ramp up over time.]

From our point of view it is no problem to import tests even if they are not 100% stable (and of course if the instability is due to a bug in Opera they could be stable for you and not for us). The worst case scenario is that we will simply disable the problematic tests but continue to get the benefit of all the other tests. That is a much better position to be in than not being able to run the tests at all.

Aryeh Gregor

unread,

Aug 19, 2012, 4:41:57 AM8/19/12

to Justin Dolske, dev-pl...@lists.mozilla.org

On Sat, Aug 18, 2012 at 12:38 AM, Justin Dolske <dol...@mozilla.com> wrote:
> Given the long history (shall I say "plague"?) of intermittent-orange in our
> tree, I can't agree that this would be a non-issue or is easy to fix! [Nor
> am I saying reusable tests are a bad idea -- just that it would seem wise to
> ramp up over time.]

To be fair, the reason intermittent orange is such a headache for us
is because our infrastructure for it is terrible. It all revolves
around running tests in giant indivisible blocks that produce
semi-formatted plaintext output, which is then parsed (using regex?)
by various ad hoc tools, and huge amounts have to be done by hand.

Random orange would be a drastically smaller problem for us if, e.g.,
we ran tests individually instead of in giant chunks, and
automatically reran any failed test a few times to see if the failure
is intermittent, and restarted a run in the middle if something made
it crashed. Based on what James is saying, it sounds like Opera has a
substantially more sophisticated system than we do (embarrassingly?).

Anyway, one major goal of an open web is that users should have as
many choices as possible for web browsers. That means we need to put
special effort into making things as easy as possible for smaller
browsers. So if Opera will definitely use our tests and other
browsers might or might not, I think that's a good reason to go ahead.

Asa Dotzler

unread,

Aug 20, 2012, 1:25:37 PM8/20/12

to

On 8/19/2012 1:41 AM, Aryeh Gregor wrote:

> Anyway, one major goal of an open web is that users should have as
> many choices as possible for web browsers. That means we need to put
> special effort into making things as easy as possible for smaller
> browsers. So if Opera will definitely use our tests and other
> browsers might or might not, I think that's a good reason to go ahead.

Can you say more about this? Are you saying it's Mozilla's
responsibility to put Mozilla resources into solving problems for Opera?
I'm not sure I understand this assertion.

- A

Gervase Markham

unread,

Aug 21, 2012, 9:34:41 AM8/21/12

to

On 20/08/12 18:25, Asa Dotzler wrote:
> Can you say more about this? Are you saying it's Mozilla's
> responsibility to put Mozilla resources into solving problems for Opera?
> I'm not sure I understand this assertion.

I think he's arguing that a belief in user choice could well translate
into doing things in a way which helps (or at least does not hinder)
other browsers.

How do you think our belief in user choice should translate into our
attitude to other browsers using our work?

Gerv

Asa Dotzler

unread,

Aug 21, 2012, 12:43:36 PM8/21/12

to

I don't believe our belief in user choice should cause us to spend extra
resources helping other browsers if those extra resources slow us down
in our attempts to be effective competitors with those other browsers.

- A

L. David Baron

unread,

Aug 21, 2012, 1:32:57 PM8/21/12

to Asa Dotzler, dev-pl...@lists.mozilla.org

In the long term, I think sharing tests with other browsers can be a
speed-up, both for us and for the Web platform as a whole, since we:
(a) have to spend less time changing our behavior when browsers are
found to disagree with each other (something that's often more
work when the problems are discovered later)
(b) get to a more interoperable platform faster, which helps the
Web compete against non-Web platforms

Asa Dotzler

unread,

Aug 21, 2012, 2:45:49 PM8/21/12

to

David, I can certainly see the value there. That is, IMO, quite
different from the position I was responding to, Aryeh Gregor suggestion
that our mission compells us to "to put special effort into making

things as easy as possible for smaller browsers."

- A

Ehsan Akhgari

unread,

Aug 21, 2012, 4:09:18 PM8/21/12

to ja...@hoppipolla.co.uk, dev-pl...@lists.mozilla.org

On 12-08-18 9:42 AM, ja...@hoppipolla.co.uk wrote:
> On Friday, 17 August 2012 23:38:22 UTC+2, Justin Dolske wrote:
>
>> I'm talking about the problem of having a large set of tests with a
>> small percentage that fail intermittently, which is what we have today
>> in m-c. Even if they all magically became cross-browser compatible right
>> now, I think it could still be a tough sell to get other browsers
>> vendors to run them. The rarity of a successful all-green run means you
>> need people (like our tree sheriffs) to interpret what is a known
>> failures and what is a real problem. AIUI Chromium has similar issues
>> (any know about MS/Apple/Opera?), so if we were importing their tests
>> we'd have to decide if that was worth it.
>
> I know about Opera ;)
>
> We have experienced the same kind of problems with randomly failing tests that you have, both in tests we have written ourselves and tests that have been imported from other places. We have put quite a lot of effort into fixing the problem and now have quite extensive systems for identifying unstable tests as soon as they are added to our test repository, and before they have the chance to cause the equivalent of "intermittent orange". We also have ways to flag bogus changes in test status, and so can identify frequently misbehaving tests.
>
> As a consequence of this we have become better at writing stable tests, and by the time we release tests we are generally pretty confident that they are stable at least in Opera on our systems. We are also unafraid of using externally written tests because we can detect many quality issues before they have a chance to cost developers lots of time investigating bad results.

Any program which relies on an event loop is by definition going to
suffer from intermittent changes in behavior because of event ordering,
etc. This means that some tests will fail intermittently in any web
browser even if the browser itself is bug free, which means that any
browser engine with a substantial amount of tests needs to come up with
ways of dealing with those. Therefore, I agree with James that
intermittently failing tests should not stop us from proceeding on
sharing tests between different engines.

Cheers,
Ehsan

Kyle Huey

unread,

Aug 21, 2012, 4:37:03 PM8/21/12

to Ehsan Akhgari, ja...@hoppipolla.co.uk, dev-pl...@lists.mozilla.org

On Tue, Aug 21, 2012 at 1:09 PM, Ehsan Akhgari <ehsan....@gmail.com>wrote:

> Any program which relies on an event loop is by definition going to suffer
> from intermittent changes in behavior because of event ordering, etc. This
> means that some tests will fail intermittently in any web browser even if
> the browser itself is bug free
>

Not if the tests are written correctly! Of course, lots aren't, but A does
not imply B here.

- Kyle

ja...@hoppipolla.co.uk

unread,

Aug 21, 2012, 4:39:02 PM8/21/12

to

On Tuesday, 21 August 2012 20:45:49 UTC+2, Asa Dotzler wrote:

> David, I can certainly see the value there. That is, IMO, quite
> different from the position I was responding to, Aryeh Gregor suggestion
> that our mission compells us to "to put special effort into making
> things as easy as possible for smaller browsers."

I think this is a side track. At least, when Opera release tests it is nothing to do with making life easy for other people; it's about improving interoperability between implementations, which is good for the platform and everyone who is invested in its success.

Obviously to achieve this we do want as many people as possible to run the tests we write, and it does seem reasonable that you would want the same for tests that you release.

I have already said that Opera will try to run as many tests as we can, and that we try to release tests we write when it makes sense.

I believe you (Mozilla) can already run testharness.js tests (and reftests, obviously) and might be able to reuse some machinery that you have for importing CSSWG tests to import tests from other parts of the W3C.

WebKit are also starting to write W3C-compatible tests in at least some cases, and are upgrading their test infrastructure to run these tests [1].

Microsoft also write testharness.js tests so I guess they must run them too although obviously it is rather harder to tell what's going on there.

[1] http://krijnhoetmer.nl/irc-logs/whatwg/20120821#l-675

Chris Hofmann

unread,

Aug 21, 2012, 5:03:30 PM8/21/12

to dev-pl...@lists.mozilla.org

On 8/21/12 10:32 AM, L. David Baron wrote:
> On Tuesday 2012-08-21 09:43 -0700, Asa Dotzler wrote:

> In the long term, I think sharing tests with other browsers can be a
> speed-up, both for us and for the Web platform as a whole, since we:
> (a) have to spend less time changing our behavior when browsers are
> found to disagree with each other (something that's often more
> work when the problems are discovered later)
> (b) get to a more interoperable platform faster, which helps the
> Web compete against non-Web platforms
>

> -David
>

Yeah, this is not for the other browser vendors or users, but is mostly
a benefit for web developers that want to count on certain behaviors to
work across browsers effectively and reliably every release of every
browser.

-chofmann

Ehsan Akhgari

unread,

Aug 21, 2012, 5:16:40 PM8/21/12

to Kyle Huey, ja...@hoppipolla.co.uk, dev-pl...@lists.mozilla.org

True. If both the software and the tests are correct, they will never
fail by definition! But we know that neither of the two extremes is
possible to attend to.

Ehsan

Robert Kaiser

unread,

Aug 22, 2012, 8:59:46 AM8/22/12

to

Chris Hofmann schrieb:

> Yeah, this is not for the other browser vendors or users, but is mostly
> a benefit for web developers that want to count on certain behaviors to
> work across browsers effectively and reliably every release of every
> browser.

And as web developers write websites for users, this is for users. And
as we want to serve the web and users, it's a benefit for us.

Robert Kaiser

Brian Smith

unread,

Aug 24, 2012, 5:08:52 PM8/24/12

to Aryeh Gregor, dev-pl...@lists.mozilla.org

Aryeh Gregor wrote:
> 1) Decide on guidelines for whether a test is internal or reusable.
> As a starting point, I suggest that all tests that are regular
> webpages that don't use any Mozilla-specific features should be
> candidates for reuse. Examples of internal tests would be tests
> written in XUL and unit tests. In particular, I think we should
> write
> tests for reuse if they cover anything that other browsers implement
> or might implement, even if there's currently no standard for it.
> Other browsers should still be able to run these tests, even if they
> might decide not to follow them. Also, tests that currently use
> prefixed web-exposed properties should still be made reusable, since
> the properties should eventually be unprefixed.

Which other browser makers are going to follow these guidelines, so that we benefit from them? Generally, this is a great idea if it makes it faster and easier to improve Firefox. But, like Asa, I also interpreted this proposal along the lines of "Spend resources, and slow down Firefox development, to help other browsers." That seems totally in line with our values, but doesn't seem great as far as competitiveness is concerned.

Also, are you saying "if you are going to write a mochitest, then try to write a reusable test" or "if you are going to write a test, write a reusable test?" The reason I ask is that we're supposed to write xpcshell tests in preference to mochitests when possible, and I'd hate the preference to change to be in favor of mochitests, because xpcshell tests are much more convenient (and faster) to write and run.

Thanks,
Brian

Kyle Huey

unread,

Aug 24, 2012, 5:11:34 PM8/24/12

to Brian Smith, dev-pl...@lists.mozilla.org, Aryeh Gregor

On Fri, Aug 24, 2012 at 2:08 PM, Brian Smith <bsm...@mozilla.com> wrote:

> Also, are you saying "if you are going to write a mochitest, then try to
> write a reusable test" or "if you are going to write a test, write a
> reusable test?" The reason I ask is that we're supposed to write xpcshell
> tests in preference to mochitests when possible, and I'd hate the
> preference to change to be in favor of mochitests, because xpcshell tests
> are much more convenient (and faster) to write and run.
>

Most things that this is relevant to (things visible to web content) can't
be tested from xpcshell anyways, so this shouldn't affect xpcshell vs.
mochitests.

- Kyle

Ms2ger

unread,

Aug 25, 2012, 4:17:23 AM8/25/12

to

On 08/24/2012 11:08 PM, Brian Smith wrote:
> Aryeh Gregor wrote:
>> 1) Decide on guidelines for whether a test is internal or
>> reusable. As a starting point, I suggest that all tests that are
>> regular webpages that don't use any Mozilla-specific features
>> should be candidates for reuse. Examples of internal tests would
>> be tests written in XUL and unit tests. In particular, I think we
>> should write tests for reuse if they cover anything that other
>> browsers implement or might implement, even if there's currently no
>> standard for it. Other browsers should still be able to run these
>> tests, even if they might decide not to follow them. Also, tests
>> that currently use prefixed web-exposed properties should still be
>> made reusable, since the properties should eventually be
>> unprefixed.
>
> Which other browser makers are going to follow these guidelines, so
> that we benefit from them? Generally, this is a great idea if it
> makes it faster and easier to improve Firefox. But, like Asa, I also
> interpreted this proposal along the lines of "Spend resources, and
> slow down Firefox development, to help other browsers." That seems
> totally in line with our values, but doesn't seem great as far as
> competitiveness is concerned.

Looking at <http://w3c-test.org/html/tests/submission/>, there are tests
from Apple, Google, Microsoft, and Opera, as well as from other
organizations who benefit from interoperable implementations (Baidu,
Comcast, Intel, …) and individuals (David Carlisle, Mathias Bynens,
Philip Taylor, Aryeh, and myself). Some of those tests are already
running on tinderbox.

> Also, are you saying "if you are going to write a mochitest, then try
> to write a reusable test" or "if you are going to write a test, write
> a reusable test?" The reason I ask is that we're supposed to write
> xpcshell tests in preference to mochitests when possible, and I'd
> hate the preference to change to be in favor of mochitests, because
> xpcshell tests are much more convenient (and faster) to write and
> run.

Kyle already answered this.

HTH
Ms2ger

Jonas Sicking

unread,

Oct 6, 2012, 12:25:26 AM10/6/12

to Aryeh Gregor, Ms2ger, dev-pl...@lists.mozilla.org

Sorry to bring back an old thread, but the upcoming "Test the web
forward" meeting reminded me of this thread.

In general I really approve of this idea, however I have one major concern.

> 2) Write an introduction to testharness.js targeted at people familiar
> with mochitest. testharness.js is the de facto standard testing
> harness in the web standards world, and we already can run such tests
> as mochitests automatically (see dom/imptests/), so JavaScript tests
> meant to be usable by other browsers should be written in that format.

As others have pointed out, the testharness.js test suite is much less
convenient to use than mochitest.

Simply wrapping

test(function() {
// test here
})

only works for the most simple tests. Most tests that I write use lots
of synchronous and asynchronous callbacks. Each one of those needs to
be wrapped to catch exceptions. There's also a lot more overhead in
the harness due to trying to count how much of a test you pass.

In general, testharness.js seems to be more optimized for producing a
result report which measure how close an implementation is to
implementing a feature, than it is optimized for making it easy to
write tests.

I believe many developers right now end up spending as much time
writing tests as they do implementing features. That is a very high
cost, but something that is definitely worth it. However we should be
working towards lowering that cost rather than increasing it.

Rather than trying to convince developers that testharness.js would in
fact not increase the cost of writing tests, I think we should try to
get W3C to adjust testharness.js such that it's easier to write tests
for it. If we make it as easy to write W3C tests as it is to write
mochitests then I would absolutely agree with your proposal. I would
imagine that would also make it easier to get other browser vendors to
do the same, as well as members of the webdevelopment community.

Another problem that I think we'd have is that many of our tests use
generators and yield. This *dramatically* cuts down on the complexity
of writing complex tests which has lots of asynchronous callbacks. For
example [1][2] would have been much harder to write without them. I
think our approach here could be migrate these tests to use ES6 based
generators as soon as we have them implemented in gecko, and then
submit them to W3C as soon as enough browsers implement ES6.

I don't think that we should be telling people to not use generators
in the meantime. My experience is that rewriting tests to use
generators both cuts down on the test writing time, and makes it much
less likely that the test ends up with intermittent orange bugs.

[1] http://mxr.mozilla.org/mozilla-central/source/content/base/test/test_xhr_progressevents.html?force=1
[2] http://mxr.mozilla.org/mozilla-central/source/dom/indexedDB/test/unit/test_add_put.js

/ Jonas

Aryeh Gregor

unread,

Oct 9, 2012, 5:43:28 AM10/9/12

to Jonas Sicking, Ms2ger, dev-pl...@lists.mozilla.org, James Graham

On Sat, Oct 6, 2012 at 6:25 AM, Jonas Sicking <jo...@sicking.cc> wrote:
> In general, testharness.js seems to be more optimized for producing a
> result report which measure how close an implementation is to
> implementing a feature, than it is optimized for making it easy to
> write tests.

I think it's actually optimized for Opera's internal testing needs as
much as anything, since James Graham is the one who wrote it. Opera
has a much more elaborate testing infrastructure than we do. I think
they actually have stuff like databases instead of grepping plaintext
output, and the ability to run individual test files, and
automatically running new test files X times to record all
intermittent failures. If we just layer it on top of mochitest, we
can't take advantage of the greater structure, so the added complexity
is wasted. James has indicated that features like "each file runs the
same set of tests no matter how many fail" are essential to working
properly with Opera's test setup.

> I believe many developers right now end up spending as much time
> writing tests as they do implementing features. That is a very high
> cost, but something that is definitely worth it. However we should be
> working towards lowering that cost rather than increasing it.
>
> Rather than trying to convince developers that testharness.js would in
> fact not increase the cost of writing tests, I think we should try to
> get W3C to adjust testharness.js such that it's easier to write tests
> for it. If we make it as easy to write W3C tests as it is to write
> mochitests then I would absolutely agree with your proposal. I would
> imagine that would also make it easier to get other browser vendors to
> do the same, as well as members of the webdevelopment community.

I think testharness.js would definitely increase the costs of writing
tests. I think the increase in interop would be worth it, if other
browsers start using our tests. It's all good and well if we
implement a feature correctly, but if other browsers implement it
differently, it's not very useful to authors. This is why we want to
submit tests to the W3C to start with.

> Another problem that I think we'd have is that many of our tests use
> generators and yield. This *dramatically* cuts down on the complexity
> of writing complex tests which has lots of asynchronous callbacks. For
> example [1][2] would have been much harder to write without them. I
> think our approach here could be migrate these tests to use ES6 based
> generators as soon as we have them implemented in gecko, and then
> submit them to W3C as soon as enough browsers implement ES6.
>
> I don't think that we should be telling people to not use generators
> in the meantime. My experience is that rewriting tests to use
> generators both cuts down on the test writing time, and makes it much
> less likely that the test ends up with intermittent orange bugs.

If it's a pain to write a particular file in testharness.js, it can be
kept as mochitest. In my experience, quite a lot of tests boil down
to like ten lines, which would take about three minutes more to write
using testharness.js than mochitest. Also, a test that's based on
testharness.js but uses some Gecko-only features would be easier to
make portable later than a test that's based on mochitest and also
uses Gecko-only features.

Jonas Sicking

unread,

Oct 9, 2012, 6:46:06 PM10/9/12

to Aryeh Gregor, Ms2ger, dev-pl...@lists.mozilla.org, James Graham

I agree that at a large scale, the additional value from writing tests
that are sharable with other browser vendors and with the W3C
community is technically worth the overhead that testharness.js
brings.

However, for someone working against a looming deadline, the cost of
writing sharable tests and risk missing the deadline can be much
higher than having those tests not be run by other browser vendors.

Additionally, I think that the more overhead the harness has, the less
thorough the tests will be. I've definitely noticed that the tests
from Opera for CORS has been a lot less thorough than the tests that I
wrote myself. I absolutely think that there's a stronger bias against
writing more comprehensive tests the more work is needed to write
those tests. No amount of "rules" forcing developers to write tests
against testharness.js will remove that bias.

So I'd much rather spend the additional effort to create a test
harness which is optimized for getting comprehensive tests than take
the additional cost to developers and the reduction in test quality
that comes with a overly heavyweight harness.

I do understand that Opera might have various requirements for what
fits in their test harness, but I think if the goal is to create a
harness "for the web" then we should optimize for what the web needs.
Which is lots of tests and comprehensive ones.

>> Another problem that I think we'd have is that many of our tests use
>> generators and yield. This *dramatically* cuts down on the complexity
>> of writing complex tests which has lots of asynchronous callbacks. For
>> example [1][2] would have been much harder to write without them. I
>> think our approach here could be migrate these tests to use ES6 based
>> generators as soon as we have them implemented in gecko, and then
>> submit them to W3C as soon as enough browsers implement ES6.
>>
>> I don't think that we should be telling people to not use generators
>> in the meantime. My experience is that rewriting tests to use
>> generators both cuts down on the test writing time, and makes it much
>> less likely that the test ends up with intermittent orange bugs.
>
> If it's a pain to write a particular file in testharness.js, it can be
> kept as mochitest. In my experience, quite a lot of tests boil down
> to like ten lines, which would take about three minutes more to write
> using testharness.js than mochitest. Also, a test that's based on
> testharness.js but uses some Gecko-only features would be easier to
> make portable later than a test that's based on mochitest and also
> uses Gecko-only features.

This doesn't match my experience at all. My experience is that writing
tests has a high cost and results in fairly complex test files.

/ Jonas

Boris Zbarsky

unread,

Oct 9, 2012, 9:39:10 PM10/9/12

to

You're both right.

Simple tests for really basic DOM stuff are very short.

Tests for things like XHR and events and IndexedDB, which are a lot of
what Jonas has had to write tests for are complete hell to write, if
nothing else because of the async nature of those objects.

We need a test harness that can handle both use cases without exploding.
It's not quite clear to me that testharness.js is it, fwiw.

-Boris

Ian Bicking

unread,

Oct 10, 2012, 12:23:56 AM10/10/12

to Boris Zbarsky, dev-pl...@lists.mozilla.org

I'm a little confused, because I looked at testharness.js and thought it
looked perfectly fine, not particularly more or less complex than
SimpleTest/MochiTest.

Here's how I think you'd write a simple XHR test in both:

// SimpleTest aka MochiTest
req = new XMLHttpRequest();
req.open("GET", "/example.json");
req.onreadystatechange = function () {
if (req.readyState != 4) {
return;
}
is(req.status, 200);
is(req.getResponseHeader("Content-Type"), "json");
SimpleTest.finish();
};
SimpleTest.waitForExplicitFinish();

// testharness
var t = async_test("Test XHR");
req = new XMLHttpRequest();
req.open("GET", "/example.json");
req.onreadystatechange = function () {
if (req.readyState != 4) {
return;
}
t.step(function () {
assert_equals(req.status, 200);
assert_equals(req.getResponseHeader("Content-Type"), "json");
});
t.done();
};

Am I missing an important difference? Seems like testharness.js just wants
to add the concept of multiple tests on a page that can independently pass
or fail, and needs just a little more complexity as a result. But it
doesn't feel like a big deal.

(An aside from the topic of reusable tests, but I feel I should plug an
entirely different testing framework I've written: http://doctestjs.org –
which I think is especially good for spec testing, encouraging really
thorough testing and it makes test writing easy, including async tests.)

Boris Zbarsky

unread,

Oct 10, 2012, 12:41:33 AM10/10/12

to

On 10/10/12 12:23 AM, Ian Bicking wrote:
> Here's how I think you'd write a simple XHR test in both:
>
> // SimpleTest aka MochiTest
> req = new XMLHttpRequest();
> req.open("GET", "/example.json");

How did example.json get there?

What if you need to test CORS?

With mochitest at this point you're doing some SJS work and whatnot.

> // testharness

And here you have to go and do whatever is appropriate to your server
setup. Which is not part of testharness. Which is why the CORS tests
imported from Opera to the W3C ended up all broken, because they did not
configure the server correctly.

-Boris

Ian Bicking

unread,

Oct 10, 2012, 1:13:59 AM10/10/12

to Boris Zbarsky, dev-pl...@lists.mozilla.org

OK – so if I understand the objection to testharness isn't anything in
testharness.js itself, but that it's an incomplete solution as it doesn't
define an environment? That is, MochiTest gives an environment where we
can define resources at a variety of URLs and serve them with arbitrary
headers, and so you can define tests that are more complete and
self-contained. There's also stuff like permission overrides, which are
really just about how MochiTest sets up the environment, and of course that
stuff by its nature is not portable. But of course if you can't override
permission checks it makes testing annoying.

Boris Zbarsky

unread,

Oct 10, 2012, 1:18:44 AM10/10/12

to

On 10/10/12 1:13 AM, Ian Bicking wrote:
> OK – so if I understand the objection to testharness isn't anything in
> testharness.js itself, but that it's an incomplete solution as it doesn't
> define an environment?

That's _my_ primary objection, after looking at it briefly and seeing
how it works in practice.

I don't think this is Jonas' objection, however.

-Boris

Aryeh Gregor

unread,

Oct 10, 2012, 3:46:53 AM10/10/12

to Jonas Sicking, Boris Zbarsky, Ms2ger, dev-pl...@lists.mozilla.org, James Graham

On Wed, Oct 10, 2012 at 12:46 AM, Jonas Sicking <jo...@sicking.cc> wrote:
> However, for someone working against a looming deadline, the cost of
> writing sharable tests and risk missing the deadline can be much
> higher than having those tests not be run by other browser vendors.

Which is why we have rolling releases, right, so things can be pushed
off a bit if necessary to get them right? If individual projects'
managers set deadlines that don't prioritize portable testing, then of
course those projects will have trouble getting portable tests done,
but that's true for any possible goal. Many software projects don't
require good testing or QA at all, and as a result developers write
half-baked features so they can meet deadlines that don't allow enough
time to do things right. That doesn't mean we should avoid doing
things right!

Naturally, we do have to prioritize our resources, and sometimes it
will make more sense to not bother with reusable tests if they're too
much work. That doesn't mean we shouldn't aim to *ever* write
reusable tests.

> Additionally, I think that the more overhead the harness has, the less
> thorough the tests will be. I've definitely noticed that the tests
> from Opera for CORS has been a lot less thorough than the tests that I
> wrote myself. I absolutely think that there's a stronger bias against
> writing more comprehensive tests the more work is needed to write
> those tests. No amount of "rules" forcing developers to write tests
> against testharness.js will remove that bias.

I wrote a lot of tests using testharness.js that I hope you'll agree
are pretty thorough, such as Range and Selection tests:

http://hg.mozilla.org/mozilla-central/file/5cca0408a73f/dom/imptests/webapps/DOMCore/tests/approved
http://hg.mozilla.org/mozilla-central/file/5cca0408a73f/dom/imptests/editing/selecttest

On the other hand, I've seen *manual* tests (= no framework overhead
at all) submitted to the W3C by Microsoft that are so superficial as
to be worse than useless, not to mention some that were wrong. That
you write better tests than other people can't be blamed on the
framework!

In practice: I've written many tests, both complicated and simple, in
both Mochitest and testharness.js. Once you get used to them, they're
about equally easy to write, although testharness.js winds up being
slightly longer. If you're writing any sort of nontrivial test and
you're familiar with the test harness you're using, 90% of your effort
is going to be thinking about what exactly to test and how to test it,
not actually typing stuff.

That said, of course, Mozilla hackers *are* familiar with Mochitest
but not testharness.js, and adopting testharness.js in parallel with
Mochitest would require people to be familiar with both. That is
certainly a minus.

> So I'd much rather spend the additional effort to create a test
> harness which is optimized for getting comprehensive tests than take
> the additional cost to developers and the reduction in test quality
> that comes with a overly heavyweight harness.

I think making Yet Another Test Harness is not the right way to go.
testharness.js has traction in the standards world, it's well
documented, it already works in all browsers, at least Mozilla and
Opera (possibly MS/WebKit too) can already use testharness.js tests
internally, and there are people from every major browser who are at
least somewhat familiar with it. It may not be perfect, but it
doesn't make sense to reinvent the wheel still again unless there are
really compelling specific flaws that can be identified.

> This doesn't match my experience at all. My experience is that writing
> tests has a high cost and results in fairly complex test files.

It depends. If you're writing tests for an entire feature, then yes,
it's very complicated. I've done that -- Range, Selection, and
editing, for instance. But most routine bugs are simple things and
the tests are just a few lines long. Even if we only shared those
tests, it would be a step forward.

On Wed, Oct 10, 2012 at 6:41 AM, Boris Zbarsky <bzba...@mit.edu> wrote:
> On 10/10/12 12:23 AM, Ian Bicking wrote:
>>
>> Here's how I think you'd write a simple XHR test in both:
>>
>> // SimpleTest aka MochiTest
>> req = new XMLHttpRequest();
>> req.open("GET", "/example.json");
>
>
> How did example.json get there?
>
> What if you need to test CORS?
>
> With mochitest at this point you're doing some SJS work and whatnot.
>
>> // testharness
>
>
> And here you have to go and do whatever is appropriate to your server setup.
> Which is not part of testharness. Which is why the CORS tests imported from
> Opera to the W3C ended up all broken, because they did not configure the
> server correctly.

testharness.js absolutely does not support all the same features as
Mochitest. It also doesn't support taking screenshots, for instance,
which makes it useless for certain things. But we can still use it
for the things it does support, like self-contained DOM tests.
Although yes, this means using a mix of two different testing
frameworks, which is annoying, especially if you start in
testharness.js and then realize it needs to be a Mochitest. But there
are plenty of types of tests that can easily be testharness.js.

ja...@hoppipolla.co.uk

unread,

Oct 10, 2012, 4:26:03 AM10/10/12

to

On Wednesday, 10 October 2012 07:18:45 UTC+2, Boris Zbarsky wrote:
> On 10/10/12 1:13 AM, Ian Bicking wrote:
> > OK – so if I understand the objection to testharness isn't anything in
>
> > testharness.js itself, but that it's an incomplete solution as it doesn't
> > define an environment?
>
> That's _my_ primary objection, after looking at it briefly and seeing
> how it works in practice.

Do you have a concrete suggestion for how to improve this, in a way that works cross-browser? AIUI (which is not very well, so please correct me if I am wrong), the Mozilla solution involves running a custom XPCOM-based web server, and thus is not very portable.

The W3C solution is basically "there exists a server with a set of known subdomains and the ability to run PHP". When I have wanted to write tests for things that require multiple origins, I have assumed that the domain from which the tests are being served is unknown, but that there are sure to be some specific subdomains. This is far from ideal, but does allow me to make tests that are portable across our infrastructure and the W3C infrastructure. It also seems like they could, in principle, be ported to other infrastructure.

Henri Sivonen

unread,

Nov 7, 2012, 8:03:57 AM11/7/12

to Aryeh Gregor, Boris Zbarsky, Ms2ger, dev-pl...@lists.mozilla.org, Jonas Sicking, James Graham

On Wed, Oct 10, 2012 at 10:46 AM, Aryeh Gregor <a...@aryeh.name> wrote:
> That said, of course, Mozilla hackers *are* familiar with Mochitest
> but not testharness.js, and adopting testharness.js in parallel with
> Mochitest would require people to be familiar with both. That is
> certainly a minus.

I was told at TPAC that testharness.js has gotten/is getting a mode
where you get to make the entire file "one test" from the point of
view of testharness.js. In that case, the user experience of
testharness.js is (I'm told) isomorphic to using mochitest with
explicit finish.

This change removes my objection to using testharness.js. I'm quite
okay with a non-mochitest harness that merely requires me to spell
"SimpleTest.waitForExplicitFinish", "SimpleTest.finish", "is" and "ok"
in a different way. (Although I still think "is" and "ok" are superior
spellings.)

As for the concern related to tests that need to manage server-side
behavior, I think supporting our foo^headers^ files would go a long
way even if the .sjs/.php case remained unsolved. (For arbitrary
server-side programming, I am guessing that it would be easier to make
all parties accept a Python-based solution than to get everyone to
accept either .sjs or .php.)

--
Henri Sivonen
hsiv...@iki.fi
http://hsivonen.iki.fi/

Aryeh Gregor

unread,

Nov 8, 2012, 7:10:39 AM11/8/12

to James Graham, Henri Sivonen, Boris Zbarsky, Ms2ger, dev-pl...@lists.mozilla.org, Jonas Sicking

On Wed, Nov 7, 2012 at 4:13 PM, James Graham <jgr...@opera.com> wrote:
> There is an experimental branch with this mode in; it isn't production
> quality yet. I am still unsure that it's a good idea; in particular I think
> it encourages people to write multiple tests on the same page in such a way
> that if one fails the whole suite stops running. This seems very unpleasant
> to deal with; remember that unlike typical Mozilla tests these will not
> necessarily all pass in their unmodified form, and that modifying imported
> tests inline creates a headache when you want to merge changes.

In Mochitest, a failed assert doesn't stop execution of the file, so
even if fails aren't marked inline, the results are still likely to be
useful. An unexpected result just prints out a TEST-UNEXPECTED-PASS
or TEST-UNEXPECTED-FAIL line to the console instead of TEST-PASS or
TEST-EXPECTED-FAIL, which is then slurped out by various line-based
parsing tools and reported as a failure. (Yes, our infrastructure is
not so sophisticated.)

In testharness.js, a failed assert causes the whole test to fail.
This is not very useful if the whole file is one giant test that tests
lots of different things. If a failed assert wasn't fatal in this
experimental new mode, on the other hand, it would still be pretty
useful.

James Graham

unread,

Nov 7, 2012, 9:13:50 AM11/7/12

to Henri Sivonen, Boris Zbarsky, Ms2ger, dev-pl...@lists.mozilla.org, Aryeh Gregor, Jonas Sicking

On 11/07/2012 02:03 PM, Henri Sivonen wrote:
> On Wed, Oct 10, 2012 at 10:46 AM, Aryeh Gregor <a...@aryeh.name> wrote:
>> That said, of course, Mozilla hackers *are* familiar with Mochitest
>> but not testharness.js, and adopting testharness.js in parallel with
>> Mochitest would require people to be familiar with both. That is
>> certainly a minus.
>
> I was told at TPAC that testharness.js has gotten/is getting a mode
> where you get to make the entire file "one test" from the point of
> view of testharness.js. In that case, the user experience of
> testharness.js is (I'm told) isomorphic to using mochitest with
> explicit finish.

There is an experimental branch with this mode in; it isn't production
quality yet. I am still unsure that it's a good idea; in particular I
think it encourages people to write multiple tests on the same page in
such a way that if one fails the whole suite stops running. This seems
very unpleasant to deal with; remember that unlike typical Mozilla tests
these will not necessarily all pass in their unmodified form, and that
modifying imported tests inline creates a headache when you want to
merge changes.

Nevertheless I will put the current code somewhere public and let people
experiment with the API.

Boris Zbarsky

unread,

Nov 10, 2012, 2:41:07 AM11/10/12

to James Graham, Henri Sivonen, Ms2ger, dev-pl...@lists.mozilla.org, Aryeh Gregor, Jonas Sicking

On 11/9/12 12:52 AM, James Graham wrote:
> I know Mozilla use a system where all the tests in a file should pass,
> but I don't see how that will work well when you don't control the
> tests. If you are manually editing every file when you import it, I
> imagine that updating tests will be so time consuming that it will be
> tempting not to bother. How do you plan to address this?

I believe right now we have a list of "known failures" alongside such
tests, and our own test harness knows to compare what the tests are
reporting to our list of known failures. As in, we're not using the
pass/fail state of the tests directly; we're comparing it to "should all
pass, except this whitelist of things that we know fail".

Constructing these whitelists of known failures is indeed a bit of a
PITA, but they're pretty static until we fix stuff, usually.

-Boris

Aryeh Gregor

unread,

Nov 10, 2012, 1:48:24 PM11/10/12

to James Graham, Boris Zbarsky, Henri Sivonen, Ms2ger, dev-pl...@lists.mozilla.org, Jonas Sicking

On Sat, Nov 10, 2012 at 9:41 AM, Boris Zbarsky <bzba...@mit.edu> wrote:
> I believe right now we have a list of "known failures" alongside such tests,
> and our own test harness knows to compare what the tests are reporting to
> our list of known failures. As in, we're not using the pass/fail state of
> the tests directly; we're comparing it to "should all pass, except this
> whitelist of things that we know fail".
>
> Constructing these whitelists of known failures is indeed a bit of a PITA,
> but they're pretty static until we fix stuff, usually.

Yes, exactly. And they're quite easy to construct -- there's a script
these days (parseFailures.py) that you run on the output of the
mochitests, and it creates all the directories and files for you.

The code for our testharness.js wrapping is here:

http://hg.mozilla.org/mozilla-central/file/ea5c4c1b0edf/dom/imptests

See the README. To import a new test suite, all you have to do is add
a line to a file (or a new file) specifying the location of the test
suite and the directories that are wanted, run the importTestsuite.py
script, run the test suite to get a list of known failures, and use
parseFailures.py on the result to generate appropriate JSON files in
the "failures" directory. It's only a minor hassle.

Currently we only check that no test fails that's not on the per-file
whitelist of expected fails, and in practice that works fine for us.
If we wanted to be pickier, we could list all expected results, both
pass and fail, and verify that the lists exactly match. This is
unpleasant in practice because some of the test files I wrote run tens
of thousands of tests, which leads to JSON files quite a few megabytes
in size that have to be loaded to run the tests. Since in most files
we pass all or almost all tests, storing only failures is a very
effective optimization.

It's certainly true that if a file threw an exception at file scope,
it would make the test useless. However, if we can't change the file,
there could be all kinds of things about it that are broken anyway.
For instance, it could wrap many unrelated things in a single test(),
and then you have exactly the same problems. For the time being, any
tests we import are tests we can change -- they're all from the W3C
and most are written by Mozilla contributors. So if it doesn't play
nicely for our system, we can always change it. I expect that to
continue to be the case. If an odd file breaks and we can't change it
for whatever reason, we can live without the test coverage. It's not
a problem we have in practice.

So I still don't see the value in the test/assert model used by
testharness.js, as opposed to having everything be one big test but
asserts be non-fatal. In particular, what value does test() provide
that couldn't be provided about as well by try/catch blocks, aside
from aesthetic preferences about grouping?

Neil

unread,

Nov 11, 2012, 9:21:17 AM11/11/12

to

Aryeh Gregor wrote:

>Currently we only check that no test fails that's not on the per-file whitelist of expected fails, and in practice that works fine for us. If we wanted to be pickier, we could list all expected results, both pass and fail, and verify that the lists exactly match. This is unpleasant in practice because some of the test files I wrote run tens of thousands of tests, which leads to JSON files quite a few megabytes in size that have to be loaded to run the tests.
>

Why not simply verify that the list of actual fails equals the list of
expected fails, and report items that are only in one of the two lists?

--
Warning: May contain traces of nuts.

Aryeh Gregor

unread,

Nov 12, 2012, 6:36:33 AM11/12/12

to Neil, dev-pl...@lists.mozilla.org

On Sun, Nov 11, 2012 at 4:21 PM, Neil <ne...@parkwaycc.co.uk> wrote:
> Why not simply verify that the list of actual fails equals the list of
> expected fails, and report items that are only in one of the two lists?

That would be a bit more robust, yes, and it should be doable without
much work. It still wouldn't detect the case where different sets of
passing tests are run, such as because a change accidentally disabled
a large chunk of tests. The system James seems to be describing would
detect that condition too. On the other hand, all our existing
mochitests don't detect it and we seem to do fine.

(There *are* cases where someone accidentally disabled a bunch of
tests and no one noticed. Some months ago I found a number of those
when I changed mochitests to fail if the file ran no tests. But that
only detects if all tests in the file are disabled, not if only a
handful are disabled. Still, it's not a common enough scenario that
it's worth spending much effort on, IMO.)

James Graham

unread,

Nov 9, 2012, 3:52:43 AM11/9/12

to Aryeh Gregor, Henri Sivonen, Boris Zbarsky, Ms2ger, dev-pl...@lists.mozilla.org, Jonas Sicking

On 11/08/2012 01:10 PM, Aryeh Gregor wrote:
> On Wed, Nov 7, 2012 at 4:13 PM, James Graham <jgr...@opera.com> wrote:

>> There is an experimental branch with this mode in; it isn't production
>> quality yet. I am still unsure that it's a good idea; in particular I think
>> it encourages people to write multiple tests on the same page in such a way
>> that if one fails the whole suite stops running. This seems very unpleasant
>> to deal with; remember that unlike typical Mozilla tests these will not
>> necessarily all pass in their unmodified form, and that modifying imported
>> tests inline creates a headache when you want to merge changes.
>

> In Mochitest, a failed assert doesn't stop execution of the file, so
> even if fails aren't marked inline, the results are still likely to be
> useful. An unexpected result just prints out a TEST-UNEXPECTED-PASS
> or TEST-UNEXPECTED-FAIL line to the console instead of TEST-PASS or
> TEST-EXPECTED-FAIL, which is then slurped out by various line-based
> parsing tools and reported as a failure. (Yes, our infrastructure is
> not so sophisticated.)
>
> In testharness.js, a failed assert causes the whole test to fail.
> This is not very useful if the whole file is one giant test that tests
> lots of different things. If a failed assert wasn't fatal in this
> experimental new mode, on the other hand, it would still be pretty
> useful.
>

I think that would be possible to arrange, but you couldn't do that and
retain an invariant on the number of tests run in the face of arbitrary
(i.e. not asserts) lines of code failing. This would make it very hard
to compare results between runs. If you allowed the file as a whole to
have a status you could distinguish between "this file ran without
unexpected exceptions" and "this file had an unexpected exception", but
it still seems like it would be annoying if, when implementing an API,
tests for an unimplemented method caused a single "fail" result rather
than detailed information about the parts that passed.

Jeff Walden

unread,

Nov 12, 2012, 9:53:57 PM11/12/12

to

I read newsgroups too little (or perhaps just enough, or too much, depending), sorry for the kind-of-late response here...

On 10/10/2012 01:26 AM, ja...@hoppipolla.co.uk wrote:
> Do you have a concrete suggestion for how to improve this, in a way that works cross-browser?

Well, the cross-domain part should be solvable using proxy autoconfig. The exact setup of that, and the primal server that would appear to exist for all ports/domains/etc. would have to be something each browser/engine configured when it ran tests. That seems unavoidable. I have suggested using PAC like this to WebKit people in the past; to the best of my knowledge they haven't actually done anything with the idea, just kept using localhost/127.0.0.1 to serve HTTP(S) tests. This is nowhere near enough to do interesting things like testing IDN, or subdomains (I'd be surprised if they don't have anything for testing this now, I haven't tried keeping up with what they do, but they might have something), or others. But it gets you a lot even still.

> AIUI (which is not very well, so please correct me if I am wrong), the Mozilla solution involves running a custom XPCOM-based web server, and thus is not very portable.

That's reasonably accurate, for some definitions of "portable". :-)

At the time the web server was introduced I don't believe we had Python as a build requirement, so we couldn't have used some Python-based server (the option most likely to be somehow portable across browsers/engines). That probably could be addressed. It'd require rewriting away from ^headers^ files, which do nothing more than let you set custom headers/status when serving a static file. This probably wouldn't be too hard to do. It'd also require rewriting a bunch of SJS files (effectively CGI scripts). That'd be much harder. I believe some b2g/mobile testing has found the SJS thing to reduce the flexibility of how they run tests, but I don't know the details.

Such a rewrite might nonetheless be worth it, especially if tests were more shareable as a result. I know I've written tests and run them against other browsers by spinning up a server, manually tweaking the PAC URL for some other browser, then running the tests there -- and gotten value out of doing so, as a poor-man's reference implementation.

Jeff

Ted Mielczarek

unread,

Nov 13, 2012, 8:36:26 AM11/13/12

to Jeff Walden, jma...@mozilla.com, dev-pl...@lists.mozilla.org

On 11/12/2012 9:53 PM, Jeff Walden wrote:
> At the time the web server was introduced I don't believe we had
> Python as a build requirement, so we couldn't have used some
> Python-based server (the option most likely to be somehow portable
> across browsers/engines). That probably could be addressed. It'd
> require rewriting away from ^headers^ files, which do nothing more
> than let you set custom headers/status when serving a static file.
> This probably wouldn't be too hard to do. It'd also require rewriting
> a bunch of SJS files (effectively CGI scripts). That'd be much harder.
> I believe some b2g/mobile testing has found the SJS thing to reduce
> the flexibility of how they run tests, but I don't know the details.

FYI there has been some investigation into this. The primary motivation
is that using httpd.js for mobile testing is a pain, as it requires a
host binary xpcshell, and since that's not a product of the build you
have to source it elsewhere. It also has the nice side benefit that your
test webserver is no longer tied up with the product you're testing, so
bugs in xpcshell don't have to affect serving web content for tests.

-Ted