automated test policy for mail/, mailnews/, editor/, directory/ in comm-central

Dan Mosedale

unread,

Dec 18, 2009, 9:44:45 PM12/18/09

to dev-apps-t...@lists.mozilla.org, dev-pl...@lists.mozilla.org, dev-apps-...@lists.mozilla.org

[Note that replies are redirected to dev-apps-thunderbird]

As mentioned in previous post-Tb3 planning discussion centered around
<https://wiki.mozilla.org/User:Dmose/Tb_Post-3.0_Scratchpad#Testing>, I
believe Thunderbird needs to become significantly more agile by bringing
our testing polices in line with those of the browser/ and toolkit/
directory in mozilla-central. Which is to say that I'd like to start
requiring that all patches to mail/, mailnews/, directory/, and editor/
include automated tests starting Monday, January 4, 2010, except in
cases where it's specifically impractical or not sensible.

I recognize that this will be a different style of development than some
folks are used to and that the change may feel little a bumpy, so I want
to be sure we make this policy change in a humane way. In particular,
I'd like to us to be sensitive to the fact that there are a lot of
patches in progress in Bugzilla today, and that we will suddenly be
asking more of our contributors than we were at the time they started
working on those patches.

For those who haven't already written or run tests using one of our
automated testing frameworks,
<https://developer.mozilla.org/en/Thunderbird/Thunderbird_Automated_Testing>
is a good starting point.

If you have thoughts, issues, or concerns, feel free to reply to this
message either in dev-apps-thunderbird or to me by private email.

Dan

Ben Bucksch

unread,

Dec 19, 2009, 8:54:54 AM12/19/09

to

>
> I recognize that this will be a different style of development than
> some folks are used to and that the change may feel little a bumpy, so
> I want to be sure we make this policy change in a humane way. In
> particular, I'd like to us to be sensitive to the fact that there are
> a lot of patches in progress in Bugzilla today, and that we will
> suddenly be asking more of our contributors than we were at the time
> they started working on those patches.

Can we first improve the test framework significantly, please?

I've followed your wishes of creating tests, spent the last 2 weeks with
the testsuite, but it's been most painful.

I'll try to formulate my pain objectively. Main pain points:

setTimeout
All I'll say is: do_timeout(500, "endTest()");
This is guaranteed undeterministic, and indeed the main problems were my
tests (adapted from existing tests) just running astray or just stopping
and I don't know where, why, or how to fix it. In my experience, almost
every instance of setTimeout() is just a hidden bug that that's just
papered over with setTimeout "because that's the only way it worked",
but nobody cared to find out the real reason and fix properly.

No error output
Sometimes, when I make a typo or error in tests or testsuite, it goes
astray with no output whatsoever (bug 535310). This makes coding a
severe pain, because I don't know what's wrong.

Copypasteritis
Also, there's a *ton* of code cuplication, like
prefSvc.setBoolPref("mail.biff.show_tray_icon", false); (and many others)
(why isn't that in head_?) and
var thread = gThreadManager.currentThread;
while (thread.hasPendingEvents())
thread.processNextEvent(true);
(why isn't that in tail_?).

No documentation
There are currently 2-3 small documents, which don't even cover the API,
much less how to write IMAP tests (without copy&paste).
I could to create tests for POP3 and SMTP login, using copy&paste. But
today, I tried to simply test login (connection) to an IMAP server, but
couldn't make it work, the incomingServer.performExpand(null) continued
further asyncronously, I don't know what to wait for and how to watch
success. I gave up.

All over the place
Many tests have globals which cross files. That makes it hard to see
what's happening. I agree that tests need to be short and comfortable to
write, so it's good to have lots of common code like createIMAPServer()
or createLocalFolders() and friends, but these globals are not
necessary, that's just not sane coding.
Similarly, it's hard to find where stuff comes from, when it's comming
from all of head_*, mailnews/test/fakeserver/, mailnews/test/resources/,
mailnews/base/ and whatnot.

---

Sorry for this form, I wanted to write my experiences and main pain
points in a more coherent form later, with possibly some ideas for
improvement, but I'll be on vacation now, so I don't have much time to
contribute in discussion or code and you're pressing the issue now... :-(

Robert Kaiser

unread,

Dec 19, 2009, 11:32:45 AM12/19/09

to

Ben Bucksch wrote:
> setTimeout

The whole step-by-step execution thing in our test suites leaves a lot
to be desired: stepping (or letting the non-test environment do
something) with setTimeout leads to intermittent orange in most cases,
and IMHO using setTimeout in a test should make it fail right away - but
doing event-triggered stepping needs a lot of overhead, makes it easy to
not clean up correctly and makes us time out and fail the whole suite
when one test fails, which is suboptimal as well.

I'd very much appreciate a simpler model there (and I won't go into
having too many different test suites for the moment, as that affects
Firefox and SeaMonkey more than Thunderbird).

Robert Kaiser

Andrew Sutherland

unread,

Dec 20, 2009, 4:40:09 AM12/20/09

to

On 12/19/2009 05:54 AM, Ben Bucksch wrote:
> Can we first improve the test framework significantly, please?

I have been trying to avoid code duplication in the units tests for code
I develop / work on and have done a lot of framework building that
should hopefully be useful in many cases. My testing framework stack
looks like this:

* logHelper.js - Logging abstraction that allows for 'rich' meta-data to
be logged. Output is flattened for text output but also supports
JSON-serialized output to disk and network. The current expected rich
target is my 'logsploder' XULRunner application. One notable thing it
does is cause script warnings/errors reported by the error console to
become unit test failures. This covers a lot of the 'silent error'
cases you note in your message.

The logsploder repo can be found here:
http://hg.mozilla.org/users/bugmail_asutherland.org/logsploder/

Some logsploder screenshots can be found here:
http://clicky.visophyte.org/examples/logsploder/20091220/

You can run logsploder by doing "xulrunner application.ini". It will
create a file in your temp dir that makes itself known to loghelper.
Then running any unit tests that use logHelper will magically connect to
logsploder. For example, the gloda unit tests. ('cd
thunderbird-objdir/mailnews/db/gloda/test; make xpcshell-tests')

I am not going to claim excellent UI design, but it works surprisingly
well for what it is and how much effort it has received.

* (currently lives in gloda resources space, I have a patch to move it
out) folderEventLogHelper.js - Subscribes as an nsIFolderListener and
nsIMsgFolderListener and logs these events (when logHelper has a rich
consuder) with rich data to logHelper so that the log output includes
useful information about what events are happening without the unit
tests needing to provide this information directly. Incredibly useful
in helping validation assumptions about what should be happening in a
unit test.

* asyncTestUtils.js - An asynchronous test runner framework that
maintains an async call stack and supports use of generators in tests
and testing helpers. mozmill's strategy of spinning a nested event loop
rather than yielding control back up to the containing event loop (which
is how asyncTestUtils works) may be a better strategy for testing
comprehension.

* messageGenerator.js - Synthetic message creation support. It's not a
full-fledged message generator suitable for replacing Thunderbird's
message generation (it grows as testing needs demand), but it can
express multi-part MIME stuff. It has support for building threads of
messages and other scenarios. This was brought into existence because
simply copying in pre-mastered/existing .eml files from disk is
insufficient for gloda testing; messages need to have distinct
message-id's or things start getting unrealistic.

* messageModifier.js - Represents collections of synthetic messages
created by messageGenerator and supports performing attribute
modification on the messages. More significant operations (moving,
copying) are performed by messageInjection.js and tend to operate in
terms of the SyntheticMessageSet collection object.

* messageInjection.js - Abstracts away dealing with local message
injection (via addMessage) and IMAP message injection (using the IMAP
fake server). The POP code path is sortof there but not actually
used/tested, so it likely does not work. POP may also have conceptual
difficulties since both local and IMAP injection support targeting all
folders, not just the Inbox. This file includes operations to
copy/move/trash/delete messages.

These all can be found in:
http://mxr.mozilla.org/comm-central/source/mailnews/test/resources/
rkent's POP3pump abstraction can also notably be found there.

While said files may not provide all you need, they should hopefully at
least help you avoid needing to write code that does anything they do.
I do imagine protocol testing might need to operate a level lower than
the message*.js framework stuff, though.

Andrew

Ben Bucksch

unread,

Dec 21, 2009, 9:57:45 AM12/21/09

to

On 20.12.2009 10:40, Andrew Sutherland wrote:
> While said files may not provide all you need, they should hopefully
> at least help you

A lot of this sounds like great work. Thanks!

Can we integrate that in the testsuite, esp. the log and async stuff?
That was my main point: I think the testsuite has to be improved before
seeing more widespread use. (Not catching error console errors in
current testsuite is just *headbang*.)

Ben

Mark Banner

unread,

Dec 22, 2009, 6:20:59 AM12/22/09

to

Just to clarify: asyncTestUtils.js is already in the test suite.

Standard8

Mark Banner

unread,

Dec 22, 2009, 6:38:56 AM12/22/09

to

On 19/12/2009 13:54, Ben Bucksch wrote:

>>
>> I recognize that this will be a different style of development than
>> some folks are used to and that the change may feel little a bumpy, so
>> I want to be sure we make this policy change in a humane way. In
>> particular, I'd like to us to be sensitive to the fact that there are
>> a lot of patches in progress in Bugzilla today, and that we will
>> suddenly be asking more of our contributors than we were at the time
>> they started working on those patches.
>
> Can we first improve the test framework significantly, please?
>
> I've followed your wishes of creating tests, spent the last 2 weeks with
> the testsuite, but it's been most painful.

Agreed this is still painful, however, I think that unless we require
tests we're not going to find the pain points and we're not going to
have any motivation to fix them.

> I'll try to formulate my pain objectively. Main pain points:

Excellent :-)

>
> setTimeout
> All I'll say is: do_timeout(500, "endTest()");
> This is guaranteed undeterministic, and indeed the main problems were my
> tests (adapted from existing tests) just running astray or just stopping
> and I don't know where, why, or how to fix it. In my experience, almost
> every instance of setTimeout() is just a hidden bug that that's just
> papered over with setTimeout "because that's the only way it worked",
> but nobody cared to find out the real reason and fix properly.

Agreed, we should be avoiding setTimeout wherever possible. I think that
some of these were mainly because we didn't have async capabilities in
the test suite, now we have it, we should find some time to go back and
revisit some of the earlier tests.

I'm actually wondering if it is worth putting together a devmo page
similar to "do's and don'ts" or a coding practice for unit tests (not
just don't use setTimeout, but use tool foo instead.

> No error output
> Sometimes, when I make a typo or error in tests or testsuite, it goes
> astray with no output whatsoever (bug 535310). This makes coding a
> severe pain, because I don't know what's wrong.

That's going to be a bit hard to track down. We've used long timeouts
(!) in the past to try and ensure this doesn't happen, but I think it is
a bit hard given the nature of the environment.

One problem with errors is that I've seen sometimes that mailnews
"swallows" errors because they have been reported in listeners and
aren't actually relevant to the bit of code calling the listeners. I
think for them, we have to hunt them down on a case-by-case basis and be
prepared to add in logging to the console (for example) so that
extension authors (and tests) can track issues better.

> Copypasteritis
> Also, there's a *ton* of code cuplication, like
> prefSvc.setBoolPref("mail.biff.show_tray_icon", false); (and many others)
> (why isn't that in head_?) and

Unfortunately it happens. Yes, we need to revisit some of these and make
these into head_ or utility files.

> var thread = gThreadManager.currentThread;
> while (thread.hasPendingEvents())
> thread.processNextEvent(true);
> (why isn't that in tail_?).

Generally I think that should be a utility function not necessarily in
tail - we need to call it at times that aren't just the end of the test.

> No documentation
> There are currently 2-3 small documents, which don't even cover the API,
> much less how to write IMAP tests (without copy&paste).

Which API are you talking about here?

> I could to create tests for POP3 and SMTP login, using copy&paste. But
> today, I tried to simply test login (connection) to an IMAP server, but
> couldn't make it work, the incomingServer.performExpand(null) continued
> further asyncronously, I don't know what to wait for and how to watch
> success. I gave up.

I think that's the point to ask and/or file a bug ;-) Documentation is
something we need to get better at doing in an all-round sense, and I
think that we should start requiring dev documentation on test suite
changes and new additions.

> All over the place
> Many tests have globals which cross files. That makes it hard to see
> what's happening. I agree that tests need to be short and comfortable to
> write, so it's good to have lots of common code like createIMAPServer()
> or createLocalFolders() and friends, but these globals are not
> necessary, that's just not sane coding.

This sounds like we just need a bug filing.

> Similarly, it's hard to find where stuff comes from, when it's comming
> from all of head_*, mailnews/test/fakeserver/, mailnews/test/resources/,
> mailnews/base/ and whatnot.

Do you mean globals or functions? Generally functions relevant to their
area are kept in head files in mailnews/imap or mailnews/local for
instance, whereas the fakeserver stuff should be fairly obvious, and
everything else should be in mailnews/test/resources.

> Sorry for this form, I wanted to write my experiences and main pain
> points in a more coherent form later, with possibly some ideas for

> improvement...

It is certainly useful to get experiences of someone who hasn't written
any tests before. I think getting some of these into bugs would be the
best option and then we try and find some time in the new year to start
improving them.

Standard8

Ben Bucksch

unread,

Dec 22, 2009, 8:24:52 AM12/22/09

to

On 22.12.2009 12:38, Mark Banner wrote:
> Agreed this is still painful, however, I think that unless we require
> tests we're not going to find the pain points

OK, I tried you use it, and made a list of concrete pain points. I think
these all need to be fixed properly *before* tests can be required for
everybody. Otherwise, people will suffer the same pain (with error
output etc.), and learn the wrong test coding practices (like, thinking
that setTimeout is fine).

> Agreed, we should be avoiding setTimeout wherever possible. I think
> that some of these were mainly because we didn't have async
> capabilities in the test suite, now we have it, we should find some
> time to go back and revisit some of the earlier tests.

Good.

> I'm actually wondering if it is worth putting together a devmo page
> similar to "do's and don'ts" or a coding practice for unit tests (not
> just don't use setTimeout, but use tool foo instead.

Yes, please! Good idea.

>> No error output
>> Sometimes, when I make a typo or error in tests or testsuite, it goes
>> astray with no output whatsoever (bug 535310). This makes coding a
>> severe pain, because I don't know what's wrong.
>
> That's going to be a bit hard to track down. We've used long timeouts
> (!) in the past

Yes, I guess so. This is an absolute show-stopper for the testsuite, though.

> One problem with errors is that I've seen sometimes that mailnews
> "swallows" errors because they have been reported in listeners and
> aren't actually relevant to the bit of code calling the listeners.

That's then either a structural problem in the API (which should be
fixed, because it will hurt other API users, too), or we just need to
have utility functions (e.h. in head_) for the tests which hook up to
the listeners which report errors and catch them.

>> while (thread.hasPendingEvents())

> Generally I think that should be a utility function not necessarily in
> tail - we need to call it at times that aren't just the end of the test.

Yes, both utility function and calling it in tail_ .
Note that there were just the most obvious examples. There's a lot more
which is not as trivial as moving it to head_ or tail_, but should still
be factored out into functions.

>> No documentation
>> There are currently 2-3 small documents, which don't even cover the API,
>> much less how to write IMAP tests (without copy&paste).
>
> Which API are you talking about here?

*snicker* "Which API?" (SCNR)

Any functions that test writers might use, either explicitly by calling
it, or implicitly, because it's invoked by head_/tail_.

>> Similarly, it's hard to find where stuff comes from, when it's comming
>> from all of head_*, mailnews/test/fakeserver/, mailnews/test/resources/,
>> mailnews/base/ and whatnot.
>
> Do you mean globals or functions?

Both

> It is certainly useful to get experiences of someone who hasn't
> written any tests before. I think getting some of these into bugs
> would be the best option and then we try and find some time in the new
> year to start improving them.

OK, thanks. My main point was, though, that this is so painful that this
has to be solved *before* we can start requiring tests.

Ben