Getting Unit Tests Green

Jonathan Pryor

unread,

May 6, 2009, 12:36:39 AM5/6/09

to DbLinq

I think we have a fundamental problem with our unit tests: most of them don't have 100% pass rates. This is a problem because you can introduce errors without realizing it, because the NUnit tree doesn't look significantly different (lots of green and red before, vs. lots of green and red later).

For example, when I write the unit test page (http://groups.google.com/group/dblinq/web/unit-tests) I wrote that SQL Server had 425 tests run with 70 failures. As I write this 432 tests are run (yay) with 190 failures (boo!) -- more than double the error count compared to March 23.

I don't know what caused the increased failures. Currently, I don't care.

What I do currently care about is preventing such regressions in the future, and the way to do that is by getting our unit tests 100% green (so that regressions and errors are actually visible, not hidden in a sea of existing errors).

I can think of two[0] solutions to this, and I'm welcome to any additional suggestions.

1. Use #if's in the test code to remove failing tests.

2. Use [Category] attributes on the tests to declare which tests shouldn't be executed. The Categories tab within NUnit allows you to specify which categories are executed.

The problem with (1) is a gigantic increase in code complexity, as each vendor will have a different set of failing unit tests, so we'd potentially need checks for every vendor on each method. This is incredibly bad.

(2) would at least avoid the "line noise" implied by #if, though it could also get very "busy" (with upwards of 7 [Category] attributes on a given method).

Thoughts? Alternatives?

Thanks,
- Jon

[0] OK, a third solution would be to actually fix all the errors so that everything is green without using (1) or (2) above, but I don't think that this is practical in the short term.

Giacomo Tesio

unread,

May 6, 2009, 2:44:36 AM5/6/09

to dbl...@googlegroups.com

[3] write a "simple" software monitoring checkins and running all tests at each checkin. When it notes some regression, it would mail all the developers a notification.

BTW I've to note that
1) Some test are wrong coded: they do not really test what they should (noticed on some of the 101 modified tests) passing even if they should not (some time using workaround against missing DbLinq feature... but than why unit testing?).
2) Some test don't passes just becouse the underlying db contains different data set (and data structure).

Even if we could not fix all the DbLinq's errors we SHOULD fix the unit tests ones.
Those error are the real noise we should fix.

Moreover we really need a distributable testing infrastructure, urgently.
I simply can not run all tests, since I've not all the databases.

Giacomo

Giacomo Tesio

unread,

May 6, 2009, 2:51:12 AM5/6/09

to dbl...@googlegroups.com

Just another thought:
We could also use [Category] to mark "ShouldPass" tests.
Then all regressions in the ShouldPass category would be visible.

BTW, since tests are shared among vendors, some "ShouldPass" for a vendor but not yet for another...

How to handle this?

Giacomo

Jonathan Pryor

unread,

May 6, 2009, 8:06:53 AM5/6/09

to dbl...@googlegroups.com

First, for most databases more tests pass than fail, so use [Category] to mark tests that should pass would result in more methods being attributed than marking those that fail (which is why I suggested marking failing tests).

As for tests failing on some vendors but not others, we just need to use multiple categories with separate strings:

[Test]
[Category("NotSqlite")]
[Category("NotSqlServer")]
public void TestSomething() {...}

- Jon

Pascal Craponne

unread,

May 6, 2009, 8:15:55 AM5/6/09

to dbl...@googlegroups.com

Regarding databases distributions, I just had a chat with Stefan Klinger, the guy who is working on a VM with all databases. He should be sending a progress status on the group soon.

Jonathan Pryor

unread,

Jun 1, 2009, 9:05:04 PM6/1/09

to dbl...@googlegroups.com

After trying to actually do this, [Category("...")] is a non-starter.

The requirements for what I want are simple:

Have a single .nunit file for all "interesting" tests. For example, svn-trunk has DbLinq-All.nunit which contains all unit tests, while DbLinq-Sqlite-Sqlserver.nunit contains only SQLite, Microsoft SQL Server, and "no database" (*_ndb*) tests. The latter exists because I'm currently only interested in SQLite and SQL Server support (as I don't have any other databases available), so testing anything else is pointless. I want a single .nunit file so that I can hit Run and run all tests in one window instead of needing to manage several separate NUnit runner windows (and possibly forget to run some subset of tests).
All tests should be built, even the failing ones. This is so that we can easily run previously failing tests to see if any are now working.
Known failing tests should not be run by default.

(Why do I care about a green tree? See below [0].)

So, solutions:

1. #if the tests, as originally mentioned. This fails requirement (2).

2. Use [Conditional("...")]. This fails (1), as all assemblies within a .nunit file share the same set of categories, meaning that if I have one set of tests that fail under SQLite, and a different (non-overlapping) set of tests that fail under SQL Server, both sets won't be executed if both categories are specified. This is obviously not good.

3. What I'm currently thinking is to use [Exclude] in combination with #if, e.g.:

#if SQLSERVER /* || other #defines */
    [Explicit]
#endif
    public void TestName() {...}

This will fulfill my 3 requirements. Alternatively we could use [Ignore] instead of [Explicit], though [Ignore] will make the test runner turn yellow instead of green in an "all pass" scenario.

- Jon

[0] The problem I've been facing is a simple one: I'll see that there's e.g. 190 failing tests. I'll fix one, and find that I now have 195 failing tests (i.e. I fixed one test and regressed 6 others).

Now, which tests regressed? :-)

My current solution, which is a PITA, is a nasty combination of grep+sed+diff on the "before svn changes" and "after svn changes" TestResult.xml files, e.g.:

grep '<test-case' TestRestult.previous.xml | sed 's/time=.*//g' > p
grep '<test-case' TestRestult.current.xml | sed 's/time=.*//g' > c
diff -u p c

So I'm currently finding it of utmost importance to get a green build, simply so I can more easily see which tests I'm regressing while trying to fix things.

Giacomo Tesio

unread,

Jun 2, 2009, 3:44:04 AM6/2/09

to dbl...@googlegroups.com

I'm not so sure that excluding tests would be a good idea, since we could loose control about failing ones...

Can you send me the PITA :-D

I think that a script controlling for regressions would be a good thing...

Giacomo

Jonathan Pryor

unread,

Jun 2, 2009, 1:53:16 PM6/2/09

to dbl...@googlegroups.com

I'm not sure what you mean by "lose control about failing [tests]." [Explicit] tests can still be run from within the NUnit GUI, they just need to be explicitly selected first.

Though the "explicit selection" means that you can't easily run all tests to see if any of the "known bad" tests are now working, which is annoying. However, you can instead show the checkboxes (View -> Tree -> Show CheckBoxes), select the Explicit tests, and then Run will run all tests including the explicitly checked ones.

(For the really ideal case, there would be a way to select all checkboxes at once, but this should work for now...)

This is certainly better than using [Ignore], which provides no mechanism to actually run the test without editing source code (that I could find, anyway).

Of course, this still has the downside of "uglying up" the test source code with lots of #if code. An alternative would be to "centralize" via attributes:

    [AttributeUsage(AttributeTargets.Method, AllowMultiple=true)]
    class SqlServerKnownFailureAttribute :
#if MSSQL
        NUnit.Framework.ExplicitAttribute
#else
        Attribute
#endif
    {
    }

    // and similar for SqliteKnownFailure, etc.

This would allow tests such as:

[SqlServerKnownFailure]
[Test]
public void FirstInternal02() {...}

(The #if within the SqlServerKnownFailureAttribute declaration would prevent it from being an ExplicitAttribute within e.g. DbLinq.Sqlite_test.dll, thus allowing the test to be run with other vendors.)

- Jon

Jonathan Pryor

unread,

Jun 2, 2009, 3:13:11 PM6/2/09

to dbl...@googlegroups.com

OK, in an ideal world my SqlServerKnownFailureAttribute would work. Alas, this isn't a perfect world. From NUnit's NUnitCore/core/NUnitFramework.cs:

public class NUnitFramework
{
  public const string ExplicitAttribute = "NUnit.Framework.ExplicitAttribute";
  // ...
  public static void ApplyCommonAttribute(Attribute[] attributes, Test test)
  {
    //...
    foreach (Attribute attribute in attributes)
    {
      // ...
      string attributeName = attribute.GetType().FullName;
      switch (attributeName)
        // ...
        case ExplicitAttribute: ...
    }
  }
}

In short, instead of using Type.IsAssignableFrom() to see if a type ISA ExplicitAttribute, it checks the type name directly, which (of course) fails if you subclass the attribute.

Thus my nifty subclass cleanup effort doesn't actually work. :-(

This of course means that the only way to mark certain tests as [Explicit] is to use #if within the tests themselves. (Or bundle a forked copy of NUnit with DbLinq...which really isn't a great idea.)

- Jon

Giacomo Tesio

unread,

Jun 2, 2009, 6:17:57 PM6/2/09

to dbl...@googlegroups.com

I mean that falling test could be more useful than passing ones.

Actually you do not want green tests, you want a way to track regressions easily.

Green tests is one way, but has many issue (those you said plus assigning mantaining the attributes etc...)
We should fix the test, but since its difficult in the short term, we are searching a work around.

I think the work around would be not to not run failing tests, but to use nunit-console plus some script to look for regressions.
I really think that such a script could be written in far less time of the #if / [Explicit] set up.

Giacomo

Giacomo Tesio

unread,

Jun 4, 2009, 3:24:49 AM6/4/09

to dbl...@googlegroups.com

Please take a look to the attached patch (against r1104).

It enable ALL tests to be run by default when compiled in DEBUG, while keep the behaviour you are introducing (the #if / [Explicit]) for the release.

This would be logically correct (since we actually need to not get regressions in release) AND confortable during development.

We could keep just 1 NUnit open looking for regressions when the number of failure increase AND before committing by simply selecting the NUnit-GUI -> Project -> Configuration -> Release.

I find this really usefull since would help to identify regressions but allow to faster identifing already fixed tests to correct #if statements.

Giacomo

debug-release-unittests.patch

Giacomo Tesio

unread,

Jun 4, 2009, 12:56:33 PM6/4/09

to dbl...@googlegroups.com

Committed.

Now RELEASE unit tests track for regressions, while DEBUG ones make them run all.

Remember to unselect the DEBUG define on RELEASE compilation in each test project you touch.

Giacomo
PS: Someone should take a to EntitySetExpression usage: search "//if (!isEntitySet)" in ExpressionDispatcher.Analyzer.cs, uncomment it and make it work!
I've started this work, but it's quite long and I've no time now...

Giacomo Tesio

unread,

Jun 4, 2009, 12:59:03 PM6/4/09

to dbl...@googlegroups.com

Another thing: try test running on Sql Server (release configuration).

I fear that we still have different datas (even in Northwind! ! !) since Jon marked to be skipped also tests that run for me!

Giacomo

Jonathan Pryor

unread,

Jun 4, 2009, 3:44:19 PM6/4/09

to dbl...@googlegroups.com

You are correct -- our Northwind's were different.

I don't know how, but my Northwind lost most of its Customer entries (no idea how): of the original 92 in the Northwind data set, I only had 7, and I had no Customers from London (causing many tests to fail because of it).

I've reloaded my Northwind database. Hopefully I'll manage to not screw it up this time...

- Jon

Jonathan Pryor

unread,

Jun 15, 2009, 1:12:18 AM6/15/09

to dbl...@googlegroups.com

To bring this thread to a (hopeful) close, the SQLite, SQL Server, and Linq to SQL tests are all "green" now. (Technically they're yellow, but at least they're not red.)

I would appreciate it immensely if we could keep it this way. :-)

Alas, keeping them out of the red effectively requires that all tests against all databases be run...which is currently impossible (or impractical, at least until we get that fabled VMware image with all databases installed...).

So in the alternative, I would suggest that when adding tests, please make sure that the added test passes against at least two different databases. This should be trivial, as SQLite runs everywhere, and the wiki has detailed instructions for configuring SQL Server and other databases [0].

If you know that your test will not pass, please mark it as [Explicit], using the following template:

#if !DEBUG /* && (SQLITE || MSSQL || ... other DB #defines if necessary) */
	[Explicit]
#endif
	[Test]
	public void TestName() {...}

This will allow Release builds of the tests to skip your failing test, thus keeping the tests green (or yellow, as the case may be).

Thanks,
- Jon

[0] http://groups.google.com/group/dblinq/web/unit-tests

Reply all

Reply to author

Forward