Transactional testsuite

Marc Remolt

unread,

Aug 5, 2008, 2:53:03 PM8/5/08

to django-d...@googlegroups.com

Hi,

one of the most frustrating django experiences for me always was running
the testsuite. At work we use testing quite extensively (test first), so
a medium sized project can have quite a lot of tests.
I always eyed over to some of my Rails using colleagues, as their tests
always ran much faster (notice Ruby and faster!). My testsuite even
wasn't that large and it still ran for several minutes. Switching from
MySQL to PostgreSQL cut test time in half, but even that was over one
minute. So I dug into Rails and found out, that they use the following
concept for tests:
- test database and the tables are created
- changes are committed, so far no difference to django
- the fixtures are loaded
- commit

Now the interesting part:
- a transaction is started
- one test runs
- the transaction is rolled back
- next transaction, test and rollback ...

That's exactly where they gain that insane amount of performance.
Instead of loading fixtures - commit, running the test - commit,
truncating all tables - commit and loading fixtures again -commit,
everything is in a single transaction.
After that I hacked around django.test to see, how hard it would be to
implement something like that. As it turned out, two small modifications
to TestCase and one to django.core.management.commands.loaddata (to
deactivate the transaction handling there) were enough.
Now the following sequence happens:
- test database and the tables are created, commit
- transaction is started
- fixtures for current testcase are loaded
- test runs
- a rollback is performed

With a database intensive testsuite (loads of fixtures, many inserts,
deletes - around 2000 write operations to db), that small hack changed
the running time of the tests from roughly 200 seconds to 5 seconds
(MySQL). I am really having a good time right now.

My question now is: Are the django developers interested in that fix.
I've looked into it and it can easily be backwards compatible. I imagine
a switch with the test command

./manage.py test --with-transactions

to activate the new behaviour. The modification for loaddata can also be
a switch. If you are interested, I'd make a ticket, put it under "post
1.0" and come up with a patch over the next days.

If there is no interest to put that in the core, I would turn the hack
into a small app with a dedicated management command

./manage.py test_transactional

and put that up somewhere on googlecode.

If I'm doing something completely stupid here and there is a
conceptional problem with my approach, please tell me.

Marc Remolt

Ben Ford

unread,

Aug 5, 2008, 3:53:10 PM8/5/08

to django-d...@googlegroups.com

Hi Marc,

I'd definitely be interested. Would you be prepared to set up a Google code account or something so that we could have a look at it? I've been poking around looking at using nose to test my (fairly large and woefully undertested) app and transactional testing would be nice. Following a read of Refactoring Databases I'd also like the option of actually loading a dump of some live data to test against too.

What do you reckon?

Cheers,
Ben

2008/8/5 Marc Remolt <m.re...@webmasters.de>

--
Regards,
Ben Ford
ben.f...@gmail.com
+447792598685

Luke Plant

unread,

Aug 5, 2008, 3:59:07 PM8/5/08

to django-d...@googlegroups.com

On Tuesday 05 August 2008 19:53:03 Marc Remolt wrote:

> My question now is: Are the django developers interested in that
> fix. I've looked into it and it can easily be backwards compatible.
> I imagine a switch with the test command
>
> ./manage.py test --with-transactions
>
> to activate the new behaviour. The modification for loaddata can
> also be a switch. If you are interested, I'd make a ticket, put it
> under "post 1.0" and come up with a patch over the next days.

I for one am interested! You say 'backwards compatible' -- are there
any tests which currently fail with your changes, but pass without
them? If not, I see no reason it couldn't be the default.

Also, how does it work for backends that don't have transactions?

Personally I'd be interested in seeing this included ASAP (pre 1.0),
because reducing the time for test runs by a factor of 40 would make
developing and releasing Django so much easier. If you want to make
a patch and assign it to me (lukeplant) that would be great.

Cheers,

Luke

--
"Smoking cures weight problems...eventually..." (Steven Wright)

Luke Plant || http://lukeplant.me.uk/

Marc Remolt

unread,

Aug 5, 2008, 4:50:26 PM8/5/08

to django-d...@googlegroups.com

Luke Plant schrieb:

The factor of 40 was not with the django testsuite itself, but with an
application suite with mostly DB write tests. So the real factor might
be not that great. I haven't run the Django test suite yet and so far
all of my own tests run with MySQL, PostgreSQL and SQLite. By backwards
compatible I meant that the new behaviour can be turned on and off by a
command line argument.
Tomorrow (as it's getting a bit late around my timezone) I'll have a
patch with an on/off-switch for you that will hopefully pass all Django
tests with both settings.

Thanks for your positive feedback. The first feature suggestion to such
a large project is always something special.

Marc Remolt

Russell Keith-Magee

unread,

Aug 5, 2008, 8:04:00 PM8/5/08

to django-d...@googlegroups.com

On Wed, Aug 6, 2008 at 2:53 AM, Marc Remolt <m.re...@webmasters.de> wrote:
>
> Hi,

>
> Now the interesting part:
> - a transaction is started
> - one test runs
> - the transaction is rolled back
> - next transaction, test and rollback ...

...

>
> My question now is: Are the django developers interested in that fix.
> I've looked into it and it can easily be backwards compatible. I imagine
> a switch with the test command

We are always interested in anything to speed up the test suite. As
you have correctly noted, the current test suite is painfully slow.

However, two comments:

1) If we're going to implement this, we're not going to use flags and
options to implement it. We don't want test output to be dependent on
the flags passed to the test runner, beyond basic 'test using MySQL'
type options.

2) You're not the the first person to suggest this idea. However,
there is a slight problem - using this approach, its impossible to
test transactional behaviour. Postgres supports nested transactions,
but to the best of my knowledge, MySQL and SQLite do not. There are a
few tests (I'm thinking of the serialization regression tests in
particular) where transaction boundaries are required as they are an
essential part of the deserialization process.

That said, there may be a middle ground. There will be a large number
of Django tests where transactions are not required, so the current
approach is overkill. It might be possible make
``django.test.TestCase`` implement a 'rollback' based approach, but
then define TransactionTestCase to implement the flush/reload. Those
tests that require transactions can be modified to use the latter.

Yours,
Russ Magee %-)

Marc Remolt

unread,

Aug 6, 2008, 4:57:10 AM8/6/08

to django-d...@googlegroups.com

I was thinking about the same problem after my posting. Personally I
never used transactions inside tests but you never know who does. As
soon as a tested model method runs a transaction, the testsuite breaks.

MySQL doesn't suppport nested transactions, and the named transactions
(savepoints) would be no help here. If we use savepoints for the
testsuite rollback and someone inside a test commits without specifying
a savepoint, all transactions are commited.

I like your idea with two TestCase-Classes. If everyone agrees, I will
create a patch that:
* renames TestCase to TransactionTestCase
* creates a new TestCase with the rollback implementation
* give the loaddata command a switch --external-transaction
(suggestions for a better label?), so that it doesn't commit at the end.

I'm not sure about the last point though. An alternative approach would
be to modify loaddata to test, if an transaction is already running and
if that is the case, perform no commits and rollbacks inside.
In the case of an error during loaddata I propose, that instead of the
rollbacks an exception is thrown, as the outside code, that manages the
transaction, needs to react to the problem.

Does that sound reasonable to everyone?

Marc Remolt

Steve Holden

unread,

Aug 6, 2008, 7:26:14 AM8/6/08

to django-d...@googlegroups.com

Marc Remolt wrote (in reply to Russell Keith-Magee):
>
[...]

>> 1) If we're going to implement this, we're not going to use flags and
>> options to implement it. We don't want test output to be dependent on
>> the flags passed to the test runner, beyond basic 'test using MySQL'
>> type options.
>>
>> 2) You're not the the first person to suggest this idea. However,
>> there is a slight problem - using this approach, its impossible to
>> test transactional behaviour. Postgres supports nested transactions,
>> but to the best of my knowledge, MySQL and SQLite do not. There are a
>> few tests (I'm thinking of the serialization regression tests in
>> particular) where transaction boundaries are required as they are an
>> essential part of the deserialization process.
>>
>> That said, there may be a middle ground. There will be a large number
>> of Django tests where transactions are not required, so the current
>> approach is overkill. It might be possible make
>> ``django.test.TestCase`` implement a 'rollback' based approach, but
>> then define TransactionTestCase to implement the flush/reload. Those
>> tests that require transactions can be modified to use the latter.
>>
>> Yours,
>> Russ Magee %-)
>>
> I was thinking about the same problem after my posting. Personally I
> never used transactions inside tests but you never know who does. As
> soon as a tested model method runs a transaction, the testsuite breaks.
>

Yup.

> MySQL doesn't suppport nested transactions, and the named transactions
> (savepoints) would be no help here. If we use savepoints for the
> testsuite rollback and someone inside a test commits without specifying
> a savepoint, all transactions are commited.
>
> I like your idea with two TestCase-Classes. If everyone agrees, I will
> create a patch that:
> * renames TestCase to TransactionTestCase
> * creates a new TestCase with the rollback implementation
> * give the loaddata command a switch --external-transaction
> (suggestions for a better label?), so that it doesn't commit at the end.
>

If you want such a switch then "--no-commit" might be more meaningful.
See Russell's remarks about not implementing it with options or
switches, however.

But this still doesn't address the issue of people using TestCase
instead of TransactionTestCase or vice versa. Does this make it
something that must go in before 1.0?

> I'm not sure about the last point though. An alternative approach would
> be to modify loaddata to test, if an transaction is already running and
> if that is the case, perform no commits and rollbacks inside.
> In the case of an error during loaddata I propose, that instead of the
> rollbacks an exception is thrown, as the outside code, that manages the
> transaction, needs to react to the problem.
>
> Does that sound reasonable to everyone?
>

Sounds to be getting closer to a usable specification.

regards
Steve
--
Steve Holden +1 571 484 6266 +1 800 494 3119
Holden Web LLC http://www.holdenweb.com/

Marc Remolt

unread,

Aug 6, 2008, 9:47:55 AM8/6/08

to django-d...@googlegroups.com

> If you want such a switch then "--no-commit" might be more meaningful.
> See Russell's remarks about not implementing it with options or
> switches, however.
>
>

I think he meant the test command with the "no switch" rule. That switch
would have had influence on the whole testsuite. With Russells two
classes suggestion, you can decide per testcase, if you want the
transaction behaviour or not.
The switch with loaddata would just be an explicit way to tell "don't
commit this fixtures yet", which could be useful in other situations. It
doesn't influence the behaviour of the loaddata feature as a whole as
with the test command.
Please correct me, if I'm wrong Russell.

> But this still doesn't address the issue of people using TestCase
> instead of TransactionTestCase or vice versa. Does this make it
> something that must go in before 1.0?
>

If someone uses transactions inside a test, it will break backwards
compatibility for them. That was the original idea about my not so
clever switch suggestion. Can I assume, that the test tools will also
get a stable api with 1.0? If that is the case, the feature should go
inside before 1.0 or much later, when the next stable release is planned.

Another problem I yet have to look into are the doctests. At the moment
I have not looked at how exactly the doctests are run and how they
integrate with the normal testsuite. DocTestCase inherits from
unittest.TestCase, so that's probably the point where I hang my code in.
As the Django testsuite itself uses mainly doctests, transaction support
for them should be even more interesting to the django devs. I
personally use TestCases exclusively, so that wasn't a priority for me.

>> I'm not sure about the last point though. An alternative approach would
>> be to modify loaddata to test, if an transaction is already running and
>> if that is the case, perform no commits and rollbacks inside.
>> In the case of an error during loaddata I propose, that instead of the
>> rollbacks an exception is thrown, as the outside code, that manages the
>> transaction, needs to react to the problem.
>>
>> Does that sound reasonable to everyone?
>>
>>
> Sounds to be getting closer to a usable specification.
>
> regards
> Steve
>

Thanks

Marc

Marc Remolt

unread,

Aug 6, 2008, 2:44:15 PM8/6/08

to django-d...@googlegroups.com

The ticket and first patch are online now: Ticket #8138

daonb

unread,

Aug 7, 2008, 2:40:14 AM8/7/08

to Django developers

Seems simple and smart. I'll use the patch in the new bot we'll build
in the 1.0 rc testing sprint - http://code.djangoproject.com/wiki/SprintIsraelAugust2008
be glad if you can help making django testing best.

Russell Keith-Magee

unread,

Aug 7, 2008, 8:17:00 AM8/7/08

to django-d...@googlegroups.com

On Thu, Aug 7, 2008 at 2:44 AM, Marc Remolt <m.re...@webmasters.de> wrote:
>
> The ticket and first patch are online now: Ticket #8138

I've had a chance to take a quick look over the patch. For the most
part, it looks good.

Some quick comments:

1) Changes to _doctest.py are out of bounds. _doctest.py is a literal
copy of the Python doctest module, provided as a copy because there
were some important bugfixes between Python 2.3 and 2.4 that Django
relies upon. Unless you're making a change that you intend to push
upstream to Python, modifying doctest isn't really an option. I
haven't gone digging to be certain, but there should be a point in the
Django DocTestRunner that will serve an equivalent purpose.

2) I'm not wild about the --no-commit option to loaddata. I can see
why you've done it, but it's a bit ugly. The option isn't something
that will be useful to anyone outside the test system, so exposing it
as a publicly visible option doesn't really appeal to me. That said, a
better idea isn't jumping to the front of my mind, so i'll have to
think about this a little more. If you have any suggestions, feel free
to make them.

3) The test pass clean under SQLite, but I'm seeing a lot of test
failures (and in some cases, test errors) under Postgres.
Unfortunately, I haven't got the time to dig into the causes right
now. Hopefully, I will get a chance over the next day or so. However,
taking an initial glance at the failures, it looks like the sorts of
things that would go wrong if transactions were messed with. For the
benefit of anyone trying to replicate, I'm on r8223, using Postgres
8.3.1 on OS X Leopard.

Yours,
Russ Magee %-)

Marc Remolt

unread,

Aug 7, 2008, 11:21:08 AM8/7/08

to django-d...@googlegroups.com

Russell Keith-Magee schrieb:

> On Thu, Aug 7, 2008 at 2:44 AM, Marc Remolt <m.re...@webmasters.de> wrote:
>
>> The ticket and first patch are online now: Ticket #8138
>>
>
> I've had a chance to take a quick look over the patch. For the most
> part, it looks good.
>
> Some quick comments:
>
> 1) Changes to _doctest.py are out of bounds. _doctest.py is a literal
> copy of the Python doctest module, provided as a copy because there
> were some important bugfixes between Python 2.3 and 2.4 that Django
> relies upon. Unless you're making a change that you intend to push
> upstream to Python, modifying doctest isn't really an option. I
> haven't gone digging to be certain, but there should be a point in the
> Django DocTestRunner that will serve an equivalent purpose.
>

I just uploaded a new patch, that doesn't change _doctests.py. Had no
idea that this was a "don''t touch"-file. Thanks for the correction and
the tip.

> 2) I'm not wild about the --no-commit option to loaddata. I can see
> why you've done it, but it's a bit ugly. The option isn't something
> that will be useful to anyone outside the test system, so exposing it
> as a publicly visible option doesn't really appeal to me. That said, a
> better idea isn't jumping to the front of my mind, so i'll have to
> think about this a little more. If you have any suggestions, feel free
> to make them.
>

One of the next things I'll try is to make loaddata aware of an already
running transaction. If transaction management is already active, when
the command runs, no commit is performed. We will see, if that works and
doesn't break something deep within Django. Personally I don't like the
switch either, but for testing is was the simplest way to make sure,
that the commit is only skipped for the tests.

> 3) The test pass clean under SQLite, but I'm seeing a lot of test
> failures (and in some cases, test errors) under Postgres.
> Unfortunately, I haven't got the time to dig into the causes right
> now. Hopefully, I will get a chance over the next day or so. However,
> taking an initial glance at the failures, it looks like the sorts of
> things that would go wrong if transactions were messed with. For the
> benefit of anyone trying to replicate, I'm on r8223, using Postgres
> 8.3.1 on OS X Leopard.
>

The results I posted in the Ticket were produced on an Ubuntu 8.04
System, Postgres 8.3.3. Strange, that on my system SQLite reports one
failure.
My personal bet on where these failures come from is, that the test
before the officially failing one doesn't perform the rollback, as most
failures result from assertions, that count the number of entries in a
table.
As an example, one test loads a fixture with just one entry, tries to
delete it (which should fail there, because of missing permissions) and
then tests, if exactly one entry exists. Actually, during the assert
there are 4 entries in the table.
> Yours,
> Russ Magee %-)
>
> >
Marc Remolt

Marc Remolt

unread,

Aug 7, 2008, 11:51:58 AM8/7/08

to django-d...@googlegroups.com

That was an exceptionally bad idea. I tried, what I suggested and now we
have really, really a lot of errors, not just failures anymore. I'll
just skip that approach.

Russell Keith-Magee

unread,

Aug 7, 2008, 7:40:49 PM8/7/08

to django-d...@googlegroups.com

On Thu, Aug 7, 2008 at 11:21 PM, Marc Remolt <m.re...@webmasters.de> wrote:
>
> Russell Keith-Magee schrieb:

>> 2) I'm not wild about the --no-commit option to loaddata. I can see
>> why you've done it, but it's a bit ugly. The option isn't something
>> that will be useful to anyone outside the test system, so exposing it
>> as a publicly visible option doesn't really appeal to me. That said, a
>> better idea isn't jumping to the front of my mind, so i'll have to
>> think about this a little more. If you have any suggestions, feel free
>> to make them.

Luke Plant has just uploaded a patch that possibly works around the
problem another way - it exposes the option as something that can be
passed in if you are manually calling the command (using
call_command()), but doesn''t publicly expose the option to the user.
This looks like a pretty good solution to me.

All that leaves now is getting the failures sorted out. I should be
sprinting tonight, so I will have a look then.

Yours,
Russ Magee %-)

Luke Plant

unread,

Aug 7, 2008, 8:34:35 PM8/7/08

to django-d...@googlegroups.com

On Thursday 07 August 2008 13:17:00 Russell Keith-Magee wrote:

> 3) The test pass clean under SQLite, but I'm seeing a lot of test
> failures (and in some cases, test errors) under Postgres.
> Unfortunately, I haven't got the time to dig into the causes right
> now. Hopefully, I will get a chance over the next day or so.
> However, taking an initial glance at the failures, it looks like
> the sorts of things that would go wrong if transactions were messed
> with. For the benefit of anyone trying to replicate, I'm on r8223,
> using Postgres 8.3.1 on OS X Leopard.

I've got a patch with all the tests passing under SQLite and Postgres,
by enabling 'TransactionTestCase' for the ones that need it.
(currently this involves mainly TestCase, but also one doctest, and
I've used a nasty hack to fix that. Any ideas about how to specify
in a doctest that it uses transactions are welcome.)

I'm slightly worried that I don't always know why the tests are
failing without TransactionTestCase. I thought that anything that
uses the test Client() invokes transactions, but after some digging,
I don't know why that is necessarily the case.

In some cases if you isolate the test, it passes, which indicates that
it is a previous test that has not been cleaned up properly. This
seems to happen with test cases that use views.

One of the most puzzling is this, under postgres:

-------
File "/home/luke/httpd/www.cciw.co.uk/django_src/tests/modeltests/fixtures/models.py",
line ?, in modeltests.fixtures.models.__test__.API_TESTS
Failed example:
management.call_command('dumpdata', 'fixtures', format='json')
Expected:
[{"pk": 3, "model": "fixtures.article", "fields":
{"headline": "Time to reform copyright", "pub_date": "2006-06-16
13:00:00"}}, {"pk": 2, "model": "fixtures.article", "fields":
{"headline": "Poker has no place on ESPN", "pub_date": "2006-06-16
12:00:00"}}, {"pk": 1, "model": "fixtures.article", "fields":
{"headline": "Python program becomes self
aware", "pub_date": "2006-06-16 11:00:00"}}]
Got:
[{"pk": 3, "model": "fixtures.article", "fields":
{"headline": "Time to reform copyright", "pub_date": "2006-06-16
19:00:00"}}, {"pk": 2, "model": "fixtures.article", "fields":
{"headline": "Poker has no place on ESPN", "pub_date": "2006-06-16
18:00:00"}}, {"pk": 1, "model": "fixtures.article", "fields":
{"headline": "Python program becomes self
aware", "pub_date": "2006-06-16 17:00:00"}}]

--------

The failure is that all the dates are 6 hours later than they ought to
be. This has got to be something to do with timezone (my settings
file doesn't change it, so the default 'America/Chicago' is used, my
local time is actually BST), but I really have no clue about how that
stuff works, or why it should only show up when the test is running
in a transaction.

Unless we can get to the bottom of why some tests
need 'TransactionTestCase', and can give concrete conditions when
developers will need to use it, the patch is slightly worrying, as it
could lead to lots of headscratching by developers who cannot work
out why their tests are failing.

I'll upload a patch for what I've done so far, I'm going to bed now.

Luke

P.S. Here are my timings for the entire testsuite (not including the
gis stuff, which doesn't work on my machine), 415 tests in all cases:

SQLite, without patch:

real 6m28.425s
user 5m41.949s
sys 0m8.837s

SQLite, with patch:

real 4m21.447s
user 3m46.298s
sys 0m7.476s

Postgres, without patch:

real 19m27.912s
user 6m22.660s
sys 0m17.517s

Postgres, with patch:

real 13m28.262s
user 4m15.324s
sys 0m12.517s

--
"The first ten million years were the worst. And the second ten
million, they were the worst too. The third ten million, I didn't
enjoy at all. After that I went into a bit of a decline." (Marvin the
paranoid android)

Luke Plant || http://lukeplant.me.uk/

Marc Remolt

unread,

Aug 8, 2008, 3:47:33 AM8/8/08

to django-d...@googlegroups.com

Luke Plant schrieb:

Thanks for the patch - especially for removing that ugly switch.

I was also looking into the failing tests and have made some slight
progress without manually using TransactionTestCase. All failures I
looked at included calls to test.Client and later evaluating if the call
to client changed the database. It seems we have a problem with
transaction isolation. The request (post mostly) to Client updated the
DB but doesn't commit. The asserts outside won't see the changes then.
Just for testing I put a commit() at the beginning of Client.request()
and the failures under Postgres were cut in half (25 -> 12).
As I see it we now can:
- Automatically commit in client and rollback afterwards or using
TransactionTestCase manually when using client. That would eat up most
of the performance gain.
- Try to bring test.Client to use exactly the same DB connection as the
outside test. To be honest I have no idea if this is possible as my
knowledge about the inner workings of Psycopg2 and the other DB
connectors is limited at the moment. Any ideas?
- Just document, that tests with test.Client must use
TransactionTestCase. Not a very good solution in my opinion.

Marc

Reply all

Reply to author

Forward