Running django tests with postgres

483 views
Skip to first unread message

Hezi Halpert

unread,
Sep 14, 2018, 11:32:49 AM9/14/18
to Django users
I would like to share with you an issue we encounter while moving from sqlite to postgres with heavily use of Django testing.

We have Django app with ~700 tests. Most of them accessing database. We recently migrated the database from sqlite to postgres. 
Many of our tests were written in a way that compares actual pk’s (hard-coded pks or just file/json comparisons) . Since Django testing on sqlite (testcases.TestCase class) creates in-memory database (which is being reseted every unit test by default), we never had a problem with it.
However, Django TestCase on postgres create completely another test db which preserves the pk sequences between different tests. And since many of our tests were written in a way that compares actual pk’s they all start fail - depends on the exact tests execution order.
Even tests which expect some pk and are were not failed yet, can potentially failed in the future - depends on adding/editing other tests which may change the db sequence

We consider the following solutions:
  1. Move to TransactionTestCase (instead of TestCase) and use “reset_sequences = True” flag. Cons: TransactionTestCase reduces performance dramatically (~4 times longer in some of the tests) 
  2. Refactor all failed tests: remove all hard-coded references to the pk. Cons: Require much Dev effort (we had more then 70 such tests)
  3. Route the database in settings.py such it will use sqlite instead of postgres when running tests. Cons: It will not actually test the real scenarios - not an option
  4. Combine reset_sequences flag with TestCase in our own version to TestCase: OurTestCase class and make everything to inherit from it. This is the option we finally decided of. See below.

from django.test import TestCase, testcases

class OurTestCase(TestCase):
reset_sequences = True

def _fixture_setup(self):
for db_name in self._databases_names(include_mirrors=False):
if self.reset_sequences:
self._reset_sequences(db_name)
if self.fixtures:
call_command('loaddata', *self.fixtures, **{'verbosity': 0, 'database': db_name})
if not testcases.connections_support_transactions():
self.setUpTestData()
return super(TestCase, self)._fixture_setup()
self.atomics = self._enter_atomics()

Another problem of these kind of tests is the default ordering assumption of Django which changes significantly between postgres and sqlite when testing.
Therefore, models included in such tests must have a hint for Django regarding the default ordering retrieval. 
Our solution was to make all models inherit from DexterModelDefaultOrder (below) 


class DexterModelDefaultOrder(models.Model):
class Meta:
abstract = True
ordering = ['id']


I hope it (will) help someone

Andréas Kühne

unread,
Sep 14, 2018, 11:46:31 AM9/14/18
to django...@googlegroups.com
Hi,

Just my 5 cents. I think you are doing the tests wrong. I don't believe that doing testing against hard coded values is at all correct - and it isn't actually that hard to change the tests to a simpler way. The values of the PK's aren't really necessary for your test to be true either - how does that translate to a real use case? You should probably check for A value in the pk field, but not a specific value, because that doesn't result in any added value for your customer?

Also changing the way django runs tests feels like working against the framework rather than with it? I would probably much prefer changing the tests than changing the way the framework runs my tests.... Another issue you may face is if Django changes the underlying code, then you will get strange failures as well... 

I don't think that 70 tests is that much to change either - we work with a project that could fail a considerable amount of tests during a refactor - and then we need to fix them. The same goes here I think - you did a change to the infrastructure that made the tests invalid - rewrite the tests :-)

Best regards,

Andréas


--
You received this message because you are subscribed to the Google Groups "Django users" group.
To unsubscribe from this group and stop receiving emails from it, send an email to django-users...@googlegroups.com.
To post to this group, send email to django...@googlegroups.com.
Visit this group at https://groups.google.com/group/django-users.
To view this discussion on the web visit https://groups.google.com/d/msgid/django-users/ffcd3cc9-c3be-44ba-9665-a4ded5fed492%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

Mike Dewhirst

unread,
Sep 14, 2018, 8:30:11 PM9/14/18
to django...@googlegroups.com
+1 Andréas

One of my projects runs (currently) 1,248 tests using SQLite3 in 72
minutes on my local Windows 10 dev laptop. That laptop has both a SSD
and a hard disk. Foolishly I opted to use the SSD for software
installations and the hard disk for development and thus the tests. I
was concerned that overexercising the SSD might wear it out. I've since
been advised that I shouldn't worry so one of these days I'll reorganise
things and maybe see that 72 minutes drop somewhat.

However, those same tests take 285 minutes on a headless PC running
Ubuntu 16.04 LTS server with Postgres 9.6

There are only one or two tests which generate slightly different error
messages between SQLite3 and Postgres and that is trivial to manage.

Most tests are database heavy, requiring queries and calling/testing
model methods.

Consequently, I use - and recommend - SQLite for dev tests and Postgres
for deployment testing.

Cheers

Mike

On 15/09/2018 1:45 AM, Andréas Kühne wrote:
> Hi,
>
> Just my 5 cents. I think you are doing the tests wrong. I don't
> believe that doing testing against hard coded values is at all correct
> - and it isn't actually that hard to change the tests to a simpler
> way. The values of the PK's aren't really necessary for your test to
> be true either - how does that translate to a real use case? You
> should probably check for A value in the pk field, but not a specific
> value, because that doesn't result in any added value for your customer?
>
> Also changing the way django runs tests feels like working against the
> framework rather than with it? I would probably much prefer changing
> the tests than changing the way the framework runs my tests....
> Another issue you may face is if Django changes the underlying code,
> then you will get strange failures as well...
>
> I don't think that 70 tests is that much to change either - we work
> with a project that could fail a considerable amount of tests during a
> refactor - and then we need to fix them. The same goes here I think -
> you did a change to the infrastructure that made the tests invalid -
> rewrite the tests :-)
>
> Best regards,
>
> Andréas
>
>
> Den fre 14 sep. 2018 kl 17:32 skrev Hezi Halpert <che...@gmail.com
> <mailto:che...@gmail.com>>:
>
> I would like to share with you an issue we encounter while moving
> from sqlite to postgres with heavily use of Django testing.
>
> We have Django app with ~700 tests. Most of them accessing
> database. We recently migrated the database from sqlite to postgres.
> Many of our tests were written in a way that compares actual pk’s
> (hard-coded pks or just file/json comparisons) . Since Django
> testing on sqlite (testcases.TestCase class) creates in-memory
> database (which is being reseted every unit test by default), we
> never had a problem with it.
> However, Django TestCase on postgres create completely another
> test db which preserves the pk sequences between different tests.
> And since many of our tests were written in a way that compares
> actual pk’s they all start fail - depends on the exact tests
> execution order.
> Even tests which expect some pk and are were not failed yet, can
> potentially failed in the future - depends on adding/editing other
> tests which may change the db sequence
>
> We consider the following solutions:
>
> 1. Move to TransactionTestCase (instead of TestCase) and use
> “reset_sequences = True” flag. Cons:
> TransactionTestCase reduces performance dramatically (~4 times
> longer in some of the tests)
> 2. Refactor all failed tests: remove all hard-coded references to
> the pk. Cons: Require much Dev effort (we had more then 70
> such tests)
> 3. Route the database in settings.py such it will use sqlite
> instead of postgres when running tests. Cons: It will not
> actually test the real scenarios - not an option
> 4. Combine reset_sequences flag with TestCase in our own version
> to TestCase: OurTestCase class and make everything to inherit
> from it. This is the option we finally decided of. See below.
>
>
> fromdjango.test import TestCase, testcases
>
> class OurTestCase(TestCase):
> reset_sequences =True def _fixture_setup(self):
> for db_namein self._databases_names(include_mirrors=False):
> if self.reset_sequences:
> self._reset_sequences(db_name)
> if self.fixtures:
> call_command('loaddata', *self.fixtures, **{'verbosity':0,'database': db_name})
> if not testcases.connections_support_transactions():
> self.setUpTestData()
> return super(TestCase,self)._fixture_setup()
> self.atomics =self._enter_atomics()
>
> Another problem of these kind of tests is the default ordering
> assumption of Django which changes significantly between postgres
> and sqlite when testing.
> Therefore, models included in such tests must have a hint for
> Django regarding the default ordering retrieval.
> Our solution was to make all models inherit from
> DexterModelDefaultOrder (below)
>
>
> class DexterModelDefaultOrder(models.Model):
> class Meta:
> abstract =True ordering = ['id']
>
> I hope it (will) help someone
>
> --
> You received this message because you are subscribed to the Google
> Groups "Django users" group.
> To unsubscribe from this group and stop receiving emails from it,
> send an email to django-users...@googlegroups.com
> <mailto:django-users...@googlegroups.com>.
> To post to this group, send email to django...@googlegroups.com
> <mailto:django...@googlegroups.com>.
> <https://groups.google.com/d/msgid/django-users/ffcd3cc9-c3be-44ba-9665-a4ded5fed492%40googlegroups.com?utm_medium=email&utm_source=footer>.
> For more options, visit https://groups.google.com/d/optout.
>
> --
> You received this message because you are subscribed to the Google
> Groups "Django users" group.
> To unsubscribe from this group and stop receiving emails from it, send
> an email to django-users...@googlegroups.com
> <mailto:django-users...@googlegroups.com>.
> To post to this group, send email to django...@googlegroups.com
> <mailto:django...@googlegroups.com>.
> Visit this group at https://groups.google.com/group/django-users.
> To view this discussion on the web visit
> https://groups.google.com/d/msgid/django-users/CAK4qSCeeyDaUnTb95uO_LpdkLknS-V6tWtm8q3%2BY58FgEs_AnQ%40mail.gmail.com
> <https://groups.google.com/d/msgid/django-users/CAK4qSCeeyDaUnTb95uO_LpdkLknS-V6tWtm8q3%2BY58FgEs_AnQ%40mail.gmail.com?utm_medium=email&utm_source=footer>.

Andréas Kühne

unread,
Sep 15, 2018, 12:50:45 PM9/15/18
to django...@googlegroups.com
I would just be scared that some minor issues are different between the database implementations - therefore some test that would work in the tests and during development, doesn't work in production. 

I usually try to use the same things in production and development (or as close as possible).

That being said - Tests that take 300 minutes is just too much :-) 

We have around 750 tests are it takes around 5 minutes to run - Something that I find to be too long still....

Regards,

Andréas


To unsubscribe from this group and stop receiving emails from it, send an email to django-users...@googlegroups.com.
To post to this group, send email to django...@googlegroups.com.

Jason

unread,
Sep 15, 2018, 1:33:12 PM9/15/18
to Django users
Agreed.  Something is definitely off.  At work, the legacy monolith django app has about 7800 tests.  It takes about 18 minutes to run the full test suite in docker with the latest MBP (four cores, 5GB RAM allocated in docker machine), including creating the db and four parallel test streams running.  A smaller service I work on has about 780 tests, and it takes about 4 minutes to run in total.  And like Andreas, I find that too long at times!

Mike Dewhirst

unread,
Sep 16, 2018, 2:18:29 AM9/16/18
to django...@googlegroups.com
My software classifies chemical hazards according to intrinsic physical
properties. This requires accessing large quantities of reference data
imported from jurisdictions around the world. The tests demand correct
classification for known properties.

There are dozens of different classification types for physical, health
and environmental hazards. Here is the current spec ...
https://www.unece.org/trans/danger/publi/ghs/ghs_rev07/07files_e0.html

Then we have transport hazards based on UN reference data for air, sea
and land transport. Here is the current spec ...
https://www.unece.org/trans/danger/publi/unrec/rev20/20files_track_e.html

Then there are regulatory workplace exposure limits and biological
monitoring criteria which has to be fetched from more reference tables
based on jurisdictions specified.

Then there are mixtures. Known properties of ingredients in various
proportions produc different classifications. The software (and tests)
must classify a mixture correctly. Some ingredients react together in
known stoichiometric proportions to form a reaction product with its own
known properties. In such a mixture partly reacted like that the tests
have to prove correct classification.

Some of the properties are common and some depend on whether a substance
is gas, liquid or solid. Nano is another. So there are core
classification methods plus a separate set each for gas, liquid and
solid. Then there are other sets for different acute toxicity routes
(skin, inhalation, ingestion), chronic toxicity, specific organ toxicity
for single and multiple exposures, explosion characteristics, oxidizing
properties, corrosive to metal, skin and eyes, carcinogenicity,
mutagenicity, ototoxicity and the list continues with aquatic and other
environmental classifications.

There is more but I really have difficulty working out how to test
end-points without invoking the calculations which in turn require
construction of test substances and mixtures with model methods doing
self-classification according to assigned test properties and the
reference material.

On the topic of testing via SQLite3 when the production environment uses
Postgres, I have found remarkably identical behaviour. Provided you
*also* test with the same DBMS I see no problem.

Nothing gets into production without being tested with Postgres using
BuildBot to manage it. As it happens the development server always uses
Postgres. I have the option of testing in development with either
SQLite3 or Postgres but cannot remember the last time I tested in
development with Postgres.

Most of my dev testing is restricted to individual test classes which
run relatively (for me!) quickly. I only run the full suite when I head
off to lunch or Pilates or for a bike ride.

However, I would love my tests to run as fast as yours. How much would
you charge me to make that happen?

Cheers

Mike
> --
> You received this message because you are subscribed to the Google
> Groups "Django users" group.
> To unsubscribe from this group and stop receiving emails from it, send
> an email to django-users...@googlegroups.com
> <mailto:django-users...@googlegroups.com>.
> To post to this group, send email to django...@googlegroups.com
> <mailto:django...@googlegroups.com>.
> Visit this group at https://groups.google.com/group/django-users.
> To view this discussion on the web visit
> https://groups.google.com/d/msgid/django-users/07a56866-c2ef-4bf3-8349-a5faa54e7d28%40googlegroups.com
> <https://groups.google.com/d/msgid/django-users/07a56866-c2ef-4bf3-8349-a5faa54e7d28%40googlegroups.com?utm_medium=email&utm_source=footer>.

Bill-Torcaso-Oxfam

unread,
Sep 18, 2018, 9:34:43 AM9/18/18
to Django users

I two comments, and as always Your Milage May Vary.

  • I wonder if you have the right indexes on your Postgres database? The previous people report much faster completion times for test that use Postgres as the database.  Perhaps your domain is just hard (and you description makes it sound that way), but "Explain Plan" might be your best friend here.

  • Sqlite does not enforce length limit constraints on text fields.  You can declare a field to be CharField(max_length=200), and Sqlite will happily put a longer string into the field.  Postgres will not.

    This came up when I was importing data from an older generation of our product.  My custom import code ran successfully against Sqlite and the exact same command failed against Postgres.

  • (And yes, I said only two points.  But this just occurs to me...)

    Can you divide your test suite into distinct independent chunks, and run them in parallel?  Either against one common database or using one database per test-chunk?

Mike Dewhirst

unread,
Sep 18, 2018, 6:49:08 PM9/18/18
to django...@googlegroups.com
Bill

I'm about to disappear for a few days. Thank you very much for your
suggestions - I'll tackle them when I get back.

Cheers

Mike

On 18/09/2018 11:34 PM, Bill-Torcaso-Oxfam wrote:
>
> I two comments, and as always Your Milage May Vary.
>
> * I wonder if you have the right indexes on your Postgres database?
> The previous people report much faster completion times for test
> that use Postgres as the database.  Perhaps your domain is just
> hard (and you description makes it sound that way), but "Explain
> Plan" might be your best friend here.
>
> * Sqlite does not enforce length limit constraints on text fields. 
> You can declare a field to be CharField(max_length=200), and
> Sqlite will happily put a longer string into the field.  Postgres
> will not.
>
> This came up when I was importing data from an older generation of
> our product.  My custom import code ran successfully against
> Sqlite and the exact same command failed against Postgres.
>
> * (And yes, I said only two points.  But this just occurs to me...)
>
> Can you divide your test suite into distinct independent chunks,
> and run them in parallel?  Either against one common database or
> using one database per test-chunk?
>
> On Friday, September 14, 2018 at 11:32:49 AM UTC-4, Hezi Halpert wrote:
>
> I would like to share with you an issue we encounter while moving
> from sqlite to postgres with heavily use of Django testing.
>
> We have Django app with ~700 tests. Most of them accessing
> database. We recently migrated the database from sqlite to postgres.
> Many of our tests were written in a way that compares actual pk’s
> (hard-coded pks or just file/json comparisons) . Since Django
> testing on sqlite (testcases.TestCase class) creates in-memory
> database (which is being reseted every unit test by default), we
> never had a problem with it.
> However, Django TestCase on postgres create completely another
> test db which preserves the pk sequences between different tests.
> And since many of our tests were written in a way that compares
> actual pk’s they all start fail - depends on the exact tests
> execution order.
> Even tests which expect some pk and are were not failed yet, can
> potentially failed in the future - depends on adding/editing other
> tests which may change the db sequence
>
> We consider the following solutions:
>
> 1. Move to TransactionTestCase (instead of TestCase) and use
> “reset_sequences = True” flag. Cons:
> TransactionTestCase reduces performance dramatically (~4 times
> longer in some of the tests)
> 2. Refactor all failed tests: remove all hard-coded references to
> the pk. Cons: Require much Dev effort (we had more then 70
> such tests)
> 3. Route the database in settings.py such it will use sqlite
> instead of postgres when running tests. Cons: It will not
> actually test the real scenarios - not an option
> 4. Combine reset_sequences flag with TestCase in our own version
> to TestCase: OurTestCase class and make everything to inherit
> from it. This is the option we finally decided of. See below.
>
>
> fromdjango.test import TestCase, testcases
>
> class OurTestCase(TestCase):
> reset_sequences =True def _fixture_setup(self):
> for db_namein self._databases_names(include_mirrors=False):
> if self.reset_sequences:
> self._reset_sequences(db_name)
> if self.fixtures:
> call_command('loaddata', *self.fixtures, **{'verbosity':0,'database': db_name})
> if not testcases.connections_support_transactions():
> self.setUpTestData()
> return super(TestCase,self)._fixture_setup()
> self.atomics =self._enter_atomics()
>
> Another problem of these kind of tests is the default ordering
> assumption of Django which changes significantly between postgres
> and sqlite when testing.
> Therefore, models included in such tests must have a hint for
> Django regarding the default ordering retrieval.
> Our solution was to make all models inherit from
> DexterModelDefaultOrder (below)
>
>
> class DexterModelDefaultOrder(models.Model):
> class Meta:
> abstract =True ordering = ['id']
>
> I hope it (will) help someone
>
> --
> You received this message because you are subscribed to the Google
> Groups "Django users" group.
> To unsubscribe from this group and stop receiving emails from it, send
> an email to django-users...@googlegroups.com
> <mailto:django-users...@googlegroups.com>.
> To post to this group, send email to django...@googlegroups.com
> <mailto:django...@googlegroups.com>.
> Visit this group at https://groups.google.com/group/django-users.
> To view this discussion on the web visit
> https://groups.google.com/d/msgid/django-users/1e8ec552-d793-4161-ba10-070263586698%40googlegroups.com
> <https://groups.google.com/d/msgid/django-users/1e8ec552-d793-4161-ba10-070263586698%40googlegroups.com?utm_medium=email&utm_source=footer>.
Reply all
Reply to author
Forward
0 new messages