how to test an application that's using a legacy database

1,608 views
Skip to first unread message

dpalao...@gmail.com

unread,
Nov 10, 2014, 2:43:06 PM11/10/14
to django...@googlegroups.com
Hi,
I'm writing a Django application that uses an existing database. If I understood it well, in such a case one must create non-managed models for the legacy tables to avoid Django creating already existing tables, right?
For instance, this is how one of my models looks like:

class JobInfo(models.Model):
    job_db_inx
= models.IntegerField(primary_key=True)
    id_job
= models.IntegerField()
    id_user
= models.IntegerField()
    id_group
= models.IntegerField()
    account
= models.TextField(blank=True)
    cpus_req
= models.IntegerField()
    cpus_alloc
= models.IntegerField()
    nodelist
= models.TextField()
    nodes_alloc
= models.IntegerField()
    partition
= models.TextField()
    time_start
= models.IntegerField()
    time_end
= models.IntegerField()
    was_updated
= models.IntegerField()
    jobmondatacleared
= models.IntegerField(db_column='jobMonDataCleared')  # Field name made lowercase.
    endupcount
= models.IntegerField(db_column='endUpCount')  # Field name made lowercase.
    approved
= models.IntegerField()

   
class Meta:
        managed
= False
        db_table
= 'job_info'

The problem, of course happens when the unit tests are run. The test database must be created when the tests start. But because "managed = False", the tables are not created.

First round.
Googling a bit I found a recipe to by-pass this problem: modify the DiscoverRunner class (I found it in here, and tailor it trvially to remove the deprecation warning):

class ManagedModelDiscoverRunner(DiscoverRunner):
   
def setup_test_environment(self, *args, **kwargs):
       
from django.db.models.loading import get_models
       
self.unmanaged_models = [m for m in get_models()
                                 
if not m._meta.managed]
       
for m in self.unmanaged_models:
            m
._meta.managed = True
           
print("setting %s._meta.managed to True" % (m.__name__,))
       
super(ManagedModelDiscoverRunner, self).setup_test_environment(*args, **kwargs)

   
def teardown_test_environment(self, *args, **kwargs):
       
super(ManagedModelDiscoverRunner, self).teardown_test_environment(*args, **kwargs)
       
# reset unmanaged models
       
for m in self.unmanaged_models:
            m
._meta.managed = False


So I created a directory in my project called "tests" and put the above code in a file called "managed_runner.py" in there. To be more explicit, the tree looks like:

|-- myapp
|   |-- admin.py
|   |-- __init__.py
|   |-- models.py
|   |-- templates
|   |   `-- myapp
|   |       `
-- home.html
|   |-- tests.py
|   `-- views.py
|-- myproj
|   |-- __init__.py
|   |-- settings.py
|   |-- urls.py
|   `
-- wsgi.py
|-- manage.py
`-- tests
    |-- __init__.py
    `
-- managed_runner.py


Then I modified my "settings.py" file adding

TEST_RUNNER="tests.managed_runner.ManagedModelDiscoverRunner"

to it. And I ran the tests. Still, it fails:

django.db.utils.ProgrammingError: Table 'test_db.job_info' doesn't exist

But I see the output of print in the screen saying that the models' ._meta.managed attributes are set to True.


Second round.
Reading the Django docs, I see that DiscoverRunner has a couple of interesting methods: setup_databases and teardown_databases. I tried to subclass the DiscoverRunner again, this time it looks like:

class ManagedModelDiscoverRunner(DiscoverRunner):
   
def setup_databases(self, **kwargs):
       
from django.db.models.loading import get_models
       
self.unmanaged_models = [m for m in get_models() if not m._meta.managed]
       
for m in self.unmanaged_models:
            m
._meta.managed = True
           
print("setting %s._meta.managed to True" % (m.__name__,))
       
return super(ManagedModelDiscoverRunner, self).setup_databases(**kwargs)

   
def teardown_databases(self, old_config, **kwargs):
       
super(ManagedModelDiscoverRunner, self).teardown_databases(old_config, **kwargs)
       
# reset unmanaged models
       
for m in self.unmanaged_models:
            m
._meta.managed = False



and I get exactly the same error.


What am I missing?

Thanks in advance,

David


PS Django-1.7 with Python-3.3

Larry Martell

unread,
Nov 10, 2014, 2:49:40 PM11/10/14
to django...@googlegroups.com
On Mon, Nov 10, 2014 at 9:43 AM, <dpalao...@gmail.com> wrote:
> Hi,
> I'm writing a Django application that uses an existing database. If I
> understood it well, in such a case one must create non-managed models for
> the legacy tables to avoid Django creating already existing tables, right?

No, when you run syncdb it will not create or modify any existing
tables, only create new ones.

dpalao...@gmail.com

unread,
Nov 10, 2014, 2:53:29 PM11/10/14
to django...@googlegroups.com

Larry Martell

unread,
Nov 10, 2014, 2:57:54 PM11/10/14
to django...@googlegroups.com
On Mon, Nov 10, 2014 at 9:53 AM, <dpalao...@gmail.com> wrote:
> So what is written in the docs is wrong?

Please provide some context. Quote the relevant portions of the post
you're replying to.

dpalao...@gmail.com

unread,
Nov 10, 2014, 3:00:12 PM11/10/14
to django...@googlegroups.com
Here comes the context. Sorry.


On Monday, November 10, 2014 3:49:40 PM UTC+1, Larry....@gmail.com wrote:

No, when you run syncdb it will not create or modify any existing
tables, only create new ones.

Again,  are the docs wrong?

David

Larry Martell

unread,
Nov 10, 2014, 3:03:57 PM11/10/14
to django...@googlegroups.com
The link you referenced refers to migrate and flush. I was talking
about syncdb. I've never used migrate or flush. I expect the docs to
be correct.

donarb

unread,
Nov 10, 2014, 3:08:15 PM11/10/14
to django...@googlegroups.com
Did you read the doc part you posted? 

"For tests involving models with managed=False, it’s up to you to ensure the correct tables are created as part of the test setup."

This means that when you are testing you need to make sure that your testing code creates that table, the testing process will not do it for you.

dpalao...@gmail.com

unread,
Nov 10, 2014, 3:08:35 PM11/10/14
to django...@googlegroups.com

The link says:
" If False, no database table creation or deletion operations will be performed for this model. This is useful if the model represents an existing table or a database view that has been created by some other means."

The problem I have is that I need models that are valid in production (modification of the db is not allowed) and in testing (the tables must be created). Either "managed" is set to True or False in the model (actually in "model"._meta), but not both.
Therefore I need to somehow modify the models during the tests, but I don't know how.

Best

dpalao...@gmail.com

unread,
Nov 10, 2014, 3:14:49 PM11/10/14
to django...@googlegroups.com

I guess that explains why the tests fail.
Probably I was not clear in my original post. My problem is precisely how to switch managed to True just before the tests and put it back to False afterwards to avoid creating of tables by hand during the tests. That is precisely the trick described in the link I included in my first post: subclass the DiscoverRunner class. But I don't know how to achieve it.
Thanks for your comment, though.

Best

Fred Stluka

unread,
Nov 10, 2014, 3:18:27 PM11/10/14
to django...@googlegroups.com
David,

No.  You can use it unchanged.  With Django 1.4 at least, and I
assume also with 1.7.

I have a legacy MS SQL Server DB where I have rights to modify
data but not tables.  I also cannot create new DBs in MS SQL
Server.

I used inspectdb to create models from the existing DB.

I generally have them marked as managed=False, but when
running tests, I have them managed=True, and I override the
DATABASES settings to point to SQLite, instead of MS SQL Server.
Thus, when I run the tests, the test DB is created in SQLite, and
the real DB server is untouched.

Here's how I do it:

settings.py:

DATABASES = { ... the usual stuff ... }

# Decide whether we're running unit tests
RUNNING_UNIT_TESTS = 'test' in sys.argv

if RUNNING_UNIT_TESTS:
    DATABASES['default'] = { 'ENGINE': 'django.db.backends.sqlite3', }

models.py:

    class Meta:
        managed = True if settings.RUNNING_UNIT_TESTS else False

--Fred
Fred Stluka -- mailto:fr...@bristle.com -- http://bristle.com/~fred/
Bristle Software, Inc -- http://bristle.com -- Glad to be of service!
Open Source: Without walls and fences, we need no Windows or Gates.
--
You received this message because you are subscribed to the Google Groups "Django users" group.
To unsubscribe from this group and stop receiving emails from it, send an email to django-users...@googlegroups.com.
To post to this group, send email to django...@googlegroups.com.
Visit this group at http://groups.google.com/group/django-users.
To view this discussion on the web visit https://groups.google.com/d/msgid/django-users/bbcc092a-7530-448d-a681-c01fbae742e9%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

Carl Meyer

unread,
Nov 10, 2014, 5:58:40 PM11/10/14
to django...@googlegroups.com
Hi David,

On 11/10/2014 08:14 AM, dpalao...@gmail.com wrote:
> Probably I was not clear in my original post. My problem is precisely
> how to switch managed to True just before the tests and put it back to
> False afterwards to avoid creating of tables by hand during the tests.
> That is precisely the trick described in the link I included in my first
> post: subclass the DiscoverRunner class. But I don't know how to achieve it.

Fred posted about how to trick the ORM into creating these tables for
you. For what it's worth, if I were in this situation I would not try to
flip `managed` to `True` during testing for these tables. In production
these tables have a fixed known schema, they are not created by the
Django ORM. I'd prefer to replicate that situation in testing as well,
so that my tests would catch a mis-match between my ORM models for the
legacy tables and the actual schema of the tables.

So the approach I'd take instead would be to bypass the ORM entirely
when creating these tables. Instead create a SQL file with the DDL for
the exact schema of the real legacy tables, and then hook into the test
runner to run that SQL file directly against your test database.
Specifically, I think in a subclass of DiscoverRunner you could override
the `setup_databases` method and run your SQL immediately after that.

(Note that I haven't actually done this, so there may be caveats I
haven't anticipated. If you want a battle-tested solution, you might
prefer Fred's. But if I was in your situation, I would first try to get
a solution working that doesn't have the ORM create these tables for
testing.)

Carl

dpalao...@gmail.com

unread,
Nov 11, 2014, 1:45:34 PM11/11/14
to django...@googlegroups.com, fr...@bristle.com, FredS...@gmail.com
Dear Fred,
Thanks a lot for the answer.
Actually I got very happy when I saw it. But I sadly found out that it does not work in my case.

I think the problem is related to the way Django-1.7 behaves with respect to databases, migrations and so on. Not 100% sure, though.

I don't really know what to do...

Best
get_models
       
self<span
...

dpalao...@gmail.com

unread,
Nov 11, 2014, 2:01:37 PM11/11/14
to django...@googlegroups.com
Dear Carl,
I see your point. You might be right, but it is not clear to me how to do it and if it would work: I have already tried to subclass DiscoverRunner to modify its behaviour with little success.
Another problem that I see: it is not an homogenous approach. I mean, the models are created from the production database. Now I create an independent database for testing. Of course, you will tell me that I have to follow the TDD approach to the end: for the code that creates the testing database too! And I would agree. But it is clearly a bit too complicated.
I'm having serious doubts on how friendly Django-1.7 is with respect to TDD...

Best,

David

Carl Meyer

unread,
Nov 11, 2014, 2:13:18 PM11/11/14
to django...@googlegroups.com
Hi David,

On 11/11/2014 07:01 AM, dpalao...@gmail.com wrote:
> I see your point. You might be right, but it is not clear to me how to
> do it and if it would work: I have already tried to subclass
> DiscoverRunner to modify its behaviour with little success.

If there are specific aspects of "how to do it" that are confusing to
you, I could try to clarify. Or if you try it and have specific problems.

> Another problem that I see: it is not an homogenous approach. I mean,
> the models are created from the production database. Now I create an
> independent database for testing. Of course, you will tell me that I
> have to follow the TDD approach to the end: for the code that creates
> the testing database too! And I would agree. But it is clearly a bit too
> complicated.

I'm afraid I am quite confused by this paragraph.

I don't know what you mean by "not an homogenous approach".

And I don't know what you mean by "the models are created from the
production database."

And I don't know what you mean by "an independent database for testing."
Django already creates a new database for each test run. I'm not
suggesting creating any additional database beyond that, just running
some SQL to create your legacy (un-managed) tables in the testing database.

The code that creates the testing database will be exercised at the
start of every test run, and if it's not working your tests will fail
because the legacy tables are not created. Personally, I'd consider that
adequate; I wouldn't write any additional tests for that code.

> I'm having serious doubts on how friendly Django-1.7 is with respect to
> TDD...

That seems an unfair conclusion, since your problems here mostly arise
from using legacy tables un-managed by Django. That's not the typical
situation for a Django project, and it's naturally one that will require
some manual work on your part for the corresponding testing setup.

Carl

signature.asc

dpalao...@gmail.com

unread,
Nov 11, 2014, 3:37:06 PM11/11/14
to django...@googlegroups.com
Dear Carl,

Thank you for the answer.


On Tuesday, November 11, 2014 3:13:18 PM UTC+1, Carl Meyer wrote:
Hi David,

On 11/11/2014 07:01 AM, dpalao...@gmail.com wrote:
> I see your point. You might be right, but it is not clear to me how to
> do it and if it would work: I have already tried to subclass
> DiscoverRunner to modify its behaviour with little success.

If there are specific aspects of "how to do it" that are confusing to
you, I could try to clarify. Or if you try it and have specific problems.


Well, I never worked with sql directly. And as I said in my post when I tried to subclass  DiscoverRunner I got the impression that it is not straightforward (at least for my limited experience with Django).

> Another problem that I see: it is not an homogenous approach. I mean,
> the models are created from the production database. Now I create an
> independent database for testing. Of course, you will tell me that I
> have to follow the TDD approach to the end: for the code that creates
> the testing database too! And I would agree. But it is clearly a bit too
> complicated.

I'm afraid I am quite confused by this paragraph.

I don't know what you mean by "not an homogenous approach".

And I don't know what you mean by "the models are created from the
production database."

And I don't know what you mean by "an independent database for testing."
Django already creates a new database for each test run. I'm not
suggesting creating any additional database beyond that, just running
some SQL to create your legacy (un-managed) tables in the testing database.

The code that creates the testing database will be exercised at the
start of every test run, and if it's not working your tests will fail
because the legacy tables are not created. Personally, I'd consider that
adequate; I wouldn't write any additional tests for that code.


 I created the models by running "inspectdb". I see that as a (logical) link between the models and the tables. Perhaps not the standard link, but they are related. That is what I meant by "the models are created from the production database".

Now your suggestion is to create some tables from scratch. By hand. This is what I mean by "not an homogeneous approach", because I don't see a logical link between the models and the testing database. The testing database is created by me. (Or by Django, then I take control to create the tables, then I return control to Django -- if this is what you suggest).

But actually I think your solution makes a lot of sense. At the beginning I wanted Django to create the testing database along with the tables automatically. And you probably are right and the SQL code to create the tables is absolutely trivial, but for me it is not.
So I tried also something slightly different. As the database already exists, I think it would make sense if I also the testing database exists prior to testing.
What I just did was:
I created the testing database simply with a command like

mysqldump -u user --password='xxx' -h localhost -d production_db | mysql -h localhost -u user --passwor='xxx' -Dtest_db

Again, I must subclassing DiscoverRunner to have a chance to succeed (need to by-pass db creation), and actually Django complains if I don't do that.
So, I subclassed the DiscoverRunner like this

class SkipDBCreationDiscoverRunner(DiscoverRunner):
    def setup_databases(self, **kwargs):
        return

    def teardown_databases(self, old_config, **kwargs):
        pass

and enabled this new thing to run instead of the default in settings.py:

TEST_RUNNER="tests.nodb_runner.SkipDBCreationDiscoverRunner"

I created a "TEST" entry in the DATABASES dictionary with the proper db name:
        DATABASES = {
            'default': {
                'ENGINE': 'mysql.connector.django',
                'NAME': 'production_db',
                'USER': 'user',
                'PASSWORD': 'xxx',
                'HOST': 'localhost',
                'OPTIONS': {
                    'autocommit': True,
                    },
                    },
           
            'TEST': {
                'ENGINE': 'mysql.connector.django',
                'NAME': 'test_db',
                'USER': 'user',
                'PASSWORD': 'xxx',
                'HOST': 'localhost',
                'OPTIONS': {
                    'autocommit': True,
                    },
                    },
        }


BUT, Django is using the production database! I don't know what I'm missing. Probably something stupid...


> I'm having serious doubts on how friendly Django-1.7 is with respect to
> TDD...

That seems an unfair conclusion, since your problems here mostly arise
from using legacy tables un-managed by Django. That's not the typical
situation for a Django project, and it's naturally one that will require
some manual work on your part for the corresponding testing setup.

Carl

 
Sorry if that sounded rude or disrespectful. It was not my intention. I thought that using Django to work with existing databases was a common case. For me, up to now it has been a bit painful.

Thanks again.

Best,

David

PS Probably many words are not very accurate in my previous posts(s), sorry if that causes confusion. The term "independent database for testing" may be a good example. I will try to be more precise in the future. Also, I don't intend to start a semantic discussion or whatever. I just tried to clarify (probably with little success) what I mean.

Carl Meyer

unread,
Nov 11, 2014, 11:26:11 PM11/11/14
to django...@googlegroups.com
Hi David,

On 11/11/2014 08:37 AM, dpalao...@gmail.com wrote:
> Dear Carl,
>
> Thank you for the answer.
>
> On Tuesday, November 11, 2014 3:13:18 PM UTC+1, Carl Meyer wrote:
>
> Hi David,
>
I see, that makes sense.

> Now your suggestion is to create some tables from scratch. By hand. This
> is what I mean by "not an homogeneous approach", because I don't see a
> logical link between the models and the testing database. The testing
> database is created by me. (Or by Django, then I take control to create
> the tables, then I return control to Django -- if this is what you suggest).

Sorry, I didn't fully explain this part of my suggestion. I would not
create the SQL for those tables "by hand" - I would do it via a dump of
the SQL schema from the actual production tables.

> But actually I think your solution makes a lot of sense. At the
> beginning I wanted Django to create the testing database along with the
> tables automatically. And you probably are right and the SQL code to
> create the tables is absolutely trivial, but for me it is not.
> So I tried also something slightly different. As the database already
> exists, I think it would make sense if I also the testing database
> exists prior to testing.
> What I just did was:
> I created the testing database simply with a command like
>
> mysqldump -u user --password='xxx' -h localhost -d production_db | mysql
> -h localhost -u user --passwor='xxx' -Dtest_db

So this is close to my suggestion. But rather than creating the entire
DB (including managed models too), I would send the dump to a SQL file
instead:

mysqldump -u user --password='xxx' -h localhost -d production_db >
schema.sql

And then edit schema.sql to remove any tables for normal managed models,
leaving only the table definitions for the unmanaged (legacy) tables.
(You might also add `--no-data`, depending whether you want your legacy
tables to also be prepopulated with data for your tests).

Then you can allow Django to create its test database normally, and run
migrations to create tables for your managed models; you just have to
additionally run `schema.sql` to add in the un-managed tables.

(It's possible that your approach of pre-creating the entire test DB
could just work, with no need to subclass DiscoverRunner at all, by
using the --keepdb option to manage.py test; but that option only exists
in the Django development version, so you probably don't have it.)

> Again, I must subclassing DiscoverRunner to have a chance to succeed
> (need to by-pass db creation), and actually Django complains if I don't
> do that.
> So, I subclassed the DiscoverRunner like this
>
> class SkipDBCreationDiscoverRunner(DiscoverRunner):
> def setup_databases(self, **kwargs):
> return
>
> def teardown_databases(self, old_config, **kwargs):
> pass

Rather than bypassing database setup and teardown entirely (which will
cause Django to just use the production database, as you noticed), I
would override only setup_databases, and still call the super method,
but just add code after it to run your SQL file and add the table
definitions for the legacy tables. Something more like this (untested):

from django.db import connection

class MyDiscoverRunner(DiscoverRunner):
def setup_databases(self, *a, **kw):
ret = super(MyDiscoverRunner, self).setup_databases(*a, **kw)
cur = connection.cursor()
cur.execute(open('schema.sql').read())
return ret
This is not the right format for configurating a test database. What
you've configured here is an ordinary non-test database that happens to
be named "TEST" - Django won't do anything special with that. To
configure a test version of a particular database, you use a 'TEST'
sub-key, like this (and you don't have to repeat any configuration that
is the same):

DATABASES = {
'default': {
'ENGINE': 'mysql.connector.django',
'NAME': 'production_db',
'USER': 'user',
'PASSWORD': 'xxx',
'HOST': 'localhost',
'OPTIONS': {
'autocommit': True,
},
'TEST': {
'NAME': 'test_db',
},
},
}

> BUT, Django is using the production database! I don't know what I'm
> missing. Probably something stupid...
>
>
> > I'm having serious doubts on how friendly Django-1.7 is with
> respect to
> > TDD...
>
> That seems an unfair conclusion, since your problems here mostly arise
> from using legacy tables un-managed by Django. That's not the typical
> situation for a Django project, and it's naturally one that will
> require
> some manual work on your part for the corresponding testing setup.
>
> Carl
>
>
> Sorry if that sounded rude or disrespectful. It was not my intention. I
> thought that using Django to work with existing databases was a common
> case. For me, up to now it has been a bit painful.

This raises the question of whether you really need to be using
un-managed models at all. The other option is to use normal managed
models, but just fake the initial (table-creating) migration for those
models on your production database. Then testing will just work normally.

Carl

signature.asc

David Palao

unread,
Nov 12, 2014, 12:32:00 PM11/12/14
to django...@googlegroups.com
Dear Carl,
It works!!!
I just made some small changes (see below).
Thanks a lot for your patience and help.

Best,

David

PS The other solution, namely "fake the initial (table-creating)
migration for those models on your production database" is
interesting.
I don't know what solution is better. It feels safer, from my point of
view, playing around with testing databases than with production
databases. But I could consider the other solution if there are strong
enough arguments in favour of it.

---

I summarize the steps I gave to solve it, just for future reference:

Setup: Django.-1.7, Python-3.3

Problem: unit tests crash due to non-existing tables for unmanaged models

Solution: subclass DiscoverRunner to include "by-hand" creation of
tables for unmanaged models.

Details:
1. Create models from tables with "python manage.py inspectdb >
myapp/models.py"
• Edit the file conveniently:
• remove models that are related to Django (they were present
after some tests)
• un-manage the models that represent tables in the production db

2. Create a dump (to file) of the production database structure with
only the un-managed tables in it:
a. mysqldump -u root --password='xxx' -h localhost -d
production_db > tests/production_db_schema.sql
b. Edit "production_db_schema.sql"
1. to remove managed tables, if present.
2. and to join mulitline sql commands in one line.
3. Subclass DiscoverRunner. In my case
a. create a file called "tests/mytest_db_runner.py" inside
the project directory
b. create a subclass of DiscoverRunner in it:
from django.test.runner import DiscoverRunner
from django.conf import settings
from django.db import connection

class NonManagedModelsDiscoverRunner(DiscoverRunner):
"""
DiscoverRunner subclass that creates un-managed tables from a file
containing
the database schema (given by settings.TEST_DB_UNMANAGED_TABLES_SCHEMA_FILE)
"""
def setup_databases(self, **kwargs):
ret = super(NonManagedModelsDiscoverRunner,
self).setup_databases(**kwargs)
cur = connection.cursor()
schema_fn = settings.TEST_DB_UNMANAGED_TABLES_SCHEMA_FILE
with open(schema_fn, "r") as f:
for line in f:
if line.strip() != "":
cur.execute(line)
return ret

c. be sure that the project "settings.py" file contains
something like:
'''
TEST_RUNNER="tests.mytest_db_runner.NonManagedModelsDiscoverRunner"
TEST_DB_UNMANAGED_TABLES_SCHEMA_FILE=os.path.join(BASE_DIR,
"tests/production_db_schema.sql")
'''

4. Configure the project "settings.py" to set a name for the testing
db (if needed, not in my case).
Reply all
Reply to author
Forward
0 new messages