Multiple database support (#1142) roadblock

1 view
Skip to first unread message

jpel...@gmail.com

unread,
Jun 7, 2006, 6:25:21 PM6/7/06
to Django developers
I've been hacking away today at a patch for #1142 (multiple database
connection support), which is the last serious technical hurdle to my
company's use of django -- a very itchy itch for me to scratch.

Mostly things have gone well. My basic design is to add, as a
supplement to the current db.connection, db.connections: a lazy
connection manager that acts like a dict of named connections. It picks
up its settings for those connections from a new settings var,
DATABASES, like so:

>>> from django.conf import settings
>>> settings.DATABASES = {
... 'a': { 'DATABASE_ENGINE': 'sqlite3',
... 'DATABASE_NAME': '/tmp/dba.db'
... },
... 'b': { 'DATABASE_ENGINE': 'sqlite3',
... 'DATABASE_NAME': '/tmp/dbb.db'
... }}
>>> from django.db import connections

# connections[database] holds an object that includes connection,
# DatabaseError, backend, get_introspection_module,
# get_creation_module, and runshell. Connections are established only
# when requested.

>>> connections['a']
Connection: <django.db.backends.sqlite3.base.DatabaseWrapper object at
...> (ENGINE=sqlite3 NAME=/tmp/dba.db)
>>> connections['b']
Connection: <django.db.backends.sqlite3.base.DatabaseWrapper object at
...> (ENGINE=sqlite3 NAME=/tmp/dbb.db)

(That's right out of the doctests for the new code.)

This, plus a little patching of Meta options gets models their own
connections, set by a Meta option, and used by managers attached to the
model. I still have some work to do to integrate with the transactions
module.

However, I've run into a problem that can't be fixed with little
patches, in django.core.management. Specifically, all of the many
get_sql_* functions in there pull together sql from multiple models and
execute it all with the default connection. That's not going to work if
model A and model B want different connections.

I'd argue that the right solution here would be to push the brains
farther out to the edge. Have management functions call class methods
on models to execute table creation, initial data loading, etc, rather
than having them poll the models for information and construct and
execute the sql themselves. Something like:

def install(app):
# ... validation, start a transaction, etc
for model in models.get_models(app):
model.install()

Rather than the current:

def install(app):
# ...
sql_list = get_sql_all(app)

try:
cursor = connection.cursor()
for sql in sql_list:
cursor.execute(sql)

That would be a pretty substantial change, but I think it would open up
a lot of interesting possiblities for models that behave differently
from the default -- and it certainly would make supporting multiple
databases a whole lot easier.

What do you all think?

Thanks,

JP

Jeremy Dunck

unread,
Jun 7, 2006, 6:30:25 PM6/7/06
to django-d...@googlegroups.com
On 6/7/06, jpel...@gmail.com <jpel...@gmail.com> wrote:
> I'd argue that the right solution here would be to push the brains
> farther out to the edge. Have management functions call class methods
> on models to execute table creation, initial data loading, etc, rather
> than having them poll the models for information and construct and
> execute the sql themselves. Something like:

I don't deserve much of a vote, but when I peaked in manage.py, I
thought the same thing. Wha? All that reflection magic isn't in the
model itself?

But, yeah, it was too big a thing for me to start. ;-)

Adrian Holovaty

unread,
Jun 7, 2006, 6:46:47 PM6/7/06
to django-d...@googlegroups.com

The reason for that is that the reflection magic is only needed in
special situations -- namely, table creation -- so there's no need to
load it into memory.

Adrian

--
Adrian Holovaty
holovaty.com | djangoproject.com

Malcolm Tredinnick

unread,
Jun 7, 2006, 6:50:16 PM6/7/06
to django-d...@googlegroups.com
On Wed, 2006-06-07 at 22:25 +0000, jpel...@gmail.com wrote:

[...]

I don't think you can completely remove the controller portion here,
although you can push a lot of the mechanics down into the model
managers. The difficulty is that models do not exist completely
independently of other models.

Think about relations between models. ForeignKey and friends now have to
be implemented differently or can only apply to models using the same
database. And, in any case, they need to know the table name (and quite
possibly database name and connection proxy) for the related tables. So
you are going to have do a pass through all the models and build up the
graph of dependencies and make that available to each model at
construction time as well, aren't you?

[As an aside: I think we are also going to discover that ForeignKey is
unfortunately named, because one-to-many relations to tables in another
database is not complete nonsense; I've worked on systems in the past
that separated frequently read tables from frequently updated, less
frequently read tables for performance reasons.]

>
> What do you all think?

Nice work. :-)

Best wishes,
Malcolm

Adrian Holovaty

unread,
Jun 7, 2006, 7:04:05 PM6/7/06
to django-d...@googlegroups.com
> I'd argue that the right solution here would be to push the brains
> farther out to the edge. Have management functions call class methods
> on models to execute table creation, initial data loading, etc, rather
> than having them poll the models for information and construct and
> execute the sql themselves.

As I mentioned in another note to this thread, things are the way they
are because I didn't want to load all the rarely-used reflection stuff
into memory each time a model is used. That said, if it helps your
goal (which would be a great Django addition), let's go ahead and make
those model methods. One immediate problem I'd see is that it
increases the number of reserved words (assuming these are model-level
class methods). Perhaps, like "objects" for Managers, we should put
all the reflection stuff within the namespace of a single attribute.
Or maybe they become Manager methods, to avoid that problem entirely?

jpel...@gmail.com

unread,
Jun 7, 2006, 10:26:47 PM6/7/06
to Django developers
> And, in any case, they need to know the table name (and quite
> possibly database name and connection proxy) for the related tables. So
> you are going to have do a pass through all the models and build up the
> graph of dependencies and make that available to each model at
> construction time as well, aren't you?

Sure, but that has to happen anyway to support initial data for models
that have foreign keys. It won't be too difficult to produce a list of
models for an app ordered by dependency. Actually, hasn't someone
already done the inverse (producing models in dependency order) with
inspectdb? The tough part will be inter-app dependencies. Probably best
to handle those as django does now, with a pending_references set, but
I need to mull that over for a while.

> [As an aside: I think we are also going to discover that ForeignKey is
> unfortunately named, because one-to-many relations to tables in another
> database is not complete nonsense; I've worked on systems in the past
> that separated frequently read tables from frequently updated, less
> frequently read tables for performance reasons.]

The crunchy PHP ORM that I wrote at my day job and that I fervently
hope to retire, along with the rest of the equally-crunch framework
surrounding it, supports "AlienForeignKey" and "AlienManyToMany"
relations -- it's quite easy when you can depend on the alien model to
do most of the work.

> Nice work. :-)

Thanks! Hopefully it will remain nice when it's done. :)

JP

jpel...@gmail.com

unread,
Jun 7, 2006, 10:49:50 PM6/7/06
to Django developers
> As I mentioned in another note to this thread, things are the way they
> are because I didn't want to load all the rarely-used reflection stuff
> into memory each time a model is used. That said, if it helps your
> goal (which would be a great Django addition), let's go ahead and make
> those model methods. One immediate problem I'd see is that it
> increases the number of reserved words (assuming these are model-level
> class methods). Perhaps, like "objects" for Managers, we should put
> all the reflection stuff within the namespace of a single attribute.
> Or maybe they become Manager methods, to avoid that problem entirely?

Excellent points. I like the idea of putting the schema manipulation
functions in the manager -- much better than polluting the model
namespace. (Heck, I think save() and delete() should go in there too,
but that's another issue entirely... ). To avoid loading all of this
logic when it's not needed, perhaps we could put it all into
backend/creation.py. That might imply creating something like an
ansisql backend with just common creation parts, since so much is
similar across different databases. (SQLAlchemy takes a similar
approach -- a great project with many good, stealable ideas. :)

I'll start working in that direction tomorrow, if that seems like a
good plan. I'm going to be mostly internetless for the next 2 weeks or
so, so it will be a while before I can actually submit a patch that's
fully functional. But I can mail or post up what I have so far tomorrow
morning, if there's any interest. Would it be worthwhile to start a
wiki page at this point?

Thanks for the great feedback, all.

JP

Adrian Holovaty

unread,
Jun 7, 2006, 11:50:55 PM6/7/06
to django-d...@googlegroups.com
> I'll start working in that direction tomorrow, if that seems like a
> good plan. I'm going to be mostly internetless for the next 2 weeks or
> so, so it will be a while before I can actually submit a patch that's
> fully functional. But I can mail or post up what I have so far tomorrow
> morning, if there's any interest. Would it be worthwhile to start a
> wiki page at this point?

Sounds good -- looking forward to it! Yes, go ahead and start a wiki
page if you get a chance.

jpel...@gmail.com

unread,
Jun 8, 2006, 3:19:39 PM6/8/06
to Django developers
Wiki page up:

http://code.djangoproject.com/wiki/MultipleDatabaseSupport

The current (sloppy, totally incomplete) patch is attached, along with
a description of where I see it heading.

JP

David Elias

unread,
Jun 9, 2006, 1:24:04 PM6/9/06
to Django developers

jpel...@gmail.com wrote:
> perhaps we could put it all into backend/creation.py.

That's what i had, early, implemented with the firebird backend patch,
put the sequence and triggers sql in creation.py. And now i was trying
retain the management.py intact in terms of functionality, but your
aproach in puting all the sql statements in creation module is better,
i think. At least will give much more flexibility for generating SQL
for all backends.

Nice work.

David

Reply all
Reply to author
Forward
0 new messages