On Sat, Dec 5, 2009 at 3:34 AM, Tobias McNulty <
tob...@caktusgroup.com> wrote:
> The very first "Requirement"/ideal feature on the wiki page reads:
>
> "Different models/applications living on different databases, for
> example a 'blog' application on db1 and a forum application on db2.
> This should include the ability to assign a different database to an
> existing application without modifying it, e.g. telling Django where
> to keep the django.contrib.auth.User table. "
>
> That sounds perfect - having a way to modify in what database a table
> exists on a per-model or per-app basis, without making any changes to
> the app itself (so I can choose to put reusable app A in database X
> and reusable app B in database Y).
>
> I don't see anything in the docs about if and how this type of
> partitioning is supported.
The underlying idea of multidb is that you can direct any query to any
database you choose. If you want to force a particular model to a
particular database by default, then you install a default manager
that directs queries for that model onto that database of choice:
class MyManager(models.Manager):
def get_query_set(self):
return super(MyManager, self).get_query_set().using('other')
You then use syncdb to synchronize applications onto the databases
where you want them.
Now - I fully acknowledge that this is problematic for models like
contrib.auth, as you don't have the ability to install a default
manager. As I noted in my explanatory notes for the call for feedback,
this iteration is the 'porcelain but not the polished seat' version of
the code. You can direct a contrib.auth query to any database you want
with this iteration; in 1.3, I expect that Django will gain some form
of registration/callback API to allow you control database assignment
on a per-query basis.
> From what I understand, it has been
> proposed that this be implemented for the admin specifically through
> ModelAdmin and/or at the admin.Site level, because in the current
> iteration of the code there is no way to use the admin for anything
> other than the default database.
The idea about having Site bound to a database is consistent with the
view of contrib.admin as an Adminstration interface for your database.
If you have two databases, having two adminstration interfaces isn't a
huge logical jump.
Of course, it would be nice to be able to switch between databases
inside a single admin interface, and I don't see any reason that this
couldn't be done - it will just take more effort to implement.
> To me, specifying the database in the ModelAdmin or admin.Site seems
> arbitrary and potentially limiting: For any reusable app on the
> market, depending on the value of a particular setting in your urls.py
> or admin.py file is a Bad Idea.
I don't follow that logic. Admin sites are deployed into urls.py by
the end user - who knows exactly what databases are available, and
what models are available on those databases.
The only time this would be a problem for Site is if a reusable app
included an admin deployment as part of it's urls.py - but I can't
think of any reusable app that does this.
It is arguably true of ModelAdmin. However, remember if you don't like
the ModelAdmin for contrib.auth, you can unregister that ModelAdmin
and install your own.
> What if the admin was instead fixed
> by providing facilities for the more general case outlined above?
>
> What would this look like? I'm picturing another setting (bleh) that
> maps apps and/or models to specific databases. Name choices aside,
> this might look something like:
>
> DATABASE_USING = {
> 'django.contrib.auth': 'default',
> 'myapp': 'database1',
> 'myapp2': {
> 'Model1': 'database2',
> 'Model2': 'database3',
> }
> }
>
> The admin could automatically take advantage of this setting to
> determine what database to use for a given model.
Alex, Joseph Kocherhans and I discussed this exact idea at Djangocon.
You are correct that it has a lot of potential uses - not only the
admin, but also loaddata/dumpdata, syncdb, and anywhere else that an
iteration over models is required.
However, it's a little bit more complicated than you make out.
This sort of setting is very easy to give as a 10 line example, but in
practice, this isn't what you will require - you will effectively need
to duplicate the contents of INSTALLED_APPS. I have projects in the
wild with dozens of entries in INSTALLED_APS - all running on a single
database. Writing a DATABASE_USING setting like the one you describe
would be extremely cumbersome and annoying.
So, we could hijack INSTALLED_APPS to represent the idea of database
deployment, using tuples/dictionaries rather than strings to define
how apps are deployed. However, this comes at the cost of backwards
compatibility for anyone iterating over INSTALLED_APPS.
Even if we were able to find an acceptable way to represent this
information, consider some of the other use cases for multi-db like
master/slave or sharding. In these cases, every model needs to be
assigned to every database. Using the syntax you propose would involve
a whole lot of repeated typing of lines like:
'django.contrib.auth': ['master','slave1','slave2','slave3','slave4']
You could use DATABASES.keys() here, but even then - there's a lot of
extra configuration required to get the simple case of "just put
everything everywhere".
In the end, the three of us (Alex, Joseph and I) came to the
conclusion that the best approach was to treat the users as
"consenting adults". Rather than engage in a complex configuration
exercise, we can assume everything is everywhere, and provide the
tools during syncdb to prevent particular tables being created in the
case where that isn't the case. We then allow any query on any
database, and assume that the user knows what they are doing.
Looking at this in practical terms, the only code that actually needs
a database-table-lookup facility is code that is engaged in automated
meta-programming, like that used by admin and syncdb. While this is a
very nifty thing to do, it isn't really the common use case.
To be clear - I'm not opposed to having some way to programmatically
determine the models that are available on any given database. Having
this sort of facility would make the admin/syncdb/dumpdata tasks much
easier. It would provide an additional source of error checking in
queries themselves (preventing queries on databases where we know they
can't succeed). However, we need to find a clean way to express this,
and a clean proposal hasn't emerged from the discussions I've had to
date. I'm open to suggestions.
Yours,
Russ Magee %-)