In exploring this, I've realized that Django's transaction management
needs to be refactored to get multiple-database support working. BUT,
if we do it right, we have an opportunity to simplify our transaction
APIs at the same time. I have never liked our transaction APIs --
requiring decorators? WTF? -- so I have long waited for the day when
we can clean it up and make it into something that I actually enjoy
using (and don't have to look up in the documentation each time I want
to use it).
So I have a proposal. Here are the main design concepts:
* Django manages a set of connections, accessible either via a global
dictionary (maybe django.db.connections) or some API
(django.db.get_connection()). This is what we've been talking about in
the "More multi-database plumbing" thread on django-developers.
* Connections are specified in the settings file. Each connection has
a label, like "default", "auth", whatever.
* Transactions are managed via methods on connection objects. NOT via
some strange decorator and magic global django.db.transaction variable
that comes out of thin air.
* QuerySets (and likely model objects, too) have hooks for
*optionally* specifying which connection to use in their queries.
* We retain the concept of a "default" connection, which is the key
for making this backwards-compatible and easy to use.
Here's some example code:
"""
from django import db
from mysite.users.models import User
# This updates User objects on the default connection with
# auto-commit.
User.objects.update(is_registered='t')
# This updates User objects on the "auth" connection with
# auto-commit.
User.objects.with_connection("auth").update(is_registered='t')
# This is equivalent to the previous example, demonstrating
# that with_connection() also can take a connection object.
conn = db.get_connection('auth')
User.objects.with_connection(conn).update(is_registered='t')
# This updates User objects on the default connection within
# a transaction.
conn = db.get_connection() # Equivalent to get_connection('default')
conn.begin()
User.objects.update(is_registered='t')
conn.commit()
# This updates User objects on the "auth" connection within
# a transaction.
conn = db.get_connection('auth')
conn.begin()
User.objects.with_connection(conn).update(is_registered='t')
conn.commit()
"""
For backwards compatibility, we can still keep the legacy decorators
-- transaction.commit_on_success(), etc. -- and they'd just work on
the default connection. But we'd encourage people to use this new API.
My proposal is not necessarily to get this in Django 1.1, but to get
it in trunk at the very least. I'm selfishly motivated by my own
project to get this done ASAP, so I'm very happy to do the
development.
Adrian
> * Transactions are managed via methods on connection objects. NOT via
> some strange decorator and magic global django.db.transaction variable
> that comes out of thin air.
I agree that the global functions are not anything to write home
about. However, I disagree that the decorators need to be written off
so quickly. I like the idea that transaction management can be done on
this new connection object. I see the decorators are still useful and
I use them all the time. They can be modified to not only work on the
default connection, but couldn't they take a connection as an argument?
@transaction.commit_on_success(conn)
When no arguments are given it can simply fallback to the default
connection. The decorators provide, IMHO, a quick way to wrap a
function in some sort of transaction scheme that work really well.
Brian Rosner
http://oebfare.com
Since 1.1's only about a month away and we need to focus on finishing
up the features planned for it and squashing bugs before the release,
might it be better to manage this work, for now, in a branch (either
in the main SVN repo, or on an external DVCS mirror like github or
bitbucket)?
--
"Bureaucrat Conrad, you are technically correct -- the best kind of correct."
Like James, I'm concerned with getting a 1.1 release that's as
high-quality as possible, and I'm concerned that a big change like
this late in the game could be too destabilizing to hit our (already
delayed) release timeline. On top of that, it rubs me the wrong way to
make our community go through a whole feature proposal process only to
drop a big feature in at the last minute.
We faced a similar decision with aggregation support in the run up to
1.0: it was *mostly* done by feature freeze, but we opted to hold it
to give more time for testing and for the feature set to mature.
Personally, I think it worked out great: 1.0 avoided delays or
destabilization, we got an "easy win" feature for 1.1, and having that
feature nearly ready helped us have a nice short window between 1.0
and 1.1.
I'd say that robust multiple database APIs could be a similar "easy
win" for 1.2 if we start a branch now and get it merged during the 1.2
window. If that branch stays tightly locked to trunk as we stabilize
things for 1.1 it's entirely possible that the branch could be stable
enough for those of us who're used to bleeding edge releases to just
use instead of trunk. I probably will, at least.
As for the specific API itself: I think I need to chew it over a bit.
Seems nice and simple, but I'd like to run through the various
multiple-database use cases I've encountered and think about how
they'd work. In general I'm pretty happy with the direction: I agree
with the annoyance of the global transaction management stuff, and I'd
love to say "good riddance" to it.
Jacob
> What about having an attribute in the Meta class of the model that
> let's the model have a default connection for executing the 4 most
> common different operations in each conneciton, something like
>
> class MyModel(models.Model)
> class Meta:
> select_conn = "default"
> insert_conn = "write_conn"
> update_conn = "write_conn"
> delete_conn = "write_conn"
Urgh. :-(
That's four attributes, not one. It doesn't seem to have anything to do
with transactions, either. Please go back and read the long thread from
last September on multi-db support before going down the "design a
multi-db API" path. We've already been over a lot of the requirements
and options here. This really isn't the thread to revisit that (I would
hope).
Tying a model to a particular (set of) databases connection(s) at
declaration time is unnecessarily tight coupling. If you change the
connection configuration -- an operational/runtime issue -- you need to
edit all your model source code to keep up. We can avoid doing that.
Malcolm
Whoop, I should've been much more sensitive to the 1.1 deadline in how
I presented this. I'm guilty of caring too much about the particular
feature and not enough about how it fits into timelines and particular
Django releases.
Can you blame me? Multiple-database support is dead sexy. :-)
Sounds like the best way for me to work on this without disrupting the
1.1 momentum is to set up a dedicated branch. I'll post a note here
when I've got that up and running.
Adrian
On Saturday 14 March 2009, Brian Rosner wrote:
> On Mar 13, 2009, at 4:51 PM, Adrian Holovaty wrote:
> > * Transactions are managed via methods on connection objects. NOT via
> > some strange decorator and magic global django.db.transaction variable
> > that comes out of thin air.
>
> I agree that the global functions are not anything to write home
> about. However, I disagree that the decorators need to be written off
> so quickly. I like the idea that transaction management can be done on
> this new connection object. I see the decorators are still useful and
> I use them all the time. They can be modified to not only work on the
> default connection, but couldn't they take a connection as an argument?
>
> @transaction.commit_on_success(conn)
>
Nitpick: The decorators are called at import time, when there is no connection
object available. They can take a connection name. That, in turn, may cause
unwanted dependencies between settings.py and application code (it could be
acceptable, much like an application requiring a specific setting is
acceptable, but it is a point to consider).
> When no arguments are given it can simply fallback to the default
> connection. The decorators provide, IMHO, a quick way to wrap a
> function in some sort of transaction scheme that work really well.
>
First issue:
Besides the decorators, which Brian suggests to salvage, the current global
transaction management also supports the transaction middleware; and it will
be a little harder to resurrect the latter under the proposed scheme.
Core committers: Does your annoyance with current global transaction
management also apply to this middleware?
Second issue:
I understand the spirit that prefers code that is fast and simple, to code
that, by default, behaves correctly in fringe cases. However, I think that
when judging such cases, one should also take into account the costs, to
users, of making the code correct. Ticket #9964[1] provides an example where
this cost is relatively low (at least for now; add "transaction.set_dirty()"
calls in easily-identifiable places in your code). Multiple connections
without distributed transactions is a case where this cost is high -- the
database distribution is often not at the developer's control, and if a
distributed transaction is required, this may mean executing obscure,
engine-specific SQL (classic use-case for this: transfer some asset between
sharded accounts).
Thanks for your attention,
Shai.
I dont know if this has been covered in some of the mentioned previous
multi db support threads but how is it supposed to work with admin?