Gmail Calendar Documents Reader Web more »
Recently Visited Groups | Help | Sign in
Google Groups Home
Multiple database support
There are currently too many topics in this group that display first. To make this topic appear first, remove this option from another topic.
There was an error processing your request. Please try again.
flag
  Messages 1 - 25 of 28 - Collapse all  -  Translate all to Translated (View all originals)   Newer >
The group you are posting to is a Usenet group. Messages posted to this group will make your email address visible to anyone on the Internet.
Your reply message has not been sent.
Your post was successful
 
From:
To:
Cc:
Followup To:
Add Cc | Add Followup-to | Edit Subject
Subject:
Validation:
For verification purposes please type the characters you see in the picture below or the numbers you hear by clicking the accessibility icon. Listen and type the numbers you hear
 
koenb  
View profile  
 More options May 20 2008, 9:17 am
From: koenb <koen.bierm...@werk.belgie.be>
Date: Tue, 20 May 2008 06:17:39 -0700 (PDT)
Local: Tues, May 20 2008 9:17 am
Subject: Multiple database support
For those interested in multiple database support, I have started
working on it again, and posted my work-in-progress to ticket #4747.

I started from trunk and added things from the multidb branch little
by little, since so much has changed in that area since then.
There is still a lot more that needs to be checked and a number of
things to be redone. References to the default connection object are
all over the codebase, there is still a lot of work left to get all of
them straightened out.

What is working (more or less):
- using existing databases (though there might still be quirks when
using different engines)
- running tests
- some of the management commands (eg sqlall, sqlflush, but loaddata,
inspectdb or syncdb are not quite there yet)

Important to mention is that relations across databases are not
supported.

Anyway, if anyone is interested in helping, please let me know!

Koen


    Reply to author    Forward  
You must Sign in before you can post messages.
To post a message you must first join this group.
Please update your nickname on the subscription settings page before posting.
You do not have the permission required to post.
Nicola Larosa (tekNico)  
View profile  
 More options May 20 2008, 10:56 am
From: "Nicola Larosa (tekNico)" <nicola.lar...@gmail.com>
Date: Tue, 20 May 2008 07:56:04 -0700 (PDT)
Local: Tues, May 20 2008 10:56 am
Subject: Re: Multiple database support

koenb wrote:
> For those interested in multiple database support, I have started
> working on it again, and posted my work-in-progress to ticket #4747.
> ...
> Anyway, if anyone is interested in helping, please let me know!

I am going to need this in a month or so. Actions speak louder than
words, so many thanks for your efforts. However, there were news two
months ago, summarized in this thread:

Yet another SoC introduction: Getting multi-db done
http://groups.google.com/group/django-developers/browse_thread/thread...

It would be nice to coordinate each one's efforts, to avoid wasting
time. Ben, Daryl, any news?

--
Nicola Larosa - http://www.teknico.net/


    Reply to author    Forward  
You must Sign in before you can post messages.
To post a message you must first join this group.
Please update your nickname on the subscription settings page before posting.
You do not have the permission required to post.
Casper Jensen  
View profile  
 More options May 20 2008, 11:05 am
From: "Casper Jensen" <t...@sema.dk>
Date: Tue, 20 May 2008 17:05:50 +0200
Local: Tues, May 20 2008 11:05 am
Subject: Re: Multiple database support
On Tue, May 20, 2008 at 4:56 PM, Nicola Larosa (tekNico)
<nicola.lar...@gmail.com> wrote:
> It would be nice to coordinate each one's efforts, to avoid wasting
> time. Ben, Daryl, any news?

Currently, I have not worked on the project, since the proposal,
because of job and university commitments. I plan to track the
development at begin to help with the development when I get more time
(over the summer).

- Casper


    Reply to author    Forward  
You must Sign in before you can post messages.
To post a message you must first join this group.
Please update your nickname on the subscription settings page before posting.
You do not have the permission required to post.
koenb  
View profile  
 More options May 20 2008, 11:05 am
From: koenb <koen.bierm...@werk.belgie.be>
Date: Tue, 20 May 2008 08:05:52 -0700 (PDT)
Local: Tues, May 20 2008 11:05 am
Subject: Re: Multiple database support
Ah, missed that one.
Anyway, I only did the easy parts (that is, getting data in and out of
existing databases).
Thanks for the pointer, I'll try to keep an eye on that.

Koen

On 20 mei, 16:56, "Nicola Larosa (tekNico)" <nicola.lar...@gmail.com>
wrote:


    Reply to author    Forward  
You must Sign in before you can post messages.
To post a message you must first join this group.
Please update your nickname on the subscription settings page before posting.
You do not have the permission required to post.
Daryl Spitzer  
View profile  
 More options May 20 2008, 11:20 am
From: "Daryl Spitzer" <daryl.spit...@gmail.com>
Date: Tue, 20 May 2008 08:20:18 -0700
Local: Tues, May 20 2008 11:20 am
Subject: Re: Multiple database support
I've unfortunately been too busy to make time to work on this since
PyCon.  The last thing I've done (after writing some code on the
flight home) is to send a patch to Ben Ford.  Not long after that Ben
created a Mercurial repository (with my patch) and a Trac project.
You'll want to contact him.

I would still like to get my patch working so others (and myself) can
start testing it.  I won't have time this week, but so far it looks
like I may be able to make some time next week.  If I don't, I see if
I can at least make enough time to write up the API I came up with at
PyCon.

--
Daryl


    Reply to author    Forward  
You must Sign in before you can post messages.
To post a message you must first join this group.
Please update your nickname on the subscription settings page before posting.
You do not have the permission required to post.
Nicola Larosa (tekNico)  
View profile  
 More options May 20 2008, 2:13 pm
From: "Nicola Larosa (tekNico)" <nicola.lar...@gmail.com>
Date: Tue, 20 May 2008 11:13:39 -0700 (PDT)
Local: Tues, May 20 2008 2:13 pm
Subject: Re: Multiple database support

Daryl Spitzer wrote:
> If I don't, I see if I can at least make enough time to write up the API
> I came up with at PyCon.

Please do, that would be great.

--
Nicola Larosa - http://www.teknico.net/


    Reply to author    Forward  
You must Sign in before you can post messages.
To post a message you must first join this group.
Please update your nickname on the subscription settings page before posting.
You do not have the permission required to post.
Ben Ford  
View profile  
 More options May 21 2008, 6:33 am
From: "Ben Ford" <ben.for...@gmail.com>
Date: Wed, 21 May 2008 11:33:57 +0100
Local: Wed, May 21 2008 6:33 am
Subject: Re: Multiple database support

Hi all,

I'll sort out the hg repo (it now needs to point at trunk - not qsrf) and
trac project if I get time this evening and make it public readable for
everyone who's interested.

Is there a ticket in django we could use to track progress on this? We could
use 4747, but if we do decide on a new API that might be a bit confusing...
We could of course just use the mailing list and trac project, thoughts?

It's great to see some interest in multiple db support again :-)

Ben

2008/5/20 Nicola Larosa (tekNico) <nicola.lar...@gmail.com>:

> Daryl Spitzer wrote:
> > If I don't, I see if I can at least make enough time to write up the API
> > I came up with at PyCon.

> Please do, that would be great.

> --
> Nicola Larosa - http://www.teknico.net/

--
Regards,
Ben Ford
ben.for...@gmail.com
+447792598685

    Reply to author    Forward  
You must Sign in before you can post messages.
To post a message you must first join this group.
Please update your nickname on the subscription settings page before posting.
You do not have the permission required to post.
Daryl Spitzer  
View profile  
 More options May 21 2008, 8:31 am
From: "Daryl Spitzer" <daryl.spit...@gmail.com>
Date: Wed, 21 May 2008 05:31:08 -0700
Local: Wed, May 21 2008 8:31 am
Subject: Re: Multiple database support

> Is there a ticket in django we could use to track progress on this? We could
> use 4747, but if we do decide on a new API that might be a bit confusing...
> We could of course just use the mailing list and trac project, thoughts?

There's also http://code.djangoproject.com/ticket/1142.  With the
mailing list and trac project, do we need a ticket for more than just
a place to attach patches to invite others to test?

> I'll sort out the hg repo (it now needs to point at trunk - not qsrf) and
> trac project if I get time this evening and make it public readable for
> everyone who's interested.

Thanks Ben.

--
Daryl


    Reply to author    Forward  
You must Sign in before you can post messages.
To post a message you must first join this group.
Please update your nickname on the subscription settings page before posting.
You do not have the permission required to post.
Jacob Kaplan-Moss  
View profile  
 More options May 21 2008, 5:16 pm
From: "Jacob Kaplan-Moss" <jacob.kaplanm...@gmail.com>
Date: Wed, 21 May 2008 16:16:47 -0500
Local: Wed, May 21 2008 5:16 pm
Subject: Re: Multiple database support
Hi guys --

Sorry for coming late to the party, but I'm just now catching up on django-dev.

I'm really glad to see you get the ball rolling on multiple db
support, and once I'm dug out from my backlog I'll be happy to start
reviewing code and helping out if I'm needed.

However, before we get to that point, I've got some pretty serious API
concerns with the current approach, so I think I should outline those
before y'all go much further. I don't want you to expend much effort
just to get a -1 smackdown.

The current mechanism of defining "other" databases in the settings
module is just fine, and the underlying mechanism of having
queries/managers "know" their connection is similarly dandy. But the
wheels come off when it comes to the "public" API where users will
choose which connection they use.

As far as I can tell, you've currently provided two hooks to use a
secondary connection: set the model's default connection in the
settings module (which is OK, I suppose, though I might want to
nitpick the syntax a bit), and assigning to ``Model.objects.db``.

This second one is a disaster waiting to happen -- you've had to muddy
things up with threadlocals to work around some problems already. Also
consider the "bookkeeping" you'd need to do to deal with objects
across multiple database simultaneously (think sharding). You'd have
to keep juggling ``Model.objects.db`` and saving old ones... ugh.

Here's how I think it should work:

* I'd like the default connection for each and every object to be the
default database forever and always. I find putting models for default
connections in settings distasteful and I'd rather just a single API
for changing the connection (see below). However, I imagine I'll be in
the minority here so I'm prepared to cede this point if necessary.

* There needs to be an official API to get a model (or perhaps a
manager) which references a different "context" --
``Model.objects.db`` should be read-only. So you'd call some API
method, and get back a sort of proxy object that uses the other
connection. Here's a strawman API::

    >>> from django import db
    >>> from someapp.models import Article

    >>> Article.objects.all()
    [... all Articles from the default database ...]

    >>> ArticlesOnOtherDatabase =
db.get_model_for_other_connection(Article, "private")
    >>> ArticlesOnOtherDatabase.objects.all()
    [... all Articles from the database defined with the "private" key ...]

This should make the threadlocal stuff unnecessary, and (to my eye) is
a lot more sane than assigning the ``Manager.db``. Oh, and please
choose a better better name than
``db.get_model_for_other_connection()``; given that you're building
the bikeshed you might as well paint it, too.

Jacob


    Reply to author    Forward  
You must Sign in before you can post messages.
To post a message you must first join this group.
Please update your nickname on the subscription settings page before posting.
You do not have the permission required to post.
Ben Ford  
View profile  
 More options May 21 2008, 6:10 pm
From: "Ben Ford" <ben.for...@gmail.com>
Date: Wed, 21 May 2008 23:10:13 +0100
Local: Wed, May 21 2008 6:10 pm
Subject: Re: Multiple database support

Hi Jacob,

I'd be interested in your thoughts on a declarative approach to defining the
other databases...? I'll have my mercurial repo synced up to trunk tomorrow
(my time) and I'll re-apply the patch I got from Daryl to it as a starting
point. Hopefully people will be able to have a look through it and compare
the declarative approach proposed with the existing multi-db approach.

 As far as I can tell, you've currently provided two hooks to use a

> secondary connection: set the model's default connection in the
> settings module (which is OK, I suppose, though I might want to
> nitpick the syntax a bit), and assigning to ``Model.objects.db``.

> This second one is a disaster waiting to happen -- you've had to muddy
> things up with threadlocals to work around some problems already. Also
> consider the "bookkeeping" you'd need to do to deal with objects
> across multiple database simultaneously (think sharding). You'd have
> to keep juggling ``Model.objects.db`` and saving old ones... ugh.

I built a non trivial application with multi-db as it is right now and found
the api to be a bit hairy to be honest. I think it would be an advantage,
especially in a "database rich" environment to be able to build a db on the
fly much like a model, rather than be tied to what's in a dict in settings.
I don't really like the objects.db way of doing it, and I found myself doing
a fair bit of hacking to get it to work.

Agreed, the way I got round this was to build the model again from scratch
each time I wanted to access objects in a different database and have the
dynamicaly created model persist in the app cache. I took most of this from
the dynamic models entry on the wiki, it's here, look in the duplicate_model
function:
    http://www.djangosnippets.org/snippets/442/
This would really need work (especially the field copying code, which is
fairly horrifying! I know that doesn't work for all field types too - yuk)
before it becomes a 'good idea', and I'm not even sure it's the right way to
go, however I'd be interested in weather people think it's a valid approach.

 * I'd like the default connection for each and every object to be the

> default database forever and always. I find putting models for default
> connections in settings distasteful and I'd rather just a single API
> for changing the connection (see below). However, I imagine I'll be in
> the minority here so I'm prepared to cede this point if necessary.

The API which I think is being proposed is that there should be a central
register for database connections. In my mind this would be the place to go
to get hold of a connection for use in a queryset (and all the other places
it's needed) and I think the correct default behaviour of the class/object
would be to return the connection defined in settings.DATABASE_*. The code
to build the declarative DatabaseWrapper is already there, and there a
method to build one of these from what's in settings too. This should make
it easy to get hold of connection in all of the places where we currently do
"from django.db import connection".

It's great to see this revived again :-)

Cheers
Ben

--
Regards,
Ben Ford
ben.for...@gmail.com
+447792598685


    Reply to author    Forward  
You must Sign in before you can post messages.
To post a message you must first join this group.
Please update your nickname on the subscription settings page before posting.
You do not have the permission required to post.
oggie rob  
View profile  
 More options May 21 2008, 9:43 pm
From: oggie rob <oz.robhar...@gmail.com>
Date: Wed, 21 May 2008 18:43:30 -0700 (PDT)
Local: Wed, May 21 2008 9:43 pm
Subject: Re: Multiple database support

On May 21, 2:16 pm, "Jacob Kaplan-Moss" <jacob.kaplanm...@gmail.com>
wrote:

Has anybody considered declaring the connection when getting the
manager? Something like:
Artist.objects.all()
Widget.objects(db='a').all()
Obviously with the default database for the case when "db" isn't
passed. Also you could override the Manager to use a different
database by default (e.g. Widget.objects.all() might always use an
OTHER_DATABASE while all other models use the main db, if you create a
custom Manager for Widget)

This still leaves questions about how syncdb would be achieved, at
least. But if it could be done, the API seems simple to understand.

 -rob


    Reply to author    Forward  
You must Sign in before you can post messages.
To post a message you must first join this group.
Please update your nickname on the subscription settings page before posting.
You do not have the permission required to post.
koenb  
View profile  
 More options May 21 2008, 11:50 pm
From: koenb <koen.bierm...@werk.belgie.be>
Date: Wed, 21 May 2008 20:50:57 -0700 (PDT)
Local: Wed, May 21 2008 11:50 pm
Subject: Re: Multiple database support
I really like this line of thought: having the persistence layer of a
model fixed is a good idea.
Relations is a big issue here: unless we can support relations across
databases, switching connections is always a big opportunity to shoot
yourself in the foot.
I would propose a function that can collect "clusters" of models, that
is a collection of models that somehow are related to each other and
use that function to a) check that they all use the same database
during validation, and b) if we provide a API to register a model for
an additional connection (that is a second one), you get copies of the
models for the entire cluster, relations and all. Like that we could
even have syncdb create the tables for these 'backup models' too.

Koen

On 21 mei, 23:16, "Jacob Kaplan-Moss" <jacob.kaplanm...@gmail.com>
wrote:


    Reply to author    Forward  
You must Sign in before you can post messages.
To post a message you must first join this group.
Please update your nickname on the subscription settings page before posting.
You do not have the permission required to post.
Simon Willison  
View profile  
 More options May 22 2008, 10:59 am
From: Simon Willison <si...@simonwillison.net>
Date: Thu, 22 May 2008 07:59:38 -0700 (PDT)
Local: Thurs, May 22 2008 10:59 am
Subject: Re: Multiple database support
I have to admit I'm slightly worried about the multi-database
proposal, because at the moment it doesn't seem to solve either of the
multi-db problems I'm concerned about.

The proposal at the moment deals with having different models live in
different databases - for example, the Forum application lives on DB1
while the Blog application lives on DB2.

I can see how this could be useful, but the two database problems that
keep me up at night are the following:

1. Replication - being able to send all of my writes to one master
machine but spread all of my reads over several slave machines.
Thankfully Ivan Sagalaev's confusingly named mysql_cluster covers this
problem neatly without modification to Django core - it's just an
alternative DB backend which demonstrates that doing this isn't
particularly hard: http://softwaremaniacs.org/soft/mysql_cluster/en/

2. Sharding - being able to put User entries 1-1000 on DB1, whereas
User entries 1001-2000 live on DB2 and so on.

I'd love Django to have built-in abilities to solve #1 - it's a really
important first-step on scaling up to multiple databases, and it's
also massively easier than any other part of the multi-db problem.

I wouldn't expect a magic solution to #2 because it's so highly
dependent on the application that is being built, but at the same time
it would be nice to see a multi-db solution at least take this in to
account (maybe just by providing an easy tool to direct an ORM request
to a specific server based on some arbitrary logic).

I may have misunderstood the proposal, but I think it's vital that the
above two use cases are considered. Even if they can't be solved
outright, providing tools that custom solutions to these cases can be
built with should be a priority for multi-db support.

Cheers,

Simon


    Reply to author    Forward  
You must Sign in before you can post messages.
To post a message you must first join this group.
Please update your nickname on the subscription settings page before posting.
You do not have the permission required to post.
Ben Ford  
View profile  
 More options May 22 2008, 12:28 pm
From: "Ben Ford" <ben.for...@gmail.com>
Date: Thu, 22 May 2008 17:28:42 +0100
Local: Thurs, May 22 2008 12:28 pm
Subject: Re: Multiple database support

Hi all,

I've now re-applied Daryls patch (which was against qsrf) to a clone of
django trunk in a mercurial repo. It's available at
http://hg.woe-beti.deand there's a trac set up for it at
http://trac.woe-beti.de. Feel free to make use of both of these. Although
I've disabled to ability to create tickets perhaps the wiki might be a good
place to discuss the API? Anyone can clone from the hg repo, give me a shout
if you would like push access and I'll sort it out.

Cheers,
Ben

--
Regards,
Ben Ford
ben.for...@gmail.com
+447792598685


    Reply to author    Forward  
You must Sign in before you can post messages.
To post a message you must first join this group.
Please update your nickname on the subscription settings page before posting.
You do not have the permission required to post.
Ivan Sagalaev  
View profile  
 More options May 22 2008, 1:53 pm
From: Ivan Sagalaev <man...@softwaremaniacs.org>
Date: Thu, 22 May 2008 21:53:13 +0400
Local: Thurs, May 22 2008 1:53 pm
Subject: Re: Multiple database support

Simon Willison wrote:
> Thankfully Ivan Sagalaev's confusingly named mysql_cluster

BTW does anyone have a suggestion how to rename it? I've picked
mysql_cluster simply because I didn't know that there exists the thing
named "MySQL Cluster" (no kidding :-) ).

    Reply to author    Forward  
You must Sign in before you can post messages.
To post a message you must first join this group.
Please update your nickname on the subscription settings page before posting.
You do not have the permission required to post.
Eratothene  
View profile  
 More options May 22 2008, 5:56 pm
From: Eratothene <cd...@bk.ru>
Date: Thu, 22 May 2008 14:56:34 -0700 (PDT)
Local: Thurs, May 22 2008 5:56 pm
Subject: Re: Multiple database support
I think there is a third issue.

Usage of several RDBMS in one django application simulatneously

For example we maintain two RDBMS: monetdb and postgresql. The latest
and most accessed data is stored in monetdb for performance. After one
month data is moved to posgresql which holds the full archive.


    Reply to author    Forward  
You must Sign in before you can post messages.
To post a message you must first join this group.
Please update your nickname on the subscription settings page before posting.
You do not have the permission required to post.
Mike Scott  
View profile  
 More options May 23 2008, 7:00 am
From: "Mike Scott" <mic...@gmail.com>
Date: Fri, 23 May 2008 23:00:09 +1200
Local: Fri, May 23 2008 7:00 am
Subject: Re: Multiple database support

On Fri, May 23, 2008 at 2:59 AM, Simon Willison <si...@simonwillison.net>
wrote:

> 1. Replication - being able to send all of my writes to one master
> machine but spread all of my reads over several slave machines.
> Thankfully Ivan Sagalaev's confusingly named mysql_cluster covers this
> problem neatly without modification to Django core - it's just an
> alternative DB backend which demonstrates that doing this isn't
> particularly hard: http://softwaremaniacs.org/soft/mysql_cluster/en/

Personally I think this is something that is better solved by the database
software itself. Having replication code-side is something that I don't feel
to good about. But maybe thats just me.

> 2. Sharding - being able to put User entries 1-1000 on DB1, whereas
> User entries 1001-2000 live on DB2 and so on.

This is something I would love, an example being archives. (As Eratothene
points out.

Maybe having to state a storage location on a per-row level. (IE this could
happen by overriding the manager, and simply switching DB at selection time.
or being able to provide the DB info at selection time.)


    Reply to author    Forward  
You must Sign in before you can post messages.
To post a message you must first join this group.
Please update your nickname on the subscription settings page before posting.
You do not have the permission required to post.
koenb  
View profile  
 More options May 23 2008, 9:06 am
From: koenb <koen.bierm...@werk.belgie.be>
Date: Fri, 23 May 2008 06:06:24 -0700 (PDT)
Local: Fri, May 23 2008 9:06 am
Subject: Re: Multiple database support
You need to be aware that this is quite complicated if your model has
foreign keys in it: if you use the ORM to do queries, the ORM would
have to be so smart as to when to split up your queries instead of
doing joins.
eg you have model A which foreign keys to a User model. For a row of A
that is in the same database as User, the ORM could simply use a join,
but for a row of A that was already in the other database, it can't.

I do not believe this is a trivial change.

My proposal is to keep things simple in a first phase: let's make it
possible to use different databases for different models with the
restriction that relations should not cross databases. Once we get all
that working, we may look at making the query generation deal with
those.

Koen

On 23 mei, 13:00, "Mike Scott" <mic...@gmail.com> wrote:


    Reply to author    Forward  
You must Sign in before you can post messages.
To post a message you must first join this group.
Please update your nickname on the subscription settings page before posting.
You do not have the permission required to post.
Simon Willison  
View profile  
 More options May 23 2008, 9:16 am
From: Simon Willison <si...@simonwillison.net>
Date: Fri, 23 May 2008 06:16:54 -0700 (PDT)
Local: Fri, May 23 2008 9:16 am
Subject: Re: Multiple database support
On May 22, 6:53 pm, Ivan Sagalaev <man...@softwaremaniacs.org> wrote:

> Simon Willison wrote:
> > Thankfully Ivan Sagalaev's confusingly named mysql_cluster

> BTW does anyone have a suggestion how to rename it? I've picked
> mysql_cluster simply because I didn't know that there exists the thing
> named "MySQL Cluster" (no kidding :-) ).

How about mysql_masterslave or mysql_replicated (I prefer the second)?

    Reply to author    Forward  
You must Sign in before you can post messages.
To post a message you must first join this group.
Please update your nickname on the subscription settings page before posting.
You do not have the permission required to post.
Manuel Saelices  
View profile  
 More options May 23 2008, 10:26 am
From: Manuel Saelices <msaeli...@gmail.com>
Date: Fri, 23 May 2008 07:26:36 -0700 (PDT)
Local: Fri, May 23 2008 10:26 am
Subject: Re: Multiple database support
On 23 mayo, 13:00, "Mike Scott" <mic...@gmail.com> wrote:

> Maybe having to state a storage location on a per-row level. (IE this could
> happen by overriding the manager, and simply switching DB at selection time.
> or being able to provide the DB info at selection time.)

Maybe i good thing was to provide a middleware that does db selection.
Think in applications user centered. In those, user define what DB
use.

You can provide an API for change DB of all next queries, and use if
you want in a middleware (looking at request.user).


    Reply to author    Forward  
You must Sign in before you can post messages.
To post a message you must first join this group.
Please update your nickname on the subscription settings page before posting.
You do not have the permission required to post.
Ivan Sagalaev  
View profile  
 More options May 24 2008, 2:52 pm
From: Ivan Sagalaev <man...@softwaremaniacs.org>
Date: Sat, 24 May 2008 21:52:24 +0300
Local: Sat, May 24 2008 2:52 pm
Subject: Re: Multiple database support

Simon Willison wrote:
> How about mysql_masterslave or mysql_replicated (I prefer the second)?

Yes, mysql_replicated seems right. Thanks!

    Reply to author    Forward  
You must Sign in before you can post messages.
To post a message you must first join this group.
Please update your nickname on the subscription settings page before posting.
You do not have the permission required to post.
koenb  
View profile  
 More options May 28 2008, 3:25 am
From: koenb <koen.bierm...@werk.belgie.be>
Date: Wed, 28 May 2008 00:25:39 -0700 (PDT)
Local: Wed, May 28 2008 3:25 am
Subject: Re: Multiple database support
On 22 mei, 18:28, "Ben Ford" <ben.for...@gmail.com> wrote:

I have been adding some code to Ben's mercurial repo on [http://hg.woe-
beti.de], see also [http://trac.woe-beti.de].

What has been realised (more or less):
- connection and model registration
- validation that related objects use the same connection
- database_engine specific SQL generation (needs more checking)
- some management commands accept connection parameter, others can
generate output for multiple connections
- syncdb can sync different connections
- transaction management over connections
- some tests (needs a lot more work)

This means point 3 of the discussion at [http://trac.woe-beti.de/wiki/
Discuss] is somewhat working, but definitely needs a lot more testing.

I do need some help with creating tests for all this though.
I have not figured out yet how to create tests that check that the
right SQL is being generated for the backend used. (Nor how to test
the right database was touched by an action, this is quite obvious
manually, but I do not know how to automate this.)

I put some ideas on using multiple databases for copying (backup or
migration) of objects (point 4 of the discussion note) in [http://
trac.woe-beti.de/wiki/APIModelDuplication].

Please comment, add, shoot etc. Any help will be much appreciated.

Koen


    Reply to author    Forward  
You must Sign in before you can post messages.
To post a message you must first join this group.
Please update your nickname on the subscription settings page before posting.
You do not have the permission required to post.
Daryl Spitzer  
View profile  
 More options Jun 4 2008, 10:53 am
From: "Daryl Spitzer" <daryl.spit...@gmail.com>
Date: Wed, 4 Jun 2008 07:53:05 -0700
Local: Wed, Jun 4 2008 10:53 am
Subject: Re: Multiple database support
Another couple weeks have slipped by and I continue to be crazy-busy.
(But each week I'm busy for a different reason--so I continue to be
foolishly optimistic that I'll soon get a week with some free time.)

Anyway, I don't have time to read this thread through with the care it
deserves, but I thought I shouldn't let that stop me from finally
writing a description of the API I proposed at the PyCon sprint.

Each app would have a databases.py file that contains classes used to
define databases connections (in the same manner as classes are used
to define models).  Here's an example:

----

from django.db import connections

class LegacyDatabase(connections.DatabaseConnection):
   engine = 'sqlite3'
   name = '/foo/bar/legacy_db.sqlite3'

----

(And the other DATABASE_* settings (from settings.py) could certainly
be defined as attributes of a DatabaseConnection class.)

JUST FOR TESTING, I propose we allow a database connection to be
specified in a model with a Meta attribute, like this:

----

from django.db import models
from legacy.databases import LegacyDatabase

class LegacyStuff(models.Model):
    ...

    class Meta:
        db_connection = LegacyDatabase

----

Jacob expressed his extreme distaste for this at PyCon--for good
reason.  (We don't want to encourage coupling models to databases.)
But just so we can get a working patch and start testing, I propose we
go with this for now.

Adrian suggested we allow the specification of database connections
per-app using the new app() function being proposed for settings.py.
I haven't seen a description of this since PyCon, but I think it would
look something like:

app(name='legacy', db_connection='LegacyDatabase')

(I'm sure I'm leaving several important arguments out of this example.)

Perhaps one could implement sharding by defining multiple
DatabaseConnection classes in a databases.py file (we could support
these files at the project level in addition to the app level) and
putting them in a list.  Then one could write a function to return the
appropriate database to use and specify that callable in the argument
to the app function (or perhaps as an argument to the url function in
urls.py).

I haven't given any thought to replication.  Perhaps someone who needs
this could think about whether this proposal could somehow make
supporting replication easier (or if it might get in the way), or if
it's simply orthogonal to this.

--
Daryl


    Reply to author    Forward  
You must Sign in before you can post messages.
To post a message you must first join this group.
Please update your nickname on the subscription settings page before posting.
You do not have the permission required to post.
mengel  
View profile  
 More options Jun 16 2008, 10:05 pm
From: mengel <Marc.Men...@gmail.com>
Date: Mon, 16 Jun 2008 19:05:32 -0700 (PDT)
Local: Mon, Jun 16 2008 10:05 pm
Subject: Re: Multiple database support

On May 22, 9:59 am, Simon Willison <si...@simonwillison.net> wrote:

> 1. Replication - being able to send all of my writes to one master
> machine but spread all of my reads over several slave machines.
> 2. Sharding - being able to put User entries 1-1000 on DB1, whereas
> User entries 1001-2000 live on DB2 and so on.

It seems to me this isn't beyond doing in the current setup; but I'm
not sure I understand
the underlying mechanism well enough.   For case 1, you need an object
class that
creates two (or more) (apparently identical) Models.model classes, one
attached to each database, based on the field types declared as class
variables:
  * on searches, it picks one of the model classes to search
  * on saves, saves the same data to each  object class in turn

For case 2, it's very similar, except you need to run the query on all
sides (unless
you can tell it should only go to one) building a chained query-set
union type to hold
the result, and for saves pick the right model  to save to  based on
the condition.

In each case, the underlying models have to be tied to the right
databases, but this can
be done using the mechanism in the proposal so far..


    Reply to author    Forward  
You must Sign in before you can post messages.
To post a message you must first join this group.
Please update your nickname on the subscription settings page before posting.
You do not have the permission required to post.
David Cramer  
View profile  
 More options Jun 17 2008, 12:50 am
From: David Cramer <dcra...@gmail.com>
Date: Mon, 16 Jun 2008 21:50:11 -0700 (PDT)
Local: Tues, Jun 17 2008 12:50 am
Subject: Re: Multiple database support
I suppose I'll chime in here since we actually wrote master/slave
replication code on Curse.

Our approach:

- read_cursor and write_cursor exist. write_cursor is what cursor
would point ot.
- get queries all use the read cursor
- saves all use the write cursor
- we had a list of database connections, which stored the same
settings, just in a tuple format
- reading I believe used something like itertools.cycle but I can't
honestly say without looking at the code

Beyond this, the database itself should handle writing the objects to
the slaves. Django shouldn't even bother.

In regards to multiple databases in general. it is my feeling that
even if it is not good practice, Django _NEEDS_ to support a model
being attached to a database other than the default. So if you have
mydjango_blogs, and mydjango_forums databases, my Forum model always
goes to the write db when it queries, and same for blogs. I myself
like a Meta solution to this as it makes sense.

In MySQL as well, you can optimize things, so that if they use the
same connection, you can just query on that database. It's select X
from mydatabase.mytable. I'm not sure if something similar exists in
other database engines.

On Jun 16, 9:05 pm, mengel <Marc.Men...@gmail.com> wrote:


    Reply to author    Forward  
You must Sign in before you can post messages.
To post a message you must first join this group.
Please update your nickname on the subscription settings page before posting.
You do not have the permission required to post.
Messages 1 - 25 of 28   Newer >
« Back to Discussions « Newer topic     Older topic »

Create a group - Google Groups - Google Home - Terms of Service - Privacy Policy
©2009 Google