django/db/models/sql/where.py: line 215

51 views
Skip to first unread message

Sneaky Wombat

unread,
Feb 2, 2011, 5:40:22 PM2/2/11
to Django developers
I'm not sure what sort of answer this question will lead to, but here
goes. I know foreignkeys on separate databases isn't supposed to work,
according to the django docs, but I got it to work. First off, I'm
using an old release (1, 2, 0, 'alpha', 1), one of the earlier multidb
releases. Because I hacked on it for so long and so much to make it
work for my needs, I've literally ended up with a branch of my own to
maintain. As such, I port features in as I need them from various
patches to the trunk. It sucks, but anyway, I came across a problem
where I wanted to use django-pyodbc to connect to a mssql server on
another machine. This machine was not my default router, in fact, it
was the third database I moved into my dbrouter. Further complicating
my design, (lack thereof) I needed to use a field on a table in this
mssql database as a foreign key to my primary database (mysql).

I don't want to be lectured on my use of foreign keys like this, I
know.

At first it seemed ok, and then I found a nugget on the django doc
pages that said foreign keys to external databases are not allowed/
won't work. (http://docs.djangoproject.com/en/dev/topics/db/multi-db/
#cross-database-relations) Since I ported/added things I needed in the
past, I didn't let this deter me. I pressed on. After a lot of
experimenting, I discovered a very strange behavior. Take this for
example:

class mssqlTable(Model):
# this is one of three non-default databases in my db router
# this would get selected explictly using db_for_read/write
display = models.CharField(max_length=128)
def __unicode__(self):
return "mssqlTable:%s"%self.id

class mysqlTable(Model):
# this is in my default database in my db router
assignee = models.ForeignKey(mssqlTable)
display = models.CharField(max_length=128)
def __unicode__(self):
return "mysqlTable:%s"%self.id

Now the weirdness, and I think Model.objects.defer has it's hands in
this somewhere, but didn't look too hard there, other than watch it
try to defer in pdb.

launch the shell: python manage.py shell
import the models...
x=mysqlTable.objects.all()

As long as I don't try to do:

x.assignee

It works great. As soon as i try to hit that deferred field though,
it pukes with a horrible error that comes up from django-pyodbc. I can
see x.assignee_id though, that is populated with an int value of the
FK from the external DB. However, if I do:

x=mysqlTable.objects.all()[0]

x.assignee

it works great! I never did find out exactly what that slice does,
but I later surmised that the slice is causing the connection.alias to
hold the right handle to the right database. However, when defer
"undefers" or whatever the process is that loads those fields is,
fires later, it doesn't hold the right connection.alias all the way
through. In fact, it does hold the right alias for most of the
process, it just doesn't pass it in one method. The rest of the
failure occurs in django-pyodbc. I found the following in the latest
svn, and for all practical purposes, it's exactly the same as mine.
The call to connection.ops.field_cast_sql doesn't pass connection
through. It only passes db_type, which doesn't hold the connection
instance that is passed in to this method.


def sql_for_columns(self, data, qn, connection):
"""
Returns the SQL fragment used for the left-hand side of a
column
constraint (for example, the "T1.foo" portion in the clause
"WHERE ... T1.foo = 6").
"""
table_alias, name, db_type = data
if table_alias:
lhs = '%s.%s' % (qn(table_alias), qn(name))
else:
lhs = qn(name)
return connection.ops.field_cast_sql(db_type) % lhs

sql_for_columns actually calls this:
PYTHON/site-packages/sql_server/pyodbc/operations.py:63

def field_cast_sql(self, db_type, connection=None):
"""
Given a column type (e.g. 'BLOB', 'VARCHAR'), returns the SQL
necessary
to cast it before using it in a WHERE statement. Note that the
resulting string should contain a '%s' placeholder for the
column being
searched against.
"""
if self.sql_server_ver < 2005 and db_type and db_type.lower()
== 'ntext':
return 'CAST(%s as nvarchar)'
return '%s'

You can see that field_cast_sql can accept connection, but it never
gets it passed. It in turns, calls a getter (self.sql_server_ver) on
self and that looks like this: (line 14)

def _get_sql_server_ver(self, connection=None):
"""
Returns the version of the SQL Server in use:
"""
if self._ss_ver is not None:
return self._ss_ver
else:
if connection:
cur = connection.cursor()
else:
from django.db import connection
cur = connection.cursor()
cur.execute("SELECT CAST(SERVERPROPERTY('ProductVersion')
as varchar)")
ver_code = int(cur.fetchone()[0].split('.')[0])
if ver_code >= 10:
self._ss_ver = 2008
elif ver_code == 9:
self._ss_ver = 2005
else:
self._ss_ver = 2000
return self._ss_ver
sql_server_ver = property(_get_sql_server_ver)

Unfortunately, the only way i know to work around this sort of call,
is to add a setter in to this property. Even though, the actual
method it calls, _get_sql_server_ver can accept connection yet again,
it doesn't get that passed in. However, if you do pass it in, the cur
instance is on the right database. The error i found was that it
would enter the else block and create a new cursor to the db, and
select the default database instead of the external/non-default one i
needed to use.

So, the failure to pass connection starts first in Django's where.py
file and then there are two other errors in the django-pyodbc source.
If I modify all of the above functions to pass connection, the issues
I experienced go away and I can use the mssql database as i wanted to,
as a foreign-key to another database on another machine.

My where.py ends up like this:

def sql_for_columns(self, data, qn, connection):
"""
Returns the SQL fragment used for the left-hand side of a
column
constraint (for example, the "T1.foo" portion in the clause
"WHERE ... T1.foo = 6").
"""
table_alias, name, db_type = data
if table_alias:
lhs = '%s.%s' % (qn(table_alias), qn(name))
else:
lhs = qn(name)
try:
return connection.ops.field_cast_sql(db_type,connection) %
lhs
except TypeError,e:
return connection.ops.field_cast_sql(db_type) % lhs

The try/except is a hack for me. anything that isn't in this mssql
database will throw a type error because it doesn't use the django-
odbc source. I don't think the real cause is in django though, but
this is the only way i found to work around it.

thoughts? If it's anything at all, I found it interesting that I
could get this working without a lot of effort. I've loaded over 1200
records in the new tables i created that depend on it and haven't had
a problem. Admin works fine too, for the most part. some things like
the list_filter are giving me issues that i'll have to fix.
Nevertheless, it's cool and wanted to throw this out there for people
to tear into...or ignore.

Thanks for all the hard work!

regards,

-j

legutierr

unread,
Feb 6, 2011, 10:50:26 AM2/6/11
to Django developers
I'm afraid that I can't help you with the specific problem that you
are describing, but I would like to recommend that you look at the
work that Waldemar Kornewald and Thomas Wanschik are doing with their
django-dbindexer:

http://www.allbuttonspressed.com/blog/django/joins-for-nosql-databases-via-django-dbindexer-first-steps

Their project may be solving the same basic problem you are trying to
solve, but in a more generalized way, and in a way that supports more
recent versions of Django. Although their primary objective is to
simulate relational functionality in non-relational systems, I'm sure
they would be open to seeing how their tool can make multi-db easier
for relational databases.

Russell Keith-Magee

unread,
Feb 6, 2011, 7:32:01 PM2/6/11
to django-d...@googlegroups.com

Then you surmised very, very badly. It does a little more than that.

mysqlTable.objects.all() is a queryset. It's a representation of a
group of objects resulting from a query. By taking the slice, you are
pulling out a single object from that query.

Hence,

x = mysqlTable.objects.all()
x.assignee

doesn't work, because a list of objects doesn't have an assignee.

x = mysqlTable.objects.all()[0]
x.assignee

does work, because the first object in the result set has an assignee.

This has nothing to do with multi-db -- it's a fundamental part of how
querysets work. Without wanting to be rude -- if you haven't got a
grasp of this level of detail of Django, you might want to consider
whether you're up to the task of making cross-database joins work.

However, if you're still confident you can get the job done, there are
two issues that you'll need to resolve:

1) The mechanics in Foreign keys that prohibit cross-database joins.
There isn't currently any way to remove these checks, so you'l need to
do some customization. I'm not fundamentally opposed to making these
checks

2) The process of retrieving remote objects when you use foreign key
accessors. This should really just be a matter of using to_python() to
turn a foreign key value into a remote query.

I'm not sure why "where" comes into this whole discussion, because
cross database queries aren't *ever* going to work (or, at least, they
aren't going to work without some *serious* effort). An implementation
of cross-database foreign keys shouldn't need to do anything with the
WHERE clause AFAICT.

Defer doesn't have anything to do with it either. Foreign Keys don't
use the same mechanics as deferred fields. A deferred field is the
idea of "don't retrieve column X until it is explicitly requested".
Foreign Keys always retrieve the *primary key* of the related object,
but lazily load the related object when it is requested. Similar idea,
but quite different implementation.

However, you're in pretty much unexplored territory here.
Conceptually, it should be possible, but there are reasons why we
dropped cross-database joins from the multi-db implementation that we
delivered :-)

Yours,
Russ Magee %-)

Sneaky Wombat

unread,
Feb 7, 2011, 11:55:07 AM2/7/11
to Django developers
>
> x = mysqlTable.objects.all()
> x.assignee
>
> doesn't work, because a list of objects doesn't have an assignee.

clearly it was a typo on my part, no need to be nasty :) I know x is
a qs and understand its operation the way you describe it. What I
meant was if I took an element from it AFTER saving to x:

x = mysqlTable.objects.all()
y=x[0]
y.assignee

would always fail. Sorry to leave that out. However, if i sliced
immediately, it would work:

> x = mysqlTable.objects.all()[0]
> x.assignee

The cause is that django-pyodbc wasn't receiving the correct
connection instance. I corrected that by passing it in the code
listed in my original post and it works great now.

> I'm not sure why "where" comes into this whole discussion, because
> cross database queries aren't *ever* going to work (or, at least, they
> aren't going to work without some *serious* effort).

It was about three lines of code for me, no problem so far. :)

Łukasz Rekucki

unread,
Feb 7, 2011, 12:58:23 PM2/7/11
to django-d...@googlegroups.com
On 7 February 2011 17:55, Sneaky Wombat <joe....@gmail.com> wrote:
>>
>> x = mysqlTable.objects.all()
>> x.assignee
>>
>> doesn't work, because a list of objects doesn't have an assignee.
>
> clearly it was a typo on my part, no need to be nasty :)  I know x is
> a qs and understand its operation the way you describe it.  What I
> meant was if I took an element from it AFTER saving to x:
>
> x = mysqlTable.objects.all()
> y=x[0]
> y.assignee
>
> would always fail.  Sorry to leave that out.  However, if i sliced
> immediately, it would work:
>
>> x = mysqlTable.objects.all()[0]
>> x.assignee
>

It's rather hard to image how assigning to "y" could change anything
in this code, unless "y" is some kind of global variable. The
difference in behaviour must had come from something other in your
testing environment that you didn't noticed.

--
Łukasz Rekucki

Reply all
Reply to author
Forward
0 new messages