Yes, this *could* be done entirely externally. However, I think it is
a manifestation of a pattern that we actively encourage, so therefore
we should provide nice API support for it.
I've lost count of the number of times I've said "ORM != SQL, so just
use a cursor and write raw SQL". However, "write raw SQL" isn't always
trivial, since you don't get back ORM objects. If you want ORM
objects, you have to:
(1) know about the trick of instantiating an object with the
unrolled list version of a cursor, and
(2) ensure that you've got all your columns specified in the right order.
We can fix (1) with some documentation, but (2) is one of those nasty
implementation details that you don't know you've got right until
everything blows up in your face. There's also an edge case where two
versions of a table might end up with different column orders due to
applying different migration strategies on each version. If this
happens, then the cursor approach fails because there is no consistent
column order.
If we can provide API-level support to make it easier to write custom
SQL, I think we should do so.
Yours,
Russ Magee %-)
On 29 Sep 2009, at 03:25, Russell Keith-Magee wrote:
> (1) know about the trick of instantiating an object with the
> unrolled list version of a cursor, and
Any chance you could expand upon this?
--
David Reynolds
da...@reynoldsfamily.org.uk
Sure. Assume a simple model:
class Author(Model):
name = CharField()
age = IntegerField()
Now, in your code, get a cursor and issue a SQL statement:
from django.db import connection.
cursor = connection.cursor()
cursor.execte('SELECT id, name, age FROM myapp_author')
result = cursor.fetchone()
At this point, result holds a tuple containing (id, name, age). You
can instantiate an instance of Author by unrolling this tuple:
instance = Author(*result)
if you want multiple instances, use cursor.fetchall():
results = cursor.fetchall()
instances = [Author(*result) for result in results]
You need to be a little careful with the fetchall version because you
could end up with a _lot_ of results - but that's really just the
standard cursor usage warning.
The real caveat: the order of the columns you SELECT *must* match the
order in which the fields are specified in the model. For example, if
you made the following SQL query:
cursor.execute('SELECT id, age, name FROM myapp_author')
the query will work fine, but you'll get a TypeError when you create
the object because 'age' can't be coerced into a string. If the two
fields that are out of order are the same data type, you won't get any
errors at all - you'll just get a badly represented instance. Hilarity
ensues.
This trick is exactly what Django does internally when it constructs
object instances. However, in the Django internals, it is a completely
automated process - Django issues the query and parses the results, so
there's no risk of getting the column order wrong.
There are some ways to work around the column ordering problem. For
example, you can interrogate the cursor to get the name of the columns
that have been returned. This is what Sean has done in his patch to
make the raw() call a little more robust. This code is completely
generic; hence the interest in adding this to core.
Yours,
Russ Magee %-)
Code here; note the treatment of .defer'd querysets in a branch that
uses **dict as well.
http://code.djangoproject.com/browser/django/trunk/django/db/models/query.py#L950
Sorry for the noise, but ... this stuff is really good. Great job!
One perhaps silly question: I saw the testAnnotations test, and I was
wondering if the annotation could be any possible SQL supported by the
DB.
For e.g.
"SELECT a.*, db_specific_function(a.id) as my_annotation FROM
raw_query_author a ORDER BY a.id"
Regards
Rajeev J Sebastian
While I can see what you're aiming at here, I'm not really a fan of the idea.
Django's ORM is designed to make the simple cases really simple, but
once you move beyond the basics, you really should be looking to use
raw SQL.
When you try to mash segments of raw SQL into Django's internal query
representation, it's very easy to start breaking things. extra() is
already really fragile when it comes to interacting with other query
features. Increasing the complexity of the SQL segments that you can
inject into a query isn't going to improve the situation.
I'm in favor of raw() specifically because it is only trying to make
the output of raw SQL statements a little more palatable. It's
interesting to me specifically because it doesn't try to integrate
with normal Django queries. It makes it easier to use raw SQL without
increasing the complexity of the query engine. Rather than trying to
work out how to cram a query into Django's ORM, we acknowledge that
Django's ORM has strengths, but for complex cases, raw SQL is still
the language of choice.
I think there are some things that we might be able to do to make
writing the raw SQL a little easier - for example, allowing string
substitutions for common parts of the query:
SELECT %(columns)s FROM %(db_table) ...
but even these I could live without. The important part is to provide
a better answer for users that have a query that exceeds the
capabilities of Django's ORM.
Yours,
Russ Magee %-)
Indeed. I look at raw() really as "I know *exactly* the query I want
to run, get the ORM out of my way and just make it easy to get
objects".
--
"Bureaucrat Conrad, you are technically correct -- the best kind of correct."