#3566 - Aggregations: Ready to commit

29 views
Skip to first unread message

Russell Keith-Magee

unread,
Jan 13, 2009, 10:52:42 PM1/13/09
to Django Developers
Hi all,

I have now finished work on ticket #3566 - adding aggregations to Django's ORM.

I intend to commit the code to Django's trunk tomorrow evening, my
time (i.e., in about 30 hours time).

If you have any objections or problem reports, speak now or live with
the consequences :-)

Yours,
Russ Magee %-)

Jacob Kaplan-Moss

unread,
Jan 14, 2009, 9:17:22 AM1/14/09
to django-d...@googlegroups.com
On Tue, Jan 13, 2009 at 9:52 PM, Russell Keith-Magee
<freakb...@gmail.com> wrote:
> I have now finished work on ticket #3566 - adding aggregations to Django's ORM.

Woohoo! Congratulations, and many thanks to you, Nicolas, and all the
others who helped out.

Jacob

Karen Tracey

unread,
Jan 14, 2009, 12:54:36 PM1/14/09
to django-d...@googlegroups.com
On Tue, Jan 13, 2009 at 10:52 PM, Russell Keith-Magee <freakb...@gmail.com> wrote:

If you have any objections or problem reports, speak now or live with
the consequences :-)

I'm getting some errors running the aggregation and aggregation_regress tests on Windows boxes running Python 2.5.2 and 2.5.4 when using sqlite as the database.  I do not see these failures on these boxes using 'sqlite3' backend and Python 2.3.5, 2.4.4 or 2.6/2.6.1, nor on a Linux box with Python 2.5.1.  The errors can be seen here (the failures are the same for 2.5.2 and 2.5.4):

http://dpaste.com/109020/

I'm also seeing failures when using Oracle as the backend, many tracebacks that end with:

File "d:\u\kmt\django\aggregation\django\db\backends\oracle\query.py", line 69, in convert_values
     elif value is not None and field.get_internal_type() == 'DecimalField':
AttributeError: 'NoneType' object has no attribute 'get_internal_type'

(or similar -- it's not always 'DecimalField').  The full Oracle output can be seen here:

http://dpaste.com/109034/

(Now, I'm kind of stumbling my way around git so it's possible I've corrupted the tree I'm using somehow, though 'git show' seems to be telling me the latest commit I've got is the Decimal doctest checker changes, which appears to be the latest out there, and 'git diff' shows no local changes, so near as I can tell I've got the latest git code, but my comfort level with git is not too high yet so if what I've posted looks like the result of out-of-date code feel free to ignore me or tell me to start over with a new tree...)

Karen

Ian Kelly

unread,
Jan 14, 2009, 2:20:28 PM1/14/09
to django-d...@googlegroups.com
On Wed, Jan 14, 2009 at 10:54 AM, Karen Tracey <kmtr...@gmail.com> wrote:
> I'm also seeing failures when using Oracle as the backend, many tracebacks
> that end with:
>
> File "d:\u\kmt\django\aggregation\django\db\backends\oracle\query.py", line
> 69, in convert_values
> elif value is not None and field.get_internal_type() == 'DecimalField':
> AttributeError: 'NoneType' object has no attribute 'get_internal_type'
>
> (or similar -- it's not always 'DecimalField'). The full Oracle output can
> be seen here:
>
> http://dpaste.com/109034/
>
> (Now, I'm kind of stumbling my way around git so it's possible I've
> corrupted the tree I'm using somehow, though 'git show' seems to be telling
> me the latest commit I've got is the Decimal doctest checker changes, which
> appears to be the latest out there, and 'git diff' shows no local changes,
> so near as I can tell I've got the latest git code, but my comfort level
> with git is not too high yet so if what I've posted looks like the result of
> out-of-date code feel free to ignore me or tell me to start over with a new
> tree...)

I'm seeing the same errors using the tarball snapshot. All but the
last one are fixed by the attached patch.

Ian

oracle-agg.diff

Ian Kelly

unread,
Jan 14, 2009, 2:21:31 PM1/14/09
to django-d...@googlegroups.com
On Wed, Jan 14, 2009 at 12:20 PM, Ian Kelly <ian.g...@gmail.com> wrote:
> I'm seeing the same errors using the tarball snapshot. All but the
> last one are fixed by the attached patch.

By "the last one" I actually mean the final failure where a float is
expected but an int is seen.

Ian

Ramiro Morales

unread,
Jan 14, 2009, 4:33:08 PM1/14/09
to django-d...@googlegroups.com
Karen,

On Wed, Jan 14, 2009 at 3:54 PM, Karen Tracey <kmtr...@gmail.com> wrote:
>
> I'm getting some errors running the aggregation and aggregation_regress
> tests on Windows boxes running Python 2.5.2 and 2.5.4 when using sqlite as
> the database.� I do not see these failures on these boxes using 'sqlite3'
> backend and Python 2.3.5, 2.4.4 or 2.6/2.6.1, nor on a Linux box with Python
> 2.5.1.� The errors can be seen here (the failures are the same for 2.5.2 and
> 2.5.4):
>
> http://dpaste.com/109020/

I will be able to test this by myself in a couple of hours, but just in
case you have the Windows system with at hand, what are the versions of
SQLlite and pysqlite2 (aka sqlite3) included in the official win32 2.5.4
binaries?:

>>> from sqlite3 import dbapi2
>>> print dbapi2.version_info
...

>>> print dbapi2.sqlite_version_info
...

Regards,

--
Ramiro Morales

Russell Keith-Magee

unread,
Jan 14, 2009, 5:45:06 PM1/14/09
to django-d...@googlegroups.com
On Thu, Jan 15, 2009 at 4:20 AM, Ian Kelly <ian.g...@gmail.com> wrote:
> On Wed, Jan 14, 2009 at 10:54 AM, Karen Tracey <kmtr...@gmail.com> wrote:
>> I'm also seeing failures when using Oracle as the backend, many tracebacks
>> that end with:
>>
>> File "d:\u\kmt\django\aggregation\django\db\backends\oracle\query.py", line
>> 69, in convert_values
>> elif value is not None and field.get_internal_type() == 'DecimalField':
>> AttributeError: 'NoneType' object has no attribute 'get_internal_type'
>>
>> (or similar -- it's not always 'DecimalField'). The full Oracle output can
>> be seen here:
>>
>> http://dpaste.com/109034/
>>
> I'm seeing the same errors using the tarball snapshot. All but the
> last one are fixed by the attached patch.

Thanks for the patch, Ian. The patch applied cleanly; I've applied it
and pushed it upstream.

Russ %-)

Russell Keith-Magee

unread,
Jan 14, 2009, 5:58:38 PM1/14/09
to django-d...@googlegroups.com
On Thu, Jan 15, 2009 at 2:54 AM, Karen Tracey <kmtr...@gmail.com> wrote:
> On Tue, Jan 13, 2009 at 10:52 PM, Russell Keith-Magee
> <freakb...@gmail.com> wrote:
>>
>> If you have any objections or problem reports, speak now or live with
>> the consequences :-)
>
> I'm getting some errors running the aggregation and aggregation_regress
> tests on Windows boxes running Python 2.5.2 and 2.5.4 when using sqlite as
> the database. I do not see these failures on these boxes using 'sqlite3'
> backend and Python 2.3.5, 2.4.4 or 2.6/2.6.1, nor on a Linux box with Python
> 2.5.1. The errors can be seen here (the failures are the same for 2.5.2 and
> 2.5.4):
>
> http://dpaste.com/109020/

I will now say a word that is not sanctioned by the church for use in public.

....

Now that I have that out of my system... :-)

Ok, this is wierd. The errors you are reporting are exactly what I
would expect to see if the SQLite specific convert_values() function
wasn't getting called. Given your reported problems with git, can you
confirm that on the failing box:
1) django.db.backends.__init__.py lines 386:399 defines a
convert_values(self, values, field) function,
2) django.db.backends.sqlite3.base.py lines 105:123 has a
convert_values(self, values, field) function,
3) the backend specific version of the convert_values function that
is being invoked in your running test case.

If the backend specific version is getting used, can you have a quick
poke around to see any reason why, for example, the first failing test
has decimal and date values, but the field types aren't invoking the
date/decimal specific type conversions.

Russ %-)

Russell Keith-Magee

unread,
Jan 14, 2009, 6:03:31 PM1/14/09
to django-d...@googlegroups.com

Can you (or Karen) confirm if the following patch:

http://dpaste.com/109182/

cleans up this issue?

Yours,
Russ Magee %-)

Karen Tracey

unread,
Jan 14, 2009, 6:16:25 PM1/14/09
to django-d...@googlegroups.com

Sure:

D:\u\kmt\django\aggregation\tests>\bin\Python2.5.2\python.exe
Python 2.5.2 (r252:60911, Feb 21 2008, 13:11:45) [MSC v.1310 32 bit (Intel)] on win32
Type "help", "copyright", "credits" or "license" for more information.

>>> from sqlite3 import dbapi2
>>> print dbapi2.version_info
(2, 3, 2)
>>> print dbapi2.sqlite_version_info
(3, 3, 4)
>>>

The machine with 2.5.4 reports the same.

Karen 

Ian Kelly

unread,
Jan 14, 2009, 6:16:27 PM1/14/09
to django-d...@googlegroups.com

Almost. It chokes on None values. Add a check for that, and it works.

Ian

Karen Tracey

unread,
Jan 14, 2009, 6:21:03 PM1/14/09
to django-d...@googlegroups.com
On Wed, Jan 14, 2009 at 5:58 PM, Russell Keith-Magee <freakb...@gmail.com> wrote:

On Thu, Jan 15, 2009 at 2:54 AM, Karen Tracey <kmtr...@gmail.com> wrote:
> On Tue, Jan 13, 2009 at 10:52 PM, Russell Keith-Magee
> <freakb...@gmail.com> wrote:
>>
>> If you have any objections or problem reports, speak now or live with
>> the consequences :-)
>
> I'm getting some errors running the aggregation and aggregation_regress
> tests on Windows boxes running Python 2.5.2 and 2.5.4 when using sqlite as
> the database.  I do not see these failures on these boxes using 'sqlite3'
> backend and Python 2.3.5, 2.4.4 or 2.6/2.6.1, nor on a Linux box with Python
> 2.5.1.  The errors can be seen here (the failures are the same for 2.5.2 and
> 2.5.4):
>
> http://dpaste.com/109020/

I will now say a word that is not sanctioned by the church for use in public.

....

Now that I have that out of my system... :-)

I did fear I'd be hearing screams from Australia....sorry to be the bearer of bad news!
 

Ok, this is wierd. The errors you are reporting are exactly what I
would expect to see if the SQLite specific convert_values() function
wasn't getting called. Given your reported problems with git, can you
confirm that on the failing box:
 1) django.db.backends.__init__.py lines 386:399 defines a
convert_values(self, values, field) function,

Check, yes.

 2) django.db.backends.sqlite3.base.py lines 105:123 has a
convert_values(self, values, field) function,

Ditto, check.
 

 3) the backend specific version of the convert_values function that
is being invoked in your running test case.

It's not being called for the ones that are failing.  It does get called some of the time (if I simply put a print in there I get a bunch more failures due to the unexpected output.)

That any help?

Karen

Russell Keith-Magee

unread,
Jan 14, 2009, 6:26:23 PM1/14/09
to django-d...@googlegroups.com

Fantastic. Done, and pushed to github.

Thanks Ian,

Russ %-)

Karen Tracey

unread,
Jan 14, 2009, 6:50:39 PM1/14/09
to django-d...@googlegroups.com

Confirmed this fixes all the Oracle failures I was seeing earlier.  Thanks!

(I do seem to be correctly pulling updates via git, so the Python2.5.2 failures on this same machine when using sqlite as a backend are indeed cropping up with the latest code.)

Karen

Russell Keith-Magee

unread,
Jan 14, 2009, 7:22:01 PM1/14/09
to django-d...@googlegroups.com

Help, yes. Illuminating, No. :-(

To fill you in on what I'm looking for - every aggregate value coming
back from the database passes through resolve_aggregates() in
django.db.models.query.py (line 212). This passes to convert_values()
in query.py, which in turn defers to the backend to do conversion.

MySQL and Postgres use the base backend implementation of
convert_values(), in the base connection's __init__.py; SQLite uses an
overridden version. The errors you are seeing are consistent with data
that is passing through the base convert_values(), rather than the
specific version().

Could you please trace one of the failing queries and confirm which
backend convert_values() is being invoked for Decimal and Date fields?
If the base implementation is being invoked, can you identify any
reason why self.connection.ops isn't the SQLite specific version? If
convert_values() isn't being invoked, which branch is which the
resolve_aggregates() returning on?

Apologies for throwing you in at the deep end here - unfortunately, I
have no access to a Windows box I can test with.

Thanks,
Russ %-)

Karen Tracey

unread,
Jan 14, 2009, 8:26:54 PM1/14/09
to django-d...@googlegroups.com
On Wed, Jan 14, 2009 at 7:22 PM, Russell Keith-Magee <freakb...@gmail.com> wrote:
Help, yes. Illuminating, No. :-(

To fill you in on what I'm looking for - every aggregate value coming
back from the database passes through resolve_aggregates() in
django.db.models.query.py (line 212). This passes to convert_values()
in query.py, which in turn defers to the backend to do conversion.

MySQL and Postgres use the base backend implementation of
convert_values(), in the base connection's __init__.py; SQLite uses an
overridden version. The errors you are seeing are consistent with data
that is passing through the base convert_values(), rather than the
specific version().

Could you please trace one of the failing queries and confirm which
backend convert_values() is being invoked for Decimal and Date fields?
If the base implementation is being invoked, can you identify any
reason why self.connection.ops isn't the SQLite specific version? If
convert_values() isn't being invoked, which branch is which the
resolve_aggregates() returning on?

OK, considering the first failing query where the significant differences in output are:

File "D:\u\kmt\django\aggregation\tests\modeltests\aggregation\models.py", line ?, in modeltests.aggregation.models.__test__.API_TESTS
Failed example:
Book.objects.filter(pk=1).annotate(mean_age=Avg('authors__age')).values()
Expected:
[... 'pubdate': datetime.date(2007, 12, 6) ... 'mean_age': 34.5}]
Got:
[... 'pubdate': u'2007-12-06' ... 'mean_age': 34.5}]

django.db.models.sql.query.resolve_aggregates is not being called to convert the pubdate.  (It is being called for mean_age.)  resolve_aggregates is not called for pubdate on any of Python 2.5.2, 2.4.4 or 2.6, but I only see the failure on 2.5.2.  2.4.4 & 2.6, without any call to resolve_aggregates, is getting a datetime.date() back somehow whereas on 2.5.2 it comes back as u'2007-12-06'.  Ring any bells with anyone? 

Karen

Russell Keith-Magee

unread,
Jan 14, 2009, 8:53:25 PM1/14/09
to django-d...@googlegroups.com

... and finally, Russell looks closer at the output and notices the
obvious... the problem values aren't aggregates.

So, either the aggregates processing has broken the normal value
coercion, or normal value coercion has always been busted on that
combination.

Here's the check - what do you get as output for:

Book.objects.filter(pk=1).values()

on a vanilla (non-aggregates) checkout of trunk?

Russ %-)

Karen Tracey

unread,
Jan 14, 2009, 9:15:06 PM1/14/09
to django-d...@googlegroups.com

Yeah, that's the direction I was heading in.  Only I didn't have to go so far as trying against trunk.  Even with the aggregate code,

>>> Book.objects.filter(pk=1).values()
[{'rating': 4.5, 'isbn': u'159059725', 'name': u'The Definitive Guide to Django: Web Development Done Right', 'pubdate': datetime.date(2007, 12, 6), 'price': Decimal("30"), 'id': 1, 'publisher_id': 1, 'pages': 447}]

on Win/Python 2.5.2.  Yet if you toss in the annotate():

>>> Book.objects.filter(pk=1).annotate(mean_age=Avg('authors__age')).values()
[{'rating': 4.5, 'isbn': u'159059725', 'name': u'The Definitive Guide to Django: Web Development Done Right', 'pubdate': u'2007-12-06', 'price': 30, 'id': 1, 'publisher_id': 1, 'pages': 447, 'mean_age': 34.5}]

The date and decimal fields stop getting converted properly.  Ideas where to look next?

Karen

Alex Gaynor

unread,
Jan 14, 2009, 9:26:01 PM1/14/09
to django-d...@googlegroups.com
Looking at results_iter() on sql.Query I'm seeing: row = self.resolve_columns(row, fields) however I'm not actually seeing resolve_columns, therefore that block of code isn't actually being called and therefore the row is being returned raw from the DB.  Is this correct(and it's just a hook subclasses can be use), or is it a typo?

Alex

--
"I disapprove of what you say, but I will defend to the death your right to say it." --Voltaire
"The people's good is the highest law."--Cicero

Russell Keith-Magee

unread,
Jan 14, 2009, 9:46:54 PM1/14/09
to django-d...@googlegroups.com

Next suggestion is to check if the converters that are registered on
lines 38-45 of db.backends.sqlite3.base.py are actually getting
called, and if there is a difference between the pre and post resolved
versions of row in results_iter() (i.e., the value of row at lines
240, 250 and 259 of query.py).

I can't think of any reason that these converters would be disabled by
the call to aggregates, though.

Russ %-)

Russell Keith-Magee

unread,
Jan 14, 2009, 9:49:23 PM1/14/09
to django-d...@googlegroups.com

This is correct. resolve_columns() is an extension point used by
subclasses such as the Oracle and GIS backends. It isn't used for
other databases - hence the 'hasattr' check.

Russ %-)

Karen Tracey

unread,
Jan 14, 2009, 10:01:16 PM1/14/09
to django-d...@googlegroups.com

They are, apparently.  They are called for the non-annotated query, they are NOT called for the annotated query, on Windows Python 2.5.2&4 only so far as I can tell.  Actually I only checked typecast_date -- it's called for both queries on 2.4.4/2.6 but only for the non-annotated one on 2.5.2.  Ugh.

Karen

Russell Keith-Magee

unread,
Jan 14, 2009, 10:20:31 PM1/14/09
to django-d...@googlegroups.com

Just a wild stab in the dark - but this isn't a case sensitivity
thing, is it? I noticed that there are two entries for timestamp - one
is fully capitalized. Is it possible that the backend is returning a
different capitalization once aggregates get involved? I have no idea
why this would be happening, or how you would check this at a lower
level, though - can you get Can you find out the database datatype
from the cursor?

Russ %-)

Karen Tracey

unread,
Jan 14, 2009, 11:33:19 PM1/14/09
to django-d...@googlegroups.com

Changing the case or adding an all-uppercase converter for 'DATE' doesn't help.  Looking around pysqlite's trac I found a ticket about their upperacase on converter type names not handling symbols so I get the impression what we pass is automatically uppercased. Not sure why we have both 'timestamp' and 'TIMESTAMP' converters registered.  

Figuring out what column type the lower levels think this thing is, since they aparently don't think it's a date, is a bit beyond what I know how to figure out at this point.  Plus my brain is starting to shut down as it's getting late here...

In a brief look I couldn't find anything that sounded similar logged in pysqlite's trac.  You'd think since it's only broken for the Windows/Python 2.5.2/4 level it is something that got broken and fixed so there must be a record of it somewhere (I was hoping with possible mention of a workaround) but I can't find it so far.  

Karen

Karen Tracey

unread,
Jan 15, 2009, 12:21:22 AM1/15/09
to django-d...@googlegroups.com
FYI -- since this Windows Python 2.5.x problem doesn't appear to be something we'll be able to figure out before I turn into a pumpkin and have to go to sleep, and I'm not sure what you are thinking about with regard to committing the code -- the rest of the test suite runs normally on the Windows Python 2.5.2 (I didn't try 2.5.4) sqlite combo.  (That's not to say no errors -- there's another known sqlite bug on that level, plus I get a file access error -- but neither of these are introduced by the aggregation code.)  So I don't believe adding aggregation breaks any existing code on this setup. 

Which leads me to think holding up the checkin of this function because of this problem is not necessary.  It's a particular combo that aggregation doesn't work on completely yet, that's all.  It's also quite easy to have multiple Python's installed on Windows, so if someone with Python 2.5.x really wants to immediately start playing with aggregation on Windows with sqlite as the backend, they can install a version of Python where it works, and they can keep 2.5.x around for other stuff they use Python for, if they need to.

Karen

Russell Keith-Magee

unread,
Jan 15, 2009, 1:54:56 AM1/15/09
to django-d...@googlegroups.com
On Thu, Jan 15, 2009 at 2:21 PM, Karen Tracey <kmtr...@gmail.com> wrote:
> FYI -- since this Windows Python 2.5.x problem doesn't appear to be
> something we'll be able to figure out before I turn into a pumpkin and have
> to go to sleep, and I'm not sure what you are thinking about with regard to
> committing the code -- the rest of the test suite runs normally on the
> Windows Python 2.5.2 (I didn't try 2.5.4) sqlite combo. (That's not to say
> no errors -- there's another known sqlite bug on that level, plus I get a
> file access error -- but neither of these are introduced by the aggregation
> code.) So I don't believe adding aggregation breaks any existing code on
> this setup.
>
> Which leads me to think holding up the checkin of this function because of
> this problem is not necessary.

Agreed. I'll commit tonight, as originally planned.

Russ %-)

Reply all
Reply to author
Forward
0 new messages