[Django] #24692: QuerySet.extra(select=...) is silently dropped with aggregations

31 views
Skip to first unread message

Django

unread,
Apr 23, 2015, 10:16:07 AM4/23/15
to django-...@googlegroups.com
#24692: QuerySet.extra(select=...) is silently dropped with aggregations
-------------------------------+--------------------
Reporter: jdelic | Owner: nobody
Type: Uncategorized | Status: new
Component: Uncategorized | Version: 1.8
Severity: Normal | Keywords:
Triage Stage: Unreviewed | Has patch: 0
Easy pickings: 0 | UI/UX: 0
-------------------------------+--------------------
Here is a minimal example:
{{{
#!python
from django.db import models

class Service(models.Model):
name = models.TextField()

class ServiceAction(models.Model):
performed = models.DateTimeField(auto_now_add=True)
service = models.ForeignKey(Service)
}}}

Then add some models:
{{{
#!python
>>> from dbtest.models import Service, ServiceAction
>>> from django.db.models import Min, Max
>>> s1 = Service.objects.create(name='test1')
>>> s2 = Service.objects.create(name='test2')
>>> ServiceAction.objects.create(service_id=1)
>>> ServiceAction.objects.create(service_id=1)
>>> ServiceAction.objects.create(service_id=1)
>>> ServiceAction.objects.create(service_id=2)
>>> ServiceAction.objects.create(service_id=2)
>>> ServiceAction.objects.create(service_id=2)
}}}

...and query them using aggregations:
{{{
#!python
>>> qs = ServiceAction.objects.extra(select={'period':
'(MAX("dbtest_serviceaction"."performed")-MIN("dbtest_serviceaction"."performed"))'}).values('service_id').annotate(first_action=Min('performed'),
last_action=Max('performed'))
>>> print(qs.query)
SELECT "dbtest_serviceaction"."service_id",
MAX("dbtest_serviceaction"."performed") AS "last_action",
MIN("dbtest_serviceaction"."performed") AS "first_action" FROM
"dbtest_serviceaction" GROUP BY "dbtest_serviceaction"."service_id"
}}}

As you can see the {{{.extra(select=...)}}} has been dropped from the
query. I assume this happens on Django 1.6.x, 1.7.x and 1.8.x, but I have
only tested 1.6.11 and 1.8.

The expected outcome would have been:

{{{
SELECT "dbtest_serviceaction"."service_id",
MAX("dbtest_serviceaction"."performed") AS "last_action",
MIN("dbtest_serviceaction"."performed") AS "first_action",
MAX("dbtest_serviceaction"."performed")-MIN("dbtest_serviceaction"."performed")
as "period" FROM "dbtest_serviceaction" GROUP BY
"dbtest_serviceaction"."service_id"
}}}

In Django 1.8 this exact query can be performed with the ORM itself
through the use of chained aggregations:

{{{
#!python
>>> qs =
ServiceAction.objects.values('service_id').annotate(first_action=Min('performed'),
last_action=Max('performed'), period=Max('performed')-Min('performed'))
>>> print(qs.query)
SELECT "dbtest_serviceaction"."service_id",
(MAX("dbtest_serviceaction"."performed") -
MIN("dbtest_serviceaction"."performed")) AS "period",
MAX("dbtest_serviceaction"."performed") AS "last_action",
MIN("dbtest_serviceaction"."performed") AS "first_action" FROM
"dbtest_serviceaction" GROUP BY "dbtest_serviceaction"."service_id"
}}}

--
Ticket URL: <https://code.djangoproject.com/ticket/24692>
Django <https://code.djangoproject.com/>
The Web framework for perfectionists with deadlines.

Django

unread,
Apr 23, 2015, 10:20:00 AM4/23/15
to django-...@googlegroups.com
#24692: QuerySet.extra(select=...) is silently dropped with aggregations
-------------------------------+--------------------------------------

Reporter: jdelic | Owner: nobody
Type: Uncategorized | Status: new
Component: Uncategorized | Version: 1.8
Severity: Normal | Resolution:
Keywords: | Triage Stage: Unreviewed
Has patch: 0 | Needs documentation: 0
Needs tests: 0 | Patch needs improvement: 0

Easy pickings: 0 | UI/UX: 0
-------------------------------+--------------------------------------
Changes (by jdelic):

* needs_better_patch: => 0
* needs_tests: => 0
* needs_docs: => 0


Old description:

New description:

Here is a minimal example:
{{{
#!python
from django.db import models

class Service(models.Model):
name = models.TextField()

class ServiceAction(models.Model):
performed = models.DateTimeField(auto_now_add=True)
service = models.ForeignKey(Service)
}}}

Then add some models:
{{{
#!python
>>> from dbtest.models import Service, ServiceAction
>>> from django.db.models import Min, Max
>>> s1 = Service.objects.create(name='test1')
>>> s2 = Service.objects.create(name='test2')

>>> ServiceAction.objects.create(service_id=s1.id)
>>> ServiceAction.objects.create(service_id=s1.id)
>>> ServiceAction.objects.create(service_id=s1.id)
>>> ServiceAction.objects.create(service_id=s2.id)
>>> ServiceAction.objects.create(service_id=s2.id)
>>> ServiceAction.objects.create(service_id=s2.id)
}}}

--

--
Ticket URL: <https://code.djangoproject.com/ticket/24692#comment:1>

Django

unread,
Apr 23, 2015, 10:21:41 AM4/23/15
to django-...@googlegroups.com
#24692: QuerySet.extra(select=...) is silently dropped with aggregations
-------------------------------+--------------------------------------

Reporter: jdelic | Owner: nobody
Type: Uncategorized | Status: new
Component: Uncategorized | Version: 1.8
Severity: Normal | Resolution:
Keywords: | Triage Stage: Unreviewed

Has patch: 0 | Needs documentation: 0
Needs tests: 0 | Patch needs improvement: 0

Easy pickings: 0 | UI/UX: 0
-------------------------------+--------------------------------------
Changes (by mbertheau):

* cc: mbertheau@… (added)


--
Ticket URL: <https://code.djangoproject.com/ticket/24692#comment:2>

Django

unread,
Apr 23, 2015, 10:21:52 AM4/23/15
to django-...@googlegroups.com
#24692: QuerySet.extra(select=...) is silently dropped with aggregations
-------------------------------+--------------------------------------

Reporter: jdelic | Owner: nobody
Type: Uncategorized | Status: new
Component: Uncategorized | Version: 1.8
Severity: Normal | Resolution:
Keywords: | Triage Stage: Unreviewed

Has patch: 0 | Needs documentation: 0
Needs tests: 0 | Patch needs improvement: 0

Easy pickings: 0 | UI/UX: 0
-------------------------------+--------------------------------------
Description changed by jdelic:

Old description:

> Here is a minimal example:
> {{{
> #!python
> from django.db import models
>
> class Service(models.Model):
> name = models.TextField()
>
> class ServiceAction(models.Model):
> performed = models.DateTimeField(auto_now_add=True)
> service = models.ForeignKey(Service)
> }}}
>
> Then add some models:
> {{{
> #!python
> >>> from dbtest.models import Service, ServiceAction
> >>> from django.db.models import Min, Max
> >>> s1 = Service.objects.create(name='test1')
> >>> s2 = Service.objects.create(name='test2')

> >>> ServiceAction.objects.create(service_id=s1.id)
> >>> ServiceAction.objects.create(service_id=s1.id)
> >>> ServiceAction.objects.create(service_id=s1.id)
> >>> ServiceAction.objects.create(service_id=s2.id)
> >>> ServiceAction.objects.create(service_id=s2.id)
> >>> ServiceAction.objects.create(service_id=s2.id)
> }}}
>

New description:

Here is a minimal example:
{{{
#!python
from django.db import models

class Service(models.Model):
name = models.TextField()

class ServiceAction(models.Model):
performed = models.DateTimeField(auto_now_add=True)
service = models.ForeignKey(Service)
}}}

Then add some models:
{{{
#!python
>>> from dbtest.models import Service, ServiceAction
>>> from django.db.models import Min, Max
>>> s1 = Service.objects.create(name='test1')
>>> s2 = Service.objects.create(name='test2')

>>> ServiceAction.objects.create(service_id=s1.id)
>>> ServiceAction.objects.create(service_id=s1.id)
>>> ServiceAction.objects.create(service_id=s1.id)
>>> ServiceAction.objects.create(service_id=s2.id)
>>> ServiceAction.objects.create(service_id=s2.id)
>>> ServiceAction.objects.create(service_id=s2.id)
}}}

...and query them using aggregations:


{{{
#!python
>>> qs = ServiceAction.objects.extra(select={'period':
'(MAX("dbtest_serviceaction"."performed")-MIN("dbtest_serviceaction"."performed"))'}).values('service_id').annotate(first_action=Min('performed'),
last_action=Max('performed'))
>>> print(qs.query)
SELECT "dbtest_serviceaction"."service_id",
MAX("dbtest_serviceaction"."performed") AS "last_action",
MIN("dbtest_serviceaction"."performed") AS "first_action" FROM
"dbtest_serviceaction" GROUP BY "dbtest_serviceaction"."service_id"
}}}

As you can see the {{{.extra(select=...)}}} has been dropped from the
query. I assume this happens on Django 1.6.x, 1.7.x and 1.8.x, but I have
only tested 1.6.11 and 1.8.

The expected outcome would have been:

{{{
SELECT "dbtest_serviceaction"."service_id",
MAX("dbtest_serviceaction"."performed") AS "last_action",
MIN("dbtest_serviceaction"."performed") AS "first_action",
MAX("dbtest_serviceaction"."performed")-MIN("dbtest_serviceaction"."performed")
as "period" FROM "dbtest_serviceaction" GROUP BY
"dbtest_serviceaction"."service_id"
}}}

Using Django 1.8 this exact query can be performed with the ORM itself
through the use of the newly added operators for aggregations:

{{{
#!python
>>> qs =
ServiceAction.objects.values('service_id').annotate(first_action=Min('performed'),
last_action=Max('performed'), period=Max('performed')-Min('performed'))
>>> print(qs.query)
SELECT "dbtest_serviceaction"."service_id",
(MAX("dbtest_serviceaction"."performed") -
MIN("dbtest_serviceaction"."performed")) AS "period",
MAX("dbtest_serviceaction"."performed") AS "last_action",
MIN("dbtest_serviceaction"."performed") AS "first_action" FROM
"dbtest_serviceaction" GROUP BY "dbtest_serviceaction"."service_id"
}}}

--

--
Ticket URL: <https://code.djangoproject.com/ticket/24692#comment:3>

Django

unread,
Apr 23, 2015, 10:26:36 AM4/23/15
to django-...@googlegroups.com
#24692: QuerySet.extra(select=...) is silently dropped with aggregations
-------------------------------+--------------------------------------

Reporter: jdelic | Owner: nobody
Type: Uncategorized | Status: new
Component: Uncategorized | Version: 1.8
Severity: Normal | Resolution:
Keywords: | Triage Stage: Unreviewed

Has patch: 0 | Needs documentation: 0
Needs tests: 0 | Patch needs improvement: 0

Easy pickings: 0 | UI/UX: 0
-------------------------------+--------------------------------------

Comment (by timgraham):

Not that this fixes the problem, but we are trying to obsolete `.extra()`
by providing better APIs for all its use cases. Do you have a use case in
mind where the new expressions API doesn't suffice? If so, I think there
will be more interest in working on that instead of fixing bugs with
`.extra()` which seems likely to be deprecated at some point.

--
Ticket URL: <https://code.djangoproject.com/ticket/24692#comment:4>

Django

unread,
Apr 23, 2015, 10:36:52 AM4/23/15
to django-...@googlegroups.com
#24692: QuerySet.extra(select=...) is silently dropped with aggregations
-------------------------------+--------------------------------------

Reporter: jdelic | Owner: nobody
Type: Uncategorized | Status: new
Component: Uncategorized | Version: 1.8
Severity: Normal | Resolution:
Keywords: | Triage Stage: Unreviewed

Has patch: 0 | Needs documentation: 0
Needs tests: 0 | Patch needs improvement: 0

Easy pickings: 0 | UI/UX: 0
-------------------------------+--------------------------------------

Comment (by jdelic):

Replying to [comment:4 timgraham]:


> Not that this fixes the problem, but we are trying to obsolete
`.extra()` by providing better APIs for all its use cases. Do you have a
use case in mind where the new expressions API doesn't suffice? If so, I
think there will be more interest in working on that instead of fixing
bugs with `.extra()` which seems likely to be deprecated at some point.

Well, there will always be edge cases where any ORM will not work, I
guess. In my case the new expressions API *would suffice*, unfortunately
we run a rather big project on Django 1.6.11, migrating to 1.7... so I
won't be able to use it for a while.

In any case, I ran into this while trying to generate some much more
complex reports from our database. As {{{.extra()}}} hasn't been
deprecated yet and I was able to boil it down to such a simple failure
case, I think fixing it would still be a very good idea. I feel that if
{{{.extra()}}} fails silently on such a simple case, there are bound to be
other "more legitimate" use-cases that will fail as well. That said, I
haven't investigated Django's code itself, so this might also just be a
very narrow failure mode.

--
Ticket URL: <https://code.djangoproject.com/ticket/24692#comment:5>

Django

unread,
Apr 23, 2015, 10:45:40 AM4/23/15
to django-...@googlegroups.com
#24692: QuerySet.extra(select=...) is silently dropped with aggregations
-------------------------------------+-------------------------------------

Reporter: jdelic | Owner: nobody
Type: Uncategorized | Status: new
Component: Database layer | Version: 1.8
(models, ORM) |
Severity: Normal | Resolution:
Keywords: | Triage Stage:
| Unreviewed

Has patch: 0 | Needs documentation: 0
Needs tests: 0 | Patch needs improvement: 0

Easy pickings: 0 | UI/UX: 0
-------------------------------------+-------------------------------------
Changes (by timgraham):

* component: Uncategorized => Database layer (models, ORM)


Comment:

We can accept the ticket and see if someone cares to fix it, but even if
fixed it won't be backported to 1.6 or 1.7.

Looking at it a bit more, though, I noticed that `period` doesn't seem to
be used in the values or in the annotate. Do you have a typo there?

--
Ticket URL: <https://code.djangoproject.com/ticket/24692#comment:6>

Django

unread,
Apr 23, 2015, 11:06:28 AM4/23/15
to django-...@googlegroups.com
#24692: QuerySet.extra(select=...) is silently dropped with aggregations
-------------------------------------+-------------------------------------

Reporter: jdelic | Owner: nobody
Type: Uncategorized | Status: new
Component: Database layer | Version: 1.8
(models, ORM) |
Severity: Normal | Resolution:
Keywords: | Triage Stage:
| Unreviewed

Has patch: 0 | Needs documentation: 0
Needs tests: 0 | Patch needs improvement: 0

Easy pickings: 0 | UI/UX: 0
-------------------------------------+-------------------------------------

Comment (by jdelic):

> Looking at it a bit more, though, I noticed that period doesn't seem to
be used in the values or in the annotate. Do you have a typo there?

No, not a typo per se. Maybe a mistake, though. I'm basing this off the
{{{is_recent}}} example in
https://docs.djangoproject.com/en/1.8/ref/models/querysets/#extra, which
also doesn't mention the field anywhere else. It might be worth mentioning
that if you add {{{.values('service_id', 'period')}}} then the QuerySet
will generate invalid SQL:

{{{
#!python
>>> qs = ServiceAction.objects.extra(select={'period':

'(MAX("dbtest_serviceaction"."performed")-MIN("dbtest_serviceaction"."performed"))'}).values('service_id',
'period').annotate(first_action=Min('performed'),


last_action=Max('performed'))
>>> print(qs.query)
SELECT

((MAX("dbtest_serviceaction"."performed")-MIN("dbtest_serviceaction"."performed")))
AS "period", "dbtest_serviceaction"."service_id",


MAX("dbtest_serviceaction"."performed") AS "last_action",
MIN("dbtest_serviceaction"."performed") AS "first_action" FROM

"dbtest_serviceaction" GROUP BY "dbtest_serviceaction"."service_id",
((MAX("dbtest_serviceaction"."performed")-MIN("dbtest_serviceaction"."performed")))
>>> qs
...
OperationalError: aggregate functions are not allowed in the GROUP BY
clause
}}}

--
Ticket URL: <https://code.djangoproject.com/ticket/24692#comment:7>

Django

unread,
Apr 23, 2015, 11:42:49 AM4/23/15
to django-...@googlegroups.com
#24692: QuerySet.extra(select=...) is silently dropped with aggregations
-------------------------------------+-------------------------------------
Reporter: jdelic | Owner: nobody
Type: | Status: new
Cleanup/optimization |

Component: Database layer | Version: 1.8
(models, ORM) |
Severity: Normal | Resolution:
Keywords: | Triage Stage: Accepted

Has patch: 0 | Needs documentation: 0
Needs tests: 0 | Patch needs improvement: 0

Easy pickings: 0 | UI/UX: 0
-------------------------------------+-------------------------------------
Changes (by timgraham):

* type: Uncategorized => Cleanup/optimization
* stage: Unreviewed => Accepted


Comment:

Right, but `values()` limits the select to only those fields so to me it's
not surprising that this doesn't work. We can accept the ticket in case
someone cares to write a patch (doc or code), but as I mentioned before, I
don't think it's useful to spend more time on `extra()` at this point.

--
Ticket URL: <https://code.djangoproject.com/ticket/24692#comment:8>

Django

unread,
Apr 23, 2015, 3:27:51 PM4/23/15
to django-...@googlegroups.com
#24692: QuerySet.extra(select=...) is silently dropped with aggregations
-------------------------------------+-------------------------------------
Reporter: jdelic | Owner: nobody

Type: | Status: new
Cleanup/optimization |
Component: Database layer | Version: 1.8
(models, ORM) |
Severity: Normal | Resolution:
Keywords: | Triage Stage: Accepted
Has patch: 0 | Needs documentation: 0
Needs tests: 0 | Patch needs improvement: 0

Easy pickings: 0 | UI/UX: 0
-------------------------------------+-------------------------------------

Comment (by shaib):

So, I see two separate issues here.

The first is that `period` was not selected in the original query. That, I
think, is exactly expected behavior -- that's what `values()` does.

The second is that, as demonstrated by adding `"period"` to the `values()`
call, `extra()` does not support aggregates well (the error is caused by
the ORM assuming that `period` is a non-aggregate field, and so must be
pulled into the group-by clause for the aggregate to work). This, AFAICT,
cannot be solved without adding to `extra()` an argument (or a sub-
argument) to specify which of the `select` keys are aggregates; and that,
combined with Tim's comments above, makes me think this should be
wontfix'd.

--
Ticket URL: <https://code.djangoproject.com/ticket/24692#comment:9>

Django

unread,
Apr 23, 2015, 9:16:19 PM4/23/15
to django-...@googlegroups.com
#24692: QuerySet.extra(select=...) is silently dropped with aggregations
-------------------------------------+-------------------------------------
Reporter: jdelic | Owner: nobody
Type: | Status: closed

Cleanup/optimization |
Component: Database layer | Version: 1.8
(models, ORM) |
Severity: Normal | Resolution: wontfix

Keywords: | Triage Stage: Accepted
Has patch: 0 | Needs documentation: 0
Needs tests: 0 | Patch needs improvement: 0

Easy pickings: 0 | UI/UX: 0
-------------------------------------+-------------------------------------
Changes (by timgraham):

* status: new => closed
* resolution: => wontfix


--
Ticket URL: <https://code.djangoproject.com/ticket/24692#comment:10>

Django

unread,
Apr 23, 2015, 9:36:24 PM4/23/15
to django-...@googlegroups.com
#24692: QuerySet.extra(select=...) is silently dropped with aggregations
-------------------------------------+-------------------------------------
Reporter: jdelic | Owner: nobody

Type: | Status: closed
Cleanup/optimization |
Component: Database layer | Version: 1.8
(models, ORM) |
Severity: Normal | Resolution: wontfix
Keywords: | Triage Stage: Accepted
Has patch: 0 | Needs documentation: 0
Needs tests: 0 | Patch needs improvement: 0

Easy pickings: 0 | UI/UX: 0
-------------------------------------+-------------------------------------

Comment (by jarshwah):

I agree with your analysis shaib. I've just tried a few things to see if I
can work around the problem, but no I can't. Adding the extra clause at
the end of the .values().annotate() also drops the extra, which is
somewhat concerning, but it wouldn't work anyway.

> Well, there will always be edge cases where any ORM will not work, I
guess.

Yes, and that is why users can now build their own expressions with full
support of the ORM.

At a maximum we could maybe document that extra does not play well with
aggregation, but I can't see us implementing a fix here without turning
"extra" into the new expressions api anyway.

--
Ticket URL: <https://code.djangoproject.com/ticket/24692#comment:11>

Reply all
Reply to author
Forward
0 new messages