[Django] #26602: RawSQL window functions break count()

24 views
Skip to first unread message

Django

unread,
May 10, 2016, 7:19:42 AM5/10/16
to django-...@googlegroups.com
#26602: RawSQL window functions break count()
----------------------------------------------+----------------------------
Reporter: daggaz | Owner: nobody
Type: Uncategorized | Status: new
Component: Database layer (models, ORM) | Version: 1.9
Severity: Normal | Keywords: QuerySet.extra
Triage Stage: Unreviewed | Has patch: 0
Easy pickings: 0 | UI/UX: 0
----------------------------------------------+----------------------------
I wanted to annotate my objects with a running total:

{{{
class A(models.Model):
amount = models.IntegerField()
created = models.DateTimeField(auto_now_add=True)

qs = A.objects.annotate(total=RawSQL("SUM(amount) OVER (ORDER BY
created)", []))
}}}

That works fine, and I get a running total for each object, but I cannot
call `count()` on that queryset:
{{{
>>> qs.count()
Traceback...
ProgrammingError: window functions not allowed in GROUP BY clause
}}}

Using `extra()`, I can get the same annotation behaviour as well being
able to call `count()`:
{{{
>>> qs = A.objects.extra(select={'total': 'SUM(amount) OVER (ORDER BY
created)'})
>>> qs.count()
8
}}}

--
Ticket URL: <https://code.djangoproject.com/ticket/26602>
Django <https://code.djangoproject.com/>
The Web framework for perfectionists with deadlines.

Django

unread,
May 10, 2016, 12:57:38 PM5/10/16
to django-...@googlegroups.com
#26602: RawSQL window functions break count()
-------------------------------------+-------------------------------------

Reporter: daggaz | Owner: nobody
Type: Uncategorized | Status: new
Component: Database layer | Version: 1.9
(models, ORM) |
Severity: Normal | Resolution:

Keywords: QuerySet.extra | Triage Stage:
| Unreviewed
Has patch: 0 | Needs documentation: 0
Needs tests: 0 | Patch needs improvement: 0

Easy pickings: 0 | UI/UX: 0
-------------------------------------+-------------------------------------
Changes (by Brobin):

* needs_better_patch: => 0
* needs_tests: => 0
* needs_docs: => 0


Old description:

> I wanted to annotate my objects with a running total:
>
> {{{
> class A(models.Model):
> amount = models.IntegerField()
> created = models.DateTimeField(auto_now_add=True)
>
> qs = A.objects.annotate(total=RawSQL("SUM(amount) OVER (ORDER BY
> created)", []))
> }}}
>
> That works fine, and I get a running total for each object, but I cannot
> call `count()` on that queryset:
> {{{
> >>> qs.count()
> Traceback...
> ProgrammingError: window functions not allowed in GROUP BY clause
> }}}
>
> Using `extra()`, I can get the same annotation behaviour as well being
> able to call `count()`:
> {{{
> >>> qs = A.objects.extra(select={'total': 'SUM(amount) OVER (ORDER BY
> created)'})
> >>> qs.count()
> 8
> }}}

New description:

I wanted to annotate my objects with a running total:

{{{#!python


class A(models.Model):
amount = models.IntegerField()
created = models.DateTimeField(auto_now_add=True)

qs = A.objects.annotate(total=RawSQL("SUM(amount) OVER (ORDER BY
created)", []))
}}}

That works fine, and I get a running total for each object, but I cannot
call `count()` on that queryset:
{{{
>>> qs.count()
Traceback...
ProgrammingError: window functions not allowed in GROUP BY clause
}}}

Using `extra()`, I can get the same annotation behaviour as well being
able to call `count()`:
{{{
>>> qs = A.objects.extra(select={'total': 'SUM(amount) OVER (ORDER BY
created)'})
>>> qs.count()
8
}}}

--

--
Ticket URL: <https://code.djangoproject.com/ticket/26602#comment:1>

Django

unread,
May 10, 2016, 9:47:48 PM5/10/16
to django-...@googlegroups.com
#26602: RawSQL window functions break count()
-------------------------------------+-------------------------------------

Reporter: daggaz | Owner: nobody
Type: Uncategorized | Status: new
Component: Database layer | Version: master

(models, ORM) |
Severity: Normal | Resolution:
Keywords: QuerySet.extra | Triage Stage: Accepted

Has patch: 0 | Needs documentation: 0
Needs tests: 0 | Patch needs improvement: 0

Easy pickings: 0 | UI/UX: 0
-------------------------------------+-------------------------------------
Changes (by jarshwah):

* cc: josh.smeaton@… (added)
* version: 1.9 => master
* stage: Unreviewed => Accepted


Comment:

This happens because RawSQL defines:

{{{
def get_group_by_cols(self):
return [self]
}}}

It'd be helpful if there was a way to disable grouping in an easy way. For
the moment, you should be able to do something like:

{{{
window = RawSQL("SUM(amount) OVER (ORDER BY created)", [])
window.get_group_by_cols = lambda: []
qs = A.objects.annotate(total=window)
qs.count()

# Or..

RawAggregate(RawSQL):
contains_aggregate = True # may not be necessary in newer django
versions
def get_group_by_cols(self):
return []

qs = A.objects.annotate(total=RawAggregate("SUM(amount) OVER (ORDER BY
created)", []))
}}}

Can you let me know if this workaround helps you?

In the meantime, I think it's worth figuring out how we want to treat
RawSQL when it should not be doing group bys, so I'm accepting on that
basis.

Two options I see. First, we can define a whole new class as I've done
above - like RawAggregate. Second, we could use a kwarg to RawSQL to tweak
whether or not it contains aggregate code.

{{{
RawSQL('SELECT SUM() ..', [], grouping=False)
}}}

--
Ticket URL: <https://code.djangoproject.com/ticket/26602#comment:2>

Django

unread,
May 12, 2016, 5:13:14 AM5/12/16
to django-...@googlegroups.com
#26602: RawSQL window functions break count()
-------------------------------------+-------------------------------------

Reporter: daggaz | Owner: nobody
Type: Uncategorized | Status: new
Component: Database layer | Version: master
(models, ORM) |
Severity: Normal | Resolution:
Keywords: QuerySet.extra | Triage Stage: Accepted
Has patch: 0 | Needs documentation: 0
Needs tests: 0 | Patch needs improvement: 0

Easy pickings: 0 | UI/UX: 0
-------------------------------------+-------------------------------------

Comment (by daggaz):

Yes, you're suggestions did work as solutions, thank you.

Inspired by your `RawAggregate` subclass, I decided I'd have a go at
writing a proper window function `Expression`. I was mostly guessing at
how the API is supposed to work, so any hints/improvements would be
welcome.

{{{#!python
class Window(Expression):
template = '%(expression)s OVER (%(window)s)'

def __init__(self, expression, partition_by=None, order_by=None,
output_field=None):
self.order_by = order_by
if isinstance(order_by, six.string_types):
if order_by.startswith('-'):
self.order_by = OrderBy(F(self.order_by[1:]),
descending=True)
else:
self.order_by = OrderBy(F(self.order_by))

self.partition_by = partition_by
if self.partition_by:
self.partition_by = self._parse_expressions(partition_by)[0]

super(Window, self).__init__(output_field=output_field)
self.source_expression = self._parse_expressions(expression)[0]
if not self.source_expression.contains_aggregate:
raise FieldError("Window function expressions must be
aggregate functions")

def _resolve_output_field(self):
if self._output_field is None:
self._output_field = self.source_expression.output_field

def resolve_expression(self, query=None, allow_joins=True, reuse=None,
summarize=False, for_save=False):
c = self.copy()
c.source_expression =
c.source_expression.resolve_expression(query, allow_joins, reuse,
summarize, for_save)
if c.partition_by:
c.partition_by = c.partition_by.resolve_expression(query,
allow_joins, reuse, summarize, for_save)
if c.order_by:
c.order_by = c.order_by.resolve_expression(query, allow_joins,
reuse, summarize, for_save)
c.is_summary = summarize
c.for_save = for_save
return c

def as_sql(self, compiler, connection, function=None, template=None):
connection.ops.check_expression_support(self)
expr_sql, params = compiler.compile(self.source_expression)

window_sql = []
if self.partition_by:
window_sql.append('PARTITION BY ')
order_sql, order_params = compiler.compile(self.partition_by)
window_sql.append(order_sql)
params.extend(order_params)
if self.order_by:
window_sql.append(' ORDER BY ')
order_sql, order_params = compiler.compile(self.order_by)
window_sql.append(order_sql)
params.extend(order_params)
template = template or self.template
return template % {'expression': expr_sql, 'window':
"".join(window_sql)}, params

def copy(self):
copy = super(Window, self).copy()
copy.source_expression = self.source_expression.copy()
copy.partition_by = self.partition_by.copy() if self.partition_by
else None
copy.order_by = self.order_by.copy() if self.order_by else None
return copy

def get_group_by_cols(self):
return []
}}}

Which means you can write things like this:

{{{#!python
A.objects.annotate(total=Window(Sum('amount'), order_by='created'))
}}}

--
Ticket URL: <https://code.djangoproject.com/ticket/26602#comment:3>

Django

unread,
May 12, 2016, 5:22:19 AM5/12/16
to django-...@googlegroups.com
#26602: RawSQL window functions break count()
-------------------------------------+-------------------------------------

Reporter: daggaz | Owner: nobody
Type: Uncategorized | Status: new
Component: Database layer | Version: master
(models, ORM) |
Severity: Normal | Resolution:
Keywords: QuerySet.extra | Triage Stage: Accepted
Has patch: 0 | Needs documentation: 0
Needs tests: 0 | Patch needs improvement: 0

Easy pickings: 0 | UI/UX: 0
-------------------------------------+-------------------------------------

Comment (by akaariai):

I guess we could add Window annotations. The problem is that you can't
actually filter on window functions, you need a subquery for that. That
is:
{{{
select SUM(amount) OVER (ORDER BY created) as sum_amount
from ...
where sum_amount > 0
}}}
is illegal SQL. We'd need to write the query as:

{{{
select * from (select SUM(amount) OVER (ORDER BY created) as sum_amount
from ...) subq
where sum_amount > 0
}}}

I don't think we have machinery to do that easily right now.

I'm OK adding window aggregates to Django if they fail loudly and clearly
when one tries to filter over them. They could be useful even if you can't
actually filter on the result.

Maybe a new ticket is better for this than hijacking this ticket?

--
Ticket URL: <https://code.djangoproject.com/ticket/26602#comment:4>

Django

unread,
May 12, 2016, 5:29:01 AM5/12/16
to django-...@googlegroups.com
#26602: RawSQL window functions break count()
-------------------------------------+-------------------------------------

Reporter: daggaz | Owner: nobody
Type: Uncategorized | Status: new
Component: Database layer | Version: master
(models, ORM) |
Severity: Normal | Resolution:
Keywords: QuerySet.extra | Triage Stage: Accepted
Has patch: 0 | Needs documentation: 0
Needs tests: 0 | Patch needs improvement: 0

Easy pickings: 0 | UI/UX: 0
-------------------------------------+-------------------------------------

Comment (by daggaz):

I agree a new ticket is appropriate, as this has moved beyond my original
problem.

Do you still intend to add the `RawAggregate` or `RawSQL(...,
grouping=False)` solutions for this ticket?

Should I create the new ticket or is that your job?

--
Ticket URL: <https://code.djangoproject.com/ticket/26602#comment:5>

Django

unread,
May 12, 2016, 9:11:10 AM5/12/16
to django-...@googlegroups.com
#26602: RawSQL window functions break count()
-------------------------------------+-------------------------------------

Reporter: daggaz | Owner: nobody
Type: Uncategorized | Status: new
Component: Database layer | Version: master
(models, ORM) |
Severity: Normal | Resolution:
Keywords: QuerySet.extra | Triage Stage: Accepted
Has patch: 0 | Needs documentation: 0
Needs tests: 0 | Patch needs improvement: 0

Easy pickings: 0 | UI/UX: 0
-------------------------------------+-------------------------------------

Comment (by jarshwah):

Window functions were always something I intended to look into, so it's
really cool to see that working!

This ticket will focus on providing a way to manage grouping on RawSQL. If
you'd like to create a ticket to track actual Window expressions, that'd
be awesome. Anyone can create new feature tickets so feel free to.

--
Ticket URL: <https://code.djangoproject.com/ticket/26602#comment:6>

Django

unread,
May 12, 2016, 9:25:34 AM5/12/16
to django-...@googlegroups.com
#26602: RawSQL window functions break count()
-------------------------------------+-------------------------------------

Reporter: daggaz | Owner: nobody
Type: Uncategorized | Status: new
Component: Database layer | Version: master
(models, ORM) |
Severity: Normal | Resolution:
Keywords: QuerySet.extra | Triage Stage: Accepted
Has patch: 0 | Needs documentation: 0
Needs tests: 0 | Patch needs improvement: 0

Easy pickings: 0 | UI/UX: 0
-------------------------------------+-------------------------------------

Comment (by daggaz):

#26608 created

--
Ticket URL: <https://code.djangoproject.com/ticket/26602#comment:7>

Django

unread,
May 14, 2016, 6:00:06 PM5/14/16
to django-...@googlegroups.com
#26602: Provide a way to manage grouping with RawSQL
-------------------------------------+-------------------------------------
Reporter: daggaz | Owner: nobody
Type: New feature | Status: new

Component: Database layer | Version: master
(models, ORM) |
Severity: Normal | Resolution:
Keywords: QuerySet.extra | Triage Stage: Accepted
Has patch: 0 | Needs documentation: 0
Needs tests: 0 | Patch needs improvement: 0

Easy pickings: 0 | UI/UX: 0
-------------------------------------+-------------------------------------
Changes (by timgraham):

* type: Uncategorized => New feature


--
Ticket URL: <https://code.djangoproject.com/ticket/26602#comment:8>

Django

unread,
Dec 15, 2020, 4:59:36 PM12/15/20
to django-...@googlegroups.com
#26602: Provide a way to manage grouping with RawSQL
-------------------------------------+-------------------------------------
Reporter: Jamie Cockburn | Owner: Manav
| Agarwal
Type: New feature | Status: assigned

Component: Database layer | Version: master
(models, ORM) |
Severity: Normal | Resolution:
Keywords: QuerySet.extra | Triage Stage: Accepted
Has patch: 0 | Needs documentation: 0
Needs tests: 0 | Patch needs improvement: 0

Easy pickings: 0 | UI/UX: 0
-------------------------------------+-------------------------------------
Changes (by Manav Agarwal):

* owner: nobody => Manav Agarwal
* status: new => assigned


--
Ticket URL: <https://code.djangoproject.com/ticket/26602#comment:9>

Django

unread,
Dec 15, 2020, 5:02:07 PM12/15/20
to django-...@googlegroups.com
#26602: Provide a way to manage grouping with RawSQL
-------------------------------------+-------------------------------------
Reporter: Jamie Cockburn | Owner: Manav
| Agarwal
Type: New feature | Status: assigned
Component: Database layer | Version: master
(models, ORM) |
Severity: Normal | Resolution:
Keywords: QuerySet.extra | Triage Stage: Accepted
Has patch: 0 | Needs documentation: 0
Needs tests: 0 | Patch needs improvement: 0

Easy pickings: 0 | UI/UX: 0
-------------------------------------+-------------------------------------

Comment (by Manav Agarwal):

Replying to [comment:2 Josh Smeaton]:


> This happens because RawSQL defines:
>
> {{{
> def get_group_by_cols(self):
> return [self]
> }}}
>
> It'd be helpful if there was a way to disable grouping in an easy way.
For the moment, you should be able to do something like:
>
> {{{
> window = RawSQL("SUM(amount) OVER (ORDER BY created)", [])
> window.get_group_by_cols = lambda: []
> qs = A.objects.annotate(total=window)
> qs.count()
>
> # Or..
>
> RawAggregate(RawSQL):
> contains_aggregate = True # may not be necessary in newer django
versions
> def get_group_by_cols(self):
> return []
>

> qs = A.objects.annotate(total=RawAggregate("SUM(amount) OVER (ORDER BY
created)", []))
> }}}
>


> Can you let me know if this workaround helps you?
>
> In the meantime, I think it's worth figuring out how we want to treat
RawSQL when it should not be doing group bys, so I'm accepting on that
basis.
>
> Two options I see. First, we can define a whole new class as I've done
above - like RawAggregate. Second, we could use a kwarg to RawSQL to tweak
whether or not it contains aggregate code.
>
> {{{
> RawSQL('SELECT SUM() ..', [], grouping=False)
> }}}

I think we may use Josh's suggestion of adding a kwarg to RawSQL to tweak
whether or not it contains aggregate code. Please update if this is fine
so that I may submit a patch.
Regards

--
Ticket URL: <https://code.djangoproject.com/ticket/26602#comment:10>

Django

unread,
Dec 26, 2020, 4:45:48 PM12/26/20
to django-...@googlegroups.com
#26602: Provide a way to manage grouping with RawSQL
-------------------------------------+-------------------------------------
Reporter: Jamie Cockburn | Owner: Manav
| Agarwal
Type: New feature | Status: assigned
Component: Database layer | Version: master
(models, ORM) |
Severity: Normal | Resolution:
Keywords: QuerySet.extra | Triage Stage: Accepted
Has patch: 0 | Needs documentation: 0
Needs tests: 0 | Patch needs improvement: 0

Easy pickings: 0 | UI/UX: 0
-------------------------------------+-------------------------------------

Comment (by Manav Agarwal):

I think this bug is solved. I tried to reproduce the same but It didn't
reflect any error and returned the correct output.

--
Ticket URL: <https://code.djangoproject.com/ticket/26602#comment:11>

Django

unread,
Dec 27, 2020, 4:48:45 AM12/27/20
to django-...@googlegroups.com
#26602: Provide a way to manage grouping with RawSQL
-------------------------------------+-------------------------------------
Reporter: Jamie Cockburn | Owner: Manav
| Agarwal
Type: New feature | Status: assigned
Component: Database layer | Version: master
(models, ORM) |
Severity: Normal | Resolution:
Keywords: QuerySet.extra | Triage Stage: Accepted
Has patch: 0 | Needs documentation: 0
Needs tests: 0 | Patch needs improvement: 0

Easy pickings: 0 | UI/UX: 0
-------------------------------------+-------------------------------------

Comment (by Claude Paroz):

Manav, if you think this is resolved, the best you can do is to find the
commit where this has been fixed. Look for bisect examples in the docs if
needed.

--
Ticket URL: <https://code.djangoproject.com/ticket/26602#comment:12>

Django

unread,
Dec 28, 2020, 2:29:40 PM12/28/20
to django-...@googlegroups.com
#26602: Provide a way to manage grouping with RawSQL
-------------------------------------+-------------------------------------
Reporter: Jamie Cockburn | Owner: Manav
| Agarwal
Type: New feature | Status: assigned
Component: Database layer | Version: master
(models, ORM) |
Severity: Normal | Resolution:
Keywords: QuerySet.extra | Triage Stage: Accepted
Has patch: 0 | Needs documentation: 0
Needs tests: 0 | Patch needs improvement: 0

Easy pickings: 0 | UI/UX: 0
-------------------------------------+-------------------------------------

Comment (by Manav Agarwal):

I wrote the following test in the **BasicExpressionsTests** class in
**expressions/test_regression.py**
{{{
def test_regression(self):
companies =
Company.objects.annotate(salary=RawSQL("SUM(num_chairs) OVER (ORDER BY
num_employees)",[]))
self.assertEqual(companies.count(),3)
}}}
And found that the error was not returned after
[https://github.com/django/django/commits/3f32154f40a855afa063095e3d091ce6be21f2c5]
commit but was there in the code before this commit.

--
Ticket URL: <https://code.djangoproject.com/ticket/26602#comment:13>

Django

unread,
Dec 29, 2020, 3:19:23 AM12/29/20
to django-...@googlegroups.com
#26602: Provide a way to manage grouping with RawSQL
-------------------------------------+-------------------------------------
Reporter: Jamie Cockburn | Owner: Manav
| Agarwal
Type: New feature | Status: assigned
Component: Database layer | Version: master
(models, ORM) |
Severity: Normal | Resolution:
Keywords: QuerySet.extra | Triage Stage: Accepted
Has patch: 0 | Needs documentation: 0
Needs tests: 0 | Patch needs improvement: 0

Easy pickings: 0 | UI/UX: 0
-------------------------------------+-------------------------------------

Comment (by Claude Paroz):

Awesome! I'll let a fellow decide if it's worth adding a new regression
test or if this issue can be closed right now.

--
Ticket URL: <https://code.djangoproject.com/ticket/26602#comment:14>

Django

unread,
Jan 26, 2021, 2:17:23 AM1/26/21
to django-...@googlegroups.com
#26602: Provide a way to manage grouping with RawSQL
-------------------------------------+-------------------------------------
Reporter: Jamie Cockburn | Owner: Manav
| Agarwal
Type: New feature | Status: closed

Component: Database layer | Version: master
(models, ORM) |
Severity: Normal | Resolution: wontfix

Keywords: QuerySet.extra | Triage Stage: Accepted
Has patch: 0 | Needs documentation: 0
Needs tests: 0 | Patch needs improvement: 0

Easy pickings: 0 | UI/UX: 0
-------------------------------------+-------------------------------------
Changes (by Mariusz Felisiak):

* status: assigned => closed
* resolution: => wontfix


Comment:

Manav, thanks for checking.

With Simon's changes (3f32154f40a855afa063095e3d091ce6be21f2c5 & co.) we
avoid unnecessary `GROUP BY` clauses. This ticket is about adding a way to
manage grouping with `RawSQL()` not about this specific crash, however I
think we can close it as "wontfix", unless someone can provide a valid use
case.

--
Ticket URL: <https://code.djangoproject.com/ticket/26602#comment:15>

Django

unread,
Jan 26, 2021, 2:18:15 AM1/26/21
to django-...@googlegroups.com
#26602: Provide a way to manage grouping with RawSQL
-------------------------------------+-------------------------------------
Reporter: Jamie Cockburn | Owner: Manav
| Agarwal
Type: New feature | Status: closed
Component: Database layer | Version: master
(models, ORM) |
Severity: Normal | Resolution: wontfix
Keywords: QuerySet.extra | Triage Stage: Accepted
Has patch: 0 | Needs documentation: 0
Needs tests: 0 | Patch needs improvement: 0

Easy pickings: 0 | UI/UX: 0
-------------------------------------+-------------------------------------

Comment (by Mariusz Felisiak):

I prepared [https://github.com/django/django/pull/13935 PR] with extra
tests.

--
Ticket URL: <https://code.djangoproject.com/ticket/26602#comment:16>

Django

unread,
Jan 26, 2021, 2:18:58 AM1/26/21
to django-...@googlegroups.com
#26602: Provide a way to manage grouping with RawSQL
-------------------------------------+-------------------------------------
Reporter: Jamie Cockburn | Owner: Manav
| Agarwal
Type: New feature | Status: closed
Component: Database layer | Version: master
(models, ORM) |
Severity: Normal | Resolution: needsinfo

Keywords: QuerySet.extra | Triage Stage: Accepted
Has patch: 0 | Needs documentation: 0
Needs tests: 0 | Patch needs improvement: 0

Easy pickings: 0 | UI/UX: 0
-------------------------------------+-------------------------------------
Changes (by Mariusz Felisiak):

* resolution: wontfix => needsinfo


--
Ticket URL: <https://code.djangoproject.com/ticket/26602#comment:17>

Django

unread,
Jan 26, 2021, 5:00:06 AM1/26/21
to django-...@googlegroups.com
#26602: Provide a way to manage grouping with RawSQL
-------------------------------------+-------------------------------------
Reporter: Jamie Cockburn | Owner: Manav
| Agarwal
Type: New feature | Status: closed
Component: Database layer | Version: master
(models, ORM) |
Severity: Normal | Resolution: needsinfo
Keywords: QuerySet.extra | Triage Stage: Accepted
Has patch: 0 | Needs documentation: 0
Needs tests: 0 | Patch needs improvement: 0

Easy pickings: 0 | UI/UX: 0
-------------------------------------+-------------------------------------

Comment (by GitHub <noreply@…>):

In [changeset:"b989d213363e0c34ac34ad819a725c7ceb6bbc0b" b989d213]:
{{{
#!CommitTicketReference repository=""
revision="b989d213363e0c34ac34ad819a725c7ceb6bbc0b"
Refs #26602 -- Added tests for aggregating over a RawSQL() annotation.

Fixed in 3f32154f40a855afa063095e3d091ce6be21f2c5.

Thanks Manav Agarwal for initial test.
}}}

--
Ticket URL: <https://code.djangoproject.com/ticket/26602#comment:18>

Reply all
Reply to author
Forward
0 new messages