Re: [Django] #34016: QuerySet.values_list() crash on simple ArrayAgg.

36 views
Skip to first unread message

Django

unread,
Sep 16, 2022, 5:42:00 AM9/16/22
to django-...@googlegroups.com
#34016: QuerySet.values_list() crash on simple ArrayAgg.
----------------------------------+------------------------------------
Reporter: Alex Kerkum | Owner: (none)
Type: Bug | Status: new
Component: contrib.postgres | Version: 4.1
Severity: Release blocker | Resolution:
Keywords: | Triage Stage: Accepted
Has patch: 0 | Needs documentation: 0
Needs tests: 0 | Patch needs improvement: 0
Easy pickings: 0 | UI/UX: 0
----------------------------------+------------------------------------

Comment (by Mariusz Felisiak):

We could skip an empty `OrderByList` 🤔:
{{{#!diff
diff --git a/django/contrib/postgres/aggregates/mixins.py
b/django/contrib/postgres/aggregates/mixins.py
index b2f4097b8f..2bf236474a 100644
--- a/django/contrib/postgres/aggregates/mixins.py
+++ b/django/contrib/postgres/aggregates/mixins.py
@@ -14,10 +14,13 @@ class OrderableAggMixin:
return super().resolve_expression(*args, **kwargs)

def get_source_expressions(self):
+ if not self.order_by.source_expressions:
+ return super().get_source_expressions()
return super().get_source_expressions() + [self.order_by]

def set_source_expressions(self, exprs):
- *exprs, self.order_by = exprs
+ if isinstance(exprs[-1], OrderByList):
+ *exprs, self.order_by = exprs
return super().set_source_expressions(exprs)

def as_sql(self, compiler, connection):

}}}

Alex, would you like to prepare a patch? (regression test is required)

--
Ticket URL: <https://code.djangoproject.com/ticket/34016#comment:4>
Django <https://code.djangoproject.com/>
The Web framework for perfectionists with deadlines.

Django

unread,
Sep 16, 2022, 5:56:50 AM9/16/22
to django-...@googlegroups.com
#34016: QuerySet.values_list() crash on simple ArrayAgg.
----------------------------------+------------------------------------
Reporter: Alex Kerkum | Owner: (none)
Type: Bug | Status: new
Component: contrib.postgres | Version: 4.1
Severity: Release blocker | Resolution:
Keywords: | Triage Stage: Accepted
Has patch: 0 | Needs documentation: 0
Needs tests: 0 | Patch needs improvement: 0
Easy pickings: 0 | UI/UX: 0
----------------------------------+------------------------------------

Comment (by Alex Kerkum):

Thanks for the quick reply! Your proposed fix does indeed seem to fix the
issue.

I would love to create a patch, but I have no experience with the django
code.
I'll have a look at it this afternoon to see if I can come up with
something.

--
Ticket URL: <https://code.djangoproject.com/ticket/34016#comment:5>

Django

unread,
Sep 16, 2022, 10:38:30 AM9/16/22
to django-...@googlegroups.com
#34016: QuerySet.values_list() crash on simple ArrayAgg.
----------------------------------+------------------------------------
Reporter: Alex Kerkum | Owner: (none)
Type: Bug | Status: new
Component: contrib.postgres | Version: 4.1
Severity: Release blocker | Resolution:
Keywords: | Triage Stage: Accepted
Has patch: 0 | Needs documentation: 0
Needs tests: 0 | Patch needs improvement: 0
Easy pickings: 0 | UI/UX: 0
----------------------------------+------------------------------------

Comment (by Simon Charette):

This approach will work as long no `OrderableAggMixin` subclass use this
`get_source_expression` entry removal trick as well which I believe is the
case for our internal usages.

Somewhat related but the `Window` expression is an example of a composite
that has some components that could be missing (`partition_by`, `frame`,
`order_by`) and cannot use the `len` of the expressions to determine which
ones were provided back (e.g. does 3 source expressions mean
`(source_expression, partition_by, frame)` or `(source_expression, frame,
order_by)`?).

It solves this problem by somewhat violating the `get_source_expressions`
API by returning `None` placeholders which makes use some special casing
in a few locations

-
https://github.com/django/django/blob/bc33b047848759e7ae90bf85c94ff6cc5913e1ee/django/db/models/expressions.py#L397-L402
-
https://github.com/django/django/blob/bc33b047848759e7ae90bf85c94ff6cc5913e1ee/django/db/models/expressions.py#L384-L389
-
https://github.com/django/django/blob/bc33b047848759e7ae90bf85c94ff6cc5913e1ee/django/db/models/expressions.py#L274-L281

But not in all of them!
-
https://github.com/django/django/blob/bc33b047848759e7ae90bf85c94ff6cc5913e1ee/django/db/models/expressions.py#L410-L417
-
https://github.com/django/django/blob/bc33b047848759e7ae90bf85c94ff6cc5913e1ee/django/db/models/expressions.py#L424-L425
-
https://github.com/django/django/blob/bc33b047848759e7ae90bf85c94ff6cc5913e1ee/django/db/models/expressions.py#L236-L246

All that so say that we should probably settle on what the signature of
`get_source_expressions` is; `list[Optional[Expression]]` or
`list[Expression]`?

If we go with `list[Optional[Expression]]` then we don't have to do any
`len` based special casing when dealing with optional sub-expressions but
it means that checks of the form `len(source_expressions)` are not
adequate; they must exclude `None` values. I guess we could provide a
property/method that excludes them if this is a common pattern.

If we go with `list[Expression]` then we should adjust `Window` other
expressions violating this guarantee and drop special casing for `None`
value. I guess `Window.get_source_expressions()` could do something along
the lines of

{{{#!python
def get_source_expressions(self):
expressions = self.source_expression
if self.partition_by:
expressions.append(self.partition_by)
...
return expressions

def set_source_expressions(self, expressions):
self.source_expression = expressions.pop(0)
if self.partition_by:
self.partition_by = expressions.pop(0)
...
}}}

Given we've had these special cases in place for a while I think a better
long term solution, and one to happen to maintain backward compatibility,
would be to fully embrace the `list[Optional[Expression]]` approach by
using `None` placeholders to address this problem. It doesn't have to be
in this patch but at least agreeing on which signature we should support
in `main` and going forward would be beneficial.

From having more experience playing with Python typing in the past years,
this is the kind of issues `mypy` and friends could help us properly
enforce. [https://github.com/django/deps/pull/65 Maybe it's time for us to
re-visit adding typing at least to some parts of Django]?

--
Ticket URL: <https://code.djangoproject.com/ticket/34016#comment:6>

Django

unread,
Sep 16, 2022, 11:11:03 AM9/16/22
to django-...@googlegroups.com
#34016: QuerySet.values_list() crash on simple ArrayAgg.
----------------------------------+------------------------------------
Reporter: Alex Kerkum | Owner: (none)
Type: Bug | Status: new
Component: contrib.postgres | Version: 4.1
Severity: Release blocker | Resolution:
Keywords: | Triage Stage: Accepted
Has patch: 0 | Needs documentation: 0
Needs tests: 0 | Patch needs improvement: 0
Easy pickings: 0 | UI/UX: 0
----------------------------------+------------------------------------

Comment (by Alex Kerkum):

I've made a PR with Mariusz' suggestions here:
https://github.com/django/django/pull/16064

Is that sufficient, or does it need some more work?

--
Ticket URL: <https://code.djangoproject.com/ticket/34016#comment:7>

Django

unread,
Sep 16, 2022, 11:15:51 AM9/16/22
to django-...@googlegroups.com
#34016: QuerySet.values_list() crash on simple ArrayAgg.
----------------------------------+------------------------------------
Reporter: Alex Kerkum | Owner: (none)
Type: Bug | Status: new
Component: contrib.postgres | Version: 4.1
Severity: Release blocker | Resolution:
Keywords: | Triage Stage: Accepted
Has patch: 0 | Needs documentation: 0
Needs tests: 0 | Patch needs improvement: 0
Easy pickings: 0 | UI/UX: 0
----------------------------------+------------------------------------

Comment (by Simon Charette):

Thanks for taking the time to submit a PR Alex, greatly appreciated. It
should be enough assuming all tests are passing.

The long blob of text above was mostly about this regression being a sign
of a larger problem and possible solutions to address it in follow up
tickets.

--
Ticket URL: <https://code.djangoproject.com/ticket/34016#comment:8>

Django

unread,
Sep 16, 2022, 11:34:44 AM9/16/22
to django-...@googlegroups.com
#34016: QuerySet.values_list() crash on simple ArrayAgg.
----------------------------------+------------------------------------
Reporter: Alex Kerkum | Owner: (none)
Type: Bug | Status: new
Component: contrib.postgres | Version: 4.1
Severity: Release blocker | Resolution:
Keywords: | Triage Stage: Accepted
Has patch: 0 | Needs documentation: 0
Needs tests: 0 | Patch needs improvement: 0
Easy pickings: 0 | UI/UX: 0
----------------------------------+------------------------------------

Comment (by Alex Kerkum):

Thanks Simon! As all tests are passing I've marked it as 'Ready for
review'.

Fixing the larger problem in follow up tickets sounds good, although the
implications of it are over my head :)

--
Ticket URL: <https://code.djangoproject.com/ticket/34016#comment:9>

Django

unread,
Sep 16, 2022, 1:42:16 PM9/16/22
to django-...@googlegroups.com
#34016: QuerySet.values_list() crash on simple ArrayAgg.
----------------------------------+------------------------------------
Reporter: Alex Kerkum | Owner: (none)
Type: Bug | Status: new
Component: contrib.postgres | Version: 4.1
Severity: Release blocker | Resolution:
Keywords: | Triage Stage: Accepted
Has patch: 0 | Needs documentation: 0
Needs tests: 0 | Patch needs improvement: 0
Easy pickings: 0 | UI/UX: 0
----------------------------------+------------------------------------

Comment (by Mariusz Felisiak):

> If we go with list[Optional[Expression]] then we don't have to do any
len based special casing when dealing with optional sub-expressions but it
means that checks of the form len(source_expressions) are not adequate;
they must exclude None values. I guess we could provide a property/method
that excludes them if this is a common pattern.

Thanks for checking. I'd choose `list[Optional[Expression]]` and `None`
for empty optional expressions. However, Alex's
[https://github.com/django/django/pull/16064 PR] is more suitable for
backporting. We can standardized the behavior for optional sub-expressions
in a separate cleanup targeted only to the `main` branch.

--
Ticket URL: <https://code.djangoproject.com/ticket/34016#comment:10>

Django

unread,
Sep 17, 2022, 1:40:21 PM9/17/22
to django-...@googlegroups.com
#34016: QuerySet.values_list() crash on simple ArrayAgg.
-------------------------------------+-------------------------------------
Reporter: Alex Kerkum | Owner: Alex
| Kerkum
Type: Bug | Status: assigned

Component: contrib.postgres | Version: 4.1
Severity: Release blocker | Resolution:
Keywords: | Triage Stage: Ready for
| checkin
Has patch: 1 | Needs documentation: 0

Needs tests: 0 | Patch needs improvement: 0
Easy pickings: 0 | UI/UX: 0
-------------------------------------+-------------------------------------
Changes (by Mariusz Felisiak):

* owner: (none) => Alex Kerkum
* status: new => assigned
* has_patch: 0 => 1
* stage: Accepted => Ready for checkin


--
Ticket URL: <https://code.djangoproject.com/ticket/34016#comment:11>

Django

unread,
Sep 18, 2022, 1:40:23 AM9/18/22
to django-...@googlegroups.com
#34016: QuerySet.values_list() crash on simple ArrayAgg.
-------------------------------------+-------------------------------------
Reporter: Alex Kerkum | Owner: Alex
| Kerkum
Type: Bug | Status: closed
Component: contrib.postgres | Version: 4.1
Severity: Release blocker | Resolution: fixed

Keywords: | Triage Stage: Ready for
| checkin
Has patch: 1 | Needs documentation: 0
Needs tests: 0 | Patch needs improvement: 0
Easy pickings: 0 | UI/UX: 0
-------------------------------------+-------------------------------------
Changes (by Mariusz Felisiak <felisiak.mariusz@…>):

* status: assigned => closed
* resolution: => fixed


Comment:

In [changeset:"f88fc72da4eb76f2d464edb4874ef6046f8a8658" f88fc72d]:
{{{
#!CommitTicketReference repository=""
revision="f88fc72da4eb76f2d464edb4874ef6046f8a8658"
Fixed #34016 -- Fixed QuerySet.values()/values_list() crash on ArrayAgg()
and JSONBAgg().

Regression in e06dc4571ea9fd5723c8029959b95808be9f8812.
}}}

--
Ticket URL: <https://code.djangoproject.com/ticket/34016#comment:12>

Django

unread,
Sep 18, 2022, 1:40:41 AM9/18/22
to django-...@googlegroups.com
#34016: QuerySet.values_list() crash on simple ArrayAgg.
-------------------------------------+-------------------------------------
Reporter: Alex Kerkum | Owner: Alex
| Kerkum
Type: Bug | Status: closed
Component: contrib.postgres | Version: 4.1
Severity: Release blocker | Resolution: fixed
Keywords: | Triage Stage: Ready for
| checkin
Has patch: 1 | Needs documentation: 0
Needs tests: 0 | Patch needs improvement: 0
Easy pickings: 0 | UI/UX: 0
-------------------------------------+-------------------------------------

Comment (by Mariusz Felisiak <felisiak.mariusz@…>):

In [changeset:"2d20386b41f30ffbd9bfcadd48877ebd8766c22a" 2d20386b]:
{{{
#!CommitTicketReference repository=""
revision="2d20386b41f30ffbd9bfcadd48877ebd8766c22a"
[4.1.x] Fixed #34016 -- Fixed QuerySet.values()/values_list() crash on
ArrayAgg() and JSONBAgg().

Regression in e06dc4571ea9fd5723c8029959b95808be9f8812.

Backport of f88fc72da4eb76f2d464edb4874ef6046f8a8658 from main
}}}

--
Ticket URL: <https://code.djangoproject.com/ticket/34016#comment:13>

Reply all
Reply to author
Forward
0 new messages