[Django] #30209: Union with group by don't generate correct Subquery

89 views
Skip to first unread message

Django

unread,
Feb 24, 2019, 10:53:20 AM2/24/19
to django-...@googlegroups.com
#30209: Union with group by don't generate correct Subquery
-------------------------------------+-------------------------------------
Reporter: Nikolas | Owner: nobody
Type: | Status: new
Uncategorized |
Component: Database | Version: 1.11
layer (models, ORM) |
Severity: Normal | Keywords: union, group by
Triage Stage: | Has patch: 0
Unreviewed |
Needs documentation: 0 | Needs tests: 0
Patch needs improvement: 0 | Easy pickings: 0
UI/UX: 0 |
-------------------------------------+-------------------------------------
reopen: https://code.djangoproject.com/ticket/30203

Simplified for example:
have simple query


{{{
qs1: QuerySet = User.objects.filter(id__gt=10)

qs1 = qs1.values('is_active').annotate(id=Count('pk'))
}}}

generate:

{{{
SELECT "suser_customuser"."is_active",
COUNT("suser_customuser"."id") AS "id"
FROM "suser_customuser"
WHERE "suser_customuser"."id" > 10
GROUP BY "suser_customuser"."is_active"
}}}

all correct, now i wont use union

{{{
qs1: QuerySet = User.objects.filter(id__gt=10)
qs2: QuerySet = User.objects.filter(id__lt=10)

qs1 = qs1.values('is_active').annotate(id=Count('pk'))

}}}

now get not correct SQL:

{{{
(SELECT "suser_customuser"."is_active",
"suser_customuser"."id"
FROM "suser_customuser"
WHERE "suser_customuser"."id" > 10)
UNION
(SELECT "suser_customuser"."is_active",
"suser_customuser"."id"
FROM "suser_customuser"
WHERE "suser_customuser"."id" < 10)
}}}


expected:

{{{
select *
from
((SELECT "suser_customuser"."is_active",
"suser_customuser"."id"
FROM "suser_customuser"
WHERE "suser_customuser"."id" > 10)
UNION
(SELECT "suser_customuser"."is_active",
"suser_customuser"."id"
FROM "suser_customuser"
WHERE "suser_customuser"."id" < 10)) U0
group by is_active, id
}}}


i think this is bug, if not why not show any warning or exception?
or just say how rewrite query for correct use Subquery

screenshot:
http://dl4.joxi.net/drive/2019/02/24/0002/0003/143363/63/2eab4a5318.jpg

--
Ticket URL: <https://code.djangoproject.com/ticket/30209>
Django <https://code.djangoproject.com/>
The Web framework for perfectionists with deadlines.

Django

unread,
Feb 24, 2019, 10:53:52 AM2/24/19
to django-...@googlegroups.com
#30209: Union with group by don't generate correct Subquery
-------------------------------------+-------------------------------------
Reporter: Nikolas | Owner: nobody
Type: Uncategorized | Status: new
Component: Database layer | Version: 1.11
(models, ORM) |
Severity: Normal | Resolution:

Keywords: union, group by | Triage Stage:
| Unreviewed
Has patch: 0 | Needs documentation: 0

Needs tests: 0 | Patch needs improvement: 0
Easy pickings: 0 | UI/UX: 0
-------------------------------------+-------------------------------------
Changes (by Nikolas):

* Attachment "2eab4a5318.jpg" added.

screenshot

Django

unread,
Feb 24, 2019, 10:54:30 AM2/24/19
to django-...@googlegroups.com
#30209: Union with group by don't generate correct Subquery
-------------------------------------+-------------------------------------
Reporter: Nikolas | Owner: nobody
Type: Bug | Status: new

Component: Database layer | Version: 1.11
(models, ORM) |
Severity: Normal | Resolution:
Keywords: union, group by | Triage Stage:
| Unreviewed
Has patch: 0 | Needs documentation: 0

Needs tests: 0 | Patch needs improvement: 0
Easy pickings: 0 | UI/UX: 0
-------------------------------------+-------------------------------------
Changes (by Nikolas):

* type: Uncategorized => Bug


--
Ticket URL: <https://code.djangoproject.com/ticket/30209#comment:1>

Django

unread,
Feb 24, 2019, 2:52:06 PM2/24/19
to django-...@googlegroups.com
#30209: Union with group by don't generate correct Subquery
-------------------------------------+-------------------------------------
Reporter: Nikolas | Owner: nobody
Type: Bug | Status: new
Component: Database layer | Version: master

(models, ORM) |
Severity: Normal | Resolution:
Keywords: union, group by | Triage Stage: Accepted
Has patch: 0 | Needs documentation: 1

Needs tests: 0 | Patch needs improvement: 0
Easy pickings: 0 | UI/UX: 0
-------------------------------------+-------------------------------------
Changes (by Simon Charette):

* needs_docs: 0 => 1
* version: 1.11 => master
* stage: Unreviewed => Accepted


Comment:

I agree that either an exception should be raised and the
[https://docs.djangoproject.com/en/2.1/ref/models/querysets/#union
documentation be adjusted] to explicitly mention aggregation won't work or
adding an aggregate function should perform a subquery pushdown.

--
Ticket URL: <https://code.djangoproject.com/ticket/30209#comment:2>

Django

unread,
Feb 24, 2019, 3:11:41 PM2/24/19
to django-...@googlegroups.com
#30209: Union with group by don't generate correct Subquery
-------------------------------------+-------------------------------------
Reporter: Nikolas | Owner: nobody
Type: Bug | Status: new
Component: Database layer | Version: master
(models, ORM) |
Severity: Normal | Resolution:
Keywords: union, group by | Triage Stage: Accepted
Has patch: 0 | Needs documentation: 1
Needs tests: 0 | Patch needs improvement: 0
Easy pickings: 0 | UI/UX: 0
-------------------------------------+-------------------------------------

Comment (by Nikolas):

Replying to [comment:2 Simon Charette]:


> I agree that either an exception should be raised and the
[https://docs.djangoproject.com/en/2.1/ref/models/querysets/#union
documentation be adjusted] to explicitly mention aggregation won't work or
adding an aggregate function should perform a subquery pushdown.


Please can help, how can rewrite query for correct work???? My boss shut
me in close time, but i stock there)

--
Ticket URL: <https://code.djangoproject.com/ticket/30209#comment:3>

Django

unread,
Mar 15, 2019, 6:27:39 PM3/15/19
to django-...@googlegroups.com
#30209: Union with group by don't generate correct Subquery
-------------------------------------+-------------------------------------
Reporter: Nikolas | Owner: nobody
Type: Bug | Status: new
Component: Database layer | Version: master
(models, ORM) |
Severity: Normal | Resolution:
Keywords: union, group by | Triage Stage: Accepted
Has patch: 0 | Needs documentation: 1
Needs tests: 0 | Patch needs improvement: 0
Easy pickings: 0 | UI/UX: 0
-------------------------------------+-------------------------------------

Comment (by TKdka):

Replying to [comment:2 Simon Charette]:
> I agree that either an exception should be raised and the
[https://docs.djangoproject.com/en/2.1/ref/models/querysets/#union
documentation be adjusted] to explicitly mention aggregation won't work or
adding an aggregate function should perform a subquery pushdown.

I'm planning to make this a sub query pushdown as suggested. FYI, this is
my first contribution to the code base, so suggestions and feedback would
be most appreciated.

--
Ticket URL: <https://code.djangoproject.com/ticket/30209#comment:4>

Django

unread,
Mar 15, 2019, 6:40:52 PM3/15/19
to django-...@googlegroups.com
#30209: Union with group by don't generate correct Subquery
-------------------------------------+-------------------------------------
Reporter: Nikolas | Owner: nobody
Type: Bug | Status: new
Component: Database layer | Version: master
(models, ORM) |
Severity: Normal | Resolution:
Keywords: union, group by | Triage Stage: Accepted
Has patch: 0 | Needs documentation: 1
Needs tests: 0 | Patch needs improvement: 0
Easy pickings: 0 | UI/UX: 0
-------------------------------------+-------------------------------------

Comment (by Simon Charette):

> I'm planning to make this a sub query pushdown as suggested. FYI, this
is my first contribution to the code base, so suggestions and feedback
would be most appreciated.

TKdka, not to discourage you but this is certainly going to be a hard
first contribution as the ORM doesn't have any formalized API to perform
such pushdown yet. If that can provide you any guidance the most similar
code I can think of is in `django.db.models.sql.Query.get_aggregation`.

--
Ticket URL: <https://code.djangoproject.com/ticket/30209#comment:5>

Django

unread,
Mar 16, 2019, 7:30:55 PM3/16/19
to django-...@googlegroups.com
#30209: Union with group by don't generate correct Subquery
-------------------------------------+-------------------------------------
Reporter: Nikolas | Owner: nobody
Type: Bug | Status: new
Component: Database layer | Version: master
(models, ORM) |
Severity: Normal | Resolution:
Keywords: union, group by | Triage Stage: Accepted
Has patch: 0 | Needs documentation: 1
Needs tests: 0 | Patch needs improvement: 0
Easy pickings: 0 | UI/UX: 0
-------------------------------------+-------------------------------------

Comment (by TKdka):

Replying to [comment:5 Simon Charette]:


> > I'm planning to make this a sub query pushdown as suggested. FYI, this
is my first contribution to the code base, so suggestions and feedback
would be most appreciated.
>
> TKdka, not to discourage you but this is certainly going to be a hard
first contribution as the ORM doesn't have any formalized API to perform
such pushdown yet. If that can provide you any guidance the most similar
code I can think of is in `django.db.models.sql.Query.get_aggregation`.

Thanks for the guidance and code reference. As nobody else has stepped up
on this in 3 weeks, I'll give it a shot (-:

--
Ticket URL: <https://code.djangoproject.com/ticket/30209#comment:6>

Django

unread,
Mar 18, 2019, 7:25:54 PM3/18/19
to django-...@googlegroups.com
#30209: Union with group by don't generate correct Subquery
-------------------------------------+-------------------------------------
Reporter: Nikolas | Owner: nobody
Type: Bug | Status: new
Component: Database layer | Version: master
(models, ORM) |
Severity: Normal | Resolution:
Keywords: union, group by | Triage Stage: Accepted
Has patch: 0 | Needs documentation: 1
Needs tests: 0 | Patch needs improvement: 0
Easy pickings: 0 | UI/UX: 0
-------------------------------------+-------------------------------------

Comment (by TKdka):

Replying to [comment:5 Simon Charette]:
> > I'm planning to make this a sub query pushdown as suggested. FYI, this
is my first contribution to the code base, so suggestions and feedback
would be most appreciated.
>
> TKdka, not to discourage you but this is certainly going to be a hard
first contribution as the ORM doesn't have any formalized API to perform
such pushdown yet. If that can provide you any guidance the most similar
code I can think of is in `django.db.models.sql.Query.get_aggregation`.

After looking into this, you are completely right that it's a big
endeavor. The difficult part is that the query needs to be manipulated
even after the annotate call, which is significantly different from the
aggregate example.

Therefore, I'll submit a small patch that raises an exception when this is
attempted.

I'm not sure what exception class should be used here. TypeError?
Thoughts?

--
Ticket URL: <https://code.djangoproject.com/ticket/30209#comment:7>

Django

unread,
Mar 19, 2019, 12:02:14 AM3/19/19
to django-...@googlegroups.com
#30209: Union with group by don't generate correct Subquery
-------------------------------------+-------------------------------------
Reporter: Nikolas | Owner: nobody
Type: Bug | Status: new
Component: Database layer | Version: master
(models, ORM) |
Severity: Normal | Resolution:
Keywords: union, group by | Triage Stage: Accepted
Has patch: 0 | Needs documentation: 1
Needs tests: 0 | Patch needs improvement: 0
Easy pickings: 0 | UI/UX: 0
-------------------------------------+-------------------------------------

Comment (by Simon Charette):

> I'm not sure what exception class should be used here. TypeError?
Thoughts?

Good question. I know that
[https://docs.djangoproject.com/en/2.1/ref/models/querysets/#union we're
currently preventing some operations on combined querysets] so I'd try to
see what type of exception we're currently raising and stick to it to be
coherent. I think `.filter()` is disallowed for example. It's possible
that we simply crash though.

I think the best place to add this check would be in
`query.Queryset.annotate` just before `set_group_by`
[https://github.com/django/django/blob/418263c457636d3301f2068c47f09a0f42e15c52/django/db/models/query.py#L1052
is called] if an annotation contains aggregates.

--
Ticket URL: <https://code.djangoproject.com/ticket/30209#comment:8>

Django

unread,
Mar 22, 2019, 1:08:50 AM3/22/19
to django-...@googlegroups.com
#30209: Union with group by don't generate correct Subquery
-------------------------------------+-------------------------------------
Reporter: Nikolas | Owner: nobody
Type: Bug | Status: new
Component: Database layer | Version: master
(models, ORM) |
Severity: Normal | Resolution:
Keywords: union, group by | Triage Stage: Accepted
Has patch: 0 | Needs documentation: 1
Needs tests: 0 | Patch needs improvement: 0
Easy pickings: 0 | UI/UX: 0
-------------------------------------+-------------------------------------

Comment (by TKdka):

Replying to [comment:8 Simon Charette]:


> > I'm not sure what exception class should be used here. TypeError?
Thoughts?
>
> Good question. I know that
[https://docs.djangoproject.com/en/2.1/ref/models/querysets/#union we're
currently preventing some operations on combined querysets] so I'd try to
see what type of exception we're currently raising and stick to it to be
coherent. I think `.filter()` is disallowed for example. It's possible
that we simply crash though.
>
> I think the best place to add this check would be in
`query.Queryset.annotate` just before `set_group_by`
[https://github.com/django/django/blob/418263c457636d3301f2068c47f09a0f42e15c52/django/db/models/query.py#L1052
is called] if an annotation contains aggregates.

On where the exception should be raised...
I agree with general placement, but I'm thinking the exception should be
raised as soon as an aggregate function as noticed, so just after
[https://github.com/django/django/blob/418263c457636d3301f2068c47f09a0f42e15c52/django/db/models/query.py#L1048
line 1048 is called] The other part of the conditional,
{{{
clone.query.group_by = True
}}}
is problematic as well. This appears to create a group by over all the
fields, which is not supported. Thoughts?


On the type of exception that should be raised...
Using union+filter does not raise an exception (nor does it work). I
think that the correct exception should be something along the lines of a
SQLNotSupported exception. The exception should state what SQL feature
combination is not supported and then suggest using a raw query or
rewriting the query. Is there already something like this?


On related bugs...
I found two similar bugs while exploring this feature (one of which is
mentioned above). I'll create tickets for them as well, but wanted to
mention them here too.

As mentioned above, there is a bug with filter + union as well.
Performing a filter with a union neither raises an exception nor adds the
filter.
{{{
qs1 = Book.objects.filter(pk__lte=10)
qs2 = Book.objects.filter(pk__gt=10)
qs3 = qs1.union(qs2)
qs4 = qs3.filter(publisher_id=1)
print (qs4.query)
}}}
SELECT "aggregation_book"."id", "aggregation_book"."isbn",
"aggregation_book"."name", "aggregation_book"."pages",
"aggregation_book"."rating", "aggregation_book"."price",
"aggregation_book"."contact_id", "aggregation_book"."publisher_id",
"aggregation_book"."pubdate" FROM "aggregation_book" WHERE
"aggregation_book"."id" <= 10 UNION SELECT "aggregation_book"."id",
"aggregation_book"."isbn", "aggregation_book"."name",
"aggregation_book"."pages", "aggregation_book"."rating",
"aggregation_book"."price", "aggregation_book"."contact_id",
"aggregation_book"."publisher_id", "aggregation_book"."pubdate" FROM
"aggregation_book" WHERE "aggregation_book"."id" > 10

There's a similar bug when using filter + non-aggregating annotate
functions. The annotated field is not added, and an exception is not
raised.
{{{
qs1 = Book.objects.filter(pk__lte=10)
qs2 = Book.objects.filter(pk__gt=10)
qs3 = qs1.union(qs2)
qs4 = qs3.annotate(annotateField=F('id')+1)
print (qs4.query)
}}}
SELECT "aggregation_book"."id", "aggregation_book"."isbn",
"aggregation_book"."name", "aggregation_book"."pages",
"aggregation_book"."rating", "aggregation_book"."price",
"aggregation_book"."contact_id", "aggregation_book"."publisher_id",
"aggregation_book"."pubdate" FROM "aggregation_book" WHERE
"aggregation_book"."id" <= 10 UNION SELECT "aggregation_book"."id",
"aggregation_book"."isbn", "aggregation_book"."name",
"aggregation_book"."pages", "aggregation_book"."rating",
"aggregation_book"."price", "aggregation_book"."contact_id",
"aggregation_book"."publisher_id", "aggregation_book"."pubdate" FROM
"aggregation_book" WHERE "aggregation_book"."id" > 10

--
Ticket URL: <https://code.djangoproject.com/ticket/30209#comment:9>

Django

unread,
Apr 3, 2019, 11:28:04 AM4/3/19
to django-...@googlegroups.com
#30209: Union with group by don't generate correct Subquery
-------------------------------------+-------------------------------------
Reporter: Nikolas | Owner: nobody
Type: Bug | Status: new
Component: Database layer | Version: master
(models, ORM) |
Severity: Normal | Resolution:
Keywords: union, group by | Triage Stage: Accepted
Has patch: 1 | Needs documentation: 1

Needs tests: 0 | Patch needs improvement: 0
Easy pickings: 0 | UI/UX: 0
-------------------------------------+-------------------------------------
Changes (by Can Sarıgöl):

* cc: Can Sarıgöl (added)
* has_patch: 0 => 1


Comment:

Hi, I've been tackling "union as a subquery" topic for a while. So I tried
to fix, could you review? Thanks.
[https://github.com/django/django/pull/11167 PR]

Mentioned [https://github.com/django/django/pull/11152 PR]

--
Ticket URL: <https://code.djangoproject.com/ticket/30209#comment:10>

Django

unread,
Apr 16, 2019, 2:23:21 AM4/16/19
to django-...@googlegroups.com
#30209: Union with group by don't generate correct Subquery
-------------------------------------+-------------------------------------
Reporter: Nikolas | Owner: Can
| Sarıgöl
Type: Bug | Status: assigned

Component: Database layer | Version: master
(models, ORM) |
Severity: Normal | Resolution:
Keywords: union, group by | Triage Stage: Accepted
Has patch: 1 | Needs documentation: 1
Needs tests: 0 | Patch needs improvement: 0
Easy pickings: 0 | UI/UX: 0
-------------------------------------+-------------------------------------
Changes (by Can Sarıgöl):

* status: new => assigned
* owner: nobody => Can Sarıgöl


--
Ticket URL: <https://code.djangoproject.com/ticket/30209#comment:11>

Django

unread,
Jun 4, 2020, 5:46:28 AM6/4/20
to django-...@googlegroups.com
#30209: Union with group by don't generate correct Subquery
-------------------------------------+-------------------------------------
Reporter: Nikolas | Owner: Can
| Sarıgöl
Type: Bug | Status: assigned
Component: Database layer | Version: master
(models, ORM) |
Severity: Normal | Resolution:
Keywords: union, group by | Triage Stage: Accepted
Has patch: 1 | Needs documentation: 1
Needs tests: 0 | Patch needs improvement: 0
Easy pickings: 0 | UI/UX: 0
-------------------------------------+-------------------------------------

Comment (by Matthijs Kooijman):

It seems the linked PR was already closed, so it seems implementing this
properly has been abandoned?

As for showing a warning when a union is annotated with an aggregation, it
seems that https://github.com/django/django/pull/11591 forbids *all*
annotations on unions, so this case seems to be convered by that as well.
Not sure what that means for the status of this ticket, though.

--
Ticket URL: <https://code.djangoproject.com/ticket/30209#comment:12>

Django

unread,
Jun 4, 2020, 5:56:05 AM6/4/20
to django-...@googlegroups.com
#30209: Union with group by don't generate correct Subquery
-------------------------------------+-------------------------------------
Reporter: Nikolas | Owner: Can
| Sarıgöl
Type: Bug | Status: assigned
Component: Database layer | Version: master
(models, ORM) |
Severity: Normal | Resolution:
Keywords: union, group by | Triage Stage: Accepted
Has patch: 1 | Needs documentation: 1
Needs tests: 0 | Patch needs improvement: 0
Easy pickings: 0 | UI/UX: 0
-------------------------------------+-------------------------------------

Comment (by felixxm):

Replying to [comment:12 Matthijs Kooijman]:


> It seems the linked PR was already closed, so it seems implementing this
properly has been abandoned?
>
> As for showing a warning when a union is annotated with an aggregation,
it seems that https://github.com/django/django/pull/11591 forbids *all*
annotations on unions, so this case seems to be convered by that as well.
Not sure what that means for the status of this ticket, though.

I opened a [https://github.com/django/django/pull/11692 new PR] but it's
still WIP. There is a tricky parentheses issue on MySQL/MariaDB. I hope I
will find time to finish it.

--
Ticket URL: <https://code.djangoproject.com/ticket/30209#comment:13>

Django

unread,
Jan 25, 2022, 6:56:03 AM1/25/22
to django-...@googlegroups.com
#30209: Union with group by don't generate correct Subquery
-------------------------------------+-------------------------------------
Reporter: Nikolas | Owner: Can
| Sarıgöl
Type: Bug | Status: closed
Component: Database layer | Version: dev
(models, ORM) |
Severity: Normal | Resolution: duplicate

Keywords: union, group by | Triage Stage: Accepted
Has patch: 1 | Needs documentation: 1
Needs tests: 0 | Patch needs improvement: 0
Easy pickings: 0 | UI/UX: 0
-------------------------------------+-------------------------------------
Changes (by Mariusz Felisiak):

* status: assigned => closed
* resolution: => duplicate


Comment:

A proper exception when calling `annotate()` after `union()`,
`intersection()`, and `difference()` was added in
1853383969a4c53bbeba998757c30410bd3df4bb.

Closing as a duplicate of #28519.

--
Ticket URL: <https://code.djangoproject.com/ticket/30209#comment:14>

Reply all
Reply to author
Forward
0 new messages