[Django] #26019: Incorrect query generated for QuerySet Union with annotate

63 views
Skip to first unread message

Django

unread,
Dec 31, 2015, 12:46:00 AM12/31/15
to django-...@googlegroups.com
#26019: Incorrect query generated for QuerySet Union with annotate
----------------------------------------------+--------------------
Reporter: mssnlayam | Owner: nobody
Type: Bug | Status: new
Component: Database layer (models, ORM) | Version: 1.9
Severity: Normal | Keywords:
Triage Stage: Unreviewed | Has patch: 0
Easy pickings: 0 | UI/UX: 0
----------------------------------------------+--------------------
The Django ORM generates incorrect queries for a QuerySet union that has
an annotate. Here's output from a Python shell session illustrating this
error.

I am trying to do a union of two QuerySets, one which gets the first_name
field (renamed as name_field), and another one which gets the last_name
field (also renamed as name_field). The unioned QuerySet gets only the
first_name field values.

{{{
>>> from django.contrib.auth.models import User
>>> from django.db.models import F
>>> first_name_qs =
User.objects.order_by().annotate(name_field=F('first_name')).values('name_field').distinct()
>>> last_name_qs =
User.objects.order_by().annotate(name_field=F('last_name')).values('name_field').distinct()
>>> print(first_name_qs.query)
SELECT DISTINCT "auth_user"."first_name" AS "name_field" FROM "auth_user"
>>> print(last_name_qs.query)
SELECT DISTINCT "auth_user"."last_name" AS "name_field" FROM "auth_user"
>>> print((first_name_qs | last_name_qs).query)
SELECT DISTINCT "auth_user"."first_name" AS "name_field" FROM "auth_user"
}}}

The above behavior is with Django 1.9.

--
Ticket URL: <https://code.djangoproject.com/ticket/26019>
Django <https://code.djangoproject.com/>
The Web framework for perfectionists with deadlines.

Django

unread,
Dec 31, 2015, 2:13:04 AM12/31/15
to django-...@googlegroups.com
#26019: Incorrect query generated for QuerySet Union with annotate
-------------------------------------+-------------------------------------

Reporter: mssnlayam | Owner: nobody
Type: Bug | Status: new
Component: Database layer | Version: 1.9
(models, ORM) |
Severity: Normal | Resolution:
Keywords: | Triage Stage:
| Unreviewed
Has patch: 0 | Needs documentation: 0
Needs tests: 0 | Patch needs improvement: 0

Easy pickings: 0 | UI/UX: 0
-------------------------------------+-------------------------------------
Changes (by charettes):

* needs_better_patch: => 0
* needs_tests: => 0
* needs_docs: => 0


Comment:

Are you expecting the ORM to issue an `UNION` like the following query?

{{{#!sql
SELECT "auth_user"."first_name" AS "name_field" FROM "auth_user"
UNION
SELECT "auth_user"."last_name" AS "name_field" FROM "auth_user"
}}}

I think the ORM only combines the queries filters when the `|` is used.

--
Ticket URL: <https://code.djangoproject.com/ticket/26019#comment:1>

Django

unread,
Dec 31, 2015, 2:27:39 AM12/31/15
to django-...@googlegroups.com
#26019: Incorrect query generated for QuerySet Union with annotate
-------------------------------------+-------------------------------------

Reporter: mssnlayam | Owner: nobody
Type: Bug | Status: new
Component: Database layer | Version: 1.9
(models, ORM) |
Severity: Normal | Resolution:
Keywords: | Triage Stage:
| Unreviewed

Has patch: 0 | Needs documentation: 0
Needs tests: 0 | Patch needs improvement: 0

Easy pickings: 0 | UI/UX: 0
-------------------------------------+-------------------------------------

Comment (by mssnlayam):

That (or a similar query) is what I am expecting. I have used the `|`
operator.

--
Ticket URL: <https://code.djangoproject.com/ticket/26019#comment:2>

Django

unread,
Dec 31, 2015, 2:39:34 AM12/31/15
to django-...@googlegroups.com
#26019: Incorrect query generated for QuerySet Union with annotate
-------------------------------------+-------------------------------------

Reporter: mssnlayam | Owner: nobody
Type: Bug | Status: new
Component: Database layer | Version: 1.9
(models, ORM) |
Severity: Normal | Resolution:
Keywords: | Triage Stage:
| Unreviewed

Has patch: 0 | Needs documentation: 0
Needs tests: 0 | Patch needs improvement: 0

Easy pickings: 0 | UI/UX: 0
-------------------------------------+-------------------------------------

Comment (by charettes):

Replying to [comment:2 mssnlayam]:


> That (or a similar query) is what I am expecting. I have used the `|`
operator.

What I meant here is that the ORM ''only'' considers the queries filters
and ignore the `annotate()` aliases and the fact `distinct()` was used.

--
Ticket URL: <https://code.djangoproject.com/ticket/26019#comment:3>

Django

unread,
Dec 31, 2015, 11:55:06 AM12/31/15
to django-...@googlegroups.com
#26019: Incorrect query generated when combining querysets refering to different
fields under the same alias.
-------------------------------------+-------------------------------------

Reporter: mssnlayam | Owner: nobody
Type: Bug | Status: new
Component: Database layer | Version: master

(models, ORM) |
Severity: Normal | Resolution:
Keywords: | Triage Stage: Accepted

Has patch: 0 | Needs documentation: 0
Needs tests: 0 | Patch needs improvement: 0

Easy pickings: 0 | UI/UX: 0
-------------------------------------+-------------------------------------
Changes (by charettes):

* version: 1.9 => master
* stage: Unreviewed => Accepted


Comment:

I changed the title to reflect what I think is the actual issue.

If we could use kwargs to specify aliases (#16735) the issue could be
expressed by the following query combination:

{{{#!python
User.object.values(name='first_name') |
User.objects.values(name='last_name')
}}}

Still considering this a bug because a `TypeError` should be raised in
[https://github.com/django/django/blob/df3d5b1d73699b323aac377dffab039dca26c1e4/django/db/models/query.py#L1093-L1102
Query._merge_sanity_checks] to warn the user about this case.

Once we land this bug fix we should consider reorienting this ticket as a
''feature request'' to teach the ORM it should use `UNION ALL` in such
case.

{{{#!sql
SELECT "auth_user"."first_name" AS "name" FROM "auth_user"
UNION ALL
SELECT "auth_user"."last_name" AS "name FROM "auth_user"
}}}

--
Ticket URL: <https://code.djangoproject.com/ticket/26019#comment:4>

Django

unread,
Dec 31, 2015, 4:11:46 PM12/31/15
to django-...@googlegroups.com
#26019: Incorrect query generated when combining querysets refering to different
fields under the same alias.
-------------------------------------+-------------------------------------
Reporter: mssnlayam | Owner: nobody
Type: Bug | Status: new
Component: Database layer | Version: master
(models, ORM) |
Severity: Normal | Resolution:
Keywords: | Triage Stage: Accepted
Has patch: 0 | Needs documentation: 0
Needs tests: 0 | Patch needs improvement: 0

Easy pickings: 0 | UI/UX: 0
-------------------------------------+-------------------------------------

Comment (by shaib):

If and when we do teach it to use `UNION ALL`, and since `distinct()` was
mentioned in the OP's example, we should probably do something to get
distinct queries to use `UNION` instead of `UNION ALL`.

--
Ticket URL: <https://code.djangoproject.com/ticket/26019#comment:5>

Django

unread,
Dec 31, 2015, 4:28:33 PM12/31/15
to django-...@googlegroups.com
#26019: Incorrect query generated when combining querysets refering to different
fields under the same alias.
-------------------------------------+-------------------------------------
Reporter: mssnlayam | Owner: nobody
Type: Bug | Status: new
Component: Database layer | Version: master
(models, ORM) |
Severity: Normal | Resolution:
Keywords: | Triage Stage: Accepted
Has patch: 0 | Needs documentation: 0
Needs tests: 0 | Patch needs improvement: 0

Easy pickings: 0 | UI/UX: 0
-------------------------------------+-------------------------------------

Comment (by charettes):

@shaib I agree but I assumed teaching the ORM to use `UNION` instead of
`UNION ALL` if combined queries use distinct could be done in an later
patch.

Given

{{{#!python
first_names = User.object.values(name='first_name')
last_names = User.object.values(name='last_name')
}}}

I would expect

{{{#!python
first_names.distinct() | last_names.distinct()
}}}

To result in

{{{#!sql
SELECT DISTINCT "auth_user"."first_name" AS "name" FROM "auth_user"
UNION ALL
SELECT DISTINCT "auth_user"."last_name" AS "name" FROM "auth_user"
}}}

And

{{{#!python
(first_names | last_names).distinct()
}}}

To result in

{{{#!sql
SELECT "auth_user"."first_name" AS "name" FROM "auth_user"
UNION

SELECT "auth_user"."last_name" AS "name" FROM "auth_user"
}}}

or the equivalent

{{{#!sql
SELECT DISTINCT "name" FROM (


SELECT "auth_user"."first_name" AS "name" FROM "auth_user"
UNION ALL
SELECT "auth_user"."last_name" AS "name" FROM "auth_user"

)
}}}

--
Ticket URL: <https://code.djangoproject.com/ticket/26019#comment:6>

Django

unread,
Dec 31, 2015, 5:27:32 PM12/31/15
to django-...@googlegroups.com
#26019: Incorrect query generated when combining querysets refering to different
fields under the same alias.
-------------------------------------+-------------------------------------
Reporter: mssnlayam | Owner: nobody
Type: Bug | Status: new
Component: Database layer | Version: master
(models, ORM) |
Severity: Normal | Resolution:
Keywords: | Triage Stage: Accepted
Has patch: 0 | Needs documentation: 0
Needs tests: 0 | Patch needs improvement: 0

Easy pickings: 0 | UI/UX: 0
-------------------------------------+-------------------------------------

Comment (by shaib):

Yes, exactly.

--
Ticket URL: <https://code.djangoproject.com/ticket/26019#comment:7>

Django

unread,
Jan 1, 2016, 1:31:53 PM1/1/16
to django-...@googlegroups.com
#26019: Incorrect query generated when combining querysets refering to different
fields under the same alias.
-------------------------------------+-------------------------------------
Reporter: mssnlayam | Owner: nobody
Type: Bug | Status: new
Component: Database layer | Version: master
(models, ORM) |
Severity: Normal | Resolution:
Keywords: | Triage Stage: Accepted
Has patch: 0 | Needs documentation: 0
Needs tests: 0 | Patch needs improvement: 0

Easy pickings: 0 | UI/UX: 0
-------------------------------------+-------------------------------------
Changes (by charettes):

* cc: charettes (added)


--
Ticket URL: <https://code.djangoproject.com/ticket/26019#comment:8>

Django

unread,
Dec 5, 2016, 4:31:54 AM12/5/16
to django-...@googlegroups.com
#26019: Incorrect query generated when combining querysets refering to different
fields under the same alias.
-------------------------------------+-------------------------------------
Reporter: Suriya Subramanian | Owner: nobody
Type: Bug | Status: new

Component: Database layer | Version: master
(models, ORM) |
Severity: Normal | Resolution:
Keywords: | Triage Stage: Accepted
Has patch: 0 | Needs documentation: 0
Needs tests: 0 | Patch needs improvement: 0

Easy pickings: 0 | UI/UX: 0
-------------------------------------+-------------------------------------

Comment (by Matthew Schinckel):

It seems to me that from this, it _could_ be possible to UNION two
ValuesQuerySets from different models, as long as they have the same list
of columns.

Yes, I'm aware that ValuesQuerySet went away, but I'm hoping that this may
still be possible, at some level ;)

--
Ticket URL: <https://code.djangoproject.com/ticket/26019#comment:9>

Django

unread,
Aug 23, 2022, 12:32:32 PM8/23/22
to django-...@googlegroups.com
#26019: Incorrect query generated when combining querysets refering to different
fields under the same alias.
-------------------------------------+-------------------------------------
Reporter: Suriya Subramanian | Owner: nobody
Type: Bug | Status: closed
Component: Database layer | Version: dev
(models, ORM) |
Severity: Normal | Resolution: fixed

Keywords: | Triage Stage: Accepted
Has patch: 0 | Needs documentation: 0
Needs tests: 0 | Patch needs improvement: 0

Easy pickings: 0 | UI/UX: 0
-------------------------------------+-------------------------------------
Changes (by Simon Charette):

* status: new => closed
* resolution: => fixed


Comment:

Now that both `QuerySet.union` (#27718) and
`QuerySet.values(alias:str=exr:Expression)` (#16735) support has landed I
think we can close this ticket as `QuerySet` combination through `|` is
likely going to remain unchanged.

--
Ticket URL: <https://code.djangoproject.com/ticket/26019#comment:10>

Reply all
Reply to author
Forward
0 new messages