[Django] #24254: Passing a query to "pk__in" of another query may result in invalid SQL

49 views
Skip to first unread message

Django

unread,
Jan 30, 2015, 3:56:45 PM1/30/15
to django-...@googlegroups.com
#24254: Passing a query to "pk__in" of another query may result in invalid SQL
----------------------------------------------+--------------------
Reporter: bronger | Owner: nobody
Type: Bug | Status: new
Component: Database layer (models, ORM) | Version: 1.7
Severity: Normal | Keywords:
Triage Stage: Unreviewed | Has patch: 0
Easy pickings: 0 | UI/UX: 0
----------------------------------------------+--------------------
Consider the following models:

{{{
from django.db import models


class Father(models.Model):
name = models.CharField(max_length=200)


class Son(Father):

class Meta:
ordering = ("name",)
}}}

Then, the following query:

{{{
Son.objects.filter(pk__in=Son.objects.all().distinct()[:10])
}}}

results in an invalid SQL query. SQLite complains with "only a single
result allowed for a SELECT that is part of an expression".

--
Ticket URL: <https://code.djangoproject.com/ticket/24254>
Django <https://code.djangoproject.com/>
The Web framework for perfectionists with deadlines.

Django

unread,
Jan 31, 2015, 2:15:13 PM1/31/15
to django-...@googlegroups.com
#24254: Passing a query to "pk__in" of another query may result in invalid SQL
-------------------------------------+-------------------------------------

Reporter: bronger | Owner: nobody
Type: Bug | Status: new
Component: Database layer | Version: 1.7
(models, ORM) |
Severity: Normal | Resolution:
Keywords: | Triage Stage:
| Unreviewed
Has patch: 0 | Needs documentation: 0
Needs tests: 0 | Patch needs improvement: 0

Easy pickings: 0 | UI/UX: 0
-------------------------------------+-------------------------------------
Changes (by inglesp):

* needs_better_patch: => 0
* needs_tests: => 0
* needs_docs: => 0


Comment:

Have had a little dig into this.

Consider an `__in` lookup with the following on the RHS:
`M.objects.order_by(sort_field).distinct()[m:n]`.

This generates a query that returns both `M`'s primary key and
`sort_field`. Because it returns two fields, it cannot be used as a
subquery (at least in SQLite) and so we hit the reported error.

I think this behaviour is related to a
[https://docs.djangoproject.com/en/dev/ref/models/querysets/#django.db.models.query.QuerySet.distinct
warning in the docs]: "Any fields used in an `order_by()` call are
included in the SQL `SELECT` columns" (although from reading the code for
`SQLCompiler.get_extra_select` I think that the fields used in an
`order_by()` call are only included in the `SELECT` columns when the query
has `distinct` set).

It's not obvious to me why any fields used in an `order_by()` call
'''must''' always be included in the `SELECT` columns, so I'd like to
propose a fix whereby `SQLCompiler.get_extra_select` returns an empty list
if it's dealing with a subquery. I've made that change locally, and
nothing fails. However, I don't know enough about the ORM to know whether
this is a good idea! I hope this is useful.

--
Ticket URL: <https://code.djangoproject.com/ticket/24254#comment:1>

Django

unread,
Jan 31, 2015, 2:31:04 PM1/31/15
to django-...@googlegroups.com
#24254: Passing a query to "pk__in" of another query may result in invalid SQL
-------------------------------------+-------------------------------------

Reporter: bronger | Owner: nobody
Type: Bug | Status: new
Component: Database layer | Version: 1.7
(models, ORM) |
Severity: Normal | Resolution:
Keywords: | Triage Stage:
| Unreviewed

Has patch: 0 | Needs documentation: 0
Needs tests: 0 | Patch needs improvement: 0

Easy pickings: 0 | UI/UX: 0
-------------------------------------+-------------------------------------

Old description:

> Consider the following models:
>
> {{{
> from django.db import models
>

> class Father(models.Model):
> name = models.CharField(max_length=200)
>

> class Son(Father):
>
> class Meta:
> ordering = ("name",)
> }}}
>
> Then, the following query:
>
> {{{
> Son.objects.filter(pk__in=Son.objects.all().distinct()[:10])
> }}}
>
> results in an invalid SQL query. SQLite complains with "only a single
> result allowed for a SELECT that is part of an expression".

New description:

Consider the following model:

{{{
from django.db import models

class MyModel(models.Model):
name = models.CharField(max_length=200)

class Meta:
ordering = ("name",)
}}}

Then, the following query:

{{{
MyModel.objects.filter(pk__in=MyModel.objects.all().distinct()[:10])
}}}

results in an invalid SQL query. SQLite complains with "only a single
result allowed for a SELECT that is part of an expression".

--

Comment (by bronger):

You're right, my example was not minimal. I changed it so that only one
model is defined.

--
Ticket URL: <https://code.djangoproject.com/ticket/24254#comment:2>

Django

unread,
Feb 7, 2015, 12:03:49 PM2/7/15
to django-...@googlegroups.com
#24254: Passing a query to "pk__in" of another query may result in invalid SQL
-------------------------------------+-------------------------------------

Reporter: bronger | Owner: nobody
Type: Bug | Status: new
Component: Database layer | Version: 1.7
(models, ORM) |
Severity: Normal | Resolution:
Keywords: | Triage Stage:
| Unreviewed
Has patch: 1 | Needs documentation: 0
Needs tests: 0 | Patch needs improvement: 0

Easy pickings: 0 | UI/UX: 0
-------------------------------------+-------------------------------------
Changes (by inglesp):

* has_patch: 0 => 1


Comment:

I've added a patch with a fix and a test at
https://github.com/django/django/pull/4078.

--
Ticket URL: <https://code.djangoproject.com/ticket/24254#comment:3>

Django

unread,
Feb 7, 2015, 12:04:37 PM2/7/15
to django-...@googlegroups.com
#24254: Passing a query to "pk__in" of another query may result in invalid SQL
-------------------------------------+-------------------------------------

Reporter: bronger | Owner: nobody
Type: Bug | Status: new
Component: Database layer | Version: 1.7
(models, ORM) |
Severity: Normal | Resolution:
Keywords: | Triage Stage: Accepted

Has patch: 1 | Needs documentation: 0
Needs tests: 0 | Patch needs improvement: 0

Easy pickings: 0 | UI/UX: 0
-------------------------------------+-------------------------------------
Changes (by inglesp):

* stage: Unreviewed => Accepted


--
Ticket URL: <https://code.djangoproject.com/ticket/24254#comment:4>

Django

unread,
Feb 9, 2015, 12:58:17 PM2/9/15
to django-...@googlegroups.com
#24254: Passing a query to "pk__in" of another query may result in invalid SQL
-------------------------------------+-------------------------------------

Reporter: bronger | Owner: nobody
Type: Bug | Status: new
Component: Database layer | Version: 1.7
(models, ORM) |
Severity: Normal | Resolution:
Keywords: | Triage Stage: Accepted
Has patch: 1 | Needs documentation: 0
Needs tests: 0 | Patch needs improvement: 1

Easy pickings: 0 | UI/UX: 0
-------------------------------------+-------------------------------------
Changes (by timgraham):

* needs_better_patch: 0 => 1


--
Ticket URL: <https://code.djangoproject.com/ticket/24254#comment:5>

Django

unread,
Feb 15, 2015, 12:16:43 PM2/15/15
to django-...@googlegroups.com
#24254: Passing a query to "pk__in" of another query may result in invalid SQL
-------------------------------------+-------------------------------------

Reporter: bronger | Owner: nobody
Type: Bug | Status: new
Component: Database layer | Version: 1.7
(models, ORM) |
Severity: Normal | Resolution:
Keywords: | Triage Stage: Accepted
Has patch: 1 | Needs documentation: 0
Needs tests: 0 | Patch needs improvement: 1
Easy pickings: 0 | UI/UX: 0
-------------------------------------+-------------------------------------

Comment (by inglesp):

The proposed patch passes on SQLite, but fails on Postgres, with:

{{{
for SELECT DISTINCT, ORDER BY expressions must appear in select list LINE
1: ...eries_person"."id" FROM "queries_person" ORDER BY "queries_p...
}}}

and on MySQL, with:

{{{
This version of MySQL doesn't yet support 'LIMIT & IN/ALL/ANY/SOME
subquery'
}}}

I'm afraid I don't know enough about either database to know what the
right fix is.

--
Ticket URL: <https://code.djangoproject.com/ticket/24254#comment:6>

Django

unread,
Aug 5, 2016, 2:58:03 AM8/5/16
to django-...@googlegroups.com
#24254: Passing a query to "pk__in" of another query may result in invalid SQL
-------------------------------------+-------------------------------------

Reporter: bronger | Owner: nobody
Type: Bug | Status: new
Component: Database layer | Version: 1.7
(models, ORM) |
Severity: Normal | Resolution:
Keywords: | Triage Stage: Accepted
Has patch: 1 | Needs documentation: 0
Needs tests: 0 | Patch needs improvement: 1
Easy pickings: 0 | UI/UX: 0
-------------------------------------+-------------------------------------

Comment (by rtpg):

I've hit this issue as well, reaching an identical situation of `__in` +
ordering + distinct + slicing.

Typing this out, I realised that if you have ordering and slicing, then
you'll need the values in the `SELECT` (see the Postgres error mentioned
before).

So I would propose a "tuple unwrapping" in that case. So the `pk__in=qs`
property would expand to SQL like `SELECT U1."id" FROM (SELECT DISTINCT
... ) U1`


{{{
In [61]: print
MyModel.objects.filter(pk__in=MyModel.objects.all().distinct()[:10]).query
# invalid syntax due to missing tuple unwrapping but includes proper
ordering
SELECT "test_mymodel"."id", "test_mymodel"."name" FROM "test_mymodel"
WHERE "test_mymodel"."id" IN
(SELECT DISTINCT "test_mymodel"."id", "test_mymodel"."name" FROM
"test_mymodel" ORDER BY "test_mymodel"."name" ASC LIMIT 10)
ORDER BY "test_mymodel"."name" ASC

# proposed solution with tuple unwrapping
SELECT "test_mymodel"."id", "test_mymodel"."name" FROM "test_mymodel"
WHERE "test_mymodel"."id" IN
(SELECT U0."id" FROM
(SELECT DISTINCT "test_mymodel"."id", "test_mymodel"."name" FROM
"test_mymodel" ORDER BY "test_mymodel"."name" ASC LIMIT 10) U0
)
ORDER BY "test_mymodel"."name" ASC
}}}


I think we could always do wrapping and unwrapping of `__in` subqueries.
The cost is pretty negligible in Postgres (an extra select node), and
avoids having to figure out the innards of the subquery select.

--
Ticket URL: <https://code.djangoproject.com/ticket/24254#comment:7>

Django

unread,
Aug 5, 2016, 9:03:18 AM8/5/16
to django-...@googlegroups.com
#24254: Passing a query to "pk__in" of another query may result in invalid SQL
-------------------------------------+-------------------------------------

Reporter: bronger | Owner: nobody
Type: Bug | Status: new
Component: Database layer | Version: master

(models, ORM) |
Severity: Normal | Resolution:
Keywords: | Triage Stage: Accepted
Has patch: 1 | Needs documentation: 0
Needs tests: 0 | Patch needs improvement: 1
Easy pickings: 0 | UI/UX: 0
-------------------------------------+-------------------------------------
Changes (by charettes):

* cc: charettes (added)
* version: 1.7 => master


Comment:

We should be able to avoid unconditional tuple wrapping as it only seems
to be required when `DISTINCT` is used with `ORDER BY`.

--
Ticket URL: <https://code.djangoproject.com/ticket/24254#comment:8>

Django

unread,
Apr 27, 2017, 12:53:25 AM4/27/17
to django-...@googlegroups.com
#24254: Queries using distinct() and order_by() cannot be used with the __in lookup
-------------------------------------+-------------------------------------
Reporter: Torsten Bronger | Owner: Simon
| Charette
Type: Bug | Status: assigned

Component: Database layer | Version: master
(models, ORM) |
Severity: Normal | Resolution:
Keywords: | Triage Stage: Accepted
Has patch: 1 | Needs documentation: 0
Needs tests: 0 | Patch needs improvement: 0

Easy pickings: 0 | UI/UX: 0
-------------------------------------+-------------------------------------
Changes (by Simon Charette):

* owner: nobody => Simon Charette
* needs_better_patch: 1 => 0
* status: new => assigned


Comment:

[https://github.com/django/django/pull/8418 Updated PR]

--
Ticket URL: <https://code.djangoproject.com/ticket/24254#comment:9>

Django

unread,
May 10, 2017, 1:41:44 PM5/10/17
to django-...@googlegroups.com
#24254: Queries using distinct() and order_by() cannot be used with the __in lookup
-------------------------------------+-------------------------------------
Reporter: Torsten Bronger | Owner: Simon
| Charette
Type: Bug | Status: assigned
Component: Database layer | Version: master
(models, ORM) |
Severity: Normal | Resolution:
Keywords: | Triage Stage: Ready for
| checkin

Has patch: 1 | Needs documentation: 0
Needs tests: 0 | Patch needs improvement: 0

Easy pickings: 0 | UI/UX: 0
-------------------------------------+-------------------------------------
Changes (by Tim Graham):

* stage: Accepted => Ready for checkin


Comment:

(pending some cosmetic updates)

--
Ticket URL: <https://code.djangoproject.com/ticket/24254#comment:10>

Django

unread,
May 11, 2017, 9:51:11 PM5/11/17
to django-...@googlegroups.com
#24254: Queries using distinct() and order_by() cannot be used with the __in lookup
-------------------------------------+-------------------------------------
Reporter: Torsten Bronger | Owner: Simon
| Charette
Type: Bug | Status: closed

Component: Database layer | Version: master
(models, ORM) |
Severity: Normal | Resolution: fixed

Keywords: | Triage Stage: Ready for
| checkin
Has patch: 1 | Needs documentation: 0
Needs tests: 0 | Patch needs improvement: 0

Easy pickings: 0 | UI/UX: 0
-------------------------------------+-------------------------------------
Changes (by Simon Charette <charette.s@…>):

* status: assigned => closed
* resolution: => fixed


Comment:

In [changeset:"4acae21846f6212aa992763e587c7e201828d7b0" 4acae218]:
{{{
#!CommitTicketReference repository=""
revision="4acae21846f6212aa992763e587c7e201828d7b0"
Fixed #24254 -- Fixed queries using the __in lookup with querysets using
distinct() and order_by().

Thanks Tim for the review.
}}}

--
Ticket URL: <https://code.djangoproject.com/ticket/24254#comment:11>

Django

unread,
Oct 6, 2017, 12:47:54 PM10/6/17
to django-...@googlegroups.com
#24254: Queries using distinct() and order_by() cannot be used with the __in lookup
-------------------------------------+-------------------------------------
Reporter: Torsten Bronger | Owner: Simon
| Charette
Type: Bug | Status: closed
Component: Database layer | Version: master
(models, ORM) |
Severity: Normal | Resolution: fixed
Keywords: | Triage Stage: Ready for
| checkin
Has patch: 1 | Needs documentation: 0
Needs tests: 0 | Patch needs improvement: 0

Easy pickings: 0 | UI/UX: 0
-------------------------------------+-------------------------------------

Comment (by Tim Graham <timograham@…>):

In [changeset:"11ade8eefd32f5bc7ee6379b77824f02ca61c20b" 11ade8e]:
{{{
#!CommitTicketReference repository=""
revision="11ade8eefd32f5bc7ee6379b77824f02ca61c20b"
Refs #24254 -- Removed unnecessary SQL AS clause in SQLCompiler.as_sql().

Incorrect on Oracle.
}}}

--
Ticket URL: <https://code.djangoproject.com/ticket/24254#comment:12>

Django

unread,
Oct 6, 2017, 1:24:28 PM10/6/17
to django-...@googlegroups.com
#24254: Queries using distinct() and order_by() cannot be used with the __in lookup
-------------------------------------+-------------------------------------
Reporter: Torsten Bronger | Owner: Simon
| Charette
Type: Bug | Status: closed
Component: Database layer | Version: master
(models, ORM) |
Severity: Normal | Resolution: fixed
Keywords: | Triage Stage: Ready for
| checkin
Has patch: 1 | Needs documentation: 0
Needs tests: 0 | Patch needs improvement: 0

Easy pickings: 0 | UI/UX: 0
-------------------------------------+-------------------------------------

Comment (by Tim Graham <timograham@…>):

In [changeset:"71da2b4f11938532df63817f8ea01e30c0875a05" 71da2b4f]:
{{{
#!CommitTicketReference repository=""
revision="71da2b4f11938532df63817f8ea01e30c0875a05"
[2.0.x] Refs #24254 -- Removed unnecessary SQL AS clause in
SQLCompiler.as_sql().

Incorrect on Oracle.

Backport of 11ade8eefd32f5bc7ee6379b77824f02ca61c20b from master
}}}

--
Ticket URL: <https://code.djangoproject.com/ticket/24254#comment:13>

Reply all
Reply to author
Forward
0 new messages