[Django] #28560: distinct() on None values

25 views
Skip to first unread message

Django

unread,
Sep 2, 2017, 7:41:00 AM9/2/17
to django-...@googlegroups.com
#28560: distinct() on None values
-------------------------------------+-------------------------------------
Reporter: felixxm | Owner: felixxm
Type: Bug | Status: assigned
Component: Database | Version: master
layer (models, ORM) |
Severity: Normal | Keywords: distinct values
Triage Stage: | Has patch: 0
Unreviewed |
Needs documentation: 0 | Needs tests: 0
Patch needs improvement: 0 | Easy pickings: 0
UI/UX: 0 |
-------------------------------------+-------------------------------------
`distinct()` doesn't work properly on `None` values. If we select columns
with `null` values then `distinct()` returns each of them instead of
single `None`, e.g. (based on `queries/tests.py`):

{{{
>>>
list(Item.objects.filter(modified__isnull=True).values_list('modified',
flat=True).distinct())
[None, None, None]
}}}

instead of `[None]`.

--
Ticket URL: <https://code.djangoproject.com/ticket/28560>
Django <https://code.djangoproject.com/>
The Web framework for perfectionists with deadlines.

Django

unread,
Sep 2, 2017, 7:41:44 AM9/2/17
to django-...@googlegroups.com
#28560: distinct() on None values
-------------------------------------+-------------------------------------
Reporter: felixxm | Owner: felixxm
Type: Bug | Status: assigned
Component: Database layer | Version: master
(models, ORM) |
Severity: Normal | Resolution:

Keywords: distinct values | Triage Stage:
| Unreviewed
Has patch: 0 | Needs documentation: 0

Needs tests: 0 | Patch needs improvement: 0
Easy pickings: 0 | UI/UX: 0
-------------------------------------+-------------------------------------
Changes (by felixxm):

* Attachment "28560.diff" added.

Django

unread,
Sep 2, 2017, 2:34:29 PM9/2/17
to django-...@googlegroups.com
#28560: distinct() on None values
-------------------------------------+-------------------------------------
Reporter: felixxm | Owner: felixxm
Type: Bug | Status: assigned
Component: Database layer | Version: master
(models, ORM) |
Severity: Normal | Resolution:
Keywords: distinct values | Triage Stage:
| Unreviewed
Has patch: 0 | Needs documentation: 0

Needs tests: 0 | Patch needs improvement: 0
Easy pickings: 0 | UI/UX: 0
-------------------------------------+-------------------------------------

Comment (by felixxm):

Issue is related with ordering. After `distinct()` columns from `ORDER BY`
clause are added to a `SELECT`. It doesn't take into account restricted
list of columns from `values` or `values_list`. `None` doesn't matter.

--
Ticket URL: <https://code.djangoproject.com/ticket/28560#comment:1>

Django

unread,
Sep 2, 2017, 2:40:37 PM9/2/17
to django-...@googlegroups.com
#28560: distinct() on ordered queryset with restricted list of columns returns
incorrect result

-------------------------------------+-------------------------------------
Reporter: felixxm | Owner: felixxm
Type: Bug | Status: assigned
Component: Database layer | Version: master
(models, ORM) |
Severity: Normal | Resolution:
Keywords: distinct values | Triage Stage:
| Unreviewed
Has patch: 0 | Needs documentation: 0

Needs tests: 0 | Patch needs improvement: 0
Easy pickings: 0 | UI/UX: 0
-------------------------------------+-------------------------------------

--
Ticket URL: <https://code.djangoproject.com/ticket/28560#comment:2>

Django

unread,
Sep 2, 2017, 3:53:27 PM9/2/17
to django-...@googlegroups.com
#28560: distinct() on ordered queryset with restricted list of columns returns
incorrect result
-------------------------------------+-------------------------------------
Reporter: felixxm | Owner: felixxm
Type: Bug | Status: assigned
Component: Database layer | Version: master
(models, ORM) |
Severity: Normal | Resolution:
Keywords: distinct values | Triage Stage:
| Unreviewed
Has patch: 0 | Needs documentation: 0

Needs tests: 0 | Patch needs improvement: 0
Easy pickings: 0 | UI/UX: 0
-------------------------------------+-------------------------------------
Description changed by felixxm:

Old description:

> `distinct()` doesn't work properly on `None` values. If we select columns
> with `null` values then `distinct()` returns each of them instead of
> single `None`, e.g. (based on `queries/tests.py`):
>
> {{{
> >>>
> list(Item.objects.filter(modified__isnull=True).values_list('modified',
> flat=True).distinct())
> [None, None, None]
> }}}
>
> instead of `[None]`.

New description:

I think we should raise `NotSupportedError` when `distinct()` is used with
`values()` (or `values_list()`) on ordered queryset and a list of fields
in `values()` doesn't contain all fields from `ORDER BY`. It cannot return
correct result because columns from `ORDER BY` clause must be included in
`SELECT`.

--

--
Ticket URL: <https://code.djangoproject.com/ticket/28560#comment:3>

Django

unread,
Sep 3, 2017, 2:44:15 PM9/3/17
to django-...@googlegroups.com
#28560: distinct() on ordered queryset with restricted list of columns returns
incorrect result
-------------------------------------+-------------------------------------
Reporter: felixxm | Owner: felixxm
Type: Bug | Status: assigned
Component: Database layer | Version: master
(models, ORM) |
Severity: Normal | Resolution:
Keywords: distinct values | Triage Stage:
| Unreviewed
Has patch: 0 | Needs documentation: 0

Needs tests: 0 | Patch needs improvement: 0
Easy pickings: 0 | UI/UX: 0
-------------------------------------+-------------------------------------

Comment (by Simon Charette):

Couldn't we wrap the query in a subquery like we did in #24254 in this
case?

--
Ticket URL: <https://code.djangoproject.com/ticket/28560#comment:4>

Django

unread,
Sep 3, 2017, 2:48:36 PM9/3/17
to django-...@googlegroups.com
#28560: distinct() on ordered queryset with restricted list of columns returns
incorrect result
-------------------------------------+-------------------------------------
Reporter: felixxm | Owner: felixxm
Type: Bug | Status: assigned
Component: Database layer | Version: master
(models, ORM) |
Severity: Normal | Resolution:
Keywords: distinct values | Triage Stage:
| Unreviewed
Has patch: 0 | Needs documentation: 0

Needs tests: 0 | Patch needs improvement: 0
Easy pickings: 0 | UI/UX: 0
-------------------------------------+-------------------------------------

Comment (by felixxm):

Good idea! it should be feasible. I will try to prepare patch in this
week.

--
Ticket URL: <https://code.djangoproject.com/ticket/28560#comment:5>

Django

unread,
Sep 3, 2017, 3:01:19 PM9/3/17
to django-...@googlegroups.com
#28560: distinct() on ordered queryset with restricted list of columns returns
incorrect result
-------------------------------------+-------------------------------------
Reporter: felixxm | Owner: felixxm
Type: Bug | Status: assigned
Component: Database layer | Version: master
(models, ORM) |
Severity: Normal | Resolution:
Keywords: distinct values | Triage Stage:
| Unreviewed
Has patch: 0 | Needs documentation: 0

Needs tests: 0 | Patch needs improvement: 0
Easy pickings: 0 | UI/UX: 0
-------------------------------------+-------------------------------------

Comment (by Simon Charette):

By the way Mariusz, do you have an opinion on
https://code.djangoproject.com/ticket/14357#comment:11?

--
Ticket URL: <https://code.djangoproject.com/ticket/28560#comment:6>

Django

unread,
Sep 10, 2017, 8:42:31 AM9/10/17
to django-...@googlegroups.com
#28560: distinct() on ordered queryset with restricted list of columns returns
incorrect result
-------------------------------------+-------------------------------------
Reporter: felixxm | Owner: felixxm
Type: Bug | Status: assigned
Component: Database layer | Version: master
(models, ORM) |
Severity: Normal | Resolution:
Keywords: distinct values | Triage Stage:
| Unreviewed
Has patch: 0 | Needs documentation: 0

Needs tests: 0 | Patch needs improvement: 0
Easy pickings: 0 | UI/UX: 0
-------------------------------------+-------------------------------------
Description changed by felixxm:

Old description:

> I think we should raise `NotSupportedError` when `distinct()` is used


> with `values()` (or `values_list()`) on ordered queryset and a list of
> fields in `values()` doesn't contain all fields from `ORDER BY`. It
> cannot return correct result because columns from `ORDER BY` clause must
> be included in `SELECT`.

New description:

When `distinct()` is used with `values()` (or `values_list()`) on ordered


queryset and a list of fields in `values()` doesn't contain all fields

from `ORDER BY`, then it doesn't return correct result because columns
from `ORDER BY` clause must be included in `SELECT`. As Simon suggested we
should wrap a query in a subquery.

--

--
Ticket URL: <https://code.djangoproject.com/ticket/28560#comment:7>

Django

unread,
Sep 10, 2017, 9:43:11 AM9/10/17
to django-...@googlegroups.com
#28560: distinct() on ordered queryset with restricted list of columns returns
incorrect result
-------------------------------------+-------------------------------------
Reporter: felixxm | Owner: felixxm
Type: Bug | Status: assigned
Component: Database layer | Version: master
(models, ORM) |
Severity: Normal | Resolution:
Keywords: distinct values | Triage Stage:
| Unreviewed
Has patch: 0 | Needs documentation: 0

Needs tests: 0 | Patch needs improvement: 0
Easy pickings: 0 | UI/UX: 0
-------------------------------------+-------------------------------------
Description changed by felixxm:

Old description:

> When `distinct()` is used with `values()` (or `values_list()`) on ordered


> queryset and a list of fields in `values()` doesn't contain all fields
> from `ORDER BY`, then it doesn't return correct result because columns
> from `ORDER BY` clause must be included in `SELECT`. As Simon suggested
> we should wrap a query in a subquery.

New description:

When `distinct()` is used with `values()` (or `values_list()`) on ordered
queryset and a list of fields in `values()` doesn't contain all fields
from `ORDER BY`, then it doesn't return correct result because columns
from `ORDER BY` clause must be included in `SELECT`. As Simon suggested we

should wrap a query in a subquery (see related tickets #7070, #5321).

--

--
Ticket URL: <https://code.djangoproject.com/ticket/28560#comment:8>

Django

unread,
Sep 10, 2017, 2:17:03 PM9/10/17
to django-...@googlegroups.com
#28560: distinct() on ordered queryset with restricted list of columns returns
incorrect result
-------------------------------------+-------------------------------------
Reporter: felixxm | Owner: felixxm
Type: Bug | Status: assigned
Component: Database layer | Version: master
(models, ORM) |
Severity: Normal | Resolution:
Keywords: distinct values | Triage Stage:
| Unreviewed
Has patch: 1 | Needs documentation: 0

Needs tests: 0 | Patch needs improvement: 0
Easy pickings: 0 | UI/UX: 0
-------------------------------------+-------------------------------------
Changes (by felixxm):

* has_patch: 0 => 1


Comment:

[https://github.com/django/django/pull/9055 PR]

--
Ticket URL: <https://code.djangoproject.com/ticket/28560#comment:9>

Django

unread,
Sep 26, 2017, 8:47:08 PM9/26/17
to django-...@googlegroups.com
#28560: distinct() on ordered queryset with restricted list of columns returns
incorrect result
-------------------------------------+-------------------------------------
Reporter: felixxm | Owner: felixxm
Type: Bug | Status: assigned
Component: Database layer | Version: master
(models, ORM) |
Severity: Normal | Resolution:
Keywords: distinct values | Triage Stage: Accepted
Has patch: 1 | Needs documentation: 0

Needs tests: 0 | Patch needs improvement: 0
Easy pickings: 0 | UI/UX: 0
-------------------------------------+-------------------------------------
Changes (by Tim Graham):

* stage: Unreviewed => Accepted


--
Ticket URL: <https://code.djangoproject.com/ticket/28560#comment:10>

Django

unread,
Nov 6, 2017, 12:44:25 PM11/6/17
to django-...@googlegroups.com
#28560: distinct() on ordered queryset with restricted list of columns returns
incorrect result
-------------------------------------+-------------------------------------
Reporter: felixxm | Owner: felixxm
Type: Bug | Status: assigned
Component: Database layer | Version: master
(models, ORM) |
Severity: Normal | Resolution:
Keywords: distinct values | Triage Stage: Accepted
Has patch: 1 | Needs documentation: 0
Needs tests: 0 | Patch needs improvement: 1

Easy pickings: 0 | UI/UX: 0
-------------------------------------+-------------------------------------
Changes (by Tim Graham):

* needs_better_patch: 0 => 1


--
Ticket URL: <https://code.djangoproject.com/ticket/28560#comment:11>

Django

unread,
Nov 14, 2017, 3:27:29 PM11/14/17
to django-...@googlegroups.com
#28560: distinct() on ordered queryset with restricted list of columns returns
incorrect result
-------------------------------------+-------------------------------------
Reporter: felixxm | Owner: (none)
Type: Bug | Status: new

Component: Database layer | Version: master
(models, ORM) |
Severity: Normal | Resolution:
Keywords: distinct values | Triage Stage: Accepted
Has patch: 0 | Needs documentation: 0

Needs tests: 0 | Patch needs improvement: 0
Easy pickings: 0 | UI/UX: 0
-------------------------------------+-------------------------------------
Changes (by felixxm):

* owner: felixxm => (none)
* needs_better_patch: 1 => 0
* has_patch: 1 => 0
* status: assigned => new


--
Ticket URL: <https://code.djangoproject.com/ticket/28560#comment:12>

Django

unread,
Oct 30, 2020, 10:12:26 AM10/30/20
to django-...@googlegroups.com
#28560: distinct() on ordered queryset with restricted list of columns returns
incorrect result
-------------------------------------+-------------------------------------
Reporter: Mariusz Felisiak | Owner: (none)

Type: Bug | Status: new
Component: Database layer | Version: master
(models, ORM) |
Severity: Normal | Resolution:
Keywords: distinct values | Triage Stage: Accepted
Has patch: 0 | Needs documentation: 0

Needs tests: 0 | Patch needs improvement: 0
Easy pickings: 0 | UI/UX: 0
-------------------------------------+-------------------------------------

Comment (by Tobias McNulty):

After some discussion on the mailing list (https://groups.google.com/g
/django-developers/c/DNVRFqVBsfk/m/xDUvaq3DAAAJ) it looks like the
consensus is to require an explicit opt-in or raise an error (i.e., never
add a column implicitly if the user specified a list of columns via
values() or values_list()).

I am not too familiar with the ORM. I believe these columns are added in
`get_extra_select()` (`django/db/models/sql/compiler.py`), which could be
adapted initially to raise a deprecation warning and eventually an
exception. Does that sound right?

I may not have time right away so if anyone feels like picking this up, go
for it.

--
Ticket URL: <https://code.djangoproject.com/ticket/28560#comment:14>

Django

unread,
Oct 30, 2020, 10:14:09 AM10/30/20
to django-...@googlegroups.com
#28560: distinct() on ordered queryset with restricted list of columns returns
incorrect result
-------------------------------------+-------------------------------------
Reporter: Mariusz Felisiak | Owner: (none)
Type: Bug | Status: new
Component: Database layer | Version: master
(models, ORM) |
Severity: Normal | Resolution:
Keywords: distinct values | Triage Stage: Accepted
Has patch: 0 | Needs documentation: 0

Needs tests: 0 | Patch needs improvement: 0
Easy pickings: 0 | UI/UX: 0
-------------------------------------+-------------------------------------
Description changed by Tobias McNulty:

Old description:

> When `distinct()` is used with `values()` (or `values_list()`) on ordered
> queryset and a list of fields in `values()` doesn't contain all fields
> from `ORDER BY`, then it doesn't return correct result because columns
> from `ORDER BY` clause must be included in `SELECT`. As Simon suggested

> we should wrap a query in a subquery (see related tickets #7070, #5321).

New description:

When `distinct()` is used with `values()` (or `values_list()`) on ordered
queryset and a list of fields in `values()` doesn't contain all fields
from `ORDER BY`, then it doesn't return correct result because columns
from `ORDER BY` clause must be included in `SELECT`.

After some discussion on the mailing list (https://groups.google.com/g


/django-developers/c/DNVRFqVBsfk/m/xDUvaq3DAAAJ) it looks like the
consensus is to require an explicit opt-in or raise an error (i.e., never
add a column implicitly if the user specified a list of columns via
values() or values_list()).

--

--
Ticket URL: <https://code.djangoproject.com/ticket/28560#comment:15>

Reply all
Reply to author
Forward
0 new messages