[Django] #33138: Tuple comparison for efficient lexicographic ordering on multiple columns

25 views
Skip to first unread message

Django

unread,
Sep 24, 2021, 9:06:06 AM9/24/21
to django-...@googlegroups.com
#33138: Tuple comparison for efficient lexicographic ordering on multiple columns
-----------------------------------------+--------------------------------
Reporter: michalc | Owner: nobody
Type: Uncategorized | Status: new
Component: Uncategorized | Version: 3.2
Severity: Normal | Keywords: QuerySet.extra
Triage Stage: Unreviewed | Has patch: 0
Needs documentation: 0 | Needs tests: 0
Patch needs improvement: 0 | Easy pickings: 0
UI/UX: 0 |
-----------------------------------------+--------------------------------
The below doesn't seem possible without resorting to `extra`

{{{
#!div style="font-size: 80%"
{{{#!sql
WHERE (col_a, col_b) > ('value_a', 'value_b')
ORDER BY (col_a, col_b)
}}}
}}}

While this sort of this is semantically the same as

{{{
#!div style="font-size: 80%"
{{{#!sql
WHERE col_a > 'value_a' OR (col_a = 'value_a' AND col_b > 'value_b')
ORDER BY (col_a, col_b)
}}}
}}}

which can be expressed using the Django ORM, PostgreSQL at least treats
these differently in terms of applying indexes. Essentially, the tuple
version (from my brief testing) is better in the presence of a multi-
column index on `col_a, col_b`: it seems to avoid quite a lot of scanning.

My ultimate use case for this is cursor-based pagination, where the cursor
is a tuple of 2 columns: an "almost" unique datetime, and a fully unique
ID for tie-breakers.

--
Ticket URL: <https://code.djangoproject.com/ticket/33138>
Django <https://code.djangoproject.com/>
The Web framework for perfectionists with deadlines.

Django

unread,
Sep 24, 2021, 9:20:51 AM9/24/21
to django-...@googlegroups.com
#33138: Tuple comparison for efficient lexicographic ordering on multiple columns
--------------------------------+--------------------------------------

Reporter: michalc | Owner: nobody
Type: Uncategorized | Status: new
Component: Uncategorized | Version: 3.2
Severity: Normal | Resolution:

Keywords: QuerySet.extra | Triage Stage: Unreviewed
Has patch: 0 | Needs documentation: 0
Needs tests: 0 | Patch needs improvement: 0
Easy pickings: 0 | UI/UX: 0
--------------------------------+--------------------------------------
Description changed by michalc:

Old description:

> The below doesn't seem possible without resorting to `extra`
>
> {{{
> #!div style="font-size: 80%"
> {{{#!sql
> WHERE (col_a, col_b) > ('value_a', 'value_b')
> ORDER BY (col_a, col_b)
> }}}
> }}}
>
> While this sort of this is semantically the same as
>
> {{{
> #!div style="font-size: 80%"
> {{{#!sql
> WHERE col_a > 'value_a' OR (col_a = 'value_a' AND col_b > 'value_b')
> ORDER BY (col_a, col_b)
> }}}
> }}}
>
> which can be expressed using the Django ORM, PostgreSQL at least treats
> these differently in terms of applying indexes. Essentially, the tuple
> version (from my brief testing) is better in the presence of a multi-
> column index on `col_a, col_b`: it seems to avoid quite a lot of
> scanning.
>
> My ultimate use case for this is cursor-based pagination, where the
> cursor is a tuple of 2 columns: an "almost" unique datetime, and a fully
> unique ID for tie-breakers.

New description:

The below doesn't seem possible without resorting to `extra`

{{{
#!div style="font-size: 80%"
{{{#!sql
WHERE (col_a, col_b) > ('value_a', 'value_b')
ORDER BY (col_a, col_b)
}}}
}}}

While this is semantically the same as

{{{
#!div style="font-size: 80%"
{{{#!sql
WHERE col_a > 'value_a' OR (col_a = 'value_a' AND col_b > 'value_b')
ORDER BY (col_a, col_b)
}}}
}}}

which can be expressed using the Django ORM, PostgreSQL at least treats
these differently in terms of applying indexes. Essentially, the tuple
version (from my brief testing) is better in the presence of a multi-
column index on `col_a, col_b`: it seems to avoid quite a lot of scanning.

My ultimate use case for this is cursor-based pagination, where the cursor
is a tuple of 2 columns: an "almost" unique datetime, and a fully unique
ID for tie-breakers.

--

--
Ticket URL: <https://code.djangoproject.com/ticket/33138#comment:1>

Django

unread,
Sep 24, 2021, 12:50:49 PM9/24/21
to django-...@googlegroups.com
#33138: Tuple comparison for efficient lexicographic ordering on multiple columns
-------------------------------------+-------------------------------------
Reporter: Michal Charemza | Owner: nobody
Type: New feature | Status: closed
Component: Database layer | Version: 3.2
(models, ORM) |
Severity: Normal | Resolution: duplicate

Keywords: QuerySet.extra | Triage Stage:
| Unreviewed
Has patch: 0 | Needs documentation: 0
Needs tests: 0 | Patch needs improvement: 0
Easy pickings: 0 | UI/UX: 0
-------------------------------------+-------------------------------------
Changes (by Mariusz Felisiak):

* status: new => closed
* resolution: => duplicate
* component: Uncategorized => Database layer (models, ORM)
* type: Uncategorized => New feature


Comment:

Duplicate of #29527. Please feel-free to continue the discussion in
#29527. We can reopen the ticket if you/someone will provide PoC.

--
Ticket URL: <https://code.djangoproject.com/ticket/33138#comment:2>

Reply all
Reply to author
Forward
0 new messages