In order to implement keyset pagination, AKA the seek method, across
multiple fields including ordering on non-unique fields, I'm wanting to
write a PostgreSQL that compares multiple fields in an ordered fashion. To
my knowledge this isn't possible with the current ORM syntax. My
particular use-case further complicates this by also wanting a subquery in
order to avoid serializing more than the unique key in the response.
Here's a generic example where clause that mirrors my use-case:
{{{#!sql
(a,b,c,id) > (SELECT a,b,c,id FROM mytable WHERE id = ?)
}}}
What this does special is account for all of the columns when making the
comparison. So it will only compare the id column if all of the a, b, and
c columns are the same, to allow for exact indentification in an arbitrary
ordering.
In the PostgreSQL docs this feature is called Row Constructor Comparison,
and the documentation is here: https://www.postgresql.org/docs/9.6/static
/functions-comparisons.html#ROW-WISE-COMPARISON.
--
Ticket URL: <https://code.djangoproject.com/ticket/29527>
Django <https://code.djangoproject.com/>
The Web framework for perfectionists with deadlines.
Comment (by Ryan Hiebert):
With the help of a patient colleague (thanks Tim!) I've been able to
accomplish what I needed to natively, without using {{{.extra()}}}. With
one caveat, I needed to add the LHS as a calculated column using annotate,
which is a bit of a kludge, which ends up returning annotated data that I
didn't really need. For my case, that's not too big of a deal, but perhaps
less than ideal.
It might be deserving of a way to remove annotations that don't need to be
selected, or to allow filter expressions that allow a dynamic expression
on both sides. I'm not sure if that deserves its own ticket, if it should
be on this one, or if perhaps there's already a ticket open for that.
Here's what we came up with:
{{{#!python
from django.db.models import Func, F, Subquery
def after(queryset, value):
fields = ['a', 'b', 'c', 'id']
return (
queryset
.annotate(rank=Func(*(F(field) for field in fields),
function='ROW'))
.filter(rank=Subquery(queryset.filter(id=value).values(*fields)))
.order_by(*fields)
)
}}}
One other thing to note is that this opts to use the explicit {{{ROW}}}
function, rather than relying on that being the result of just using
parenthesis, because parenthesis alone with a single value doesn't
necessarily make a {{{ROW}}}.
--
Ticket URL: <https://code.djangoproject.com/ticket/29527#comment:1>
* component: Uncategorized => Database layer (models, ORM)
Comment:
I'm not sure what's to be done to close this ticket. Can you offer a
patch?
--
Ticket URL: <https://code.djangoproject.com/ticket/29527#comment:2>
* status: new => closed
* resolution: => needsinfo
--
Ticket URL: <https://code.djangoproject.com/ticket/29527#comment:3>
* type: Uncategorized => New feature
--
Ticket URL: <https://code.djangoproject.com/ticket/29527#comment:4>
Comment (by Simon Charette):
> With one caveat, I needed to add the LHS as a calculated column using
annotate, which is a bit of a kludge, which ends up returning annotated
data that I didn't really need.
FWIW this can now be avoided in two ways
1. Use `QuerySet.alias` instead of `.annotate`
2. Pass lookups directly to `filter`
{{{#!python
from django.db.models.lookups import GreaterThan
from django.db.models import Func, F, Subquery
class Row(Func):
function = 'ROW'
def after(queryset, value):
return (
queryset
.filter(GreaterThan(
Row('a', 'b', 'c', 'id'),
Subquery(queryset.filter(id=value).values('a', 'b', 'c',
'id'))
))
.order_by('a', 'b', 'c', 'id')
)
}}}
--
Ticket URL: <https://code.djangoproject.com/ticket/29527#comment:5>