I have a view which uses the `Paginator` object to page results for a
table, one of the columns of the tables is a total gathered from summing
rows from 5 different tables, so the query has 5 joins to get these
values.
{{{#!python
run_objects =
Run.objects.filter(type__name=run_type).select_related("runend",
"browserlink__browser", "filecheck__filecheckrecord").\
annotate(total_griderror=Count("griderrorrecord", distinct=True),
total_gridhttperror=Count("gridhttperrorrecord",
distinct=True),
total_gridconnerror=Count("gridconnectionerrorrecord",
distinct=True),
total_storagehttperror=Count("storagehttperrorrecord",
distinct=True),
total_storageconnerror=Count("storageconnectionerrorrecord",
distinct=True)).order_by("-id")
paginator = Paginator(runs_objects, 15)
}}}
The `Paginator` object appears to do a Count query, to calculate the total
number of pages and then a select query later when a page number is
requested to get the data. The problem I'm seeing is that the count query
in this case is including the 5 joins which don't affect the count other
than making it a lot slower.
I was able to speed this up by performing the count for the `Paginator`
myself
{{{#!python
run_objects = Run.objects.filter(type__name=run_type)
total = run_objects.count()
run_objects = run_objects.select_related("runend", "browserlink__browser",
"filecheck__filecheckrecord").\
annotate(total_griderror=Count("griderrorrecord", distinct=True),
total_gridhttperror=Count("gridhttperrorrecord",
distinct=True),
total_gridconnerror=Count("gridconnectionerrorrecord",
distinct=True),
total_storagehttperror=Count("storagehttperrorrecord",
distinct=True),
total_storageconnerror=Count("storageconnectionerrorrecord",
distinct=True)).order_by("-id")
paginator = Paginator(runs_objects, 15)
paginator._count = total
}}}
I was wondering - would it be possible to do this automatically. Is there
a way to drop bits off the query for the count query (ie like annotations,
select_related, prefetches which would have no effect on the number of
rows returned) to speed up things up?
--
Ticket URL: <https://code.djangoproject.com/ticket/23771>
Django <https://code.djangoproject.com/>
The Web framework for perfectionists with deadlines.
* keywords: => paginator optimization performance
* needs_better_patch: => 0
* needs_docs: => 0
* needs_tests: => 0
* stage: Unreviewed => Accepted
Comment:
There are probably some non-complex gains to be had, but if all else fails
it would be worth documenting the potential performance hit with an
optimization example.
--
Ticket URL: <https://code.djangoproject.com/ticket/23771#comment:1>
* cc: mail@… (added)
Comment:
This sounds like a nice idea. I don't think we should try to do this
automatically though -- it's probably quite hard and the developer
probably knows better how to optimize a query. Maybe we could add an
additional `object_count` kwarg in the constructor that's specifically
documented for optimization purposes?
--
Ticket URL: <https://code.djangoproject.com/ticket/23771#comment:2>
Comment (by adamchainz):
For the example queryset, I think just using `prefetch_related` will turn
the joins into extra queries and they won't then appear in the count
query.
--
Ticket URL: <https://code.djangoproject.com/ticket/23771#comment:3>
Comment (by Ramiro Morales):
#25375 and #29254 were related.
--
Ticket URL: <https://code.djangoproject.com/ticket/23771#comment:4>
Comment (by Ramiro Morales):
#28477 is related, although it's has a more general scope (it's about the
`Queryset.count()` method instead of just the Paginator) and proposes to
work in reduction of unused annotations.
--
Ticket URL: <https://code.djangoproject.com/ticket/23771#comment:5>
Comment (by Josh Smeaton):
Adding my +1 to this, as it makes it extremely difficult to optimise
readonly annotations in the admin. As an example:
{{{
class MyAdmin(admin.ModelAdmin):
list_display = ("f1", "f2", "related_count")
def related_count(self, obj):
return obj.related_count
def get_queryset(self, request):
return
super().get_queryset(request).annotate(related_count=Count("related"))
}}}
If QuerySet had a way to remove annotations like select_related(None) or
order_by(), then we could implement a Paginator like so:
{{{
class BetterCountPaginator(Paginator):
@cached_property
def count(self):
return self.object_list.annotate(None).count()
}}}
--
Ticket URL: <https://code.djangoproject.com/ticket/23771#comment:6>
Comment (by Simon Charette):
Doing this automatically and correctly is really tricky as #28477 proved
(think of annotations spanning multi-valued relationships, `values`
grouping by annotations before aggregation) so I'd be +1 on adding a way
to pass `count` or `total` to `Paginator` and a way to clear annotations
but we'd have to document that the later is not panacea.
--
Ticket URL: <https://code.djangoproject.com/ticket/23771#comment:7>