[Django] #23771: Optimisation idea for Paginator object

19 views
Skip to first unread message

Django

unread,
Nov 6, 2014, 5:40:31 AM11/6/14
to django-...@googlegroups.com
#23771: Optimisation idea for Paginator object
----------------------------------------------+--------------------
Reporter: GP89 | Owner: nobody
Type: Cleanup/optimization | Status: new
Component: Database layer (models, ORM) | Version: 1.6
Severity: Normal | Keywords:
Triage Stage: Unreviewed | Has patch: 0
Easy pickings: 0 | UI/UX: 0
----------------------------------------------+--------------------
I had a thought - I don't know if this can be easily achieved but here it
is.

I have a view which uses the `Paginator` object to page results for a
table, one of the columns of the tables is a total gathered from summing
rows from 5 different tables, so the query has 5 joins to get these
values.

{{{#!python
run_objects =
Run.objects.filter(type__name=run_type).select_related("runend",
"browserlink__browser", "filecheck__filecheckrecord").\
annotate(total_griderror=Count("griderrorrecord", distinct=True),
total_gridhttperror=Count("gridhttperrorrecord",
distinct=True),
total_gridconnerror=Count("gridconnectionerrorrecord",
distinct=True),
total_storagehttperror=Count("storagehttperrorrecord",
distinct=True),
total_storageconnerror=Count("storageconnectionerrorrecord",
distinct=True)).order_by("-id")

paginator = Paginator(runs_objects, 15)
}}}

The `Paginator` object appears to do a Count query, to calculate the total
number of pages and then a select query later when a page number is
requested to get the data. The problem I'm seeing is that the count query
in this case is including the 5 joins which don't affect the count other
than making it a lot slower.

I was able to speed this up by performing the count for the `Paginator`
myself

{{{#!python
run_objects = Run.objects.filter(type__name=run_type)
total = run_objects.count()

run_objects = run_objects.select_related("runend", "browserlink__browser",
"filecheck__filecheckrecord").\
annotate(total_griderror=Count("griderrorrecord", distinct=True),
total_gridhttperror=Count("gridhttperrorrecord",
distinct=True),
total_gridconnerror=Count("gridconnectionerrorrecord",
distinct=True),
total_storagehttperror=Count("storagehttperrorrecord",
distinct=True),
total_storageconnerror=Count("storageconnectionerrorrecord",
distinct=True)).order_by("-id")

paginator = Paginator(runs_objects, 15)
paginator._count = total
}}}

I was wondering - would it be possible to do this automatically. Is there
a way to drop bits off the query for the count query (ie like annotations,
select_related, prefetches which would have no effect on the number of
rows returned) to speed up things up?

--
Ticket URL: <https://code.djangoproject.com/ticket/23771>
Django <https://code.djangoproject.com/>
The Web framework for perfectionists with deadlines.

Django

unread,
Nov 6, 2014, 8:43:04 PM11/6/14
to django-...@googlegroups.com
#23771: Optimisation idea for Paginator object
-------------------------------------+-------------------------------------
Reporter: GP89 | Owner: nobody
Type: | Status: new
Cleanup/optimization | Version: 1.6
Component: Database layer | Resolution:
(models, ORM) | Triage Stage: Accepted
Severity: Normal | Needs documentation: 0
Keywords: paginator | Patch needs improvement: 0
optimization performance | UI/UX: 0
Has patch: 0 |
Needs tests: 0 |
Easy pickings: 0 |
-------------------------------------+-------------------------------------
Changes (by manfre):

* keywords: => paginator optimization performance
* needs_better_patch: => 0
* needs_docs: => 0
* needs_tests: => 0
* stage: Unreviewed => Accepted


Comment:

There are probably some non-complex gains to be had, but if all else fails
it would be worth documenting the potential performance hit with an
optimization example.

--
Ticket URL: <https://code.djangoproject.com/ticket/23771#comment:1>

Django

unread,
Nov 15, 2014, 6:35:00 AM11/15/14
to django-...@googlegroups.com
#23771: Optimisation idea for Paginator object
-------------------------------------+-------------------------------------
Reporter: GP89 | Owner: nobody

Type: | Status: new
Cleanup/optimization | Version: 1.6
Component: Database layer | Resolution:
(models, ORM) | Triage Stage: Accepted
Severity: Normal | Needs documentation: 0
Keywords: paginator | Patch needs improvement: 0
optimization performance | UI/UX: 0
Has patch: 0 |
Needs tests: 0 |
Easy pickings: 0 |
-------------------------------------+-------------------------------------
Changes (by dbrgn):

* cc: mail@… (added)


Comment:

This sounds like a nice idea. I don't think we should try to do this
automatically though -- it's probably quite hard and the developer
probably knows better how to optimize a query. Maybe we could add an
additional `object_count` kwarg in the constructor that's specifically
documented for optimization purposes?

--
Ticket URL: <https://code.djangoproject.com/ticket/23771#comment:2>

Django

unread,
Mar 27, 2015, 9:30:13 AM3/27/15
to django-...@googlegroups.com
#23771: Optimisation idea for Paginator object
-------------------------------------+-------------------------------------
Reporter: GP89 | Owner: nobody
Type: | Status: new
Cleanup/optimization |
Component: Database layer | Version: 1.6
(models, ORM) |
Severity: Normal | Resolution:
Keywords: paginator | Triage Stage: Accepted
optimization performance |
Has patch: 0 | Needs documentation: 0
Needs tests: 0 | Patch needs improvement: 0

Easy pickings: 0 | UI/UX: 0
-------------------------------------+-------------------------------------

Comment (by adamchainz):

For the example queryset, I think just using `prefetch_related` will turn
the joins into extra queries and they won't then appear in the count
query.

--
Ticket URL: <https://code.djangoproject.com/ticket/23771#comment:3>

Django

unread,
Jul 17, 2018, 12:55:43 PM7/17/18
to django-...@googlegroups.com
#23771: Optimisation idea for Paginator object
-------------------------------------+-------------------------------------
Reporter: GP89 | Owner: nobody

Type: | Status: new
Cleanup/optimization |
Component: Database layer | Version: 1.6
(models, ORM) |
Severity: Normal | Resolution:
Keywords: paginator | Triage Stage: Accepted
optimization performance |
Has patch: 0 | Needs documentation: 0
Needs tests: 0 | Patch needs improvement: 0

Easy pickings: 0 | UI/UX: 0
-------------------------------------+-------------------------------------

Comment (by Ramiro Morales):

#25375 and #29254 were related.

--
Ticket URL: <https://code.djangoproject.com/ticket/23771#comment:4>

Django

unread,
Jul 17, 2018, 4:30:24 PM7/17/18
to django-...@googlegroups.com
#23771: Optimisation idea for Paginator object
-------------------------------------+-------------------------------------
Reporter: GP89 | Owner: nobody

Type: | Status: new
Cleanup/optimization |
Component: Database layer | Version: 1.6
(models, ORM) |
Severity: Normal | Resolution:
Keywords: paginator | Triage Stage: Accepted
optimization performance |
Has patch: 0 | Needs documentation: 0
Needs tests: 0 | Patch needs improvement: 0

Easy pickings: 0 | UI/UX: 0
-------------------------------------+-------------------------------------

Comment (by Ramiro Morales):

#28477 is related, although it's has a more general scope (it's about the
`Queryset.count()` method instead of just the Paginator) and proposes to
work in reduction of unused annotations.

--
Ticket URL: <https://code.djangoproject.com/ticket/23771#comment:5>

Django

unread,
Sep 1, 2020, 2:05:31 AM9/1/20
to django-...@googlegroups.com
#23771: Optimisation idea for Paginator object
-------------------------------------+-------------------------------------
Reporter: GP89 | Owner: nobody

Type: | Status: new
Cleanup/optimization |
Component: Database layer | Version: 1.6
(models, ORM) |
Severity: Normal | Resolution:
Keywords: paginator | Triage Stage: Accepted
optimization performance |
Has patch: 0 | Needs documentation: 0
Needs tests: 0 | Patch needs improvement: 0

Easy pickings: 0 | UI/UX: 0
-------------------------------------+-------------------------------------

Comment (by Josh Smeaton):

Adding my +1 to this, as it makes it extremely difficult to optimise
readonly annotations in the admin. As an example:

{{{

class MyAdmin(admin.ModelAdmin):
list_display = ("f1", "f2", "related_count")

def related_count(self, obj):
return obj.related_count

def get_queryset(self, request):
return
super().get_queryset(request).annotate(related_count=Count("related"))

}}}

If QuerySet had a way to remove annotations like select_related(None) or
order_by(), then we could implement a Paginator like so:

{{{

class BetterCountPaginator(Paginator):
@cached_property
def count(self):
return self.object_list.annotate(None).count()

}}}

--
Ticket URL: <https://code.djangoproject.com/ticket/23771#comment:6>

Django

unread,
Sep 1, 2020, 11:18:09 AM9/1/20
to django-...@googlegroups.com
#23771: Optimisation idea for Paginator object
-------------------------------------+-------------------------------------
Reporter: GP89 | Owner: nobody

Type: | Status: new
Cleanup/optimization |
Component: Database layer | Version: 1.6
(models, ORM) |
Severity: Normal | Resolution:
Keywords: paginator | Triage Stage: Accepted
optimization performance |
Has patch: 0 | Needs documentation: 0
Needs tests: 0 | Patch needs improvement: 0

Easy pickings: 0 | UI/UX: 0
-------------------------------------+-------------------------------------

Comment (by Simon Charette):

Doing this automatically and correctly is really tricky as #28477 proved
(think of annotations spanning multi-valued relationships, `values`
grouping by annotations before aggregation) so I'd be +1 on adding a way
to pass `count` or `total` to `Paginator` and a way to clear annotations
but we'd have to document that the later is not panacea.

--
Ticket URL: <https://code.djangoproject.com/ticket/23771#comment:7>

Reply all
Reply to author
Forward
0 new messages