* status: closed => reopened
* severity: => Normal
* cc: boxm@… (added)
* type: => Uncategorized
* version: 0.96 => SVN
* easy: => 0
* resolution: wontfix =>
Comment:
Replying to [comment:17 lrekucki]:
> Replying to [comment:16 kmike]:
> > I think this can be marked as 'wontfix' because custom paginators can
be used in admin now:
> >
http://docs.djangoproject.com/en/dev/ref/contrib/admin/#django.contrib.admin.ModelAdmin.paginator
>
> Agreed :) Here's a link to a sample [http://code.google.com/p/django-
pagination/source/browse/trunk/pagination/paginator.py?r=47 Paginator that
doesn't use count]
Reopening, as the resolution simply doesn't work. The example paginator
given by kmike simply doesn't work as a ModelAdmin paginator, because the
{% pagination %} tag and the ChangeList class (among others) assume you
can count the number of items being paginated. See my comment on #4065 for
the gory details.
There are scenarios where you don't know how many results are being
paginated, but can still do useful pagination links:
- With InnoDB tables in MySQL, count(*) is frighteningly expensive, so
while in principle you can count, in practice you can't
- With NoSQL databases it can be impossible to do a count, but you can
still paginate by range queries
A proposed design for such a paginator would be to display (on page 12):
1 2 3 ..... 10 11 '''12''' ... Next
which is better than just next/prev.
I'll see if I can create acceptable patches.
--
Ticket URL: <https://code.djangoproject.com/ticket/8408#comment:18>
Django <https://code.djangoproject.com/>
The Web framework for perfectionists with deadlines.
* needs_docs: 0 => 1
* has_patch: 0 => 1
* needs_tests: 0 => 1
* stage: Accepted => Design decision needed
Comment:
OK, here's an attempt at a patch that fixes this.
I've added a new UncountedPaginator to pagination.py, and updated the
admin code to check whether the paginator has the appropriate attributes
before using them, falling back on alternative implementations if not.
Before I create test & doc updates, would like some feedback on whether
this patch is going about things the right way and is something that might
be accepted.
--
Ticket URL: <https://code.djangoproject.com/ticket/8408#comment:19>
* ui_ux: => 0
* type: Uncategorized => New feature
--
Ticket URL: <https://code.djangoproject.com/ticket/8408#comment:20>
Comment (by charles@…):
This is not just MySQL. PostgreSQL has a separate query for cheap, inexact
counts (with statistics collected during the most recent ANALYZE) -- but
exact counts are expensive; I have a table where a `SELECT COUNT(*)` is
typically running upward of 50 seconds.
Perhaps on PostgreSQL we could run a query like `select reltuples from
pg_class where relname=?;`, and use the traditional pagination system if
the result is under 20,000 or so, or a newer one otherwise?
See also http://wiki.postgresql.org/wiki/Slow_Counting
--
Ticket URL: <https://code.djangoproject.com/ticket/8408#comment:21>
* cc: kmike84@… (added)
--
Ticket URL: <https://code.djangoproject.com/ticket/8408#comment:22>
* stage: Design decision needed => Accepted
Comment:
Switching back to 'accepted' - I think the switch to DDN was in error and
simply sent this back into the icebox even though it was previously
accepted by a core dev and has a draft patch.
--
Ticket URL: <https://code.djangoproject.com/ticket/8408#comment:23>
Comment (by anonymous):
The issue is not restricted to pagination. Seems like the admin interface
calls count(*) at several other places.
--
Ticket URL: <https://code.djangoproject.com/ticket/8408#comment:24>
* cc: gonz (removed)
--
Ticket URL: <https://code.djangoproject.com/ticket/8408#comment:25>
Comment (by nova77):
Apparently count(*) is called on the whole table even if you filter your
view. For instance if you add "?id__in=555,666", you'll have two select
count(*): one with the "where" clause and a plain one which runs on the
whole table.
--
Ticket URL: <https://code.djangoproject.com/ticket/8408#comment:26>
Comment (by akaariai):
Can you track down where the unrestricted qs.count() is issued, and if
there is a reason for it?
--
Ticket URL: <https://code.djangoproject.com/ticket/8408#comment:27>
Comment (by aaugustin):
The admin displays the number of matching objects and the total number of
objects.
--
Ticket URL: <https://code.djangoproject.com/ticket/8408#comment:28>
* cc: slav0nic0@… (added)
--
Ticket URL: <https://code.djangoproject.com/ticket/8408#comment:29>
* cc: dan.fairs@… (added)
--
Ticket URL: <https://code.djangoproject.com/ticket/8408#comment:30>
Comment (by bb@…):
+1 because I ran into this problem today.
Using PostgreSQL and django 1.4 with a table that contains about 15
million rows. executing "select count(*) from mytable" in the psql console
takes about 40 seconds, even with an index on the primary key. This leads
to timeouts and huge lag.
I'm going to try MySQL to see if this is any better.
--
Ticket URL: <https://code.djangoproject.com/ticket/8408#comment:31>
Comment (by craig.labenz@…):
Using a table of 130M+ rows and thus obviously running into the same
problem. The COUNT(*) query takes literally 10+ minutes to run in InnoDB-
land.
--
Ticket URL: <https://code.djangoproject.com/ticket/8408#comment:33>
Comment (by jonathan_livni):
When django is hosted on Heroku, there's a time limit of 30 seconds for
any web request. So instead of admin pages taking a very long time to
load, you get admin pages which never load.
[[Image(http://s2.postimg.org/onfy59xuh/Capture.jpg)]]
--
Ticket URL: <https://code.djangoproject.com/ticket/8408#comment:34>
* owner: nobody => oinopion
* status: new => assigned
--
Ticket URL: <https://code.djangoproject.com/ticket/8408#comment:35>
Comment (by oinopion):
I'm working on this feature in my branch here:
https://github.com/oinopion/django/compare/nocountpagination
--
Ticket URL: <https://code.djangoproject.com/ticket/8408#comment:36>
* cc: markus.magnuson@… (added)
--
Ticket URL: <https://code.djangoproject.com/ticket/8408#comment:37>
Comment (by Cerin):
+1 for this feature. I have a few databases with tables containing tens of
millions of records, and browsing them in Django is painfully slow, even
with caching and extensive indexing. These sections of admin takes
*minutes* to load, and DDT is telling me that around 90% of the load time
is just from a single "SELECT COUNT(*) ..." to display the total result
count, which I don't care about.
Yes, both PostgreSQL and MySQL have tricks for approximate counts, but
only for total tables sizes, which doesn't help once you begin filtering.
And can confirm that using InfinitePaginator from django-pagination and
specifying a custom paginator=InfinitePaginator in your modeladmin does
not work because admin still expects to receive a valid count, which
causes InfinitePaginator to throw NotImplementedError exceptions.
--
Ticket URL: <https://code.djangoproject.com/ticket/8408#comment:38>
Comment (by Cerin):
See here for some interesting workarounds of Django's inefficient result
count mechanism: http://craiglabenz.me/2013/06/12/how-i-made-django-admin-
scale/
--
Ticket URL: <https://code.djangoproject.com/ticket/8408#comment:39>
* cc: t.chaumeny@… (added)
--
Ticket URL: <https://code.djangoproject.com/ticket/8408#comment:40>
* needs_docs: 1 => 0
* needs_tests: 1 => 0
--
Ticket URL: <https://code.djangoproject.com/ticket/8408#comment:41>
* owner: oinopion => tchaumeny
--
Ticket URL: <https://code.djangoproject.com/ticket/8408#comment:42>
Comment (by aaugustin):
Patch is here: https://github.com/django/django/pull/3260
--
Ticket URL: <https://code.djangoproject.com/ticket/8408#comment:43>
* status: assigned => closed
* resolution: => fixed
Comment:
In [changeset:"17557d068c43bd61cdc6c18caf250ffa469414a1"]:
{{{
#!CommitTicketReference repository=""
revision="17557d068c43bd61cdc6c18caf250ffa469414a1"
Fixed #8408 -- Added ModelAdmin.show_full_result_count to avoid COUNT()
query.
Thanks lidaobing for the suggestion.
}}}
--
Ticket URL: <https://code.djangoproject.com/ticket/8408#comment:44>
Comment (by Josh Smeaton):
I don't think this ticket has been properly or fully fixed. The patch that
landed only affects the display of filtered results. A count(*) is still
executed every time you hit a page, which leads to timeouts when tables
grow large.
I think the idea of using a custom paginator is probably a good one, but
that could probably be controlled by an option in model admin or
something. I'm not reopening this ticket until I've got some feedback
first.
--
Ticket URL: <https://code.djangoproject.com/ticket/8408#comment:45>
* status: closed => new
* resolution: fixed =>
Comment:
As per Josh's comment a {{{count(*)}}} is still executed by the paginator
when using the admin, even with {{{show_full_result_count=False}}}. This
takes multiple seconds when querying tens of millions of rows. You can
hack around the count(*) query by implementing a custom Paginator with
either an [https://medium.com/squad-engineering/estimated-counts-for-
faster-django-admin-change-list-963cbf43683e appromixiate] or a fixed
count.
{{{
from django.core.paginator import Paginator
class FixedCountPaginator(Paginator):
@property
def count(self):
return 1000
class MyModelAdmin(admin.ModelAdmin):
show_full_result_count=False
paginator = FixedCountPaginator
}}}
I'd like to fix this properly by either paginating without requesting the
number of rows / pages or disabling pagination.
{{{
class MyModelAdmin(admin.ModelAdmin):
show_full_result_count=False
no_count = False
}}}
I've put up a [https://github.com/django/django/pull/8858/files pull
request] to demonstrate the changes required to achieve this. It changes
the pagination widget in the admin to display [Prev, PAGE, Next]. Changing
this to a `show_pagination` field that disables pagination would also
solve the {{{count(*)}}} issue and may be more consistent / straight
forward.
{{{
class MyModelAdmin(admin.ModelAdmin):
show_full_result_count=False
show_pagination = True
}}}
--
Ticket URL: <https://code.djangoproject.com/ticket/8408#comment:46>
* cc: josh.smeaton@… (added)
* needs_better_patch: 0 => 1
* needs_tests: 0 => 1
Comment:
I spoke to Matthew at the pycon sprints and agreed that we should open
this one back up if we had a PR to demonstrate. The PR needs some work
still. I'll take this to the Mailing List.
--
Ticket URL: <https://code.djangoproject.com/ticket/8408#comment:47>
* cc: Ionel Cristian Mărieș (added)
--
Ticket URL: <https://code.djangoproject.com/ticket/8408#comment:48>
Comment (by Jerome Leclanche):
I've been using the following for a while:
{{{
#!python
class EstimatedCountPaginator(Paginator):
def __init__(self, *args, **kwargs):
super().__init__(*args, **kwargs)
self.object_list.count = self.count
@cached_property
def count(self):
if self.object_list.query.where:
return self.object_list.count()
db_table = self.object_list.model._meta.db_table
cursor = connections[self.object_list.db].cursor()
cursor.execute("SELECT reltuples FROM pg_class WHERE
relname = %s", (db_table, ))
result = cursor.fetchone()
if not result:
return 0
return int(result[0])
}}}
At the very least I believe such a paginator should be available by
default in `django.contrib.postgres`.
But going beyond that, the problem is that users are not always able to
modify the affected admin objects. So there should be some way of setting
the *default* paginator for all admin objects, I think.
--
Ticket URL: <https://code.djangoproject.com/ticket/8408#comment:49>
Comment (by michi88):
What's wrong with just always calling `len(self.object_list)`?
Using this saves me from the additional count of the filtered objects.
This seems to work as the queryset is evaluated/fetched and cached on the
`len(...)` call, after that another fetch is not needed to render the
items in the admin list.
{{{
#!python
class LenCountPaginator(Paginator):
@cached_property
def count(self):
return len(self.object_list)
}}}
This is how de default Paginator does it:
{{{
#!python
@cached_property
def count(self):
"""
Returns the total number of objects, across all pages.
"""
try:
return self.object_list.count()
except (AttributeError, TypeError):
# AttributeError if object_list has no count() method.
# TypeError if object_list.count() requires arguments
# (i.e. is of type list).
return len(self.object_list)
}}}
--
Ticket URL: <https://code.djangoproject.com/ticket/8408#comment:50>
Comment (by Tim Graham):
`.count()` is called on the unfiltered queryset so calling `len()` on it
would fetch all objects from the database, not just those on the current
page.
--
Ticket URL: <https://code.djangoproject.com/ticket/8408#comment:51>
Comment (by michi88):
I indeed spoke too early. This did work for me as I combine it with
`django-postgres-fuzzycount` as well which takes care of the unfiltered
total count:
https://github.com/stephenmcd/django-postgres-
fuzzycount/blob/master/fuzzycount.py
{{{
#!python
class FuzzyCountAdmin(ReplicaModelAdmin):
"""
Models must have:
fuzzy_count_manager = FuzzyCountManager()
"""
def get_queryset(self, request):
if hasattr(self.model, 'fuzzy_count_manager') and request.method
== 'GET':
qs = self.model.fuzzy_count_manager.get_queryset()
else:
qs = self.model._default_manager.get_queryset()
# taken from Django source
# TODO: this should be handled by some parameter to the
ChangeList.
ordering = self.get_ordering(request)
if ordering:
qs = qs.order_by(*ordering)
return qs
}}}
This also works for the filtered requests:
{{{
#!python
class LenCountPaginator(Paginator):
@cached_property
def count(self):
if self.object_list.query.where or getattr(self.object_list.query,
'having', None):
return len(self.object_list)
return self.object_list.count()
}}}
--
Ticket URL: <https://code.djangoproject.com/ticket/8408#comment:52>
Comment (by michi88):
Again spoke to early, sorry for the spam, it somehow only works for one
particular admin view I have. Trying to figure out why that is the case.
--
Ticket URL: <https://code.djangoproject.com/ticket/8408#comment:53>
Comment (by michi88):
Ok, it only works when the number of total rows in the database is >
`list_per_page`. When they are less I see multiple calls to
`len(self.object_list)`. Which I don't mind as for those views performance
is a non-issue.
Nonetheless, this is all highly unreliable as it relies on the queryset
cache also being used for rendering the actual list of objects.
--
Ticket URL: <https://code.djangoproject.com/ticket/8408#comment:54>
Comment (by TapanGujjar):
Is it possible to have a separate table that keeps the count of all the
rows in other tables? I am new here and don't know how much work.
--
Ticket URL: <https://code.djangoproject.com/ticket/8408#comment:55>
Comment (by Brett Higgins):
In addition to slow COUNT queries, I've just noticed that the listing page
calls `len()` on a potentially slow queryset:
On my listing page, I have a couple nested lookups, so this becomes a
complex query with four joins, and it can take tens of seconds to run for
around 600k records. With too many more records than that, I'm running
afoul of AWS Lambda timeouts (30s, just like the Heroku user mentioned
above).
All this to say - any solution to this slow COUNT issue should also
address (avoid) this slow query as well.
--
Ticket URL: <https://code.djangoproject.com/ticket/8408#comment:56>
* cc: Ondřej Chmelař (added)
--
Ticket URL: <https://code.djangoproject.com/ticket/8408#comment:57>
* cc: elonzh (added)
--
Ticket URL: <https://code.djangoproject.com/ticket/8408#comment:58>
Comment (by msphn):
I couldn't find any good solution so I build myself a custom paginator.
My database is mysql using innodb, I made use of innodb to get a fast and
precise enough count of rows. Please be aware, that this might not be
precise enough for everyone's usecase.
My paginator shouldn't be understood as a patch, but it's an adequate
workaround to speed up your django admin. I had no problem with that, and
I can now easily browse my 5 million+ datasets.
{{{#!python
from django.core.paginator import Paginator
from django.utils.functional import cached_property
from django.db import connection
class MysqlInnoDbPaginator(Paginator):
@cached_property
def count(self):
with connection.cursor() as cursor:
cursor.execute(
"SELECT TABLE_ROWS FROM information_schema.tables WHERE
TABLE_SCHEMA = %s AND TABLE_NAME = %s;",
[connection.settings_dict['NAME'],
self.object_list.model._meta.db_table]
)
row = cursor.fetchone()
if row is not None:
return row[0]
else:
return 0
}}}
Add this to your admin.ModelAdmin implementation. The
show_full_result_count is very important, otherwise another count would
happen.
{{{#!python
class FoobarAdmin(admin.ModelAdmin):
paginator = MysqlInnoDbPaginator
show_full_result_count = False
}}}
Also be aware, if you have multiple database configurations, you should
modify it so it would find the right database name
--
Ticket URL: <https://code.djangoproject.com/ticket/8408#comment:59>