* type: => Uncategorized
* ui_ux: => 0
* component: Core (Other) => Database layer (models, ORM)
* severity: => Normal
* easy: => 0
Comment:
But there is a real performance problem with the standard paginator.
For MVCC databases like postgres, performance is very poor for tables
above a couple million lines when doing a count(*), because postgres
doesn't stores enough information on indexes and will perform a seqscan.
The default Paginator seems to call count(*), this can bring a standard
server install to its knees.
My workaround os overriding _get_count to use something like:
SELECT reltuples FROM pg_class WHERE relname = 'myapp_mymodel';
May be the ORM should implement a "estimated_count" method for when you
don't need very accurate numbers. I would volunteer to send a patch but I
don't know how to perform a fast "estimated count" for all database
engines supported.
--
Ticket URL: <https://code.djangoproject.com/ticket/11287#comment:4>
Django <https://code.djangoproject.com/>
The Web framework for perfectionists with deadlines.
Comment (by akaariai):
The downside of estimated tuples is that it can return wildly wrong
results (the PostgreSQL one mentioned above at least).
The preferred way for me would be to have a couple of different new
options for the paginator (pony requests if you wish):
max_pages:
- This would be useful in limiting the maximum page to say 20, and if
you have 25 objs per page, you would do "SELECT count(*) FROM (select 1
from real_query limit 500) tmp" instead of "SELECT count(*) FROM
real_query". This should be more than fast enough assuming you have proper
indexes in place.
Links-only:
- This would give you only "next page" and "previous page" links. The
trick is you can use a "SELECT objs FROM somepage WHERE id > last_page_id
ORDER BY id LIMIT 25" to fetch the next page. This is lightning-fast if
you have the proper indexes. This requires the ordering is done on an
unique condition.
Using the estimated_count() method would still leave you open to people
just going to page 10000, hit F5 a couple of times, and watch the DB
server burn.
This ticket is closed and should remain so. The above ideas might be worth
investigation. But not in this ticket.
--
Ticket URL: <https://code.djangoproject.com/ticket/11287#comment:5>
Comment (by trbs):
Just too add a little note. This estimation of total query count by using
the tables statistics (reltuples in postgresql) won't work even if it
would give back an a reasonable count all the time.
Since this will return the estimated amount of rows in a table not in
query. It's very usual for a query backing the pager to have several WHERE
clauses. eg: {{{ Blog.objects.filter(pub_date__lte=now, status="approved",
site=site) }}}
I really like the max pages idea, if page 20 still yields 25 results (or
retrieve 26 so we know for sure there more then 25) we can just add
another page number and go up one by one from there.
--
Ticket URL: <https://code.djangoproject.com/ticket/11287#comment:6>