Follow-up to "contrib.admin is slow with large, complex datasets"

28 views
Skip to first unread message

Mart Somermaa

unread,
Apr 3, 2009, 8:03:08 AM4/3/09
to django-d...@googlegroups.com
(Sorry for top-posting, Google Groups is misbehaving.)

Jacob argued on #django-dev that you can already achieve the only()
behaviour by overriding ModelAdmin.queryset().

So I experimented with that. "The results were disappointing," as prof.
Denzil Dexter from The Fast Show used to say :) .

Adding the following to CAdmin

def queryset(self, request):
qs = super(CAdmin, self).queryset(request)
return qs.only('name', 'is_published',
'a', 'b').select_related()

results in correct SQL, but due to bug #10710 [1] fields are shuffled
and the resulting view is the worst possible: contents of *the large
text field* are displayed in the changelist instead of __unicode__ --
which strikes me as a cruel April fools trick, except today is April 3:
http://img3.imageshack.us/img3/4295/afterlimitqueryset.png

Ideally,

def queryset(self, request):
qs = super(CAdmin, self).queryset(request)
return qs.only('name', 'is_published',
'a', 'b').select_related('a__name', 'b__name')

should work, but due to the same bug [1] that throws
FieldDoesNotExist: A has no field named 'b'.

I expect that [1] is more or less easy to fix and overriding queryset()
in that manner will be eventually possible.

However, the conceptual question of whether we should do that by default
remains open. That is, should we really SELECT(*) if we know in advance
that only the fields in list_display will be displayed? Jacob argues
that current behaviour is OK, admin is "fast enough" and queryset() can
be used for advanced things -- which is a valid position, so I won't
campaign further for the "optimal by default" cause. But the "10 users
opening a default admin changelist page of 100 entries for objects that
contain 100 KB of data (100 MB of memory use), only to display e.g. the
couple of bytes in object title" case makes me still uncomfortable.

As an aside, note that queryset() is undocumented [2].

[1] http://code.djangoproject.com/ticket/10710
[2] http://code.djangoproject.com/ticket/10712

mrts

unread,
Apr 3, 2009, 8:21:47 AM4/3/09
to Django developers
Omission: the models and admin used in experiments have changed a bit
from the original post:

from django.db import models

class Base(models.Model):
name = models.CharField(max_length=10)
lots_of_text = models.TextField()

class Meta:
abstract = True

def __unicode__(self):
return self.name

class A(Base):
a_field = models.CharField(max_length=10)

class B(Base):
b_field = models.CharField(max_length=10)

class C(Base):
a = models.ForeignKey(A)
b = models.ForeignKey(B)
is_published = models.BooleanField()

---

from django.contrib import admin
from improve_admin.models import A, B, C

class CAdmin(admin.ModelAdmin):
list_display = ('name', 'a', 'b', 'is_published')
list_filter = ('a',)

def queryset(self, request):
qs = super(CAdmin, self).queryset(request)
return qs.only('name', 'is_published', 'a', 'b').select_related
().only('a__name')

admin.site.register(A)
admin.site.register(B)
admin.site.register(C, CAdmin)

On Apr 3, 3:03 pm, Mart Somermaa <m...@mrts.pri.ee> wrote:
> and the resulting view is the worst possible: contents of *the large
> text field* are displayed in the changelist instead of __unicode__ --

Correction: __unicode__ still uses C.name, but as #10710 kicks in, the
contents of C.lots_of_text is actually referenced by C.name, so that
is displayed.

mrts

unread,
Apr 5, 2009, 5:52:07 PM4/5/09
to Django developers
First, let me thank Malcolm for promptly fixing #10710. Unfortunately
#10733 is still barring the use of only() and select_related()
properly.

Given the models and admin defined above and the requirements that
* admin changelist view should perform only a single query,
* that should pull in only the fields that are actually used in
list_display,
the following should ideally work after #10733 is fixed:

def queryset(self, request):
qs = super(CAdmin, self).queryset(request)
return qs.only('name', 'is_published', 'a', 'b',
'a__name', 'b__name').select_related()

(actually, qs.only('name', 'is_published', 'a__name',
'b__name').select_related() would suffice.)

---

However, there's another bump on the Path to The One True Admin
Changelist Query.

If any of the related fields is nullable, e.g.

class C(Base):
a = models.ForeignKey(A, blank=True, null=True)
b = models.ForeignKey(B)
is_published = models.BooleanField()

, select_related() does not pull them in by default (as Karen kindly
pointed out in #10722).

Forcing it explicitly as follows

def queryset(self, request):
qs = super(CAdmin, self).queryset(request)
return qs.select_related('a', 'b')

has no effect, as ChangeList.get_query_set() blindly overrides it with
a plain select_related().

To get the desired behviour, three-valued logic is required for
ModelAdmin.list_select_related:
* True -- always call select_related(),
* False (the default) -- call select_related() if any of the fields
in list_display is a foreign key,
* None (added) -- leave the queryset alone.

The patch is backwards-compatible, quite simple and the result is a
single query as expected:

--- django/contrib/admin/views/main.py (revision 10405)
+++ django/contrib/admin/views/main.py (working copy)
@@ -200,6 +200,8 @@
# with a relationship.
if self.list_select_related:
qs = qs.select_related()
+ elif self.list_select_related is None:
+ pass
else:
for field_name in self.list_display:
try:

(validation changes omitted for clarity).

I've submitted the proposal in #10742 and am willing to drive it
further if it is accepted.
---

Finally, after #10733 and #10742 are fixed, the admin and The One True
Admin Changelist Query that satisfies the requirements is as follows:

class CAdmin(admin.ModelAdmin):
list_display = ('name', 'a', 'b', 'is_published')
list_select_related = None

def queryset(self, request):
qs = super(CAdmin, self).queryset(request)
return qs.only('name', 'is_published', # 'a', 'b' ?
'a__name', 'b__name').select_related('a', 'b')

Links:

#10733 -- http://code.djangoproject.com/ticket/10733
#10742 -- http://code.djangoproject.com/ticket/10742

Jacob Kaplan-Moss

unread,
Apr 5, 2009, 7:45:33 PM4/5/09
to django-d...@googlegroups.com
On Sun, Apr 5, 2009 at 4:52 PM, mrts <mr...@mrts.pri.ee> wrote:
> [...]

Can you please stop? We all get that you think these tickets are
important. They're on the milestone for 1.1, so they'll be fixed.
Nagging us here doesn't help get your tickets pushed to the front of
the queue.

Jacob

mrts

unread,
Apr 6, 2009, 4:22:09 AM4/6/09
to Django developers
On Apr 6, 2:45 am, Jacob Kaplan-Moss <jacob.kaplanm...@gmail.com>
wrote:
> Can you please stop? We all get that you think these tickets are
> important. They're on the milestone for 1.1, so they'll be fixed.
> Nagging us here doesn't help get your tickets pushed to the front of
> the queue.

I'm baffled. From which parts of my posts did you the impression that
I'm trying to push these somehow or nag you? Feel free to re-target
all the tickets under discussion to 1.2 or wontfix them with an
explanation if you feel this is appropriate.

I was specifically reporting that #10710 didn't fix things and asking
if #10742 makes sense, i.e. I was sharing information and soliciting
review on a theme that is useful for anyone working with complex
models in the admin. Nowhere do I see declarative or demanding
language in my posts.

Jacob, perhaps you are taking the *D* part of BDFL too seriously just
right now :) ?

Anyhow, no harm done and of course I stop -- there's nothing more to
say on this.

Russell Keith-Magee

unread,
Apr 6, 2009, 7:27:42 AM4/6/09
to django-d...@googlegroups.com
On Mon, Apr 6, 2009 at 4:22 PM, mrts <mr...@mrts.pri.ee> wrote:
>
> On Apr 6, 2:45 am, Jacob Kaplan-Moss <jacob.kaplanm...@gmail.com>
> wrote:
>> Can you please stop? We all get that you think these tickets are
>> important. They're on the milestone for 1.1, so they'll be fixed.
>> Nagging us here doesn't help get your tickets pushed to the front of
>> the queue.
>
> I'm baffled. From which parts of my posts did you the impression that
> I'm trying to push these somehow or nag you? Feel free to re-target
> all the tickets under discussion to 1.2 or wontfix them with an
> explanation if you feel this is appropriate.
>
> I was specifically reporting that #10710 didn't fix things and asking
> if #10742 makes sense, i.e. I was sharing information and soliciting
> review on a theme that is useful for anyone working with complex
> models in the admin. Nowhere do I see declarative or demanding
> language in my posts.

The reason Jacob asked you to stop is that posts of this nature do
nothing to assist in the process of closing tickets.

The Contribution FAQ [1] specifically covers this issue. From the last
point of [1]:

"Don’t post to django-developers just to announce that you have filed
a bug report. All the tickets are mailed to another list
(django-updates), which is tracked by developers and triagers, so we
see them as they are filed."

If you can demonstrate that a changeset hasn't closed a bug, and a
problem still exists, you should either reopen the relevant ticket in
Trac or open a new issue that describes the missing edge case. If you
find a completely new problem, it should be logged in Trac.

If you have a patch that you think will resolve a problem, attach it
to the ticket, and mark the ticket has having a patch. Ideally, this
should be a patch _with_ tests. A patch without tests is essentially
useless from a review perspective. Again, this is also covered in the
FAQ [2].

Posting to django-developers won't cause us to look at your tickets
any faster. We already have a list of things that we need to look at -
it's called the v1.1 todo list [3], and your tickets are already on
that list.

[1] http://docs.djangoproject.com/en/dev/internals/contributing/#id1
[2] http://docs.djangoproject.com/en/dev/internals/contributing/#patch-style
[3] http://code.djangoproject.com/query?status=new&status=assigned&status=reopened&group=component&milestone=1.1&order=priority

Yours,
Russ Magee %-)

Reply all
Reply to author
Forward
0 new messages