Improve Performance in Admin ManyToMany

421 views
Skip to first unread message

Timothy W. Cook

unread,
May 13, 2015, 12:32:50 PM5/13/15
to django...@googlegroups.com
I have a model with 13 M2M relations and some of those have a few thousand instances. 
This renders rather slowly in the Admin.

Thinking about improvements I wonder if it will help to setup prefetch_related queries

inside a  formfield_for_manytomany method?


​I haven't tried it yet and am not even sure how to go about it.  But if experienced developers think it will work, I'll give it a shot.

Thoughts? ​


============================================
Timothy Cook
LinkedIn Profile:http://www.linkedin.com/in/timothywaynecook

Tim Graham

unread,
May 13, 2015, 12:58:04 PM5/13/15
to django...@googlegroups.com
Are you sure it's the query that's slow and not the template rendering and/or JavaScript performance?

Timothy W. Cook

unread,
May 14, 2015, 7:29:36 AM5/14/15
to django...@googlegroups.com
It isn't that the individual queries are very slow. But that there are so many of them. Attached is a screenshot of DjDT.
I see that I am not using a cache at all.  This was after doing a reload of the same standard Django admin/change_form.html. 


--
You received this message because you are subscribed to the Google Groups "Django users" group.
To unsubscribe from this group and stop receiving emails from it, send an email to django-users...@googlegroups.com.
To post to this group, send email to django...@googlegroups.com.
Visit this group at http://groups.google.com/group/django-users.
To view this discussion on the web visit https://groups.google.com/d/msgid/django-users/58f721ad-2ee4-4016-ac6f-b48661c4ce5b%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.



--
Screenshot from 2015-05-14 08:21:05.png

Simon Charette

unread,
May 14, 2015, 11:14:24 AM5/14/15
to django...@googlegroups.com
It's a bit hard to tell without your model definition and the executed queries but is it possible that the string representation method (__str__, __unicode__) of one the models referenced by a M2M access another related model during construction?

e.g.

from django.db import models

class Foo(models.Model):
   
pass

class Bar(models.Model):
    foo
= models.ForeignKey(Foo)

   
def __str__(self):
       
return "Foo %s" % self.foo

class Baz(models.Model):
    bars
= models.ManyToManyField(Bar)

Here the Bar model uses the Foo related model for string representation construction and will result in N + 1 queries (N equals the number of Bar objects) if the Baz.bars fields is displayed in the admin. To make sure only one query is executed you must make sure Baz.bars are selected with their related Foo objects:

from django.contrib import admin

class BazAdmin(admin.ModelAdmin):
   
def formfield_for_many_to_many(self, db_field, *args, **kwargs):
        formfield
= super(BazAdmin, self).formfield_for_many_to_many(db_field, *args, **kwargs)
       
if db_field.name == 'baz':
            formfield
.queryset = formfield.queryset.select_related('foo')
       
return formfield

Simon

Simon Charette

unread,
May 14, 2015, 11:15:49 AM5/14/15
to django...@googlegroups.com
The last example should read:

if db_field.name == 'baz':

Simon Charette

unread,
May 14, 2015, 11:16:38 AM5/14/15
to django...@googlegroups.com
I meant

if db_field.name == 'bars':

Sorry for noise...

Timothy W. Cook

unread,
May 14, 2015, 4:19:43 PM5/14/15
to django...@googlegroups.com
That is exactly the problem Simon.  Everyone of those models reference a model called Project.  I did this so that when the items are displayed in the selects, the user knows which project it is from.  In the interim I guess I'll remove the call to Project from the __str__ 

I wonder if there is another approach that I can use to solve this?  

Thanks,
Tim


For more options, visit https://groups.google.com/d/optout.

Erik Cederstrand

unread,
May 15, 2015, 6:26:00 AM5/15/15
to Django Users

> Den 14/05/2015 kl. 22.19 skrev Timothy W. Cook <t...@mlhim.org>:
>
> That is exactly the problem Simon. Everyone of those models reference a model called Project. I did this so that when the items are displayed in the selects, the user knows which project it is from. In the interim I guess I'll remove the call to Project from the __str__
>
> I wonder if there is another approach that I can use to solve this?

Does the suggestion to append select_related() / prefetch_related() to the queryset in your admin view not work?

Erik

Timothy W. Cook

unread,
May 15, 2015, 2:56:37 PM5/15/15
to django...@googlegroups.com
On Fri, May 15, 2015 at 7:25 AM, Erik Cederstrand <erik+...@cederstrand.dk> wrote:

> I wonder if there is another approach that I can use to solve this?

Does the suggestion to append select_related() / prefetch_related() to the queryset in your admin view not work?

​If I implemented it correctly, it doesn't have any effect.  I ​get the same number of queries reported by DjDT and effectively the same amount of time. 

The model called Cluster is the most egregious.  
Model:

class Cluster(Item):
    """
    The grouping variant of Item, which may contain further instances of Item, in an ordered list. This
    provides the root Item for potentially very complex structures.
    """
    cluster_subject = models.CharField(_('cluster subject'),max_length=110, help_text="Enter a text name for this subject of this cluster.")
    clusters = models.ManyToManyField('Cluster',help_text="Select zero or more Clusters to include in this Cluster. You cannot put a Cluster inside itself, it will be ignored if you select itself.", blank=True)
    dvboolean = models.ManyToManyField(DvBoolean, related_name='%(class)s_related', help_text="Select zero or more booleans to include in this Cluster.", blank=True)
    dvuri = models.ManyToManyField(DvURI, related_name='%(class)s_related', help_text="Select zero or more uris to include in this Cluster.", blank=True)
    dvstring = models.ManyToManyField(DvString, related_name='%(class)s_related', help_text="Select zero or more strings to include in this Cluster.", blank=True)
    dvcodedstring = models.ManyToManyField(DvCodedString, related_name='%(class)s_related', help_text="Select zero or more coded strings to include in this Cluster.", blank=True)
    dvidentifier = models.ManyToManyField(DvIdentifier, related_name='%(class)s_related', help_text="Select zero or more identifiers to include in this Cluster.", blank=True)
    dvparsable = models.ManyToManyField(DvParsable, related_name='%(class)s_related', help_text="Select zero or more parsables to include in this Cluster.", blank=True)
    dvmedia = models.ManyToManyField(DvMedia, related_name='%(class)s_related', help_text="Select zero or more media items to include in this Cluster.", blank=True)
    dvordinal = models.ManyToManyField(DvOrdinal, related_name='%(class)s_related', help_text="Select zero or more ordinals to include in this Cluster.", blank=True)
    dvcount = models.ManyToManyField(DvCount, related_name='%(class)s_related', help_text="Select zero or more counts to include in this Cluster.", blank=True)
    dvquantity = models.ManyToManyField(DvQuantity, related_name='%(class)s_related', help_text="Select zero or more quantity items to include in this Cluster.", blank=True)
    dvratio = models.ManyToManyField(DvRatio, related_name='%(class)s_related', help_text="Select zero or more ratios to include in this Cluster.", blank=True)
    dvtemporal = models.ManyToManyField(DvTemporal, related_name='%(class)s_related', help_text="Select zero or more temporal items to include in this Cluster.", blank=True)

    def __str__(self):
        return self.prj_name.prj_name + ":" + self.cluster_subject




Model Admin:

class ClusterAdmin(admin.ModelAdmin):
    list_filter = ['prj_name__rm_version__version_id','prj_name',]
    search_fields = ['cluster_subject','ct_id']
    ordering = ['prj_name','cluster_subject']
    actions = [make_published, unpublish, copy_cluster]
    readonly_fields = ['published','schema_code','xqr_code','xqw_code',]
    filter_horizontal = ['dvboolean','dvuri','dvstring','dvcodedstring','dvidentifier','clusters','dvparsable','dvmedia','dvordinal','dvtemporal','dvcount','dvquantity','dvratio',]
    form = ClusterAdminForm

    def formfield_for_many_to_many(self, db_field, *args, **kwargs):
        formfield = super(ClusterAdmin, self).formfield_for_many_to_many(db_field, *args, **kwargs)
        if db_field.name in ['cluster','dvboolean','dvuri','dvstring','dvcodedstring','dvidentifier','dvparsable','dvmedia',
                             'dvordinal','dvcount','dvquantity','dvratio','dvtemporal']:
            formfield.queryset = formfield.queryset.select_related('project')
        return formfield



​Each of the ManyToMany references have this in their model:


   def __str__(self):
        return self.prj_name.prj_name + ":" + self.data_name


​Thoughts?


--Tim​

 

Erik Cederstrand

unread,
May 16, 2015, 4:28:49 PM5/16/15
to Django Users

> Den 15/05/2015 kl. 20.54 skrev Timothy W. Cook <t...@mlhim.org>:
>
> def formfield_for_many_to_many(self, db_field, *args, **kwargs):
> formfield = super(ClusterAdmin, self).formfield_for_many_to_many(db_field, *args, **kwargs)
> if db_field.name in ['cluster','dvboolean','dvuri','dvstring','dvcodedstring','dvidentifier','dvparsable','dvmedia',
> 'dvordinal','dvcount','dvquantity','dvratio','dvtemporal']:
> formfield.queryset = formfield.queryset.select_related('project')
> return formfield
>
>
>
> ​Each of the ManyToMany references have this in their model:
>
>
> ​ def __str__(self):
> return self.prj_name.prj_name + ":" + self.data_name

Are you sure you don't mean

formfield.queryset.select_related('prj_name')

If 'prj_name' is the FK on your m2m models, then that's what should be passed to select_related()

Django 1.8 should catch this for you, if 'project' isn't also a FK on your model.

Erik

Timothy W. Cook

unread,
May 16, 2015, 6:00:14 PM5/16/15
to django...@googlegroups.com
​Project is the model used as the foreign key. ​
But yes, I suppose that prj_name is the correct field name to use there. 
However, I still have 4520 queries executing.  Before that change it was 4521.  
Strange that it would drop by only one query.  I changed it back to 'project' just to be sure I remembered correctly and yes it went back to 4521. 

The queries look like this:

SELECT "ccdgen_cluster"."id" FROM "ccdgen_cluster" INNER JOIN "ccdgen_cluster_clusters" ON ( "ccdgen_cluster"."id" = "ccdgen_cluster_clusters"."to_cluster_id" ) INNER JOIN "ccdgen_project" ON ( "ccdgen_cluster"."prj_name_id" = "ccdgen_project"."prj_name" ) WHERE "ccdgen_cluster_clusters"."from_cluster_id" = 209 ORDER BY"ccdgen_project"."prj_name" ASC, "ccdgen_cluster"."cluster_subject" ASC

...


Most take between .35 and .9 seconds.  
Except this one took 17 seconds.  No idea why except possibly memory swapping or something at that point?
It seems to be just like many of the others. 
SELECT ••• FROM "ccdgen_dvtemporal" INNER JOIN "ccdgen_project" ON ( "ccdgen_dvtemporal"."prj_name_id" = "ccdgen_project"."prj_name" ) ORDER BY "ccdgen_project"."prj_name" ASC, "ccdgen_dvtemporal"."data_name" ASC

These are all times running with the dev server on a laptop.  They do not take that long on a small AWS instance.
But still too long to be friendly to most users. 

Thanks for any other insight. 

--Tim





 
Erik


--
You received this message because you are subscribed to the Google Groups "Django users" group.
To unsubscribe from this group and stop receiving emails from it, send an email to django-users...@googlegroups.com.
To post to this group, send email to django...@googlegroups.com.
Visit this group at http://groups.google.com/group/django-users.

For more options, visit https://groups.google.com/d/optout.



--

Erik Cederstrand

unread,
May 17, 2015, 3:21:10 PM5/17/15
to Django Users

> Den 15/05/2015 kl. 20.54 skrev Timothy W. Cook <t...@mlhim.org>:
>
> def formfield_for_many_to_many(self, db_field, *args, **kwargs):
> formfield = super(ClusterAdmin, self).formfield_for_many_to_many(db_field, *args, **kwargs)
> if db_field.name in ['cluster','dvboolean','dvuri','dvstring','dvcodedstring','dvidentifier','dvparsable','dvmedia',
> 'dvordinal','dvcount','dvquantity','dvratio','dvtemporal']:
> formfield.queryset = formfield.queryset.select_related('project')
> return formfield

I just noticed you have a typo:

if db_field.name in ['cluster', ...

should be:

if db_field.name in ['clusters', ...

according to your model definition.

Erik

Timothy W. Cook

unread,
May 17, 2015, 7:54:24 PM5/17/15
to django...@googlegroups.com
On Sun, May 17, 2015 at 4:20 PM, Erik Cederstrand <erik+...@cederstrand.dk> wrote:

I just noticed you have a typo:

    if db_field.name in ['cluster', ...

should be:

    if db_field.name in ['clusters', ...

according to your model definition.


​Thanks Erik. Yes, that was a typo.  Unfortunately fixing it doesn't change anything though. ​  :-(




 
Erik


--
You received this message because you are subscribed to the Google Groups "Django users" group.
To unsubscribe from this group and stop receiving emails from it, send an email to django-users...@googlegroups.com.
To post to this group, send email to django...@googlegroups.com.
Visit this group at http://groups.google.com/group/django-users.

For more options, visit https://groups.google.com/d/optout.



--

Timothy W. Cook

unread,
May 20, 2015, 7:47:24 AM5/20/15
to django...@googlegroups.com
SOLVED:  

The correct method name is formfield_for_manytomany  instead of   formfield_for_many_to_many  

def formfield_for_manytomany(self, db_field, *args, **kwargs):
        formfield = super(ClusterAdmin, self).formfield_for_manytomany(db_field, *args, **kwargs)
        if db_field.name in ['dvboolean','dvuri','dvstring','dvcodedstring','dvidentifier','clusters','dvparsable','dvmedia','dvordinal','dvtemporal','dvcount','dvquantity','dvratio',]:
            formfield.queryset = formfield.queryset.select_related('prj_name')
        return formfield

Reduced the number of queries to 33 and the time to 182ms. 

Awesome, thanks for the pointers. 


Reply all
Reply to author
Forward
0 new messages