Account Options

  1. Sign in
The old Google Groups will be going away soon, but your browser is incompatible with the new version.
Google Groups Home
« Groups Home
pre_init/post_init vs. performance
There are currently too many topics in this group that display first. To make this topic appear first, remove this option from another topic.
There was an error processing your request. Please try again.
flag
  8 messages - Collapse all  -  Translate all to Translated (View all originals)
The group you are posting to is a Usenet group. Messages posted to this group will make your email address visible to anyone on the Internet.
Your reply message has not been sent.
Your post was successful
 
From:
To:
Cc:
Followup To:
Add Cc | Add Followup-to | Edit Subject
Subject:
Validation:
For verification purposes please type the characters you see in the picture below or the numbers you hear by clicking the accessibility icon. Listen and type the numbers you hear
 
Jeremy Dunck  
View profile  
 More options Jul 13 2012, 10:57 pm
From: Jeremy Dunck <jdu...@gmail.com>
Date: Fri, 13 Jul 2012 19:57:44 -0700
Local: Fri, Jul 13 2012 10:57 pm
Subject: pre_init/post_init vs. performance
I was poking around in our (Votizen's) use of signals and thinking
about making some tooling so that signal usage was a bit more
transparent.

In doing so, I noticed that GenericForeignKey hooks the model pre_init
signal.  It does that because GFK needs a chance to munge kwargs from
the GFK field name to the content_id and object_id fields.

Signal dispatch with zero receivers is much much faster than with 1 or
more receivers, and pre_init is fired quite a lot.  Pretty much every
decent sized project I've worked on has used GFK, so I imagine this is
a large sap on performance overall.

Similarly, I just noticed that ImageField does a similar thing,
setting dimension values in post_init.

I think we could get a significant performance win if, instead of
using pre_init/post_init here, we changed Model.__init__ to call a
per-field hook that could munge the Model.__init__ kwargs.

SomeField.model_pre_init(instance, args, kwargs)
  or maybe for symmetry with contribute_to_class:
SomeField.contribute_to_instance(instance, args, kwargs)

We could store which fields have these hooks upon ModelBase.__new__
construction and so skip most fields and overhead in __init__.

I'm happy to make a pull request but checking here to see if there's
any pushback.


 
You must Sign in before you can post messages.
To post a message you must first join this group.
Please update your nickname on the subscription settings page before posting.
You do not have the permission required to post.
Anssi Kääriäinen  
View profile  
 More options Jul 14 2012, 4:27 am
From: Anssi Kääriäinen <anssi.kaariai...@thl.fi>
Date: Sat, 14 Jul 2012 01:27:22 -0700 (PDT)
Local: Sat, Jul 14 2012 4:27 am
Subject: Re: pre_init/post_init vs. performance
On 14 heinä, 05:57, Jeremy Dunck <jdu...@gmail.com> wrote:

A big +1 from me.

Maybe the fields could explicitly add methods to pre/post init hooks
in contribute_to_class instead of method autodiscovery?

The signal receiver caching approach is another alternative (https://
code.djangoproject.com/ticket/16679). The downside of that approach is
that it complicates the already complex signals framework, so for that
reason it seems better to get rid of the post/pre init signals.

 - Anssi


 
You must Sign in before you can post messages.
To post a message you must first join this group.
Please update your nickname on the subscription settings page before posting.
You do not have the permission required to post.
Russell Keith-Magee  
View profile  
 More options Jul 14 2012, 9:37 pm
From: Russell Keith-Magee <russ...@keith-magee.com>
Date: Sun, 15 Jul 2012 09:37:28 +0800
Local: Sat, Jul 14 2012 9:37 pm
Subject: Re: pre_init/post_init vs. performance

+1 to this idea.

My only concern is:

> We could store which fields have these hooks upon ModelBase.__new__
> construction and so skip most fields and overhead in __init__.

I'm not sure if I'm misreading your intentions here, but just to be
sure -- I'd like to avoid explicitly naming certain field types as
"special" in ModelBase if at all possible. If we explicitly name
certain fields in ModelBase, then it means that third-parties with the
same requirement won't be able to exploit the same API entry point. In
the case of GenericForeignKey specifically, it also means that we
would be introducing a dependency (even if it was name-only) on
contrib into core -- which is something we've been trying to move away
from.

Russ %-)


 
You must Sign in before you can post messages.
To post a message you must first join this group.
Please update your nickname on the subscription settings page before posting.
You do not have the permission required to post.
Andy McCurdy  
View profile  
 More options Jul 15 2012, 12:51 pm
From: Andy McCurdy <sed...@gmail.com>
Date: Sun, 15 Jul 2012 09:51:27 -0700
Local: Sun, Jul 15 2012 12:51 pm
Subject: Re: pre_init/post_init vs. performance

On Jul 14, 2012, at 6:37 PM, Russell Keith-Magee wrote:

> My only concern is:

>> We could store which fields have these hooks upon ModelBase.__new__
>> construction and so skip most fields and overhead in __init__.

> I'm not sure if I'm misreading your intentions here, but just to be
> sure -- I'd like to avoid explicitly naming certain field types as
> "special" in ModelBase if at all possible. If we explicitly name
> certain fields in ModelBase, then it means that third-parties with the
> same requirement won't be able to exploit the same API entry point. In
> the case of GenericForeignKey specifically, it also means that we
> would be introducing a dependency (even if it was name-only) on
> contrib into core -- which is something we've been trying to move away
> from.

I think Jeremy means doing something like this in ModelBase.__new__:

    meta.fields_needing_init = [field for field in fields if hasattr(field, 'model_pre_init')]

And in ModelBase.__init__:

    for field in self._meta.fields_needing_init:
        field.model_pre_init(*args, **kwargs)

This way, in ModelBase.__init__, only the fields that actually need to be initialized have this method called.

-andy


 
You must Sign in before you can post messages.
To post a message you must first join this group.
Please update your nickname on the subscription settings page before posting.
You do not have the permission required to post.
Jeremy Dunck  
View profile  
 More options Jul 15 2012, 2:07 pm
From: Jeremy Dunck <jdu...@gmail.com>
Date: Sun, 15 Jul 2012 11:07:38 -0700
Local: Sun, Jul 15 2012 2:07 pm
Subject: Re: pre_init/post_init vs. performance

That is indeed what I meant, but I think Anssi's probably right that
this belongs in contribute_to_class instead of a new adhoc thing in
ModelBase.__new__.

I'll take a swing at making this change.

Opened related ticket: https://code.djangoproject.com/ticket/18627


 
You must Sign in before you can post messages.
To post a message you must first join this group.
Please update your nickname on the subscription settings page before posting.
You do not have the permission required to post.
Jeremy Dunck  
View profile  
 More options Jul 18 2012, 6:48 am
From: Jeremy Dunck <jdu...@gmail.com>
Date: Wed, 18 Jul 2012 03:48:15 -0700
Local: Wed, Jul 18 2012 6:48 am
Subject: Re: pre_init/post_init vs. performance

On Sun, Jul 15, 2012 at 11:07 AM, Jeremy Dunck <jdu...@gmail.com> wrote:

> That is indeed what I meant, but I think Anssi's probably right that
> this belongs in contribute_to_class instead of a new adhoc thing in
> ModelBase.__new__.

> I'll take a swing at making this change.

> Opened related ticket: https://code.djangoproject.com/ticket/18627

OK, I have a very rough cut w/ all tests passing here:
https://github.com/jdunck/django/tree/18627_pre_init

I poked in a flag, sys.JDUNCK_NEW, which can be turned on/off for old
and new implementation - this is poked in at django.__init__

To test performance, using tests/test_sqlite modified with:

INSTALLED_APPS = [
    'regressiontests.model_fields'
]

and
DATABASES['default']['name'] = ':memory:'

Testing the pre_init hook:

timeit.Timer("""
for i in names:
    models.Person(name=i)
""", """
from django.core.management import call_command
from regressiontests.model_fields import models
call_command('syncdb')
names = [str(i) for i in range(1000)]
""").timeit(1000)

Under the existing signal usage: 24.609718084335327
Under the new approach avoiding signal usage: 18.074234008789062

I'd expect post_init to be a smaller gain since the destination
function has more overhead, but still a gain.

Note that the pre_init and post_init signals are still fired -- it's
just that django as-shipped won't use any listeners on them, so the
Model.__init__ performance improves.

In some cases, object construction currently is larger than query time
- a simple 'select x from y limit 1000' would have similar overhead to
this.  Also, note that this gain would be seen on all models in any
project using GFK or ImageField but no other pre_init or post_init
hooks - not just those models which have the fields.  With this
switch, only the models with the hooking fields have the related
overhead.

Feedback welcome.


 
You must Sign in before you can post messages.
To post a message you must first join this group.
Please update your nickname on the subscription settings page before posting.
You do not have the permission required to post.
Jeremy Dunck  
View profile  
 More options Jul 27 2012, 11:58 am
From: Jeremy Dunck <jdu...@gmail.com>
Date: Fri, 27 Jul 2012 08:58:18 -0700
Local: Fri, Jul 27 2012 11:58 am
Subject: Re: pre_init/post_init vs. performance
Can I get a review of the branch?  25% performance improvement of
Model.__init__ in stock django seems worth landing. :)


 
You must Sign in before you can post messages.
To post a message you must first join this group.
Please update your nickname on the subscription settings page before posting.
You do not have the permission required to post.
Anssi Kääriäinen  
View profile  
 More options Jul 28 2012, 4:49 pm
From: Anssi Kääriäinen <anssi.kaariai...@thl.fi>
Date: Sat, 28 Jul 2012 13:49:08 -0700 (PDT)
Local: Sat, Jul 28 2012 4:49 pm
Subject: Re: pre_init/post_init vs. performance
On 27 heinä, 18:58, Jeremy Dunck <jdu...@gmail.com> wrote:

> Can I get a review of the branch?  25% performance improvement of
> Model.__init__ in stock django seems worth landing. :)

I did some work to add new benchmarks to djangobench. In my tests the
speedup for the init hooks approach was nearly 60% when both pre/post
init signals were in use in the project. Currently this happens when
there are at least one ImageField and one GenericForeignKey used in
the project. The test is fetching 500 objects from DB by
the .iterator() method.

We should do something to this issue. If the init hooks approach is
the way to go needs to be decided. The caching of signal receivers
gives the same speedup.

Some more info in ticket #18627.

 - Anssi

PS: for another possible nice speedup see ticket https://code.djangoproject.com/ticket/16759
which is about removing the use of __deepcopy__ from queryset cloning.
There is potential for major speedups using that approach. Just one
example: model.save() +50% speedup.


 
You must Sign in before you can post messages.
To post a message you must first join this group.
Please update your nickname on the subscription settings page before posting.
You do not have the permission required to post.
End of messages
« Back to Discussions « Newer topic     Older topic »