Google Groups Home
Help | Sign in
signals
There are currently too many topics in this group that display first. To make this topic appear first, remove this option from another topic.
There was an error processing your request. Please try again.
flag
  21 messages - Collapse all
The group you are posting to is a Usenet group. Messages posted to this group will make your email address visible to anyone on the Internet.
Your reply message has not been sent.
Your post was successful
Brian Harring  
View profile
 More options Jun 10 2007, 12:07 pm
From: Brian Harring <ferri...@gmail.com>
Date: Sun, 10 Jun 2007 09:07:59 -0700
Local: Sun, Jun 10 2007 12:07 pm
Subject: signals

Curious, how many folks are actually using dispatch at all?

For my personal usage, I'm actually not using any of the hooks- I
suspect most folks aren't either.  That said, I'm paying a fairly
hefty price for them.

With Model.__init__'s send left in for 53.3k record instantiation
(just a walk of the records), time required is 9.2s.  Without the
send, takes 7.0s.  Personally, I'd like to get that quarter of the
time slice back. :)

Via ticket 3439, I've already gone after dispatch to try and speed it
up- I probably can wring a bit more speed out of it, but the
improvements won't come anywhere near reclaiming the 2.2s from above.

What is left, is flat out removing the send invocations if they're
not needed- specifically shifting the send calls out of __init__, and
wrapping __init__ on the fly *when* something tries to connect to it.

Effectively,

from django.dispatch.dispatcher import connect, disconnect
from django.db.model.signals import pre_init
from django.db.models import Model

class m(Model): pass

callback = lambda *a, **kw:None

assert m.__init__ is Model.__init__
connect(callback, sender=m, signal=pre_init)
assert m.__init__ is not Model.__init__

disconnect(callback, sender=m, signal=pre_init)
assert m.__init__ is Model.__init__

The pro of this is that the slowdown is limited to *only* the
instances where something is known to be listening- listening to model
class foo doesn't slow down class bar basically; listening to save
doesn't slow down __init__, etc.

The cons I'll enumerate:

1) to do this requires a few tricks- specifically, wrapping methods on
a class on the fly when something starts listening, and reversing the
wrapping when nobody is listening anymore.  Personally, I'm
comfortable with this (the misc contribute_to_class crap going on in
_meta already isn't too far off).  Realize however others may not be
comfortable with it- thus speak up please.

2) usage of Any for a sender means we have to track the potential
senders.

3) usage of Any for a signal means we have to track the signals
involved in this trick (registration of the signal instance), and #2.

4) Not strictly required, but if sender is a class (and the only
listeners are listening to *that* class, not Any), any deriving
from that class will still fire the signal- meaning the performance
gain is lost for the derivative, sends are occuring that don't have
any listeners.  This can be reclaimed via some tricks in
ModelBase.__new__ offhand if desired and the use scenario is at least
semi-common (how many people derive from a defined Model?).

5) the wrapping trick introduces an extra func into the callpath when
something is listening.  That's basically a semi-ellusive way of
saying "it'll be slightly slower when there is a listener then what's
in place now"; haven't finished the implementation thus I don't have
specifics, but figure a few usecs hit from the wrapper itself (since
the codepaths are executed often enough, it's worth noting for cases
where listeners are expected).

Would appreciate any thoughts on above; the cons are basically
implementation specific, that said I can work through them (I want
that 25% back, damn it ;)- question is if folks are game for it or
not, if the idea is palatable or not.

Aside from that, would really help if I had a clue what folks are
actually using dispatch for with django- which signals, common
patterns for implementing their own signals (pre/post I'd assume?),
common signals they're listening to, etc.

Knowing it would help with optimizing dispatch further, and would be
useful if someone ever decides to gut dispatch and refactor the code
into something less fugly.

~harring

  application_pgp-signature_part
< 1K Download

    Reply to author    Forward  
You must Sign in before you can post messages.
To post a message you must first join this group.
Please update your nickname on the subscription settings page before posting.
You do not have the permission required to post.
James Bennett  
View profile
 More options Jun 10 2007, 4:29 pm
From: "James Bennett" <ubernost...@gmail.com>
Date: Sun, 10 Jun 2007 15:29:31 -0500
Local: Sun, Jun 10 2007 4:29 pm
Subject: Re: signals
On 6/10/07, Brian Harring <ferri...@gmail.com> wrote:

> Aside from that, would really help if I had a clue what folks are
> actually using dispatch for with django- which signals, common
> patterns for implementing their own signals (pre/post I'd assume?),
> common signals they're listening to, etc.

I'm working on something which will be leaning pretty heavily on the
pre_save and post_save signals; the code's not public yet, but will be
soon.

--
"Bureaucrat Conrad, you are technically correct -- the best kind of correct."


    Reply to author    Forward  
You must Sign in before you can post messages.
To post a message you must first join this group.
Please update your nickname on the subscription settings page before posting.
You do not have the permission required to post.
Jeremy Dunck  
View profile
 More options Jun 10 2007, 4:50 pm
From: "Jeremy Dunck" <jdu...@gmail.com>
Date: Sun, 10 Jun 2007 15:50:48 -0500
Local: Sun, Jun 10 2007 4:50 pm
Subject: Re: signals
On 6/10/07, James Bennett <ubernost...@gmail.com> wrote:

> On 6/10/07, Brian Harring <ferri...@gmail.com> wrote:
> > Aside from that, would really help if I had a clue what folks are
> > actually using dispatch for with django- which signals, common
> > patterns for implementing their own signals (pre/post I'd assume?),
> > common signals they're listening to, etc.

> I'm working on something which will be leaning pretty heavily on the
> pre_save and post_save signals; the code's not public yet, but will be
> soon.

Jellyroll does, too:
http://jellyroll.googlecode.com/svn/trunk/jellyroll/managers.py

I really like that technique, and plan to do similar in future.


    Reply to author    Forward  
You must Sign in before you can post messages.
To post a message you must first join this group.
Please update your nickname on the subscription settings page before posting.
You do not have the permission required to post.
marcin.kaszynski@gmail.co m  
View profile
 More options Jun 10 2007, 6:51 pm
From: "marcin.kaszyn...@gmail.com" <marcin.kaszyn...@gmail.com>
Date: Sun, 10 Jun 2007 15:51:33 -0700
Local: Sun, Jun 10 2007 6:51 pm
Subject: Re: signals

On Jun 10, 10:29 pm, "James Bennett" <ubernost...@gmail.com> wrote:

> I'm working on something which will be leaning pretty heavily on the
> pre_save and post_save signals; the code's not public yet, but will be
> soon.

I use these in django-multilingual to update translations when an
instance of a translatable model is saved.  I also depend on
signals.class_prepared for each translatable model to finish its
definition and create the child model with translation data.

I even tried to sneak yet another signal into Django at some point,
but noone else was interested :)

http://groups.google.com/group/django-developers/browse_thread/thread...

That one would be cheap, though, triggered only when a new model gets
created.  I worked around not having it by wrapping ModelBase.__new__
in a function that did all the extra stuff I needed before calling the
original implementation.

-mk


    Reply to author    Forward  
You must Sign in before you can post messages.
To post a message you must first join this group.
Please update your nickname on the subscription settings page before posting.
You do not have the permission required to post.
Jacob Kaplan-Moss  
View profile
 More options Jun 10 2007, 8:40 pm
From: "Jacob Kaplan-Moss" <jacob.kaplanm...@gmail.com>
Date: Sun, 10 Jun 2007 19:40:10 -0500
Local: Sun, Jun 10 2007 8:40 pm
Subject: Re: signals
On 6/10/07, Jeremy Dunck <jdu...@gmail.com> wrote:

> Jellyroll does, too:
> http://jellyroll.googlecode.com/svn/trunk/jellyroll/managers.py

> I really like that technique, and plan to do similar in future.

Indeed; I (ab)use the hell out of signals, and would be sad without
'em. Nearly every trick in my sleeve these days needs signals.

That said, I'd also like that 25% back :)  I'm *very* interested in
your idea of dynamically enabling signals only when they're going to
be caught; it's pointless to spend all that time dispatching if
nothing's gonna answer. If you can figure out a clean way of
accomplishing that -- and it looks like you've already started -- I'd
certainly push for its acceptance.

Jacob


    Reply to author    Forward  
You must Sign in before you can post messages.
To post a message you must first join this group.
Please update your nickname on the subscription settings page before posting.
You do not have the permission required to post.
Malcolm Tredinnick  
View profile
 More options Jun 11 2007, 5:39 am
From: Malcolm Tredinnick <malc...@pointy-stick.com>
Date: Mon, 11 Jun 2007 19:39:08 +1000
Local: Mon, Jun 11 2007 5:39 am
Subject: Re: signals

On Sun, 2007-06-10 at 09:07 -0700, Brian Harring wrote:
> Curious, how many folks are actually using dispatch at all?

> For my personal usage, I'm actually not using any of the hooks- I
> suspect most folks aren't either.  That said, I'm paying a fairly
> hefty price for them.

> With Model.__init__'s send left in for 53.3k record instantiation
> (just a walk of the records), time required is 9.2s.  Without the
> send, takes 7.0s.  Personally, I'd like to get that quarter of the
> time slice back. :)

Since you already have your own version of the Spanish Inquisition set
up for testing, what portion of this overhead is just the function call?
If you the dispatch function is replaced with just "return", do we save
much.

In case it's not clear: I'm trying to get a feeling for how much of the
cost is caused by the dispatching itself and how much by processing the
dispatch inside the signal module. Is avoiding the call altogether
necessary or making the handlers much faster? (More for future direction
than anything else).

I don't think we can hope to get a really accurate picture here beyond a
statistical sample with a broad range for any real confidence interval.
The problem is that there is a userbase of thousands and a lot of
evidence to suggest that most people don't read mailing list threads
that they didn't start themselves. Yes, almost everybody reading this is
an exception, but that automatically makes you an outlier.

However, to add to the sample, I'm using post_init in some cases and
pre_save and post_save a lot. Looks like request_started and
request_finished are making an appearance in my code, but mostly in
diagnostic stuff that is not intended for production use.

> Knowing it would help with optimizing dispatch further, and would be
> useful if someone ever decides to gut dispatch and refactor the code
> into something less fugly.

Given that upstream pydispatcher isn't really being maintained, I don't
think we should be too hesitant to tweak it for our needs.

Regards,
Malcolm


    Reply to author    Forward  
You must Sign in before you can post messages.
To post a message you must first join this group.
Please update your nickname on the subscription settings page before posting.
You do not have the permission required to post.
Sandro Dentella  
View profile
 More options Jun 11 2007, 6:22 am
From: Sandro Dentella <san...@e-den.it>
Date: Mon, 11 Jun 2007 12:22:41 +0200
Local: Mon, Jun 11 2007 6:22 am
Subject: Re: signals

> The problem is that there is a userbase of thousands and a lot of
> evidence to suggest that most people don't read mailing list threads
> that they didn't start themselves. Yes, almost everybody reading this is

so let's come to the surface... i use signals. Mainly pre-save / post-save.
And just to let you know I miss a post_insert different from post_update.

sandro
*:-)


    Reply to author    Forward  
You must Sign in before you can post messages.
To post a message you must first join this group.
Please update your nickname on the subscription settings page before posting.
You do not have the permission required to post.
David Larlet  
View profile
 More options Jun 11 2007, 8:51 am
From: "David Larlet" <lar...@gmail.com>
Date: Mon, 11 Jun 2007 14:51:23 +0200
Local: Mon, Jun 11 2007 8:51 am
Subject: Re: signals
2007/6/10, Brian Harring <ferri...@gmail.com>:

> Curious, how many folks are actually using dispatch at all?

I use signals this way:
http://groups.google.com.mt/group/django-users/browse_thread/thread/d...
and it can be useful too when you need to send mass mail at each
modification of an object.

David


    Reply to author    Forward  
You must Sign in before you can post messages.
To post a message you must first join this group.
Please update your nickname on the subscription settings page before posting.
You do not have the permission required to post.
Brian Harring  
View profile
 More options Jun 12 2007, 9:16 am
From: Brian Harring <ferri...@gmail.com>
Date: Tue, 12 Jun 2007 06:16:32 -0700
Local: Tues, Jun 12 2007 9:16 am
Subject: Re: signals

Offhand, replacing the dispatch with just 'return' is actually
semi tricky, since there are a few receivers required for the django
internals (class preparation).  Basically requires delegating the send
to the signal in select cases (for *_delete, and request_*, don't see
much option unless they can be shifted around also).

For __init__ and save however, the wrap trick will fly- meaning don't
even need the empty function call.

Either way, profile dump follows.

Top 30 via lsprof (cProfile for 2.5); with send left in Model.__init__

>>> ps.sort_stats("ti").print_stats(30)

Mon Jun 11 02:55:18 2007    dump.stats

         1747388 function calls (1745991 primitive calls) in 18.627 CPU seconds

   Ordered by: internal time
   List reduced from 916 to 30 due to restriction <30>

   ncalls  tottime  percall  cumtime  percall filename:lineno(function)
    53332    3.314    0.000    9.046    0.000 base.py:97(__init__)
   106720    2.592    0.000    3.162    0.000 dispatcher.py:271(getAllReceivers)
   373318    1.995    0.000    3.056    0.000 base.py:38(utf8)
        2    1.723    0.861    1.723    0.861 base.py:99(execute)
    53332    1.631    0.000    4.687    0.000 base.py:37(utf8rowFactory)
      537    1.540    0.003    6.227    0.012 ~:0(<method 'fetchmany' of 'pysqlite2.dbapi2.Cursor' objects>)
   380348    1.329    0.000    1.329    0.000 ~:0(<setattr>)
   196950    1.061    0.000    1.061    0.000 ~:0(<method 'encode' of 'unicode' objects>)
    53334    0.809    0.000   17.958    0.000 query.py:171(iterator)
   106678    0.808    0.000    3.980    0.000 dispatcher.py:317(send)
   213374    0.570    0.000    0.570    0.000 ~:0(<id>)
116228/115981    0.321    0.000    0.322    0.000 ~:0(<len>)
        3    0.168    0.056   18.126    6.042 query.py:468(_get_data)
    53334    0.162    0.000    0.162    0.000 ~:0(<iter>)
       60    0.060    0.001    0.118    0.002 functional.py:26(__init__)
   245/60    0.042    0.000    0.158    0.003 sre_parse.py:374(_parse)
     6660    0.036    0.000    0.036    0.000 functional.py:36(__promise__)
     3118    0.035    0.000    0.051    0.000 sre_parse.py:182(__next)
   458/56    0.032    0.000    0.126    0.002 sre_compile.py:27(_compile)
        1    0.027    0.027    0.032    0.032 sre_compile.py:296(_optimize_unicode)
      206    0.022    0.000    0.067    0.000 sre_compile.py:202(_optimize_charset)
        7    0.021    0.003    0.455    0.065 __init__.py:1(?)
     6360    0.017    0.000    0.017    0.000 ~:0(<method 'append' of 'list' objects>)
     2573    0.017    0.000    0.059    0.000 sre_parse.py:201(get)
  634/243    0.014    0.000    0.018    0.000 sre_parse.py:140(getwidth)
        1    0.014    0.014   18.627   18.627 full-run.py:2(?)
    19/12    0.012    0.001    0.187    0.016 ~:0(<__import__>)
        1    0.008    0.008    0.011    0.011 socket.py:43(?)
        1    0.007    0.007    0.109    0.109 urllib2.py:71(?)
   182/57    0.007    0.000    0.160    0.003 sre_parse.py:301(_parse_sub)

<pstats.Stats instance at 0xb7ce268c>

without

>>> ps.sort_stats("ti").print_stats(30)

Mon Jun 11 03:02:10 2007    /home/bharring/dump2.stats

         1320732 function calls (1319335 primitive calls) in 13.970 CPU seconds

   Ordered by: internal time
   List reduced from 916 to 30 due to restriction <30>

   ncalls  tottime  percall  cumtime  percall filename:lineno(function)
    53332    2.428    0.000    4.475    0.000 base.py:97(__init__)
   373318    2.004    0.000    3.059    0.000 base.py:38(utf8)
        2    1.715    0.858    1.716    0.858 base.py:99(execute)
   380348    1.623    0.000    1.623    0.000 ~:0(<setattr>)
    53332    1.617    0.000    4.676    0.000 base.py:37(utf8rowFactory)
      537    1.538    0.003    6.215    0.012 ~:0(<method 'fetchmany' of 'pysqlite2.dbapi2.Cursor' objects>)
   196950    1.055    0.000    1.056    0.000 ~:0(<method 'encode' of 'unicode' objects>)
    53334    0.744    0.000   13.305    0.000 query.py:171(iterator)
116228/115981    0.315    0.000    0.317    0.000 ~:0(<len>)
        3    0.166    0.055   13.470    4.490 query.py:468(_get_data)
    53334    0.158    0.000    0.158    0.000 ~:0(<iter>)
       60    0.060    0.001    0.119    0.002 functional.py:26(__init__)
   245/60    0.042    0.000    0.158    0.003 sre_parse.py:374(_parse)
     6660    0.036    0.000    0.036    0.000 functional.py:36(__promise__)
     3118    0.035    0.000    0.051    0.000 sre_parse.py:182(__next)
   458/56    0.032    0.000    0.127    0.002 sre_compile.py:27(_compile)
        1    0.027    0.027    0.032    0.032 sre_compile.py:296(_optimize_unicode)
      206    0.022    0.000    0.067    0.000 sre_compile.py:202(_optimize_charset)
        7    0.021    0.003    0.455    0.065 __init__.py:1(?)
     2573    0.017    0.000    0.059    0.000 sre_parse.py:201(get)
     6360    0.017    0.000    0.017    0.000 ~:0(<method 'append' of 'list' objects>)
  634/243    0.014    0.000    0.018    0.000 sre_parse.py:140(getwidth)
        1    0.014    0.014   13.970   13.970 full-run.py:2(?)
    19/12    0.011    0.001    0.188    0.016 ~:0(<__import__>)
        1    0.008    0.008    0.011    0.011 socket.py:43(?)
   182/57    0.007    0.000    0.160    0.003 sre_parse.py:301(_parse_sub)
        1    0.007    0.007    0.109    0.109 urllib2.py:71(?)
     1456    0.007    0.000    0.015    0.000 sre_parse.py:195(match)
      206    0.007    0.000    0.076    0.000 sre_compile.py:173(_compile_charset)
       20    0.006    0.000    0.007    0.000 sre_compile.py:253(_mk_bitmap)

Model.__init__ is still a bit of a kick in the teeth offhand;
addressing that one however requires some semi-nasty work shifting
some of the fields related testing to be cached in _meta; not
expecting a huge gain out of it, plus it'll likely be fairly nasty so
I'd rather hold off on that one till a later date.

Not yet advocating it (mainly since digging it out would be ugly), but
if you take a look at the bits above, having the option to disable
verification on read *would* have a nice kick in the pants for ORM
object instantiation when the admin has decided the data is guranteed
to be the correct types.

> In case it's not clear: I'm trying to get a feeling for how much of the
> cost is caused by the dispatching itself and how much by processing the
> dispatch inside the signal module. Is avoiding the call altogether
> necessary or making the handlers much faster? (More for future direction
> than anything else).

Cost is from the dispatching; take a look in dispatcher.send.  Django
codebase has already deviated from dispatcher upstream via inlining
large parts of the lookup there (part of the 5x boost in dispatching
going from 0.95 to 0.96)- still has to do the lookups, which
unfortunately are semi complex due to the semantics of Any.

That said... there really isn't any reason to continue making the
calls if you know nothing is listening and the target to wrap emits
just pre/post.

Just looking to get an idea of what folks are actually doing; simple
example, it's easier to fire both pre/post if there is a listener for
one- that said, if the vast number of folks are listening to only
*one* of the signals, it's potentially worth the time to have the code
swap in a pre, pre + post, or post wrapper as needed.

Also is a bit more of a pain in the ass implementing that, but looks
of it, it'll be the desired next step.

> > Knowing it would help with optimizing dispatch further, and would be
> > useful if someone ever decides to gut dispatch and refactor the code
> > into something less fugly.

> Given that upstream pydispatcher isn't really being maintained, I don't
> think we should be too hesitant to tweak it for our needs.

Don't spose it could just be thrown out?  The code really *is* ugly :)

Can likely drop a lot of the internal voodoo and shift over to using
weakref.Weak*Dictionary where appropriate internally, but robustapply
is still fairly nasty- tend to think it should stop trying to hold
folks hands, and just pass the send args/kwargs straight through to
the receiver instead of trying to map args out.

~harring

  application_pgp-signature_part
< 1K Download

    Reply to author    Forward  
You must Sign in before you can post messages.
To post a message you must first join this group.
Please update your nickname on the subscription settings page before posting.
You do not have the permission required to post.
Malcolm Tredinnick  
View profile
 More options Jun 12 2007, 7:37 pm
From: Malcolm Tredinnick <malc...@pointy-stick.com>
Date: Wed, 13 Jun 2007 09:37:35 +1000
Local: Tues, Jun 12 2007 7:37 pm
Subject: Re: signals