signals

199 views
Skip to first unread message

Brian Harring

unread,
Jun 10, 2007, 12:07:59 PM6/10/07
to django-d...@googlegroups.com
Curious, how many folks are actually using dispatch at all?

For my personal usage, I'm actually not using any of the hooks- I
suspect most folks aren't either. That said, I'm paying a fairly
hefty price for them.

With Model.__init__'s send left in for 53.3k record instantiation
(just a walk of the records), time required is 9.2s. Without the
send, takes 7.0s. Personally, I'd like to get that quarter of the
time slice back. :)

Via ticket 3439, I've already gone after dispatch to try and speed it
up- I probably can wring a bit more speed out of it, but the
improvements won't come anywhere near reclaiming the 2.2s from above.

What is left, is flat out removing the send invocations if they're
not needed- specifically shifting the send calls out of __init__, and
wrapping __init__ on the fly *when* something tries to connect to it.

Effectively,

from django.dispatch.dispatcher import connect, disconnect
from django.db.model.signals import pre_init
from django.db.models import Model

class m(Model): pass

callback = lambda *a, **kw:None

assert m.__init__ is Model.__init__
connect(callback, sender=m, signal=pre_init)
assert m.__init__ is not Model.__init__

disconnect(callback, sender=m, signal=pre_init)
assert m.__init__ is Model.__init__

The pro of this is that the slowdown is limited to *only* the
instances where something is known to be listening- listening to model
class foo doesn't slow down class bar basically; listening to save
doesn't slow down __init__, etc.

The cons I'll enumerate:

1) to do this requires a few tricks- specifically, wrapping methods on
a class on the fly when something starts listening, and reversing the
wrapping when nobody is listening anymore. Personally, I'm
comfortable with this (the misc contribute_to_class crap going on in
_meta already isn't too far off). Realize however others may not be
comfortable with it- thus speak up please.

2) usage of Any for a sender means we have to track the potential
senders.

3) usage of Any for a signal means we have to track the signals
involved in this trick (registration of the signal instance), and #2.

4) Not strictly required, but if sender is a class (and the only
listeners are listening to *that* class, not Any), any deriving
from that class will still fire the signal- meaning the performance
gain is lost for the derivative, sends are occuring that don't have
any listeners. This can be reclaimed via some tricks in
ModelBase.__new__ offhand if desired and the use scenario is at least
semi-common (how many people derive from a defined Model?).

5) the wrapping trick introduces an extra func into the callpath when
something is listening. That's basically a semi-ellusive way of
saying "it'll be slightly slower when there is a listener then what's
in place now"; haven't finished the implementation thus I don't have
specifics, but figure a few usecs hit from the wrapper itself (since
the codepaths are executed often enough, it's worth noting for cases
where listeners are expected).

Would appreciate any thoughts on above; the cons are basically
implementation specific, that said I can work through them (I want
that 25% back, damn it ;)- question is if folks are game for it or
not, if the idea is palatable or not.


Aside from that, would really help if I had a clue what folks are
actually using dispatch for with django- which signals, common
patterns for implementing their own signals (pre/post I'd assume?),
common signals they're listening to, etc.

Knowing it would help with optimizing dispatch further, and would be
useful if someone ever decides to gut dispatch and refactor the code
into something less fugly.

~harring

James Bennett

unread,
Jun 10, 2007, 4:29:31 PM6/10/07
to django-d...@googlegroups.com
On 6/10/07, Brian Harring <ferr...@gmail.com> wrote:
> Aside from that, would really help if I had a clue what folks are
> actually using dispatch for with django- which signals, common
> patterns for implementing their own signals (pre/post I'd assume?),
> common signals they're listening to, etc.

I'm working on something which will be leaning pretty heavily on the
pre_save and post_save signals; the code's not public yet, but will be
soon.

--
"Bureaucrat Conrad, you are technically correct -- the best kind of correct."

Jeremy Dunck

unread,
Jun 10, 2007, 4:50:48 PM6/10/07
to django-d...@googlegroups.com
On 6/10/07, James Bennett <ubern...@gmail.com> wrote:
>
> On 6/10/07, Brian Harring <ferr...@gmail.com> wrote:
> > Aside from that, would really help if I had a clue what folks are
> > actually using dispatch for with django- which signals, common
> > patterns for implementing their own signals (pre/post I'd assume?),
> > common signals they're listening to, etc.
>
> I'm working on something which will be leaning pretty heavily on the
> pre_save and post_save signals; the code's not public yet, but will be
> soon.

Jellyroll does, too:
http://jellyroll.googlecode.com/svn/trunk/jellyroll/managers.py

I really like that technique, and plan to do similar in future.

marcin.k...@gmail.com

unread,
Jun 10, 2007, 6:51:33 PM6/10/07
to Django developers

On Jun 10, 10:29 pm, "James Bennett" <ubernost...@gmail.com> wrote:
> I'm working on something which will be leaning pretty heavily on the
> pre_save and post_save signals; the code's not public yet, but will be
> soon.

I use these in django-multilingual to update translations when an
instance of a translatable model is saved. I also depend on
signals.class_prepared for each translatable model to finish its
definition and create the child model with translation data.

I even tried to sneak yet another signal into Django at some point,
but noone else was interested :)

http://groups.google.com/group/django-developers/browse_thread/thread/853a1d30aac78eb9/19b2cd804760925c

That one would be cheap, though, triggered only when a new model gets
created. I worked around not having it by wrapping ModelBase.__new__
in a function that did all the extra stuff I needed before calling the
original implementation.

-mk

Jacob Kaplan-Moss

unread,
Jun 10, 2007, 8:40:10 PM6/10/07
to django-d...@googlegroups.com
On 6/10/07, Jeremy Dunck <jdu...@gmail.com> wrote:
> Jellyroll does, too:
> http://jellyroll.googlecode.com/svn/trunk/jellyroll/managers.py
>
> I really like that technique, and plan to do similar in future.

Indeed; I (ab)use the hell out of signals, and would be sad without
'em. Nearly every trick in my sleeve these days needs signals.

That said, I'd also like that 25% back :) I'm *very* interested in
your idea of dynamically enabling signals only when they're going to
be caught; it's pointless to spend all that time dispatching if
nothing's gonna answer. If you can figure out a clean way of
accomplishing that -- and it looks like you've already started -- I'd
certainly push for its acceptance.

Jacob

Malcolm Tredinnick

unread,
Jun 11, 2007, 5:39:08 AM6/11/07
to django-d...@googlegroups.com
On Sun, 2007-06-10 at 09:07 -0700, Brian Harring wrote:
> Curious, how many folks are actually using dispatch at all?
>
> For my personal usage, I'm actually not using any of the hooks- I
> suspect most folks aren't either. That said, I'm paying a fairly
> hefty price for them.
>
> With Model.__init__'s send left in for 53.3k record instantiation
> (just a walk of the records), time required is 9.2s. Without the
> send, takes 7.0s. Personally, I'd like to get that quarter of the
> time slice back. :)

Since you already have your own version of the Spanish Inquisition set
up for testing, what portion of this overhead is just the function call?
If you the dispatch function is replaced with just "return", do we save
much.

In case it's not clear: I'm trying to get a feeling for how much of the
cost is caused by the dispatching itself and how much by processing the
dispatch inside the signal module. Is avoiding the call altogether
necessary or making the handlers much faster? (More for future direction
than anything else).

I don't think we can hope to get a really accurate picture here beyond a
statistical sample with a broad range for any real confidence interval.
The problem is that there is a userbase of thousands and a lot of
evidence to suggest that most people don't read mailing list threads
that they didn't start themselves. Yes, almost everybody reading this is
an exception, but that automatically makes you an outlier.

However, to add to the sample, I'm using post_init in some cases and
pre_save and post_save a lot. Looks like request_started and
request_finished are making an appearance in my code, but mostly in
diagnostic stuff that is not intended for production use.

> Knowing it would help with optimizing dispatch further, and would be
> useful if someone ever decides to gut dispatch and refactor the code
> into something less fugly.

Given that upstream pydispatcher isn't really being maintained, I don't
think we should be too hesitant to tweak it for our needs.

Regards,
Malcolm

Sandro Dentella

unread,
Jun 11, 2007, 6:22:41 AM6/11/07
to django-d...@googlegroups.com
> The problem is that there is a userbase of thousands and a lot of
> evidence to suggest that most people don't read mailing list threads
> that they didn't start themselves. Yes, almost everybody reading this is

so let's come to the surface... i use signals. Mainly pre-save / post-save.
And just to let you know I miss a post_insert different from post_update.

sandro
*:-)

David Larlet

unread,
Jun 11, 2007, 8:51:23 AM6/11/07
to django-d...@googlegroups.com
2007/6/10, Brian Harring <ferr...@gmail.com>:

> Curious, how many folks are actually using dispatch at all?

I use signals this way:
http://groups.google.com.mt/group/django-users/browse_thread/thread/dcc8ed4b26fd261c/12a19040ad7af699?lnk=gst&rnum=2#12a19040ad7af699
and it can be useful too when you need to send mass mail at each
modification of an object.

David

Brian Harring

unread,
Jun 12, 2007, 9:16:32 AM6/12/07
to django-d...@googlegroups.com
On Mon, Jun 11, 2007 at 07:39:08PM +1000, Malcolm Tredinnick wrote:
>
> On Sun, 2007-06-10 at 09:07 -0700, Brian Harring wrote:
> > Curious, how many folks are actually using dispatch at all?
> >
> > For my personal usage, I'm actually not using any of the hooks- I
> > suspect most folks aren't either. That said, I'm paying a fairly
> > hefty price for them.
> >
> > With Model.__init__'s send left in for 53.3k record instantiation
> > (just a walk of the records), time required is 9.2s. Without the
> > send, takes 7.0s. Personally, I'd like to get that quarter of the
> > time slice back. :)
>
> Since you already have your own version of the Spanish Inquisition set
> up for testing, what portion of this overhead is just the function call?
> If you the dispatch function is replaced with just "return", do we save
> much.

Offhand, replacing the dispatch with just 'return' is actually
semi tricky, since there are a few receivers required for the django
internals (class preparation). Basically requires delegating the send
to the signal in select cases (for *_delete, and request_*, don't see
much option unless they can be shifted around also).

For __init__ and save however, the wrap trick will fly- meaning don't
even need the empty function call.

Either way, profile dump follows.

Top 30 via lsprof (cProfile for 2.5); with send left in Model.__init__

>>> ps.sort_stats("ti").print_stats(30)
Mon Jun 11 02:55:18 2007 dump.stats

1747388 function calls (1745991 primitive calls) in 18.627 CPU seconds

Ordered by: internal time
List reduced from 916 to 30 due to restriction <30>

ncalls tottime percall cumtime percall filename:lineno(function)
53332 3.314 0.000 9.046 0.000 base.py:97(__init__)
106720 2.592 0.000 3.162 0.000 dispatcher.py:271(getAllReceivers)
373318 1.995 0.000 3.056 0.000 base.py:38(utf8)
2 1.723 0.861 1.723 0.861 base.py:99(execute)
53332 1.631 0.000 4.687 0.000 base.py:37(utf8rowFactory)
537 1.540 0.003 6.227 0.012 ~:0(<method 'fetchmany' of 'pysqlite2.dbapi2.Cursor' objects>)
380348 1.329 0.000 1.329 0.000 ~:0(<setattr>)
196950 1.061 0.000 1.061 0.000 ~:0(<method 'encode' of 'unicode' objects>)
53334 0.809 0.000 17.958 0.000 query.py:171(iterator)
106678 0.808 0.000 3.980 0.000 dispatcher.py:317(send)
213374 0.570 0.000 0.570 0.000 ~:0(<id>)
116228/115981 0.321 0.000 0.322 0.000 ~:0(<len>)
3 0.168 0.056 18.126 6.042 query.py:468(_get_data)
53334 0.162 0.000 0.162 0.000 ~:0(<iter>)
60 0.060 0.001 0.118 0.002 functional.py:26(__init__)
245/60 0.042 0.000 0.158 0.003 sre_parse.py:374(_parse)
6660 0.036 0.000 0.036 0.000 functional.py:36(__promise__)
3118 0.035 0.000 0.051 0.000 sre_parse.py:182(__next)
458/56 0.032 0.000 0.126 0.002 sre_compile.py:27(_compile)
1 0.027 0.027 0.032 0.032 sre_compile.py:296(_optimize_unicode)
206 0.022 0.000 0.067 0.000 sre_compile.py:202(_optimize_charset)
7 0.021 0.003 0.455 0.065 __init__.py:1(?)
6360 0.017 0.000 0.017 0.000 ~:0(<method 'append' of 'list' objects>)
2573 0.017 0.000 0.059 0.000 sre_parse.py:201(get)
634/243 0.014 0.000 0.018 0.000 sre_parse.py:140(getwidth)
1 0.014 0.014 18.627 18.627 full-run.py:2(?)
19/12 0.012 0.001 0.187 0.016 ~:0(<__import__>)
1 0.008 0.008 0.011 0.011 socket.py:43(?)
1 0.007 0.007 0.109 0.109 urllib2.py:71(?)
182/57 0.007 0.000 0.160 0.003 sre_parse.py:301(_parse_sub)


<pstats.Stats instance at 0xb7ce268c>

without

>>> ps.sort_stats("ti").print_stats(30)
Mon Jun 11 03:02:10 2007 /home/bharring/dump2.stats

1320732 function calls (1319335 primitive calls) in 13.970 CPU seconds

Ordered by: internal time
List reduced from 916 to 30 due to restriction <30>

ncalls tottime percall cumtime percall filename:lineno(function)
53332 2.428 0.000 4.475 0.000 base.py:97(__init__)
373318 2.004 0.000 3.059 0.000 base.py:38(utf8)
2 1.715 0.858 1.716 0.858 base.py:99(execute)
380348 1.623 0.000 1.623 0.000 ~:0(<setattr>)
53332 1.617 0.000 4.676 0.000 base.py:37(utf8rowFactory)
537 1.538 0.003 6.215 0.012 ~:0(<method 'fetchmany' of 'pysqlite2.dbapi2.Cursor' objects>)
196950 1.055 0.000 1.056 0.000 ~:0(<method 'encode' of 'unicode' objects>)
53334 0.744 0.000 13.305 0.000 query.py:171(iterator)
116228/115981 0.315 0.000 0.317 0.000 ~:0(<len>)
3 0.166 0.055 13.470 4.490 query.py:468(_get_data)
53334 0.158 0.000 0.158 0.000 ~:0(<iter>)
60 0.060 0.001 0.119 0.002 functional.py:26(__init__)
245/60 0.042 0.000 0.158 0.003 sre_parse.py:374(_parse)
6660 0.036 0.000 0.036 0.000 functional.py:36(__promise__)
3118 0.035 0.000 0.051 0.000 sre_parse.py:182(__next)
458/56 0.032 0.000 0.127 0.002 sre_compile.py:27(_compile)
1 0.027 0.027 0.032 0.032 sre_compile.py:296(_optimize_unicode)
206 0.022 0.000 0.067 0.000 sre_compile.py:202(_optimize_charset)
7 0.021 0.003 0.455 0.065 __init__.py:1(?)
2573 0.017 0.000 0.059 0.000 sre_parse.py:201(get)
6360 0.017 0.000 0.017 0.000 ~:0(<method 'append' of 'list' objects>)
634/243 0.014 0.000 0.018 0.000 sre_parse.py:140(getwidth)
1 0.014 0.014 13.970 13.970 full-run.py:2(?)
19/12 0.011 0.001 0.188 0.016 ~:0(<__import__>)
1 0.008 0.008 0.011 0.011 socket.py:43(?)
182/57 0.007 0.000 0.160 0.003 sre_parse.py:301(_parse_sub)
1 0.007 0.007 0.109 0.109 urllib2.py:71(?)
1456 0.007 0.000 0.015 0.000 sre_parse.py:195(match)
206 0.007 0.000 0.076 0.000 sre_compile.py:173(_compile_charset)
20 0.006 0.000 0.007 0.000 sre_compile.py:253(_mk_bitmap)


Model.__init__ is still a bit of a kick in the teeth offhand;
addressing that one however requires some semi-nasty work shifting
some of the fields related testing to be cached in _meta; not
expecting a huge gain out of it, plus it'll likely be fairly nasty so
I'd rather hold off on that one till a later date.

Not yet advocating it (mainly since digging it out would be ugly), but
if you take a look at the bits above, having the option to disable
verification on read *would* have a nice kick in the pants for ORM
object instantiation when the admin has decided the data is guranteed
to be the correct types.


> In case it's not clear: I'm trying to get a feeling for how much of the
> cost is caused by the dispatching itself and how much by processing the
> dispatch inside the signal module. Is avoiding the call altogether
> necessary or making the handlers much faster? (More for future direction
> than anything else).

Cost is from the dispatching; take a look in dispatcher.send. Django
codebase has already deviated from dispatcher upstream via inlining
large parts of the lookup there (part of the 5x boost in dispatching
going from 0.95 to 0.96)- still has to do the lookups, which
unfortunately are semi complex due to the semantics of Any.

That said... there really isn't any reason to continue making the
calls if you know nothing is listening and the target to wrap emits
just pre/post.


> > Aside from that, would really help if I had a clue what folks are
> > actually using dispatch for with django- which signals, common
> > patterns for implementing their own signals (pre/post I'd assume?),
> > common signals they're listening to, etc.
>
> I don't think we can hope to get a really accurate picture here beyond a
> statistical sample with a broad range for any real confidence interval.
> The problem is that there is a userbase of thousands and a lot of
> evidence to suggest that most people don't read mailing list threads
> that they didn't start themselves. Yes, almost everybody reading this is
> an exception, but that automatically makes you an outlier.
>
> However, to add to the sample, I'm using post_init in some cases and
> pre_save and post_save a lot. Looks like request_started and
> request_finished are making an appearance in my code, but mostly in
> diagnostic stuff that is not intended for production use.

Just looking to get an idea of what folks are actually doing; simple
example, it's easier to fire both pre/post if there is a listener for
one- that said, if the vast number of folks are listening to only
*one* of the signals, it's potentially worth the time to have the code
swap in a pre, pre + post, or post wrapper as needed.

Also is a bit more of a pain in the ass implementing that, but looks
of it, it'll be the desired next step.


> > Knowing it would help with optimizing dispatch further, and would be
> > useful if someone ever decides to gut dispatch and refactor the code
> > into something less fugly.
>
> Given that upstream pydispatcher isn't really being maintained, I don't
> think we should be too hesitant to tweak it for our needs.

Don't spose it could just be thrown out? The code really *is* ugly :)

Can likely drop a lot of the internal voodoo and shift over to using
weakref.Weak*Dictionary where appropriate internally, but robustapply
is still fairly nasty- tend to think it should stop trying to hold
folks hands, and just pass the send args/kwargs straight through to
the receiver instead of trying to map args out.

~harring

Malcolm Tredinnick

unread,
Jun 12, 2007, 7:37:35 PM6/12/07
to django-d...@googlegroups.com

It's also going to get a little heavier (not noticable in the common
cases, maybe noticable in the 10^6 case) with the unicode branch and
when we add Field sub-classing (which will be very soon), since both
require an extra function call per field in the average case. This is
the standard trade-off: functionality costs time and the functionality
in both cases is worth having.

> Not yet advocating it (mainly since digging it out would be ugly), but
> if you take a look at the bits above, having the option to disable
> verification on read *would* have a nice kick in the pants for ORM
> object instantiation when the admin has decided the data is guranteed
> to be the correct types.

I have a blog post I'm still collecting the data for (making pretty
graphs, mostly), but it seems that the extreme cases you can actually
make instantiation much faster by just writing an __init__ method on the
Model sub-class that does just the right thing -- customised for the
model fields -- since it's only an attribute populator when you get
right down to it. One can even generate this code automatically. Again,
not something to worry about for the 90% case, but for people wanting to
create 10^6 objects, it's worth the ten minutes to code it up by hand.
That'll appear on the community aggregator when I post it.

[...]


> > Given that upstream pydispatcher isn't really being maintained, I don't
> > think we should be too hesitant to tweak it for our needs.
>
> Don't spose it could just be thrown out? The code really *is* ugly :)

Agree with the second part (although I'd be less hyperbolic, if I saw it
from a colleague, we'd be having discussions). Keeping the API similar
to what it is (or at least, "routine" to port -- possible to do with a
reg-exp, say) would be worthwhile, since there is a lot of code in the
wild using the signal infrastructure. The main argument for keeping the
implementation as it is now (expressed by Jacob in the past, but it
makes sense), was to ease our maintenance by synchronising with
upstream. Upstream has gone away. It's drought conditions. (As usual,
speaking only for myself, but I don't think this is too highly
controversial.)

Regards,
Malcolm

Malcolm Tredinnick

unread,
Jun 12, 2007, 9:54:19 PM6/12/07
to django-d...@googlegroups.com
On Wed, 2007-06-13 at 09:37 +1000, Malcolm Tredinnick wrote:
> On Tue, 2007-06-12 at 06:16 -0700, Brian Harring wrote:
[...]

> > Model.__init__ is still a bit of a kick in the teeth offhand;
> > addressing that one however requires some semi-nasty work shifting
> > some of the fields related testing to be cached in _meta; not
> > expecting a huge gain out of it, plus it'll likely be fairly nasty so
> > I'd rather hold off on that one till a later date.
>
> It's also going to get a little heavier (not noticable in the common
> cases, maybe noticable in the 10^6 case) with the unicode branch and
> when we add Field sub-classing (which will be very soon), since both
> require an extra function call per field in the average case. This is
> the standard trade-off: functionality costs time and the functionality
> in both cases is worth having.

Actually, before you start worrying unnecessarily, Brian, I take some of
that back. Field sub-classing can be done at a cost of no extra calls
for non-sub-classed fields. Apparently I'm schizophrenic, because the
half of me that wrote the above paragraph had not noticed when the other
half of me wrote the code that did this. It's essentially what you are
suggesting with _meta caching: keep a list of "special" fields that we
can determine as part of the metaclass's __new__ and do a run over those
for any type coercion. Normal fields are not touched.

Regards,
Malcolm


Jacob Kaplan-Moss

unread,
Jun 13, 2007, 12:01:45 AM6/13/07
to django-d...@googlegroups.com
On 6/12/07, Malcolm Tredinnick <mal...@pointy-stick.com> wrote:
> Keeping the API similar
> to what it is (or at least, "routine" to port -- possible to do with a
> reg-exp, say) would be worthwhile, since there is a lot of code in the
> wild using the signal infrastructure.

This sums up my feelings perfectly. I'm all for cleaning (and speeding up) the
signal infrastructure as long as the API doesn't change or at least
only changes slightly.

Looking over what Brian's written and the dispatch code, it looks like three
small changes would let us simplify dispatching immensely:

1. Require the ``sender`` argument when connecting to a signal (i.e. don't allow
``Any`` sender any more).

2. Don't do any pattern matching on call signals; assume all listeners conform
to the api ``listener(signal, sender, **kwargs)``.

3. Simplify ``robustApply`` accordingly (i.e. don't consider
positional arguments).

With those changes, the dynamic wrapping Brian's talking about would be much
easier and none of the signal code I've written or seen would break (IIRC).

A slightly more controversial change might be to drop ``robustApply`` entirely
in favor of requiring listeners to *always* accept arbitrary kwargs. Frankly I'd
be fine just telling people to always add ``**kwargs`` to their listeners; it's
a trivial change to existing code.

Thoughts?

Jacob

Tai Lee

unread,
Jun 13, 2007, 4:50:37 AM6/13/07
to Django developers
I use Any sender a fair bit (in a generic version control app), and
I'd not like to see that disappear without considerable performance
penalties if keeping it / considerable performance gains if removing
it.

Brian Harring

unread,
Jun 13, 2007, 10:29:12 PM6/13/07
to django-d...@googlegroups.com
On Wed, Jun 13, 2007 at 09:37:35AM +1000, Malcolm Tredinnick wrote:
> On Tue, 2007-06-12 at 06:16 -0700, Brian Harring wrote:
> > Not yet advocating it (mainly since digging it out would be ugly), but
> > if you take a look at the bits above, having the option to disable
> > verification on read *would* have a nice kick in the pants for ORM
> > object instantiation when the admin has decided the data is guranteed
> > to be the correct types.
>
> I have a blog post I'm still collecting the data for (making pretty
> graphs, mostly), but it seems that the extreme cases you can actually
> make instantiation much faster by just writing an __init__ method on the
> Model sub-class that does just the right thing -- customised for the
> model fields -- since it's only an attribute populator when you get
> right down to it. One can even generate this code automatically. Again,
> not something to worry about for the 90% case, but for people wanting to
> create 10^6 objects, it's worth the ten minutes to code it up by hand.
> That'll appear on the community aggregator when I post it.

Sounds semi fragile since presumably it will totally bypasses the
default Model.__init__ (which later on may grow important instance
initialization bits).

Personally, I'm more interested in speeding up django then bypassing
chunks of it for speed- realize it's not always possible, but that's
where my interests lie (upshot of it is that the common cases get a
bit faster while the extremes become usable). That said, others may
be less stubborn and it may be useful to them :)


> > > Given that upstream pydispatcher isn't really being maintained, I don't
> > > think we should be too hesitant to tweak it for our needs.
> >
> > Don't spose it could just be thrown out? The code really *is* ugly :)
>
> Agree with the second part (although I'd be less hyperbolic, if I saw it
> from a colleague, we'd be having discussions).

Humour, man, humour ;)

'Ugly' isn't hyperbole I'm afraid however. Terms usage isn't meant
to say anything about the author, merely that this code while
working, needs some heavy cleanup, something a bit more
elegant/maintainable.

Internals are pretty obscure, and no one seems to have firm
expectations out of the code- for example, should the order of
connect calls reflect the order of dispatches when a send comes in?
Yet to find an answer to that one, both from django devs/users
expectations, and searching the remains of upstream.

Part of the confusion about the code comes down to the age of the
code- it originally aimed for py2.2 (since then requiring 2.3.3 due to
python bugs). Few examples, sets came around in 2.3, so the order of
dispatch may not be desired to be constant, may just be an
implementation quirk.

Same goes for the misc dicts floating in dispatcher; are they
implemented that way for a reason, instead of using
weakref.Weak*Dictionary? The author of pydispatch as far as I know
also implemented weakref.Weak*Dictionary (they've got it in their vcs
at least)- assuming that's correct, either the author skipped updating
pydispatch for Weak*Dictionary (either due to time/interest, or the
updating being tricky), or there was an undocumented reason it wasn't
used- basically have to assume the latter and try to root about for
an undocumented reason, since weakref collections can be fairly
tricky.

Either way, I'm still playing locally, but I strongly suspect I'll
wind up chucking the main core of dispatch.dispatcher and folding it
into per signal handling; may not pan out, but at this point it seems
the logical next step for where I'm at patch wise.


> Keeping the API similar
> to what it is (or at least, "routine" to port -- possible to do with a
> reg-exp, say) would be worthwhile, since there is a lot of code in the
> wild using the signal infrastructure.

Don't have any intention of chucking the api at this point; will make
more sense when the patch is posted, but basically I'm shifting the
logic into individual signals instances- making them smarter.

So, for (unimplemented, but soon) simple signals, following example-
instead of

dispatcher.send(signal=signals.request_started)

would be

signals.request_started.send()

Reasoning is two fold- 1) locks down the data thats passed to
receivers, can figure out whats for that signal just via pydoc, 2)
shifts control of actually dispatching into the signal, which can be
made aware at the time of regirestration of whether or not anything is
actually listening.

Expected by product of that shift is that the set of listeners for a
signal be calculated when a listener is added, instead of doing a lookup
each send invocation.

Nice aspect of this is that dispatcher.send api still can exist, just
wind up delegating to the signal if it's new style, if not, fallback
to the existing connections machinery.

Assuming this all goes sanely, it may be possible to flat out drop the
existing internals- basically if an old style signal is used, should
be possible to generate a new style signal that holds the misc
receivers already, and just map the old style to new
(WeakKeyDictionary specifically) on the fly for the dispatcher.* api.

~harring

Jeremy Dunck

unread,
Dec 20, 2007, 11:45:24 PM12/20/07
to django-d...@googlegroups.com
On Jun 13, 2007 8:29 PM, Brian Harring <ferr...@gmail.com> wrote:
...

> Either way, I'm still playing locally, but I strongly suspect I'll
> wind up chucking the main core of dispatch.dispatcher and folding it
> into per signal handling; may not pan out, but at this point it seems
> the logical next step for where I'm at patch wise.
...


Brian,
Did anything come of this? I'm interested in several uses of
signals. At the Dec 1 sprint, Jacob said he'd prefer not to add any
signals until the performance issues are addressed.
I'm willing to work at improving the performance, but David
indicated that you may have something worth supplying back to django
trunk?

-Jeremy

ivan.illarionov

unread,
Dec 23, 2007, 7:39:48 AM12/23/07
to Django developers
As I know PyDispatcher project has evolved into Louie project and is
actively maintained by the same author. The code is *not* ugly, is
closer to Django coding style and has a lot of improvements.
http://louie.berlios.de/

Even more, I replaced content of django.dispatch with louie code,
renamed all imports `from louie` to `from django.dispatch` and
everything worked without any errors.

Hope this information will help.

Ivan Illarionov

Simon Willison

unread,
Dec 23, 2007, 8:49:02 AM12/23/07
to django-d...@googlegroups.com

On 23 Dec 2007, at 12:39, ivan.illarionov wrote:

> As I know PyDispatcher project has evolved into Louie project and is
> actively maintained by the same author. The code is *not* ugly, is
> closer to Django coding style and has a lot of improvements.
> http://louie.berlios.de/
>
> Even more, I replaced content of django.dispatch with louie code,
> renamed all imports `from louie` to `from django.dispatch` and
> everything worked without any errors.

Have you benchmarked Louie compared to PyDispatcher at all? From
looking over their documentation they don't seem to have made any
changes that would result in a substantial speed increase.

Cheers,

Simon Willison

ivan.illarionov

unread,
Dec 23, 2007, 1:58:17 PM12/23/07
to Django developers
Just did a simple benchmark and found that Louie is about 150% slower
than current dispatcher in Django. It looks like Louie developers
doesn't care about performance at all...

Brian Harring

unread,
Jan 4, 2008, 10:21:02 PM1/4/08
to django-d...@googlegroups.com, Jeremy Dunck
On Thu, Dec 20, 2007 at 10:45:24PM -0600, Jeremy Dunck wrote:
>
> On Jun 13, 2007 8:29 PM, Brian Harring <ferr...@gmail.com> wrote:
> ...
> > Either way, I'm still playing locally, but I strongly suspect I'll
> > wind up chucking the main core of dispatch.dispatcher and folding it
> > into per signal handling; may not pan out, but at this point it seems
> > the logical next step for where I'm at patch wise.
> ...
>
>
> Brian,

Helps to CC me, been crazy busy lately over the last few months and
watching this ml less and less unfortunately.


> Did anything come of this? I'm interested in several uses of
> signals. At the Dec 1 sprint, Jacob said he'd prefer not to add any
> signals until the performance issues are addressed.
> I'm willing to work at improving the performance, but David
> indicated that you may have something worth supplying back to django
> trunk?

Haven't gone any further on my signals work since #4561 is currently
bitrotting- the intention was to basically shift the listener tracking
all into individual signal objects; the basic machinery required is
avail. in the patches posted on that ticket.

Presuming folks liked the approach, the intended next step was to
shift away from using django.dispatch.dispatcher.connect to bind a
callable to a signal, to signal_inst.connect(func) as the default,
gutting the remaining dispathcer internals, shifting it into the
signal objects themselves.

In the process, moving the robustapply bits up from checking everytime
the signal fires, to checking when the connection is established- end
result of that would be faster signaling, and reduction of a lot of
the nastyness robustapply attempts at the point of signaling.

Either way, #4561 is the guts of my intentions- resurrecting signals
work starts at resurrecting that patch ;)

If there is interest in getting this commited, I would like to know-
that patch alone was another 25% speed up in model instantiation when
no listener was connected for that model.

One additional optimization point for signals is deciding whether
connect(Any, f1)
connect(Any, f2)

Must execute f1, f2 in that explicit order- internally, dispatcher
uses some nasty linear lookups that could be gutted if f1/f2 ordering
can vary. Never got any real feedback on that one, so have left it
alone.
~brian

Marty Alchin

unread,
Jan 4, 2008, 11:00:27 PM1/4/08
to django-d...@googlegroups.com
On 1/4/08, Brian Harring <ferr...@gmail.com> wrote:
> One additional optimization point for signals is deciding whether
> connect(Any, f1)
> connect(Any, f2)
>
> Must execute f1, f2 in that explicit order- internally, dispatcher
> uses some nasty linear lookups that could be gutted if f1/f2 ordering
> can vary. Never got any real feedback on that one, so have left it
> alone.

My opinion on this doesn't come from any particular expertise in this
area, but I would vote for the order of execution being undefined. The
point (as I understand it) of using signals is to encourage loose
coupling. Coding a set of listeners in such a way that their order of
execution is important seems like an example of tight coupling. Not
tightly couple with Django itself exactly, but the listeners would be
coupled to each other, by relying on an internal implementation detail
of something that shouldn't have anything to do witht hat coupling.

I say, if someone needs to care about what order two things are
processed, they should just register one listener and call the other
when the one fires.

-Gul

Marty Alchin

unread,
Jan 4, 2008, 11:06:14 PM1/4/08
to django-d...@googlegroups.com
A slight amendment. I like your idea of moving registration and
triggering into methods of the signal object itself, though that does
then require that signals support a particular API, which currently
isn't a requirement. Providing a base "Signal" class would, of course,
help with this, but I like the potential of this, particular with
regards to the order of execution issue.

The base Signal implementation could implement execution in an
undefined order, for speed, and if a particular signal would benefit
from having its listeners triggered in order, its trigger() (or
execute() or whatever) method could be overriden to provide the
necessary functionality. Or, depending on how the signal code works,
it could override connect() instead, so that they're registered in the
proper order. Either way, it's possible to implement on a per-signal
basis, once the signals themselves are smarter.

-Gul

Reply all
Reply to author
Forward
0 new messages