During the last few weeks I've spent time optimizing code, among other
things. A surprising number of random things I look at in Sage are
too slow by a factor of 10-100,
usually due to programming style. One way to help to remedy that
would be to write a book or article about how to program Cython +
Python *efficiently*, and avoid the numerous traps. Basically, one
false move in Python when implementing mathematics algorithms can
knock you down by a factor of 10-100 in speed. As an example of
this, try multiplying polynomials in ZZ['x'] in Sage right now:
sage: R.<x> = ZZ[]
sage: timeit('x*x',number=10^6)
1000000 loops, best of 3: 1.96 µs per loop
FLINT could do that multiply in a few hundred nanoseconds, so why does
it take 2 microseconds (2000 nanoseconds)? Because the code is
complicated and general, but was written without any thought about
efficiency. By making a few changes (really making the code less
clever), I get:
sage: timeit('x*x', number=10^6)
1000000 loops, best of 3: 687 ns per loop
The categories implementation as it is in Sage must be changed or
removed. There, I got your attention.
A few months ago some code was added to Sage by the combinat folks
called "categories". I recently learned ObjectiveC, and found out
that this idea of "categories" goes back to smalltalk at least, and
was fully implemented in ObjectiveC in the mid 1980s. It has little
to do with category theory in the sense of mathematics, though is
related. Basically, it is a way to attach a bunch of generic methods
to a class without using inheritance. So far so good.
The combinat folks implemented this programming idea in Sage not by
changing the Python languge, but by defining a custom __getattr__
method at sticking it in the base classes for Sage parents and
elements. The problem: this means that every single method lookup on
every single Sage object seem to now have about 1500nanoseconds added
to it in completely unnecessary overhead. Most of what Sage does is
call functions, so this is a huge, huge problem. Try it out for
yourself:
(1) Start up Sage and try this:
flat:~ wstein$ sage-4.4.4
----------------------------------------------------------------------
| Sage Version 4.4.4, Release Date: 2010-06-23 |
| Type notebook() for the GUI, and license() for information. |
----------------------------------------------------------------------
sage: QQ.testit = 'foobar'
sage: timeit('QQ.testit',number=10^6)
1000000 loops, best of 3: 2.43 µs per loop
sage: timeit('QQ.testit',number=10^6)
1000000 loops, best of 3: 2.53 µs per loop
This is on my OS X laptop, so your timings may be different.
(2) Now go into SAGE_ROOT/devel/sage/sage/structure/parent.pyx
and change the line
def __getattr__(self, name):
to
def x__getattr__(self, name):
then restart sage with "sage -br"
----------------------------------------------------------------------
| Sage Version 4.4.4, Release Date: 2010-06-23 |
| Type notebook() for the GUI, and license() for information. |
----------------------------------------------------------------------
sage: QQ.testit = 'foobar'
sage: timeit('QQ.testit',number=10^6)
1000000 loops, best of 3: 1.53 µs per loop
sage: timeit('QQ.testit',number=10^6)
1000000 loops, best of 3: 1.54 µs per loop
----
This is a major problem. The best solution I can think of at present
is to remove these methods from the top level of Sage, then add
objects like Parent_cat and Element_cat, and put them there. Then
change all the combinat objects to derive from those classes. I'm not
sure this would work though.
Anyway, a slowdown of more than 1 microsecond on everything throughout
sage is absolutely unacceptable.
So, big question -- am I just confused? It seems impossible that
nobody noticed this before, or that people made a conscious decision
to let this code in if it slows down Sage this much. (I.e., my guess
is that nobody even bothered to benchmark the code when refereeing
it.)
Robert B: since you were the one who decided to let this into Sage,
and since I know you know Python insanely well, am I just missing
something here? It seems to me like this one patch is a huge drain on
Sage's performance... So much so that I'm personally tempted to
just remove it for my own work. But maybe I'm just missing something
obvious.
--William
--
William Stein
Associate Professor of Mathematics
University of Washington
http://wstein.org
IMHO, we need a benchmark to benchmark a large part of Sage, then
compare that from release to release and see if there are any
significant changes.
Dave
We should do this as part of the tests, collect timing data on each
test block (and perhaps even each line?). There have been vague plans
to do this for a while, and I thought about waiting for it before
getting this in, but no one's produced code.
- Robert
I think Craig may actually have some code for this.
--Mike
+1 to the idea of a timeit decorator, which would be useful for
microbenchmarking. I think timing whole blocks would be useful as well
to make sure there's no macro regressions. Though any one datapoint
isn't as useful, if everyone was consistently getting a slowdown for a
given doctest, that would be an indicator that something is going on.
- Robert
In additional to all the above being good ideas, there is something
else I'm going
to start doing, which is to add a new function to sage called "timeit.test".
sage: timeit.test('some expression', 7, '2+2', ... arguments as to timeit )
True
It's a function that will return True only if the result of timing
evaluation of the expression is at most 7 times the time to evaluate
the third expression (in this case "2+2").
For example, this could be used as follows:
sage: R.<x> = ZZ[]
sage: timeit.test('x*x', 10, '2*2')
True
to ensure that somebody doesn't screw up polynomial arithmetic so that
it suddenly takes much longer than it used to.
I'm not 100% sure that this test suite will be in Sage, but there's no
question I'm going to make it. I can always run it myself outside of
Sage if I so desire.
-- William
Great!
> A surprising number of random things I look at in Sage are too slow
> by a factor of 10-100, usually due to programming style. One way to
> help to remedy that would be to write a book or article about how to
> program Cython + Python *efficiently*, and avoid the numerous traps.
> Basically, one false move in Python when implementing mathematics
> algorithms can knock you down by a factor of 10-100 in speed.
Definitely.
> As an example of this, try multiplying polynomials in ZZ['x'] in
> Sage right now:
>
> sage: R.<x> = ZZ[]
> sage: timeit('x*x',number=10^6)
> 1000000 loops, best of 3: 1.96 �s per loop
>
> FLINT could do that multiply in a few hundred nanoseconds, so why does
> it take 2 microseconds (2000 nanoseconds)? Because the code is
> complicated and general, but was written without any thought about
> efficiency. By making a few changes (really making the code less
> clever), I get:
>
> sage: timeit('x*x', number=10^6)
> 1000000 loops, best of 3: 687 ns per loop
>
> The categories implementation as it is in Sage must be changed or
> removed. There, I got your attention.
Let's see about this.
That's before any category code:
zephyr> /opt/sage-4.2.1/sage
sage: R.<x> = ZZ[]
sage: timeit('x*x',number=10^6)
1000000 loops, best of 3: 1.14 �s per loop
That's after the integration of the main category patches:
zephyr-~monoid>/opt/sage-4.3/sage -br main
sage: R.<x> = ZZ[]
sage: timeit('x*x',number=10^6)
1000000 loops, best of 3: 1.12 �s per loop
That's after #7921: Categories for extension types via __getattr___:
zephyr-~monoid>/opt/sage-4.3.2/sage -br main
sage: R.<x> = ZZ[]
sage: timeit('x*x',number=10^6)
1000000 loops, best of 3: 1.13 �s per loop
That's now:
zephyr-~monoid>/opt/sage-4.4.4/sage -br main
sage: R.<x> = ZZ[]
sage: timeit('x*x',number=10^6)
1000000 loops, best of 3: 1.19 �s per loop
Please don't blame this one on categories. The overhead is in the
coercion code which was written by our favorite Cython expert.
Now, should the coercion code be changed or removed because it is too
generic and horribly slow? First note that we are speaking about a
speed factor of 2 or 3, not 10 or 100. And this is for very low level
arithmetic; the overhead would be negligible on anything more
complicated (if not just larger polynomials). So the right thing to do
is simply to override the generic code by optimized code for those
element classes which have very cheap arithmetic. That's what has
been done for ZZ, ... and as far as I know everybody is happy about
this approach.
> A few months ago some code was added to Sage by the combinat folks
> called "categories". I recently learned ObjectiveC, and found out
> that this idea of "categories" goes back to smalltalk at least, and
> was fully implemented in ObjectiveC in the mid 1980s. It has little
> to do with category theory in the sense of mathematics, though is
> related. Basically, it is a way to attach a bunch of generic
> methods to a class without using inheritance.
That's a very reductive view. And actually quite inaccurate: most of
Sage's parents and elements *do* get their generic methods from the
categories, through class inheritance. There is indeed one exception:
extension types (i.e. Cython classes), which currently use the getattr
hack (because Cython does not support yet any form of multiple
inheritance).
Anyway that's just an implementation detail. Of course, categories in
Sage are strongly rooted in object oriented ideas which date back from
smalltalk. But what the categories really bring to Sage is about
interactions between parents, elements, and morphisms, as well as
generic tests, functorial constructions, etc, all of which have much
to do with mathematics.
> The combinat folks implemented this programming idea in Sage not by
> changing the Python languge, but by defining a custom __getattr__
> method at sticking it in the base classes for Sage parents and
> elements.
Two notes:
- About the getattr hack, any blame should go to me personally and
the reviewer (Robert), not to the "combinat folks".
- I never liked this hack. Actually the original plan for the
implementation of categories before I jumped in was to
systematically use this hack. And I had to argue a lot to advance
instead the dynamic class approach. I would *love* to be able to
get rid of this hack for extension classes as well.
> The problem: this means that every single method lookup on every
> single Sage object seem to now have about 1500nanoseconds added to
> it in completely unnecessary overhead. Most of what Sage does is
> call functions, so this is a huge, huge problem.
Wow, I confirm this:
Without:
sage: sage: sage: sage: QQ.testit = 'foobar'
timeit('QQ.testit',number=10^6)
timeit('QQ._list',number=10^6)
timeit('QQ.gens',number=10^6)
timeit('QQ.an_element',number=10^6)
1000000 loops, best of 3: 1.7 �s per loop
1000000 loops, best of 3: 422 ns per loop
1000000 loops, best of 3: 236 ns per loop
1000000 loops, best of 3: 586 ns per loop
With:
sage: QQ.testit = 'foobar'
sage: timeit('QQ.testit',number=10^6)
sage: timeit('QQ._list',number=10^6)
sage: timeit('QQ.gens',number=10^6)
sage: timeit('QQ.an_element',number=10^6)
1000000 loops, best of 3: 2.68 �s per loop
1000000 loops, best of 3: 1.45 �s per loop
1000000 loops, best of 3: 1.2 �s per loop
1000000 loops, best of 3: 1.56 �s per loop
Which is totally unacceptable indeed.
Ironically, extension types are far less affected, if at all:
Without:
sage: timeit('ZZ._list',number=10^6)
sage: timeit('ZZ.gens',number=10^6)
sage: timeit('ZZ.an_element',number=10^6)
1000000 loops, best of 3: 312 ns per loop
1000000 loops, best of 3: 170 ns per loop
1000000 loops, best of 3: 391 ns per loop
With:
sage: timeit('ZZ._list',number=10^6)
sage: timeit('ZZ.gens',number=10^6)
sage: timeit('ZZ.an_element',number=10^6)
1000000 loops, best of 3: 318 ns per loop
1000000 loops, best of 3: 180 ns per loop
1000000 loops, best of 3: 391 ns per loop
We had done quite some benchmarking with Robert (see e.g. the IRC log
around mid-January). But somehow we let this slip through: Python's
doc for __getattr__ states:
�Note that if the attribute is found through the normal mechanism,
__getattr__() is not called.�
We checked that indeed __getattr__ was only called when the attribute
is not found. And we ran benchmark to make sure that there was no
overhead for extension types. Our mistake was to assume that there
would be no overhead either for Python classes. Ouch! Mea culpa.
> This is a major problem. The best solution I can think of at
> present is to remove these methods from the top level of Sage, then
> add objects like Parent_cat and Element_cat, and put them there.
> Then change all the combinat objects to derive from those classes.
> I'm not sure this would work though.
I see three options:
1 Simply revert #7921. Most of the combinat code is plain Python, so
won't be affected. What will break are access to generic methods
and generic TestSuite's for extension types. This will break some
of the recently added tests, but hopefully not too many
(e.g. TestSuite calls, or Rob's tests for multiplication tables,
things around permutation groups, ...)
2 Since only extension types need the getattr hack, make sure that
__getattr__ is only defined for those. I would love it. Anyone a
good idea on how to achieve this? The problem is that all parents
and elements classes derive from Parent and Element, which are the
natural places to put the getattr hack.
3 Implement partial multiple inheritance in Cython
We discussed Option 3 quite some with Robert during the categories
brainstorm in Nancy (October 2008). As far as I remember from the
discussions, full multiple inheritance (including merging of data
structure and method lookup table) was impossible; on the other hand,
there was no technical obstructions for allowing a Cython class to
inherit from multiple purely abstract classes besides the main base
class branch, if one accepts the price of a virtual lookup for the
methods defined there. Essentially, this would boil down to
implementing a variant of the getattr trick, but within the Cython
compiler, and with full control so as to make sure that standard
attribute and method lookups would not be affected.
Robert: can you confirm? Are you still thinking about it? If yes, do
you have an idea of a possible timeline?
Best,
Nicolas
PS: currently, the "combinat folks" are focusing on features rather
than pure speed. We can screw up here and there, but we are taking
great care in the design to make sure to leave full room for later
optimizations. In particular, a lot of the "generic approach", is
designed to concentrate all the time critical code in very few spots,
which will be easy to locate and optimize when someone will have time
for it (e.g. CombinatorialFreeModule).
--
Nicolas M. Thi�ry "Isil" <nth...@users.sf.net>
http://Nicolas.Thiery.name/
I would love that too. I think it should be in the Cython documentation.
Ondrej
<grumpy o' pa>
There is a strong technical issue that needs to be resolved, and for
which I explicitly take the blame. And I understand your frustration
when fighting with zillions of technical points. However I perceived
your e-mail as unfairly bashing the sage-combinat community, which
is just putting me off. Now, I should just step back and answer in a
couple hours, but on the other hand this discussion needs to get
going on the technical side.
Deep breath.
</grumpy o' pa>
On Mon, Jul 26, 2010 at 11:53:59AM -0700, William Stein wrote:
> I'm talking about the __getattr__ that you added. Simply commenting
> out the __getattr__ function leads to a significant speedup. This
> has been replicated by several other people, and this thread
> contains a deep investigation into why this is happening (though
> things still aren't clear).
Then, do you have an explanation for why this does not show up above?
Can you reproduce my timings?
> >> A few months ago some code was added to Sage by the combinat folks
> >> called "categories". �I recently learned ObjectiveC, and found out
> >> that this idea of "categories" goes back to smalltalk at least, and
> >> was fully implemented in ObjectiveC in the mid 1980s. �It has little
> >> to do with category theory in the sense of mathematics, though is
> >> related. �Basically, it is a way to attach a bunch of generic
> >> methods to a class without using inheritance.
> >
> > That's a very reductive view. And actually quite inaccurate: most of
> > Sage's parents and elements *do* get their generic methods from the
> > categories, through class inheritance. �There is indeed one exception:
> > extension types (i.e. Cython classes), which currently use the getattr
> > hack (because Cython does not support yet any form of multiple
> > inheritance).
>
> Do you know anything about Objective C or smalltalk or "categories" in
> those languages?
I haven't played with them directly. But from looking around this very
much looks like what "extends" do in other languages I played with
(Aldor, ...). And which I would love to have in Python as well (I am
actually planning to make a crude implementation of it, for plain
Python classes, for use with my external experimental code; it should
just be a couple lines of code).
So yes, the getattr hack is some sort of dynamic implementation of
categories (in smalltalk sense), which is a building block for the
implementation of "real" categories (in the Sage sense) for extension
types.
> I don't really care about blame; I just want to find a way to
> resolve this problem. I've slowed down/messed up etc., so many
> things in Sage over the years, it isn't even funny.
<grumpy o' pa>
I am glad to hear that, because that did not show up in your e-mail.
</grumpy o' pa>
> I think the impact depends a lot on how many python def'd methods the
> class has. E.g., I was testing with the huge number field classes,
> and the slowdown is much greater than with testing with a naked class
> that just derives from Element. I did not notice any real difference
> between otherwise identical Cython and Python classes.
I just had a look at the Python sources (in Object/typeobject.c), and
it's all about the difference between slot_tp_getattro and
slot_tp_getattr_hook, the later being a bit more complicated than the
former. I did not see obvious reasons for having it slower for large
classes, except for the lookup of the __getattr__ method. From
glancing at it, it sounds like this could be optimized, and in
particular that the getattr lookup could be delayed until it is really
used (if at all). This does not tell anything about Cython I guess.
Also it sounds like __getattr__ could be set to None if it is not
really needed, and then on the next call getattro_hook would switch
tp_getattro to the fast simple implementation slot_tp_getattroback. At
least this could solve the issue for Python classes. I'll give it a
shot.
> > �2 Since only extension types need the getattr hack, make sure that
> > � __getattr__ is only defined for those. I would love it. Anyone a
> > � good idea on how to achieve this? The problem is that all parents
> > � and elements classes derive from Parent and Element, which are the
> > � natural places to put the getattr hack.
>
> I'm dubious that this is not an issue for Cython classes.
Could someone lookup the sources to know this for sure?
> > �3 Implement partial multiple inheritance in Cython
> >
> > We discussed Option 3 quite some with Robert during the categories
> > brainstorm in Nancy (October 2008). As far as I remember from the
> > discussions, full multiple inheritance (including merging of data
> > structure and method lookup table) was impossible; on the other hand,
> > there was no technical obstructions for allowing a Cython class to
> > inherit from multiple purely abstract classes besides the main base
> > class branch, if one accepts the price of a virtual lookup for the
> > methods defined there. Essentially, this would boil down to
> > implementing a variant of the getattr trick, but within the Cython
> > compiler, and with full control so as to make sure that standard
> > attribute and method lookups would not be affected.
> That sounds like implementing what is called "categories" in
> ObjectiveC. ObjectiveC is an extension to C that was done in the
> late 1980s to add features like one has in Smalltalk. It's built on
> C structs, like Cython is, and does *not* have multiple inheritance.
> Categories make up for this.
Yeah, with the extra thing that this would have to be done
dynamically, and for parents possibly on a parent by parent basis,
since in many cases one does not know statically the category.
Cheers,
Nicolas
<humble servant>
I'm very sorry to have unfairly bashed the sage-combinat community, and I'm
glad you've jumped to their defense to clarify matters. I was exhausted after
a very long, sleepless coding sprint (days and days) trying to speed things up,
and very surprised to discover this source of a speed issue. My email was
certainly very unfairly rewritten. I hope you'll accept my apology.
</humble servant>
That makes a lot of sense.
> From
> glancing at it, it sounds like this could be optimized, and in
> particular that the getattr lookup could be delayed until it is really
> used (if at all). This does not tell anything about Cython I guess.
>
> Also it sounds like __getattr__ could be set to None if it is not
> really needed, and then on the next call getattro_hook would switch
> tp_getattro to the fast simple implementation slot_tp_getattroback. At
> least this could solve the issue for Python classes. I'll give it a
> shot.
Thanks!!
>
>> > 2 Since only extension types need the getattr hack, make sure that
>> > __getattr__ is only defined for those. I would love it. Anyone a
>> > good idea on how to achieve this? The problem is that all parents
>> > and elements classes derive from Parent and Element, which are the
>> > natural places to put the getattr hack.
>>
>> I'm dubious that this is not an issue for Cython classes.
>
> Could someone lookup the sources to know this for sure?
Maybe I'm just wrong.
>
>> > 3 Implement partial multiple inheritance in Cython
>> >
>> > We discussed Option 3 quite some with Robert during the categories
>> > brainstorm in Nancy (October 2008). As far as I remember from the
>> > discussions, full multiple inheritance (including merging of data
>> > structure and method lookup table) was impossible; on the other hand,
>> > there was no technical obstructions for allowing a Cython class to
>> > inherit from multiple purely abstract classes besides the main base
>> > class branch, if one accepts the price of a virtual lookup for the
>> > methods defined there. Essentially, this would boil down to
>> > implementing a variant of the getattr trick, but within the Cython
>> > compiler, and with full control so as to make sure that standard
>> > attribute and method lookups would not be affected.
>
>> That sounds like implementing what is called "categories" in
>> ObjectiveC. ObjectiveC is an extension to C that was done in the
>> late 1980s to add features like one has in Smalltalk. It's built on
>> C structs, like Cython is, and does *not* have multiple inheritance.
>> Categories make up for this.
>
> Yeah, with the extra thing that this would have to be done
> dynamically, and for parents possibly on a parent by parent basis,
> since in many cases one does not know statically the category.
>
> Cheers,
> Nicolas
> --
> Nicolas M. Thiéry "Isil" <nth...@users.sf.net>
> http://Nicolas.Thiery.name/
>
> --
> You received this message because you are subscribed to the Google Groups "sage-combinat-devel" group.
> To post to this group, send email to sage-comb...@googlegroups.com.
> To unsubscribe from this group, send email to sage-combinat-d...@googlegroups.com.
> For more options, visit this group at http://groups.google.com/group/sage-combinat-devel?hl=en.
>
>
--
William Stein
I gladly do :-)
Have a good night of sleep!
Carl, Robert, thanks so much for investigating this!
Wow a general speed improvement over all non cdef classes will be good :-)
I am looking forward timing results and trying it out!
Cheers,
Nicolas
Yes, indeed. This is absolutely fantastic work you guys did!
>
> Wow a general speed improvement over all non cdef classes will be good :-)
> I am looking forward timing results and trying it out!
Yes, indeed. And by the way my Cython benchmark was for a non-cdef'd
but still Cython class. So everything is consistent.
Thanks again Carl, etc.!
William
>
> Cheers,
> Nicolas
> --
> Nicolas M. Thiéry "Isil" <nth...@users.sf.net>
> http://Nicolas.Thiery.name/
>
> --
> You received this message because you are subscribed to the Google Groups "sage-combinat-devel" group.
> To post to this group, send email to sage-comb...@googlegroups.com.
> To unsubscribe from this group, send email to sage-combinat-d...@googlegroups.com.
> For more options, visit this group at http://groups.google.com/group/sage-combinat-devel?hl=en.
>
>
--
William Stein