Is it OK to replace certain cached methods by lazy attributes?

43 views
Skip to first unread message

Simon King

unread,
Oct 25, 2011, 4:30:20 PM10/25/11
to sage-devel
Hi!

I had first asked the following question on sage-combinat-devel, but
there was no reply, and so sorry for bothering you here...

It concerns two cached methods of categories, namely
C.all_super_categories() and C.super_categories(proper=False).

The former does not take any argument (except self); thus, one could
easily replace it by a lazy attribute. The latter takes an argument
which can be true or false; thus, one could easily replace it by two
lazy attributes, say, C.super_categories and
C.super_categories_proper.

The question is: Shall we do such replacements?

I vote "yes!", because a lazy attribute will always be faster than a
cached method. Cached methods will hopefully soon be improved by
#11115, but calling a cached method will still have an overhead
compared with a simple attribute lookup.

However, I have heard that methods are considered more pythonic than
attributes. That's why I ask what you think about that topic. My
counter-argument is: IMHO, currently both C.super_categories() and
C.all_super_categories() are mainly for internal use - and I think
that it is perfectly pythonic to internally work with attributes.

The motivation of my question is #11943, where I accomplish a mission
formulated by Nicolas: C.all_super_categories() should be ordered in
the same way as C.parent_class.mro(). It turns out that #11943 would
yield a mild speed regression when keeping super_categories and
all_super_categories as (cached) methods, but yields a speed-up when
replacing them by lazy attributes.

Best regards,
Simon

Jason Grout

unread,
Oct 25, 2011, 4:48:00 PM10/25/11
to sage-...@googlegroups.com
On 10/25/11 3:30 PM, Simon King wrote:
> However, I have heard that methods are considered more pythonic than
> attributes.

I think attributes are considered more pythonic for many things than
methods. We insist on methods for many things in Sage because you can
document methods (but it's hard and not intrinsically supported by
python to document and introspect attribute documentation).

This is from the PEP 8 style guide: http://www.python.org/dev/peps/pep-0008/

- For simple public data attributes, it is best to expose just the
attribute name, without complicated accessor/mutator methods.
Keep in
mind that Python provides an easy path to future enhancement,
should
you find that a simple data attribute needs to grow functional
behavior. In that case, use properties to hide functional
implementation behind simple data attribute access syntax.

Note 1: Properties only work on new-style classes.

Note 2: Try to keep the functional behavior side-effect free,
although
side-effects such as caching are generally fine.

Note 3: Avoid using properties for computationally expensive
operations; the attribute notation makes the caller believe
that access is (relatively) cheap.


Thanks,

Jason

Simon King

unread,
Oct 26, 2011, 4:52:29 AM10/26/11
to sage-devel
Hi Jason,

On 25 Okt., 22:48, Jason Grout <jason-s...@creativetrax.com> wrote:
> I think attributes are considered more pythonic for many things than
> methods.

That's good to know! So, I assume that it is OK that I suggest to
replace C.all_super_categories() and C.super_categories() by lazy
attributes. By the way, #11943 needs review :)

>  We insist on methods for many things in Sage because you can
> document methods (but it's hard and not intrinsically supported by
> python to document and introspect attribute documentation).

I wonder: Could (and should) that be done using the preparser?

That's to say: Assume that an input line ends with ? or ?? and is
preceded by something like "Foo.Bar.bla". Would it be feasible that
the preparser tests whether "Foo.Bar.__class__" has an attribute
"bla"? If it has, the preparser could replace "Foo.Bar.bla?" by
"Foo.Bar.__class__.bla?".

I am not suggesting that such replacement should occur when function
calls are involved: "Foo.Bar().bla?" should remain unchanged
(currently, "Foo.Bar().bla?" results in an error anyway). This is to
avoid costly function calls.

The effect would be: If bla is a lazy attribute of Foo.Bar, then
Foo.Bar.bla? would be the same as Foo.Bar.__class__.bla?, and would
thus show the documentation of the lazy attribute, and not the
documentation of the result of calling the lazy attribute.

> This is from the PEP 8 style guide:http://www.python.org/dev/peps/pep-0008/
>
> ...
>          Note 3: Avoid using properties for computationally expensive
>          operations; the attribute notation makes the caller believe
>          that access is (relatively) cheap.

That's the case for super_categories and all_super_categories: They
are cheap, the result is cached anyway, so, why not turn them into a
"property" (is that essentially the same as a lazy attribute?)?

Best regards,
Simon

Maarten Derickx

unread,
Oct 28, 2011, 9:32:16 AM10/28/11
to sage-...@googlegroups.com



I think attributes are considered more pythonic for many things than
methods.  We insist on methods for many things in Sage because you can
document methods (but it's hard and not intrinsically supported by
python to document and introspect attribute documentation).

Well, it is supported if you make the attribute a property. Since most of those things are already functions it would be very easy

sage: class test(object):
....:     @property
....:     def documented(self):
....:         """some doc"""
....:         pass
....:     
sage: l=test()
sage: l.documented?




Simon King

unread,
Oct 28, 2011, 10:28:19 AM10/28/11
to sage-devel
Hi Maarten,

On 28 Okt., 15:32, Maarten Derickx <m.derickx.stud...@gmail.com>
wrote:
> Well, it is supported if you make the attribute a property. Since most of
> those things are already functions it would be very easy
>
> ...
>
> http://docs.python.org/library/functions.html#property

Thank you, I didn't know "properties" before. But I think it is not
the right thing:

sage: class Bar(object):
....: @property
....: def property_test(self):
....: print "calling test"
....: return 5
....: @lazy_attribute
....: def lazy_test(self):
....: print "calling test"
....: return 7
....:
sage: b = Bar()
sage: b.property_test
calling test
5
sage: b.property_test
calling test
5
sage: b.lazy_test
calling test
7
sage: b.lazy_test
7

Hence,
* A property would be called over and over again and would do the
same computation over and over again.
* A cached method would be called over and over again, but would do
the computation only once.
* A lazy attribute would be called only once and thus also do the
computation only once.

My suggestion comes from the observation that one can save a
considerable amount of time by changing super_categories and
all_super_categories from cached_method into lazy_attribute. It seems
that a change from cached_method into property would have the opposite
effect.

By the way, I tried to modify the preparser so that
"Foo.some_lazy_attribute?" would show the documentation of
"Foo.__class__.some_lazy_attribute", but that didn't work well. So,
the question remains whether the speed-up makes up for the more
difficult access to the documentation: Is it acceptable that the users
need to do C.__class__.super_categories? in order to learn what
C.super_categories does?

Comments on #11943 (or a review:) are welcome!

Cheers,
Simon

Jason Grout

unread,
Oct 28, 2011, 1:27:14 PM10/28/11
to sage-...@googlegroups.com
On 10/28/11 9:28 AM, Simon King wrote:
> Hi Maarten,
>
> On 28 Okt., 15:32, Maarten Derickx<m.derickx.stud...@gmail.com>
> wrote:
>> Well, it is supported if you make the attribute a property. Since most of
>> those things are already functions it would be very easy
>>
>> ...
>>
>> http://docs.python.org/library/functions.html#property
>
> Thank you, I didn't know "properties" before. But I think it is not
> the right thing:
>
> sage: class Bar(object):
> .....: @property
> .....: def property_test(self):
> .....: print "calling test"
> .....: return 5
> .....: @lazy_attribute
> .....: def lazy_test(self):
> .....: print "calling test"
> .....: return 7
> .....:

> sage: b = Bar()
> sage: b.property_test
> calling test
> 5
> sage: b.property_test
> calling test
> 5
> sage: b.lazy_test
> calling test
> 7
> sage: b.lazy_test
> 7
>
> Hence,
> * A property would be called over and over again and would do the
> same computation over and over again.
> * A cached method would be called over and over again, but would do
> the computation only once.
> * A lazy attribute would be called only once and thus also do the
> computation only once.
>

"Aha!" I thought, "you should be using *cached* properties." But then I
tried it:

sage: class Bar(object):
....: @property

....: @cached_method
....: def property_test(self):
....: print "calling test property"


....: return 5
....: @lazy_attribute
....: def lazy_test(self):

....: print "calling lazy attribute"


....: return 7
....:
sage: b=Bar()
sage: b.property_test

---------------------------------------------------------------------------
TypeError Traceback (most recent call last)

[snip]<ipython console> in <module>()

TypeError: 'CachedMethod' object is not callable


I'm not exactly sure why I got this error. Would it be easy for someone
to enlighten me?

Also, it seems that the brilliant people that implemented lazy_attribute
thought deeply about it compared with properties; at least, there is
some discussion of properties in the docstring for lazy_attribute. It
would be great if someone who has thought more deeply about the relative
advantages of each to chime in. My guess is that there are some
inheritance issues where the two approaches differ.

Thanks,

Jason

Simon King

unread,
Oct 28, 2011, 2:25:11 PM10/28/11
to sage-devel
Hi Jason,

On 28 Okt., 19:27, Jason Grout <jason-s...@creativetrax.com> wrote:
> >   * A property would be called over and over again and would do the
> > same computation over and over again.
> >   * A cached method would be called over and over again, but would do
> > the computation only once.
> >   * A lazy attribute would be called only once and thus also do the
> > computation only once.
>
> "Aha!" I thought, "you should be using *cached* properties."

No. Even if it would work, I am sure that it would be slower than a
lazy attribute.

If I understand correctly, using the @property decorator means that
some function "property" is applied to an unbound method
property_test. Let the result be P=property(property_test).

The object P has a __get__ method. Whenever you do b.property_test in
your example, then P.__get__(b) is called, which then calls the
original method. Even if the original method was a cached method, the
overhead from calling P.__get__(b) is too much, for the application I
have in mind.

Compare with the @lazy_attribute decorator: lazy_attribute(lazy_test)
is some object L. Again, L has a __get__ method. However, L.__get__(b)
overwrites the lazy attribute by a *usual* attribute. So, calling
b.lazy_test a second time, it really is nothing more than an attribute
lookup - no __get__ involved anymore!

> TypeError: 'CachedMethod' object is not callable
>
> I'm not exactly sure why I got this error.  Would it be easy for someone
> to enlighten me?

Yes. The @cached_method decorator applied to an unbound method results
in a CachedMethod object. The CachedMethod object has a __get__
method, but no __call__ method.

That __get__ method of a cached_method does not return a CachedMethod
object, but a CachedMethodCaller object. The CachedMethodCaller has a
__call__ method.

Here is an example:
sage: P.<x,y> = QQ[]
sage: I = P*[x,y]
sage: type(I.__class__.groebner_basis)
<class 'sage.misc.cachefunc.CachedMethodCaller'>
sage: type(I.groebner_basis)
<class 'sage.misc.cachefunc.CachedMethodCaller'>

However, if you prepend the @cached_method decorator by an @property
decorator, then @property you essentially have
property(cached_method(some_unbound_method)). Thus, the function
"property" receives a CachedMethod object, not a CachedMethodCaller
instance. It is not callable, thus the error.

Cheers,
Simon

Simon King

unread,
Oct 28, 2011, 3:33:47 PM10/28/11
to sage-devel
Hi!

Just to show you what speed differences we are talking about, consider
a situation were we see the overhead in its purest form:

sage: class Bar(object):
....: def pure_method(self):
....: return 1
....: @property
....: def test_property(self):
....: return 2
....: @lazy_attribute
....: def test_lazy(self):
....: return 3
....: @cached_method
....: def test_cache(self):
....: return 4
....: def __init__(self):
....: self.test_attrib = 5
....:
sage: b = Bar()
sage: b.pure_method()
1
sage: b.test_property
2
sage: b.test_lazy
3
sage: b.test_cache()
4
sage: b.test_attrib
5
sage: %timeit b.test_property
625 loops, best of 3: 967 ns per loop
sage: %timeit b.pure_method()
625 loops, best of 3: 934 ns per loop
sage: %timeit b.test_cache()
625 loops, best of 3: 291 ns per loop
sage: %timeit b.test_lazy
625 loops, best of 3: 192 ns per loop
sage: %timeit b.test_attrib
625 loops, best of 3: 174 ns per loop

Note that I am using #11115. Without #11115, b.test_cache() would take
1.7 µs.

Since super_categories and all_super_categories are called really
often, I think they should be as fast as possible. But on the other
hand, in some cases they are not called at all. Hence, it would be a
bad idea to create a *non-lazy* attribute during initialisation of a
category. So, for sure my suggestion at #11943 is good for
performance.

The question on "being pythonic" has been answered affirmatively.

The only remaining questions are: Should we renounce the speed-up for
introspection's sake? Renounce introspection for speed's sake? Or is
it possible to modify the preparser (or some other tool) so that we
both have the speed-up and make `C.super_categories?` show the source
of the *method/property/...* (not the source code of the return
value)?

Cheers,
Simon

Maarten Derickx

unread,
Oct 28, 2011, 6:52:16 PM10/28/11
to sage-...@googlegroups.com
Sorry for the confusion Simon. My comment was purely a reaction to jasons claim that it is hard to document attributes in python and not a solution to your problem.

What I think would be the best solution in this case would be is:

make lazy attributes

_super_categories and 
_all_super_categories

Make the functions super_categories and all_super_categories say in the documentation that developers shouldn't use these but that they should use the lazy attributes.
Modify the sage source code to use the newly defined lazy attributes (if you where going to make them attributes instead of functions you would also have the same ammount of modification work).

The advantage of this, is that this change is backwards compatible and you keep the inspectibility.

Jason Grout

unread,
Oct 28, 2011, 9:28:56 PM10/28/11
to sage-...@googlegroups.com
On 10/28/11 5:52 PM, Maarten Derickx wrote:
> My comment was purely a reaction to jasons claim that it is hard to
> document attributes in python and not a solution to your problem.

That's really cool. I didn't realize that ipython would pick up the
documentation to properties. I assume it appears in the Sphinx
documentation too?

Thanks,

Jason


Simon King

unread,
Oct 29, 2011, 1:48:11 AM10/29/11
to sage-devel
Hi Maarten,

On 29 Okt., 00:52, Maarten Derickx <m.derickx.stud...@gmail.com>
wrote:
> What I think would be the best solution in this case would be is:
>
> make lazy attributes
>
> _super_categories and
> _all_super_categories
>
> Make the functions super_categories and all_super_categories say in the
> documentation that developers shouldn't use these but that they should use
> the lazy attributes.

That sounds like a very good solution! Thank you, I'll go for it!

Cheers,
Simon

Simon King

unread,
Oct 29, 2011, 1:52:13 AM10/29/11
to sage-devel
PS:

On 29 Okt., 00:52, Maarten Derickx <m.derickx.stud...@gmail.com>
wrote:
> Make the functions super_categories and all_super_categories say in the
> documentation that developers shouldn't use these but that they should use
> the lazy attributes.

With a deprecation warning, too? Or better no deprecation, since your
solution implies that there will be a all_super_categories *method*
also in future?

Cheers,
Simon

Florent Hivert

unread,
Oct 29, 2011, 5:18:53 AM10/29/11
to sage-...@googlegroups.com
Hi There,

Sorry for reacting this late...

The main reason why those two functions are functions and not attributes is
that at the time of designing the category system, they where a very strong
agreement that no attribute should be exposed to users, only methods. I think
this agreement still holds. We use lazy attribute for programming
technicalities (eg: element_class). For what concerns super_categories
all_super_categories, they definitely carry some important mathematical
information so I don't think we can consider them as programming
technicalities. Therefore we put them as function. Note that I'm pretty sure
that at some point they where lazy_attribute. Here is an old diff.

diff -r 0e6f5c052819 sage/categories/category_types.py
--- a/sage/categories/category_types.py Mon Dec 22 15:25:50 2008 -0800
+++ b/sage/categories/category_types.py Mon Dec 22 17:16:50 2008 -0800
@@ -738,6 +738,10 @@ class Fields(Category_uniq):
"""
return Fields, tuple([])

+ @lazy_attribute
+ def super_categories(self):
+ return[Rings()]
+

Cheers,

Florent

Simon King

unread,
Oct 29, 2011, 6:41:51 AM10/29/11
to sage-devel
Hi Florent,

On 29 Okt., 11:18, Florent Hivert <Florent.Hiv...@lri.fr> wrote:
> For what concerns super_categories
> all_super_categories, they definitely carry some important mathematical
> information so I don't think we can consider them as programming
> technicalities. Therefore we put them as function. Note that I'm pretty sure
> that at some point they where lazy_attribute. Here is an old diff.

Interesting!

But do you agree that it would be a good solution to both have an
underscore lazy attribute _all_super_categories for programming
(yielding speed) and a method all_super_categories that is exposed to
the user (yielding documentation, with information added that for
programming one should use the underscore lazy attribute)?

Cheers,
Simon

Maarten Derickx

unread,
Oct 29, 2011, 6:59:02 AM10/29/11
to sage-...@googlegroups.com
Do you remember why people agree on t

On Saturday, October 29, 2011 11:18:53 AM UTC+2, fhivert wrote:
     Hi There,

On Fri, Oct 28, 2011 at 10:52:13PM -0700, Simon King wrote:

> With a deprecation warning, too? Or better no deprecation, since your
> solution implies that there will be a all_super_categories *method*
> also in future?

I would certainly let the methods also exist in the future since regular users might also be interested in the info.

Sorry for reacting this late...

The main reason why those two functions are functions and not attributes is
that at the time of designing the category system, they where a very strong
agreement that no attribute should be exposed to users, only methods. I think
this agreement still holds. 


Do you remember some of the motivating arguments for this agreement? And would those arguments also apply to properties or descriptors? At least the argument Jason gave, namely you can't document attributes, does not apply to properties. And although Simon showed that calling a property is slightly slower (3.5%) then a method, it does have the advantage that it saves you typing two useless parenthesis. And even more important it allows for tab completion in ipython as shown below. (note that using the tab completion actually calls the method hello as you can see by the printed "hello")

sage: class test(object):
....:     @property
....:     def hello(self):
....:         print 'hello'
....:         return 1
....:     
sage: l=test()
sage: l.hello.p
hello

l.hello.parent           l.hello.powermod         l.hello.prime_divisors   l.hello.prime_to_m_part
l.hello.popcount         l.hello.powermodm_ui     l.hello.prime_factors    
sage: l.hello.p

I would find this tabcompletion very usefull for certain properties since I have often done something like:
tmp = Object.method_wich_could_be_an_attribute()
just to be able to have tab completion on tmp to see if it allowed me to do with it what I want.

Of course properties/descriptors should only be used for side effect free stuff (not counting caching as a real side effect) as the above example shows. Since it is bad if tab completion really changes stuff.



Simon King

unread,
Oct 29, 2011, 8:57:54 AM10/29/11
to sage-devel
Hi Maarten,

On 29 Okt., 12:59, Maarten Derickx <m.derickx.stud...@gmail.com>
wrote:
> I would certainly let the methods also exist in the future since regular
> users might also be interested in the info.

Agreed. My current proposal is to keep the methods and have underscore
lazy attributes.

> And although Simon showed that calling a property is
> slightly slower (3.5%) then a method, it does have the advantage that it
> saves you typing two useless parenthesis.

The point I wanted to make is: It doesn't matter how much slower a
property is than a method, since calling a method is already too slow.

If we care for speed, then neither a usual method nor a usual property
are options. IMHO, we should either
(1) get finally someone to review #11115, so that calling a cached
method is only 120ns slower than requesting an attribute,
or
(2) use underscore lazy attributes internally, but keep the methods
around, for documentation.

(1) means: We have fairly good speed and we have documentation.
(2) means: We have best speed and we have documentation, but the
method carrying the documentation is not used internally.

> And even more important it allows
> for tab completion in ipython as shown below.

That's exactly the same for lazy attributes.

> I would find this tabcompletion very usefull for certain properties since I
> have often done something like:
> tmp = Object.method_wich_could_be_an_attribute()
> just to be able to have tab completion on tmp to see if it allowed me to do
> with it what I want.

When you do tmp = b.some_property, followed by tmp?, then you see the
documentation of the return value of b.some_property.

Is that really what you want to see? Then you are arguing *against*
the use of properties, because b.some_property? will show you the
documentation of the property itself (not of its return value).

If you do b.some_lazy_attribute?, then you will see the documentation
of the return value (which I don't like).

If you do b.some_cached_method? then you will see the documentation of
the function that returns the value, and of course
b.some_cached_method()? won't work.

Best regards,
Simon

leif

unread,
Oct 29, 2011, 4:52:22 PM10/29/11
to sage-devel
On 29 Okt., 14:57, Simon King <simon.k...@uni-jena.de> wrote:
>  (2) use underscore lazy attributes internally, but keep the methods
> around, for documentation.

+1


> (1) means: We have fairly good speed and we have documentation.
> (2) means: We have best speed and we have documentation, but the
> method carrying the documentation is not used internally.


-leif

Maarten Derickx

unread,
Oct 30, 2011, 7:32:36 AM10/30/11
to sage-...@googlegroups.com


On Saturday, October 29, 2011 2:57:54 PM UTC+2, Simon King wrote:
Hi Maarten,

On 29 Okt., 12:59, Maarten Derickx <m.derick...@gmail.com>
wrote:

(2) means: We have best speed and we have documentation, but the
method carrying the documentation is not used internally.


+1
 
> And even more important it allows
> for tab completion in ipython as shown below.

That's exactly the same for lazy attributes.


I know, this is since lazy attributes are descriptors just as properties.

 
> I would find this tabcompletion very usefull for certain properties since I
> have often done something like:
> tmp = Object.method_wich_could_be_an_attribute()
> just to be able to have tab completion on tmp to see if it allowed me to do
> with it what I want.

When you do tmp = b.some_property, followed by tmp?, then you see the
documentation of the return value of b.some_property.

I was talking about tab completion. I now see that I explained my ideas slightly short and cryptic. What I meant to say I is that it is nice to be able to do l.hello.p<tab> like in the example I posted above. This is possible with all __get__ descriptors (so with lazy attributes and properties for example).

Nicolas M. Thiery

unread,
Oct 30, 2011, 5:45:47 PM10/30/11
to sage-...@googlegroups.com
Hi,

On Sun, Oct 30, 2011 at 04:32:36AM -0700, Maarten Derickx wrote:
> I was talking about tab completion. I now see that I explained my ideas
> slightly short and cryptic. What I meant to say I is that it is nice to be
> able to do l.hello.p<tab> like in the example I posted above. This is
> possible with all __get__ descriptors (so with lazy attributes and
> properties for example).

Just to be precise: For lazy attributes, this will only work if the
attribute has not been evaluated before. Otherwise, it's replaced by a
usual attribute, and the documentation is lost.

Cheers,
Nicolas
--
Nicolas M. Thi�ry "Isil" <nth...@users.sf.net>
http://Nicolas.Thiery.name/

Nicolas M. Thiery

unread,
Oct 30, 2011, 6:01:35 PM10/30/11
to sage-...@googlegroups.com
Hi Simon,

Sorry for the late reaction; I have been off e-mail for the last week.

For _super_categories: yes I can be convinced that the speed gain is
worth the added complexity because there is some very low-level
algorithmic using it intensively; maybe _super_categories should be
only used in such low-level algorithmic though; e.g. only in the core
categories code.

For _all_super_categories: in most use cases, someone accessing
all_super_categories will go through the result (or maybe half of it
in average?). Since that result is quite big (a list of typically 5,
10, or even 20 categories), I guess that the speed gain is negligible
in practice (but please prove me wrong if that's not the case!), and
thus does not justify the added complexity.

Sorry again for the late suggestion; I hope it's not too much work
reverting half of the change!

Cheers,
Nicolas

Btw: there could be another point in separating super_categories / and
_super_categories: namely to handle full super categories. But let's
discuss this face to face when you come over.

Simon King

unread,
Oct 30, 2011, 6:57:36 PM10/30/11
to sage-devel
Hi Nicolas,

On 30 Okt., 23:01, "Nicolas M. Thiery" <Nicolas.Thi...@u-psud.fr>
wrote:
> For _all_super_categories: in most use cases, someone accessing
> all_super_categories will go through the result (or maybe half of it
> in average?). Since that result is quite big (a list of typically 5,
> 10, or even 20 categories), I guess that the speed gain is negligible
> in practice (but please prove me wrong if that's not the case!), and
> thus does not justify the added complexity.

Here is how to prove you wrong (sorry...):

In elliptic curve computations, you typically have polynomial rings
over many (~1000) different finite fields. Some low-level methods will
check whether such polynomial ring really belongs to the category of
rings. But (at least with #9138) the category of a polynomial ring is
the category of algebras over its base ring.

Hence, here, we have ~1000 different categories, namely "Algebras"
over the different finite fields.

Testing whether F['x'] belongs to Rings() means: Testing whether
Rings() appears in Algebras(F).all_super_categories().

So, I believe the following is a fair test:

sage: A = Algebras(QQ)
sage: C = Rings()
sage: A.all_super_categories
Cached version of <function all_super_categories at 0x1066e60>
sage: C in A.all_super_categories()
True
sage: %timeit C in A.all_super_categories()
625 loops, best of 3: 3.08 µs per loop

Here, A.all_super_categories is a cached method. In particular, the
list of all super categories will be computed only once. So, what I am
testing here is just "calling a cached method + testing containment in
a fixed list".

However, if one uses a lazy attribute A._all_super_categories instead
(which means "requesting an attribute + testing containment in a fixed
list"), one is a lot faster:

sage: A = Algebras(QQ)
sage: C = Rings()
sage: A.__class__._all_super_categories
<sage.misc.lazy_attribute.lazy_attribute object at 0xe15fd0>
sage: C in A._all_super_categories
True
sage: %timeit C in A._all_super_categories
625 loops, best of 3: 1.41 µs per loop

I can tell you that such differences have a significant impact on
elliptic curve computations. I learnt it the hard way...

Thus, I believe that it is worth the additional complication.

Best regards,
Simon

Nicolas M. Thiery

unread,
Oct 31, 2011, 8:45:27 AM10/31/11
to sage-...@googlegroups.com
On Sun, Oct 30, 2011 at 03:57:36PM -0700, Simon King wrote:
> Here is how to prove you wrong (sorry...):

Thanks for the enlightenment :-)

> In elliptic curve computations, you typically have polynomial rings
> over many (~1000) different finite fields. Some low-level methods will
> check whether such polynomial ring really belongs to the category of
> rings. But (at least with #9138) the category of a polynomial ring is
> the category of algebras over its base ring.
>
> Hence, here, we have ~1000 different categories, namely "Algebras"
> over the different finite fields.
>
> Testing whether F['x'] belongs to Rings() means: Testing whether
> Rings() appears in Algebras(F).all_super_categories().

Ok. Containment in a tiny list is darn cheap in Python :-)

So let's push the logic further. On one hand we are creating a second
gadget to model all the super categories of a category. On the other
hand, the use cases where speed is critical seem to be containment
tests. Doing a containment test using a list has had a bad smell to me
since I wrote that code, but I was too lazy to fix it.

So while we are at it, what about making this gadget into a
(frozen_)set rather than a list? I.e. we would have:

- a method C.all_super_categories(proper=false) returning a list
(order matters; btw we should make that a tuple at some point)
- a lazy attribute C._all_super_categories_as_set
- the doc of C.all_super_categories would mention the existence of
C._all_super_categories_as_set for containment tests.

This would also speed up is_subcategory tests (at least their first
call if there is a cache). And this seems better than caching
is_subcategory, since one does not waste memory for caching the result
of all the negative subcategory tests.

One caveat to test is whether containment tests in tiny Python lists
are not faster than containment tests in tiny Python sets.

Cheers,
Nicolas

Simon King

unread,
Oct 31, 2011, 9:38:22 AM10/31/11
to sage-devel
Hi Nicolas,

On 31 Okt., 13:45, "Nicolas M. Thiery" <Nicolas.Thi...@u-psud.fr>
wrote:
> So while we are at it, what about making this gadget into a
> (frozen_)set rather than a list? I.e. we would have:
>
>  - a method C.all_super_categories(proper=false) returning a list
>    (order matters; btw we should make that a tuple at some point)
>  - a lazy attribute C._all_super_categories_as_set
>  - the doc of C.all_super_categories would mention the existence of
>    C._all_super_categories_as_set for containment tests.

That sounds like a very good idea to me!

> This would also speed up is_subcategory tests (at least their first
> call if there is a cache). And this seems better than caching
> is_subcategory, since one does not waste memory for caching the result
> of all the negative subcategory tests.

+1

Here is an example that shows that your idea is very good, speedwise:

sage: C = HopfAlgebrasWithBasis(QQ)
sage: R = Rings()
sage: RC = CommutativeRings()
sage: L = C.all_super_categories()
sage: S = frozenset(L)
sage: T = tuple(L)
sage: C._all_super_categories_list = L
sage: C._all_super_categories_tuple = T
sage: C._all_super_categories_set = S

Test sanity:
sage: R in C.all_super_categories()
True
sage: RC in C.all_super_categories()
False
sage: (R in C.all_super_categories()) == (R in
C._all_super_categories_list) == (R in C._all_super_categories_set) ==
(R in C._all_super_categories_tuple)
True
sage: (RC in C.all_super_categories()) == (RC in
C._all_super_categories_list) == (RC in C._all_super_categories_set)
== (RC in C._all_super_categories_tuple)
True

Timings:
sage: %timeit R in C.all_super_categories()
625 loops, best of 3: 8.69 µs per loop
sage: %timeit RC in C.all_super_categories()
625 loops, best of 3: 26 µs per loop
sage: %timeit R in C._all_super_categories_list
625 loops, best of 3: 7.74 µs per loop
sage: %timeit RC in C._all_super_categories_list
625 loops, best of 3: 25.2 µs per loop
sage: %timeit R in C._all_super_categories_tuple
625 loops, best of 3: 7.77 µs per loop
sage: %timeit RC in C._all_super_categories_tuple
625 loops, best of 3: 25.3 µs per loop
sage: %timeit R in C._all_super_categories_set
625 loops, best of 3: 1.47 µs per loop
sage: %timeit RC in C._all_super_categories_set
625 loops, best of 3: 1.58 µs per loop

However, using #11115, it would still make sense to have
C.is_subcategory(R) as a cached method: It would return the cached
answer in a a few hundred nanoseconds.

Best regards,
Simon

Nicolas M. Thiery

unread,
Nov 2, 2011, 11:04:24 AM11/2/11
to sage-...@googlegroups.com
On Mon, Oct 31, 2011 at 06:38:22AM -0700, Simon King wrote:
> Here is an example that shows that your idea is very good, speedwise:
> Timings:
> sage: %timeit R in C.all_super_categories()
> 625 loops, best of 3: 8.69 �s per loop

> sage: %timeit RC in C.all_super_categories()
> 625 loops, best of 3: 26 �s per loop

> sage: %timeit R in C._all_super_categories_list
> 625 loops, best of 3: 7.74 �s per loop

> sage: %timeit RC in C._all_super_categories_list
> 625 loops, best of 3: 25.2 �s per loop

> sage: %timeit R in C._all_super_categories_tuple
> 625 loops, best of 3: 7.77 �s per loop

> sage: %timeit RC in C._all_super_categories_tuple
> 625 loops, best of 3: 25.3 �s per loop

> sage: %timeit R in C._all_super_categories_set
> 625 loops, best of 3: 1.47 �s per loop

> sage: %timeit RC in C._all_super_categories_set
> 625 loops, best of 3: 1.58 �s per loop

Pretty convincing indeed :-)

> However, using #11115, it would still make sense to have
> C.is_subcategory(R) as a cached method: It would return the cached
> answer in a a few hundred nanoseconds.

At a little memory price. But I don't know how many negatives test we
are having in average, so whether this is any problem. An alternative
would be to have is_subcategory cythoned. Actually, maybe all of
Category should be eventually cythoned. But let's keep that for later.

Reply all
Reply to author
Forward
0 new messages