24 views

Skip to first unread message

Apr 25, 2011, 1:57:56 PM4/25/11

to sage-devel

Hi!

The background of my question is trac ticket #11115. It provides a

Cython version of @cached_method, and it is really fast now.

Apart from the speed up, another problem is addressed: If an element

or parent structure is an extension class that does not allow

attribute assignment then it can still inherit cached methods from the

category framework -- but currently, the cache would break without

attribute assignment. Moreover, calling a cached method has a huge

overhead if attributes can't be assigned.

I suggest to introduce a new "cdef public dict" attribute

__cached_methods for both sage.structure.parent.Parent and

sage.structure.element.Element. If the cached method finds that

attribute assignment does not work then it will use __cached_methods

to store stuff. Then, the cache works, and the overhead is reduced

(although it is still faster with proper attribute assignment).

It works fine and doctests pass. However, Nicolas suggested that we

should discuss on sage-devel whether it is ok to introduce a new

attribute to any element (he said that it should be fine for parents,

as there are far less parents than elements).

My opinion:

If you consider an Element x that allows attribute assignment then the

only change with my patch is that x has an additional uninitialised

(i.e., None) attribute. If x does not allow attribute assignment and

x.parent().category().element_class provides a cached method (I don't

know if that exists, yet) then stuff would be stored in

x.__cached_methods that would otherwise be stored in, say, x.__dict__,

if it *would* allow attribute assignment.

So, I believe that the additional attribute can be afforded.

What do you think?

Best regards,

Simon

The background of my question is trac ticket #11115. It provides a

Cython version of @cached_method, and it is really fast now.

Apart from the speed up, another problem is addressed: If an element

or parent structure is an extension class that does not allow

attribute assignment then it can still inherit cached methods from the

category framework -- but currently, the cache would break without

attribute assignment. Moreover, calling a cached method has a huge

overhead if attributes can't be assigned.

I suggest to introduce a new "cdef public dict" attribute

__cached_methods for both sage.structure.parent.Parent and

sage.structure.element.Element. If the cached method finds that

attribute assignment does not work then it will use __cached_methods

to store stuff. Then, the cache works, and the overhead is reduced

(although it is still faster with proper attribute assignment).

It works fine and doctests pass. However, Nicolas suggested that we

should discuss on sage-devel whether it is ok to introduce a new

attribute to any element (he said that it should be fine for parents,

as there are far less parents than elements).

My opinion:

If you consider an Element x that allows attribute assignment then the

only change with my patch is that x has an additional uninitialised

(i.e., None) attribute. If x does not allow attribute assignment and

x.parent().category().element_class provides a cached method (I don't

know if that exists, yet) then stuff would be stored in

x.__cached_methods that would otherwise be stored in, say, x.__dict__,

if it *would* allow attribute assignment.

So, I believe that the additional attribute can be afforded.

What do you think?

Best regards,

Simon

Apr 27, 2011, 9:20:14 AM4/27/11

to sage-devel

Sorry to bother you again, but I think that changes on a very basic

level such as sage/structure/element.pxd require sage-devel's

blessing.

On 25 Apr., 19:57, Simon King <simon.k...@uni-jena.de> wrote:

> The background of my question is trac ticket #11115. It provides a

> Cython version of @cached_method, and it is really fast now.

To back my claim up: With the patch, I obtain

sage: class MyClass:

....: def __init__(self):

....: self._cache = {}

....: @cached_method

....: def noarg_cache(self):

....: return True

....: def noarg(self):

....: return True

....: @cached_method

....: def arg_cache(self,x,n=10):

....: return x^n

....: def arg(self,x,n=10):

....: try:

....: return self._cache[x,n]

....: except KeyError:

....: a = self._cache[x,n] = x^n

....: return a

....:

sage: O = MyClass()

sage: timeit('O.noarg_cache()', number=10^6)

1000000 loops, best of 3: 104 ns per loop

sage: timeit('O.noarg()', number=10^6)

1000000 loops, best of 3: 237 ns per loop

sage: timeit('O.arg_cache(3)', number=10^6)

1000000 loops, best of 3: 942 ns per loop

sage: timeit('O.arg(3)', number=10^6)

1000000 loops, best of 3: 844 ns per loop

sage: timeit('O.arg_cache(3,10)', number=10^6)

1000000 loops, best of 3: 933 ns per loop

sage: timeit('O.arg(3,10)', number=10^6)

1000000 loops, best of 3: 1.01 µs per loop

So, if there are no arguments then the @cached_method is actually

faster than simply returning "True". With arguments, it can compete

with a self-made cache written in Python. But of course, using

@cached_method is more convenient than a hand-knitted cache.

Now, the proposed additional attribute is not related with the speedup

in the example above. But it would play a role if you have

* a cdef class derived from Element

* that does not allow attribute assignment and

* that inherits a cached method from the category framework.

Namely, the additional attribute makes the cache work in the situation

above.

One could argue:

1) The situation above is very rare, and if a developer wants that it

works for a cdef class Foo(Element), then s/he can still provide Foo

with a "cdef public dict __cached_methods": It is not needed to do

that on the level of sage.structure.element.

or

2) I worked with that patch, and I did not see memory problems. So,

apparently the additional attribute is lightweight enough. Thus we can

afford to make the creation of cached methods via categories as

convenient as possible.

I am in favour of 2).

Cheers,

Simon

level such as sage/structure/element.pxd require sage-devel's

blessing.

On 25 Apr., 19:57, Simon King <simon.k...@uni-jena.de> wrote:

> The background of my question is trac ticket #11115. It provides a

> Cython version of @cached_method, and it is really fast now.

sage: class MyClass:

....: def __init__(self):

....: self._cache = {}

....: @cached_method

....: def noarg_cache(self):

....: return True

....: def noarg(self):

....: return True

....: @cached_method

....: def arg_cache(self,x,n=10):

....: return x^n

....: def arg(self,x,n=10):

....: try:

....: return self._cache[x,n]

....: except KeyError:

....: a = self._cache[x,n] = x^n

....: return a

....:

sage: O = MyClass()

sage: timeit('O.noarg_cache()', number=10^6)

1000000 loops, best of 3: 104 ns per loop

sage: timeit('O.noarg()', number=10^6)

1000000 loops, best of 3: 237 ns per loop

sage: timeit('O.arg_cache(3)', number=10^6)

1000000 loops, best of 3: 942 ns per loop

sage: timeit('O.arg(3)', number=10^6)

1000000 loops, best of 3: 844 ns per loop

sage: timeit('O.arg_cache(3,10)', number=10^6)

1000000 loops, best of 3: 933 ns per loop

sage: timeit('O.arg(3,10)', number=10^6)

1000000 loops, best of 3: 1.01 µs per loop

So, if there are no arguments then the @cached_method is actually

faster than simply returning "True". With arguments, it can compete

with a self-made cache written in Python. But of course, using

@cached_method is more convenient than a hand-knitted cache.

Now, the proposed additional attribute is not related with the speedup

in the example above. But it would play a role if you have

* a cdef class derived from Element

* that does not allow attribute assignment and

* that inherits a cached method from the category framework.

Namely, the additional attribute makes the cache work in the situation

above.

One could argue:

1) The situation above is very rare, and if a developer wants that it

works for a cdef class Foo(Element), then s/he can still provide Foo

with a "cdef public dict __cached_methods": It is not needed to do

that on the level of sage.structure.element.

or

2) I worked with that patch, and I did not see memory problems. So,

apparently the additional attribute is lightweight enough. Thus we can

afford to make the creation of cached methods via categories as

convenient as possible.

I am in favour of 2).

Cheers,

Simon

Apr 27, 2011, 9:38:04 AM4/27/11

to sage-...@googlegroups.com

+1: I trust you. That might not be a well-enough informed basis for

backing the proposal, but it might at least generate some other

responses!

backing the proposal, but it might at least generate some other

responses!

John

> --

> To post to this group, send an email to sage-...@googlegroups.com

> To unsubscribe from this group, send an email to sage-devel+...@googlegroups.com

> For more options, visit this group at http://groups.google.com/group/sage-devel

> URL: http://www.sagemath.org

>

Apr 27, 2011, 11:06:30 AM4/27/11

to sage-devel

I still don't have a strong oppinion on this. This is mainly because I

would like to see one additional test: Could you compare (timewise)

the creation of say 1 or 10 million (easy) objects based on a version

with and without patch? In principle, I am in favor of you suggestion,

but this seems like a possible issue. If it turns out that it is not,

then you definitely have a +1.

Martin

would like to see one additional test: Could you compare (timewise)

the creation of say 1 or 10 million (easy) objects based on a version

with and without patch? In principle, I am in favor of you suggestion,

but this seems like a possible issue. If it turns out that it is not,

then you definitely have a +1.

Martin

Apr 27, 2011, 1:18:40 PM4/27/11

to sage-devel

Hi Martin,

On 27 Apr., 17:06, Martin Raum <Martin.R...@matha.rwth-aachen.de>

wrote:

sage.structure.element.Element, such as elements of a finite field?

You say "timewise", so, I suppose that I should just create the

elements, not keep them in memory, and some of them may coincide? I

hope the test below is OK for you.

With sage-4.7.alpha5, I obtain:

sage: K = GF(101)

sage: get_memory_usage()

839.96484375

sage: %time for i in xrange(10^7): a = K(i)

CPU times: user 25.19 s, sys: 0.01 s, total: 25.21 s

Wall time: 25.20 s

sage: get_memory_usage()

839.96484375

sage -startuptime yields

...

== Slowest (including children) ==

1.128 sage.all (None)

0.275 sage.schemes.all (sage.all)

0.197 sage.misc.all (sage.all)

...

With sage-4.7.alpha5 and the patches from #9976 (that's about

introspection and is a dependency) and #11115 (that's the

cached_method stuff), I get

sage: K = GF(101)

sage: get_memory_usage()

841.96875

# show that the new attribute is there, but it is non-initialized

sage: a=K(4)

sage: print a.__cached_methods

None

sage: %time for i in xrange(10^7): a = K(i)

CPU times: user 24.46 s, sys: 0.00 s, total: 24.46 s

Wall time: 24.46 s

sage: get_memory_usage()

841.96875

sage -startuptime yields

...

== Slowest (including children) ==

1.137 sage.all (None)

0.274 sage.schemes.all (sage.all)

0.178 twisted.persisted.styles (sage.all)

0.162 sage.misc.all (sage.all)

...

Conclusion:

The time for the element creation test is actually slightly better

with the patch. The startuptime did not change significantly. However,

the memory consumption did increase.

More test scenarios are welcome!

Cheers,

Simon

On 27 Apr., 17:06, Martin Raum <Martin.R...@matha.rwth-aachen.de>

wrote:

> I still don't have a strong oppinion on this. This is mainly because I

> would like to see one additional test: Could you compare (timewise)

> the creation of say 1 or 10 million (easy) objects based on a version

> with and without patch?

By "objects", I suppose you mean instances of
> would like to see one additional test: Could you compare (timewise)

> the creation of say 1 or 10 million (easy) objects based on a version

> with and without patch?

sage.structure.element.Element, such as elements of a finite field?

You say "timewise", so, I suppose that I should just create the

elements, not keep them in memory, and some of them may coincide? I

hope the test below is OK for you.

With sage-4.7.alpha5, I obtain:

sage: K = GF(101)

sage: get_memory_usage()

839.96484375

sage: %time for i in xrange(10^7): a = K(i)

CPU times: user 25.19 s, sys: 0.01 s, total: 25.21 s

Wall time: 25.20 s

sage: get_memory_usage()

839.96484375

sage -startuptime yields

...

== Slowest (including children) ==

1.128 sage.all (None)

0.275 sage.schemes.all (sage.all)

0.197 sage.misc.all (sage.all)

...

With sage-4.7.alpha5 and the patches from #9976 (that's about

introspection and is a dependency) and #11115 (that's the

cached_method stuff), I get

sage: K = GF(101)

sage: get_memory_usage()

841.96875

# show that the new attribute is there, but it is non-initialized

sage: a=K(4)

sage: print a.__cached_methods

None

sage: %time for i in xrange(10^7): a = K(i)

CPU times: user 24.46 s, sys: 0.00 s, total: 24.46 s

Wall time: 24.46 s

sage: get_memory_usage()

841.96875

sage -startuptime yields

...

== Slowest (including children) ==

1.137 sage.all (None)

0.274 sage.schemes.all (sage.all)

0.178 twisted.persisted.styles (sage.all)

0.162 sage.misc.all (sage.all)

...

Conclusion:

The time for the element creation test is actually slightly better

with the patch. The startuptime did not change significantly. However,

the memory consumption did increase.

More test scenarios are welcome!

Cheers,

Simon

Apr 29, 2011, 5:06:23 PM4/29/11

to sage-...@googlegroups.com

On Wed, Apr 27, 2011 at 10:18:40AM -0700, Simon King wrote:

> By "objects", I suppose you mean instances of

> sage.structure.element.Element, such as elements of a finite field?

> You say "timewise", so, I suppose that I should just create the

> elements, not keep them in memory, and some of them may coincide? I

> hope the test below is OK for you.

>

> With sage-4.7.alpha5, I obtain:

>

> sage: K = GF(101)

> By "objects", I suppose you mean instances of

> sage.structure.element.Element, such as elements of a finite field?

> You say "timewise", so, I suppose that I should just create the

> elements, not keep them in memory, and some of them may coincide? I

> hope the test below is OK for you.

>

> With sage-4.7.alpha5, I obtain:

>

> sage: K = GF(101)

> ...

> Conclusion:

>

> The time for the element creation test is actually slightly better

> with the patch. The startuptime did not change significantly. However,

> the memory consumption did increase.

>

> More test scenarios are welcome!

> Conclusion:

>

> The time for the element creation test is actually slightly better

> with the patch. The startuptime did not change significantly. However,

> the memory consumption did increase.

>

> More test scenarios are welcome!

Can you give it a shot with GF(2), so as to hit one of the smallest

possible sage Element?

Cheers,

Nicolas

--

Nicolas M. Thi�ry "Isil" <nth...@users.sf.net>

http://Nicolas.Thiery.name/

Apr 30, 2011, 4:27:19 AM4/30/11

to sage-devel

Hi Nicolas,

On 29 Apr., 23:06, "Nicolas M. Thiery" <Nicolas.Thi...@u-psud.fr>

wrote:

sage: K = GF(2)

sage: get_memory_usage()

842.01171875

sage: %time L = [K(i) for i in xrange(10^7)]

CPU times: user 25.85 s, sys: 0.05 s, total: 25.91 s

Wall time: 25.91 s

sage: get_memory_usage()

920.51171875

sage: 920.51171875 - 842.01171875

78.5000000000000

Sage-4.7.alpha5 without patches:

sage: K = GF(2)

sage: get_memory_usage()

839.96875

sage: %time L = [K(i) for i in xrange(10^7)]

CPU times: user 24.71 s, sys: 0.04 s, total: 24.75 s

Wall time: 24.75 s

sage: get_memory_usage()

918.4609375

sage: 918.4609375 - 839.96875

78.4921875000000

So, the additional memory usage seems affordable to me. But I guess I

should try to find out why the element creation became slower for

GF(2), while it became faster for GF(101), and again slower for

GF(next_prime(10^6)) (see example on the ticket).

Cheers,

Simon

On 29 Apr., 23:06, "Nicolas M. Thiery" <Nicolas.Thi...@u-psud.fr>

wrote:

> Can you give it a shot with GF(2), so as to hit one of the smallest

> possible sage Element?

Sage-4.7.alpha5 with patches:
> possible sage Element?

sage: K = GF(2)

sage: get_memory_usage()

842.01171875

sage: %time L = [K(i) for i in xrange(10^7)]

CPU times: user 25.85 s, sys: 0.05 s, total: 25.91 s

Wall time: 25.91 s

sage: get_memory_usage()

920.51171875

sage: 920.51171875 - 842.01171875

78.5000000000000

Sage-4.7.alpha5 without patches:

sage: K = GF(2)

sage: get_memory_usage()

839.96875

sage: %time L = [K(i) for i in xrange(10^7)]

CPU times: user 24.71 s, sys: 0.04 s, total: 24.75 s

Wall time: 24.75 s

sage: get_memory_usage()

918.4609375

sage: 918.4609375 - 839.96875

78.4921875000000

So, the additional memory usage seems affordable to me. But I guess I

should try to find out why the element creation became slower for

GF(2), while it became faster for GF(101), and again slower for

GF(next_prime(10^6)) (see example on the ticket).

Cheers,

Simon

Apr 30, 2011, 1:12:57 PM4/30/11

to sage-devel

Hi Simon,

I looked over the comments in trac #11115 . Thank you for a very

thorough analysis and presentation. One thing I didn't immediately

find in the discussion there is that an increased memory footprint for

elements can also cause cache-misses sooner, so lead to a slowdown (so

memory usage can lead to slowdown).

Cache performance is very dependent on processor and very tricky to

test for. However, I would expect that an implementation which uses

less memory for its working set will never perform worse than one that

uses more (other things being equal). So my default preference would

be for the mixed approach (3): make your chance to parents as proposed

and let people install __cached_methods manually on elements to get

the benefits.

The drawback of that approach is that it takes very specialized

knowledge to get caching on elements working well. It would need to be

featured prominently in the documentation somewhere.

I looked over the comments in trac #11115 . Thank you for a very

thorough analysis and presentation. One thing I didn't immediately

find in the discussion there is that an increased memory footprint for

elements can also cause cache-misses sooner, so lead to a slowdown (so

memory usage can lead to slowdown).

Cache performance is very dependent on processor and very tricky to

test for. However, I would expect that an implementation which uses

less memory for its working set will never perform worse than one that

uses more (other things being equal). So my default preference would

be for the mixed approach (3): make your chance to parents as proposed

and let people install __cached_methods manually on elements to get

the benefits.

The drawback of that approach is that it takes very specialized

knowledge to get caching on elements working well. It would need to be

featured prominently in the documentation somewhere.

Apr 30, 2011, 4:51:07 PM4/30/11

to sage-devel

Hi Nils,

On 30 Apr., 19:12, Nils Bruin <nbr...@sfu.ca> wrote:

> I looked over the comments in trac #11115 . Thank you for a very

> thorough analysis and presentation. One thing I didn't immediately

> find in the discussion there is that an increased memory footprint for

> elements can also cause cache-misses sooner, so lead to a slowdown

I don't understand that. Can you elaborate? Perhaps on the ticket?

> Cache performance is very dependent on processor and very tricky to

> test for. However, I would expect that an implementation which uses

> less memory for its working set will never perform worse than one that

> uses more (other things being equal). So my default preference would

> be for the mixed approach (3): make your chance to parents as proposed

> and let people install __cached_methods manually on elements to get

> the benefits.

OK, I'll soon post a patch for version (3) on the ticket, as an

alternative to the original implementation.

> The drawback of that approach is that it takes very specialized

> knowledge to get caching on elements working well. It would need to be

> featured prominently in the documentation somewhere.

I somehow disagree: I believe that it would take very specialized

knowledge in order to create an example that exposes the problem: You

need to implement a class whose instances don't accept attributes (so,

that is likely to require Cython), and in addition an instance must

belong to a parent that belongs to a category that provides a cached

element method (does that exist, yet?). And then, you must write code

that relies on that method being cached.

But certainly, if we go for version (3) then I will add an example in

sage.misc.cachefunc.

Best regards,

Simon

On 30 Apr., 19:12, Nils Bruin <nbr...@sfu.ca> wrote:

> I looked over the comments in trac #11115 . Thank you for a very

> thorough analysis and presentation. One thing I didn't immediately

> find in the discussion there is that an increased memory footprint for

> elements can also cause cache-misses sooner, so lead to a slowdown

> Cache performance is very dependent on processor and very tricky to

> test for. However, I would expect that an implementation which uses

> less memory for its working set will never perform worse than one that

> uses more (other things being equal). So my default preference would

> be for the mixed approach (3): make your chance to parents as proposed

> and let people install __cached_methods manually on elements to get

> the benefits.

alternative to the original implementation.

> The drawback of that approach is that it takes very specialized

> knowledge to get caching on elements working well. It would need to be

> featured prominently in the documentation somewhere.

knowledge in order to create an example that exposes the problem: You

need to implement a class whose instances don't accept attributes (so,

that is likely to require Cython), and in addition an instance must

belong to a parent that belongs to a category that provides a cached

element method (does that exist, yet?). And then, you must write code

that relies on that method being cached.

But certainly, if we go for version (3) then I will add an example in

sage.misc.cachefunc.

Best regards,

Simon

May 1, 2011, 4:00:00 PM5/1/11

to sage-devel

On Apr 30, 1:51 pm, Simon King <simon.k...@uni-jena.de> wrote:

> I don't understand that. Can you elaborate? Perhaps on the ticket?

I did, and I included some striking timings. The final effect for sage
> I don't understand that. Can you elaborate? Perhaps on the ticket?

elements will be much smaller because there are so many other factors,

but the timings do show that not having the data you work on in

consecutive memory can be quite expensive, so one should not lightly

adorn elements with fields that are rarely used.

May 31, 2011, 2:13:32 AM5/31/11

to sage-...@googlegroups.com

Elements of small finite fields are cached, so this test doesn't

really show anything.

sage: GF(2)(1) is GF(2)(-1)

True

> But I guess I

> should try to find out why the element creation became slower for

> GF(2), while it became faster for GF(101), and again slower for

> GF(next_prime(10^6)) (see example on the ticket).

Noise?

I think the increased memory usage is a bigger issue than performance

(it can impact the latter, but in hard-to-measure ways). I'm not as

worried about one field as a precedent, and I'm glad you brought this

up (even if it just sat starred in my inbox for a while). Perhaps for

infrequently accessed fields a level of indirection would be

worthwhile, a single lazily-initialized field (either an actual Python

dictionary, or some cdef object with special fields (+ a dict too?)).

In fact, some objects (e.g. matrices) already have a "cache" dict,

this could be done at a higher level.

Creating a new element means existing class hierarchies will have to

be changed (e.g. RingElement) to take advantage of this, but that may

be what's required.

Also, blindly marking all self-only methods as cached is a huge change

in the time-memory performance tradeoff. In terms of memory, imagine I

did

def all_possible_orders(element_list):

return set(g.order() for g in element_list)

I have now doubled my memory usage if g was small. Some things may be

better to re-compute rather than store (especially if they're biggish

and cheap). If your'e setting up a generic infrastructure, perhaps the

default could be enabled/disabled with a flag (and with statement

context).

- Robert

May 31, 2011, 3:23:04 AM5/31/11

to sage-devel

Hi Robert,

Thank you for resuming this thread.

On 31 Mai, 08:13, Robert Bradshaw <rober...@math.washington.edu>

wrote:

I am trying to sum up what the few participants of the discussion

agreed upon.

* By default, a cached method or a lazy attribute tries to use

attribute assignment in order to make itself work. That's probably the

fastest what one can do, and it is how things already worked before

#11115.

* If attribute assignment fails (that may happen for extension

classes) then it is tried to store stuff in an attribute

__cached_methods. That's new in #11115.

* The __cached_methods attribute is added to parents. The rationale is

that (1) cached methods seem to be most frequently used for parents

(not for elements), (2) parents are big anyway, so that an additional

attribute should not count so much. Note that the stuff stored in

__cached_method would otherwise be stored via attribute assignment,

and the new attribute will be None if attribute assignment is possible

-- hence, there should be no memory problem.

* In contrast to what I had originally planned, the patches from

#11115 do *not* add __cached_methods to

sage.structure.element.Element. That means: One can still use cached

methods and lazy attributes on elements that accept attribute

assignment; but on other elements, cached methods and lazy attributes

are not supported.

* If one wants cached methods on an element class that for some reason

can not accept general attribute assignment, #11115 adds a new

subclass of Element, called ElementWithCachedMethod. It provides the

__cached_methods attribute.

By consequence, both cached methods and lazy attributes work for *all*

parents and *most* elements, and there is the big speed up for cached

methods. Just to avoid misunderstanding: The speed-up is not related

with the existence of __cached_methods.

> > But I guess I

> > should try to find out why the element creation became slower for

> > GF(2), while it became faster for GF(101), and again slower for

> > GF(next_prime(10^6)) (see example on the ticket).

>

> Noise?

Perhaps.

> I think the increased memory usage is a bigger issue than performance

> (it can impact the latter, but in hard-to-measure ways).

I did not notice a big increase in memory consumption. But perhaps you

can find a test case where this would be a problem?

> In fact, some objects (e.g. matrices) already have a "cache" dict,

> this could be done at a higher level.

Yep, I guess the purpose of __cached_methods is similar. Originally, I

only intended to use it for cached methods, but then I also used it to

make inherited lazy attributes (such as element_class) work on

extension classes.

> Creating a new element means existing class hierarchies will have to

> be changed (e.g. RingElement) to take advantage of this, but that may

> be what's required.

By "new element" you mean a new class such as ElementWithCachedMethod?

The patches from #11115 do not change the class hierarchy for things

like RingElement -- they offer ElementWithCachedMethod, but it is

currently not more than an offer.

> Also, blindly marking all self-only methods as cached is a huge change

> in the time-memory performance tradeoff.

Yep. Meanwhile I found that it makes more sense to add the

@cached_method decorator only to carefully chosen small frequently

used methods, such as modulus() of finite fields, and in some cases

caching gen() has a positive effect.

> Some things may be

> better to re-compute rather than store (especially if they're biggish

> and cheap).

In what sense do you mean "better"?

TIME-WISE, a cached method post-#11115 even beats a method that does

no computation at all:

sage: class MyClass:

....: def __init__(self,m):

....: self.__modulus = m

....: def modulus(self):

....: return self.__modulus

....: @cached_method

....: def c_modulus(self):

....: return self.__modulus

....: def one(self):

....: return 1

....: @cached_method

....: def c_one(self):

....: return 1

....:

sage: O = MyClass(5)

sage: timeit('O.modulus()', number=10^7)

10000000 loops, best of 3: 247 ns per loop

sage: timeit('O.c_modulus()', number=10^7)

10000000 loops, best of 3: 106 ns per loop

sage: timeit('O.one()', number=10^7)

10000000 loops, best of 3: 419 ns per loop

sage: timeit('O.c_one()', number=10^7)

10000000 loops, best of 3: 106 ns per loop

Or do you mean "better memory-wise"? That may be true, if the result

of a method is big and not often used.

However, there frequently are methods that look like

def bla(self):

try:

return self.__bla

except AttributeError:

do some lengthy computation

self.__bla = result

return result

Those should be replaced by

@cached_method

def bla(self):

do some lengthy computation

return result

That's both shorter and faster.

> If your'e setting up a generic infrastructure, perhaps the

> default could be enabled/disabled with a flag (and with statement

> context).

I don't understand that. Can you give me an example?

Best regards,

Simon

Thank you for resuming this thread.

On 31 Mai, 08:13, Robert Bradshaw <rober...@math.washington.edu>

wrote:

> > So, the additional memory usage seems affordable to me.

>

> Elements of small finite fields are cached, so this test doesn't

> really show anything.

>

> sage: GF(2)(1) is GF(2)(-1)

> True

I see.
>

> Elements of small finite fields are cached, so this test doesn't

> really show anything.

>

> sage: GF(2)(1) is GF(2)(-1)

> True

I am trying to sum up what the few participants of the discussion

agreed upon.

* By default, a cached method or a lazy attribute tries to use

attribute assignment in order to make itself work. That's probably the

fastest what one can do, and it is how things already worked before

#11115.

* If attribute assignment fails (that may happen for extension

classes) then it is tried to store stuff in an attribute

__cached_methods. That's new in #11115.

* The __cached_methods attribute is added to parents. The rationale is

that (1) cached methods seem to be most frequently used for parents

(not for elements), (2) parents are big anyway, so that an additional

attribute should not count so much. Note that the stuff stored in

__cached_method would otherwise be stored via attribute assignment,

and the new attribute will be None if attribute assignment is possible

-- hence, there should be no memory problem.

* In contrast to what I had originally planned, the patches from

#11115 do *not* add __cached_methods to

sage.structure.element.Element. That means: One can still use cached

methods and lazy attributes on elements that accept attribute

assignment; but on other elements, cached methods and lazy attributes

are not supported.

* If one wants cached methods on an element class that for some reason

can not accept general attribute assignment, #11115 adds a new

subclass of Element, called ElementWithCachedMethod. It provides the

__cached_methods attribute.

By consequence, both cached methods and lazy attributes work for *all*

parents and *most* elements, and there is the big speed up for cached

methods. Just to avoid misunderstanding: The speed-up is not related

with the existence of __cached_methods.

> > But I guess I

> > should try to find out why the element creation became slower for

> > GF(2), while it became faster for GF(101), and again slower for

> > GF(next_prime(10^6)) (see example on the ticket).

>

> Noise?

> I think the increased memory usage is a bigger issue than performance

> (it can impact the latter, but in hard-to-measure ways).

can find a test case where this would be a problem?

> In fact, some objects (e.g. matrices) already have a "cache" dict,

> this could be done at a higher level.

only intended to use it for cached methods, but then I also used it to

make inherited lazy attributes (such as element_class) work on

extension classes.

> Creating a new element means existing class hierarchies will have to

> be changed (e.g. RingElement) to take advantage of this, but that may

> be what's required.

The patches from #11115 do not change the class hierarchy for things

like RingElement -- they offer ElementWithCachedMethod, but it is

currently not more than an offer.

> Also, blindly marking all self-only methods as cached is a huge change

> in the time-memory performance tradeoff.

@cached_method decorator only to carefully chosen small frequently

used methods, such as modulus() of finite fields, and in some cases

caching gen() has a positive effect.

> Some things may be

> better to re-compute rather than store (especially if they're biggish

> and cheap).

TIME-WISE, a cached method post-#11115 even beats a method that does

no computation at all:

sage: class MyClass:

....: def __init__(self,m):

....: self.__modulus = m

....: def modulus(self):

....: return self.__modulus

....: @cached_method

....: def c_modulus(self):

....: return self.__modulus

....: def one(self):

....: return 1

....: @cached_method

....: def c_one(self):

....: return 1

....:

sage: O = MyClass(5)

sage: timeit('O.modulus()', number=10^7)

10000000 loops, best of 3: 247 ns per loop

sage: timeit('O.c_modulus()', number=10^7)

10000000 loops, best of 3: 106 ns per loop

sage: timeit('O.one()', number=10^7)

10000000 loops, best of 3: 419 ns per loop

sage: timeit('O.c_one()', number=10^7)

10000000 loops, best of 3: 106 ns per loop

Or do you mean "better memory-wise"? That may be true, if the result

of a method is big and not often used.

However, there frequently are methods that look like

def bla(self):

try:

return self.__bla

except AttributeError:

do some lengthy computation

self.__bla = result

return result

Those should be replaced by

@cached_method

def bla(self):

do some lengthy computation

return result

That's both shorter and faster.

> If your'e setting up a generic infrastructure, perhaps the

> default could be enabled/disabled with a flag (and with statement

> context).

Best regards,

Simon

May 31, 2011, 4:23:28 AM5/31/11

to sage-...@googlegroups.com

This is because you were testing with finite fields that create all

their elements ahead of time and re-use them.

sage: M = matrix(InfinitePolynomialRing(QQ), 100, 100, range(10000))

sage: get_memory_usage()

228.75

sage: max(a.max_index() for a in M.list()) # a cached method

-1

sage: get_memory_usage()

265.00390625

> But perhaps you

> can find a test case where this would be a problem?

I can't think of one off the top of my head, but this is a problem in

general (as explained so well by nbruin in on the ticket).

>> In fact, some objects (e.g. matrices) already have a "cache" dict,

>> this could be done at a higher level.

>

> Yep, I guess the purpose of __cached_methods is similar. Originally, I

> only intended to use it for cached methods, but then I also used it to

> make inherited lazy attributes (such as element_class) work on

> extension classes.

>

>> Creating a new element means existing class hierarchies will have to

>> be changed (e.g. RingElement) to take advantage of this, but that may

>> be what's required.

>

> By "new element" you mean a new class such as ElementWithCachedMethod?

> The patches from #11115 do not change the class hierarchy for things

> like RingElement -- they offer ElementWithCachedMethod, but it is

> currently not more than an offer.

>

>> Also, blindly marking all self-only methods as cached is a huge change

>> in the time-memory performance tradeoff.

>

> Yep. Meanwhile I found that it makes more sense to add the

> @cached_method decorator only to carefully chosen small frequently

> used methods, such as modulus() of finite fields, and in some cases

> caching gen() has a positive effect.

+1 to thoughtful use. It's a wonderful decorator. (Actually, I wanted

it at work for something just yesterday, but I was using Java :(.

>> Some things may be

>> better to re-compute rather than store (especially if they're biggish

>> and cheap).

>

> In what sense do you mean "better"?

That depends on how much time you have vs. how much memory you have.

As another example, it'd be a waste to cache parent() or the degree of

a univariate polynomial.

That's because MyClass is implemented in Python, and cached_method in

Cython :).

> Or do you mean "better memory-wise"? That may be true, if the result

> of a method is big and not often used.

Yes.

> However, there frequently are methods that look like

> def bla(self):

> try:

> return self.__bla

> except AttributeError:

> do some lengthy computation

> self.__bla = result

> return result

>

> Those should be replaced by

> @cached_method

> def bla(self):

> do some lengthy computation

> return result

>

> That's both shorter and faster.

I agree completely, I was more referring to marking all methods by default.

>> If your'e setting up a generic infrastructure, perhaps the

>> default could be enabled/disabled with a flag (and with statement

>> context).

>

> I don't understand that. Can you give me an example?

You'd have a "maybe_cache" decorator that would decide to cache based

on a global flag. Then you'd do

with aggressive_caching():

[some calculation that uses maybe_cache decorated methods

[here, all caches are cleared and memory reclaimed]

Probably not worth it, and things will get garbage collected when

you're done with them.

- Robert

May 31, 2011, 4:57:06 AM5/31/11

to sage-...@googlegroups.com

Or perhaps you're referring to the total startup memory? We create

lots of Parents at startup, but (relatively) few elements (compared to

typical use).

In any case, I like the current solution best.

Reply all

Reply to author

Forward

0 new messages

Search

Clear search

Close search

Google apps

Main menu