UniqueRepresentation and cached_method don't mix

122 views
Skip to first unread message

Nils Bruin

unread,
Apr 6, 2014, 1:21:18 PM4/6/14
to sage-...@googlegroups.com
The following paradigm happens for "sage.combinat.posets.posets.FinitePoset" and their "linear_extensions", but it's such an easy mistake to make that I'd almost consider it a flaw in our design:

class A(Parent,UniqueRepresentation):
    ....
    @cached_method
    def BfromA(self):
        return B(A)

class B(Parent,UniqueRepresentation):
    ...

If one does

  a = A(....)
  b = a.BfromA()
  del a,b

one leaks memory. This is because the (global!) UniqueRepresentation cache for B will have a strong reference to a (as a key in a WeakValueDictionary) , since it's a construction parameter for b, and a will be holding a strong reference to b because BfromA was declared a cached method. Hence, a keeps b alive, and as long as b is alive, a will be kept alive due to the strong reference in the weakvaluedict.

The solution here probably is to NOT cache BfromA, since the lookup performed by B(A) will be very fast anyway (if the thing already exists), but you can see how difficult it now becomes to decide whether you can safely stick a @cached_method decorator somewhere: If the result contains a UniqueRepresentation object that is constructed using something linked to the object on which the cached_method lives, you probably can't. But this becomes a highly non-local thing to decide, because now you need to know the implementation details of lots of things you might be using.

The real problem here is UniqueRepresentation. Fundamentally, that breaks locality. We try to hide the problems a little bit by using a WeakValueDict to cache things, but as this example shows, it only takes one extra level to let the problems resurface. Any ideas on how we can mitigate these problems? Educating developers perhaps?

Nathann Cohen

unread,
Apr 23, 2014, 6:02:04 PM4/23/14
to sage-...@googlegroups.com, Simon King, Nicolas M. Thiery, Florent Hivert, Travis Scrimshaw
Please guys.. Answer this thread, this is a bad problem. It is important to do something about it... :-/

Nathann

Nathann Cohen

unread,
Apr 23, 2014, 6:04:25 PM4/23/14
to Sage devel, Simon King, Nicolas M. Thiery, Florent Hivert, Travis Scrimshaw

Volker Braun

unread,
Apr 23, 2014, 6:15:25 PM4/23/14
to sage-...@googlegroups.com, Simon King, Nicolas M. Thiery, Florent Hivert, Travis Scrimshaw
I agree that it is tricky. There is no need to cache the unique output since it itself is cached by being unique. This is testable with some effort; in doctest mode the memoization decorators should performs some additional testing. Takes some work to write but no conceptual problem.

Also, I think that caches should age out eventually. This would also mitigate a lot of possible memleaks. Downside is of course that you might have to recompute intermediate results, but that is still better than being reaped by the OOM killer ;-)

Nils Bruin

unread,
Apr 25, 2014, 3:03:58 PM4/25/14
to sage-...@googlegroups.com, Simon King, Nicolas M. Thiery, Florent Hivert, Travis Scrimshaw
On Wednesday, April 23, 2014 3:15:25 PM UTC-7, Volker Braun wrote:
Also, I think that caches should age out eventually. This would also mitigate a lot of possible memleaks. Downside is of course that you might have to recompute intermediate results, but that is still better than being reaped by the OOM killer ;-)
 
I disagree that this would be desirable default behaviour: the aging of the cache would introduce yet another dependency on history in computations and hence make buggy behaviour harder to reproduce. Furthermore, there can also be semantic reasons why you want a certain result to be computed only once and then cached for the rest of the lifetime (lazily computed attributes are presently often implemented via cached methods, because for a while that seemed faster).

It's of course perfectly fine to have a "limited_cached" decorator that implements some ageing strategy if people see a need for it, but we shouldn't just supplant cached_method etc. with it.

Volker Braun

unread,
Apr 26, 2014, 6:22:30 AM4/26/14
to sage-...@googlegroups.com, Simon King, Nicolas M. Thiery, Florent Hivert, Travis Scrimshaw
On Friday, April 25, 2014 8:03:58 PM UTC+1, Nils Bruin wrote:
the aging of the cache would introduce yet another dependency on history in computations and hence make buggy behaviour harder to reproduce

But it also gives us a crank to turn, by making the lru cache smaller you'll most likely be able to force the buggy behavior earlier.
 
It's of course perfectly fine to have a "limited_cached" decorator that implements some ageing strategy if people see a need for it, but we shouldn't just supplant cached_method etc. with it.

I'd rather have a @forever_nail_into_memory that is used as last resort instead of cached_method default to leaking memory. Though really the only reason for using it would be to avoid side effects of calling the decorated method repeatedly, and IMHO its a pretty bad design if you need rely on the @cached_method decorator for that.

And lazy attributes should just be @cached_property whose lifetime is then tied to the object. 

Regardless of whether the @cached_method cache ages or not, we should have the doctesting framework try to find caching mistakes. And @cached_method returning a UniqueRepresentation is never good, its at best double caching (even if the memory leak were fixed elsewhere).

Nils Bruin

unread,
Apr 26, 2014, 12:10:28 PM4/26/14
to sage-...@googlegroups.com, Simon King, Nicolas M. Thiery, Florent Hivert, Travis Scrimshaw
On Saturday, April 26, 2014 3:22:30 AM UTC-7, Volker Braun wrote:
And @cached_method returning a UniqueRepresentation is never good, its at best double caching (even if the memory leak were fixed elsewhere).

That is not true in the full generality in which you state it. For instance on a number field you can have

class NumberField(...):
    ....
    @cached_method
    def class_group(self):
        <very expensive operations>
        G = AbelianGroup(<invariants>)
        return G

The caching here is preserving the results of deriving the invariants from the number field. Furthermore, there is no problem here: the construction parameters passed to AbelianGroup don't hold a reference to the number field.

On the other hand,

    @cached_method
    def class_field_tower(self):
        return [list of relative extensions of self]

would be bad if number fields were CachedRepresentation (and probably is still bad because they probably use some equivalent strategy) and does not return the offending objects in plain sight: they are wrapped in a list. If they were stored in an attribute of a returned object, it would be equally bad.

Furthermore, the relative extensions wouldn't have canonical representatives, so ageing the cache out would suddenly lead to a change in representative field, whereas in small tests, you are always getting back the same.

Volker Braun

unread,
Apr 26, 2014, 12:51:35 PM4/26/14
to sage-...@googlegroups.com, Simon King, Nicolas M. Thiery, Florent Hivert, Travis Scrimshaw
On Saturday, April 26, 2014 5:10:28 PM UTC+1, Nils Bruin wrote:
class NumberField(...):
    ....
    @cached_method
    def class_group(self):
        <very expensive operations>
        G = AbelianGroup(<invariants>)
        return G

Just split it into the cached computation of the invariants and the uncached construction of the AbelianGroup (if the latter is necessary at all).
 
    @cached_method
    def class_field_tower(self):
        return [list of relative extensions of self]
would be bad if number fields were CachedRepresentation (and probably is still bad because they probably use some equivalent strategy) and does not return the offending objects in plain sight: they are wrapped in a list. If they were stored in an attribute of a returned object, it would be equally bad.

Yes, which is why you'd need to walk the referred objects digraph with some depth cutoff in doctest mode.

Furthermore, the relative extensions wouldn't have canonical representatives, so ageing the cache out would suddenly lead to a change in representative field, whereas in small tests, you are always getting back the same

IMHO that is an unrelated bug. You may only use caching if input/output is essentially unique (equivalent up to unique isomorphism or so). Otherwise you can break pretty much any sufficiently complicated computation by first priming a subset of the cached methods with equivalent-but-not-canonically unrelated input.If you really must pick a representative once and for all then you should be forced to be explicit about the memory leak that you created right there.


Reply all
Reply to author
Forward
0 new messages