Re: Object#freeze as a basis for caching of method results?

Ezra Zygmuntowicz

unread,

Feb 19, 2008, 1:40:02 PM2/19/08

to

Hi~

On Feb 19, 2008, at 8:01 AM, Shot (Piotr Szotkowski) wrote:

> Hello, ruby-talk.
>
> After profiling my code I figured out I might want to attempt some
> caching of results obtained in ‘heavy’ methods. Is there an idiomatic
> way to do method-results caching in Ruby?
>
> The first thing that came to my mind as a mean to ensure the cache’s
> ‘freshness’ is to freeze the object in question. Does this make sense?
>
> For example, assume that a Blanket is a Set of Blocks, and
> that there’s a computation-intesive method ‘separations’:
>
>
>
> class Blanket < Set
>
> def separations
> seps = Set[]
> # some ‘heavy’ code that builds seps
> seps
> end
>
> end
>
>
>
> Would the below make sense?
>
>
>
> class Blanket < Set
>
> def freeze
> each { |block| block.freeze }
> super
> end
>
> def separations
> return @seps if @seps
> @seps = Set[]
> # the ‘heavy’ code from that builds @seps this time
> freeze
> @seps
> end
>
> end
>
>
>
> I guess my question boils down to what does exactly Object#freeze
> prevent from being modified – simply all the properties?
>
> If so, does it mean I can only have one method like the above (because
> if some other method did any freezing, I couldn’t initialise the @seps
> cache when the first call to separations occurs)?
>
> Is there any ‘best practice’ in Ruby
> with regards to caching method results?
>
> Is there any obvious other approach to such caching? (I thought about
> having a @cache instance variable that would get reset on object
> changes, but I’m not sure how to obtain all the list of methods that
> can change an object – i.e., methods that throw error when a given
> object is frozen.)
>
> Thanks in advance for any replies/suggestions/insights!
>

The usual idiom for caching heavy results is something like this:

def foo
@foo ||= some_long_calculation
end

Cheers-

- Ezra Zygmuntowicz
-- Founder & Software Architect
-- ez...@engineyard.com
-- EngineYard.com

Robert Klemme

unread,

Feb 20, 2008, 5:15:27 AM2/20/08

to

2008/2/19, Shot (Piotr Szotkowski) <sh...@hot.pl>:

> After profiling my code I figured out I might want to attempt some
> caching of results obtained in 'heavy' methods. Is there an idiomatic
> way to do method-results caching in Ruby?

http://raa.ruby-lang.org/project/memoize/

> I guess my question boils down to what does exactly Object#freeze
> prevent from being modified – simply all the properties?

Yes. You cannot assign instance variables any more once an instance is frozen.

> If so, does it mean I can only have one method like the above (because
> if some other method did any freezing, I couldn't initialise the @seps
> cache when the first call to separations occurs)?

An alternative approach would be to use current state as cache key,
i.e. create an immutable copy and stuff that along with calculation
results into a Hash.

> Is there any 'best practice' in Ruby
> with regards to caching method results?

Memoize, see above.

> Is there any obvious other approach to such caching? (I thought about
> having a @cache instance variable that would get reset on object
> changes, but I'm not sure how to obtain all the list of methods that
> can change an object – i.e., methods that throw error when a given
> object is frozen.)

I suggest you do not inherit Set. In that case it's easy: *you*
define which methods change the state of your class.

Kind regards

robert

--
use.inject do |as, often| as.you_can - without end

Robert Klemme

unread,

Feb 21, 2008, 4:15:30 AM2/21/08

to

2008/2/20, Shot (Piotr Szotkowski) <sh...@hot.pl>:
> Robert Klemme:

>
> > 2008/2/19, Shot (Piotr Szotkowski) <sh...@hot.pl>:
>

> >> Is there an idiomatic way to do method-results caching in Ruby?
>
> > http://raa.ruby-lang.org/project/memoize/
>

> Ah, perfect, thanks. From both the gem's docs and the source itself,
> it looks like it nicely handles the caching, provided that either the
> memoized method's result does not depend on the object state or the
> object state doesn't change, right?

Correct. IIRC Memoize uses the method argument array to do cache
lookups. I don't know whether Memoize takes measures to avoid
aliasing effects but that can be tested easily, e.g.

a = [1,2,3]
foo(a)
a << 4
foo(a)

foo must of course print something to the screen or such so you see
when it's invoked.

> > An alternative approach would be to use current state as cache key,
> > i.e. create an immutable copy and stuff that along with calculation
> > results into a Hash.
>

> Hm, that's actually an approach worth considering. I do need to take
> memory use into account, unfortunately, but it seems this is worth
> testing.
>
> I assume this approach highly depends on (a) making sure #freeze freezes
> also all referenced objects (like in my original example) and (b) #==,
> #eql? and #hash are sensibly implemented (because Hash uses them for key
> comparison), right?

Correct. But I believe this is true for Memoize also, i.e. if you
decide to use something as key to determine whether a calculation has
to be redone you better make sure it properly implements #hash, #eql?
etc.

> > I suggest you do not inherit Set. In that case it's easy:
> > *you* define which methods change the state of your class.
>

> Hm, that's true; also, if I get all this right, the ones
> that change the state could simply clear Memoize's cache.
>
> I'll see how much of the stuff inherited from Set I actually use –
> quite a bit, I assume, but then I could simply make an instance variable
> of @set and pass all these method calls to it (while selectively
> invalidating cache)…

Delegator may help here although in this case I'd probably rather
explicitly forward method invocations because automatic delegation
does have issues of its own, for example it does not "correct" return
values:

$ irb -r delegate
irb(main):001:0> Delegat
DelegateClass Delegater Delegator
irb(main):001:0> s=[1,2,3]
=> [1, 2, 3]
irb(main):003:0> so = SimpleDelegator.new(s)
=> [1, 2, 3]
irb(main):004:0> so.size
=> 3
irb(main):005:0> so.object_id
=> 1073413300
irb(main):006:0> s.object_id
=> 1073463120
irb(main):007:0> so.each {}.object_id
=> 1073463120
irb(main):008:0>

/so.each/ should rather return /so/ but it returns /s/.

> Thanks a *ton*, Robert; as usual, your reply is both
> invaluable and makes me think in the right direction.

Thank you! You're welcome! I am glad that my post proved useful.

Robert Klemme

unread,

Feb 21, 2008, 5:47:24 AM2/21/08

to

2008/2/21, Shot (Piotr Szotkowski) <sh...@hot.pl>:
> Robert Klemme:
>

> > An alternative approach would be to use current state as cache key,
> > i.e. create an immutable copy and stuff that along with calculation
> > results into a Hash.
>

> What would you say about the following approach? I'm not sure it scales
> well, but seems to work properly, have the smallest diff impact on the
> existing codebase and take object state into account…
>
> shot@devielle:~/work/PhD/bzr/trunk$ bzr diff
> === modified file 'trunk/lib/art-decomp/fsm.rb'
> --- trunk/lib/art-decomp/fsm.rb 2008-02-17 21:36:39 +0000
> +++ trunk/lib/art-decomp/fsm.rb 2008-02-21 09:03:53 +0000
> @@ -1,5 +1,7 @@
> class ArtDecomp::FSM < ArtDecomp::TruthTable
>
> + @@cache = Hash.new { |hash, key| hash[key] = {} }
> +
> attr_accessor :q, :qp
>
> def self.from_kiss string
> @@ -31,7 +33,7 @@
> end
>
> def beta_f
> - outputs.to_blanket
> + @@cache[Marshal.dump self][:beta_f] ||= outputs.to_blanket
> end
>
> def beta_q

Funny that you mention it: I had thought of copying the key via
Marshal#dump and #load. Using the marshaled string is a nice idea!
You can make this a tad more efficient by doing

@@cache[Marshal.dump(self).freeze][:beta_f] ||= outputs.to_blanket

because there is an optimization in Hash that copies unfrozen Strings
that are used as Hash keys in order to avoid aliasing effects.

Now, whether you use the String or demarshal probably mainly depends
on memory usage. If the String is short enough that approach is
certainly preferable because it incurs less processing overhead
(demarshaling).

> So the Ether Bunny goes hippety hopity down the garden path, waylaying
> innocent fieldmice and anesthetising them, so he can sell their teeth
> to the Tooth Fairy to support his milk-and-cookies habit.

What kind of dope are *you* smoking? :-)

Cheers

Robert Klemme

unread,

Feb 21, 2008, 12:06:16 PM2/21/08

to

2008/2/21, Shot (Piotr Szotkowski) <sh...@hot.pl>:

> // Thanks a lot for that delegate tutorial in the other post –
> // I finally understood what's delegation for and how to do it! :)

Nice side effect of this thread. :-)

> Robert Klemme:

> > 2008/2/21, Shot (Piotr Szotkowski) <sh...@hot.pl>:

> > Now, whether you use the String or demarshal probably mainly depends

> > on memory usage. If the String is short enough that approach is
> > certainly preferable because it incurs less processing overhead
> > (demarshaling).
>

> Ok, I'm totally lost here. I don't see any demarshalling happening,
> just marshalling (and using that as a key)… What am I missing?

Demarshalling was the option I had thought of. A copy through
marshalling _might_ use less memory than the marshaled form (the
String).

> > What kind of dope are *you* smoking? :-)
>

> I'm a PHP programmer² by day.

Uh, oh. *cough* :-)

> ² http://civicrm.org/
>
> -- Shot (seriously, though, it's a sig of unknown origin from Stewart
> Stremler's collection: http://www-rohan.sdsu.edu/~stremler/sigs/sigs.html)