On Feb 19, 2008, at 8:01 AM, Shot (Piotr Szotkowski) wrote:
> Hello, ruby-talk.
>
> After profiling my code I figured out I might want to attempt some
> caching of results obtained in ‘heavy’ methods. Is there an idiomatic
> way to do method-results caching in Ruby?
>
> The first thing that came to my mind as a mean to ensure the cache’s
> ‘freshness’ is to freeze the object in question. Does this make sense?
>
> For example, assume that a Blanket is a Set of Blocks, and
> that there’s a computation-intesive method ‘separations’:
>
>
>
> class Blanket < Set
>
> def separations
> seps = Set[]
> # some ‘heavy’ code that builds seps
> seps
> end
>
> end
>
>
>
> Would the below make sense?
>
>
>
> class Blanket < Set
>
> def freeze
> each { |block| block.freeze }
> super
> end
>
> def separations
> return @seps if @seps
> @seps = Set[]
> # the ‘heavy’ code from that builds @seps this time
> freeze
> @seps
> end
>
> end
>
>
>
> I guess my question boils down to what does exactly Object#freeze
> prevent from being modified – simply all the properties?
>
> If so, does it mean I can only have one method like the above (because
> if some other method did any freezing, I couldn’t initialise the @seps
> cache when the first call to separations occurs)?
>
> Is there any ‘best practice’ in Ruby
> with regards to caching method results?
>
> Is there any obvious other approach to such caching? (I thought about
> having a @cache instance variable that would get reset on object
> changes, but I’m not sure how to obtain all the list of methods that
> can change an object – i.e., methods that throw error when a given
> object is frozen.)
>
> Thanks in advance for any replies/suggestions/insights!
>
The usual idiom for caching heavy results is something like this:
def foo
@foo ||= some_long_calculation
end
Cheers-
- Ezra Zygmuntowicz
-- Founder & Software Architect
-- ez...@engineyard.com
-- EngineYard.com
http://raa.ruby-lang.org/project/memoize/
> I guess my question boils down to what does exactly Object#freeze
> prevent from being modified – simply all the properties?
Yes. You cannot assign instance variables any more once an instance is frozen.
> If so, does it mean I can only have one method like the above (because
> if some other method did any freezing, I couldn't initialise the @seps
> cache when the first call to separations occurs)?
An alternative approach would be to use current state as cache key,
i.e. create an immutable copy and stuff that along with calculation
results into a Hash.
> Is there any 'best practice' in Ruby
> with regards to caching method results?
Memoize, see above.
> Is there any obvious other approach to such caching? (I thought about
> having a @cache instance variable that would get reset on object
> changes, but I'm not sure how to obtain all the list of methods that
> can change an object – i.e., methods that throw error when a given
> object is frozen.)
I suggest you do not inherit Set. In that case it's easy: *you*
define which methods change the state of your class.
Kind regards
robert
--
use.inject do |as, often| as.you_can - without end
Correct. IIRC Memoize uses the method argument array to do cache
lookups. I don't know whether Memoize takes measures to avoid
aliasing effects but that can be tested easily, e.g.
a = [1,2,3]
foo(a)
a << 4
foo(a)
foo must of course print something to the screen or such so you see
when it's invoked.
> > An alternative approach would be to use current state as cache key,
> > i.e. create an immutable copy and stuff that along with calculation
> > results into a Hash.
>
> Hm, that's actually an approach worth considering. I do need to take
> memory use into account, unfortunately, but it seems this is worth
> testing.
>
> I assume this approach highly depends on (a) making sure #freeze freezes
> also all referenced objects (like in my original example) and (b) #==,
> #eql? and #hash are sensibly implemented (because Hash uses them for key
> comparison), right?
Correct. But I believe this is true for Memoize also, i.e. if you
decide to use something as key to determine whether a calculation has
to be redone you better make sure it properly implements #hash, #eql?
etc.
> > I suggest you do not inherit Set. In that case it's easy:
> > *you* define which methods change the state of your class.
>
> Hm, that's true; also, if I get all this right, the ones
> that change the state could simply clear Memoize's cache.
>
> I'll see how much of the stuff inherited from Set I actually use –
> quite a bit, I assume, but then I could simply make an instance variable
> of @set and pass all these method calls to it (while selectively
> invalidating cache)…
Delegator may help here although in this case I'd probably rather
explicitly forward method invocations because automatic delegation
does have issues of its own, for example it does not "correct" return
values:
$ irb -r delegate
irb(main):001:0> Delegat
DelegateClass Delegater Delegator
irb(main):001:0> s=[1,2,3]
=> [1, 2, 3]
irb(main):003:0> so = SimpleDelegator.new(s)
=> [1, 2, 3]
irb(main):004:0> so.size
=> 3
irb(main):005:0> so.object_id
=> 1073413300
irb(main):006:0> s.object_id
=> 1073463120
irb(main):007:0> so.each {}.object_id
=> 1073463120
irb(main):008:0>
/so.each/ should rather return /so/ but it returns /s/.
> Thanks a *ton*, Robert; as usual, your reply is both
> invaluable and makes me think in the right direction.
Thank you! You're welcome! I am glad that my post proved useful.
Funny that you mention it: I had thought of copying the key via
Marshal#dump and #load. Using the marshaled string is a nice idea!
You can make this a tad more efficient by doing
@@cache[Marshal.dump(self).freeze][:beta_f] ||= outputs.to_blanket
because there is an optimization in Hash that copies unfrozen Strings
that are used as Hash keys in order to avoid aliasing effects.
Now, whether you use the String or demarshal probably mainly depends
on memory usage. If the String is short enough that approach is
certainly preferable because it incurs less processing overhead
(demarshaling).
> So the Ether Bunny goes hippety hopity down the garden path, waylaying
> innocent fieldmice and anesthetising them, so he can sell their teeth
> to the Tooth Fairy to support his milk-and-cookies habit.
What kind of dope are *you* smoking? :-)
Cheers
Nice side effect of this thread. :-)
> Robert Klemme:
> > 2008/2/21, Shot (Piotr Szotkowski) <sh...@hot.pl>:
> > Now, whether you use the String or demarshal probably mainly depends
> > on memory usage. If the String is short enough that approach is
> > certainly preferable because it incurs less processing overhead
> > (demarshaling).
>
> Ok, I'm totally lost here. I don't see any demarshalling happening,
> just marshalling (and using that as a key)… What am I missing?
Demarshalling was the option I had thought of. A copy through
marshalling _might_ use less memory than the marshaled form (the
String).
> > What kind of dope are *you* smoking? :-)
>
> I'm a PHP programmer² by day.
Uh, oh. *cough* :-)
> ² http://civicrm.org/
>
> -- Shot (seriously, though, it's a sig of unknown origin from Stewart
> Stremler's collection: http://www-rohan.sdsu.edu/~stremler/sigs/sigs.html)