I have a class Foo which has a number of heavy, CPU intensive, number
crunching methods that take parameters.
class Foo
attr_reader :x, :y, ...
def calc1(a,b,c)
...# complicated, time-consuming calculation...
end
def calc2(a,b)
...# complicated, time-consuming calculation...
end
...
end
Foo is immutable. If a client calls a method more than once using the
same set of parameters, the same result will be calculated and
returned.
To avoid the work of recalculating a result if a client calls a method
more than once with the same set of parameters, I revised Foo to cache
results in a hash for each method for each set of parameters. The
keys in the hashes are arrays of parameters.
class Foo
def calc1(a,b,c)
param_array = [a,b,c]
@cache_calc1 ||= {}
return @cache_calc1[param_array] if @cache_calc1[param_array]
ans = ... #same complicated, time-consuming calculation...
@cache_calc1[param_array] = ans
end
def calc2(a,b)
param_array = [a,b]
@cache_calc2 ||= {}
return @cache_calc2[param_array] if @cache_calc2[param_array]
ans = ... #same complicated, time-consuming calculation...
@cache_calc2[param_array] = ans
end
end
This works great -- calculations for a given set of parameters on
each method are now done only once. However, the code seems rote,
repetitive and intrusive to the original methods.
Is there a way to do this in Ruby in a more declarative, more DRY way?
I am envisioning it should be possible to add a simple, single line to
the original Foo, a la
class Foo
cache_calculation :calc1, :calc2
#rest of Foo unchanged
end
My newbie ruby skills are not up to task writing such a
cache_calculation method on Class, to wrap and call the original
method and provide the caching, but it seems it should be simple to
do.
class Class
def cache_calculation(syms)
syms.each do |sym|
???
end
end
end
Can anyone provide the necessary snippet or otherwise advise or point
me to a solution to this?
Thanks!
Brian
require "memoize"
include Memoize
# Inefficient fibonacci method
def fib(n)
return n if n < 2
fib(n-1) + fib(n-2)
end
fib(100) # Slow
memoize(:fib)
fib(100) # Fast
Regards,
Sean
def #{m.to_s}(*args, &block)
fail "Object shoul be frozen" unless frozen?
fail "Block handling is not yet implemented in once_with_args" if block_given?
@@__once_with_args__ ||= {}
@@__once_with_args__[#{m.to_i}] ||= {}
@@__once_with_args__[#{m.to_i}][args] ||= __once_with_args__#{m.to_i}__(*args)
# We should handle &block as parameter, since it might be different each time. I don't know how to do this...
end
END_OF_EVAL
end
end
end
class Foo
def calc1(a,b,c)
p [:org_calc1, [a, b, c]] # DEBUG
1 #same complicated, time-consuming calculation...
end
def calc2(a,b)
p [:org_calc2, [a, b]] # DEBUG
2 #same complicated, time-consuming calculation...
end
once_with_args :calc1
once_with_args :calc2
end
class Module
def once_with_args(*methods)
methods.each do |m|
class_eval <<-"END_OF_EVAL"
alias_method :__once_with_args__#{m.to_i}__, :#{m.to_s}
private :__once_with_args__#{m.to_i}__
def #{m.to_s}(*args, &block)
fail "Object should be frozen" unless frozen?
fail "Block handling is not yet implemented in once_with_args" if block_given?
@@__once_with_args__ ||= {}
@@__once_with_args__[self] ||= {}
@@__once_with_args__[self][#{m.to_i}] ||= {}
@@__once_with_args__[self][#{m.to_i}][args] ||= __once_with_args__#{m.to_i}__(*args)
may be useful to you.
j.
--
"http://ruby-lang.org -- do you ruby?"
Jeff Wood
Thank you and wowser! memoize seems to do just what I'd been looking
to do. A handful of lines. I can see it works.
#my test code
f = Foo.new
f.memoize(:calc1)
f.memoize(:calc2)
f.calc1(1,2,3)
f.calc1(1,2,3) #yep its working
f.calc1(7,8,9) #yep its working
If I can ask a Ruby 101 question about it... Isn't "cache" in the
module (pasted below in its entirety it is so short) only a local
variable? Why does is retain state? How can one, for example, access
the cache to see the state of the cache?
Thanks!
--Brian
module Memoize
MEMOIZE_VERSION = "1.0.0"
def memoize(name)
meth = method(name)
cache = {}
(class << self; self; end).class_eval do
define_method(name) do |*args|
cache[args] ||= meth.call(*args)
end
end
end
end
class Foo
include Memoize
def calc1( *args )
# do something here
end
memoize :calc1
end
f = Foo.new
f.calc1( 1,2,3 )
f.calc1( 1,2,3 )
f.calc1( 7,8,9 )
..
it's a mixin, the functionality becomes a part of the functionality of
the class, wrapping calls to self with the individual object it was
included from.
hope that makes sense.
j.
On 10/17/05, Brian Buckley <briank...@gmail.com> wrote:
This is the magic of closures. When you use define_method, it takes a
block as argument defining the method body. This block is a
closure[1]. A nifty thing about closures is that they capture the
environment in which they are defined. In this case, the closure
representing the body of the method can capture the cache variable
from the environment in which it's defined. The cache variable itself
goes out of scope at the end of the memoize method, but the object it
referred to is still referred to within the closure, so the object
remains accessible within that closure.
Jacob Fugal
[1] http://en.wikipedia.org/wiki/Closure_(computer_science)
If you hadn't noticed, "meth" is also a local variable. It looks like
magic doesn't it? ;)
OK - I'll attempt to explain this.
> module Memoize
> MEMOIZE_VERSION = "1.0.0"
> def memoize(name)
> meth = method(name)
1. Get a reference to the existing method. This is a ~bit~ like a
function pointer.
> cache = {}
2. Set up a hash to store memoized results.
> (class << self; self; end).class_eval do
3. Common idiom to define a method on the singleton class of the object
> define_method(name) do |*args|
Here you are defining a method with a do..end block. A block in Ruby
is a /closure/ - this means it captures the values of any variables in
scope at the time of the block definition (i.e. it captures the
/binding/ in effect inside the memoize function). When you call the
newly defined method, Ruby will evaluate the block in the context of
the captured binding, so it will be able to see any variables visible
inside the memoize function.
> cache[args] ||= meth.call(*args)
4. If cache[args] has been defined, return it, else set cache[args] to
the value of the original method call and return it
> end
> end
> end
> end
>
The key concept to grok is closure. This is a Comp Sci term for a
function that carries around the environment it was defined in.
(Better plain English definition anyone?)
HTH,
Regards,
Sean
Yes it is a local variable, but the block used in define_method is a
closure, so the cache variable is accessible from within that block.
It is done this way instead of using a instance variable or other
non-local variables because there needs to be a cache for each method
that is memoized. Since each invocation of memoize creates a new local
cache for that memoized method, each method has its own cache. It is a
very clever use of closures.
There may be some trick to access the cache, but it isn't something
that is very straightforward. This is because the block is used to
define a method, and when you try to get that block back using
"method" to access its binding, the binding is for the object in
question, not for the closure of the original block. Hope that made
sense :)
Do you *really* need to look at the cache? If you hadn't read the
source code for memoize you wouldn't realize it was there, so.......
Ryan
Thank you for the closure explanation. I'll have to experiment.
It appears then that that cache cannot be a accessed for inspection
(for example to see the history of different calculations made), only
the method created by define_method has use of it.
Brian
OT If anyone knows why my email (gmail) is double-posting, please advise
On 10/17/05, Brian Buckley <briank...@gmail.com> wrote:
>
Not with the current implementation of memoize, no.
> OT If anyone knows why my email (gmail) is double-posting, please advise
Don't worry, it only double posts to you, not to the group. I have the
same issue (also with gmail), but after being assured the double's
aren't going to the group in general, have learned to live with the
minor inconvenience.
Jacob Fugal
See below:
> module Memoize
> MEMOIZE_VERSION = "1.0.0"
> def memoize(name)
> meth = method(name)
> cache = {}
> (class << self; self; end).class_eval do
> define_method(name) do |*args|
> cache[args] ||= meth.call(*args)
> end
define_method("#{name}_cache") do
cache
end
> end
> end
> end
Untested, but should work.
Regards,
Pit
It does indeed work. You rock!
Hmmm... That'll allow pre-polulating the cache with commonly used
parameter sets. My brain is spinning.
Brian
A simple modification to memoize to return the cache would let you see it:
module Memoize
MEMOIZE_VERSION = "1.0.0"
def memoize(name)
meth = method(name)
cache = {}
(class << self; self; end).class_eval do
define_method(name) do |*args|
cache[args] ||= meth.call(*args)
end
end
cache # <<< add this line
end
end
include Memoize
def foo(*args)
puts "calculating foo(#{args.map{|x| x.inspect}.join(',')})"
args.inject(0) {|sum, x| sum + x}
end
cache = memoize(:foo)
p cache
puts foo(2)
puts foo(2)
puts foo(2,3,4)
puts foo(2,3,4)
p cache
__END__
{}
calculating foo(2)
2
2
calculating foo(2,3,4)
9
9
{[2, 3, 4]=>9, [2]=>2}
Regards,
Sean
This won't work: "undefined method `memoize' for Foo:Class
(NoMethodError)"
You could extend Memoize, and define a class method to memoize, I
suppose. I'm also open to suggestions.
Regards,
Dan
I'm debating between this suggestion and Pit's (where you access the cache via
it's name and as an instance method). I dunno - what do people prefer?
This one is certainly simpler :)
Regards,
Dan
Here's one way to do it:
class Foo
include Memoize
def initialize
memoize :foo
end
def foo(*args)
puts "calculating foo(#{args.map{|x| x.inspect}.join(',')})"
args.inject(0) {|sum, x| sum + x}
end
end
f = Foo.new
puts f.foo(2)
puts f.foo(2)
__END__
calculating foo(2)
2
Regards,
Sean
Sean
Eh? If you memoize in initialize, why wouldn't another instance of Foo get the
benefit?
Regards,
Dan
I prefer Pit's. Otherwise you'll have to hang on to each cache
yourself (should you memoize several methods), instead of just letting
the class do it.
Ryan
Agreed. Pit's is better (even if more complicated ;)
Sean
Try it:
class Foo
include Memoize
def initialize
memoize :foo
end
def foo(*args)
puts "calculating #{self}.foo(#{args.map{|x| x.inspect}.join(',')})"
args.inject(0) {|sum, x| sum + x}
end
end
f = Foo.new
puts f.foo(2)
puts f.foo(2)
g = Foo.new
puts g.foo(2)
__END__
calculating #<Foo:0x28705e8>.foo(2)
2
2
calculating #<Foo:0x28703d8>.foo(2)
2
It's because you're defining the memoized method on the singleton class:
(class << self; self; end).class_eval do ... end
Regards,
Sean
> -----Original Message-----
> From: Sean O'Halpin [mailto:sean.o...@gmail.com]
> Sent: Monday, October 17, 2005 4:05 PM
> To: ruby-talk ML
> Subject: Re: declaratively caching results of a method
>
>
> On 10/17/05, Ryan Leavengood <leave...@gmail.com> wrote:
> > On 10/17/05, Daniel Berger <Daniel...@qwest.com> wrote:
> > >
> > > I'm debating between this suggestion and Pit's (where you
> access the
> > > cache via it's name and as an instance method). I dunno
> - what do
> > > people prefer?
> > >
> > > This one is certainly simpler :)
> >
> > I prefer Pit's. Otherwise you'll have to hang on to each cache
> > yourself (should you memoize several methods), instead of
> just letting
> > the class do it.
> >
> > Ryan
>
> Agreed. Pit's is better (even if more complicated ;)
>
> Sean
Alrighty, then. I'll make that change in the next release.
Thanks, all, for the input!
Dan
My code was meant as a quick hack for the OP to be able to experiment
with the cache. I wouldn't include it in the library, though, because I
don't like polluting the namespace with those additional methods.
What are the use cases for accessing the cache? (I remember someone
talking about filling it with predetermined values.)
Maybe there are other ways to implement the desired functionality.
Regards,
Pit
> -----Original Message-----
> From: Pit Capitain [mailto:p...@capitain.de]
> Sent: Tuesday, October 18, 2005 8:21 AM
> To: ruby-talk ML
> Subject: Re: declaratively caching results of a method
>
>
> Berger, Daniel schrieb:
> >> (accessing the memoize cache)
> >
> > Alrighty, then. I'll make that change in the next release.
>
> My code was meant as a quick hack for the OP to be able to experiment
> with the cache. I wouldn't include it in the library, though,
> because I
> don't like polluting the namespace with those additional methods.
Hm...ok, then I could just return the cache itself, as Sean originally
suggested. Not as slick, but it does avoid namespace pollution.
Dan
Here's another change to memoize for your consideration:
Suppose the result of a method call is nil or false, and suppose is
takes a lot of work to find that out. As written, methods that return
nil or false are not getting speedups from memoize.
How about changing memoize to something like below?
define_method(name) do |*args|
#cache[args] ||= meth.call(*args) #what if the cached value is
nil or false?
cache.has_key?(args) ? cache[args] : cache[args] ||= meth.call(*args)
end
Brian
Good idea! Thanks!
Dan