On Mar 8, 10:00 pm, David Powell <
djpow...@djpowell.net> wrote:
> I think that memoization tends to be somewhat misunderstood.
Almost certainly! I count myself humbly among the number :-)
>
> My understanding of a general purpose memoize funciton, is that when you
> call a memoized function, the parameter/results mapping will be stored
> forever, and when the function is called again with the same parameters, the
> memoized result can be returned in constant time, without having to
> re-execute the function. This isn't a shortcoming of memoize - it is just
> what memoize is designed to do.
This is fine as a definition, however it fails in a low memory
condition.
The SoftReference alternative has the same semantics except that
instead of
throwing OOM, it discards some or all cached data in preference to
failing.
Some values may then need to be recalculated, but that should be safe
since
your function is meant to be referentially transparent anyway.
>
> Memoization is sometimes used to take recursive O(2^n) algorithms and
> convert them to O(x^2) algorithms. To do this, you don't memoize the
> function and stick the function in a top-level def - you just ephemerally
> construct a memoized copy of the function, and arrange for the recursive
> calls to be made on that copy, discarding the copy when the function finally
> returns. For example, the longest-common-subsequence problem [1] can be
> solved using dynamic programming or it can be solved using memoization.
>
> [1]
http://en.wikipedia.org/wiki/Longest_common_subsequence_problem
>
Agreed. But a SoftReference solution would also provide the same
performance,
at least until it hits a low memory condition, in which case the
O(x^2)
algorithm would have failed anyway so you can hardly complain....
> That isn't to say that a more intelligent, tunable, application-level cache
> isn't a useful thing to have - it is, and it is probably more useful than
> memoize. But they are different things, with different purposes. Adding
> soft references to memoize wouldn't improve memoize - it would change it
> into something else.
>
> That a memoized function keeps input/output pairs around forever isn't a
> memory leak in memoize - it is a memory leak in whatever application doesn't
> properly manage the life of the function that memoize returns. For
> optimizing recursive algorithms you may need to throw it away, for lazy
> initialisation applications you probably don't care if the range of inputs
> is controlled.
Perhaps we should keep memoize as it is and provide a new function
("cachify"?)
that has the same usage as memoize apart from the memory sensitive
caching
behaviour. That would obviously work.
But then I can't think of a single situation where I would actually
use the
original memoize above cachify. Hence I wonder whether it wouldn't be
better
just to upgrade the original memoize? This would still be consistent
with
how memoize is documented in the Clojure 1.2 API, and would help
prevent less
experienced users of memoize accidentally creating "memory leaks".
Guess it comes down to what you think is the most reasonable default
behaviour.