memoized functions / methods

462 views
Skip to first unread message

Dan Rosen

unread,
Jan 30, 2012, 3:23:28 PM1/30/12
to scala-...@googlegroups.com
I've been thinking about submitting a new SIP, but I want to get a sanity check and some help first...  I would like to be able to memoize the results of referentially transparent functions / methods.  I can do this manually by implementing some caching strategy encapsulated in, say, a MemoizedFunction[-T, +R]...  But that feels unsatisfying; I want language support.

For memoizing methods, I'd propose a new keyword: 'memo' (as in 'memo def', analogous to 'lazy val').  The semantics of "memo def f(a_1: T_1, a_2: T_2, ..., a_n: T_N): R" would be twofold: statically, the transitive closure of all functions called by 'f' must be referentially transparent; and dynamically, the result of invoking 'f' on some arguments (a_1..a_n) would be equivalent to getOrElseUpdate on some in-memory cache (the size of which is, let's say, implementation-dependent) using the tuple of those arguments as the key.  Note that although 'f' will not likely be synchronized (because why would you synchronize a method without side effects?) the getOrElseUpdate implementation would nevertheless require synchronization.

For memoizing functions, I have to admit my proposal is a bit more nebulous...  I think a MemoizedFunction trait is a good idea, and I'd suggest that the eta expansion of a 'memo def' should produce a MemoizedFunction.  But there are a couple things that are unclear to me:

  - In the case of 'val f = new MemoizedFunction[A, B] { def apply(a: A): B = .... }', the compiler would need to inspect the 'apply' method body to analyze for referential transparency, and I don't necessarily like the idea that the compiler would need to treat any library type as special...

  - In the case of 'val f: A => B = { a => ... }', how to indicate syntactically that the anonymous function is referentially transparent?

A couple other scattered thoughts I had, that might be worth mentioning:

  - It's probably a good idea to allow interchangeable caching strategies.  For example, assume users might want to use a cache tuned for their application (such as ehcache).  I figure a 'Cache' typeclass with a sensible default 'implicit val' provided in predef would be the way to go.

  - I worry about the performance cost of this proposal, because it conflates the notion of referential transparency (and being able to reason about it statically) with the runtime strategy of caching results.  For example, if I define 'memo def plus(a: Int, b: Int): Int = a + b', the proposal would say that Int.+() needs to be declared as 'memo def' and consequently cached.  It may be that keeping the two concerns separate is better, so that Int.+() can be annotated as idempotent but without incurring runtime space penalty.

Thoughts?
dr

Daniel Sobral

unread,
Jan 30, 2012, 3:31:39 PM1/30/12
to scala-...@googlegroups.com
I haven't looked at the full proposal, though I'd also like
language-level memoization. However, I think "lazy def" would make
perfect sense and not create a new keyword.

--
Daniel C. Sobral

I travel to the future all the time.

Dan Rosen

unread,
Jan 30, 2012, 3:51:53 PM1/30/12
to scala-...@googlegroups.com
On Monday, January 30, 2012 12:31:39 PM UTC-8, Daniel Sobral wrote:
However, I think "lazy def" would make
perfect sense and not create a new keyword.

The only beef I have with "lazy def" is that it doesn't hint at the fact that the function being defined is side effect free.  I'd have suggested "pure def" actually, but I know that scalaz uses 'pure' as a method name, and it's called frequently from client code.  And "idem def" just sounds goofy.

Anyway, would love to hear your thoughts on implementation...

dr

Paul Hudson

unread,
Jan 30, 2012, 5:26:19 PM1/30/12
to scala-...@googlegroups.com
On 30 January 2012 20:23, Dan Rosen <dan....@gmail.com> wrote:
I've been thinking about submitting a new SIP, but I want to get a sanity check and some help first...  I would like to be able to memoize the results of referentially transparent functions / methods.  I can do this manually by implementing some caching strategy encapsulated in, say, a MemoizedFunction[-T, +R]...  But that feels unsatisfying; I want language support.

Can you expand on why it feels unsatisfying? Purely aesthetics, or something else? 

Dan Rosen

unread,
Jan 30, 2012, 6:56:12 PM1/30/12
to scala-debate
Aesthetics is definitely part of it, but aesthetics alone wouldn't
justify other similar language features like 'lazy val' and by-name
parameters... Similar to 'lazy val,' this 'memo def' thing provides a
distinct evaluation strategy with the particular requirement that the
function or method is indeed free of side effects. Either that
requirement must be verified through a static analysis (meaning it's a
language feature, not a library feature), or we drop the requirement
and say "you're free to memoize side-effecting method calls at your
own risk."

Obviously I've been avoiding using the term "effects system" here...
I sort of get the impression that another "hey, how's Scala's new
effects system coming along?" post would be too much of a troll, and
besides, I don't think I care about the granularity described in the
"Side-effect checking for Scala" talk---I think I only care that there
are no side effects. But anyway, that's essentially what I'm trying
to capture, to leverage it for safe memoization.

So I guess the question becomes: is it important to prove that it's
safe to memoize? I'm assuming so, because I think it would be very
difficult to track down bugs due to mis-memoization, but I'm curious
to know what others think.

dr

Daniel Sobral

unread,
Jan 30, 2012, 10:51:43 PM1/30/12
to Dan Rosen, scala-debate
On Mon, Jan 30, 2012 at 21:56, Dan Rosen <dan....@gmail.com> wrote:
> Aesthetics is definitely part of it, but aesthetics alone wouldn't
> justify other similar language features like 'lazy val' and by-name
> parameters...  Similar to 'lazy val,' this 'memo def' thing provides a
> distinct evaluation strategy with the particular requirement that the
> function or method is indeed free of side effects.  Either that
> requirement must be verified through a static analysis (meaning it's a
> language feature, not a library feature), or we drop the requirement
> and say "you're free to memoize side-effecting method calls at your
> own risk."
>
> Obviously I've been avoiding using the term "effects system" here...
> I sort of get the impression that another "hey, how's Scala's new
> effects system coming along?" post would be too much of a troll, and
> besides, I don't think I care about the granularity described in the
> "Side-effect checking for Scala" talk---I think I only care that there
> are no side effects.  But anyway, that's essentially what I'm trying
> to capture, to leverage it for safe memoization.
>
> So I guess the question becomes: is it important to prove that it's
> safe to memoize?  I'm assuming so, because I think it would be very
> difficult to track down bugs due to mis-memoization, but I'm curious
> to know what others think.

I think proving something in Scala is side-effect free would easily
get its own SIP, much bigger than memoization.

rkuhn

unread,
Jan 31, 2012, 2:24:53 AM1/31/12
to scala-...@googlegroups.com
Hi Dan,

memoization is not something for which the one and only true solution exists, meaning it cannot generically be done right once and for all. Based on this observation and the fact that you can easily memoize functions (albeit not methods directly) using existing infrastructure, I don't see the point of adding "language support". I don't know enough about the proposed macros, but couldn't they possibly abstract over arity?

Regards,

Roland

Eugene Burmako

unread,
Jan 31, 2012, 3:09:22 AM1/31/12
to scala-...@googlegroups.com
If macro annotations make it into Scala, it'd be possible to write "@memo def foo = bar" (or even "@pure def foo = bar", since pure here is a TypeName, not a TermName pure from Scalaz). Then in the macro you'll be able to: 1) rewrite the body of foo to your liking, 2) perform static analysis of your liking. 

The only restriction with this is the fact that you'll only have ASTs for the function itself and for its brothers and sisters from the same compiler run. Analyzing ASTs of program entities from already compiled libraries will require decompilation or having annotations on them (annotations can be picked reflectively, ASTs can not).

Lukas Rytz

unread,
Jan 31, 2012, 4:23:38 AM1/31/12
to Eugene Burmako, scala-...@googlegroups.com

On Tuesday, 31. January 2012 at 09:09, Eugene Burmako wrote:

If macro annotations make it into Scala, it'd be possible to write "@memo def foo = bar" (or even "@pure def foo = bar", since pure here is a TypeName, not a TermName pure from Scalaz). Then in the macro you'll be able to: 1) rewrite the body of foo to your liking, 2) perform static analysis of your liking. 

The only restriction with this is the fact that you'll only have ASTs for the function itself and for its brothers and sisters from the same compiler run. Analyzing ASTs of program entities from already compiled libraries will require decompilation or having annotations on them (annotations can be picked reflectively, ASTs can not).

If that ever becomes a need then it's certainly possible to pickle ASTs, the
infrastructure is already there.

Eugene Burmako

unread,
Jan 31, 2012, 4:38:35 AM1/31/12
to Lukas Rytz, scala-...@googlegroups.com
Will it be a significant performance hit to auto-pickle ALL asts?

Dennis Haupt

unread,
Jan 31, 2012, 5:18:13 AM1/31/12
to scala-...@googlegroups.com
take a look at clojure. it can wrap any function with a caching one. this is possible because in clojure you can access all parameters and statements in a function as an actual list of things and statements and reorder it at runtime as you like. in the memoize-case, the parameter-list is used as a key for a caching map.

in scala, you'd need to write a wrapper for any single one of the function-traits

-------- Original-Nachricht --------
> Datum: Mon, 30 Jan 2012 23:24:53 -0800 (PST)
> Von: rkuhn <goo...@rkuhn.info>
> An: scala-...@googlegroups.com
> Betreff: [scala-debate] memoized functions / methods

Lukas Rytz

unread,
Jan 31, 2012, 5:27:43 AM1/31/12
to Eugene Burmako, scala-...@googlegroups.com
We would have to find that out by experience. I think serializing the trees into a byte
array should not take too much time, but then there's writing the larger classfiles. And
the size of the classfiles might be a bigger issue.

Eugene Burmako

unread,
Jan 31, 2012, 5:33:46 AM1/31/12
to Dennis Haupt, scala-...@googlegroups.com
>>in scala, you'd need to write a wrapper for any single one of the function-traits  
not necessarily, if you use macros.

Jason Zaugg

unread,
Jan 31, 2012, 5:44:54 AM1/31/12
to Eugene Burmako, Dennis Haupt, scala-...@googlegroups.com
On Tue, Jan 31, 2012 at 11:33 AM, Eugene Burmako <eugene....@epfl.ch> wrote:
>>in scala, you'd need to write a wrapper for any single one of the function-traits  
not necessarily, if you use macros.

I was just wondering how macros could be used to conveniently implement single abstract method (SAM) interfaces.

Will it be possible to write a macro that converts:

  trait Compute { def doit(in: Double): Double) }
  sam[Compute]((_: Double) * 2)

To:

  new Compute { def doit(in: Double) = in * 2 }

The motivation is two-fold, firstly to avoid writing the wrapper type manually for each SAM interface, and secondly to achieve better performance by avoiding boxing and the megamorphic call to FunctionN#apply. 

Eugene Burmako

unread,
Jan 31, 2012, 5:55:18 AM1/31/12
to scala-debate
If you make "def sam[T](fn) = ..." a macro, then yes.

According to our current vision, a macro can generate whatever code.
Thus, any code written manually can be expressed with a macro, given
you're satisfied with provided entry points (method invocation for
macro defs, type reference for macro types and annotation for macro
annotations).

On Jan 31, 11:44 am, Jason Zaugg <jza...@gmail.com> wrote:

Simon Ochsenreither

unread,
Jan 31, 2012, 6:37:26 AM1/31/12
to scala-...@googlegroups.com, Lukas Rytz

Just saying...



Sorry. :-)

Daniel Sobral

unread,
Jan 31, 2012, 7:15:49 AM1/31/12
to Eugene Burmako, scala-debate
On Tue, Jan 31, 2012 at 08:55, Eugene Burmako <eugene....@epfl.ch> wrote:
> If you make "def sam[T](fn) = ..." a macro, then yes.
>
> According to our current vision, a macro can generate whatever code.
> Thus, any code written manually can be expressed with a macro, given
> you're satisfied with provided entry points (method invocation for
> macro defs, type reference for macro types and annotation for macro
> annotations).

This is actually possible right now in the wip tree, isn't it?

>
> On Jan 31, 11:44 am, Jason Zaugg <jza...@gmail.com> wrote:
>> On Tue, Jan 31, 2012 at 11:33 AM, Eugene Burmako <eugene.burm...@epfl.ch>wrote:
>>
>> > >>in scala, you'd need to write a wrapper for any single one of the
>> > function-traits
>> > not necessarily, if you use macros.
>>
>> I was just wondering how macros could be used to conveniently implement
>> single abstract method (SAM) interfaces.
>>
>> Will it be possible to write a macro that converts:
>>
>>   trait Compute { def doit(in: Double): Double) }
>>   sam[Compute]((_: Double) * 2)
>>
>> To:
>>
>>   new Compute { def doit(in: Double) = in * 2 }
>>
>> The motivation is two-fold, firstly to avoid writing the wrapper type
>> manually for each SAM interface, and secondly to achieve better performance
>> by avoiding boxing and the megamorphic call to FunctionN#apply.

--

Adam Jorgensen

unread,
Jan 31, 2012, 7:33:03 AM1/31/12
to scala-...@googlegroups.com
Hmnmmmm, I'm not sure building memo-ization into the language  is really a good idea.

While the basics of memo-izing a function are simple enough there are just so many options
when it comes to the mechanics of caching that it seems like any single solution would be
unable to satisfy the needs of everyone.

Daniel Sobral

unread,
Jan 31, 2012, 8:42:51 AM1/31/12
to Adam Jorgensen, scala-...@googlegroups.com

But it doesn't need to, does it? There are many ways to do a "lazy
val", but I don't see anyone out there implementing alternatives, or
complaining about Scala-choosen one.

If Scala had builtin memoization, that would not prevent people with
specific needs to implement their own solution. Rather, it would
simply make a lot of code simpler, and get some people who have never
heard of it, or don't want to waste the effort to get one implemented,
to use and benefit from it.

Johannes Rudolph

unread,
Jan 31, 2012, 9:10:47 AM1/31/12
to Daniel Sobral, Adam Jorgensen, scala-...@googlegroups.com
On Tue, Jan 31, 2012 at 2:42 PM, Daniel Sobral <dcso...@gmail.com> wrote:
> But it doesn't need to, does it? There are many ways to do a "lazy
> val", but I don't see anyone out there implementing alternatives, or
> complaining about Scala-choosen one.

I am.

https://issues.scala-lang.org/browse/SUGGEST-11

--
Johannes

-----------------------------------------------
Johannes Rudolph
http://virtual-void.net

Daniel Sobral

unread,
Jan 31, 2012, 9:45:59 AM1/31/12
to Johannes Rudolph, Adam Jorgensen, scala-...@googlegroups.com
On Tue, Jan 31, 2012 at 12:10, Johannes Rudolph
<johannes...@googlemail.com> wrote:
> On Tue, Jan 31, 2012 at 2:42 PM, Daniel Sobral <dcso...@gmail.com> wrote:
>> But it doesn't need to, does it? There are many ways to do a "lazy
>> val", but I don't see anyone out there implementing alternatives, or
>> complaining about Scala-choosen one.
>
> I am.
>
> https://issues.scala-lang.org/browse/SUGGEST-11

You are the exception that proves the rule! ;-)

By the way, I just read your suggestion, and I'd like to make a
comment about the last remark you make, about JVM optimizing the
implementation. Recent benchmarks with code that uses lazy val
indicates that, when encountering them, JVM simply gives up on
optimizing it. It's a magnitude-change switch: replace a val with a
lazy val and see your performance decrease tenfold.

So the claim that JVM will optimize it fairly well is cannot be taken
for granted.

Dan Rosen

unread,
Jan 31, 2012, 11:31:15 AM1/31/12
to Adam Jorgensen, scala-...@googlegroups.com
RKuhn mentioned this concern as well. One thing I guess I should
reiterate from my original post, since I absolutely agree that there
is no "one size fits all" caching policy:

> It's probably a good idea to allow interchangeable caching strategies.
> For example, assume users might want to use a cache tuned for their
> application (such as ehcache). I figure a 'Cache' typeclass with a
> sensible default 'implicit val' provided in predef would be the way to go.

Cheers,
dr

Bernd Johannes

unread,
Feb 1, 2012, 1:36:50 AM2/1/12
to scala-...@googlegroups.com, Dan Rosen

For what it's worth - I would prefer 'deterministic' (burrowed from various SQL dialects). That doesn't describe the fact that something is actually memoized but indicates that it COULD be memoized if the implementation decides to do so.


That opens up the freedom for the implementation to drop memoization at the cost of performance if the need arises (runtime architecture, memory consumption, solution space).


The actual memoization and all the possible behaviour of it could be delegated to a "global" memoization handler with one instance per thread (I don't assume that sharing the caches between multiple threads in a concurrent setup will perform well - but this could be customizable behaviour as well).


Global because this opens up the possibility to implement cache balancing mechanisms over several memoized functions (e.g. a global LRU dropout scheme or something balancing LRU and cost of calculation). From my experience it's not satisfactory to throw just a caching Map at a function when the solution space is non-trivial.


An application could replace the default handler (which e.g. could do simply nothing and all memoization interception code would silently be erased) with one that implements exact that memoization behaviour which is needed for this specific application.


The hard part is to be able to catch nondeterministic operations in the function body as good as possible. The first step would simply be to delegate it completely to the developer - deterministic would be a mere tag then without guarantees or enforced behaviour restrictions.


Just my 5 cents

Greetings Bernd

Alex Repain

unread,
Feb 1, 2012, 2:51:41 PM2/1/12
to Bernd Johannes, scala-...@googlegroups.com, Dan Rosen
Just my 2 cents about that :

memoization can be useful even without the function being "pure". It
can become tricky if you decide to use it and are a tricky programmer,
but it actually can turn into a feature. One use-case I thought of :

you run an online app which just ranks the distant users as they click
on some button in your webpage. Let"s assume you have users A,B, C and
so on.

var i = 0
memo def rank(u: User) = {i = i + 1; i}


The result is that A,B, and C can click how many times they want,
their ranking won't be altered after their first click.

Let's assume they clicked in the order (A,C,B), represented by a
collection, users = (A,C,B). We could have done :

var i = 0
val rank: Map[User,Int] = users.map(u => (u, {i = i + 1; i}) // or
some cleaner way without a var, but let me get my point please

But we don't have 'users'. The elements order will only be available
at runtime, and only after and undetermined time. Memoization allows
for a progressive treatment of this
"to-be collection", possibly using side-effects, with a fairly
easy-to-read code.

2012/2/1, Bernd Johannes <bjoh...@bacon.de>:


--
*Alex REPAIN*
Computer Science Engineer
Scala-enthusiast

Reply all
Reply to author
Forward
0 new messages