[SIP-21] Suggestions for improving SIP-21 - Spores

860 views
Skip to first unread message

Jim Powers

unread,
Jun 19, 2013, 11:22:52 AM6/19/13
to scala...@googlegroups.com

I think this SIP (SIP-21) is a good start, but the guarantees and safety can be improved. I would like to highlight the following as considerations for improving the specification for spores. The areas I will address are:

  • Compile-time guarantees
  • Function lifting
  • Run-time sanity checks

Compile-time guarantees

As things stand, the current proposal provides very weak guarantees about correct behavior of spores at run-time. I think this situation can and should be improved. For instance, the current specification allows the following construct with no error:

class Bar(file:File) {
  private val stream:FileInputStream = // ...

  val takeN = spore {
    val localStream = stream
    (n:Int) => {
      val r = new Array[Byte](n)
      localStream.read(r)
      r
    }
  }
}

This can be improved.

What I would propose is that values captured by spores be required to have a specific typeclass associated with their type (this idea is stolen from Cloud Haskell) that constrains what can be captured sanely. For the purpose of this discussion I'll use a typeclass called Sporable. In discussions with Philipp Haller he suggested going directly to Pickleable (based on the new Scala pickling system) which may be fine, I don't want to get too bogged down in serialization details at this point. The issue is that the typeclass needs to be more than a mere marker - you should be required to provide useful functionality necessary for the correct operation of a spore. In particular, in a non-local (distributed) environment. A meaningful serialization/deserialization implementation would be a good start.

So, the implementation of the spore macro would examine all of the types of the captured values and search through implicit scope for an implementation of an appropriate typeclass instance of Sporable for the target type. If no such instance can be found then fail in the usual ways.

Sporable typeclass instances can be provided for all scala.collection.immutable types. Users can provide their own instances of Sporable for their own types. Yes, this means that if someone wants to pretend that they can correctly serialize a FileInputStream or an object behind an arbitrary trait/interface then more power to them.

Function lifting

Consider the following code:

def foo[I,O](f:I => O) = spore {
  val localF = f
  (i:I) => f(i)
}

While one may be quick to point out that for safety higher-order-functions should transitively require Spore types I think this is too restricting. I propose something lighter-weight that, for instance, does not require changing existing interface types in order to work with spores. In particular, I propose a @sporable attribute that can be used on methods, functions. For example:

@sporable
def doubleString(i:String):String = i+i

Would invoke a compile-time check to indicate that the method doubleString is, in fact, a "sporable" function. In the context of a method/function this would mean that the function body abides by more-or-less the same rules as a spore body: the only values captured are arguments, all other values are introduced in the body of the function/method, arguments to the function must have a Sporable typeclass instance in scope, and function arguments must be tagged @sporable or be of a Spore type. The type of the method would still be Function1[String,String].

The original HOF can now be written:

def foo[I,O](f:I => O @sporable) = spore {
  val localF = f
  (i:I) => f(i)
}

Which can be used at call sites to ensure that the only functions passed to foo are of type Spore[?,?] or tagged as @sporable essentially enabling safe lifting into the spore context.

I suspect that implementing this will require macro annotations.

Run-Time Sanity checks

Since what a spore generates during serialization can be a binary blob of JVM bytecode and data it may be useful to indicate limits on exactly how big these blobs can get. I would like to propose a way to produce warnings and errors when the run-time properties of a serialized spore exceed thresholds. For example:

def foo[I,O](f:I => O @sporable) = spore(warn = 1 kilobyte) {
  val localF = f
  (i:I) => f(i)
}

def foo[I,O](f:I => O @sporable) = spore(error = 2 kilobytes) {
  val localF = f
  (i:I) => f(i)
}

def foo[I,O](f:I => O @sporable) = spore(warn = 1 kilobyte, error = 2 kilobytes) {
  val localF = f
  (i:I) => f(i)
}

This would cause the implementation of spore to produce useful information when user-defined thresholds are met. A further refinement would be that a specific logging object can be supplied for where to log these messages.

def foo[I,O](f:I => O @sporable) = spore(warn = 1 kilobyte, log = myLogger) {
  val localF = f
  (i:I) => f(i)
}

Another useful feature would be to just log generated sizes:

def foo[I,O](f:I => O @sporable) = spore(logSize = true, log = myLogger) {
  val localF = f
  (i:I) => f(i)
}
--
Jim Powers

Eugene Burmako

unread,
Jun 19, 2013, 11:35:11 AM6/19/13
to <scala-sips@googlegroups.com>, Heather Miller
Hi Jim, thanks a lot for the feedback! Just got a word from Heather - she's travelling today, so most likely she won't be able to respond, but she plans to chime in as soon as possible.


--
You received this message because you are subscribed to the Google Groups "scala-sips" group.
To unsubscribe from this group and stop receiving emails from it, send an email to scala-sips+...@googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.
 
 

Jim Powers

unread,
Jun 19, 2013, 12:18:04 PM6/19/13
to scala...@googlegroups.com, Heather Miller
No problems.

I want to reiterate something to be clear: the "Sporable" typeclass I mention is a placeholder for "a typeclass that makes the most sense".  I'm not tied to any particular name, only the results are of interest to me: You should have to lie to make spores fail.  Of course, even with some nice compile time-guarantees a spore running in the context of another machine can fail (for instance, a file may exist on one machine and may not exist on another machine).

--
Jim Powers

Simon Ochsenreither

unread,
Jun 19, 2013, 1:01:09 PM6/19/13
to scala...@googlegroups.com, Heather Miller
I haven't looked at the inner workings yet, but imho the “user interface” needs work.

I think it is especially important to figure out the relationship to Function more clearly, because I'd like to avoid ending up i a state were everyone suggests to ignore plain lambdas/Functions and to always pick spores.

(The naming is another issue ... I think Pickles & Spores is cute, but users shouldn't need to be aware of the inner workings to understand the pun.)

The basic question I have is: why are we picking such a seemingly disruptive way with the spores block + val-defs ... isn't there something more C++-like where you can just specify what and how you want things to be captured?

Johannes Rudolph

unread,
Jun 21, 2013, 11:17:13 AM6/21/13
to scala...@googlegroups.com, Heather Miller
On Wed, Jun 19, 2013 at 7:01 PM, Simon Ochsenreither <simon.och...@gmail.com> wrote:
I think it is especially important to figure out the relationship to Function more clearly, because I'd like to avoid ending up i a state were everyone suggests to ignore plain lambdas/Functions and to always pick spores.

I think this is a valid and realistic concern. Actually, the SIP says too little about where you would use the Spore type in practice. Would you have libraries where some functions would require Spores? Why would you do that? Especially, when you can't pass any kind of regular functions into that library method any more.

I don't say the spore construct itself is useless. You can use it to make sure not to accidentally capture things you don't want to but make all captures explicit. But that's it.

This seems like the lesser problem of the ones you usually have with capturing, the biggest one that you are capturing a reference to something which isn't allowed to leave the scope. So, IMO closing over something accidentally is much less interesting than preventing references to some values from escaping the scope. Speaking in your examples:

Actors: The closure isn't the problem, the problem is that the `this` reference to a mutable instance is allowed to leave the scope. Spores don't help you to decide which part of the expression you actually want to capture and which not (just `this` or `this.sender`).

Serialization: Again you are closing over `this`. For serialization purposes you don't want a closure to reference a Main instance because it isn't serializable. However, again, spores don't help you to decide which value you actually want to capture. Is it `this`? Why not? Or `this.helper`? Or `this.helper.toString()` or an even bigger expression? 


Part of my confusion seems to come from the problem that it's not clear at all what spores are supposed to be. The SIP says

> The main idea behind spores is to provide an alternative way to create closure-like objects, in a way where the environment is controlled.

IMO that's too little to warrant a SIP and maybe a change in the language (if no language change is needed, why not deliver it as just another library). Something like what's in cloud haskell would seem more useful, so Jim's suggestions would apply. However, this starts to look like a bigger project.

--
Johannes

-----------------------------------------------
Johannes Rudolph
http://virtual-void.net

Johannes Rudolph

unread,
Jun 21, 2013, 11:23:58 AM6/21/13
to scala...@googlegroups.com
On Wed, Jun 19, 2013 at 5:22 PM, Jim Powers <j...@casapowers.com> wrote:

def foo[I,O](f:I => O @sporable) = spore {
  val localF = f
  (i:I) => f(i)
}
If spores are about serializability why would you be required to write "val localF = f" if you have already annotated the parameter f that it is @sporable? Maybe you could even remove the requirement to write out the captures explicitly but instead require some static quality that says that this value (or values of a certain kind) are safe to be captured in this context.

Jim Powers

unread,
Jun 21, 2013, 11:31:22 AM6/21/13
to scala...@googlegroups.com
On Fri, Jun 21, 2013 at 11:23 AM, Johannes Rudolph <johannes...@googlemail.com> wrote:
On Wed, Jun 19, 2013 at 5:22 PM, Jim Powers <j...@casapowers.com> wrote:

def foo[I,O](f:I => O @sporable) = spore {
  val localF = f
  (i:I) => f(i)
}
If spores are about serializability why would you be required to write "val localF = f" if you have already annotated the parameter f that it is @sporable? Maybe you could even remove the requirement to write out the captures explicitly but instead require some static quality that says that this value (or values of a certain kind) are safe to be captured in this context.

Valid point.  I was just following the syntax slavishly. 

--
Jim Powers

Simon Ochsenreither

unread,
Jun 21, 2013, 12:01:09 PM6/21/13
to scala...@googlegroups.com
Maybe some approach where one would have a separate parameter list for things which are allowed to be captured would be preferable?

Heather Miller

unread,
Jun 21, 2013, 12:02:19 PM6/21/13
to scala...@googlegroups.com
Hi- I'm still on the road am without reliable reception and writing from a phone, but two points: 
- spores aren't only about serializability
- re: Johannes point about actors, have a look at the SIP, the example illustrates a  
frequently-occurring source of bugs- it illistrates the use of futures and actors, ie closures (futures) accidentally capturing the 'this' reference. So the issue is indeed an issue related to closures capturing unstable/unwanted objects.

As soon as I can get to a computer I can answer the rest of these points more completely. Thanks for your patience.

--
You received this message because you are subscribed to the Google Groups "scala-sips" group.
To unsubscribe from this group and stop receiving emails from it, send an email to scala-sips+...@googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.
 
 


--
Heather Miller
Doctoral Assistant
EPFL, IC, LAMP
http://people.epfl.ch/heather.miller

-- Please excuse my brevity, sent from mobile device

martin odersky

unread,
Jun 21, 2013, 1:07:54 PM6/21/13
to scala...@googlegroups.com, Heather Miller
Just a quick remark: Major changes to core libraries should also have a SIP. It's not just the language proper. So, if spores should go in the standard library, they should have a SIP.

Cheers

 - Martin
 
--
Johannes

-----------------------------------------------
Johannes Rudolph
http://virtual-void.net

--
You received this message because you are subscribed to the Google Groups "scala-sips" group.
To unsubscribe from this group and stop receiving emails from it, send an email to scala-sips+...@googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.
 
 



--
Martin Odersky
Prof., EPFL and Chairman, Typesafe
PSED, 1015 Lausanne, Switzerland
Tel. EPFL: +41 21 693 6863
Tel. Typesafe: +41 21 691 4967

Chris Marshall

unread,
Jun 22, 2013, 4:34:13 AM6/22/13
to scala...@googlegroups.com, scala...@googlegroups.com
I guess my problem about this is: the example illustrates using spores to make sure you've not captured anything. But you have to explicitly use them! If you are explicitly using them, it's because you know that you need to be careful and are paying attention to detail. If you know you need to be careful, it seems unlikely that you  captured something by accident.

Unless libraries are somehow changed to *force* the use of spores (I don't see how this is possible), it doesn't really seem like they are going to actually prevent many bugs.

Why not just write some compiler plugin/macro that identifies references which unsafely escape their scope when used with actors?

Chris

Eugene Burmako

unread,
Jun 22, 2013, 7:48:00 AM6/22/13
to <scala-sips@googlegroups.com>
I guess that's because then you would have to write such a macro for every library that wants to use spore-like functionality. This SIP, from what I can guess (I'm not involved in its development), aims to provide a definitive solution to such situations.

From my PoV, this SIP establishes a nice discussion point (introduces a dedicated function-like type, describes the capture problem), but omits a lot of practical details.

Let me elaborate. Imagine I'm a library author exposing a function Foo with the signature `def foo[T, R](f: Function1[T, R]): R = ...`. Now I want to make use of spores to make sure that `f` satisfies certain properties. So I change the signature to `def foo[T, R](f: Spore1[T, R]): R = ...` as suggested on Twitter and then start thinking.

1) What if I want to not only pre-capture free variables using in the spore, but to also verify that free variables satisfy certain properties? E.g. if I'm writing a parallel computing framework I probably would only like to allow capturing primitives, but not arbitrary types.

2) What if I don't want my users to write `foo(spore{ x => ... })`, but simply `foo(x => ...)`? It's easy to writing a corresponding implicit conversion from Spore to Function, but then my users would have to bring it in scope every time they call my methods.

Note that both of these problems can be solved by me writing my own MySpore, providing an implicit conversion from Function to MySpore in its companion and then in the myspore macro check that all the captures have certain properties that I want. Okay, so I go and implement all that, but once it's done I suddenly realize that I'm no longer using anything provided by this SIP. Therefore, I think maybe the SIP could be discussed a bit more and then generalized a bit to accommodate scenarious such as the one described above.


ode...@gmail.com

unread,
Jun 22, 2013, 11:26:52 AM6/22/13
to scala...@googlegroups.com, scala...@googlegroups.com


Sent from my iPhone

On 22.06.2013, at 10:34, Chris Marshall <oxbow...@gmail.com> wrote:

I guess my problem about this is: the example illustrates using spores to make sure you've not captured anything. But you have to explicitly use them! If you are explicitly using them, it's because you know that you need to be careful and are paying attention to detail. If you know you need to be careful, it seems unlikely that you  captured something by accident.

I think that's a wildly optimistic assumption. What I have seen in large projects was exactly the opposite. Even experienced and careful People were surprised by what their closures captured.

  Martin

Hanns Holger Rutz

unread,
Jun 22, 2013, 4:50:05 PM6/22/13
to scala...@googlegroups.com
So wouldn't the SIP that's actually needed---i.e., that is sustainable and long term oriented---be one to specify an effect system for Scala?


On 21 Jun 2013, at 17:17, Johannes Rudolph wrote:

> Part of my confusion seems to come from the problem that it's not clear at all what spores are supposed to be. The SIP says
>
> > The main idea behind spores is to provide an alternative way to create closure-like objects, in a way where the environment is controlled.
>

---
"Si hay reelección, Capriles será reelecto"

Lukas Rytz

unread,
Jun 22, 2013, 6:05:35 PM6/22/13
to scala...@googlegroups.com
Isn't that exactly what they propose? About Example 4 the SIP says

"[...] the spore’s closure is invalid, and would be rejected during compilation.
 The reason is that the [captured] variable outer1 is neither a parameter of the closure nor one
 of the spore’s value declarations"


Chris Marshall

unread,
Jun 25, 2013, 7:59:13 AM6/25/13
to scala...@googlegroups.com, scala...@googlegroups.com
But that doesn't invalidate my point. These bugs probably crept in because the author "didn't notice" the boundary. As such, it's unlikely that it would have occurred to them to use a spore

Chris

Michael Pilquist

unread,
Jun 25, 2013, 9:18:00 AM6/25/13
to scala...@googlegroups.com
I think the idea is that the library authors would declare their methods as taking spores instead of functions. As a result, the library users would be forced in to creating a spore, which causes them to explicitly delineate which values to capture.

Regards,
Michael

Roland Kuhn

unread,
Jun 25, 2013, 9:23:43 AM6/25/13
to scala...@googlegroups.com
Yes, that is my main motivation for looking into this SIP: Akka’s Props.apply() needs a Spore, that is a very clear-cut case, and I consider that an important benchmark. Unfortunately I’m currently fully occupied with getting Akka 2.2 released, hence I must defer a more complete contribution to this thread by just a little longer.

Regards,

Roland


Dr. Roland Kuhn
Akka Tech Lead
Typesafe – Reactive apps on the JVM.
twitter: @rolandkuhn


martin odersky

unread,
Jun 25, 2013, 9:26:54 AM6/25/13
to scala...@googlegroups.com
I think, yes, libraries might demand Spore in their APIs. But I argue that even if they don't, Spore as a concept is still useful. I have heard quite a few stories from people at Foursquare, Twitter, in the Spark project and others where people are well aware that care is required yet still get surprised from time to time by what's captured in a closure. Projects like this might well have a coding standard in the future where certain classes of closures should always be spores, even if the library does not (yet) enforce that.

Cheers

 - Martin


Simon Ochsenreither

unread,
Jun 25, 2013, 9:31:40 AM6/25/13
to scala...@googlegroups.com

I think, yes, libraries might demand Spore in their APIs. But I argue that even if they don't, Spore as a concept is still useful. I have heard quite a few stories from people at Foursquare, Twitter, in the Spark project and others where people are well aware that care is required yet still get surprised from time to time by what's captured in a closure. Projects like this might well have a coding standard in the future where certain classes of closures should always be spores, even if the library does not (yet) enforce that.

I agree that the concept is pretty important, but isn't there a more principled way to achieve it?
The use-site syntax feels a lot like a hack and I don't want to imagine the mass-migration of people from Functions to Spores as soon as people realize that non-capturing-by-default might be the thing they prefer.

Rex Kerr

unread,
Jun 25, 2013, 5:49:40 PM6/25/13
to scala...@googlegroups.com
I haven't tried them out, so I could be mistaken, but as they stand do spores do enough to prevent accidental leaks?  For example, in

  f: () => x

it's pretty obvious that I've captured x from somewhere.

But in

class Outer {
  val a = 1
  class Inner { def b = a }
  val i = new Inner
}

val x = (new Outer).i
spore {
  val inner = x
  () => inner
}

it's much less obvious that I've grabbed Outer secretly via Inner.

I wonder what fraction of problems with unintended capture are of the latter type?

  --Rex



--

Heather Miller

unread,
Jun 26, 2013, 8:15:13 AM6/26/13
to scala...@googlegroups.com
Hi Jim, all,

Thanks a lot for your feedback.

As an over-arching point, I'd like to reiterate again that spores are not, and should not be thought of as only being applicable to serialization. For parallel and concurrent applications, (not even considering Akka), spores present themselves as a useful abstraction; an example could be for both futures and parallel collections– for both of these, race conditions are often caused by what has been captured in a closure. 

Spores could help in cases like these by:
  • eliminating some races off the bat (as indicated as examples in the SIP), 
  • and for other cases where spores cannot eliminate races (because of a lack of an effect system) they at least force you to make the captured environment explicit which would definitely help in debugging a race condition.

On the compile-time guarantees:

When we were discussing SIP-21 in person at ScalaDays we already mentioned to you our ideas about adding type constraints to restrict the types of captured variables (I also mention it in my live talk [1]). However, all of the proposals that we've come up with or that were proposed by others are still experimental at this point. Naturally, something not well-polished should be left out of a SIP. For reasons I'll explain later, I think it's best to keep the current SIP-21 modest, and defer an addition of type constraints to a point when we have gained experience with applying spores in real code.

Adding a `Sporable` (or whatever it would be called) type class is an interesting idea. Though, it seems it might have far-reaching consequences: for all types that are reasonable to capture in a typical spore closure we/someone would have to provide a `Sporable` type class instance; the number of required type class instances might be impractically large. Though, we don't know without doing the experiment. Some people might argue that instead of explicitly enabling capturing, explicitly disabling capturing could be more practical/useful. At least this approach of disabling is along the current lines of thinking for type constraints in spores (meanwhile it's still not clear how to practically achieve this, so I'm not going to go into any more details, yet).

In some situations, requiring the types of captured variables to be `Sporable` might not be enough. For example, to make spores pickleable using the scala-pickling framework, one would need to have type class instances of `Pickler[T]` for all types `T` of captured variables (to be able to pickle the spore's environment). I don't see how this could be expressed in your current proposal.

On function lifting:

As you've pre-supposed, an @sporable annotation is a bit more heavyweight, since we can't implement this using a def macro. Even if we had macro annotations, this wouldn't be enough, since macro annotations act locally on the annotated definition, whereas in the case of @sporable, you'd need to do something that smells like actual type checking: at invocations of `foo` you'd have to check that the provided argument is @sporable. However, the invocation is definitely not annotated with anything. As a result, checking the constraints of @sporable would require checking not only the annotated definitions themselves, but also all parts of the AST where annotated definitions are used. Thus, it'd either require us to extend the compiler, or to provide a compiler plugin, which is much more heavyweight than a macro.

Furthermore, if we were to proceed with this idea of experimenting with type constraints, there are advantages to giving spores a proper type, so that they can be composed soundly. With a pure annotation-based approach that does not influence the types of functions/spores, we lose the benefit of composability. (I will detail all of these ideas more completely, publicly, once they are a bit better worked out.)

On runtime sanity checks:

As mentioned, I heartily stand behind the idea that spores shouldn't be thought of as only being applicable to serialization. However, say we consider we would tie spores to serialization in order to provide these runtime sanity checks. I still see some non-trivial issues that would arise when trying to achieve this.

The biggest is the fact that, as far as I can see, spores would have to be tied to pickle formats. This means that the pickling framework becomes more complicated for authors of new pickle formats, because they'd be required to implement a number of methods for dealing with the size, just for (or, mainly for) spores. Something we worked very hard to achieve for scala-pickling was a minimal pickle format, which would make writing custom pickle formats easy. For JSON, for example, the  entire pickle format is implemented in about 200 LOC, which is nice because the barrier of entry for writing pickle formats is not so high. If we've got to add another 50 LOC to deal with sizes just for spores, it's a hit to usability of the pickling framework.

General remarks:

In general, I think it's important to keep in mind that spores, as described in the SIP, are already very helpful in a number of scenarios. We've worked with Matei Zaharia, the creator of Spark, as well as with Roland Kuhn and the Akka team to make sure spores support their most important use cases. For example, in Spark it's important that closures do not retain references to their enclosing objects if it's not necessary (otherwise, closures might not be serializable, for no good reason); spores help by enforcing a closure shape that's compiled to a class without problematic outer references (we're working with the scalac backend engineers on guaranteeing this for Scala 2.11 upwards).

Right now the situation is that more and more projects use Spark's "closure cleaner" [2] to null out problematic and unnecessary outer references to make sure serializing closures that should normally be serializable does not throw `NotSerializableException` at runtime. The problems with the closure cleaner are:
  • errors manifest themselves at runtime
  • it relies on scalac compilation details
  • it's not clear whether the closure cleaner is complete; there is only anecdotal evidence that it seems to work "in enough cases".

To answer to the confusion about the motivation and where spores would be best used: I see two sides to this. First, I couldn't point it out better than Martin: spores help make code using closures clearer by making what's captured explicit. Second, spores are also very useful to authors of libraries and frameworks. Using spores in your library gives you several useful benefits: 
  • it helps the library to have better guarantees about the closures that it's dealing with (for example, worries about problematic outer pointers are gone);
  • the library enforces that users write code in a clear style where it is appropriate;
  • the library enforces the absence of certain errors (one example involving Akka and futures is given in the SIP).

Bottom line: Approach this SIP as a modest proposal. We'd like to be conservative. Yes, there are a number of ways to make spores richer, but no, we don't know what The Right (TM) ways to achieve that are, yet. That's more in the realm of research. As the proposal stands, things like type constraints on spores (if shown to be reasonable at some point) could later be added. Let's put spores to use first, in their modest form, where they are already very useful, and then later consider richer additions like these type constraints.


Cheers,
Heather

--
You received this message because you are subscribed to the Google Groups "scala-sips" group.
To unsubscribe from this group and stop receiving emails from it, send an email to scala-sips+...@googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.
 
 

Johannes Rudolph

unread,
Jun 26, 2013, 11:32:14 AM6/26/13
to scala...@googlegroups.com

Heather, thanks for the clarifications.


The problem I still have is that it is not clear how the SIP should help to write correct programs in practice. And IMO that’s mainly a consequence of two things the SIP is unclear about:


- an analysis of the problem, a problem statement and an explanation how the given motivations fit into the problem statement

- a summary description of the solution and how the solution solves the practical problem

(- the “how does the feature integrate with existing features” section which would have revealed practical problems)


The motivating cases from the SIP show that there are issues related to capturing but it doesn’t explain what the exact relation really is.


Futures: What seems to be proposed is to replace the parameter types of Future's higher-order methods to require Spores instead of plain functions. Please try to do that and see what has to be changed in even the simplest Future-based applications. It turns out that being able to close over stuff is one of the big advantages of the Future API. In fact, it's hard to find a legitimate program using Futures that doesn't use capturing. With spores we are basically back in Java-land where closed-over values had to be put into final variables. Actually, it turns out the problem is much more related to the akka Actor API than to Futures. So, why change Futures?


Serialization: Capturing is still possible with spores, as is referencing other un-serializable values in other ways. So, how do spores help?


I’ll try to rephrase what a spore is: Spore is a new type extending the Function type which guarantees (with a static check at construction) that a function value of this type has an additional property. The property seems to be ‘All values captured by this function have been declared explicitly’.


It’s unclear to me how ‘have been declared explicitly’ will solve ‘problems, somehow related to capturing’.


There are some questions the SIP should answer:


- When would I use the Spore type over the regular function type?

- When would I use the spore constructor instead of an anonymous function?

- What are the disadvantages of using the solution?

- How does this new property solve the problem shown with the motivating examples? I want to know in detail why “Have been declared explicitly” will reduce bugs. What are the steps to get from buggy code using some libraries to a fixed version, how have spores helped and how much correct code had to be changed as well to support the solution?


IMO spores are not a conservative solution at all because the solution offered is much broader than the problem itself [1]. Think about what happens if the SIP is implemented like currently proposed: Either it’s an opt-in solution i.e. the developer decides when to use spores, then it won’t be used at all because you don’t know where to use it exactly to prevent bugs. Or, it’s enforced by libraries in which case all code using the libraries has to be changed. It will be a chore similarly to having to make variables final in Java, which when you do it in a mechanized way won’t help you paying attention to the details that actually are the root of the bugs.


Regarding my earlier comments about why this has to be a SIP: IIUC there’s nothing preventing spores from being just another library for everyone to try out. It’s likely that many of these questions would have come up by now if someone would have tried spores experimentally. A SIP for inclusion into the standard library should IMO be preceded by practical experiments collecting evidence for the utility of the proposed addition (because it's so easy compared to SIPs regarding compiler changes).


Here’s a simple application using Futures I tried to convert to use spores (assuming that all higher-order functions in `Future` are changed to require spores):


https://github.com/jrudolph/spray/commit/8faa2a6616b3a69e374a41ce3b8be9ff33b35b14


Immediate questions:

* for-comprehensions: do they work together with spores anyhow?

* nested spores, what are the rules?



Johannes



[1] Whatever the problem really is, I find it hard to describe. Many cases seem related to capturing `this` and then exporting code which calls instance methods from inside the class context, and therefore violating the encapsulation classes usually provide. That seems to fit the Actor.sender case but less the serialization case where the problem seems to be that it’s hard to control which implicitly created instances contain outer references and whereto exactly.


Heather Miller

unread,
Jun 26, 2013, 12:35:05 PM6/26/13
to scala...@googlegroups.com
On Wed, Jun 26, 2013 at 5:32 PM, Johannes Rudolph <johannes...@googlemail.com> wrote:

Heather, thanks for the clarifications.


The problem I still have is that it is not clear how the SIP should help to write correct programs in practice. And IMO that’s mainly a consequence of two things the SIP is unclear about:


- an analysis of the problem, a problem statement and an explanation how the given motivations fit into the problem statement

- a summary description of the solution and how the solution solves the practical problem

(- the “how does the feature integrate with existing features” section which would have revealed practical problems)


The motivating cases from the SIP show that there are issues related to capturing but it doesn’t explain what the exact relation really is.


Futures: What seems to be proposed is to replace the parameter types of Future's higher-order methods to require Spores instead of plain functions.


No, that has not been proposed. If that were to be proposed, you'd definitely be aware, as it would be included in a futures-related SIP.
 

Please try to do that and see what has to be changed in even the simplest Future-based applications. It turns out that being able to close over stuff is one of the big advantages of the Future API. In fact, it's hard to find a legitimate program using Futures that doesn't use capturing.


Capturing stuff is certainly allowed. Have another look at the SIP, or the slides or the parleys presentation. Spores simply require that you're explicit about what you capture.
 

With spores we are basically back in Java-land where closed-over values had to be put into final variables. Actually, it turns out the problem is much more related to the akka Actor API than to Futures. So, why change Futures?


Serialization: Capturing is still possible with spores, as is referencing other un-serializable values in other ways. So, how do spores help?


In my previous email, I think I was pretty explicit about the several ways that spores can help. As myself and Martin have pointed out, it's common for most people to have no clue about what they're capturing. Spores prevent many cases of accidental capturing by forcing users to be explicit about what they capture. Furthermore, it's guaranteed that all spores will not have an `outer` reference if it's not necessary.

A bottom-line is that spores make it possible to be a bit more confident about using closures in situations where one must be careful about a closure's environment– e.g. concurrent/distributed settings. An immediate benefit is for library authors– they can confidently write APIs which publicly expose functions (spores), and need not worry about how their users might misuse their API in a concurrent/distributed situation, for example.
 

I’ll try to rephrase what a spore is: Spore is a new type extending the Function type which guarantees (with a static check at construction) that a function value of this type has an additional property. The property seems to be ‘All values captured by this function have been declared explicitly’.


It’s unclear to me how ‘have been declared explicitly’ will solve ‘problems, somehow related to capturing’.


Small example. Let's say you have:

    val wikipedia: List[WikipediaPages] = ...

And you want to send a function to another machine which does lots of math and in that math, uses the total number of wikipedia pages. So in your closure, you:

    doCrazyLinearAlgebra(localDataShard)/wikipedia.length

wikipedia.length is supposed to be an Int, right? Well, you just captured all of wikipedia unknowingly (if wikipedia is a field, then, more precisely, you've captured the enclosing object that declares the field). Oops, would suck to send that over the wire.

If you used spores, you'd be required to make a local copy of everything that you use. You definitely would not put `val wikipedia` inside of your spore, would you? No– instead you'd just make a copy of the Int-valued length of `wikipedia`, by including the following in the list of value definitions as part of the spore:

    val length = wikipedia.length

Now you don't have to worry about accidentally sending wikipedia over the wire.
 

There are some questions the SIP should answer:



- When would I use the Spore type over the regular function type?

- When would I use the spore constructor instead of an anonymous function?

- What are the disadvantages of using the solution?

- How does this new property solve the problem shown with the motivating examples? I want to know in detail why “Have been declared explicitly” will reduce bugs. What are the steps to get from buggy code using some libraries to a fixed version, how have spores helped and how much correct code had to be changed as well to support the solution?


SIPs aren't really meant to be user guides or tutorials. Have a look at any of the previous SIPs. SIPs are meant to be concise, carefully-stated proposals for language features or libraries. More often than not, SIPs read a lot like a spec. See any of the 2.10 SIPs for an example, value classes (http://docs.scala-lang.org/sips/pending/value-classes.html) or implicit classes (http://docs.scala-lang.org/sips/pending/implicit-classes.html) come to mind. None of the questions you ask would be appropriate for inclusion in a document of this type. The questions you list are indeed important, and these questions are a good example of what should be answered in the user documentation for spores.

Naturally, that doesn't prevent us from discussing these questions here prior to judgement of the SIP. On the contrary, discussion could help develop the future documentation for spores.
 

IMO spores are not a conservative solution at all because the solution offered is much broader than the problem itself [1]. Think about what happens if the SIP is implemented like currently proposed: Either it’s an opt-in solution i.e. the developer decides when to use spores, then it won’t be used at all because you don’t know where to use it exactly to prevent bugs. Or, it’s enforced by libraries in which case all code using the libraries has to be changed. It will be a chore similarly to having to make variables final in Java, which when you do it in a mechanized way won’t help you paying attention to the details that actually are the root of the bugs.


Regarding my earlier comments about why this has to be a SIP: IIUC there’s nothing preventing spores from being just another library for everyone to try out. It’s likely that many of these questions would have come up by now if someone would have tried spores experimentally. A SIP for inclusion into the standard library should IMO be preceded by practical experiments collecting evidence for the utility of the proposed addition (because it's so easy compared to SIPs regarding compiler changes).


As Martin has already said, SIPs are required even for proposed library additions. Have a look at the SIP that myself and others wrote for futures (back in 2.10) for an example. This is protocol. 

A prototype implementation of spores will be included in the next 2.11 milestone. And we will most certainly try to get as much practical experience with them as soon as that's possible. That goes without saying that we would experiment using them with our own libraries, and with the community libraries that would take advantage of them. Of course, the point of a SIP is to simply seek feedback early on.
 

Here’s a simple application using Futures I tried to convert to use spores (assuming that all higher-order functions in `Future` are changed to require spores):


https://github.com/jrudolph/spray/commit/8faa2a6616b3a69e374a41ce3b8be9ff33b35b14


Immediate questions:

* for-comprehensions: do they work together with spores anyhow?

* nested spores, what are the rules?



Johannes



[1] Whatever the problem really is, I find it hard to describe. Many cases seem related to capturing `this` and then exporting code which calls instance methods from inside the class context, and therefore violating the encapsulation classes usually provide. That seems to fit the Actor.sender case but less the serialization case where the problem seems to be that it’s hard to control which implicitly created instances contain outer references and whereto exactly.



--
Johannes

-----------------------------------------------
Johannes Rudolph
http://virtual-void.net

--
You received this message because you are subscribed to the Google Groups "scala-sips" group.
To unsubscribe from this group and stop receiving emails from it, send an email to scala-sips+...@googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.
 
 

Philipp Haller

unread,
Jun 26, 2013, 1:12:32 PM6/26/13
to scala...@googlegroups.com
I find the SIP is exceptionally clearly written, which is why I'm a bit surprised about your feedback. Of course, I'm biased, since I'm a co-author. It's important that the SIP is very clear, so we're going to address anything that's unclear.

My other responses are inline.

On Wed, Jun 26, 2013 at 5:32 PM, Johannes Rudolph <johannes...@googlemail.com> wrote:

Heather, thanks for the clarifications.


The problem I still have is that it is not clear how the SIP should help to write correct programs in practice. And IMO that’s mainly a consequence of two things the SIP is unclear about:


- an analysis of the problem, a problem statement and an explanation how the given motivations fit into the problem statement


I think the SIP does a good job of stating and analyzing the problem. Unintentionally capturing variables in the environment of a closure can lead to various problems. The SIP explicitly lists some of the most important problems. It even provides two concrete examples exhibiting two different problems. The serialization example directly illustrates a very common problem in real Spark code. The Akka/futures example illustrates another real use case.

Could you let us know which of the above parts are unclear and why?
 

- a summary description of the solution and how the solution solves the practical problem


Which part of the description (section "Design") of spores is unclear? Spores have a very simple definition. What's more the SIP clearly explains how the solution solves the practical problems (section "Motivating Examples, Revisited"). For both concrete examples, the solution is clearly explained.

Actors/futures:
In this case, the problematic capturing of this is avoided, since the result of this.sender is assigned to the spore’s local value from when the spore is created. The spore conformity checking ensures that within the spore’s closure, only from and d are used.

Serialization:
Similar to example 1, the problematic capturing of this is avoided, since helper has to be assigned to a local value (here, h) so that it can be used inside the spore’s closure. As a result, fun can now be serialized without runtime errors, since h refers to a serializable object (a case class instance).

 

(- the “how does the feature integrate with existing features” section which would have revealed practical problems) 


The motivating cases from the SIP show that there are issues related to capturing but it doesn’t explain what the exact relation really is.


The exact relation is explained directly following the examples. In the case of the serialization example, the explanation is as follows. Please let us know what's unclear about it:

Given the above class definitions, serializing the fun member of an instance of Main throws aNotSerializableException. This is unexpected, since fun refers only to serializable objects: x (an Int) andhelper (an instance of a case class).

Here is an explanation of why the serialization of fun fails: since helper is a field, it is not actually copied when it is captured by the closure. Instead, when accessing helper its getter is invoked. This can be made explicit by replacinghelper.toString by the invocation of its getter, this.helper.toString. Consequently, the fun closure capturesthis, not just a copy of helper. However, this is a reference to class Main which is not serializable.

 

Futures: What seems to be proposed is to replace the parameter types of Future's higher-order methods to require Spores instead of plain functions. Please try to do that and see what has to be changed in even the simplest Future-based applications. It turns out that being able to close over stuff is one of the big advantages of the Future API. In fact, it's hard to find a legitimate program using Futures that doesn't use capturing. With spores we are basically back in Java-land where closed-over values had to be put into final variables. Actually, it turns out the problem is much more related to the akka Actor API than to Futures. So, why change Futures?


Spores do not prohibit capturing at all. A future taking a spore can very well close over variables in its environment. The SIP makes this very clear. In fact, spores use capturing in _all_ examples in the SIP!

The problem is not at all limited to the Akka actor API. In fact, the discussion about spores was initially driven to a large extent by problems occurring in real Spark code where closures were not serializable because of unintentionally captured outer references. What's interesting is that even with Spark's closure cleaner that Heather mentioned, people can shoot themselves in the foot by creating closures that have the wrong shape. With spores most of these problems are avoided.


Serialization: Capturing is still possible with spores, as is referencing other un-serializable values in other ways. So, how do spores help?


Spores are very helpful for this case because of two reasons:

1. It is directly visible from the source whether a spore is serializable: you just have to inspect the types of all captured variables (the val defs in the first part of the spore); if all of those are serializable, then the spore is serializable.

What's really important about this is the fact that with conventional closures the problem is that even if you know all the captured variables (you have to inspect the closure's body, though) and even if all their types are serializable, you still don't know whether the closure is serializable! You have to inspect the resulting _compiled_ code to find out whether a problematic outer reference is retained. This is a nightmare in terms of code comprehension!

2. Spores work together with the compiler backend to make sure no references to enclosing objects are retained if it's not absolutely necessary. So, instead of knowing what code the backend produces you can rely on the high-level spore abstraction to guarantee that for you!
 

I’ll try to rephrase what a spore is: Spore is a new type extending the Function type which guarantees (with a static check at construction) that a function value of this type has an additional property. The property seems to be ‘All values captured by this function have been declared explicitly’.


It’s unclear to me how ‘have been declared explicitly’ will solve ‘problems, somehow related to capturing’.


There are some questions the SIP should answer:


- When would I use the Spore type over the regular function type?


I can only re-iterate what Martin and Heather have already pointed out. Spores are useful both for library consumers and library producers. For consumers they are good because it improves code clarity and helps code comprehension (but please re-read Martin's response which I find quite clear). Furthermore, it provides useful guarantees, such as more robust serialization. For library producers, it helps prevent users to shoot themselves in the foot.

I have seen impressive examples of commercial Spark code where simply enforcing the closure shape that spores enforce would have saved hours of debugging, and would have been of tremendous help to Spark's users. This is why real experience with frameworks like Spark and Akka is so important. And that's why I'm so happy and confident with the current SIP, since it effectively addresses a huge problem in a simple and robust way.
 

- When would I use the spore constructor instead of an anonymous function?

- What are the disadvantages of using the solution?


Disadvantages could be the more verbose syntax and the fact that capturing vars requires using an explicit reference cell. On the other hand, the more verbose syntax is in many cases also an advantage (that's why I said "could"). And I have yet to see an example where it is useful to capture a var in a closure _that's used in a concurrent or distributed application_. Something like that just invites race conditions and other hazards which is why I would consider it a strong code smell!
 

- How does this new property solve the problem shown with the motivating examples? I want to know in detail why “Have been declared explicitly” will reduce bugs. What are the steps to get from buggy code using some libraries to a fixed version, how have spores helped and how much correct code had to be changed as well to support the solution?


Again, read the section "Motivating Examples, Revisited". It explains exactly how the use of spores resolves their respective problems and why. Since these examples are written in full length you can directly see how much code had to be changed.
 

IMO spores are not a conservative solution at all because the solution offered is much broader than the problem itself [1]. Think about what happens if the SIP is implemented like currently proposed: Either it’s an opt-in solution i.e. the developer decides when to use spores, then it won’t be used at all because you don’t know where to use it exactly to prevent bugs. Or, it’s enforced by libraries in which case all code using the libraries has to be changed. It will be a chore similarly to having to make variables final in Java, which when you do it in a mechanized way won’t help you paying attention to the details that actually are the root of the bugs.


Even if spores are opt-in they have benefits, as Martin has pointed out.
 

Regarding my earlier comments about why this has to be a SIP: IIUC there’s nothing preventing spores from being just another library for everyone to try out. It’s likely that many of these questions would have come up by now if someone would have tried spores experimentally. A SIP for inclusion into the standard library should IMO be preceded by practical experiments collecting evidence for the utility of the proposed addition (because it's so easy compared to SIPs regarding compiler changes).


Rest assured that we've been hammering on spores already for some time, and that there are going to be a lot of practical experiments!

In addition, and that's again very important: the spore macro is a very simple validation macro which means there is absolutely no magic involved. Spores simply enforce a particular closure shape. The Spore type extends Function1. This simple subtyping already provides very strong guarantees, right? The types let you reason about what's guaranteed to work and what isn't. The main experiments that we'll have to expand on are related to the syntax: is it workable in practice or is it too verbose?

Basically everything else we can be extremely confident about, given the simplicity and robustness of the proposal.

Therefore, having spores be just another library for everyone to try out will not help all that much. The situation with spores is very different from something involving a full-featured library (Akka comes to mind): there is no point in delaying the standardization of spores because of the desire to making the implementation more robust or more feature complete. The implementation is very simple and is being reviewed by the same people that are working on the Scala compiler code base on a daily basis. Furthermore, discussing feature-completeness goes against the point of the spores proposal, the whole point is for it to be simple and not feature-rich!

We've been trying to be good citizens by publishing one of the first two SIPs related to Scala 2.11 as early as possible to make comprehensive feedback possible (in the past, people were sometimes unhappy about SIPs being published rather late).


Here’s a simple application using Futures I tried to convert to use spores (assuming that all higher-order functions in `Future` are changed to require spores):


https://github.com/jrudolph/spray/commit/8faa2a6616b3a69e374a41ce3b8be9ff33b35b14


This is a great start- thanks a lot for experimenting!


Immediate questions:

* for-comprehensions: do they work together with spores anyhow?

* nested spores, what are the rules? 


Good point about for-comprehensions, but consider what happens to other subtypes of Function1. I don't see issues with nested spores.

Thanks for your feedback.

Philipp




Johannes



[1] Whatever the problem really is, I find it hard to describe. Many cases seem related to capturing `this` and then exporting code which calls instance methods from inside the class context, and therefore violating the encapsulation classes usually provide. That seems to fit the Actor.sender case but less the serialization case where the problem seems to be that it’s hard to control which implicitly created instances contain outer references and whereto exactly.



--
Johannes

-----------------------------------------------
Johannes Rudolph
http://virtual-void.net

--

Alex Boisvert

unread,
Jun 26, 2013, 1:59:14 PM6/26/13
to scala...@googlegroups.com
Semi-related, I'm wondering if the SIP authors have seen the comments provided at the bottom the SIP page (http://docs.scala-lang.org/sips/pending/spores.html), particularly with regards to the capturing syntax proposed by Ryan Hendrickson, which appear desirable by a number of upvoters and commenters.

(It's a bit confusing to have both page-comments + emails to discuss and debate the SIP)

Johannes Rudolph

unread,
Jun 27, 2013, 5:38:05 AM6/27/13
to scala...@googlegroups.com

I apologize, my tone was inviting a defensive answer and I didn't get my point through. I’ll try to rephrase my arguments.


My main complaint: All boils down to how to decide when to use spores. I say: for your argumentation in the SIP that spores prevent errors you have invoked an oracle to decide when to use them. More specifically, the SIP takes two examples we know are broken and then applies spores to 'reveal' the errors. When and how to use spores to prevent errors (on code of unknown correctness) is not something which should be left to the user documentation but it should be the central part of the motivation. Lacking an oracle, you are left with two choices: use spores everywhere which is not acceptable, or don’t use them at all.


My formal objection: Language changes and library changes in Scala usually are evaluated to comply to a high-standard. One of the main criteria usually is that a new feature was proven to compose well with other features of the language. The most obvious interaction of spores would be with itself (i.e. nested spores), other functions, with methods, with by-name parameters, with for-comprehensions and wrt eta expansion. This is missing from the proposal.


My admission: It’s indeed hard to argue with Roland’s example. `Props.apply` seems to be the poster child for spores. What you usually want to build a Props from is 1) a number of constant values and 2) a constructor for an actor. The `Props.apply` overload with the by-name parameter is just too powerful and invites dangerous captures. Spores seem to fit there quite nicely. My objection is that even there a spore doesn't catch all errors in its domain. But maybe that’s something I have to live with: there are solutions which are not applicable everywhere and which don’t catch all errors but which may still be useful for some purposes.


In summary my objections are:

* spores only help if applied at the right places, but I don’t know where to apply them

* spores interaction with other language features is unspecified and, as others have noted, not obviously trouble-free

* spores indicate only some errors related to capturing and always require the developer to be alert not to duct-tape around the issue


Trying to be more constructive I created an example (showcasing the last point from my list) which assumes that `Props.apply` would require spores (I created a wrapper `SProps.apply` which actually does). The situation is simple, it’s an akka application which has a Boot class which holds some buffers used during booting which should be eligible for GC after booting. During booting a series of actors is created. The task here is to avoid leaking the Boot instance into the actor Props. I used spores with imagined best practices to avoid the common pitfalls:


https://github.com/jrudolph/sporophyte/blob/86c4cbce68047186f780fbc90a23c9c0de041565/src/main/scala/sporophyte/Main.scala#L19


Questions for evaluation: Which one of those actor creations avoided capturing the Boot instance? How did spores help? How does the syntax compare with the original one?


Johannes

Chris Marshall

unread,
Jun 27, 2013, 6:49:07 AM6/27/13
to scala...@googlegroups.com, scala...@googlegroups.com
Wouldn't this mean a complete re-write of the futures API?

Lukas Rytz

unread,
Jun 27, 2013, 6:58:52 AM6/27/13
to scala...@googlegroups.com
On Thu, Jun 27, 2013 at 11:38 AM, Johannes Rudolph <johannes...@googlemail.com> wrote:

I apologize, my tone was inviting a defensive answer and I didn't get my point through. I’ll try to rephrase my arguments.


My main complaint: All boils down to how to decide when to use spores. I say: for your argumentation in the SIP that spores prevent errors you have invoked an oracle to decide when to use them. More specifically, the SIP takes two examples we know are broken and then applies spores to 'reveal' the errors. When and how to use spores to prevent errors (on code of unknown correctness) is not something which should be left to the user documentation but it should be the central part of the motivation. Lacking an oracle, you are left with two choices: use spores everywhere which is not acceptable, or don’t use them at all.



I would not see it that black-and-white. The educated programmer is aware of the issues that can
arise with accidentally captured state. Spores are a useful tool that allow him to make captured state
explicit and have it verified in those places where he knows he has to be careful.

Of course I agree having also an oracle would be better.

Roland Kuhn

unread,
Jun 27, 2013, 9:24:45 AM6/27/13
to scala...@googlegroups.com
Hi Heather,

thanks for this write-up, the goal of a modest and conservative proposal resonate well with me: keep it simple and obvious. We should aim for the simplest form which is still reasonably easy to use and then build on that once we have gained more experience: we need to assess how successful this mechanism is in avoiding common pitfalls, and we also need to make sure that those for whose benefit the feature is introduced—users of the language and its libraries—actually like what they type and see.

Alex pointed out that there are comments on the SIP page: not only in this thread but also on that page one comment appears multiple times—and I have heard that often when asking around privately—which is that the syntax is too clunky for everyday use. When working with concurrency-related code (which does not only entail actors) it is too costly to spend several extra lines for every closure passed around. The best suggestion I saw was from Ryan Hendrickson, and it is one which fixes the problems we discussed in New York in a quite natural way:

spore { () => … captureValue(<expr>) … }

will be translated into

spore {
  val $freshName = <expr>
  () => … $freshName …
}

I took the liberty to rename his proposed marker from “capture” to “captureValue” to make it completely obvious what it does. This construct can be used for capturing the result of a method call without the potential of confusing the user as to what is happening.

The other proposal I found intriguing is to make the capture of stable values automatic: it has been rightly remarked that the need to invent a new name for the same value has non-negligible cost and no gain. I would argue that this behavior of spores is what should actually be the semantics of normal closures, since stable values are just that—values—and there is no need to capture them via their containing context.

The third proposal I want to make is to add specific support for nullary functions, making it possible to drop the “() =>” boiler-plate code when for example creating Futures or Actors.

The above proposals would reduce the syntactical overhead of using spores considerably—as has been argued in private they would push the SIP from “unusable” to “usable”.

The points originally raised by Jim are certainly worth considering as well, but as you correctly point out we should not make the second step before the first. To illustrate what I mean by this, let us consider a small example:

class X {
  def mutate() = { … }
  Future(spore(mutate())) // will not compile
}

The compiler error would advise the user to use explicit capture, which would then end up for example with

  Future(spore(captureValue(this).mutate()))

Jim’s point is that `this` might want to be allowed explicitly (opt-in) or forbidden (opt-out) by using a type class or annotation, and I think that is something we should definitely explore. But meanwhile the code line above stands out enough to reviewers as to raise the question “why capture `this`?”. Step 1 will not be the perfect solution, it should rather be good enough to allow step 2 to continue the work.

Finally I’d like to say that while I am eager to try this technique in Akka and gain experience with it, I am not sure whether we should rush the inclusion into the language proper. The points raised concerning eta-expansion and for-expressions (and possibly others) deserve thorough treatment before that step could be considered. That is not to say that I find this SIP non-useful, the contrary is true: we should start trying things in anger and possibly scrutinize several solutions. After giving the matter more thought I realize that perhaps one solution does not fit the two problems (serializability and concurrency); I’ll write up more on that later.

Thanks everyone involved for composing this SIP, I think solving these issues will be an important step in Scala’s evolution.

Regards,

Roland

Eric Torreborre

unread,
Jun 28, 2013, 2:06:52 AM6/28/13
to scala...@googlegroups.com
>  Future(spore(captureValue(this).mutate()))

Would that be a good idea to instead capture *all* the values and, only if desired, "uncapture" them (I don't know what would be the good term for that).

In that sense the spores would be some kind of dual of the normal function. A normal function closes over its whole environment, a Spore makes clean copies of everything.

E.

Heather Miller

unread,
Jun 28, 2013, 9:00:43 PM6/28/13
to scala...@googlegroups.com
Hi Johannes, all,

First off, thanks, all, for all the feedback everyone's provided so far. All of your input has allowed us to come up with a potential revision of the SIP (the rest of this email covers its main points).

Johannes– thanks for rephrasing your arguments, that clarifies a lot. Btw, I really appreciate your efforts to try and apply spores in various situations- it'll help us with our next revision of the SIP!

Regarding the interaction of spores with other language features: you're right that as it stands spores and for-comprehensions don't integrate (causing your rewriting of your initial future-based code). The question of for-comprehensions specifically has come up several times– it's a question I/we have actually been working hard to answer for several days (solving it in a satisfying way requires re-thinking some aspects of spores). I'm happy to say that, after some careful thought, I think we have some cool ideas.

In order to more seamlessly support for-comprehensions and other closure-based features, we've been thinking about whether or not it'd be a good idea to provide a way to implicitly convert closures to spores (Eugene has been trying really hard to convince us!). This implicit conversion + a twist could give us a way to provide intuitive support for for-comprehensions. Additionally, as a side effect, it should enable us to reduce the syntactical overhead that you, Roland, and others have voiced concerns about.

About your SProps experiment: using an implicit macro for converting a (correctly-shaped) closure to a spore, and making SProps a macro, too, one should be able to write

system.actorOf(SProps {
  val l = this.logger
  () => new LoggingActor(l)
})

The earlier-mentioned "twist" is to introduce the suggested `capture` syntax, but only for local vals. Selections and method invocations (e.g., `this.sender`) that should be evaluated upon spore creation would still have to go into the list of val defs of the spore. The rationale is that this restriction would make sure that we re-use the existing way of expressing the same evaluation semantics:

  {
    val l = this.logger
    () => new LoggingActor(l)
  }

and

  spore {
    val l = this.logger
    () => new LoggingActor(l)
  }

have the same runtime behavior. That's nice because this way, we don't introduce a new syntax for expressing this evaluation semantics.

Returning to the point of how we intend to support for-comprehensions... Suppose we have a type T with flatMap/map/withFilter methods that take spores instead of functions. Then, we can support the for-comprehension syntax as follows (gen: T):

  for {
    x <- gen
    y <- f(capture(x))
  } yield
    g(capture(y))

Basically, whenever you refer to a local variable you'd have to use `capture`. Otherwise, the implicit macro that tries to convert the corresponding closure to a spore fails, because the closure does not have the shape of a spore.

Of course, you could use the `capture` syntax within a normal spore, as suggested by Ryan Hendrickson, however with the above restriction (i.e., only local vals, not expressions). That way the problem of having to explicitly name each captured value that Dan Nugent, Li Haoyi, and others have mentioned goes away, too.

Cheers,
Heather
   



On Thu, Jun 27, 2013 at 11:38 AM, Johannes Rudolph <johannes...@googlemail.com> wrote:



--

martin odersky

unread,
Jun 29, 2013, 5:20:50 AM6/29/13
to scala...@googlegroups.com
On Sat, Jun 29, 2013 at 3:00 AM, Heather Miller <heather...@epfl.ch> wrote:
Hi Johannes, all,

First off, thanks, all, for all the feedback everyone's provided so far. All of your input has allowed us to come up with a potential revision of the SIP (the rest of this email covers its main points).

Johannes– thanks for rephrasing your arguments, that clarifies a lot. Btw, I really appreciate your efforts to try and apply spores in various situations- it'll help us with our next revision of the SIP!

Regarding the interaction of spores with other language features: you're right that as it stands spores and for-comprehensions don't integrate (causing your rewriting of your initial future-based code). The question of for-comprehensions specifically has come up several times– it's a question I/we have actually been working hard to answer for several days (solving it in a satisfying way requires re-thinking some aspects of spores). I'm happy to say that, after some careful thought, I think we have some cool ideas.

In order to more seamlessly support for-comprehensions and other closure-based features, we've been thinking about whether or not it'd be a good idea to provide a way to implicitly convert closures to spores (Eugene has been trying really hard to convince us!). This implicit conversion + a twist could give us a way to provide intuitive support for for-comprehensions. Additionally, as a side effect, it should enable us to reduce the syntactical overhead that you, Roland, and others have voiced concerns about.

About your SProps experiment: using an implicit macro for converting a (correctly-shaped) closure to a spore, and making SProps a macro, too, one should be able to write

system.actorOf(SProps {
  val l = this.logger
  () => new LoggingActor(l)
})

The earlier-mentioned "twist" is to introduce the suggested `capture` syntax, but only for local vals. Selections and method invocations (e.g., `this.sender`) that should be evaluated upon spore creation would still have to go into the list of val defs of the spore. The rationale is that this restriction would make sure that we re-use the existing way of expressing the same evaluation semantics:

  {
    val l = this.logger
    () => new LoggingActor(l)
  }

and

  spore {
    val l = this.logger
    () => new LoggingActor(l)
  }

have the same runtime behavior. That's nice because this way, we don't introduce a new syntax for expressing this evaluation semantics.

I think that's an important point. It's already difficult for many people to predict what will be evaluated when in closure-heavy code. But at least the rules are simple - you are either under the lambda or not. I think it would become even more difficult if we complicated the rules, e.g. capture(expr) means that expr will be evaluated outside the closure (what does that even mean in a for-comprehension?). So, I think the capture idea is very cool as long as it does not involve a modification of evaluation order. And if we restrict captured things to paths, that's the case.

Cheers

 - Martin



--

Roland Kuhn

unread,
Jul 2, 2013, 3:11:59 AM7/2/13
to scala...@googlegroups.com
Hi Heather, Johannes, all,

the proposals below are definitely a step in the right direction, that is intuitively clear to me. What was not so clear before the weekend, and Johannes’ post told me that I’m not alone, is what exactly the goal should be for this feature, in particular viewed from the angle of concurrency. As it turns out, serializability and thread-safety are connected more tightly than I realized on Thursday.

When sending behavior—a closure—over the wire to a remote network node, for example to run it on a spark cluster, we need two things: we need to get the code there including all data it needs to perform its function, and we need to get the result back. In a local context there are two ways to perform the latter, either by aggregating the result within the closure and returning it as a result or by side-effecting. There are cases where the latter yields better performance and/or readability, as Martin argued in his ScalaDays keynote, and hence Scala is not dogmatic about only allowing the former. But alas, it does not work on a spark cluster, the side effect would occur on some remote network node and would probably never be propagated back to the original program. If it was indeed propagated back, then the local application of the transported side-effects would have to be carefully synchronized—between the different cluster nodes as well as with the locally running program. This is the same issue which arises when transporting the closure “just” onto a different thread on the same JVM; this is the link between the two concerns.

There is one way to make closures absolutely safe in these regards. Given an effect system it would be possible to allow only capturing immutable values which do not offer any behavior with side-effects, meaning that the closure would have to be pure. But this overshoots the goal, as can be demonstrated using the ActorRef type: its “tell” method is obviously side-effecting, hence an ActorRef cannot be considered pure, however Akka ensures that calling that method always works as expected, no matter in which context or on which network node. This brings me to the conclusion that we need something else than purity to express what is allowed and what is not.

In view of the notable absence of an effect system for Scala—and in particular resonating with Martin’s email earlier this morning about SAM types—it is obvious that the above must remain utopian for now, our goal bound closer to where we are right now. The best we can achieve as far as I can see is to introduce a construct which behaves like a closure but makes capturing problematic names at least obvious and where possible reject it. This feature is a lot smaller in scope, a very incomplete solution in terms of the introduction above. Therefore it should also not have undue syntactic cost; I believe that users will be more inclined to add visual overhead if the improvement in safety is more complete.

The conclusion of this train of thought is thus:
  • that it would be wise to add the implicit conversion from function literal to spore as outlined below
  • that unproblematic values should be captured without any overhead (i.e. “capture()” should not be required, since it does not add more information)
  • that unproblematic values include all stable values (or paths, if I understand that term correctly), not only local vals
  • it should be possible to explicitly mark problematic types so that capturing them is rejected

This means that user code for a hypothetically changed Future API would look very much the same as today, just with some added safety. Changes would only be required to cache values like Actor.sender in a val, exactly matching the best practices we teach users already today. There should of course also remain the explicit spore{} constructor for cases where the target API is not (yet) converted or where the user wants to ensure non-capturing behavior due to other reasons, or just to be very explicit as a matter of code style. Of course the protection offered by this feature is shallow: if you capture any problematic object reference (including normal closures) then you can shoot yourself into the foot with it, but as I argue above there is nothing we can do about that at this point.

The syntax overhead of this proposal would be close to zero, and the rules for translation by the macro would also be straight forward to specify (I am not so sure about the implementation details, but that is not a cost the user should have to pay). The evaluation semantics of a closure would not be changed by the translation, justifying the absence of specific syntax.

Regards,

Roland

Havoc Pennington

unread,
Jul 2, 2013, 3:34:02 PM7/2/13
to scala...@googlegroups.com
Hi,

Reading over this thread, I'm not sure exactly when a spore type
should be used, or what specifically defines a "problematic" value as
Roland puts it. We've talked about examples such as serialization and
concurrency, but I'm not sure how to identify a problem via general
principle (vs. by plugging it into examples).

I wonder if the general principle has to do with whether the current
stack frame remains present when the closure runs?

I talked about that in this blog post (some of you have seen it before):
http://blog.ometer.com/2011/07/24/callbacks-synchronous-and-asynchronous/

For that post I defined sync vs. async in a very specific way, which
is whether the closure is executed with the current stack frame still
around.

Talking about "same stack (frame)" execution vs. "another stack",
"another stack" could conceptually be another thread, or this same
thread but in deferred fashion (stack is popped back to the main loop,
which then executes the closure).

It's possible that serialization fits into the same scheme; another
JVM, after all, is a different stack...

Perhaps this terminology (same-stack / different-stack) would be a way
to clarify the intended semantics and when to use which
types/annotations/APIs/whatever.

If we look at Actor methods which are dangerous to capture, such as
`sender` and `self`, they are dangerous precisely because they are
valid only on the current stack (they are presumably set up by the
stack frame that invokes the actor's behavior). If these were somehow
annotated as "same stack only", then a macro could know that they
cannot be captured in a "different stack" closure. So those `sender`
and `self` methods could be precisely described as "stack dependent"
or "valid only for this same stack" or something of that nature.

Similarly, the subject of my blog post above is to argue that most
APIs should define the execution model of closures they accept - given
foo(closure: A=>B), it should be defined whether `closure` runs with
foo's stack frame still around. So at least for new APIs without
legacy headaches, it might be reasonable to hope that they could
specify this (through the type system, an annotation, or at least
docs).

Some complications:
- some current APIs do NOT promise one way or the other
- if the closure is pure, then it is probably OK not to care what
stack the closure runs on (impure closures can be broken either way -
they can rely on same-stack or rely on different-stack)
- an API could legitimately leave the closure's execution stack
unspecified by requiring the closure to be pure
- for Future, same-vs-different stack is deterministic - but it's
defined by the ExecutionContext and thus configurable

In the context of Actor we've talked a lot about closures which are
invalid when moved to a different stack, but I don't think we've
talked much about closures which are invalid if kept on our own stack
- that's possible too though, e.g. deadlocks can happen, and you can
find mutable-state scenarios that break.

If this same-stack vs. different-stack definition of the problem is
useful, then it seems like we'd want to start to know, for each method
or value, whether its value is stack-dependent... and for each
closure, whether it will execute with the current stack frame
present... and then different warnings and macros could use that
information...

I don't know if this way of thinking about it is consistent with what
you guys have in mind for spores or not, but this is how I typically
think about closures, so I find myself trying to understand spores in
these terms.

Havoc

Johannes Rudolph

unread,
Jul 3, 2013, 10:01:38 AM7/3/13
to scala...@googlegroups.com

Hi Heather,


thanks for having this open discussion and addressing the syntax concerns. I’ve updated both of my examples to the new syntax [1,2]. Indeed, with the new syntax the original for-comprehension can be restored so that seems like progress.


That said: purposely, I tried to refrain from discussing any technical details of the proposal (it’s itching me all over to do that) because I’m still not convinced that it solves the root of the problems you state as a motivation (and I tend to agree with Roland that maybe many of them can’t be solved right now). That’s what my SProps example was about: for each example where spores could prevent a problem, the error which isn’t found by spores is just one abstraction away. *)


With the syntax changes this might actually be amplified by the fact that the least intrusive change needed to make your code compile again is usually the wrong one, i.e. wrapping the wrong expression with `capture`. As Roland noted when going from functions to spores wrapping with `capture` means no semantic change so this alone cannot directly resolve a bug. To be fair, in some cases you will end up with `capture(this)` which in some cases is a marker that something is wrong. The point is you cannot know without looking at the type of `this` and even then you may come to the conclusion that `capture(this)` is the way to go. Then you have to mark this with a comment which then again is not machine-checkable any more.


I repeat myself here: If the proposal were just about serialization, it could work because serializability actually _is_ a property of captured values and you could somehow encode it in the types. Also, it’s quite clear how serializability composes: a compound value is serializable if all its parts are (roughly). And a closure is just an implicit compound value so you could check serializability of the compound by checking the serializability of its components. IIUC that’s what that part of cloud haskell is about.


More generally speaking, the problems arising are less about the fact *that* you capture than about *what* you capture. That’s nothing you can properly determine or decide locally at the point where you capture but something which 1. has to be propagated in the types and must be composable in some way and 2. is often dependent on the actual context of capturing. So, Roland’s suggestion to explicitly opt out some types from being captured seems like the minimal version of something like that. I guess it would work in the simple cases but missing any composability it still makes it too easy for problems to hide somewhere.



Johannes


*) It may be ok for a solution to catch just a subset of errors (it's this way with most solutions). The proposal should then at least be clear about it to make it easier to understand the trade-offs.


[1] https://github.com/jrudolph/spray/compare/wip;sporification

[2] https://github.com/jrudolph/sporophyte/blob/master/src/main/scala/sporophyte/Main.scala



--
You received this message because you are subscribed to the Google Groups "scala-sips" group.
To unsubscribe from this group and stop receiving emails from it, send an email to scala-sips+...@googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.


Roland Kuhn

unread,
Jul 8, 2013, 8:17:28 AM7/8/13
to scala...@googlegroups.com
Hi Havoc,

yes, your characterization looks correct and very useful to me: “problematic closure” boils down to “executed not below originating stack frame”, which means that a `.foreach` of a sequential collection is fine while a parallel collection is problematic.

I find it appealing to visualize the lexical area of code within the closure to be hoisted out of the normal “on-stack” plane, and the cliff which forms is delimited in the source code by the spore { … } marker. In the end it is similar to type inference in that explicit hoisting can be preferable in places while automatic (checked) hoisting avoids clutter where that is not needed.

Regards,

Roland

Suminda Dharmasena

unread,
Oct 5, 2013, 3:49:11 AM10/5/13
to scala...@googlegroups.com
Perhaps ability to enforce constraints on the captured objects would be useful. E.g. you would not want a particular object escaping the current context if for example you close the resource towards the end. Also if it is captured it can be only captured once in a closure outliving the current scope. (Single access path / enforce that it is not shared.) Also the ability to specify if an object will be cloned on capture. (Any state changes of the object will be isolated to the closure.)

Suminda Dharmasena

unread,
Mar 19, 2014, 1:18:43 PM3/19/14
to scala...@googlegroups.com
spore(val h = helper, ...) { ... }

Would be a nicer syntax than what is posposed. Also you are explicit on what the captured arguments are.
Reply all
Reply to author
Forward
0 new messages