I think this SIP (SIP-21) is a good start, but the guarantees and safety can be improved. I would like to highlight the following as considerations for improving the specification for spores. The areas I will address are:
As things stand, the current proposal provides very weak guarantees about correct behavior of spores at run-time. I think this situation can and should be improved. For instance, the current specification allows the following construct with no error:
class Bar(file:File) {
private val stream:FileInputStream = // ...
val takeN = spore {
val localStream = stream
(n:Int) => {
val r = new Array[Byte](n)
localStream.read(r)
r
}
}
}
This can be improved.
What I would propose is that values captured by spores be required to have a specific typeclass associated with their type (this idea is stolen from Cloud Haskell) that constrains what can be captured sanely. For the purpose of this discussion I'll use a typeclass called Sporable
. In discussions with Philipp Haller he suggested going directly to Pickleable
(based on the new Scala pickling system) which may be fine, I don't want to get too bogged down in serialization details at this point. The issue is that the typeclass needs to be more than a mere marker - you should be required to provide useful functionality necessary for the correct operation of a spore. In particular, in a non-local (distributed) environment. A meaningful serialization/deserialization implementation would be a good start.
So, the implementation of the spore
macro would examine all of the types of the captured values and search through implicit scope for an implementation of an appropriate typeclass instance of Sporable
for the target type. If no such instance can be found then fail in the usual ways.
Sporable
typeclass instances can be provided for all scala.collection.immutable
types. Users can provide their own instances of Sporable
for their own types. Yes, this means that if someone wants to pretend that they can correctly serialize a FileInputStream
or an object behind an arbitrary trait/interface then more power to them.
Consider the following code:
def foo[I,O](f:I => O) = spore {
val localF = f
(i:I) => f(i)
}
While one may be quick to point out that for safety higher-order-functions should transitively require Spore
types I think this is too restricting. I propose something lighter-weight that, for instance, does not require changing existing interface types in order to work with spores. In particular, I propose a @sporable
attribute that can be used on methods, functions. For example:
@sporable
def doubleString(i:String):String = i+i
Would invoke a compile-time check to indicate that the method doubleString
is, in fact, a "sporable" function. In the context of a method/function this would mean that the function body abides by more-or-less the same rules as a spore
body: the only values captured are arguments, all other values are introduced in the body of the function/method, arguments to the function must have a Sporable
typeclass instance in scope, and function arguments must be tagged @sporable
or be of a Spore
type. The type of the method would still be Function1[String,String]
.
The original HOF can now be written:
def foo[I,O](f:I => O @sporable) = spore {
val localF = f
(i:I) => f(i)
}
Which can be used at call sites to ensure that the only functions passed to foo
are of type Spore[?,?]
or tagged as @sporable
essentially enabling safe lifting into the spore
context.
I suspect that implementing this will require macro annotations.
Since what a spore
generates during serialization can be a binary blob of JVM bytecode and data it may be useful to indicate limits on exactly how big these blobs can get. I would like to propose a way to produce warnings and errors when the run-time properties of a serialized spore exceed thresholds. For example:
def foo[I,O](f:I => O @sporable) = spore(warn = 1 kilobyte) {
val localF = f
(i:I) => f(i)
}
def foo[I,O](f:I => O @sporable) = spore(error = 2 kilobytes) {
val localF = f
(i:I) => f(i)
}
def foo[I,O](f:I => O @sporable) = spore(warn = 1 kilobyte, error = 2 kilobytes) {
val localF = f
(i:I) => f(i)
}
This would cause the implementation of spore to produce useful information when user-defined thresholds are met. A further refinement would be that a specific logging object can be supplied for where to log these messages.
def foo[I,O](f:I => O @sporable) = spore(warn = 1 kilobyte, log = myLogger) {
val localF = f
(i:I) => f(i)
}
Another useful feature would be to just log generated sizes:
def foo[I,O](f:I => O @sporable) = spore(logSize = true, log = myLogger) {
val localF = f
(i:I) => f(i)
}
--
You received this message because you are subscribed to the Google Groups "scala-sips" group.
To unsubscribe from this group and stop receiving emails from it, send an email to scala-sips+...@googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.
I think it is especially important to figure out the relationship to Function more clearly, because I'd like to avoid ending up i a state were everyone suggests to ignore plain lambdas/Functions and to always pick spores.
I think this is a valid and realistic concern. Actually, the SIP says too little about where you would use the Spore type in practice. Would you have libraries where some functions would require Spores? Why would you do that? Especially, when you can't pass any kind of regular functions into that library method any more.
I don't say the spore construct itself is useless. You can use it to make sure not to accidentally capture things you don't want to but make all captures explicit. But that's it.
This seems like the lesser problem of the ones you usually have with capturing, the biggest one that you are capturing a reference to something which isn't allowed to leave the scope. So, IMO closing over something accidentally is much less interesting than preventing references to some values from escaping the scope. Speaking in your examples:
Actors: The closure isn't the problem, the problem is that the `this` reference to a mutable instance is allowed to leave the scope. Spores don't help you to decide which part of the expression you actually want to capture and which not (just `this` or `this.sender`).
Serialization: Again you are closing over `this`. For serialization purposes you don't want a closure to reference a Main instance because it isn't serializable. However, again, spores don't help you to decide which value you actually want to capture. Is it `this`? Why not? Or `this.helper`? Or `this.helper.toString()` or an even bigger expression?
def foo[I,O](f:I => O @sporable) = spore { val localF = f (i:I) => f(i) }
On Wed, Jun 19, 2013 at 5:22 PM, Jim Powers <j...@casapowers.com> wrote:
If spores are about serializability why would you be required to write "val localF = f" if you have already annotated the parameter f that it is @sporable? Maybe you could even remove the requirement to write out the captures explicitly but instead require some static quality that says that this value (or values of a certain kind) are safe to be captured in this context.def foo[I,O](f:I => O @sporable) = spore { val localF = f (i:I) => f(i) }
--
You received this message because you are subscribed to the Google Groups "scala-sips" group.
To unsubscribe from this group and stop receiving emails from it, send an email to scala-sips+...@googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.
-- Please excuse my brevity, sent from mobile device
--
Johannes
-----------------------------------------------
Johannes Rudolph
http://virtual-void.net
--
You received this message because you are subscribed to the Google Groups "scala-sips" group.
To unsubscribe from this group and stop receiving emails from it, send an email to scala-sips+...@googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.
I guess my problem about this is: the example illustrates using spores to make sure you've not captured anything. But you have to explicitly use them! If you are explicitly using them, it's because you know that you need to be careful and are paying attention to detail. If you know you need to be careful, it seems unlikely that you captured something by accident.
I think, yes, libraries might demand Spore in their APIs. But I argue that even if they don't, Spore as a concept is still useful. I have heard quite a few stories from people at Foursquare, Twitter, in the Spark project and others where people are well aware that care is required yet still get surprised from time to time by what's captured in a closure. Projects like this might well have a coding standard in the future where certain classes of closures should always be spores, even if the library does not (yet) enforce that.
--
--
You received this message because you are subscribed to the Google Groups "scala-sips" group.
To unsubscribe from this group and stop receiving emails from it, send an email to scala-sips+...@googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.
Heather, thanks for the clarifications.
The problem I still have is that it is not clear how the SIP should help to write correct programs in practice. And IMO that’s mainly a consequence of two things the SIP is unclear about:
- an analysis of the problem, a problem statement and an explanation how the given motivations fit into the problem statement
- a summary description of the solution and how the solution solves the practical problem
(- the “how does the feature integrate with existing features” section which would have revealed practical problems)
The motivating cases from the SIP show that there are issues related to capturing but it doesn’t explain what the exact relation really is.
Futures: What seems to be proposed is to replace the parameter types of Future's higher-order methods to require Spores instead of plain functions. Please try to do that and see what has to be changed in even the simplest Future-based applications. It turns out that being able to close over stuff is one of the big advantages of the Future API. In fact, it's hard to find a legitimate program using Futures that doesn't use capturing. With spores we are basically back in Java-land where closed-over values had to be put into final variables. Actually, it turns out the problem is much more related to the akka Actor API than to Futures. So, why change Futures?
Serialization: Capturing is still possible with spores, as is referencing other un-serializable values in other ways. So, how do spores help?
I’ll try to rephrase what a spore is: Spore is a new type extending the Function type which guarantees (with a static check at construction) that a function value of this type has an additional property. The property seems to be ‘All values captured by this function have been declared explicitly’.
It’s unclear to me how ‘have been declared explicitly’ will solve ‘problems, somehow related to capturing’.
There are some questions the SIP should answer:
- When would I use the Spore type over the regular function type?
- When would I use the spore constructor instead of an anonymous function?
- What are the disadvantages of using the solution?
- How does this new property solve the problem shown with the motivating examples? I want to know in detail why “Have been declared explicitly” will reduce bugs. What are the steps to get from buggy code using some libraries to a fixed version, how have spores helped and how much correct code had to be changed as well to support the solution?
IMO spores are not a conservative solution at all because the solution offered is much broader than the problem itself [1]. Think about what happens if the SIP is implemented like currently proposed: Either it’s an opt-in solution i.e. the developer decides when to use spores, then it won’t be used at all because you don’t know where to use it exactly to prevent bugs. Or, it’s enforced by libraries in which case all code using the libraries has to be changed. It will be a chore similarly to having to make variables final in Java, which when you do it in a mechanized way won’t help you paying attention to the details that actually are the root of the bugs.
Regarding my earlier comments about why this has to be a SIP: IIUC there’s nothing preventing spores from being just another library for everyone to try out. It’s likely that many of these questions would have come up by now if someone would have tried spores experimentally. A SIP for inclusion into the standard library should IMO be preceded by practical experiments collecting evidence for the utility of the proposed addition (because it's so easy compared to SIPs regarding compiler changes).
Here’s a simple application using Futures I tried to convert to use spores (assuming that all higher-order functions in `Future` are changed to require spores):
https://github.com/jrudolph/spray/commit/8faa2a6616b3a69e374a41ce3b8be9ff33b35b14
Immediate questions:
* for-comprehensions: do they work together with spores anyhow?
* nested spores, what are the rules?
Johannes
[1] Whatever the problem really is, I find it hard to describe. Many cases seem related to capturing `this` and then exporting code which calls instance methods from inside the class context, and therefore violating the encapsulation classes usually provide. That seems to fit the Actor.sender case but less the serialization case where the problem seems to be that it’s hard to control which implicitly created instances contain outer references and whereto exactly.
Heather, thanks for the clarifications.
The problem I still have is that it is not clear how the SIP should help to write correct programs in practice. And IMO that’s mainly a consequence of two things the SIP is unclear about:
- an analysis of the problem, a problem statement and an explanation how the given motivations fit into the problem statement
- a summary description of the solution and how the solution solves the practical problem
(- the “how does the feature integrate with existing features” section which would have revealed practical problems)
The motivating cases from the SIP show that there are issues related to capturing but it doesn’t explain what the exact relation really is.
Futures: What seems to be proposed is to replace the parameter types of Future's higher-order methods to require Spores instead of plain functions.
Please try to do that and see what has to be changed in even the simplest Future-based applications. It turns out that being able to close over stuff is one of the big advantages of the Future API. In fact, it's hard to find a legitimate program using Futures that doesn't use capturing.
With spores we are basically back in Java-land where closed-over values had to be put into final variables. Actually, it turns out the problem is much more related to the akka Actor API than to Futures. So, why change Futures?
Serialization: Capturing is still possible with spores, as is referencing other un-serializable values in other ways. So, how do spores help?
I’ll try to rephrase what a spore is: Spore is a new type extending the Function type which guarantees (with a static check at construction) that a function value of this type has an additional property. The property seems to be ‘All values captured by this function have been declared explicitly’.
It’s unclear to me how ‘have been declared explicitly’ will solve ‘problems, somehow related to capturing’.
There are some questions the SIP should answer:
- When would I use the Spore type over the regular function type?
- When would I use the spore constructor instead of an anonymous function?
- What are the disadvantages of using the solution?
- How does this new property solve the problem shown with the motivating examples? I want to know in detail why “Have been declared explicitly” will reduce bugs. What are the steps to get from buggy code using some libraries to a fixed version, how have spores helped and how much correct code had to be changed as well to support the solution?
IMO spores are not a conservative solution at all because the solution offered is much broader than the problem itself [1]. Think about what happens if the SIP is implemented like currently proposed: Either it’s an opt-in solution i.e. the developer decides when to use spores, then it won’t be used at all because you don’t know where to use it exactly to prevent bugs. Or, it’s enforced by libraries in which case all code using the libraries has to be changed. It will be a chore similarly to having to make variables final in Java, which when you do it in a mechanized way won’t help you paying attention to the details that actually are the root of the bugs.
Regarding my earlier comments about why this has to be a SIP: IIUC there’s nothing preventing spores from being just another library for everyone to try out. It’s likely that many of these questions would have come up by now if someone would have tried spores experimentally. A SIP for inclusion into the standard library should IMO be preceded by practical experiments collecting evidence for the utility of the proposed addition (because it's so easy compared to SIPs regarding compiler changes).
Here’s a simple application using Futures I tried to convert to use spores (assuming that all higher-order functions in `Future` are changed to require spores):
https://github.com/jrudolph/spray/commit/8faa2a6616b3a69e374a41ce3b8be9ff33b35b14
Immediate questions:
* for-comprehensions: do they work together with spores anyhow?
* nested spores, what are the rules?
Johannes
[1] Whatever the problem really is, I find it hard to describe. Many cases seem related to capturing `this` and then exporting code which calls instance methods from inside the class context, and therefore violating the encapsulation classes usually provide. That seems to fit the Actor.sender case but less the serialization case where the problem seems to be that it’s hard to control which implicitly created instances contain outer references and whereto exactly.
--
You received this message because you are subscribed to the Google Groups "scala-sips" group.
To unsubscribe from this group and stop receiving emails from it, send an email to scala-sips+...@googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.
Heather, thanks for the clarifications.
The problem I still have is that it is not clear how the SIP should help to write correct programs in practice. And IMO that’s mainly a consequence of two things the SIP is unclear about:
- an analysis of the problem, a problem statement and an explanation how the given motivations fit into the problem statement
- a summary description of the solution and how the solution solves the practical problem
this
is avoided, since the result of this.sender
is assigned to the spore’s local value from
when the spore is created. The spore conformity checking ensures that within the spore’s closure, only from
and d
are used.this
is avoided, since helper
has to be assigned to a local value (here, h
) so that it can be used inside the spore’s closure. As a result, fun
can now be serialized without runtime errors, since h
refers to a serializable object (a case class instance).(- the “how does the feature integrate with existing features” section which would have revealed practical problems)
The motivating cases from the SIP show that there are issues related to capturing but it doesn’t explain what the exact relation really is.
Given the above class definitions, serializing the fun
member of an instance of Main
throws aNotSerializableException
. This is unexpected, since fun
refers only to serializable objects: x
(an Int
) andhelper
(an instance of a case class).
Here is an explanation of why the serialization of fun
fails: since helper
is a field, it is not actually copied when it is captured by the closure. Instead, when accessing helper its getter is invoked. This can be made explicit by replacinghelper.toString
by the invocation of its getter, this.helper.toString
. Consequently, the fun
closure capturesthis
, not just a copy of helper
. However, this
is a reference to class Main
which is not serializable.
Futures: What seems to be proposed is to replace the parameter types of Future's higher-order methods to require Spores instead of plain functions. Please try to do that and see what has to be changed in even the simplest Future-based applications. It turns out that being able to close over stuff is one of the big advantages of the Future API. In fact, it's hard to find a legitimate program using Futures that doesn't use capturing. With spores we are basically back in Java-land where closed-over values had to be put into final variables. Actually, it turns out the problem is much more related to the akka Actor API than to Futures. So, why change Futures?
Serialization: Capturing is still possible with spores, as is referencing other un-serializable values in other ways. So, how do spores help?
I’ll try to rephrase what a spore is: Spore is a new type extending the Function type which guarantees (with a static check at construction) that a function value of this type has an additional property. The property seems to be ‘All values captured by this function have been declared explicitly’.
It’s unclear to me how ‘have been declared explicitly’ will solve ‘problems, somehow related to capturing’.
There are some questions the SIP should answer:
- When would I use the Spore type over the regular function type?
- When would I use the spore constructor instead of an anonymous function?
- What are the disadvantages of using the solution?