Martin
--
Martin Odersky
Prof., EPFL and Chairman, Typesafe
PSED, 1015 Lausanne, Switzerland
Tel. EPFL: +41 21 693 6863
Tel. Typesafe: +41 21 691 4967
100+. We need to better control the effects of specialization. In
particular, do something about tupled and curried. Moving to an
implicit decorator or @nospecialize seem both valid options.
I do not see a reason why not, actually.
-- Martin
However, consider the substantial
> impact this small sounding change has.
>
> final class Foo[@specialized T1, @specialized T2, @specialized R](f: (T1,
> T2) => R) { }
>
> Right now that generates 10*10*10 = 1000 extra classfiles. If it skipped
> Unit for T1 and T2, which are completely useless because Unit erases to
> BoxedUnit and T1 and T2 are already specialized on AnyRef, then it would
> generate 9*9*10 = 810 classfiles, a 19% reduction. Or flipped around, we're
> eating an almost 25% size gain for nothing.
>
> One can of course enumerate the types (or make one of my fancy new groups in
> the Specializable companion) but the point of the ticket is that it would be
> even better if we didn't have to tell the compiler that we don't need 25%
> more custom code to handle all the incoming Units we intend on processing.
--
It might also be good to have the reverse of @nospecialize, i.e. a way
to force specialization on methods whose type signatures don't
immediately suggest specialization.
Here's an example:
import scala.{specialized => spec}
class Foo[@spec A](a1:A, a2:A) {
def isEqual:Boolean = a1 == a2
class Bar[@spec B](b:B) {
def allEqual = a1 == b && a2 == b
}
}
Here, "isEqual" isn't specialized because the method's arguments and
return type don't mention A. "Bar" is only specialized on B currently,
but even if the class was also specialized on A its "allEqual" method
likely would not be.
Users who write code like this probably expect specialization to copy
*all* methods, and are surprised to find out that they are not, but
that adding an unused parameter of type A to a method *will* cause
scalac to create a specialized overload of that method.
An explicit annotation would probably be easier to reason about in the
long run (assuming a larger refactor along the lines of my previous
proposal [1] isn't taken).
-- Erik
[1] http://groups.google.com/group/scala-debate/browse_thread/thread/94c2ae2daaaa2136
Now that Erik has gotten AnyRef specialization working tolerably well, I looked at specializing Function2 on AnyRef and Boolean (knowing that was likely to be too much.) OK, that's quite a bit:
% git diff head^ --diff-filter=A --name-only |wc -l654654 new classfiles. Except:% git diff head^ --diff-filter=A --name-only |egrep -v '(tupled|curried|andThen)' |wc -l240Only 240 of those are the ones I'm after (more like 120, but I expect to eat the interface/implclass hit.) And 414 are "unluckily positioned accident."Those functions are not entirely unuseful, but they're sure not this kind of useful. There's a little chicken/egg here because a lot of us don't use them for performance reasons, and I guess explosively specializing them might help with that. But even in the best case that chicken is godzilla and the egg wouldn't feed a small family of lizards.Two things to do, both of which make sense to me independently of whether we want to specialize F2 any further and independently of one another.- @nospecialize annotation for members (no reason we can't do this that I can think of... am I missing one?)- move those methods out of the instancesI know it bothers our object-oriented purity to have tupled and curried in a separate object, but this way has been all lose: the huge specialization tax, and we weren't even able to get rid of the ones on the object because type inference doesn't work as well on the instance, as seen here.
(As far as I know) without AnyRef specialization you can't mix AnyRef
and AnyVal types without causing boxing. For instance, Function1[Foo,
Int] will be able to use the Function1$mcLI$sp implementation if
Function1 is compiled with AnyRef specialization, but otherwise it will
have to just use Function1 (which will box the return value).
This is what I've always understood the benefit of AnyRef
specialization to be.
-- Erik
True. I haven't considered that, having focused on the array indexing
performance instead.
--
Aleksandar Prokopec,
Doctoral Assistant
LAMP, IC, EPFL
http://people.epfl.ch/aleksandar.prokopec
This is what I've always understood the benefit of AnyRefspecialization to be.
-- Erik
This is what I've always understood the benefit of AnyRef
specialization to be.
On Sun, Feb 19, 2012 at 09:58:18PM +0100, martin odersky wrote:It might also be good to have the reverse of @nospecialize, i.e. a way
> 100+. We need to better control the effects of specialization. In
> particular, do something about tupled and curried. Moving to an
> implicit decorator or @nospecialize seem both valid options.
to force specialization on methods whose type signatures don't
immediately suggest specialization.
I also like this suggestion. It would give the users better control over what gets specialized.On Mon, Feb 20, 2012 at 7:42 AM, Erik Osheim <er...@plastic-idolatry.com> wrote:
On Sun, Feb 19, 2012 at 09:58:18PM +0100, martin odersky wrote:It might also be good to have the reverse of @nospecialize, i.e. a way
> 100+. We need to better control the effects of specialization. In
> particular, do something about tupled and curried. Moving to an
> implicit decorator or @nospecialize seem both valid options.
to force specialization on methods whose type signatures don't
immediately suggest specialization.
And I've been giving some thought to Matthew Pocock's idea of controling the specialization to avoid generating the full cartesian product of all specializations. My take on it would be to add a @specializedOn annotation to control the types we want to specialize on. The notation is clearly on the heavyweight side, so any ideas on how to make it less verbose are welcome:
@specializeOn(Int, Int)
@specializeOn(Int, Unit)
@specializeOn(Int, AnyRef)
/* same for Boolean, Float and Double */
@specializeOn(AnyRef, Unit)
@specializeOn(Int, Bool)
@specializeOn(Float, Bool)
@specializeOn(Double, Bool)
trait Function1[@specialized -T1, @specialized +R] extends AnyRef
I guess with 16 specialized classes we could get most of the cases of Function1 (currently Function1 generates 35 specialized classes)
Unfortunately this notation, as it is, wouldn't scale to Function2, so we need to make it leaner.
Back to exploring the idea itself, regardless of the syntax, I would suggest being able to add something like:
@specialized(Int, Any)
which provides a safety net if none of the specialized cases ((Int,Int), (Int, Unit), (Int, AnyRef), (Int, Bool)) match the type parameters, This would only specialize on the first argument, leaving the second one boxed. It's only a partial solution, but it would cut some of the boxing/unboxing time without generating 10 classes. (e.g. Function1[Int, Byte] would match the (Int, Any) specialization, keeping Int unboxed but boxing Byte)
What do you think?
Vlad
@specializeOn(Int, Int)@specializeOn(Int, Unit)
@specializeOn(Int, AnyRef)
/* same for Boolean, Float and Double */
@specializeOn(AnyRef, Unit)
@specializeOn(Int, Bool)
@specializeOn(Float, Bool)
@specializeOn(Double, Bool)
trait Function1[@specialized -T1, @specialized +R] extends AnyRef
What do you think?
Vlad
I'm waiting for someone to point out that we are straying close to defining a type-level indicator function over tuples of types that should be specialized on, and to come up with some type-class skulduggery that does the trick in an entirely incomprehensible but elegant way, possibly by co-opting a macro or some other bleeding edge functionality to bind types and type-names from the specialized signature to those appearing in the annotation.
object Specializable {
// No type parameter in @specialized annotation.
trait SpecializedGroup { }
// Smuggle a list of types by way of a tuple upon which Group is parameterized.
class Group[T >: Null](value: T) extends SpecializedGroup { }
final val Primitives = new Group(Byte, Short, Int, Long, Char, Float, Double, Boolean, Unit)
final val Everything = new Group(Byte, Short, Int, Long, Char, Float, Double, Boolean, Unit, AnyRef)
final val Bits32AndUp = new Group(Int, Long, Float, Double)
final val Integral = new Group(Byte, Short, Int, Long, Char)
final val AllNumeric = new Group(Byte, Short, Int, Long, Char, Float, Double)
final val BestOfBreed = new Group(Int, Double, Boolean, Unit, AnyRef)
}/** Annotate type parameters on which code should be automatically
* specialized. For example:
* {{{
* class MyList[@specialized T] ...
* }}}
*
* Type T can be specialized on a subset of the primitive types by
* specifying a list of primitive types to specialize at:
* {{{
* class MyList[@specialized(Int, Double, Boolean) T] ..
* }}}
*
* @since 2.8
*/
// class tspecialized[T](group: Group[T]) extends annotation.StaticAnnotation {
class specialized(group: SpecializedGroup) extends annotation.StaticAnnotation {
def this(types: Specializable*) = this(new Group(types.toList))
def this() = this(Everything)
}... compiler ...
private def specializedOn(sym: Symbol): List[Symbol] = {
sym getAnnotation SpecializedClass match {
case Some(AnnotationInfo(_, Nil, _)) => specializableTypes.map(_.typeSymbol)
case Some(ann @ AnnotationInfo(_, args, _)) => {
args map (_.tpe) flatMap { tp =>
tp baseType GroupOfSpecializable match {
case TypeRef(_, GroupOfSpecializable, arg :: Nil) =>
arg.typeArgs map (_.typeSymbol)
case _ =>
List(tp.typeSymbol)
}
}
}
case _ => Nil
}
}
Back to exploring the idea itself, regardless of the syntax, I would suggest being able to add something like:
@specialized(Int, Any)
which provides a safety net if none of the specialized cases ((Int,Int), (Int, Unit), (Int, AnyRef), (Int, Bool)) match the type parameters, This would only specialize on the first argument, leaving the second one boxed. It's only a partial solution, but it would cut some of the boxing/unboxing time without generating 10 classes. (e.g. Function1[Int, Byte] would match the (Int, Any) specialization, keeping Int unboxed but boxing Byte)
Agreed, it seems like we should be able to do this. I'm not sure how
many assumptions this violates but it's certainly worth looking into.
I guess we should open a ticket?
-- Erik
And I'm waiting for someone to point out... oh I should know better. The real trick was doing this without adding a type parameter to @specialized.(Notice that you could as easily say new Group[(Int, Char, Double)](null), and were I not trying for seamlessness with @specialized, lose the unnecessary parameter and just give the types: tuples as type-varargs.)
Agreed, it seems like we should be able to do this. I'm not sure how
many assumptions this violates but it's certainly worth looking into.
I guess we should open a ticket?
Wonderful idea. It would be a huge improvement to be able to express
the types in terms of other types. I noticed right away that simply
being able to enumerate a list of types was insufficient, in part
because it's so hard to give a meaningful name to a subset of types.
(Thus my not-really-serious names like "BestOfBreed".) Relative to
another group, that's where the meaningful names are.
Perhaps I am being dense, but what assumptions could be violated? It
has been statically typed as an Int => Byte. There is an Object =>
Object point of entry and a specialized Int => Object point of entry,
and if either of those is called with anything other than a (possibly
boxed) Int or expecting anything other than a (possibly boxed) Byte,
all is lost because the type system failed. It better be true that we
can call the Int => Object interface and unbox a Byte.
There's only one implementation, it's just a question of how we get
there. We pick up the train midtown instead of riding all the way
from downtown.
Oh, I don't think there's a conceptual problem. I just meant the amount
of code that would need to change in the implementation to support the
fact that a type parameter bound to an AnyVal type could choose to use
the AnyRef specialization.
To be clear: we should do this. :)
-- Erik