optimizing collection creation

224 views
Skip to first unread message

Paul Phillips

unread,
Aug 15, 2012, 1:16:01 PM8/15/12
to scala-i...@googlegroups.com
I have a lot of performance things which offer some room for improvement.  I'll start rolling some of them out here.

When you create a collection, a lot of unnecessary work is done.  The cases of particular interest to me are List and Array.

1) List(1, 2, 3) leads to approximately this:

  val arr = new Array[Int](3)
  arr(0) = 1
  arr(1) = 2
  arr(2) = 3
  val wrapped = new WrappedArray(arr)
  List.apply(wrapped) 
  val cbf = implicitly[CanBuildFrom[...]]
  val builder = cbf() // new ListBuffer[Int]
  builder ++= wrapped
  builder.result

When it could be done like this:

  new ::(1, new ::(2, new ::(3, Nil)))

2) Array(1, 2, 3) leads to approximately this:

  val arr = new Array[Int](3)
  arr(0) = 1
  arr(1) = 2
  arr(2) = 3
  // Hey, stop right there! Scala, stop! Oh no...
  val wrapped = new WrappedArray(arr)
  Array.apply(wrapped) 
  val arr2 = new Array[Int](wrapped.length)
  var i = 0
  // throw in a closure and a boxing trip through Seq#foreach
  for (x <- wrapped.iterator) { arr2(i) = x; i += 1 } 
  arr2

I think I could do List (and all the other collections except Array) with a macro.  I can't figure out Array with a macro because it yells at me if I try to pass an implicit ClassTag, and I'm not sure how I'm supposed to get my hands on it within the macro.  I hope it's possible.

However, because we ALREADY have a hack in the typer which translates occurrences of List() into Nil, I went in there and enhanced it to rewrite all "List literals", not only List().  That particular hack still exists because it was accidentally undone once and the performance regression was severe.  I infer from that impact that we might win a lot if we rewrote List applies as I suggest.

Here is the before/after bytecode for this method.  Most of work in the "before" version is not visible here because it's actually in the call to List.apply.  However in the after version, that is everything there is to do.

  def f = List("a", "b", "c")

*** Before ***

 0: getstatic     #19                 // Field scala/collection/immutable/List$.MODULE$:Lscala/collection/immutable/List$;
 3: getstatic     #24                 // Field scala/Predef$.MODULE$:Lscala/Predef$;
 6: iconst_3      
 7: anewarray     #26                 // class java/lang/String
10: dup           
11: iconst_0      
12: ldc           #28                 // String a
14: aastore       
15: dup           
16: iconst_1      
17: ldc           #30                 // String b
19: aastore       
20: dup           
21: iconst_2      
22: ldc           #32                 // String c
24: aastore       
25: checkcast     #34                 // class "[Ljava/lang/Object;"
28: invokevirtual #40                 // Method scala/LowPriorityImplicits.wrapRefArray:([Ljava/lang/Object;)Lscala/collection/mutable/WrappedArray;
31: invokevirtual #44                 // Method scala/collection/immutable/List$.apply:(Lscala/collection/Seq;)Lscala/collection/immutable/List;
34: areturn       

*** After ***

 0: new           #16                 // class scala/collection/immutable/$colon$colon
 3: dup
 4: ldc           #18                 // String a
 6: new           #16                 // class scala/collection/immutable/$colon$colon
 9: dup
10: ldc           #20                 // String b
12: new           #16                 // class scala/collection/immutable/$colon$colon
15: dup
16: ldc           #22                 // String c
18: getstatic     #27                 // Field scala/collection/immutable/Nil$.MODULE$:Lscala/collection/immutable/Nil$;
21: invokespecial #30                 // Method scala/collection/immutable/$colon$colon."<init>":(Ljava/lang/Object;Lscala/collection/immutable/List;)V
24: invokespecial #30                 // Method scala/collection/immutable/$colon$colon."<init>":(Ljava/lang/Object;Lscala/collection/immutable/List;)V
27: invokespecial #30                 // Method scala/collection/immutable/$colon$colon."<init>":(Ljava/lang/Object;Lscala/collection/immutable/List;)V
30: areturn


Just imagine the "after" if we specialize :: (something I have already done successfully in a branch.)

There is a potential negative to this rewrite - if you had a ton of arguments, you might blow the stack where you wouldn't the other way.  I am not particularly concerned about this, but pointing it out for completeness.

Erik Osheim

unread,
Aug 15, 2012, 2:12:57 PM8/15/12
to scala-i...@googlegroups.com
On Wed, Aug 15, 2012 at 10:16:01AM -0700, Paul Phillips wrote:
> 1) List(1, 2, 3) leads to approximately this:

...

> 2) Array(1, 2, 3) leads to approximately this:

...

> I think I could do List (and all the other collections except Array) with a
> macro. I can't figure out Array with a macro because it yells at me if I
> try to pass an implicit ClassTag, and I'm not sure how I'm supposed to get
> my hands on it within the macro. I hope it's possible.

That is incredibly good to point out, and obviously pretty troubling!

I will try to look at writing a macro (I feel like it should be
possible?) but if not this seems like one of the few places where it
would be worth doing something else. I think it's important to ensure
that users get something approximating:

int[] ns = {1, 2, 3};

I'm already slightly freaking out about this (I'm not sure why I
assumed Array(1, 2, 3) would be efficient, but I did).

-- Erik

Paul Phillips

unread,
Aug 15, 2012, 2:17:04 PM8/15/12
to scala-i...@googlegroups.com


On Wed, Aug 15, 2012 at 11:12 AM, Erik Osheim <er...@plastic-idolatry.com> wrote:
I'm already slightly freaking out about this (I'm not sure why I
assumed Array(1, 2, 3) would be efficient, but I did).

Because you have not yet hit bottom.  Only after all hope is lost can the rebuilding truly begin.


Grzegorz Kossakowski

unread,
Aug 15, 2012, 6:17:48 PM8/15/12
to scala-i...@googlegroups.com
On 15 August 2012 19:16, Paul Phillips <pa...@improving.org> wrote:
I have a lot of performance things which offer some room for improvement.  I'll start rolling some of them out here.

When you create a collection, a lot of unnecessary work is done.  The cases of particular interest to me are List and Array.

1) List(1, 2, 3) leads to approximately this:

  val arr = new Array[Int](3)
  arr(0) = 1
  arr(1) = 2
  arr(2) = 3
  val wrapped = new WrappedArray(arr)
  List.apply(wrapped) 
  val cbf = implicitly[CanBuildFrom[...]]
  val builder = cbf() // new ListBuffer[Int]
  builder ++= wrapped
  builder.result

When it could be done like this:

  new ::(1, new ::(2, new ::(3, Nil)))

I agree that this is annoying and expensive. It's not only about object allocation but also about blowing up bytecode unnecessarily.

I think there's a middle ground between what we have now and macro/builtin compiler support. This reminds me an idea I had (and Miguel also mentioned to me) of early inlining. The current state of affairs is that code with foreach call like List(1,2,3) foreach println we first expand lambda to a class, we typecheck everything and generate lots of trees, then icode and only then we realize (if we are lucky) that we can inline foreach implementation, inline apply of a closure and throw away the whole class we generate for the closure. That's a huge amount of wasted work.

The idea of early inlining would be that for things we are sure we'll be able to inline and (as a consequence) eliminate the need for closure allocation we would keep Function tree node all the time and handle it properly in the backend. This way we would skip a lot of work.

Now, going back to your example the same idea should apply. We keep a bit more rich trees if we now that we can inline Array.apply and backend takes care of the rest. We'll skip all boxing and other unnecessary work altogether.

Obviously, this is just a sketch of an idea but I believe it's the way to go for us in the long run. We cannot afford unnecessary work in Scala compiler. It's too expensive.

--
Grzegorz Kossakowski

Paul Phillips

unread,
Aug 15, 2012, 8:34:28 PM8/15/12
to scala-i...@googlegroups.com
On Wed, Aug 15, 2012 at 3:17 PM, Grzegorz Kossakowski <grzegorz.k...@gmail.com> wrote:
only then we realize (if we are lucky) that we can inline foreach implementation, inline apply of a closure and throw away the whole class we generate for the closure. That's a huge amount of wasted work.

Not to mention a huge source of bugs.  Much of the confusion inside the compiler arises from transforming into too low-level a form, then expending lots of time and code attempting to roll back the transformation to find out what it was in the first place.  There are a lot of variations on this theme.  I would love to see internal abstractions which better preserve the semantics of the source code and which have been rethought with support for optimization at a higher priority.

  class A { def f = List(1) map (_ + 1) }

It still drives me nuts that we never get a clean parse tree.  The very first tree out of the gate has assaulted our source code with hardcoded assumptions, more egregiously so if I'd thrown in a for comprehension or certain other unlucky constructs.

package <empty> {
  class A extends scala.AnyRef {
    def <init>() = {
      super.<init>();
      ()
    };
    def f = List(1).map(((x$1) => x$1.$plus(1)))
  }
}

Then here's the AST at the end of uncurry - and we haven't yet reached specialization, erasure, flatten....

package <empty> {
  class A extends Object {
    def <init>(): A = {
      A.super.<init>();
      ()
    };
    def f(): List[Int] = immutable.this.List.apply[Int](scala.this.Predef.wrapIntArray(Array[Int]{1})).map[Int, List[Int]]({
      @SerialVersionUID(0) final <synthetic> class $anonfun extends scala.runtime.AbstractFunction1[Int,Int] with Serializable {
        def <init>(): anonymous class $anonfun = {
          $anonfun.super.<init>();
          ()
        };
        final def apply(x$1: Int): Int = x$1.+(1)
      };
      (new anonymous class $anonfun(): Int => Int)
    }, immutable.this.List.canBuildFrom[Int]())
  }
}

I am sure we can improve.

Erik Osheim

unread,
Aug 16, 2012, 1:46:28 AM8/16/12
to scala-i...@googlegroups.com
So, I know there is some concern around how exactly to solve this
problem, but for what it's worth I have a macro [1] that seems to work
for Arrays (which seemed like they would be the hardest to support).

I'd love to get feedback on it, both reports of whether it works for
people and also if it's written in "good macro style" whatever that is.
I only barely understand macros so I mostly just guess-and-check over
and over until I get them (seemingly) working.

-- Erik

[1] See attached ArrayUtil.scala.
ArrayUtil.scala

Miguel Garcia

unread,
Aug 16, 2012, 2:32:12 AM8/16/12
to scala-i...@googlegroups.com

I've created a ticket summarizing insights from this discussion at https://issues.scala-lang.org/browse/SI-6247 

Miguel
http://lampwww.epfl.ch/~magarcia/ScalaCompilerCornerReloaded/

Erik Osheim

unread,
Aug 16, 2012, 10:51:17 AM8/16/12
to scala-i...@googlegroups.com
On Thu, Aug 16, 2012 at 01:46:28AM -0400, Erik Osheim wrote:
> So, I know there is some concern around how exactly to solve this
> problem, but for what it's worth I have a macro [1] that seems to work
> for Arrays (which seemed like they would be the hardest to support).

P.S. I should also add that I'm happy to send a pull request to get
this (and versions for List, Vector, etc) into master (or 2.10.x) but
it seems like the decision of what to do hasn't been made yet.

-- Erik

Josh Suereth

unread,
Aug 16, 2012, 10:57:33 AM8/16/12
to scala-i...@googlegroups.com
You should submit a pull request.  Greg + I are working on different types of performance measurements.   If there's a public branch that's easily findable, we'll be able to measure what kind of performance you patch gives us and ensure community projects build before merging.

That said, I'm not sure if anyone is duplicating this work.  I think, at this point, the answer to that is no, but please chime in anyone who's looking into it.

- Josh

Paul Phillips

unread,
Aug 16, 2012, 10:58:59 AM8/16/12
to scala-i...@googlegroups.com


On Thu, Aug 16, 2012 at 7:57 AM, Josh Suereth <joshua....@gmail.com> wrote:
You should submit a pull request.  Greg + I are working on different types of performance measurements.   If there's a public branch that's easily findable, we'll be able to measure what kind of performance you patch gives us and ensure community projects build before merging.

I think the meta-decision to be made is whether we're willing to become dependent on macros to this extent at this  time.

Grzegorz Kossakowski

unread,
Aug 16, 2012, 11:10:42 AM8/16/12
to scala-i...@googlegroups.com
I'm strongly against it. Let's wait until people experiment with them in wild before we make everybody depending on them through std lib. We need to be careful about how we manage risk.

--
Grzegorz Kossakowski

Paul Phillips

unread,
Aug 16, 2012, 11:15:29 AM8/16/12
to scala-i...@googlegroups.com


On Thu, Aug 16, 2012 at 8:10 AM, Grzegorz Kossakowski <grzegorz.k...@gmail.com> wrote:
I'm strongly against it. Let's wait until people experiment with them in wild before we make everybody depending on them through std lib. We need to be careful about how we manage risk.

I agree.  It's not like the situation with collection creation is a new thing, it's always been like this.  We've lasted this long.

But we should consider my patch to rewrite List literals as ::s.


Some tests fail because of some hardcoded assumption in reify/macroland, but otherwise is done.  This is done completely from within the compiler, and the important thing to note for anyone who opposes ugly hacks (I am also in this club) is that this ugly hack is already there and already doing this rewrite.  It's just that at present it only does it for empty lists (i.e. List() becomes Nil) and not for length > 0.

Jason Zaugg

unread,
Aug 16, 2012, 11:16:13 AM8/16/12
to scala-i...@googlegroups.com, Eugene Burmako
On Thu, Aug 16, 2012 at 7:46 AM, Erik Osheim <er...@plastic-idolatry.com> wrote:
> So, I know there is some concern around how exactly to solve this
> problem, but for what it's worth I have a macro [1] that seems to work
> for Arrays (which seemed like they would be the hardest to support).
>
> I'd love to get feedback on it, both reports of whether it works for
> people and also if it's written in "good macro style" whatever that is.
> I only barely understand macros so I mostly just guess-and-check over
> and over until I get them (seemingly) working.
>
> -- Erik
>
> [1] See attached ArrayUtil.scala.

It's not quite a drop-in replacement for Array.apply:

scala> Array[Int](Nil: _*)
res29: Array[Int] = Array()

scala> ArrayUtil.alloc[Int](Nil: _*)
<console>:20: error: no `: _*' annotation allowed here
(such annotations are only allowed in arguments to *-parameters)
ArrayUtil.alloc[Int](Nil: _*)
^

Seems like a general problem with the interaction of macros + varargs.
Eugene: is this a known issue?

-jason

Erik Osheim

unread,
Aug 16, 2012, 11:20:23 AM8/16/12
to scala-i...@googlegroups.com
On Thu, Aug 16, 2012 at 08:15:29AM -0700, Paul Phillips wrote:
> > I'm strongly against it. Let's wait until people experiment with them in
> > wild before we make everybody depending on them through std lib. We need to
> > be careful about how we manage risk.
>
> I agree. It's not like the situation with collection creation is a new
> thing, it's always been like this. We've lasted this long.

In that case I'll just wait, since I'm not excited about doing all the
work to create a pull request and then managing a long-lived branch.

I will add macros like these to Debox [1] so it should be easy to
benchmark the relative performance there, for anyone that is
interested.

-- Erik

Erik Osheim

unread,
Aug 16, 2012, 11:26:57 AM8/16/12
to scala-i...@googlegroups.com
On Thu, Aug 16, 2012 at 05:16:13PM +0200, Jason Zaugg wrote:
> Seems like a general problem with the interaction of macros + varargs.
> Eugene: is this a known issue?

Right, in this case it seems like you can't use the same strategy,
since the user may have some something like:

val ns = List(1, 2, 3, 4).filter(_ % 2 == 0)
ArrayUtil.alloc(ns:_*)

It does seem a bit perverse to be using alloc in these cases instead of
something like toArray. :) But you're right, it's definitely a
deficiency.

-- Erik

Paul Phillips

unread,
Aug 16, 2012, 12:36:54 PM8/16/12
to scala-i...@googlegroups.com, Eugene Burmako
On Thu, Aug 16, 2012 at 8:16 AM, Jason Zaugg <jza...@gmail.com> wrote:
  scala> ArrayUtil.alloc[Int](Nil: _*)
  <console>:20: error: no `: _*' annotation allowed here
  (such annotations are only allowed in arguments to *-parameters)
              ArrayUtil.alloc[Int](Nil: _*)
                                      ^

We should relax this restriction so the macro can receive the arguments and the wildcard star. 

Erik Osheim

unread,
Aug 16, 2012, 12:40:14 PM8/16/12
to scala-i...@googlegroups.com
On Thu, Aug 16, 2012 at 11:20:23AM -0400, Erik Osheim wrote:
> I will add macros like these to Debox [1] so it should be easy to
> benchmark the relative performance there, for anyone that is
> interested.

So I've done some initial profiling (attached below) which indicates
that for most array literals (1-4 items) the macro is a little more
than twice as fast, but for larger array literals it's 3-5 times as
fast.

The benchmark code lives at:

https://github.com/non/debox/blob/master/benchmark/src/main/scala/debox/benchmark/LiteralMacroBenchmark.scala

Given that the times in question are in nanoseconds, it's easy to
imagine this not making a huge difference for most users, although of
course I can construct cases (i.e. involving codegen) where it would be
a big deal.

-- Erik

Results below, and also at http://plastic-idolatry.com/scala/array.txt

benchmark ns linear runtime
Apply0 14.1 =
Macro0 13.7 =
Apply1 31.4 ==
Macro1 14.7 =
Apply2 27.6 =
Macro2 14.8 =
Apply4 35.7 ==
Macro4 15.5 =
Apply8 46.0 ===
Macro8 15.9 =
Apply16 66.5 ====
Macro16 18.1 =
Apply32 111.3 =======
Macro32 26.7 =
Apply64 222.9 ==============
Macro64 48.5 ===
Apply128 450.0 ==============================
Macro128 92.6 ======

nafg

unread,
Aug 17, 2012, 2:01:04 AM8/17/12
to scala-i...@googlegroups.com
On Wednesday, August 15, 2012 1:16:01 PM UTC-4, Paul Phillips wrote:
1) List(1, 2, 3) leads to approximately this:

  val arr = new Array[Int](3)
  arr(0) = 1
  arr(1) = 2
  arr(2) = 3
  val wrapped = new WrappedArray(arr)
  List.apply(wrapped) 
  val cbf = implicitly[CanBuildFrom[...]]
  val builder = cbf() // new ListBuffer[Int]
  builder ++= wrapped
  builder.result

When it could be done like this:

  new ::(1, new ::(2, new ::(3, Nil)))


Why do we need to add magic to the compiler, when it's not harder to write the more performant code yourself? Is there some major advantage to the List(...) notation?

Also, at what stage is this taking place --- before or after type inference? Because I've had cases where the :: notation confused or crashed the compiler, mainly with types like trait Mapper[T <: Mapper[T]] and the like. The stack depth cost is multiplied by the types' complexity. Also, if List(...) notation compiles faster, there's a tradeoff between compile time and run time, so it would make more sense to let users choose, rather than magically turning one piece of code into another.

Also I'm curious why there's a "hack" to transform List() into Nil; couldn't @inline accomplish that?



martin odersky

unread,
Aug 18, 2012, 7:17:22 AM8/18/12
to scala-i...@googlegroups.com
On Thu, Aug 16, 2012 at 12:17 AM, Grzegorz Kossakowski <grzegorz.k...@gmail.com> wrote:
On 15 August 2012 19:16, Paul Phillips <pa...@improving.org> wrote:
I have a lot of performance things which offer some room for improvement.  I'll start rolling some of them out here.

When you create a collection, a lot of unnecessary work is done.  The cases of particular interest to me are List and Array.

1) List(1, 2, 3) leads to approximately this:

  val arr = new Array[Int](3)
  arr(0) = 1
  arr(1) = 2
  arr(2) = 3
  val wrapped = new WrappedArray(arr)
  List.apply(wrapped) 
  val cbf = implicitly[CanBuildFrom[...]]
  val builder = cbf() // new ListBuffer[Int]
  builder ++= wrapped
  builder.result

When it could be done like this:

  new ::(1, new ::(2, new ::(3, Nil)))

I agree that this is annoying and expensive. It's not only about object allocation but also about blowing up bytecode unnecessarily.

I think there's a middle ground between what we have now and macro/builtin compiler support. This reminds me an idea I had (and Miguel also mentioned to me) of early inlining. The current state of affairs is that code with foreach call like List(1,2,3) foreach println we first expand lambda to a class, we typecheck everything and generate lots of trees, then icode and only then we realize (if we are lucky) that we can inline foreach implementation, inline apply of a closure and throw away the whole class we generate for the closure. That's a huge amount of wasted work.

The idea of early inlining would be that for things we are sure we'll be able to inline and (as a consequence) eliminate the need for closure allocation we would keep Function tree node all the time and handle it properly in the backend. This way we would skip a lot of work.

I agree that saving work is good, and will also help with compile times. But it strikes me that macros are precisely the early inline mechanism we are after. The macro design has already addressed a number of hard problems such as inlining under separate compilation. So I think we should make them work for this as well, instead of inventing a new mechanism.

Concretely, in the end we might be best off making map and friends macros in a couple of key classes such as List. The other benefit of this is that, if the 
sequence is generic and no macro applies, we fall back to the generic method in TraversableLike. Since this will be the only method visible to the JIT compiler we will still get the JIT optimizations that come with a monomorphic call site. Specializing @inline methods has the downside that they might confuse the JIT compiler.

If we follow this approach, we should _not_ now make inlined versions of these methods in LinearSeqOptimized or IndexedSeqOptimized. It would worsen performance in the short run and hurt in the long run (by silently widening access of members that we want to keep preivate)

Cheers

 - Martin

Grzegorz Kossakowski

unread,
Aug 18, 2012, 10:17:16 AM8/18/12
to scala-i...@googlegroups.com
On 18 August 2012 13:17, martin odersky <martin....@epfl.ch> wrote:
I agree that saving work is good, and will also help with compile times. But it strikes me that macros are precisely the early inline mechanism we are after. The macro design has already addressed a number of hard problems such as inlining under separate compilation. So I think we should make them work for this as well, instead of inventing a new mechanism.

Concretely, in the end we might be best off making map and friends macros in a couple of key classes such as List. The other benefit of this is that, if the 
sequence is generic and no macro applies, we fall back to the generic method in TraversableLike. Since this will be the only method visible to the JIT compiler we will still get the JIT optimizations that come with a monomorphic call site. Specializing @inline methods has the downside that they might confuse the JIT compiler.

If we follow this approach, we should _not_ now make inlined versions of these methods in LinearSeqOptimized or IndexedSeqOptimized. It would worsen performance in the short run and hurt in the long run (by silently widening access of members that we want to keep preivate)

I find this proposal appealing but I would like to see how it works in practice before we decide on anything. Also, I think we should package the whole macro magic into some lightweight syntactic sugar (like @inline). I'd definitively not want to see macros being declared explicitly all over collections library.

Since macro annotations are far away I'd opt for annotation that gets special treatment in the compiler in a form of triggering a macro that can be defined in library. I'm thinking of universal `early inline` macro that could just be applied to all methods we want to inline.

That's something worth experimenting. If everything works as expected we might consider doing it for 2.11.

--
Grzegorz Kossakowski

martin odersky

unread,
Aug 18, 2012, 11:10:44 AM8/18/12
to scala-i...@googlegroups.com
On Sat, Aug 18, 2012 at 4:17 PM, Grzegorz Kossakowski <grzegorz.k...@gmail.com> wrote:
On 18 August 2012 13:17, martin odersky <martin....@epfl.ch> wrote:
I agree that saving work is good, and will also help with compile times. But it strikes me that macros are precisely the early inline mechanism we are after. The macro design has already addressed a number of hard problems such as inlining under separate compilation. So I think we should make them work for this as well, instead of inventing a new mechanism.

Concretely, in the end we might be best off making map and friends macros in a couple of key classes such as List. The other benefit of this is that, if the 
sequence is generic and no macro applies, we fall back to the generic method in TraversableLike. Since this will be the only method visible to the JIT compiler we will still get the JIT optimizations that come with a monomorphic call site. Specializing @inline methods has the downside that they might confuse the JIT compiler.

If we follow this approach, we should _not_ now make inlined versions of these methods in LinearSeqOptimized or IndexedSeqOptimized. It would worsen performance in the short run and hurt in the long run (by silently widening access of members that we want to keep preivate)

I find this proposal appealing but I would like to see how it works in practice before we decide on anything. Also, I think we should package the whole macro magic into some lightweight syntactic sugar (like @inline). I'd definitively not want to see macros being declared explicitly all over collections library.

I don't see what's qualitatively different between

  @inline def filter(p: T => Boolean): Repr = ...

and

  def filter(p: T => Boolean): Repr = macro filterImpl

Or is it the filterImpl you object to? But that's an implementation detail; users need not look at it.

Cheers

 - Martin


Since macro annotations are far away I'd opt for annotation that gets special treatment in the compiler in a form of triggering a macro that can be defined in library. I'm thinking of universal `early inline` macro that could just be applied to all methods we want to inline.

That's something worth experimenting. If everything works as expected we might consider doing it for 2.11.

--
Grzegorz Kossakowski




--
Martin Odersky
Prof., EPFL and Chairman, Typesafe
PSED, 1015 Lausanne, Switzerland
Tel. EPFL: +41 21 693 6863
Tel. Typesafe: +41 21 691 4967

Grzegorz Kossakowski

unread,
Aug 18, 2012, 11:27:26 AM8/18/12
to scala-i...@googlegroups.com
On 18 August 2012 17:10, martin odersky <martin....@epfl.ch> wrote:


On Sat, Aug 18, 2012 at 4:17 PM, Grzegorz Kossakowski <grzegorz.k...@gmail.com> wrote:
On 18 August 2012 13:17, martin odersky <martin....@epfl.ch> wrote:
I agree that saving work is good, and will also help with compile times. But it strikes me that macros are precisely the early inline mechanism we are after. The macro design has already addressed a number of hard problems such as inlining under separate compilation. So I think we should make them work for this as well, instead of inventing a new mechanism.

Concretely, in the end we might be best off making map and friends macros in a couple of key classes such as List. The other benefit of this is that, if the 
sequence is generic and no macro applies, we fall back to the generic method in TraversableLike. Since this will be the only method visible to the JIT compiler we will still get the JIT optimizations that come with a monomorphic call site. Specializing @inline methods has the downside that they might confuse the JIT compiler.

If we follow this approach, we should _not_ now make inlined versions of these methods in LinearSeqOptimized or IndexedSeqOptimized. It would worsen performance in the short run and hurt in the long run (by silently widening access of members that we want to keep preivate)

I find this proposal appealing but I would like to see how it works in practice before we decide on anything. Also, I think we should package the whole macro magic into some lightweight syntactic sugar (like @inline). I'd definitively not want to see macros being declared explicitly all over collections library.

I don't see what's qualitatively different between

  @inline def filter(p: T => Boolean): Repr = ...

and

  def filter(p: T => Boolean): Repr = macro filterImpl

Or is it the filterImpl you object to? But that's an implementation detail; users need not look at it.

Yes, that's what bothers me. I agree users won't see it but we also need to think about our own convenience and ability to maintain code in library.

Ideally I'd like to have something like:

def filter(p: T => Boolean): Repr = earlyInlineMacro { p =>
  ... filter implementation ...
}

where earlyInlineMacro would be a macro that returns a macro as a result that implements early inlining specialized to a code that was passed as parameter to it. Is it possible for macros to return macros a result in current macro system?

--
Grzegorz Kossakowski

Eugene Burmako

unread,
Aug 18, 2012, 11:51:29 AM8/18/12
to scala-i...@googlegroups.com
rhs of a macro can be a macro itself, e.g.

def filter = macro earlyInline { ... }

where earlyInline is a macro that generates a static macro impl and expands into a reference to it. The only problem is with the "generates" part. Currently macros have troubles creating publicly visible stuff, because they lack integration with namer. I didn't check, but most likely this won't work.

Good news is that we actually have a better solution. I don't follow the inliner stuff, but probably it would suffice to write a single macro implementation and then use it everywhere, e.g.

def filter = macro InlinerMacros.earlyInline

This macro impl can then call c.macroApplication, which represents the entire Apply being expanded, extract a symbol of the expandee and dispatch on it.

Eugene Burmako

unread,
Aug 18, 2012, 12:01:09 PM8/18/12
to scala-i...@googlegroups.com
or if you reeeaally want the notation you outlined, the following might fly:

def filter(p: T => Boolean): Repr = macro earlyInline { p =>
  ... filter implementation ...
}

earlyInline is a macro that expands into a reference to the universal InlinerMacros.earlyInlineImpl (which is also a macro) and (here comes the magic) attaches the AST of its parameter (the "filter implementation" part) as an annotation to filter.

Afterwards, when the compiler sees "foo.filter(bar)", it calls earlyInlineImpl, which gets the symbol of filter from macroApplication, unpacks the annotation with the filter implementation code and proceeds.

Yay or nay?

Jason Zaugg

unread,
Aug 18, 2012, 4:43:49 PM8/18/12
to scala-i...@googlegroups.com
On Sat, Aug 18, 2012 at 1:17 PM, martin odersky <martin....@epfl.ch> wrote:
> Concretely, in the end we might be best off making map and friends macros in
> a couple of key classes such as List. The other benefit of this is that, if
> the sequence is generic and no macro applies, we fall back to the generic method
> in TraversableLike. Since this will be the only method visible to the JIT
> compiler we will still get the JIT optimizations that come with a
> monomorphic call site. Specializing @inline methods has the downside that
> they might confuse the JIT compiler.

I had a similar thought today, which led me to test that a macro could
actually override a non-macro method. Happily, it can.

But regardless, a more conservative approach would be to create
standalone, private[scala], macro versions of `List#{map, flatMap,
...}` and use these from performance critical spots in the compiler.
You would then write `fastmap(xs)(_.foo)` rather than `xs map
(_.foo)`, or, with the help of extension methods, `xs fastmap
(_.foo)`.

I see no reason to be afraid of using macros here; if macros were
(hypothetically) to be removed from the language, you could just fall
back on old fashioned compiler magic.

-jason

Paolo G. Giarrusso

unread,
Aug 23, 2012, 12:16:26 PM8/23/12
to scala-i...@googlegroups.com


Il giorno giovedì 16 agosto 2012 02:34:28 UTC+2, Paul Phillips ha scritto:

On Wed, Aug 15, 2012 at 3:17 PM, Grzegorz Kossakowski <grzegorz.k...@gmail.com> wrote:
only then we realize (if we are lucky) that we can inline foreach implementation, inline apply of a closure and throw away the whole class we generate for the closure. That's a huge amount of wasted work.

Not to mention a huge source of bugs.  Much of the confusion inside the compiler arises from transforming into too low-level a form, then expending lots of time and code attempting to roll back the transformation to find out what it was in the first place.  There are a lot of variations on this theme.  I would love to see internal abstractions which better preserve the semantics of the source code and which have been rethought with support for optimization at a higher priority.

The Glasgow Haskell Compiler (GHC) does optimization on a typed lambda-calculus—Haskell after type inference*. Of course, they are cheating—they don't have side effects—but something close sounds sensible for Scala: using Scala after type inference (and before anonymous functions are expanded to inner classes). I wonder whether that's why macros (which run at a similar stage) look useful for early inlining (according to Martin).

Moreover, side effects seem to make no difference for inlining lambda abstractions (that is, functions or methods), they only complicate inlining value definitions like foo in this example:
val foo = //(potentially) effectful computation
//perform some computation which is side-effect-free in a non-obvious way
//use foo once; this use could be inlined, but that's hard to see.

* Look on Google Scholar for "A transformation-based optimiser for Haskell" for the general architecture; "Secrets of the Glasgow Haskell Compiler inliner" is also a nice reading. Among the first papers I read during my PhD.
Reply all
Reply to author
Forward
0 new messages