I'm already slightly freaking out about this (I'm not sure why I
assumed Array(1, 2, 3) would be efficient, but I did).
I have a lot of performance things which offer some room for improvement. I'll start rolling some of them out here.When you create a collection, a lot of unnecessary work is done. The cases of particular interest to me are List and Array.1) List(1, 2, 3) leads to approximately this:val arr = new Array[Int](3)arr(0) = 1arr(1) = 2arr(2) = 3val wrapped = new WrappedArray(arr)List.apply(wrapped)val cbf = implicitly[CanBuildFrom[...]]val builder = cbf() // new ListBuffer[Int]builder ++= wrappedbuilder.resultWhen it could be done like this:new ::(1, new ::(2, new ::(3, Nil)))
only then we realize (if we are lucky) that we can inline foreach implementation, inline apply of a closure and throw away the whole class we generate for the closure. That's a huge amount of wasted work.
You should submit a pull request. Greg + I are working on different types of performance measurements. If there's a public branch that's easily findable, we'll be able to measure what kind of performance you patch gives us and ensure community projects build before merging.
I'm strongly against it. Let's wait until people experiment with them in wild before we make everybody depending on them through std lib. We need to be careful about how we manage risk.
scala> ArrayUtil.alloc[Int](Nil: _*)
<console>:20: error: no `: _*' annotation allowed here
(such annotations are only allowed in arguments to *-parameters)
ArrayUtil.alloc[Int](Nil: _*)
^
1) List(1, 2, 3) leads to approximately this:val arr = new Array[Int](3)arr(0) = 1arr(1) = 2arr(2) = 3val wrapped = new WrappedArray(arr)List.apply(wrapped)val cbf = implicitly[CanBuildFrom[...]]val builder = cbf() // new ListBuffer[Int]builder ++= wrappedbuilder.resultWhen it could be done like this:new ::(1, new ::(2, new ::(3, Nil)))
On 15 August 2012 19:16, Paul Phillips <pa...@improving.org> wrote:
I have a lot of performance things which offer some room for improvement. I'll start rolling some of them out here.When you create a collection, a lot of unnecessary work is done. The cases of particular interest to me are List and Array.1) List(1, 2, 3) leads to approximately this:val arr = new Array[Int](3)arr(0) = 1arr(1) = 2arr(2) = 3val wrapped = new WrappedArray(arr)List.apply(wrapped)val cbf = implicitly[CanBuildFrom[...]]val builder = cbf() // new ListBuffer[Int]builder ++= wrappedbuilder.resultWhen it could be done like this:new ::(1, new ::(2, new ::(3, Nil)))I agree that this is annoying and expensive. It's not only about object allocation but also about blowing up bytecode unnecessarily.I think there's a middle ground between what we have now and macro/builtin compiler support. This reminds me an idea I had (and Miguel also mentioned to me) of early inlining. The current state of affairs is that code with foreach call like List(1,2,3) foreach println we first expand lambda to a class, we typecheck everything and generate lots of trees, then icode and only then we realize (if we are lucky) that we can inline foreach implementation, inline apply of a closure and throw away the whole class we generate for the closure. That's a huge amount of wasted work.The idea of early inlining would be that for things we are sure we'll be able to inline and (as a consequence) eliminate the need for closure allocation we would keep Function tree node all the time and handle it properly in the backend. This way we would skip a lot of work.
I agree that saving work is good, and will also help with compile times. But it strikes me that macros are precisely the early inline mechanism we are after. The macro design has already addressed a number of hard problems such as inlining under separate compilation. So I think we should make them work for this as well, instead of inventing a new mechanism.Concretely, in the end we might be best off making map and friends macros in a couple of key classes such as List. The other benefit of this is that, if thesequence is generic and no macro applies, we fall back to the generic method in TraversableLike. Since this will be the only method visible to the JIT compiler we will still get the JIT optimizations that come with a monomorphic call site. Specializing @inline methods has the downside that they might confuse the JIT compiler.If we follow this approach, we should _not_ now make inlined versions of these methods in LinearSeqOptimized or IndexedSeqOptimized. It would worsen performance in the short run and hurt in the long run (by silently widening access of members that we want to keep preivate)
On 18 August 2012 13:17, martin odersky <martin....@epfl.ch> wrote:I agree that saving work is good, and will also help with compile times. But it strikes me that macros are precisely the early inline mechanism we are after. The macro design has already addressed a number of hard problems such as inlining under separate compilation. So I think we should make them work for this as well, instead of inventing a new mechanism.Concretely, in the end we might be best off making map and friends macros in a couple of key classes such as List. The other benefit of this is that, if thesequence is generic and no macro applies, we fall back to the generic method in TraversableLike. Since this will be the only method visible to the JIT compiler we will still get the JIT optimizations that come with a monomorphic call site. Specializing @inline methods has the downside that they might confuse the JIT compiler.If we follow this approach, we should _not_ now make inlined versions of these methods in LinearSeqOptimized or IndexedSeqOptimized. It would worsen performance in the short run and hurt in the long run (by silently widening access of members that we want to keep preivate)I find this proposal appealing but I would like to see how it works in practice before we decide on anything. Also, I think we should package the whole macro magic into some lightweight syntactic sugar (like @inline). I'd definitively not want to see macros being declared explicitly all over collections library.
Since macro annotations are far away I'd opt for annotation that gets special treatment in the compiler in a form of triggering a macro that can be defined in library. I'm thinking of universal `early inline` macro that could just be applied to all methods we want to inline.That's something worth experimenting. If everything works as expected we might consider doing it for 2.11.--
Grzegorz Kossakowski
On Sat, Aug 18, 2012 at 4:17 PM, Grzegorz Kossakowski <grzegorz.k...@gmail.com> wrote:
On 18 August 2012 13:17, martin odersky <martin....@epfl.ch> wrote:I agree that saving work is good, and will also help with compile times. But it strikes me that macros are precisely the early inline mechanism we are after. The macro design has already addressed a number of hard problems such as inlining under separate compilation. So I think we should make them work for this as well, instead of inventing a new mechanism.Concretely, in the end we might be best off making map and friends macros in a couple of key classes such as List. The other benefit of this is that, if thesequence is generic and no macro applies, we fall back to the generic method in TraversableLike. Since this will be the only method visible to the JIT compiler we will still get the JIT optimizations that come with a monomorphic call site. Specializing @inline methods has the downside that they might confuse the JIT compiler.If we follow this approach, we should _not_ now make inlined versions of these methods in LinearSeqOptimized or IndexedSeqOptimized. It would worsen performance in the short run and hurt in the long run (by silently widening access of members that we want to keep preivate)I find this proposal appealing but I would like to see how it works in practice before we decide on anything. Also, I think we should package the whole macro magic into some lightweight syntactic sugar (like @inline). I'd definitively not want to see macros being declared explicitly all over collections library.I don't see what's qualitatively different between@inline def filter(p: T => Boolean): Repr = ...anddef filter(p: T => Boolean): Repr = macro filterImplOr is it the filterImpl you object to? But that's an implementation detail; users need not look at it.
On Wed, Aug 15, 2012 at 3:17 PM, Grzegorz Kossakowski <grzegorz.k...@gmail.com> wrote:
only then we realize (if we are lucky) that we can inline foreach implementation, inline apply of a closure and throw away the whole class we generate for the closure. That's a huge amount of wasted work.Not to mention a huge source of bugs. Much of the confusion inside the compiler arises from transforming into too low-level a form, then expending lots of time and code attempting to roll back the transformation to find out what it was in the first place. There are a lot of variations on this theme. I would love to see internal abstractions which better preserve the semantics of the source code and which have been rethought with support for optimization at a higher priority.