Size of standard library and possible tricks to reduce its size

377 views
Skip to first unread message

Todd Vierling

unread,
Oct 18, 2011, 5:19:26 PM10/18/11
to scala-l...@googlegroups.com
So, the library jar is getting kinda bulky, especially after the addition of parallel collections. I've been noticing that there's a whole lot of repetitive trait-stub instantiations -- scala/collection/Iterator$$anon$* are great examples of these (they're all different subtypes of Iterator itself) -- and I wonder if anyone has thought to create some library-private abstract classes to flatten some of the bytecode bloat.

For instance, my personal library has the following tree of private-use types tucked away in a package object. You'll note that they are basically a parallel to the corresponding traits, inheriting from the largest abstract superclass available "with" the more specific trait.

Effectively, these let me create multiple custom subtypes of the corresponding traits, but the swath of trait stubs are only compiled to bytecode exactly once (at the abstract class level). So instead of consuming 21k of bytecode for every subtype of Iterator that I need, it only takes about 1k each. I get similar savings from the other classes in the tree.

These abstract classes for my use take up ~240k of classfile space with 2.9.1, but that hit is only taken once, not upon creation of every subtype. This saves a pretty significant chunk of bytecode, and the same trick would likely work well in the standard library. Thoughts?

package object collection {
  import scala.collection._

  private[collection] abstract class AbstractTraversableOnce[T]
    extends TraversableOnce[T]

  private[collection] abstract class AbstractTraversable[T]
    extends AbstractTraversableOnce[T] with Traversable[T]

  private[collection] abstract class AbstractTraversableProxy[T]
    extends AbstractTraversable[T] with TraversableProxy[T]

  private[collection] abstract class AbstractIterable[T]
    extends AbstractTraversable[T] with Iterable[T]

  private[collection] abstract class AbstractIterableProxy[T]
    extends AbstractIterable[T] with IterableProxy[T]

  private[collection] abstract class AbstractIterator[T]
    extends AbstractTraversableOnce[T] with Iterator[T]

  private[collection] abstract class AbstractMap[A, B]
    extends AbstractIterable[(A, B)] with Map[A, B]

  private[collection] abstract class AbstractMapProxy[A, B]
    extends AbstractMap[A, B] with MapProxy[A, B]

  private[collection] abstract class AbstractSeq[T]
    extends AbstractIterable[T] with Seq[T]

  private[collection] abstract class AbstractSeqProxy[T]
    extends AbstractSeq[T] with SeqProxy[T]

  private[collection] abstract class AbstractSet[T]
    extends AbstractIterable[T] with Set[T]

  private[collection] abstract class AbstractSetProxy[T]
    extends AbstractSet[T] with SetProxy[T]
}

Paul Phillips

unread,
Oct 18, 2011, 5:31:02 PM10/18/11
to scala-l...@googlegroups.com
On Tue, Oct 18, 2011 at 2:19 PM, Todd Vierling <t...@duh.org> wrote:
> So, the library jar is getting kinda bulky, especially after the addition of
> parallel collections. I've been noticing that there's a whole lot of
> repetitive trait-stub instantiations -- scala/collection/Iterator$$anon$*
> are great examples of these (they're all different subtypes of Iterator
> itself) -- and I wonder if anyone has thought to create some library-private
> abstract classes to flatten some of the bytecode bloat.

https://lampsvn.epfl.ch/trac/scala/changeset/20311
https://issues.scala-lang.org/browse/SI-2876
https://lampsvn.epfl.ch/trac/scala/changeset/20490

That was for views. The issue in 2876 doesn't look so bad from this
vantage, but as I recall there was a more serious compiler bug which
was never resolved which made the whole process too tedious.

It's probably doable, but in my experience one tends to discover new
or (worse) old and unfixed compiler bugs when attempting these things,
so you need to be prepared for battle.

Paul Phillips

unread,
Oct 18, 2011, 5:40:06 PM10/18/11
to scala-l...@googlegroups.com
"This week, on Behind the Bytecode... remember 200K meant something?"

https://lampsvn.epfl.ch/trac/scala/changeset/20311

Date: Wed Dec 23 17:57:36 2009 +0000

Created team of private[collection] abstract classes
and traits in scala.collection.views. Factored boilerplate
and base Transformed traits out of *ViewLike classes.
Executive summary and motivation:

4812029 Dec 23 09:47 scala-library.jar // before
4604150 Dec 23 09:24 scala-library.jar // after

Direct size savings of 4.5%. Review by odersky.

% date
Tue Oct 18 14:39:09 PDT 2011
% ls -l lib/scala-library.jar
-rw-r--r-- 1 paulp admin 9824041 Oct 16 09:02 lib/scala-library.jar

martin odersky

unread,
Oct 18, 2011, 5:50:29 PM10/18/11
to scala-l...@googlegroups.com
I believe it's most likely a linearization problem. That is, I believe the change in 20311 affected linearization so that the wrong method (the one raising the UnsupportedOperationException) was called. Linearization can be tricky in a library that uses traits in intricate ways and Scala collections and in particular views are an example of that. Calling it a compiler bug is a bit premature without further evidence.

 -- Martin

Paul Phillips

unread,
Oct 18, 2011, 6:22:12 PM10/18/11
to scala-l...@googlegroups.com
On Tue, Oct 18, 2011 at 2:50 PM, martin odersky <martin....@epfl.ch> wrote:
> I believe it's most likely a linearization problem. That is, I believe the
> change in 20311 affected linearization so that the wrong method (the one
> raising the UnsupportedOperationException) was called. Linearization can be
> tricky in a library that uses traits in intricate ways and Scala collections
> and in particular views are an example of that. Calling it a compiler bug is
> a bit premature without further evidence.

The bug I was talking about is this.

https://issues.scala-lang.org/browse/SI-2897

I trust I'm allowed to call that a bug.

Todd Vierling

unread,
Oct 18, 2011, 6:26:09 PM10/18/11
to scala-l...@googlegroups.com
On Tuesday, October 18, 2011 5:40:06 PM UTC-4, Paul Phillips wrote:
"This week, on Behind the Bytecode... remember 200K meant something?"

    https://lampsvn.epfl.ch/trac/scala/changeset/20311

Based on my snooping around, I can see that there would be substantial savings just by providing Iterator, Traversable, and Iterable. Those three traits account for a pretty big chunk of anon class trait stubs.

Should I take your responses to mean that it would be worth pulling HEAD and making a candidate diff, if it trimmed things notably? My aim would only be to provide library-private abstract classes -- no new traits, as was done in the changesets you linked -- which should, in theory, not further confuse the compiler. (I hope. At least in my private library, it's working fine... fingers crossed.)

martin odersky

unread,
Oct 18, 2011, 6:27:42 PM10/18/11
to scala-l...@googlegroups.com
Yes, I think that would be very worthwhile doing.

Thanks!

 -- Martin

Paul Phillips

unread,
Oct 18, 2011, 6:28:22 PM10/18/11
to scala-l...@googlegroups.com
On Tue, Oct 18, 2011 at 3:26 PM, Todd Vierling <t...@duh.org> wrote:
> Based on my snooping around, I can see that there would be substantial
> savings just by providing Iterator, Traversable, and Iterable. Those three
> traits account for a pretty big chunk of anon class trait stubs.
> Should I take your responses to mean that it would be worth pulling HEAD and
> making a candidate diff, if it trimmed things notably?

Yes, absolutely. (But also take my responses to mean: please see that
the test suite completely passes from scratch, that is "ant all.clean
test" gets all the way to the part where it says yay.)

martin odersky

unread,
Oct 18, 2011, 6:33:54 PM10/18/11
to scala-l...@googlegroups.com
You certainly are. But I do not see yet what it has to do with the problem that was reported earlier. That problem manifested itself with an UnsupportedOperationException at runtime whereas 2897 is a crash at compile time. Also, I do not see how locally defined traits as shown in 2897 enter the picture. There's no use I can see of them in the collection library. But given enough time, I am sure we'll figure it out.

 -- Martin

Paul Phillips

unread,
Oct 18, 2011, 6:37:40 PM10/18/11
to scala-l...@googlegroups.com
On Tue, Oct 18, 2011 at 3:33 PM, martin odersky <martin....@epfl.ch> wrote:
> You certainly are. But I do not see yet what it has to do with the problem
> that was reported earlier.

As it says in the ticket, "this bug may have stood between me and a
solution to SI-2876." In other words, I was restructuring the views to
get the linearization right, and ran directly into that bug. Not
having the luxury of shaving that particular yak, I reverted.

Todd Vierling

unread,
Oct 19, 2011, 1:49:37 AM10/19/11
to scala-l...@googlegroups.com
On Tuesday, October 18, 2011 6:28:22 PM UTC-4, Paul Phillips wrote:

Yes, absolutely.  (But also take my responses to mean: please see that
the test suite completely passes from scratch, that is "ant all.clean
test" gets all the way to the part where it says yay.)

Oh, of course. Well, after throwing only a couple hours of tweaking at it:

$ ls -l dists/scala-2.10.0.r25850-b20111018225736/lib/scala-library.jar 
-rw-r--r-- 1 tvierling tvierling 8697074 2011-10-18 23:13 dists/scala-2.10.0.r25850-b20111018225736/lib/scala-library.jar

$ ls -l build/pack/lib/scala-library.jar 
-rw-r--r-- 1 tvierling tvierling 7825478 2011-10-19 01:48 build/pack/lib/scala-library.jar

The main changes so far: adding AbstractIterator, AbstractIterable, AbstractSet, and some *ViewLike.AbstractTransformed.

Looks like I can probably trim another 200-400k from what I see remaining so far in the collections tree. Will post more status later in the week after giving it some hefty workouts with both the tests and eyeball-glazing proofreading (mainly to ensure that the Abstract* types don't accidentally leak out into public API through methods' return type signatures).

Some of this leads me to wonder whether certain specific types (e.g., AbstractIterator) should be part of the public API after all, as convenience classes for the user. For now, they're confined to private[collection], but if they prove useful and not a speed burden, that's another debate we can have after-the-fact.

Jason Zaugg

unread,
Oct 19, 2011, 3:05:09 AM10/19/11
to scala-l...@googlegroups.com
On Wed, Oct 19, 2011 at 7:49 AM, Todd Vierling <t...@duh.org> wrote:
Oh, of course. Well, after throwing only a couple hours of tweaking at it:

$ ls -l dists/scala-2.10.0.r25850-b20111018225736/lib/scala-library.jar 
-rw-r--r-- 1 tvierling tvierling 8697074 2011-10-18 23:13 dists/scala-2.10.0.r25850-b20111018225736/lib/scala-library.jar

$ ls -l build/pack/lib/scala-library.jar 
-rw-r--r-- 1 tvierling tvierling 7825478 2011-10-19 01:48 build/pack/lib/scala-library.jar

The main changes so far: adding AbstractIterator, AbstractIterable, AbstractSet, and some *ViewLike.AbstractTransformed.

Looks like I can probably trim another 200-400k from what I see remaining so far in the collections tree. Will post more status later in the week after giving it some hefty workouts with both the tests and eyeball-glazing proofreading (mainly to ensure that the Abstract* types don't accidentally leak out into public API through methods' return type signatures).

Some of this leads me to wonder whether certain specific types (e.g., AbstractIterator) should be part of the public API after all, as convenience classes for the user. For now, they're confined to private[collection], but if they prove useful and not a speed burden, that's another debate we can have after-the-fact.

Hi Todd,

Very encouraging results so far!

Martin suggested [1] that one reason for the slowdowns in compilation with Scala 2.9.0 might have been the longer base type sequence for commonly used types after the addition of the GenXxx traits. A bunch of optimizations followed that restored (nay, bettered!) performance for 2.9.1. But you ought to verify that building the compiler with your new compiler isn't slower than before.

-jason

Ismael Juma

unread,
Oct 19, 2011, 7:01:45 AM10/19/11
to scala-l...@googlegroups.com
On Wed, Oct 19, 2011 at 6:49 AM, Todd Vierling <t...@duh.org> wrote:
> Oh, of course. Well, after throwing only a couple hours of tweaking at it:

Great!

I am hopeful that the reduced indirection will also lead to improved
runtime performance. It will be interesting to do some benchmarking.

Best,
Ismael

√iktor Ҡlang

unread,
Oct 19, 2011, 7:21:06 AM10/19/11
to scala-l...@googlegroups.com

Indeed, since it doesn't need to warm up the different implementations. :-)
 

Best,
Ismael



--
Viktor Klang

Akka Tech Lead
Typesafe - Enterprise-Grade Scala from the Experts

Twitter: @viktorklang

Ismael Juma

unread,
Oct 19, 2011, 9:17:21 AM10/19/11
to scala-l...@googlegroups.com
2011/10/19 √iktor Ҡlang <viktor...@gmail.com>:

> Indeed, since it doesn't need to warm up the different implementations. :-)

There is also the JIT inlining aspect. HotSpot relies on bytecode size
to compute what to inline and the indirection introduced by the trait
forwarders may cause problems. I haven't had the chance to do specific
benchmarking around this, but I've seen some performance results that
could be explained by that.

Best,
Ismael

√iktor Ҡlang

unread,
Oct 19, 2011, 9:22:51 AM10/19/11
to scala-l...@googlegroups.com

You mean that the bloat is pushing other optimizable things out of the boundary of the machinecode budget?
 

Best,
Ismael

Ismael Juma

unread,
Oct 19, 2011, 9:43:44 AM10/19/11
to scala-l...@googlegroups.com
2011/10/19 √iktor Ҡlang <viktor...@gmail.com>:

> You mean that the bloat is pushing other optimizable things out of the
> boundary of the machinecode budget?

The bytecode inlining budget, yes. There is also a limit to the number
of methods HotSpot will inline, which is also a potential issue.

If HotSpot relied on the generated code to compute its inlining
budget, then things would be different (and probably better, although
there are other complications in that case).

Best,
Ismael

Todd Vierling

unread,
Oct 19, 2011, 11:57:55 AM10/19/11
to scala-l...@googlegroups.com
On Wednesday, October 19, 2011 1:49:37 AM UTC-4, Todd Vierling wrote:
$ ls -l dists/scala-2.10.0.r25850-b20111018225736/lib/scala-library.jar 
-rw-r--r-- 1 tvierling tvierling 8697074 2011-10-18 23:13 dists/scala-2.10.0.r25850-b20111018225736/lib/scala-library.jar

Another hour of tweaking brought this to:

$ ls -l build/pack/lib/scala-library.jar 
-rw-r--r-- 1 tvierling tvierling 7229411 2011-10-19 11:39 build/pack/lib/scala-library.jar

Yep, that's a ~17% drop so far. Also, by marking a few Abstract*s as private[scala] rather than private[collection], I trimmed scalap, swing, and the compiler a tiny bit too.

There's still more to trim and other tasks to do. After I'm done injecting all the 'Abstract...' layers, I'll review it for possible type linearization issues, run the ant tests, then post it for review and community benchmarking (hell, I'm not sure what to test!) on my webserver: a vanilla dist, the source diff, and a dist with the diff applied.

While doing this, I ran across some cases outside of scala.collection that were pretty useful as well. In particular, the compiler, upon seeing { (x) => foo }, emits an anon class that extends scala.runtime.AbstractFunction1; but explicit inheritances from (A => B) instantiate all the trait stubs in Function1 (and there are a lot of them, thanks to @specialized). So I was able to trim even more by making use of AbstractFunction1 explicitly in a few places.

More news to come...

Todd Vierling

unread,
Oct 19, 2011, 12:38:19 PM10/19/11
to scala-l...@googlegroups.com
On Wednesday, October 19, 2011 11:57:55 AM UTC-4, Todd Vierling wrote:
There's still more to trim and other tasks to do. After I'm done injecting all the 'Abstract...' layers, I'll review it for possible type linearization issues, run the ant tests, then post it for review and community benchmarking (hell, I'm not sure what to test!) on my webserver: a vanilla dist, the source diff, and a dist with the diff applied.

I forgot to mention one pretty big caveat with this size-optimization work: It's going to make scaladoc a little more annoying.

Today, there are some cases where library-private classes exist in a class/trait hierarchy, but they're fairly rare. These changes will make that happen quite a bit more, meaning that some of the supertypes in the type declaration won't be clickable. In practice, all the types are available via the expandable linearization, but as you probably know, that list is pretty huge for a collection class.

I'm open to opinions on how this will affect users in practice. A workaround would be to continue to include the trait immediately after the abstract class type, even if the abstract class is an instantiation of exactly that trait ("extends AbstractSeq[A] with Seq[A] with ..."). This means a little more verbosity in the code, but no bytecode changes in practice -- would you all prefer to see it done this way?

Ismael Juma

unread,
Oct 19, 2011, 12:39:52 PM10/19/11
to scala-l...@googlegroups.com
On Wed, Oct 19, 2011 at 4:57 PM, Todd Vierling <t...@duh.org> wrote:
While doing this, I ran across some cases outside of scala.collection that were pretty useful as well. In particular, the compiler, upon seeing { (x) => foo }, emits an anon class that extends scala.runtime.AbstractFunction1; but explicit inheritances from (A => B) instantiate all the trait stubs in Function1 (and there are a lot of them, thanks to @specialized). So I was able to trim even more by making use of AbstractFunction1 explicitly in a few places.

Yes, this is a known issue. Great that you're looking into it too!

Best,
Ismael

Todd Vierling

unread,
Nov 7, 2011, 4:48:13 PM11/7/11
to scala-l...@googlegroups.com
On Tuesday, October 18, 2011 5:19:26 PM UTC-4, Todd Vierling wrote:
So, the library jar is getting kinda bulky, especially after the addition of parallel collections. I've been noticing that there's a whole lot of repetitive trait-stub instantiations -- scala/collection/Iterator$$anon$* are great examples of these (they're all different subtypes of Iterator itself) -- and I wonder if anyone has thought to create some library-private abstract classes to flatten some of the bytecode bloat.

So after some public and private discussion, and a bit of fiddly work in the scala source tree (mmm, "manual search and replace"), a lot of the ideas here were implemented and committed to trunk today.

To explain what I did, how it managed to reduce scala-library.jar by about a meg and a half, and how this could be applicable to other codebases, I blogged the gory details here: http://blog.duh.org/2011/11/scala-pitfalls-trait-bloat.html

Derek Williams

unread,
Nov 7, 2011, 8:19:15 PM11/7/11
to scala-l...@googlegroups.com
On Mon, Nov 7, 2011 at 2:48 PM, Todd Vierling <t...@duh.org> wrote:
To explain what I did, how it managed to reduce scala-library.jar by about a meg and a half, and how this could be applicable to other codebases, I blogged the gory details here: http://blog.duh.org/2011/11/scala-pitfalls-trait-bloat.html


Good info in there. Thanks for blogging it.

--
Derek Williams

Ruslan Shevchenko

unread,
Nov 9, 2011, 4:11:38 AM11/9/11
to scala-l...@googlegroups.com
Amazing. Interesting, can compiler compute common minimal set of
traits for set of classes, to do such optimization authomatically [?]

Jason Zaugg

unread,
Nov 9, 2011, 4:33:29 AM11/9/11
to scala-l...@googlegroups.com
On Wed, Nov 9, 2011 at 10:11 AM, Ruslan Shevchenko <ruslan.s....@gmail.com> wrote:
Amazing. Interesting, can compiler compute common minimal set of
traits for set of classes, to do such optimization authomatically [?]

Separate compilation probably rules this out.

-jason 

Todd Vierling

unread,
Nov 9, 2011, 4:58:45 PM11/9/11
to scala-l...@googlegroups.com
That, and multiple inheritance. The traits chosen for explicit instantiation were picked carefully by hand, to get maximum benefit with minimal overhead. The compiler may not be able to figure out, efficiently, where the "best" place to insert the abstract class layer is.

RomKal

unread,
Nov 10, 2011, 3:11:06 AM11/10/11
to scala-language
On 9 Lis, 16:58, Todd Vierling <t...@duh.org> wrote:

> The compiler may not be able to figure out, efficiently,
> where the "best" place to insert the abstract class layer is.

I'm wondering if the compiler can do such thing at all. I was not
reading through the whole scala spec, but in case of Java some
standards clearly say what is the mapping of some language constructs
to classes generated from the compiler (how to name anonymous classes
for example). If it is also the case for scala I don't think we can
freely generate additional classes just because it might be more
efficient from the memory and disk usage point of view.

Moreover such automatically generated classes might need to be somehow
hidden. When you look at the inheritance tree you don't want to see
classes you don't inherit from or ones that were never in any code.

Roman

Pavel Pavlov

unread,
Dec 1, 2011, 5:48:11 AM12/1/11
to scala-l...@googlegroups.com
What if the compiler will create 'abstract class Foo$AC extends Foo' for every trait Foo?
Then it will be able to rewrite every definition of the form 'class C extends Foo ...' to 'class C extends Foo$AC ...', i.e. replace only the first supertrait by generated superclass.
This will catch all the uses like 'new Iterator {...}' or 'C extends A => B' in both library and user's code.
Comparing to the now used scheme, extra code will be generated only for those traits which are never used as first superclass.
 
Moreover, it makes sense to combine Foo$class and Foo$AC into one class. This will help to futher reduce bytecode size, as most of constant pool entries will be unified in these classes.
So, Foo$class will contain two versions of every non-abstract Foo's method: static method with original method's body (as now) and instance method with stub that invokes that static.
 
What do you think of this?
 

Paul Phillips

unread,
Dec 1, 2011, 11:21:58 AM12/1/11
to scala-l...@googlegroups.com
On Thu, Dec 1, 2011 at 2:48 AM, Pavel Pavlov <pavel.e...@gmail.com> wrote:
> What do you think of this?

Offhand, I think that's a freaking great idea.

Todd Vierling

unread,
Dec 1, 2011, 11:25:51 AM12/1/11
to scala-l...@googlegroups.com
On Thu, Dec 1, 2011 at 5:48 AM, Pavel Pavlov <pavel.e...@gmail.com> wrote:
> What if the compiler will create 'abstract class Foo$AC extends Foo' for
> every trait Foo?

I'm wary of this increasing size in the standard library, though it
would indeed help the case of user code.

It makes me wonder if perhaps an annotation to enable this behavior
per-trait might be a better choice.

--
-- Todd Vierling <t...@duh.org> <t...@pobox.com>

√iktor Ҡlang

unread,
Dec 1, 2011, 11:29:08 AM12/1/11
to scala-l...@googlegroups.com
On Thu, Dec 1, 2011 at 5:25 PM, Todd Vierling <t...@duh.org> wrote:
On Thu, Dec 1, 2011 at 5:48 AM, Pavel Pavlov <pavel.e...@gmail.com> wrote:
> What if the compiler will create 'abstract class Foo$AC extends Foo' for
> every trait Foo?

I'm wary of this increasing size in the standard library, though it
would indeed help the case of user code.

It makes me wonder if perhaps an annotation to enable this behavior
per-trait might be a better choice.

The point is the other way around, avoiding splicing in implementations in all subtypes.
It wouldn't need to emit the Foo$AC if it's a purely virtual trait.
 

--
-- Todd Vierling <t...@duh.org> <t...@pobox.com>

Todd Vierling

unread,
Dec 1, 2011, 11:37:25 AM12/1/11
to scala-l...@googlegroups.com
On Thu, Dec 1, 2011 at 11:29 AM, √iktor Ҡlang <viktor...@gmail.com> wrote:
>> I'm wary of this increasing size in the standard library, though it
>> would indeed help the case of user code.
>>
>> It makes me wonder if perhaps an annotation to enable this behavior
>> per-trait might be a better choice.
>
> The point is the other way around, avoiding splicing in implementations in
> all subtypes.

Yes, I know. I like the idea, but I wonder if it will cause more
strangeness than it solves. Proof of concept would probably be the
only way to find out.

> It wouldn't need to emit the Foo$AC if it's a purely virtual trait.

True, but nearly all traits in the standard library have at least one
implemented method. I would definitely prefer overloading Foo$class
instead, though, since it will already be in use (for static methods)
and would mean one less class file added to the mix.

Pavel Pavlov

unread,
Dec 1, 2011, 11:43:46 AM12/1/11
to scala-l...@googlegroups.com
Note that all redirection stubs will disappear from all classes (and traits) which implement trait Foo as first supertrait (super-supertrait etc.), only one copy of these stubs will remain - in the class Foo$AC.
So I doubt the library size has any chance to be increased.
 

Paul Phillips

unread,
Dec 1, 2011, 11:50:14 AM12/1/11
to scala-l...@googlegroups.com

I agree, it was this aspect which was so immediately appealing. All
those forwarders add a ton of weight. Also, my (unsubstantiated)
guess is hotspot will do much better at optimizing with an instance
method calling static in the same class vs. a forwarder passing 'this'
to a separate class. At least, I seriously doubt it'll do worse.

Pavel Pavlov

unread,
Dec 1, 2011, 11:58:03 AM12/1/11
to scala-l...@googlegroups.com
Yes, I know. I like the idea, but I wonder if it will cause more strangeness than it solves.
 
To keep inheritance hierarchy consistent it can be done in slightly different manner:
1) Compiler generates "abstract class Foo$class extends Foo" with instance stubs and static impl. methods as proposed above.
2) Scala's "class C extends Foo with Bar { ... }" is translated to Java's "class C extends Foo$class implements Foo, Bar { ... }"
  instead of "class C extends Object implements Foo, Bar { ... }" as it's done now.
 
This way we'll have symmetric interface inheritance at JVM level: "C <: Foo, C <: Bar", instead of "C <: Foo$class, Foo$class <: Foo, C <: Bar" as I proposed at first.
 

Pavel Pavlov

unread,
Dec 1, 2011, 12:21:04 PM12/1/11
to scala-l...@googlegroups.com
As regards to JIT/HotSpot optimizations and overall performance:
1) Having one forwarder method instead of many leads to faster warm-up of these forwarder(s) and thus earlier compilation and optimization of these forwarders by JIT.
2) More polymorfic/megamorfic call sites (virtual/interface calls to forwarders) became monomorfic at run-time, it will directly improve aggressivenes of inline and other optimizations.
3) Because of (2) and well-known optimistic optimization strategy of HotSpot's JIT the number of deoptimization/repeated recompilation cases may somewhat decrease.
4) At the hardware level, pollution of code cache and branch prediction cache should somewhat decrease.
5) Small static methods (getters/setters) will be inlined by HotSpot into forwarders early, at the first compilation of the forwarder, if they will reside in the same class.
 

martin odersky

unread,
Dec 1, 2011, 12:38:02 PM12/1/11
to scala-l...@googlegroups.com
I don't know about doing it everywhere. Many traits are meant to be mixed into other traits before they are instantiated. Examples are all ...Like traits in the collection classes. Generating an abstract class for each of them would be pure waste, since no class ever inherits from them in first position.

On the other hand, maybe an annotation could do the trick of giving more precise control to implementers without having to do a lot of typing. Something like

  @classbacked trait Iterator { ... }

And in that case, I agree we should try to combine the abstract class with the implementation class. So both abstract class and implementation class could be called Iterator$class, which is neat -  no new classfiles!

There's one tricky issue: A static method (like the ones in the implementation class) may not collide with a dynamic method (like the ones in the abstract class), where collide means "have the same name and type signature". Collisions are possible, for instance in the following case:

  @classbacked trait T {
     def foo(x: T) = ???
     def foo() = ???
  }

This will generate:

  class T$class {
     def foo(x: T) = T$class.foo(this, x)
     def foo() = T$class.foo(this)

     static def foo(_this: T, x: T) = ???
     static def foo(_this: T) = ???
   }

Note the collision between the first instance foo and the last static foo.

The compiler needs to check for collisions when generating T$class.
Fortunately, there's a remedy: In case of collision, simply do not generate the instance member.

Cheers

 -- Martin





--
Martin Odersky
Prof., EPFL and Chairman, Typesafe
PSED, 1015 Lausanne, Switzerland
Tel. EPFL: +41 21 693 6863
Tel. Typesafe: +41 21 691 4967

Pavel Pavlov

unread,
Dec 1, 2011, 12:43:38 PM12/1/11
to scala-l...@googlegroups.com
Such scheme also solves previously discussed problem with scaladoc and private classes in the collections hierarchy:
Foo$class is superclass of C only at JVM level. At Scala level, it is invisible at all, just like interface ScalaObject is invisible now.

Paul Phillips

unread,
Dec 1, 2011, 12:44:27 PM12/1/11
to scala-l...@googlegroups.com
On Thu, Dec 1, 2011 at 9:38 AM, martin odersky <martin....@epfl.ch> wrote:
> And in that case, I agree we should try to combine the abstract class with
> the implementation class. So both abstract class and implementation class
> could be called Iterator$class, which is neat -  no new classfiles!

One issue with this is that it makes it unavailable to users unless
they're willing to extend a class with a '$' in the name, which should
definitely make them very nervous. So I think there should be a non-$
class which extends the $class class, the name of which could be
derived automatically or could be supplied by the annotation.

Paul Phillips

unread,
Dec 1, 2011, 12:48:51 PM12/1/11
to scala-l...@googlegroups.com
On Thu, Dec 1, 2011 at 9:43 AM, Pavel Pavlov <pavel.e...@gmail.com> wrote:
> Foo$class is superclass of C only at JVM level. At Scala level, it is
> invisible at all, just like interface ScalaObject is invisible now.

Oh, and that reminds me, it might give us a shot at a subset of these:

https://issues.scala-lang.org/browse/SI-2296

Basically, we get burned on accessing protected members in java
classes because the jvm requires you be in a subclass of the actual
class whereas in scala the code performing the access, if defined in a
trait, will not be in a subclass but called through a forwarder.

Todd Vierling

unread,
Dec 1, 2011, 12:49:10 PM12/1/11
to scala-l...@googlegroups.com
On Thu, Dec 1, 2011 at 12:44 PM, Paul Phillips <pa...@improving.org> wrote:
> One issue with this is that it makes it unavailable to users unless
> they're willing to extend a class with a '$' in the name, which should
> definitely make them very nervous.  So I think there should be a non-$
> class which extends the $class class, the name of which could be
> derived automatically or could be supplied by the annotation.

I believe the point would be for the compiler to detect the presence*
of the backing class and pull it in automatically by linearization
order: first inheritance entry. So inheritance would still be simply
"class MyClass extends MyTrait ...".

(* assuming that this pre-instantiation might not be universal to all
traits - see my and Martin's notes about possibly controlling this
code generation via annotation.)

martin odersky

unread,
Dec 1, 2011, 12:52:34 PM12/1/11
to scala-l...@googlegroups.com
On Thu, Dec 1, 2011 at 6:49 PM, Todd Vierling <t...@duh.org> wrote:
On Thu, Dec 1, 2011 at 12:44 PM, Paul Phillips <pa...@improving.org> wrote:
> One issue with this is that it makes it unavailable to users unless
> they're willing to extend a class with a '$' in the name, which should
> definitely make them very nervous.  So I think there should be a non-$
> class which extends the $class class, the name of which could be
> derived automatically or could be supplied by the annotation.

I believe the point would be for the compiler to detect the presence*
of the backing class and pull it in automatically by linearization
order: first inheritance entry. So inheritance would still be simply
"class MyClass extends MyTrait ...".

I agree. It would be best if we did not need to invent a user-visible name for the class.

Cheers

 -- Martin

Pavel Pavlov

unread,
Dec 1, 2011, 1:23:09 PM12/1/11
to scala-l...@googlegroups.com

On Friday, December 2, 2011 12:38:02 AM UTC+7, martin odersky wrote:
I don't know about doing it everywhere. Many traits are meant to be mixed into other traits before they are instantiated. Examples are all ...Like traits in the collection classes. Generating an abstract class for each of them would be pure waste, since no class ever inherits from them in first position.

On the other hand, maybe an annotation could do the trick of giving more precise control to implementers without having to do a lot of typing. Something like

  @classbacked trait Iterator { ... }
 
It would be great to measure effect of both techniques (annotation-based and generate-forwarders-by-default).
 
I am not a big fan of generating useless code but generate-by-default approach have one appealing property, I think:
No matter how many times you use (extend) a trait you'll have only one copy of forwarders.
So you never will run into the situation as with Iterator now: one simple line "new Iterator { ...}" in the source code leads to generation of unpredictable large amount of bytecode (how many methods will have Iterator in Scala 3.33?).
 
Yes, when we generate forwarders by default the size of bytecode can increase comparing to what we have now, but overall bytecode size will remain proportional to source code size. On the other hand, we never see disproportional/unpredictable code size bloat (in case of single trait inheritance) as we have now. Annotation-based scheme will not cure this problem.
 
Also, generate-by-default scheme will somewhat weaken binary incompatibility:
In case of adding new concrete method to the Iterator, you won't need to recompile all the code that single-inherit Iterator.
 
 
 

There's one tricky issue: A static method (like the ones in the implementation class) may not collide with a dynamic method (like the ones in the abstract class), where collide means "have the same name and type signature". Collisions are possible, for instance in the following case:

  @classbacked trait T {
     def foo(x: T) = ???
     def foo() = ???
  }

This will generate:

  class T$class {
     def foo(x: T) = T$class.foo(this, x)
     def foo() = T$class.foo(this)

     static def foo(_this: T, x: T) = ???
     static def foo(_this: T) = ???
   }

Note the collision between the first instance foo and the last static foo.

The compiler needs to check for collisions when generating T$class.
Fortunately, there's a remedy: In case of collision, simply do not generate the instance member.
 
...or use some sort of mangling magic:
 
  class T$class {
     def foo(x: T) = T$class.foo$(this, x)
     def foo() = T$class.foo$(this)

     static def foo$(_this: T, x: T) = ???
     static def foo$(_this: T) = ???
   }

Todd Vierling

unread,
Dec 1, 2011, 2:02:48 PM12/1/11
to scala-l...@googlegroups.com
On Thu, Dec 1, 2011 at 1:23 PM, Pavel Pavlov <pavel.e...@gmail.com> wrote:
> I am not a big fan of generating useless code but generate-by-default
> approach have one appealing property, I think:
> No matter how many times you use (extend) a trait you'll have only one copy
> of forwarders.

Only in cases where it's the first trait. For multiple inheritance
cases, the additional traits will always have additional forwarders
(there's no way around this).

Pavel Pavlov

unread,
Dec 1, 2011, 2:17:22 PM12/1/11
to scala-l...@googlegroups.com
On Thu, Dec 1, 2011 at 1:23 PM, Pavel Pavlov <pavel.e...@gmail.com> wrote:
> I am not a big fan of generating useless code but generate-by-default
> approach have one appealing property, I think:
> No matter how many times you use (extend) a trait you'll have only one copy
> of forwarders.

Only in cases where it's the first trait. For multiple inheritance
cases, the additional traits will always have additional forwarders
(there's no way around this).

Of course.

Interesting, how many there are such cases comparing to single inheritance of traits?

Note that if you have
  trait A extends B with C
  trait D extends A with E
  class F extends D with G with A with E

then inheritance of A and D will be "single" in the terms of generating forwarders, i.e. in this sample:
forwarders for A, B, D will be generated once;
for C - twice (in C$class & A$class);
for E - twice (in E$class & D$class);
for G - twice (in G$class & F).

Todd Vierling

unread,
Dec 1, 2011, 2:54:44 PM12/1/11
to scala-l...@googlegroups.com
On Thu, Dec 1, 2011 at 2:17 PM, Pavel Pavlov <pavel.e...@gmail.com> wrote:
>> Only in cases where it's the first trait. For multiple inheritance
>> cases, the additional traits will always have additional forwarders
>> (there's no way around this).
>
> Of course.
>
> Interesting, how many there are such cases comparing to single inheritance
> of traits?

Look at any collection class's inheritance hierarchy. :)

That doesn't make the idea bad; it just clarifies that the generated
forwarders will still happen in some cases.

Pavel Pavlov

unread,
Dec 1, 2011, 3:10:54 PM12/1/11
to scala-l...@googlegroups.com
> Interesting, how many there are such cases comparing to single inheritance
> of traits?

Look at any collection class's inheritance hierarchy. :)

That doesn't make the idea bad; it just clarifies that the generated
forwarders will still happen in some cases.

Ok, I just noted that even in case of multiple inheritance there are still some "good cases" like in the example above.

Anyway, we always have the "ultimate" method available: to rewrite the code manually like you already did with collection classes :)

John Nilsson

unread,
Dec 1, 2011, 3:10:48 PM12/1/11
to scala-l...@googlegroups.com

Out of general curiosity, two questions:

Is it jar-size or permgen-size that is the issue?

If the former, why not generate the classes at runtime?

BR,
John

Pavel Pavlov

unread,
Dec 1, 2011, 3:20:48 PM12/1/11
to scala-l...@googlegroups.com

Out of general curiosity, two questions:

Is it jar-size or permgen-size that is the issue?

If the former, why not generate the classes at runtime?

If you ask me, then I'm is very strongly biased against this idea.

Maybe because I'm writing AOT native compiler in Scala and loosing self-applicability is my worst nightmare (we will be constrained to downgrade our codebase from Scala to Java in that case), maybe it is just a matter of taste. :)

Todd Vierling

unread,
Dec 1, 2011, 3:54:46 PM12/1/11
to scala-l...@googlegroups.com
On Thu, Dec 1, 2011 at 3:10 PM, John Nilsson <jo...@milsson.nu> wrote:
> Out of general curiosity, two questions:
>
> Is it jar-size or permgen-size that is the issue?
>
> If the former, why not generate the classes at runtime?

Bytecode generation should be avoided if possible, as it prevents:
- targeting alternative VMs (Android Dalvik)
- ahead-of-time compilation (gcj)
- linkage needed to run obfuscation/optimization tools (RetroGuard)

It can also make debugging significantly more difficult, and may bloat
JIT memory footprint.

John Nilsson

unread,
Dec 1, 2011, 5:01:00 PM12/1/11
to scala-l...@googlegroups.com

Ok, fair enough. I guess most concerns could be addressed by different compilation backends or an optional postprocessing step, but I can see how its enough of an issue to not warrant the added complexity.

Thanks for the info!
John

Pavel Pavlov

unread,
Dec 1, 2011, 9:43:22 PM12/1/11
to scala-l...@googlegroups.com
I don't know about doing it everywhere. Many traits are meant to be mixed into other traits before they are instantiated. Examples are all ...Like traits in the collection classes. Generating an abstract class for each of them would be pure waste, since no class ever inherits from them in first position.

I've just noticed that I haven't stated this clearly enough:
There's no need to generate forwarders for traits used as first supertrait in another traits, as well as in classes.


Maybe someone will be able to collect some stats on the forwarders counts?
I place the sketch of (pseudo)code here:

======================================
type Klazz // class or trait
type Trait <: Klazz
type Class <: Klazz
type MethodHost = Klazz
type Method = (MethodName, MethodSig, MethodHost)

def methodDecls(k: Klazz): Set[Method] = ??? // builds set of all non-abstract methods for k; linearization magic works here

def klazzFwdCount(k: Klazz, skipFirstSuperTrait: Boolean): Int = {
  var mset = methodDecls(k)
  val sup = k.supers.toList
  sup match {
    case (s1: Trait) :: _ if skipFirstSuperTrait =>
      mset = mset &~ methodDecls(s1) // filter out non-overriden methods from first supertrait
    case (s1: Class) :: _ =>
      mset = mset &~ methodDecls(s1) // filter out non-overriden methods from the superclass
    case _ =>
  }
  k match {
    case _: Class =>
      mset = mset collect { case m@(_, _, host) if host != k => m } // filter out methods declared in k
    case _ =>
  }
  mset.size

}

def allClasses = allKlazzes collect { case c: Class => c }
def allTraits = allKlazzes collect { case t: Trait => t }

def notUsedAsFirstSuper: Seq[Trait] = {
  var notUsed: Set[Trait] = Set.empty
  def isFirstSuper(t: Trait) = {
    val p = allKlazzes find { k => !notUsed(k) && k.supers startsWith Seq(t) }
    !p.isEmpty
  }
  var changed = true
  while (changed) {
    changed = false
    for (t <- allTraits; if !isFirstSuper(t)) {
      notUsed += t
      changed = true
    }
  }
  notUsed.toSeq
}

def countStat() {
  val currFwds = allClasses.map(klazzFwdCount(_, false)).sum
  val smartFwds = allKlazzes.map(klazzFwdCount(_, true)).sum
  val notUsedFwds = notUsedAsFirstSuper.map(klazzFwdCount(_, true)).sum
  println("Forwarders count: %d; expected: %d, of which not used: %d", currFwds, smartFwds, notUsedFwds)
}
======================================

martin odersky

unread,
Dec 2, 2011, 2:52:39 AM12/2/11
to scala-l...@googlegroups.com
I prefer not to mangle the static foo's because sometimes Java interop depends on them. But either way it's not a big problem.

Cheers

 -- Martin

RomKal

unread,
Dec 2, 2011, 6:10:34 AM12/2/11
to scala-language
Btw will all those problems be solved easily by introduction of java 8
and modified JVM?

There is this feature with default interface implementations [1] that
AFAIK requires modified JVM, but... should drastically reduce the size
of scala library.

Maybe it's better to wait a little for java 8 and have it solved
ultimatelly? Or then at least have separate java8 version or standard
scala library?

Roman

[1] http://stackoverflow.com/questions/7857832/are-defaults-in-jdk-8-a-form-of-multiple-inheritance-in-java

Johannes Rudolph

unread,
Dec 2, 2011, 6:37:02 AM12/2/11
to scala-l...@googlegroups.com
Here are the corresponding slides:

http://www.wiki.jvmlangsummit.com/images/a/a1/2011_Goetz_Extension_Slides.pdf

Scala may benefit from this feature but it will be some years until
the Java 8 JVM would feasibly be a target for Scala.

--
Johannes

-----------------------------------------------
Johannes Rudolph
http://virtual-void.net

Paul Butcher

unread,
Dec 2, 2011, 6:48:28 AM12/2/11
to scala-l...@googlegroups.com
In addition, consider that this problem most severely affects limited platforms like Android/Dalvik. I suspect that Java 8 will take even longer to become available on that kind of platform.

--
paul.butcher->msgCount++

Snetterton, Castle Combe, Cadwell Park...
Who says I have a one track mind?

http://www.paulbutcher.com/
LinkedIn: http://www.linkedin.com/in/paulbutcher
MSN: pa...@paulbutcher.com
AIM: paulrabutcher
Skype: paulrabutcher

Todd Vierling

unread,
Dec 2, 2011, 9:49:34 AM12/2/11
to scala-l...@googlegroups.com
On Thu, Dec 1, 2011 at 9:43 PM, Pavel Pavlov <pavel.e...@gmail.com> wrote:
> I've just noticed that I haven't stated this clearly enough:
> There's no need to generate forwarders for traits used as first supertrait
> in another traits, as well as in classes.

How would the compiler know the difference?

In my (maybe temporary?) changes to the collection classes, there is a
hierarchy that looks like this:

abstract class AbstractTraversable[+A] extends Traversable[A]
abstract class AbstractIterable[+A] extends AbstractTraversable[A]
with Iterable[A]
abstract class AbstractSeq[+A] extends AbstractIterable[A] with Seq[A]

This does involve an inheritance hierarchy because the number of new
forwarders at each level is non-trivial. However, I didn't create an
AbstractTraversableOnce because its methods are almost completely
overridden at both the Iterator and Traversable subtrait levels. I
also didn't create classes for Gen* because those are also mostly
overridden, and not directly inherited.

If this were done only as the following, the bytecode (and number of
redundant forwarders) would be notably larger:

abstract class AbstractTraversable[+A] extends Traversable[A]
abstract class AbstractIterable[+A] extends Iterable[A]
abstract class AbstractSeq[+A] extends Seq[A]

This is the I prefer the idea of a compiler annotation. It allows for
finer-grain control over the process, so that there is not a
proliferation of either extra superclasses or redundant forwarders.

Pavel Pavlov

unread,
Dec 2, 2011, 11:27:32 PM12/2/11
to scala-l...@googlegroups.com

пятница, 2 декабря 2011 г. 21:49:34 UTC+7 пользователь Todd Vierling написал:
On Thu, Dec 1, 2011 at 9:43 PM, Pavel Pavlov <pavel.e...@gmail.com> wrote:
> I've just noticed that I haven't stated this clearly enough:
> There's no need to generate forwarders for traits used as first supertrait
> in another traits, as well as in classes.

How would the compiler know the difference?

 
Ok, to be more specific on this, let us dive into all the tedious details. I will use Scala 2.9.1 here for getting experimental data.

Starting from the top of collections hierarchy, GenTraversableOnce contains 1 non-abstract method ("/:\") and thus its $class now contain 2 static methods ("/:\" and "$init$") and will contain 1 extra instance method with the proposed scheme (forwarder for "/:\").
Its contents at bytecode level can be approximated here by contents of the "abstract class GenTraversableOnce$AC[+T] extends AnyRef with GenTraversableOnce[T]" plus current contents of the GenTraversableOnce$class.

Now let's skip some traits and look at Traversable.

If we declare its forwarders class as "abstract class Traversable$AC[+T] extends AnyRef with Traversable[T]" then it will contain 111 forwarders (including those for all supertraits), whereas its $class now contains only 4 static impl. methods.
But if we replace "AnyRef" here with first Traversable supertrait's forwarder class (TraversableLike$AC), then Traversable$AC will have only 13 forwarders, of which 4 are forwarders for Traversable's own static impl. methods, and other 9 comes from all supertraits but first (GenTraversable, TraversableOnce, GenericTraversableTemplate).
The size of Traversable$AC.class is reduced then from 26647 to 4334 bytes.

Loosing weight for Traversable requires creation of class TraversableLike$AC, which I just defined as "abstract class TraversableLike$AC[+T, +R] extends AnyRef with TraversableLike[T, R]". This version of TraversableLike$AC contain now 98 forwarders, class size is 22632 bytes, so all fat from Traversable$AC is moved here.
Doing the same operation with TraversableLike (replacing AnyRef with first supertrait) won't give us much as first supertrait of TraversableLike is (unfortunately!) HasNewBuilder. The situation here can be made better only by manual reordering of TraversableLike supertraits.
But anyway, the code size of these two classes (22632 + 4334 bytes) is virtually the same as code size of _only one use_ of Traversable as first supertrait (like in "new Traversable {...}") in the current scheme (~26647 bytes)!
So I think this is clear win.

Now let's see what we have for other collection traits:

Iterable:
32383 bytes, 134 forwarders (added on every use) becomes 11336 bytes, 47 forwarders (added once)
Seq (again unlucky case, first supertrait is PartialFunction):
54186 bytes, ~280 forwarders (added on every use) => 44151 bytes, 200 forwarders (added once)
LinearSeq:         55220 b,~280 fwd =>  4114 b,  20 fwd
IndexedSeq:        55283 b, 280 fwd =>  4379 b,  21 fwd
imm.Traversable:   27344 b, 112 fwd =>  2054 b,   6 fwd
imm.Iterable:      33314 b, 136 fwd => 12174 b,  50 fwd
imm.Seq:           55415 b,~280 fwd => 27589 b, 168 fwd
imm.LinearSeq:     56529 b, 284 fwd =>  4431 b,  21 fwd
imm.IndexedSeq:    56599 b, 284 fwd =>  4971 b,  23 fwd
muta.Traversable:  27250 b, 112 fwd =>  2024 b,   6 fwd
muta.Iterable:     33196 b, 136 fwd => 12110 b,  50 fwd
muta.Seq:          56018 b,~285 fwd => 28283 b, 170 fwd
muta.LinearSeq:    57208 b,~290 fwd =>  4283 b,  20 fwd
muta.IndexedSeq:   57972 b, 290 fwd =>  5587 b,  27 fwd
muta.Buffer:       63377 b, 314 fwd => 10807 b,  47 fwd
Set:               47927 b,~243 fwd => 38691 b,~168 fwd (1st super: Function1)
imm.Set:           49007 b, 245 fwd => 21355 b, 133 fwd (1st super: imm.Iterable)
muta.Set:          55370 b,~290 fwd => 27534 b, 164 fwd (1st super: muta.Iterable)
imm.SortedSet:     52284 b, 261 fwd =>  6702 b,  32 fwd
Map:               49041 b, 237 fwd => 20572 b, 123 fwd
imm.Map:           52760 b,~264 fwd => 24523 b, 139 fwd
muta.Map:          58590 b,~291 fwd => 29672 b, 165 fwd
Iterator:          21479 b, ~95 fwd => 11409 b,  52 fwd
PartialFunction:   11075 b,  78 fwd =>  1892 b,   4 fwd
Function1:         10213 b,  75 fwd => 10213 b,  75 fwd

So, you can see that even in unlucky cases (when first supertrait does not provide major part of non-abstract methods, comparing to other supertraits), the overall number of forwarders as well as class file size is still decreased.

Now let's look at some real uses of traits in the library's code.

I've grep'ed collections/immutable and found 10 "good" subclasses of immutable.Set:
BitSet, HashSet, ListSet, EmptySet, Set1, Set2, Set3, Set4, MapProxy.keySet, SetProxy.newProxy (both through SetProxy).
Having forwarders class for imm.Set will lead to (10*49007-21355) = ~458K reduction of code size, (10*245-133) = 2317 lesser methods.

On the other hand, trait imm.SortedSet is never used as first supertrait in collections/immutable, so there we'll have extra unused 32 forwarders weighting 6702 bytes. Nevertheless, these 6K is worth to be generated, if there are some "good uses" in other parts of the library (having only 1 such use is enough to recoup all costs) or there are good chances such uses may occur in the code of library users.


So, as I wrote in my first message, code bloat for such unused-as-first-supertrait traits is the only (and unavoidable!) weak spot of proposed scheme, as far as I can see.
Anyway, I hope the amount of such unused code will remain much smaller than removed code duplication for most of real-world code bases.

I still cannot see how one can remove comparable amount of duplicated code without generate-forwarders-by-default approach.
Annotation-based scheme can help reduce code duplication in already known problem places (which must be found manually), but it will not prevent arising such problems in the future (or worse, at the library user's side).

Maybe, annotation with opposite default ("@NoForwarders trait A") will be enough there?

Todd Vierling

unread,
Dec 3, 2011, 12:05:48 AM12/3/11
to scala-l...@googlegroups.com
On Fri, Dec 2, 2011 at 11:27 PM, Pavel Pavlov <pavel.e...@gmail.com> wrote:
> Loosing weight for Traversable requires creation of class
> TraversableLike$AC, which I just defined as "abstract class
> TraversableLike$AC[+T, +R] extends AnyRef with TraversableLike[T, R]". This
> version of TraversableLike$AC contain now 98 forwarders, class size is 22632
> bytes, so all fat from Traversable$AC is moved here.

That's the issue here: There's a much more complicated hierarchy than
just one or two traits. It's pretty extensive inside scala.collection,
and can result in some pretty deep inheritance trees for the final
concrete classes.

So I'm not so sure that it's a clear win. It it could even be
detrimental (to actual runtime overhead) to create abstract class
layers for everything, including traits like TraversableLike which are
never meant to be inherited directly.

Pavel Pavlov

unread,
Dec 3, 2011, 1:20:38 AM12/3/11
to scala-l...@googlegroups.com
You are right, deep inheritance hierarchy can be a problem.
In this case hovewer, we already have deep inheritance hierarchy through interfaces at JVM level, and adding forwarder classes won't make class inheritance height deeper than current interface inheritance height.

As I can see now, bigger height can affect three aspects adversely at JVM level:
1) slower virtual/interface calls due to more casts/call site cache violations.
2) slower class/interface casts
3) bigger class metadata, bigger number of VMTs/IMTs.

Here in Scala collections we now do most of the calls via invokeinterface (as formal types almost everywhere are traits, not classes). These calls will not be affected, as count of superinterfaces and interface method tables remain the same at VM level.
Interestingly enough, with forwarder classes some of the invokeinterface calls can be translated into cheaper invokevirtual ones.
Class casts there also seems not to be a big problem.

Third issue however can be a problem for such VMs as Dalvik. Here we factor out many identical forwarder methods and pay for this by increased size of metadata and increaed number of VMTs.
Note however, that overall count of loaded classes will not increase - we already load T.class & T$class.class for each inherited interface, and initialize every T$class on object creation (as we call each T$class.$init$). The overall number of declared methods in all those classes will also not be increased substantially. Thus more VMTs (more precisely, same number of bigger VMTs) seems to be major potential problem for me.

Anyway, it's better to have all this issues to be profiled and checked in reality.



Pavel Pavlov

unread,
Dec 3, 2011, 2:18:01 AM12/3/11
to scala-l...@googlegroups.com


суббота, 3 декабря 2011 г. 13:20:38 UTC+7 пользователь Pavel Pavlov написал:
3) bigger class metadata, bigger number of VMTs/IMTs.
...
Third issue however can be a problem for such VMs as Dalvik. Here we factor out many identical forwarder methods and pay for this by increased size of metadata and increaed number of VMTs.
Note however, that overall count of loaded classes will not increase - we already load T.class & T$class.class for each inherited interface, and initialize every T$class on object creation (as we call each T$class.$init$). The overall number of declared methods in all those classes will also not be increased substantially. Thus more VMTs (more precisely, same number of bigger VMTs) seems to be major potential problem for me.

Just remembered of so obvious optimization of abstract classes st JVM level: full elimination of IMT tables as they will never be used at runtime. Don't know if Dalvik does it or not.

Pavel Pavlov

unread,
Dec 3, 2011, 6:41:02 AM12/3/11
to scala-l...@googlegroups.com
 
...

Its contents at bytecode level can be approximated here by contents of the "abstract class GenTraversableOnce$AC[+T] extends AnyRef with GenTraversableOnce[T]" plus current contents of the GenTraversableOnce$class.
...
If we declare its forwarders class as "abstract class Traversable$AC[+T] extends AnyRef with Traversable[T]" ...
But if we replace "AnyRef" here with first Traversable supertrait's forwarder class (TraversableLike$AC), ...
...

Loosing weight for Traversable requires creation of class TraversableLike$AC, which I just defined as "abstract class TraversableLike$AC[+T, +R] extends AnyRef with TraversableLike[T, R]".

Of course, all these transformations should be done by the compiler and will not require any assistance from the programmer.
All user's code will be automatically & transparently transformed, i.e. all the changes will be visible on the JVM level, on the Scala level nothing will change.

Daniel Sobral

unread,
Dec 5, 2011, 9:23:03 AM12/5/11
to scala-l...@googlegroups.com
A question on Stack Overflow led me to a new thought on this matter.
Assume the following got implemented:

1. A class is generated for every trait.
2. If something "extends" a trait, then the corresponding class is
used instead of forwarders.

Now, assume this _also_ gets implemented:

3. Add constructor parameters to traits.

In such a case, would there be any point in keeping "class" as a
language construct instead of an implementation detail?

On Thu, Dec 1, 2011 at 15:38, martin odersky <martin....@epfl.ch> wrote:
>
>
> On Thu, Dec 1, 2011 at 11:48 AM, Pavel Pavlov <pavel.e...@gmail.com>
> wrote:
>>
>> What if the compiler will create 'abstract class Foo$AC extends Foo' for
>> every trait Foo?
>> Then it will be able to rewrite every definition of the form 'class C
>> extends Foo ...' to 'class C extends Foo$AC ...', i.e. replace only the
>> first supertrait by generated superclass.
>> This will catch all the uses like 'new Iterator {...}' or 'C extends A =>
>> B' in both library and user's code.
>> Comparing to the now used scheme, extra code will be generated only for
>> those traits which are never used as first superclass.
>>
>> Moreover, it makes sense to combine Foo$class and Foo$AC into one class.
>> This will help to futher reduce bytecode size, as most of constant pool
>> entries will be unified in these classes.
>> So, Foo$class will contain two versions of every non-abstract Foo's
>> method: static method with original method's body (as now) and instance
>> method with stub that invokes that static.
>>
>> What do you think of this?


>>
>
> I don't know about doing it everywhere. Many traits are meant to be mixed
> into other traits before they are instantiated. Examples are all ...Like
> traits in the collection classes. Generating an abstract class for each of
> them would be pure waste, since no class ever inherits from them in first
> position.
>
> On the other hand, maybe an annotation could do the trick of giving more
> precise control to implementers without having to do a lot of typing.
> Something like
>
>   @classbacked trait Iterator { ... }
>

> And in that case, I agree we should try to combine the abstract class with
> the implementation class. So both abstract class and implementation class
> could be called Iterator$class, which is neat -  no new classfiles!
>

> There's one tricky issue: A static method (like the ones in the
> implementation class) may not collide with a dynamic method (like the ones
> in the abstract class), where collide means "have the same name and type
> signature". Collisions are possible, for instance in the following case:
>
>   @classbacked trait T {
>      def foo(x: T) = ???
>      def foo() = ???
>   }
>
> This will generate:
>
>   class T$class {
>      def foo(x: T) = T$class.foo(this, x)
>      def foo() = T$class.foo(this)
>
>      static def foo(_this: T, x: T) = ???
>      static def foo(_this: T) = ???
>    }
>
> Note the collision between the first instance foo and the last static foo.
>
> The compiler needs to check for collisions when generating T$class.
> Fortunately, there's a remedy: In case of collision, simply do not generate
> the instance member.
>

> Cheers
>
>  -- Martin
>
>
>
>
>
> --
> Martin Odersky
> Prof., EPFL and Chairman, Typesafe
> PSED, 1015 Lausanne, Switzerland
> Tel. EPFL: +41 21 693 6863
> Tel. Typesafe: +41 21 691 4967
>

--
Daniel C. Sobral

I travel to the future all the time.

Pavel Pavlov

unread,
Dec 5, 2011, 9:50:22 AM12/5/11
to scala-l...@googlegroups.com
Hi Daniel,

It's very interesting point.

However one issue remains unlear for me:
How will you solve the "diamond problem" with constructors?
C++ has so intricate semantics of virtual bases initialization right because of this.

There is an example:
================================
trait A(val x: Int, val y: Int)
trait B(b: Int) extends A(b, 1)
trait C(c: Int) extends A(c, 2)

trait D(b: Int, c: Int) extends B(b) with C(c)

val d = new D(3, 4)
println(d.x) // ???
println(d.y) // ???
================================

The diamond problem with instance methods is solved in Scala by linearization of supertraits, but the same problem with constructors still persists.

Regards,
Pavel.

понедельник, 5 декабря 2011 г. 21:23:03 UTC+7 пользователь Daniel Sobral написал:
A question on Stack Overflow led me to a new thought on this matter.
Assume the following got implemented:

1. A class is generated for every trait.
2. If something "extends" a trait, then the corresponding class is
used instead of forwarders.

Now, assume this _also_ gets implemented:

3. Add constructor parameters to traits.

In such a case, would there be any point in keeping "class" as a
language construct instead of an implementation detail?

--

Todd Vierling

unread,
Dec 5, 2011, 12:21:26 PM12/5/11
to scala-l...@googlegroups.com
On Monday, December 5, 2011 9:23:03 AM UTC-5, Daniel Sobral wrote:

Now, assume this _also_ gets implemented:

3. Add constructor parameters to traits.

Not a good idea. Traits are different from classes for a reason: they avoid both the diamond problem and initialization order problem by not having initialization parameters.

Daniel Sobral

unread,
Dec 5, 2011, 2:23:57 PM12/5/11
to scala-l...@googlegroups.com
On Mon, Dec 5, 2011 at 12:50, Pavel Pavlov <pavel.e...@gmail.com> wrote:
> Hi Daniel,
>
> It's very interesting point.
>
> However one issue remains unlear for me:
> How will you solve the "diamond problem" with constructors?

The same way it is already solved:

scala> trait A[X, Y]
defined trait A

scala> trait B[T] extends A[T, Int]
defined trait B

scala> trait C[T] extends A[T, Double]
defined trait C

scala> trait D[T1, T2] extends B[T1] with C[T2]
<console>:10: error: illegal inheritance;
trait D inherits different type instances of trait A:
A[T2,Double] and A[T1,Int]
trait D[T1, T2] extends B[T1] with C[T2]
^

By making it illegal.

Note that this doesn't impose any additional restriction over what
classes already offer. Classes *already* make this kind of
double-inheritance illegal by making it impossible.

Todd Vierling

unread,
Dec 5, 2011, 3:07:59 PM12/5/11
to scala-l...@googlegroups.com
On Mon, Dec 5, 2011 at 2:23 PM, Daniel Sobral <dcso...@gmail.com> wrote:
>> However one issue remains unlear for me:
>> How will you solve the "diamond problem" with constructors?
>
> The same way it is already solved:
>
> scala> trait A[X, Y]
> defined trait A
>
> scala> trait B[T] extends A[T, Int]
> defined trait B
>
> scala> trait C[T] extends A[T, Double]
> defined trait C
>
> scala> trait D[T1, T2] extends B[T1] with C[T2]
> <console>:10: error: illegal inheritance;
>  trait D inherits different type instances of trait A:
> A[T2,Double] and A[T1,Int]
>       trait D[T1, T2] extends B[T1] with C[T2]
>             ^

However, say you have

trait A[T](foo: T)
trait B[T] extends A[T]
trait C[T] extends B[T] with A[T]
class D[T] extends C[T] with B[T]

Where's the correct place to initialize A.foo?

This is the reason C++ has virtual base classes, something that I feel
would really complicate Scala if added.

Pavel Pavlov

unread,
Dec 5, 2011, 4:11:28 PM12/5/11
to scala-l...@googlegroups.com
 
Where's the correct place to initialize A.foo?

Guys,

I've just got to have a totally crazy idea about traits initialization.

What if we allow secondary (but not primary!) constructors in traits?

Then define a simple rule:

If a trait has any secondary constructors, exactly one of them must be called in any inheritance hierarchy containing this trait.

So, if we have
  trait A {
    val x: Int = _
    def this(xx: Int) { x = xx }
  }

  trait T1 extends A
  trait T2 extends A {
    def this(xx: Int) { A.super(xx) }
    def this(z: Double) {/*empty*/}
  }


then these programs will be correct:
  class C extends A(42)
  class C0(x: Int) extends A(x)
  class C1(x: Int) extends T1 with A(x)
  class C2(x: Int) extends T2(x)   // ok - A is initialized in T2
  class C3(x: Int) extends T2(x) with A   // ok - A is initialized in T2
  class C4(x: Int) extends T2(3.14) with A(x)   // ok - A is not initialized in T2

and these are ill-formed:
  class B extends A    // A is not initialized
  class B1 extends T1   // A is not initialized
  class B2(x: Int) extends T2(x) with A(x)   // A is initialized twice

This way we can:
 - define trait initialization requirements in Scala code (and check them in compiler);
 - define initialization dependence of class/trait and its supertraits;
 - shift the responsibility on trait initialization down the heirarchy to the point where all arguments can be provided.

Will this work or not? Has it any holes? It if works, can it replace pre-initialized fields?

Any thoughts?

Bernd Johannes

unread,
Dec 6, 2011, 12:58:35 AM12/6/11
to scala-l...@googlegroups.com, Pavel Pavlov

It sounds very intriguing. Somehow I never liked the "early initialization"
syntax. The possibility of providing trait constructor parameters and
secondary constructors in the way you described is a lot more pleasant to my
eyes.
If it does not interact badly with other features this would be a very nice
addition. And as it mimics super class constructor calls it's in itself not a
new syntactic feature.

Nice idea!
Greetings
Bernd

Daniel Sobral

unread,
Dec 6, 2011, 8:49:38 AM12/6/11
to scala-l...@googlegroups.com
On Mon, Dec 5, 2011 at 18:07, Todd Vierling <t...@duh.org> wrote:
> On Mon, Dec 5, 2011 at 2:23 PM, Daniel Sobral <dcso...@gmail.com> wrote:
>>> However one issue remains unlear for me:
>>> How will you solve the "diamond problem" with constructors?
>>
>> The same way it is already solved:
>>
>> scala> trait A[X, Y]
>> defined trait A
>>
>> scala> trait B[T] extends A[T, Int]
>> defined trait B
>>
>> scala> trait C[T] extends A[T, Double]
>> defined trait C
>>
>> scala> trait D[T1, T2] extends B[T1] with C[T2]
>> <console>:10: error: illegal inheritance;
>>  trait D inherits different type instances of trait A:
>> A[T2,Double] and A[T1,Int]
>>       trait D[T1, T2] extends B[T1] with C[T2]
>>             ^
>
> However, say you have
>
> trait A[T](foo: T)
> trait B[T] extends A[T]
> trait C[T] extends B[T] with A[T]
> class D[T] extends C[T] with B[T]
>
> Where's the correct place to initialize A.foo?

Well, since no one is passing a parameter to A, this is plainly incorrect.

If someone is passing parameters to A, then apply the same rules as
Scala applies for *type* parameters (with the appropriate changes) as
shown above.

Todd Vierling

unread,
Dec 6, 2011, 10:07:35 AM12/6/11
to scala-l...@googlegroups.com
On Tue, Dec 6, 2011 at 8:49 AM, Daniel Sobral <dcso...@gmail.com> wrote:
>> However, say you have
>>
>> trait A[T](foo: T)
>> trait B[T] extends A[T]
>> trait C[T] extends B[T] with A[T]
>> class D[T] extends C[T] with B[T]
>>
>> Where's the correct place to initialize A.foo?
>
> Well, since no one is passing a parameter to A, this is plainly incorrect.

You don't get my point: There's two places where A[T] is inherited.
Which one should have the initialization parameter?

If you don't fully grok the problem introduced by having
initialization parameters for traits, read up on C++ virtual base
classes.

martin odersky

unread,
Dec 6, 2011, 10:10:52 AM12/6/11
to scala-l...@googlegroups.com
But that would mean the compiler needs to be able to decide equality
of value parameters. That looks tough to me :-)

I think the only possible alternative is to make value parameters to
traits optional and to check that only one instance of a trait is
parameterized. For instance, this would be OK:

trait A(a: T)
trait B(b: T) extends A // no parameters here
trait C(c: T) extends A(c) with B(c)
trait D extends B(d1) with C(d2)

And so would this:

trait A(a: T)
trait B(b: T) extends A(b)
trait C(c: T) extends A with B(c)
trait D extends B(d1) with C(d2)

But this would be illegal:

trait A(a: T)
trait B(b: T) extends A(b)
trait C(c: T) extends A(c) with B(c)
trait D extends B(d1) with C(d2)

Another possibility (which we explored earlier) would be to
translatetrait parameters to abstract vals and let linearization
decide which definition is current. That would have accepted the last
code with `a` in A initialized to d2. So the alternative is more
general, but it's also considerably harder to implement and I am not
sure it's what one wants in the end.

Cheers

-- Martin

Ittay Dror

unread,
Dec 6, 2011, 11:36:17 AM12/6/11
to scala-l...@googlegroups.com, martin odersky
I guess this means that 'class E[T](e: T) extends B(e)' is illegal, right? I think it may confuse people (esp. if you consider that you can't have a class extend 'trait F[T](f: T) extends B(f)' either)

Maybe require that traits like B are defined as 'abstract'? Then it would be more obvious why they can't be extended alone.

+1 for having a way for trait "constructor parameters". I need this all the time and defining an abstract 'val' for each, requiring a val definition in classes that extend is a  lot of boilerplate.

Ittay

Pavel Pavlov

unread,
Dec 6, 2011, 12:25:58 PM12/6/11
to scala-l...@googlegroups.com

What if we allow secondary (but not primary!) constructors in traits?

Of course, there is no difference between primary and secondary constructors. It is enough that trait should be inittialized exactly once for any instantiated subclass.

Daniel Sobral

unread,
Dec 6, 2011, 12:59:41 PM12/6/11
to scala-l...@googlegroups.com

I grok, and I have answered, and you seem completely oblivious to my
point. So, I'll try a different tack.

A type parameter is a parameter.

It's a special kind of parameter, sure, but it still is a parameter.
So, replace "type parameter" by "parameter", and thing of "[T]" as
"(x)".

So the example I gave, and showed how Scala handles by running it on REPL, was:

trait A[X, Y]
trait B[T] extends A[T, Int]


trait C[T] extends A[T, Double]

trait D[T1, T2] extends B[T1] with C[T2]

Translated it would become:

trait A(x, y)
trait B(b) extends A(b, 1) // replacing Int with 1
trait C(c) extends A(c, 2) // replacing Double with 2
trait D(b, c) extends B(b) with C(c)

And the example I based it on, which was the "diamond problem example", was:

trait A(val x: Int, val y: Int)
trait B(b: Int) extends A(b, 1)
trait C(c: Int) extends A(c, 2)
trait D(b: Int, c: Int) extends B(b) with C(c)

See? Same thing. And it would be treated exactly how Scala already
treats the first example, aside from obvious differences between types
and values.

Pavel Pavlov

unread,
Dec 6, 2011, 1:07:53 PM12/6/11
to scala-l...@googlegroups.com


вторник, 6 декабря 2011 г. 22:10:52 UTC+7 пользователь martin odersky написал:
What about secondary constructors for traits?
What do you think - should we able to define a trait that can be initialized in two (or more) alternative ways?
Like this:
  trait Alt(s: String) {
    def this(x: Int, flag: Boolean) {
      this(if (flag) x+"%" else "1/"+x)
    }
  }
  val a = new Alt(123, true); val b = new Alt("?")

Should we have an ability to define conditionally initilatization-dependent traits?
Like in this example:
  trait A(x: Int)
  trait B(z: Double) extends A {
    def this(x: Int) {
      A.super(x)
    }
  }
  val x1 = new B(1)                // ok - A & B initialized
  val x2 = new B(3.14) with A(42)  // ok - A & B initialized
  val x3 = new B(3.14)             // error: A isn't initialized
  val x4 = new B(1) with A(42)     // error: A initialized twice

Pavel Pavlov

unread,
Dec 6, 2011, 1:26:15 PM12/6/11
to scala-l...@googlegroups.com
Ah, now I see your point.

However, there are two difficulties there:
1) As Martin already said, to force compiler decide equality of values is overkill (if it's decidable at all)
2) How do you deal with side effects?

Consider an example:


trait A(val x: Int, val y: Int)

trait B(b: Int) extends A(b, 1)
trait C(c: Int) extends A(c, 2)
trait D(b: Int, c: Int, f: Int => Int, g: Int => Int) extends B(f(b)) with C(g(c))

def foo = 123
def bar(x: Int) = x
def baz(x: Int) = { launchMissiles(); x }

val y = new D(123, foo, bar, bar) // is this correct or not?
val x = new D(123, 123, baz, baz) // and this?

Pavel Pavlov

unread,
Dec 6, 2011, 1:34:45 PM12/6/11
to scala-l...@googlegroups.com
Sorry, read this as:
.....

trait B(b: Int) extends A(b, 1)
trait C(c: Int) extends A(c, 1)
.....

Daniel Sobral

unread,
Dec 6, 2011, 2:46:54 PM12/6/11
to scala-l...@googlegroups.com
On Tue, Dec 6, 2011 at 16:26, Pavel Pavlov <pavel.e...@gmail.com> wrote:
> Ah, now I see your point.
>
> However, there are two difficulties there:
> 1) As Martin already said, to force compiler decide equality of values is
> overkill (if it's decidable at all)

Not necessarily. Let's take your example:

trait A(val x: Int, val y: Int)

trait B(b: Int) extends A(b, 1) // A.x = B.b, A.y = 1
trait C(c: Int) extends A(c, 2) // A.x = C.c, A.y = 2


trait D(b: Int, c: Int, f: Int => Int, g: Int => Int) extends B(f(b))
with C(g(c))

B.b = f(b)
C.c = g(c)

A.x = B.b = f(D.b)
A.x = C.c = g(D.c) // not direct parameter passing, so invalid

A.y = 1
A.y = 2 // different, so invalid, but, given your correction for it to
mean 1, this would be allowed

Note that even if C.c were equal to f(D.c) or f(D.b), it would _still_
be disallowed. However, this would be ok:

trait A(val x: Int, val y: Int)
trait B(b: Int) extends A(b, 1)

trait C(c: Int) extends A(c, 1) // A.x = C.c, A.y = 2
trait D(b: Int, c: Int, f: Int => Int, g: Int => Int) extends B(b) with C(b)

A.x = B.b = D.b
A.x = C.c = D.b

A.y = 1
A.y = 1

There's no need to evaluate anything here. For any trait, all
constructor parameters for parent traits will either be a literal,
equal to one of the trait's own parameters, or something else.
Literals can be compared, a trait's own parameters can be traced, and
everything else should simply be disallowed in case of conflict.

Note that this can deal with everything that is possible in Scala
right now, since Scala simply does not allow a class constructor
parameter to be initialized by more than one value.

> 2) How do you deal with side effects?

Side effects fall into the "something else" case. If there's any
conflict, where conflict is taken to mean more than one initialization
path, then it is disallowed.

>
> Consider an example:
>
>
> trait A(val x: Int, val y: Int)
> trait B(b: Int) extends A(b, 1)
> trait C(c: Int) extends A(c, 2)
> trait D(b: Int, c: Int, f: Int => Int, g: Int => Int) extends B(f(b)) with
> C(g(c))

This is declared invalid here, at declaration site.

>
> def foo = 123
> def bar(x: Int) = x
> def baz(x: Int) = { launchMissiles(); x }
>
> val y = new D(123, foo, bar, bar) // is this correct or not?
> val x = new D(123, 123, baz, baz) // and this?

It doesn't matter if it _could_ be correct at usage site or not. It is
allowed or disallowed at declaration.

I'll grant that the rules are more complex than the simple rules that
class offers. However, I think eliminating "class" altogether would
result in an overall _simplification_ of the language. You add the
initialization rules for values, and get rid of everything about
classes.

Archontophoenix Quar

unread,
Dec 6, 2011, 2:50:48 PM12/6/11
to scala-l...@googlegroups.com
An alternative to letting the compiler decide the equality of value
parameters is to check the equality at runtime. This would allow:

trait A(a: T)
trait B(b: T) extends A(b)
trait C(c: T) extends A(c) with B(c)
trait D extends B(d1) with C(d2)

throwing some sort of exception if d1 != d2. I think accords well with
the general intuition that there's only one A in a D, so that that A
can be initialized just one way. But it would be a pain to implement,
and you're still left with the question of whether it's really what
you want.

A

Reply all
Reply to author
Forward
0 new messages