Scala specialized class field layout problem

197 views
Skip to first unread message

叶先进

unread,
Mar 18, 2015, 6:38:23 AM3/18/15
to scala-i...@googlegroups.com
Hi community:

I am working on a SizeEstimator pr for spark(https://github.com/apache/spark/pull/4783)

I am encountering an specialized scala class size problem, Tuple2(Int, Int) to be specifically. Tuple2(Int, Int) for scala in JVM should be 12(Object Header) + 4(Int _1) + 4(Int _2) = 20 => 24 bytes.

However I use javap to get the specialized Tuple2 class: Tuple2$mcII$sp. It looks that this specialized version is a subclass of Tuple2. The javap output is followed.
~/scala-library/scala javap Tuple2\$mcII\$sp.class
Compiled from "Tuple2.scala"
public class scala.Tuple2$mcII$sp extends scala.Tuple2<java.lang.Object, java.lang.Object> implements scala.Product2$mcII$sp {
  public final int _1$mcI$sp;
  public final int _2$mcI$sp;
  public int _1$mcI$sp();
  public int _1();
  public int _2$mcI$sp();
  public int _2();
  public scala.Tuple2<java.lang.Object, java.lang.Object> swap();
  public scala.Tuple2<java.lang.Object, java.lang.Object> swap$mcII$sp();
  public <T1, T2> int copy$default$1();
  public <T1, T2> int copy$default$1$mcI$sp();
  public <T1, T2> int copy$default$2();
  public <T1, T2> int copy$default$2$mcI$sp();
  public boolean specInstance$();
  public java.lang.Object copy$default$2();
  public java.lang.Object copy$default$1();
  public java.lang.Object _2();
  public java.lang.Object _1();
  public scala.Tuple2$mcII$sp(int, int);
}

In that case, based on openjdk 8's classsloader'code and the JOL(http://openjdk.java.net/projects/code-tools/jol/) tool. The field layout for Tuple(1, 2) will be 
scala.Tuple2$mcII$sp object internals:
 OFFSET  SIZE   TYPE DESCRIPTION                    VALUE
      0     4        (object header)                01 00 00 00 (0000 0001 0000 0000 0000 0000 0000 0000)
      4     4        (object header)                00 00 00 00 (0000 0000 0000 0000 0000 0000 0000 0000)
      8     4        (object header)                05 c3 00 f8 (0000 0101 1100 0011 0000 0000 1111 1000)
     12     4 Object Tuple2._1                      null
     16     4 Object Tuple2._2                      null
     20     4    int Tuple2$mcII$sp._1$mcI$sp       1
     24     4    int Tuple2$mcII$sp._2$mcI$sp       2
     28     4        (loss due to the next object alignment)
Instance size: 32 bytes (reported by Instrumentation API)
Space losses: 0 bytes internal + 4 bytes external = 4 bytes total

There will be two additional object references (_1 and _2) in the superclass Tuple2. Hence, the size of Tuple2 will be 32 bytes, as the JOL filed layout indicates.

I used the method described in the http://www.javaworld.com/article/2077496/testing-debugging/java-tip-130--do-you-know-your-data-size-.html, The Tuple(Int, Int) is actually 24 bytes. 
So I am a bit confused. I hope someone can help me understand the gap. 
Does scala uses the specialized version of Tuple2 class is Tuple(2, 2) involved? If that's true, how does the superclass's fields are not taken into account?

叶先进

unread,
Mar 18, 2015, 11:43:22 AM3/18/15
to scala-i...@googlegroups.com
| Tuple2(Int, Int) for scala in JVM should be 12(Object Header) + 4(Int _1) + 4(Int _2) = 20 => 24 bytes.

I should say It's a 64 bit JVM with UseCompressOops on. And the field layout of Tuple(1,2) is also on the same JVM as it's the default JVM mode on my machine.


Does scala uses the specialized version of Tuple2 class is Tuple(2, 2) involved? ===> Does scala uses the specialized version of Tuple2 class when Tuple(1, 2) is involved?

Vlad Ureche

unread,
Mar 21, 2015, 1:43:31 AM3/21/15
to scala-internals
On Wed, Mar 18, 2015 at 11:38 AM, 叶先进 <advan...@gmail.com> wrote:

Does scala uses the specialized version of Tuple2 class is Tuple(2, 2) involved? If that's true, how does the superclass's fields are not taken into account?

Hi,

You bumped into issue SI-3585, which is a fundamental limitation of the current approach to specialization.

HTH,
Vlad

叶先进

unread,
Mar 23, 2015, 2:24:05 AM3/23/15
to scala-i...@googlegroups.com
Hi,

Thanks for your info. I have figured out that the specialized version of Tuple2 for Tuple2(2,1) do take 32 bytes(on 64 bit vm, with UseCompressOopes). The specialized version Tuple2$mcII$sp is a subclass of the generic Tupel2 class. And the generic Tuple2 have 2 reference fields which are duplicates for specialized class. The minibox project seems promising.

Rex Kerr

unread,
Mar 23, 2015, 3:16:05 AM3/23/15
to scala-i...@googlegroups.com
Actually, it's not a fundamental limitation of the "whole" approach, just a consequence of a particular decision: do you promise vals or not?  If _1, _2, etc. were defs in Tuple, then the specialized versions would just forward-and-box when used in a non-specialized context.  There is an efficiency argument for vals if Tuple allows inheritance, but the decision has already been made not to in the future, at which point _1 and _2 can be defs, and even the present specialization could save space.  Miniboxing will save less space.  But it avoids the explosion of specialized class types, which means you can have efficient tuples of larger size than you could with specialization.

  --Rex


--
You received this message because you are subscribed to the Google Groups "scala-internals" group.
To unsubscribe from this group and stop receiving emails from it, send an email to scala-interna...@googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

Vlad Ureche

unread,
Mar 23, 2015, 5:38:57 AM3/23/15
to scala-internals, Rex Kerr
On Mon, Mar 23, 2015 at 8:16 AM, Rex Kerr <ich...@gmail.com> wrote:

Actually, it's not a fundamental limitation of the "whole" approach, just a consequence of a particular decision: do you promise vals or not?  If _1, _2, etc. were defs in Tuple, then the specialized versions would just forward-and-box when used in a non-specialized context. There is an efficiency argument for vals if Tuple allows inheritance, but the decision has already been made not to in the future, at which point _1 and _2 can be defs, and even the present specialization could save space. 

I'm not sure what you mean Rex. If you have defs in Tuple, then you would also have to write the implementations that have vals at one point, right? And as far as I can tell, you have two options: (1) you write them with specialization, and you're back to square 1 or (2) you write them by hand, which is quite tedious. Am I missing something here?

Cheers,
Vlad

Adriaan Moors

unread,
Mar 23, 2015, 9:54:13 AM3/23/15
to scala-i...@googlegroups.com, Rex Kerr
I guess you (or the compiler) could implement TupleN as a (generic) trait, with TupleNImpl a (still generic) subclass that has the vals, and specialized subclasses with specialized vals. The downside is that you're going through invokevirtuals for the field accesses unless you know you're dealing with TupleNImpls.

Rex Kerr

unread,
Mar 23, 2015, 5:40:44 PM3/23/15
to Adriaan Moors, scala-i...@googlegroups.com
Exactly, and you then count on the JVM to hide the difference, which it is pretty good at.  It's not obviously a better choice, but it is a possible choice.

  --Rex

Vlad Ureche

unread,
Mar 25, 2015, 12:38:43 AM3/25/15
to scala-internals, Adriaan Moors

On Mon, Mar 23, 2015 at 10:40 PM, Rex Kerr <ich...@gmail.com> wrote:
Exactly, and you then count on the JVM to hide the difference, which it is pretty good at.  It's not obviously a better choice, but it is a possible choice.

Thanks for explaining Adriaan and Rex! And I imagine you would also want specialized accessors in TupleN, so you don't box when the accessor is not inlined (afair this is a cool thing about PyPy -- they can specialize the storage without the accessors -- being tracing-based, their JIT eliminates boxing in accessors).

I wouldn't worry about invokevirtuals, but in this case we'd need invokeinterface, which I remember is slower (at least in the interpreter if not also when compiled and not inlined).

Cheers,
Vlad

Scott Carey

unread,
Mar 30, 2015, 1:00:54 PM3/30/15
to scala-i...@googlegroups.com, adr...@typesafe.com

Amazing blog post on jvm method dispatch:  http://shipilev.net/blog/2015/black-magic-method-dispatch

Highly recommended.  I thought I knew it all already, but after reading that I realize I only knew about 80% of it.
 

Cheers,
Vlad
Reply all
Reply to author
Forward
0 new messages