as of half an hour ago, the backend to emit classfiles 1.6 bootstraps and passes all tests

141 views
Skip to first unread message

Miguel Garcia

unread,
Apr 3, 2012, 9:50:29 AM4/3/12
to scala-i...@googlegroups.com

That's the backend at  https://github.com/magarciaEPFL/scala/tree/GenASM2

The problem I mentioned yesterday, obtaining "the" LUB given the internal names of JVM class files, can be solved taking into account that:

Quoting from gallium.inria.fr/~xleroy/publi/bytecode-verification-JAR.pdf

--- start quote ---
The simplest solution to the [least upper bound of] interface problem is to be found in Sun’s implementation of the
JDK bytecode verifier. (This approach is documented nowhere, but can easily be inferred by
experimentation.) Namely, bytecode verification ignores interfaces, treating all interface types as
the class type Object. Thus, the type algebra used by the verifier contains only proper classes
and no interfaces, and subtyping between proper classes is simply the inheritance relation between
them. Since Java has single inheritance (a class can implement several interfaces, but inherit from
one class only), the subtyping relation is tree-shaped and trivially forms a semi-lattice: the least
upper bound of two classes is simply their closest common ancestor in the inheritance tree.
--- end quote ---

That's the logic I've encoded in "jvmWiseLUB" ( see https://github.com/magarciaEPFL/scala/commit/da9e0eb590bb6d4f487388327b5fd2dbc2e10eb6 )

Below are summarized some of the problems I found when trying to use both global.lub and icodes.lub for that task. I'm reproducing them for future reference, or for those wanting to learn how it works (like, a student looking for a topic?)

As next step, I'll create a branch to allow running old and new GenJVM side-by-side, for merging into trunk. Comments are welcome.

Miguel
http://lampwww.epfl.ch/~magarcia/ScalaCompilerCornerReloaded/

---------------------------------------
Snippet 1: global.lub prefers interfaces over super classes:
---------------------------------------

global.lub(List(scala.Left, scala.Right)) returns scala.Product instead of the common direct super class scala.Either

This was the problem that icodes.lub set out to solve. However, for the purposes of overriding ASM's getCommonSuperClass, it leads to other problems, as shown below:



---------------------------------------
Snippet 2.A: icodes.lub returns scala.Any
---------------------------------------

For (scala/runtime/BoxedUnit, java/lang/StringBuilder)
whose parents are
  List(Object, java.io.Serializable)
  List(AbstractStringBuilder, java.io.Serializable, CharSequence)
resp.

---------------------------------------
Snippet 2.B: more scala.Any returned by icodes.lub
---------------------------------------

For (scala/collection/immutable/IntMap$Tip, scala/collection/immutable/IntMap$Nil$)

whose parents are:
  List(scala.collection.immutable.IntMap, Product, Serializable)
  List(scala.collection.immutable.IntMap, Product, Serializable)
resp.



Paul Phillips

unread,
Apr 3, 2012, 10:40:20 AM4/3/12
to scala-i...@googlegroups.com
On Tue, Apr 3, 2012 at 6:50 AM, Miguel Garcia <miguel...@tuhh.de> wrote:
> The problem I mentioned yesterday, obtaining "the" LUB given the internal
> names of JVM class files, can be solved taking into account that:

I must have missed that, or misunderstood it. You have seen the
tickets and comments on this matter, I hope?

https://issues.scala-lang.org/browse/SI-3872

Where iulian says "There is no one 'right' lub after erasure,
unfortunately. This is one of the reasons we still don't have a jvm6
backend."

Also something to be aware of, in TypeKinds:

/** The compiler's lub calculation does not order classes before traits.
* This is apparently not wrong but it is inconvenient, and causes the
* icode checker to choke when things don't match up. My attempts to
* alter the calculation at the compiler level were failures, so in the
* interests of a working icode checker I'm making the adjustment here.
*
* Example where we'd like a different answer:
*
* abstract class Tom
* case object Bob extends Tom
* case object Harry extends Tom
* List(Bob, Harry) // compiler calculates "Product with Tom"
rather than "Tom with Product"
*
* Here we make the adjustment by rewinding to a pre-erasure state and
* sifting through the parents for a class type.
*/

Miguel Garcia

unread,
Apr 3, 2012, 10:54:40 AM4/3/12
to scala-i...@googlegroups.com


On Tuesday, April 3, 2012 4:40:20 PM UTC+2, Paul Phillips wrote:
> The problem I mentioned yesterday, obtaining "the" LUB given the internal
> names of JVM class files, can be solved taking into account that:

You have seen the tickets and comments on this matter, I hope?

https://issues.scala-lang.org/browse/SI-3872


Now that you mention it, yes, that's related. Also related, http://comments.gmane.org/gmane.comp.java.vm.languages/2293

Moving forward, besides having the new backend available behind a compiler switch, I'll try to improve its performance somewhat, add documentation, and integrate classNameToSymbol (looks to me that's only needed for programs using things like java.util.Map/Entry).

Getting feedback from the field-testing would also be great, but I guess that will have to wait till it has been merged into trunk.

Miguel
http://lampwww.epfl.ch/~magarcia/ScalaCompilerCornerReloaded/

 

Simon Ochsenreither

unread,
Apr 3, 2012, 12:32:41 PM4/3/12
to scala-i...@googlegroups.com
Hi Miguel,

congratulations and thanks for all your work!

Is there any work left to target higher versions of the bytecode (Java7/8...)?

Thanks and bye,

Simon

Miguel Garcia

unread,
Apr 3, 2012, 1:54:00 PM4/3/12
to scala-i...@googlegroups.com
Simon,

Adding goodies for Java 7/8 might be fun in the future, however what helps the most now with the new backend is testing (over and beyond the library and compiler, which is all I've tested so far).

If you want to delve into the code, adding asserts where you think they could provide valuable feedback also seems like a good idea.


Miguel
http://lampwww.epfl.ch/~magarcia/ScalaCompilerCornerReloaded/

Daniel Sobral

unread,
Apr 3, 2012, 2:23:47 PM4/3/12
to scala-i...@googlegroups.com
Will it be on the next 2.10 milestone (and, therefore, part of 2.10.0)?

--
Daniel C. Sobral

I travel to the future all the time.

Miguel Garcia

unread,
Apr 3, 2012, 2:41:03 PM4/3/12
to scala-i...@googlegroups.com

Daniel,

my goal is to have the ASM-based backend ready for inclusion in the next milestone, for the purpose of emitting 1.6 classfiles when -target:jvm-1.6 is used (and only then). No Java 7/8 stuff. Looks like the current (ie FJBG-based) backend will remain the default, in the 2.10 release too.

Having said that, adoption can go faster with growing positive feedback. Some questions I haven't looked into yet:

  (a) size of the resulting binaries (larger? this may be due to LocalVariableTable I'm emitting for methods that the FJBG-based GenJVM didn't). If anything, the debugging experience should be better than before.

  (b) emitting BeanInfos, Java annotations, in general, not yet tested in projects combining heavily Java and Scala.

  (c) binary compatibility (not against a 2.9 but against binaries for the same version, emitted by the FJBG-based GenJVM). If you ask me, I doubt there's any incompatibility to discover, but one can always fire up http://typesafe.com/resources/getting-started/tutorials/migration-manager-step-by-step-guide.html or your favorite JAR-diffing tool.

Feedback is welcome.

Miguel
http://lampwww.epfl.ch/~magarcia/ScalaCompilerCornerReloaded/


Miguel Garcia

unread,
Apr 3, 2012, 2:50:13 PM4/3/12
to scala-i...@googlegroups.com

I almost forgot. Another area where feedback is welcome: the driving force for 1.6 classfiles is faster class loading when using the split verifier. Quantifying that would be great (some ideas: faster REPL, faster apps with lots of dynamic code loading, as in templates in web apps?).

To make sure the new verifier is used,
  -XX:-FailOverToOldVerifier -XX:+UseSplitVerifier
I'm also testing with -Xverify:all but only the first time, so as not to distort classloading time measurements.

Miguel
http://lampwww.epfl.ch/~magarcia/ScalaCompilerCornerReloaded/

Daniel Sobral

unread,
Apr 3, 2012, 3:05:00 PM4/3/12
to scala-i...@googlegroups.com
It would be good if this information is made available in the
milestone announcement, so people will know to test it.

--

Josh Suereth

unread,
Apr 3, 2012, 4:19:06 PM4/3/12
to scala-i...@googlegroups.com
I can take care of that when I make the next milestone announcement.

Miguel:  Where are you pulling the ASM jar from?  I didn't have much time to poke around in your branch today.   I just added the ability to resolve artifacts from maven central, if you're using a standard distribution.

- Josh

Miguel Garcia

unread,
Apr 3, 2012, 5:19:47 PM4/3/12
to scala-i...@googlegroups.com

Josh,

The ASM library I'm using is not standard, but the changes are minimal (an additional constructor in asm.Attribute, and promoting to public three existing methods in asm.util.CheckMethodAdapter). I'm downloading the sources from http://forge.ow2.org/plugins/scmsvn/index.php?group_id=23

I guess in the future the contents of asm.jar and asm-util.jar (with the above changes) would either go to lib/extras, or be included in scala-compiler.jar allright. Taken together, asm.jar and asm-util.jar total 90 Kilobytes.

Miguel
http://lampwww.epfl.ch/~magarcia/ScalaCompilerCornerReloaded/

Jason Zaugg

unread,
Apr 3, 2012, 5:40:43 PM4/3/12
to scala-i...@googlegroups.com
On Tue, Apr 3, 2012 at 11:19 PM, Miguel Garcia <miguel...@tuhh.de> wrote:
> The ASM library I'm using is not standard, but the changes are minimal (an
> additional constructor in asm.Attribute, and promoting to public three
> existing methods in asm.util.CheckMethodAdapter). I'm downloading the
> sources from http://forge.ow2.org/plugins/scmsvn/index.php?group_id=23
>
> I guess in the future the contents of asm.jar and asm-util.jar (with the
> above changes) would either go to lib/extras, or be included in
> scala-compiler.jar allright. Taken together, asm.jar and asm-util.jar total
> 90 Kilobytes.

I would suggest repackaging ASM to scala.tools.asm._ or similar.
JARJAR can perform the translation. This eliminates the possibility of
clashes with another version of ASM on the user's classpath, and the
practice is suggested in ASM's FAQ [1]

Admittedly this is only a concern if the users is programmatically
using the compiler, but this might become more common with the runtime
compiler toolbox offered in 2.10.

-jason

[1] http://asm.ow2.org/doc/faq.html

Josh Suereth

unread,
Apr 3, 2012, 6:14:13 PM4/3/12
to scala-i...@googlegroups.com
Yes.  We need to either namespace ASM or use a public version.

Sounds like we'll be namespacing.  Can we try to get the patch into ASM trunk so that's a temporary solution?   I'd rather relieve ourselves of ASM-related maintenance longer-term.

- Josh

Paul Phillips

unread,
Apr 3, 2012, 8:56:56 PM4/3/12
to scala-i...@googlegroups.com
Yes please let's try to update asm and use standard artifacts. I know we will likely end up sucking it in, but let us wait until there's no option.

Jason Zaugg

unread,
Apr 4, 2012, 1:45:08 AM4/4/12
to scala-i...@googlegroups.com
On Wed, Apr 4, 2012 at 12:14 AM, Josh Suereth <joshua....@gmail.com> wrote:
> Yes.  We need to either namespace ASM or use a public version.
>
> Sounds like we'll be namespacing.  Can we try to get the patch into ASM
> trunk so that's a temporary solution?   I'd rather relieve ourselves of
> ASM-related maintenance longer-term.

I would suggest to namespace it even if you use a public version. It
is too commonly used to force a particular version on anyone that
needs the compiler on the classpath. (I remember the pain this caused
users of Hibernate 3.x before they turned to JARJAR.)

-jason

Josh Suereth

unread,
Apr 4, 2012, 7:41:22 AM4/4/12
to scala-i...@googlegroups.com

Does jarjar auto namespace it?

What's the best means of accomplishing this?

Jason Zaugg

unread,
Apr 4, 2012, 8:06:24 AM4/4/12
to scala-i...@googlegroups.com
1. Create rules.txt containing "rule org.asm.** scala.tools.asm.@1"

2. java -jar jarjar.jar process rules.txt asm-orig.jar asm-scala.jar

3. Write your code against the new package name.

More info:

http://code.google.com/p/jarjar/w/list

-jason

Miguel Garcia

unread,
Apr 4, 2012, 8:06:58 AM4/4/12
to scala-i...@googlegroups.com

Some work in progress: I'm refactoring the new backend to use the standard asm-4.0.jar and asm-util-4.0.jar . The "refactoring" consists in subclassing ASM classes and using the subclases. For now I'm not using jarjar, but the recipes at http://code.google.com/p/jarjar/wiki/GettingStarted seem close enough to what we need. Once test pass I'll commit to my GenASM2 branch.

Miguel
http://lampwww.epfl.ch/~magarcia/ScalaCompilerCornerReloaded/


On Wednesday, April 4, 2012 1:41:22 PM UTC+2, Josh Suereth wrote:

Does jarjar auto namespace it?

What's the best means of accomplishing this?

On Apr 4, 2012 1:45 AM, "Jason Zaugg" wrote:

Miguel Garcia

unread,
Apr 4, 2012, 8:36:09 AM4/4/12
to scala-i...@googlegroups.com

The workaround to use standard asm jars can be seen in
  https://github.com/magarciaEPFL/scala/commit/cebc82deea58a770c9828c2a8cd417758dd1464e

With that, dependencies are:
  asm-4.0.jar and asm-util-4.0.jar go in lib/extra
  two Java classes have been added in src/fjbg that depend on the above (for now, the asm jars haven't been namespaced to org.scala.asm , when that happens these two additional classes will need to be updated).

Before getting to automating the asm download and follow-up namespacing, a quick way to bootstrap is: `ant newlibs build`, copy the updated fjbg.jar to scala/lib , `ant all.clean build`

I know more polishing is needed but the above already makes progress on using standard ASM jars. A minorl detail regarding "standard" (ie shrinked) and "debug info" versions of ASM can be gleaned from the source comment below:

public class CustomAttr extends Attribute {

    public CustomAttr(final String type, final byte[] value) {
        super(type);
        /* The next line depends on asm-4.0.jar ie the shrinked version.
           When using, say, asm-debug-all-4.0.jar, the assignment should read `super.value = value;` */
        super.b = value;
    }

}


Miguel
http://lampwww.epfl.ch/~magarcia/ScalaCompilerCornerReloaded/


Josh Suereth

unread,
Apr 4, 2012, 10:11:10 AM4/4/12
to scala-i...@googlegroups.com
If you look at trunk, you don't need to download any jar in lib/extra.

Inside the build you can specify a new <artifact name="asm" group="??" version="4.0"/> and it will automatically be downloaded, added to lib/extra and included in the distribution.

When I get time later today, i'll send you a pull-request with this fix in place.  We can work on namespacing later.

- Josh

Paul Phillips

unread,
Apr 4, 2012, 10:20:54 AM4/4/12
to scala-i...@googlegroups.com
On Wed, Apr 4, 2012 at 7:11 AM, Josh Suereth <joshua....@gmail.com> wrote:
> Inside the build you can specify a new <artifact name="asm" group="??"
> version="4.0"/> and it will automatically be downloaded, added to lib/extra
> and included in the distribution.

Can it not be added to lib/extra? I intended that specifically and
only for developer-local jars.

Josh Suereth

unread,
Apr 4, 2012, 10:24:41 AM4/4/12
to scala-i...@googlegroups.com
Technically, it's only added to lib/extra classpath.  Sorry, confusing ant properties with physical locations.

It goes in lib/ in the distribution.

iulian dragos

unread,
Apr 4, 2012, 12:08:33 PM4/4/12
to scala-i...@googlegroups.com
On Tue, Apr 3, 2012 at 3:50 PM, Miguel Garcia <miguel...@tuhh.de> wrote:

That's the backend at  https://github.com/magarciaEPFL/scala/tree/GenASM2

The problem I mentioned yesterday, obtaining "the" LUB given the internal names of JVM class files, can be solved taking into account that:

Quoting from gallium.inria.fr/~xleroy/publi/bytecode-verification-JAR.pdf

--- start quote ---
The simplest solution to the [least upper bound of] interface problem is to be found in Sun’s implementation of the
JDK bytecode verifier. (This approach is documented nowhere, but can easily be inferred by
experimentation.) Namely, bytecode verification ignores interfaces, treating all interface types as
the class type Object. Thus, the type algebra used by the verifier contains only proper classes
and no interfaces, and subtyping between proper classes is simply the inheritance relation between
them. Since Java has single inheritance (a class can implement several interfaces, but inherit from
one class only), the subtyping relation is tree-shaped and trivially forms a semi-lattice: the least
upper bound of two classes is simply their closest common ancestor in the inheritance tree.
--- end quote ---


That's a very nice paper, but it describes the old (data flow analysis) bytecode verifier. I'm not sure the same holds for the new type-checking verifier. It'd be great if it does, so is the ASM backend able to generate verifiable code in this case (taken from the ticket Paul cited earlier):


 abstract class Tom
 case object Bob extends Tom
 case object Harry extends Tom






  def moo(x: Tom) {}
  def moo2(x: Product) {}
  moo(if (c) new Bob else new Harry)
  moo2(if (c) new Bob else new Harry)

IIUC, the lub would be `Tom` for the two ifs, so the call to m2 would have to typecheck with something that does not implement Product. I can't build your branch right now, so apologies if this actually works. 

iulian



--
« Je déteste la montagne, ça cache le paysage »
Alphonse Allais

Josh Suereth

unread,
Apr 4, 2012, 2:11:31 PM4/4/12
to scala-i...@googlegroups.com
I'm having trouble getting locker to build.  Is there something I'm missing?  Granted, I haven't toyed much, but here's my branch with the maven fixes (trying to get JarJar in there):


I know there's an issue where MSIL requires locker to be built before it can rebuild, so I'm guess that's the issue here.   While FJBG *could* be built and used for locker, MSIL can't and causes issues.   Honestly, I'd gut this, but it's fixed in the SBT build, if only I could fix the SBT build....

- Josh

Miguel Garcia

unread,
Apr 4, 2012, 2:15:25 PM4/4/12
to scala-i...@googlegroups.com

Josh,

I'm in the process of simplifying the build steps, in a new branch (GenASM3) that also allows choosing between old and new backends via -target. That should be ready soon.

Miguel
http://lampwww.epfl.ch/~magarcia/ScalaCompilerCornerReloaded/

Josh Suereth

unread,
Apr 4, 2012, 2:18:20 PM4/4/12
to scala-i...@googlegroups.com
Great.  Ping me when I can fix up the build.  I'd love the bootstrapping so we're not hurting in the near future.  Please check out my 1 commit if you want ot see how to resolve artifacts from maven rather than manually downloading.

Miguel Garcia

unread,
Apr 4, 2012, 3:45:39 PM4/4/12
to scala-i...@googlegroups.com

Iulian,

My reading of http://comments.gmane.org/gmane.comp.java.vm.languages/2293 is that the new verifier can't reject bytecode that passed the old verifier, and thus by computing stack maps the old way (if there's a newer way, that's unknownst to me) is safe.

The example:


abstract class Tom
case object Bob extends Tom
case object Harry extends Tom


class Test {

  def main(args: Array[String]) {
    val c = args.isEmpty

    moo( if (c)  Bob else  Harry)
    moo2(if (c)  Bob else  Harry)

  }
 
  def moo(x: Tom) {}
  def moo2(x: Product) {}
}

passes "java -XX:-FailOverToOldVerifier -XX:+UseSplitVerifier -Xverify:all ... " and results in the following stack maps:

public void main(java.lang.String[]);
  Code:
   Stack=2, Locals=3, Args_size=2
   0:   getstatic       #16; //Field scala/Predef$.MODULE$:Lscala/Predef$;
   3:   aload_1
   4:   checkcast       #18; //class "[Ljava/lang/Object;"
   7:   invokevirtual   #22; //Method scala/Predef$.refArrayOps:([Ljava/lang/Object;)Lscala/collection/mutable/ArrayOps;
   10:  invokeinterface #28,  1; //InterfaceMethod scala/collection/IndexedSeqOptimized.isEmpty:()Z
   15:  istore_2
   16:  aload_0
   17:  iload_2
   18:  ifeq    27
   21:  getstatic       #33; //Field Bob$.MODULE$:LBob$;
   24:  goto    30
   27:  getstatic       #38; //Field Harry$.MODULE$:LHarry$;
   30:  invokevirtual   #42; //Method moo:(LTom;)V
   33:  aload_0
   34:  iload_2
   35:  ifeq    44
   38:  getstatic       #33; //Field Bob$.MODULE$:LBob$;
   41:  goto    47
   44:  getstatic       #38; //Field Harry$.MODULE$:LHarry$;
   47:  invokevirtual   #46; //Method moo2:(Lscala/Product;)V
   50:  return
  LocalVariableTable:
   Start  Length  Slot  Name   Signature
   0      51      0    this       LTest;
   0      51      1    args       [Ljava/lang/String;
   16      34      2    c       Z

  LineNumberTable:
   line 10: 0
   line 11: 16
   line 12: 33

  StackMapTable: number_of_entries = 4
   frame_type = 255 /* full_frame */
     offset_delta = 27
     locals = [ class Test, class "[Ljava/lang/String;", int ]
     stack = [ class Test ]
   frame_type = 255 /* full_frame */
     offset_delta = 2
     locals = [ class Test, class "[Ljava/lang/String;", int ]
     stack = [ class Test, class Tom ]
   frame_type = 77 /* same_locals_1_stack_item */
     stack = [ class Test ]
   frame_type = 255 /* full_frame */
     offset_delta = 2
     locals = [ class Test, class "[Ljava/lang/String;", int ]
     stack = [ class Test, class Tom ]



Reply all
Reply to author
Forward
0 new messages