---------- Forwarded message ----------
From: John Rose <John...@sun.com>
Date: Oct 9, 2007 11:27 PM
Subject: Project proposal: Multi-Language VM
To: anno...@openjdk.java.net
Hello world. I propose a new OpenJDK project[1],
the Multi-Language VM, to be abbreviated "mlvm",
and to be sponsored by the HotSpot group[2].
This project will be open for prototyping
JVM features aimed at efficiently supporting
languages other than Java.
The emphasis will be on completing the existing
bytecode and execution architecture with general
purpose extensions, as opposed to a new feature
for just one language, or adjoining an unrelated
new execution model.
The emphasis will also be on work which removes
"pain points" already observed by implementors
of successful or influential languages, as opposed
to more speculative work on unproven features or
niche languages.
Virtual machines produced by this project will
be standards-conforming, in that they will not change
the meaning or behavior of existing Java classes
and classfile formats. They may define variations
or extensions of the class format, or new kinds of
objects, whose meaning and behavior are beyond
the scope of current Java and JVM specifications.
However, these extended codes and data structures
will interoperate as much as possible with Java objects.
In addition, as a way of delimiting separate prototyping
efforts, each new feature will come with a switch which
turns it off, and that switch will be "off" by default.
This is the approach used in the Kitchen Sink Language
project.[3]
This proposal refines and completes a partial proposal
I sent earlier this year to the HotSpot project,
a proposal for a "Kitchen Sink VM"[4]. The present
proposal is more specifically directed at supporting
new languages (i.e., those languages which are
new to the JVM).
Here are some examples of features that could be
prototyped in this project, if developers were found
who are willing and able:
- tail calls and tail recursion [5]
- continuations and coroutines [6]
- tuples and value-oriented types [7]
- lightweight method objects [8]
- runtime support for closures [9]
- invokedynamic [10]
Prototyping for JSR 292[11] is likely to occur as
a part of this project. Note that none of the above
suggested features is specific to any single language.
As the current OpenJDK Project guidelines request,
please send followups to the discussion list.[12]
Thanks very much for your attention to this matter,
-- John Rose
http://blogs.sun.com/jrose/
[1] http://openjdk.java.net/projects/
[2] http://openjdk.java.net/groups/hotspot/
[3] http://ksl.dev.java.net/ (Kitchen Sink Language)
[4] http://mail.openjdk.java.net/pipermail/hotspot-dev/2007-July/
000091.html (Kitchen Sink VM)
[5] http://blogs.sun.com/jrose/entry/tail_calls_in_the_vm
[6] http://lambda-the-ultimate.org/node/1002 (Continuations for Java)
[7] http://blogs.sun.com/jrose/entry/tuples_in_the_vm
[8] http://groups.google.com/group/jvm-languages/t/dbc3a4a382868904
(Lightweight Methods)
[9] http://www.javac.info/ (Java Closures)
[10] http://groups.google.com/group/jvm-languages/web/implementation-
of-multimethods-in-jvm-languages
[11] http://jcp.org/en/jsr/detail?id=292#2 (Original JSR 292 request)
[12] http://mail.openjdk.java.net/mailman/listinfo/discuss
Many of the ideas floated on this list deserve to be
prototyped on the OpenJDK mlvm.
If I had a decade's leisure, I'd try them all personally.
Right now I have time mainly for dynamic invoke work,
treating neighboring ideas (like other invocation features)
as targets of opportunity. Stay tuned...
If you want to get into the sources, get used to Mercurial.
The mlvm will be a Mercurial workspace.
Best,
-- John
P.S. The shift of OpenJDK to Mercurial is happening as we speak:
http://weblogs.java.net/blog/kellyohair/archive/2007/10/
openjdk_mercuri_4.html
This is an amazing transition for us at Sun. We've been using
SCCS/Teamware since the beginning of time, and now we're
switching to an open-source, dynamic-language-based SCM
system. I've been a Teamware fan for a long time, and I'm glad
there's finally a modern replacement for it. (The key property
of both systems is fully symmetric distribution; you can do all
your code management disconnected from the local and/or
global nets. There is no special server that claims to know
everything. When you do a pull you get all the bits.)
Neal
Would the work from the Java Grande group apply, or is there other
relevant research?
Thanks
Patrick
I'm not aware of any work on improving the performance/correctness
tradeoff of integral arithmetic on the Java platform. If you're
generating byte-code for the Java VM, you have to choose between good
performance but limited precision with silent overflow or poor
performance but flexible precision. The general problem was "solved"
long ago, including in the StrongTalk system that was a predecessor to
HotSpot. Unfortunately those results have not been applied to the Java
platform.
Continuations!
Lightweight methods!
You have me hooked. I guess lightweight methods will get JRuby folks
hooked as well really fast :-)
As for multilinguality in a managed runtime -- at JAOO this year, Charlie
Nutter, Erik Meijer, and myself were interviewed by Kresten Thorup about
dynamic languages on managed runtimes, and one of the ideas that came up
was that (analogously to how today you have class loader mechanisms to
load executable content dynamically into the runtime) it would be great to
have a runtime where you can go one level lower and actually load
metaobject protocols into the runtime dynamically. I don't know yet
whether this makes sense (it feels to me it does), but is definitely
something to think about.
Of course, there's always the danger of losing focus if the goal is too
broad; see history of Parrot for terrifying examples. Although I'm
optimistic -- building on a foundation of the HotSpot JVM architecture
gets you immediately past lots of project infancy problems.
Attila.
--
home: http://www.szegedi.org
weblog: http://constc.blogspot.com
On Wed, 10 Oct 2007 08:57:42 +0200, Patrick Wright <pdou...@gmail.com>
wrote:
Any other cools features from StrongTalk that haven't made it into the
JVM yet? StrongTalk, from what I've been able to find about it (and of
that, the stuff that doesn't go completely over my head), sounds
awesome.
James
> it would be great to
> have a runtime where you can go one level lower and actually load
> metaobject protocols into the runtime dynamically. I don't know yet
> whether this makes sense (it feels to me it does), but is definitely
> something to think about.
There are projects to soften up the JVM's object architecture,
to give it a MOP, but that sort of thing probably requires
a redesign from the ground up.
With HotSpot, I hope to make the bytecode set more flexible,
so that it can be used as a type-safe, GC-able assembly language,
and then build MOPs on top.
But (thinking here...) there might be MOP-like capabilities we could to
do push downward into the current hardwired runtime. For example,
it might be reasonable, given method handles, to have a VM-level
operation for creating a class and populating its vtable with closures.
(I once wrote a Scheme system which did this to C++ base classes.)
But that's probably secondary, compared with a primary feature of
method handles. (With purely structural (signature-based) calling
sequence and provision for making closures, of course.)
The fast out-of-line dynamic calling sequence in HotSpot is called
a monomorphic inline cache. It involves having the caller pass a
token (expected callee class) which the callee checks quickly on
method entry. (This is called the "verified entry point", or VEP,
as opposed to the "unverified entry point", or UEP.)
Calling a method handle probably requires a similar check,
of a receiver signature rather than a receiver class.
(After all, closures are classified by their signatures.)
Introducing such a calling sequence into the JVM would
be very profitable.
I'll post more on method handles when I get some blog time.
> Of course, there's always the danger of losing focus if the goal is
> too
> broad; see history of Parrot for terrifying examples.
Yes. That initial list of ideas, if fully explored, would cost
decades of work.
(And that's just my own pick of favorites.) So there is (as always)
a need to
choose the most profitable projects first. I'm working on dynamic
invoke,
and I agree that some sort of method handles are also a big need.
You mentioned continuations. I have a low-level design
and a rough implementation sketch for the suspension,
pickling, and resumption of coroutines, but (short on time
and expertise) I left out the all-important part of resumption,
which HotSpot would call on-stack replacement.
HotSpot does perform several kinds of stack frame editing.
That code is not easy to extend. (Suprised?) It would be
lovely to have (someday) a flexible system ("MsfP") for
stack frame management, so we could do things like
reoptimization, evacuation to the heap on overflow,
or work stealing directly from the execution stack.
> Although I'm
> optimistic -- building on a foundation of the HotSpot JVM architecture
> gets you immediately past lots of project infancy problems.
Yes, that's the really appealing thing to me about this.
We've spent 10 years working on this platform,
investing in Java, and now it's time to make that
investment pay off for the newer crop of languages.
One big investment, BTW, are the JITs.
The server compiler is not Java-specific at all.
It has a (now-classic) SSA sea of low-level nodes.
As a good citizen of the JVM, it knows how to leverage
type profiles and invariants provided by the object architecture.
But it's really a pretty general purpose compiler.
(Recently we added some vectorization optimizations.)
-- John
Google, CiteSeer and I know about *poly*morphic inline caching, but not
about *mono*morphic inline caching: any chance of a quick sketch of the
contrast and a pointer to the literature?
Cheers,
Miles
Actually, google does seem to know about it, and it's just the obvious
restriction to an inline cache for exactly one <class, method> pair.
Apologies for the noise.
Cheers,
Miles
> Actually, google does seem to know about it, and it's just the obvious
> restriction to an inline cache for exactly one <class, method> pair.
Quite so, and the idea is that on highly pipelined modern CPUs,
you don't want to pay the price for more than a single conditional
branch.
--
GMail doesn't have rotating .sigs, but you can see mine at
http://www.ccil.org/~cowan/signatures
[11]
http://www.weiqigao.com/blog/2007/01/20/java_generics_let_the_other_shoe_drop.html
>
> Prototyping for JSR 292[11] is likely to occur as
> a part of this project. Note that none of the above
> suggested features is specific to any single language.
>
> As the current OpenJDK Project guidelines request,
> please send followups to the discussion list.[12]
>
> Thanks very much for your attention to this matter,
>
> -- John Rose
> http://blogs.sun.com/jrose/
>
> [1] http://openjdk.java.net/projects/
> [2] http://openjdk.java.net/groups/hotspot/
> [3] http://ksl.dev.java.net/ (Kitchen Sink Language)
> [4]
> http://mail.openjdk.java.net/pipermail/hotspot-dev/2007-July/000091.html
> (Kitchen Sink VM)
> [5] http://blogs.sun.com/jrose/entry/tail_calls_in_the_vm
> [6] http://lambda-the-ultimate.org/node/1002 (Continuations for Java)
> [7] http://blogs.sun.com/jrose/entry/tuples_in_the_vm
> [8] http://groups.google.com/group/jvm-languages/t/dbc3a4a382868904
> (Lightweight Methods)
> [9] http://www.javac.info/ (Java Closures)
> [10]
> http://groups.google.com/group/jvm-languages/web/implementation-of-multimethods-in-jvm-languages
>
> [11] http://jcp.org/en/jsr/detail?id=292#2 (Original JSR 292 request)
> [12] http://mail.openjdk.java.net/mailman/listinfo/discuss
cheers,
Rémi
and reified generics (generics at runtime) [11]
Neal, could you sketch out the correct general solution, for those of us
not familiar with Strongtalk?
To my VM-tuned ears, it sounds like a job for "class splitting".
I.e., the JVM currently has a one-to-one relation between
bytecoded classes and the klass IDs which are stored
in object headers. This need not be the case.
I know of several possible reasons to split classes:
- saving parameters from erasure
- distinguishing between instances with different creation paths
(constructors, etc.)
- distinguishing between optimized and general-case instances
(short & long number formats, etc.)
- distinguishing between immediate and ordinary objects
- adding instance-specific methods (Ruby, etc.)
- other behavioral customizations on well-known instances like enums
The cost tradeoffs are the usual balance between copying hot
information and indirecting to shared information.
Sharing vtables requires some sort of extra check on method call.
In the extreme case of immediate (non-oop) pseudo-pointers,
the processing of object references is somewhat complicated
by detecting non-oop tag bits. (E.g., if a subrange of Integer
were encoded into an immediate pointer, a corresponding
klass split from Integer would make sure never to indirect the
'this' pointer and instead decode the 'value' field from a bitfield
therein.)
These design questions need to be explored in a VM-centric way.
By that I mean what low-level general purpose API will allow
applications (or optimization packages) to organize split classes,
in such a way that use cases like the above are well-served.
-- John
P.S. "oop" comes from Smalltalk, means "ordinary object pointer",
as opposed to a primitive value bit-encoded into a pseudo-pointer.
In HotSpot, all oops have 2 or 3 low zero bits, so it's a relatively
simple matter to set one of those bits to distinguish pseudo-pointers.
On 64-bit systems the possibilities are impressive.
That link is for the ACM, which (altho a great resource) requires a
subscription to view. I contacted Mirko Viroli, one of the authors,
and he sent me PDFs of that paper and an earlier one he worked on.
They're posted in our Files section
http://groups.google.com/group/jvm-languages/files
Regards
Patrick
At $100 per year, it's a bargain for anybody with an ongoing need to
access research publications.
But my experience with other restricted publishers, say Springer or
Elsevier, is that with a little Web searching, you can often find other
copies (sometimes only pre-publication drafts) on the wild Web.
> I contacted Mirko Viroli, one of the authors,
> and he sent me PDFs of that paper and an earlier one he worked on.
That often works, too. I once got a Japanese researcher to send me print
copy of a publication for which he had no digital counterpart!
> They're posted in our Files section
> http://groups.google.com/group/jvm-languages/files
>
>
> Regards
> Patrick
Randall schulz
My bad. Immediately after posting the two files, Mirko wrote me to
tell me they were actually under ACM copyright. I checked, and they
are (is in the footnote). I have removed the links. This was a
misunderstanding on my part in my email with Mirko.
If you downloaded these since I posted them, please observe the
copyright and do not post them on servers or distribute in similar
fashion (see copyright notice on first page, footnote).
Sorry for the trouble--I actually did a Google sweep yesterday but
could find no other copies (outside the ACM).
Patrick
It's not your browser, it's an agreement between your university and the
ACM. Stanford has this arrangement with many publishers, including
Springer and Elsevier. From their network you can retrieve digital
versions of all their publications (all those distributed in digital
form, i.e.).
> ...
>
> Rémi
Randall Schulz
JRuby's inline cache is currently monomorphic, and after removing a few
roadblocks it appears that HotSpot has really picked that up and ran
with it. However I have a PIC patch hanging around that improved
polymorphic dispatch by almost 50%, and didn't appear to impact
monomorphic inline caching perf at all. We will probably put it in place
some time soon, along with a "hotness" measure to allow re-sorting the
cache occasionally.
- Charlie