Yesterday was the International Invokedynamic Day

16 views
Skip to first unread message

Attila Szegedi

unread,
Aug 27, 2008, 4:07:31 AM8/27/08
to jvm-la...@googlegroups.com
Folks,

as John Rose reported in his weblog, yesterday was the International
Invokedynamic Day: <http://blogs.sun.com/jrose/entry/international_invokedynamic_day
>, defined as the day when (quote) "... VM has for the first time
processed a full bootstrap cycle for invokedynamic instructions,
linking the constant pool entries, creating the reified call site
object, finding and calling the per-class bootstrap method, linking
the reified call site to a method handle, and then calling the linked
call site 999 more times through the method handle, at full speed. ..."

(Apologies John if you wished to post it here too in addition to mlvm-
dev, but I thought I'll just go ahead and bring the good news here too)

John's original announcement on mlvm-dev: <http://mail.openjdk.java.net/pipermail/mlvm-dev/2008-August/000174.html
>

I probably don't need to illustrate the significance of this to anyone
on this list :-)

Attila.

John Rose

unread,
Aug 27, 2008, 4:22:21 AM8/27/08
to jvm-la...@googlegroups.com
On Aug 27, 2008, at 1:07 AM, Attila Szegedi wrote:

> (Apologies John if you wished to post it here too in addition to mlvm-
> dev, but I thought I'll just go ahead and bring the good news here
> too)

No apology needed at all... Thanks, Attila.

Folks, invokedynamic, continuations and tailcalls are brewing (in
three different locations) and interface injection will (I hope)
become real after method handles go mainstream. Great things are
happening with the JVM. Oh, and we on the HotSpot team continue to
work on "classic" things like GC and compiler improvements, and
better use of emerging hardware.

It's my personal hope that, with such a strong and rich
infrastructure and lively community, the next great programming
paradigm will be invented on the JVM. That's exciting enough to make
me work the long hours on language plumbing.

Best wishes,
-- John

Charles Oliver Nutter

unread,
Aug 27, 2008, 1:56:41 PM8/27/08
to jvm-la...@googlegroups.com
John Rose wrote:
> Folks, invokedynamic, continuations and tailcalls are brewing (in
> three different locations) and interface injection will (I hope)
> become real after method handles go mainstream. Great things are
> happening with the JVM. Oh, and we on the HotSpot team continue to
> work on "classic" things like GC and compiler improvements, and
> better use of emerging hardware.
>
> It's my personal hope that, with such a strong and rich
> infrastructure and lively community, the next great programming
> paradigm will be invented on the JVM. That's exciting enough to make
> me work the long hours on language plumbing.

I'm hoping to wire up a prototype JRuby using invokedynamic starting
this weekend, and I'm also looking forward to these other features. I'll
report back here as I start that work and of course it will all be
available while it's in progress.

- Charlie

Ben Loud

unread,
Aug 28, 2008, 7:38:18 AM8/28/08
to jvm-la...@googlegroups.com
Attila,
 
I've been wondering, are you planning on trying to do version of Rhino with invokedynamic/ACL etc?
 
There's been a lot of buzz about TraceMonkey recently, and alot of people promoting it by arguing that HotSpot is 'too heavyweight' and because JavaScript is dynamic, it wouldnt do as good of a job as a 'specially designed' JIT (see http://ejohn.org/blog/tracemonkey/ for some amusing comments, including "We just applied the most relevant research (Andreas's) to JavaScript. Sun would not and will not do that, because JS is not Java." Seems Mr Eich is not aware of John's work!).
 
Im hoping their critisisms will be proven wrong, so I'd be very interested to see how they compare.

2008/8/27 Attila Szegedi <szeg...@gmail.com>

Charles Oliver Nutter

unread,
Aug 28, 2008, 12:51:31 PM8/28/08
to jvm-la...@googlegroups.com
Ben Loud wrote:
> Attila,
>
> I've been wondering, are you planning on trying to do version of Rhino
> with invokedynamic/ACL etc?
>
> There's been a lot of buzz about TraceMonkey recently, and alot of
> people promoting it by arguing that HotSpot is 'too heavyweight' and
> because JavaScript is dynamic, it wouldnt do as good of a job as a
> 'specially designed' JIT (see http://ejohn.org/blog/tracemonkey/ for
> some amusing comments, including /"We just applied the most relevant
> research (Andreas's) to JavaScript. Sun would not and will not do that,
> because JS is not Java." /Seems Mr Eich is not aware of John's work!/). /

>
> Im hoping their critisisms will be proven wrong, so I'd be very
> interested to see how they compare.

FWIW, there's a *lot* of room for improvement in most of the dynamic
languages on JVM, including JRuby. Groovy, Rhino, and Jython, as far as
I know, still use reflection for all calls, which means boxed argument
lists and a lot more call overhead. JRuby, without reflection, can at
times achieve dumb benchmark performance only a few times slower than
equivalent fully-boxed-Integer Java performance. With more research and
time and unboxing smarts, it could approach Java perf. So the techniques
we've used in JRuby combined with invokedynamic's support for
reflection-free handles and active enlistment in HotSpot's method
dispatch process will no doubt increase the performance of all dynlangs,
elevating them at least to the level of fully-boxed Java, and
potentially beyond that with a bit more work.

- Charlie

Attila Szegedi

unread,
Aug 28, 2008, 2:51:05 PM8/28/08
to jvm-la...@googlegroups.com
Well, as much as I'd love to, I personally don't really have the
required spare cycles to work on that in the foreseeable future.

Also, while invokedynamic goes a long way toward running dynamic code
efficiently on JVM, there are some things that TraceMonkey does
specifically for JavaScript (and that other dynamic languages could
probably also greatly use), and that HotSpot currently doesn't, mostly
to do with efficient type narrowing of values (so not everything has
to be treated as a generic, heap-allocated object all the time). Truth
be told, HotSpot doesn't actually need to do most of that, as the
language-specific source-to-bytecode compiler can take care of that;
indeed the bytecode compiler in Rhino with optimization level set to 1
or above will do some such things -- strictly on intraprocedural
level. Now, I haven't looked into the TraceMonkey, but have it on my
(fairly large, unfortunately) todo list to read through the peer
reviewed papers it's based on, and I fully expect I'll find things
like type specialization for functions and similar techniques.

Also note how the balance of performance goals is usually fairly
different for Rhino and for JS runtimes in browsers: in browsers, you
want fast startup time for the scripts, so you won't want to run a
full gamut of optimizations on your code; HotSpot JIT, esp. the server
one will do, as it doesn't mind sacrificing startup performance for
long-run throughput.

I'll make some further comments on Charlie's follow-up to your mail.

Attila.

Attila Szegedi

unread,
Aug 28, 2008, 2:56:32 PM8/28/08
to jvm-la...@googlegroups.com

That's actually a very promising direction to go in -- there's a bunch
of bytecode-level optimizations that apply across the board for all
(or at least, most) dynamic languages. I'm not referring only to
calling Java from a dynalang, but rather to optimizing the bytecode
produced from a dynalang source code; inferring types (even
speculatively) within methods, having multiple type-optimized versions
of methods when they can be invoked with arguments of different types,
etc. It almost sounds as if it would make sense to come up with a
common framework for such optimizations; using a higher-level
"dynamic" intermediate bytecode format that all dynalang compilers
would compile to, and then having a single dynamic bytecode to JVM
bytecode optimizing compiler ticking underneath them all. (I believe
one of your goals with this list was to actually gather people to
identify such common components and establish a cooperative for
creating a single good implementation for them).

Attila.

>
>
> - Charlie

John Rose

unread,
Aug 28, 2008, 3:27:12 PM8/28/08
to jvm-la...@googlegroups.com
On Aug 28, 2008, at 9:51 AM, Charles Oliver Nutter wrote:

With more research and  time and unboxing smarts, it could approach Java perf. 


The invokedynamic instruction removes signature-related bottlenecks from dynamically typed JVM calls, allowing dynamic calls to be compiled with any desired signature, not just reflective style (varargs with boxed primitives) or Smalltalk style (fixed argument count, with boxed primitives).  It also provides a vm-supported way to plug-and-play with method references, which until now everybody had to invent for themselves (since java.lang.reflect.Method does too much).

So if the dynamic language backend can do enough type inference to make calls "optimistically" with some unboxed arguments, and if the call target actually accepts a signature that is close to the caller's signature, a cheap adapter can be inserted by the runtime, and the call can go through with a minimum of data motion.  If the signatures match exactly (or closely enough so that no dynamic checks are needed to verify data integrity), then the call can be as direct as a normal Java call, and can be inlined just the same.

Also, because the JSR 292 JVM supports method handles directly, there can be close coupling between the handle used by the dynamic runtime and the underlying "real" method, to the point where optimizers routinely "see through" the handle to the method.  Because of the complexity and semantic mismatch of java.lang.reflect.Method, this close coupling has been rare in the past.

The statefulness of invokedynamic (as seen through CallSite.setTarget) is intended to be exactly enough for the JIT to process dynamic call sites statically.  It can optimistically inline the method handle at every invokedynamic call site, and take corrective action if a new target comes along and invalidates inlined code.  We've been doing this optimization in HotSpot for years, where the same statefulness is in the class hierarchy.  (Did an overriding method just get loaded?  Gee, gotta recompile.)  Now the same state change hazards will also be in each call site, where dynamic languages need the help.

By itself, invokedynamic is not a full replacement for fixnums (small integers packed into a tagged pointer), since (for example) it doesn't help you create a list of ints without boxing.  But it does provide full speed paths for important functions like generic arithmetic and sequence references (where the index types are often plain ints, and do not benefit from boxing).  Boxed integers in the JVM are of reasonable performance now, and can be improved transparently if we put in fixnums some day (regarding which see my blog post on fixnums).

-- John

Charles Oliver Nutter

unread,
Aug 29, 2008, 12:22:38 AM8/29/08
to jvm-la...@googlegroups.com
Attila Szegedi wrote:
> That's actually a very promising direction to go in -- there's a bunch
> of bytecode-level optimizations that apply across the board for all
> (or at least, most) dynamic languages. I'm not referring only to
> calling Java from a dynalang, but rather to optimizing the bytecode
> produced from a dynalang source code; inferring types (even
> speculatively) within methods, having multiple type-optimized versions
> of methods when they can be invoked with arguments of different types,
> etc. It almost sounds as if it would make sense to come up with a
> common framework for such optimizations; using a higher-level
> "dynamic" intermediate bytecode format that all dynalang compilers
> would compile to, and then having a single dynamic bytecode to JVM
> bytecode optimizing compiler ticking underneath them all. (I believe
> one of your goals with this list was to actually gather people to
> identify such common components and establish a cooperative for
> creating a single good implementation for them).

Yes, I've started to lean toward the notion that designing a
meta-bytecode layer that's "JVM bytecode ++" would be a more realistic
target for many language developers to work toward than what the DLR
folks have been aiming for. Put simply, I still believe that coming up
with an "abstract semantic tree" that would ever be general enough to
support "all" dynamic languages is impossible without making it so
finely-chopped and disconnected that it's impossible to realistically
optimize. But I digress.

I'm getting more to the point where just emitting bytecode is like a
second language. I've debated writing portions of JRuby in a Ruby-based
bytecode DSL of mine, just so I'd have that bare-metal control over it
without all the JVM noise:

static_method(:main, Void::TYPE, String[]) do
aload 0
ldc_int 0
aaload
invokestatic this, :foo, [this, String]
invokevirtual this, :getList, ArrayList
aprintln
returnvoid
end

And at some level my little "duby" language is really just layering the
prettiness of Ruby's syntax directly on top of JVM bytecode, with
pluggable type inference and compilation layers to allow compile-time
magic. So with what we've done with JRuby and what I'm trying to do with
Duby I've come to the realization that we could build bytecode-level
libraries or APIs that can cater to both while taking into consideration
the vagaries of running well on the JVM and supporting new MLVMish
features as they come along.

I'm also more and more of the opinion that language design up to the
level of emitting bytecode is a very personal experience. Every language
designer is going to have quirks and features they want to represent in
new and peculiar ways...and though we can come up with useful utility
libraries we might all share, trying to form a common runtime all
languages can use seems like a wild goose chase, especially in light of
so many existing languages and so many wildly different language
implementers. But where we have commonality is at the "metal", when we
need to make the language fit into the JVM and execute well. This is
where we all share most of our common challenges, and I think it's here
we can get the most out of sharing efforts.

- Charlie

Charles Oliver Nutter

unread,
Aug 29, 2008, 12:34:18 AM8/29/08
to jvm-la...@googlegroups.com
John Rose wrote:
> On Aug 28, 2008, at 9:51 AM, Charles Oliver Nutter wrote:
>
>> With more research and time and unboxing smarts, it could approach
>> Java perf.
>>
>
> The invokedynamic instruction removes signature-related bottlenecks from
> dynamically typed JVM calls, allowing dynamic calls to be compiled with
> any desired signature, not just reflective style (varargs with boxed
> primitives) or Smalltalk style (fixed argument count, with boxed
> primitives). It also provides a vm-supported way to plug-and-play with
> method references, which until now everybody had to invent for
> themselves (since java.lang.reflect.Method does too much).
>
> So if the dynamic language backend can do enough type inference to make
> calls "optimistically" with some unboxed arguments, and if the call
> target actually accepts a signature that is close to the caller's
> signature, a cheap adapter can be inserted by the runtime, and the call
> can go through with a minimum of data motion. If the signatures match
> exactly (or closely enough so that no dynamic checks are needed to
> verify data integrity), then the call can be as direct as a normal Java
> call, and can be inlined just the same.

I literally get goosebumps every time I think about this. We keep track
of compiler logs in JRuby, watching for inlining opportunities, "too
big" methods, and exploring ways to reduce trap occurrences, but even
the cleanest, simplest benchmarks never have enough room to inline more
than one dynamic recursion, even with all we've done to reduce the
complexity of the call path. The promise of invokedynamic is that we'll
be able to remove almost all of that dynamic dispatch logic from all
calls and give HotSpot the "keys to the castle" right at the call site
itself. That's really huge.

> Also, because the JSR 292 JVM supports method handles directly, there
> can be close coupling between the handle used by the dynamic runtime and
> the underlying "real" method, to the point where optimizers routinely
> "see through" the handle to the method. Because of the complexity and
> semantic mismatch of java.lang.reflect.Method, this close coupling has
> been rare in the past.

I'm also absolutely thrilled that both handles and anonymous
classloading are there, with or without invokedynamic, since largely the
limiting factor for continued JRuby optimization has been the gross
overhead incurred by generating and regenerating bytecode in the form of
call adapters, custom call sites (like DLR's DynamicSite) and
type/arity-specific invocation handles. I hope this will finally free us
from those bonds.

> The statefulness of invokedynamic (as seen through CallSite.setTarget)
> is intended to be exactly enough for the JIT to process dynamic call
> sites statically. It can optimistically inline the method handle at
> every invokedynamic call site, and take corrective action if a new
> target comes along and invalidates inlined code. We've been doing this
> optimization in HotSpot for years, where the same statefulness is in the
> class hierarchy. (Did an overriding method just get loaded? Gee, gotta
> recompile.) Now the same state change hazards will also be in each call
> site, where dynamic languages need the help.

And for those of us who have struggled with our own call-site
invalidation mechanisms (struggled as in bashed our heads against the
walls to get a good combination of fast code paths, simple guards,
concurrency, and accuracy all to line up), this is more welcome news.

> By itself, invokedynamic is not a full replacement for fixnums (small
> integers packed into a tagged pointer), since (for example) it doesn't
> help you create a list of ints without boxing. But it does provide full
> speed paths for important functions like generic arithmetic and sequence
> references (where the index types are often plain ints, and do not
> benefit from boxing). Boxed integers in the JVM are of reasonable
> performance now, and can be improved transparently if we put in fixnums
> some day (regarding which see my blog post on fixnums).

As you know, I'm interested in fixnums. One of the up-and-coming Ruby
implementations is based on Gemstone's Smalltalk VM, which I presume
already has true fixnums. As a result of this, they basically smoke all
other Ruby implementations when it comes to fixnum benchmarks. And I
think largely the cost in JRuby comes from two things: fixnum being a
boxed primitive type, and a call path that's too long for hotspot to see
it's just a boxed primitive type. Ultimately, the ability to wire in a
fixnum type that's just a tagged int would push us well over the line
for math performance, without the need to resort to really ugly tricks
(over-hacks, in my opinion) to figure out when we can hold the JVM's
hand and unbox for it.

- Charlie

Jim White

unread,
Aug 29, 2008, 1:47:27 AM8/29/08
to jvm-la...@googlegroups.com
Charles Oliver Nutter wrote:

> ...


> Yes, I've started to lean toward the notion that designing a
> meta-bytecode layer that's "JVM bytecode ++" would be a more realistic
> target for many language developers to work toward than what the DLR
> folks have been aiming for. Put simply, I still believe that coming up
> with an "abstract semantic tree" that would ever be general enough to
> support "all" dynamic languages is impossible without making it so
> finely-chopped and disconnected that it's impossible to realistically
> optimize. But I digress.
>
> I'm getting more to the point where just emitting bytecode is like a
> second language. I've debated writing portions of JRuby in a Ruby-based
> bytecode DSL of mine, just so I'd have that bare-metal control over it
> without all the JVM noise:

> ...

While I still think AOP is a better approach for this problem, your
observation prompts me to point out that llava is a dandy JVM bytecode
language that has been around for a while.

http://llava.org/

And of course Kawa's gnu.bytecode has been around forever and has spiffy
higher level stuff that uses it (not just Scheme but also a multilingual
abstract layer).

Jim

Brendan Eich

unread,
Sep 6, 2008, 8:21:09 PM9/6/08
to JVM Languages
On Aug 28, 4:38 am, "Ben Loud" <loubs...@gmail.com> wrote:
> There's been a lot of buzz about TraceMonkey recently, and alot of people
> promoting it by arguing that HotSpot is 'too heavyweight' and because
> JavaScript is dynamic, it wouldnt do as good of a job as a 'specially
> designed' JIT (seehttp://ejohn.org/blog/tracemonkey/for some amusing
> comments, including *"We just applied the most relevant research (Andreas's)
> to JavaScript. Sun would not and will not do that, because JS is not Java."
> *Seems Mr Eich is not aware of John's work!*). *

I haven't seen John in 12 years; sorry for not keeping up. But
seriously, I was replying to someone on that blog whose behavior
quickly identified him as a troll, yet who seemed at first to fault us
on technical grounds for not using HotSpot in Firefox(!). Let's get
real.

Even with John's work, we're not able to switch to such a VM. We need
a very small JIT, specialized to JS as source language, and that
avoids method inlining and similar code bloat. And it has to compete
with the likes of V8, which is not easy (we're the only browser not
apparently caught flat-footed on this particular front).

I'm glad the JVM is getting better dynamic and functional programming
language support. But a JVM is simply not an alternative to the built-
in browser VMs out now or to be released soon. Even Tamarin, in many
ways a Java-1-like VM, is not competitive at untyped JS, which is the
only kind of JS that there is on the Web.

/be
Reply all
Reply to author
Forward
0 new messages