invokedynamic newbie question

Rob N

unread,

Aug 5, 2012, 12:26:38 AM8/5/12

to JVM Languages

Hi,

I'm trying to learn about invokedynamic and the supporting classes,
and I don't yet see how they help implement the dynamically typed
language I have in mind. From what I've read so far, when you emit an
invokedynamic instruction, you specify a bootstrap method in your
language's runtime, but that is only run once, and it must return a
MethodHandle for the CallSite. Lets say I wanted to emit something
for a CLOS-like multiple dispatch method (foo x y), that will select
an implementation method based on the classes of x and y. Would I not
have to return, from the bootstrap method, a handle to a method that
did that dispatch? So why not just emit an invokevirtual or
invokestatic directly to that method? What is the advantage of
invokedynamic here?

thanks,
Rob

Charles Oliver Nutter

unread,

Aug 5, 2012, 3:18:34 AM8/5/12

to jvm-la...@googlegroups.com

A simple way to do what you want might be sketched out like this:

* The bootstrap method returns a CallSite + handle for your
multi-dispatch logic contained in another method.
* The multi-dispatch logic receives the actual arguments for that first call.
* Based on those arguments, you build a guarded chain of method
handles that calls the right target for the incoming types, and falls
back on the multi-dispatch lookup logic again if new types come in.

From here, you can do various combinations of guards, multiple
targets, mechanisms for quickly calculating whether the incoming types
have already been negotiated, and so on.

You want to look at the bootstrap as your way to point future calls
toward your own specialized logic; the bootstrap does not necessarily
contain that logic.

There's an example implementation of multi-dispatch here:
http://code.google.com/p/jsr292-cookbook/source/browse/trunk/multi-dispatch/src/jsr292/cookbook/mdispatch/RT.java

- Charlie

> --
> You received this message because you are subscribed to the Google Groups "JVM Languages" group.
> To post to this group, send email to jvm-la...@googlegroups.com.
> To unsubscribe from this group, send email to jvm-language...@googlegroups.com.
> For more options, visit this group at http://groups.google.com/group/jvm-languages?hl=en.
>

Rob N

unread,

Aug 5, 2012, 10:54:46 AM8/5/12

to JVM Languages

Thanks. It will take me a while to understand that code. Once I do,
maybe I can write a performance test. I'd like to see a comparison
between invokedynamic with guarded/chained method handles, as you
described, vs calling the dispatch logic the old fashioned way.

Rob

On Aug 5, 3:18 am, Charles Oliver Nutter <head...@headius.com> wrote:
> A simple way to do what you want might be sketched out like this:
>
> * The bootstrap method returns a CallSite + handle for your
> multi-dispatch logic contained in another method.
> * The multi-dispatch logic receives the actual arguments for that first call.
> * Based on those arguments, you build a guarded chain of method
> handles that calls the right target for the incoming types, and falls
> back on the multi-dispatch lookup logic again if new types come in.
>
> From here, you can do various combinations of guards, multiple
> targets, mechanisms for quickly calculating whether the incoming types
> have already been negotiated, and so on.
>
> You want to look at the bootstrap as your way to point future calls
> toward your own specialized logic; the bootstrap does not necessarily
> contain that logic.
>

> There's an example implementation of multi-dispatch here:http://code.google.com/p/jsr292-cookbook/source/browse/trunk/multi-di...

Matt Fowles

unread,

Aug 5, 2012, 11:46:08 AM8/5/12

to jvm-la...@googlegroups.com

Rob~

Such performance tests can be tricky to write correctly. You should make certain that the test actually involves resolving multiple dispatch targets rather than just a single one. The place where this architecture is a big win, is when you have multiple callsites, each of which resolves to one or two targets. The naive implementation will have a single call site (inside the dispatch method) that resolves to every possible target (this is known as a megamorphic call site), whereas invokedynamic will have multiple callsites each of which resolves to one or two targets (this is know as either monomorphic or bimorphic).

Matt

Charles Oliver Nutter

unread,

Aug 5, 2012, 12:14:12 PM8/5/12

to jvm-la...@googlegroups.com

Yes, this is the key point to using invokedynamic. In JRuby, for example, we have the "old way": a CachingCallSite class that does lookups and invocation for every method call in the system. The logic it contains can inline, but from there the actual invocation appears to touch every target in the system. With invokedynamic, we can match up call sites with their one or few targets and the JVM is able to inline. In fact, we have even made it configurable how many target methods with will attempt to cache before failing over to a plain inline cache again...and as a result a Ruby method call can often inline and optimize more targets at a given call site than Java can :-)

Pretty amazing stuff.

- Charlie (mobile)

Rémi Forax

unread,

Aug 5, 2012, 4:27:45 PM8/5/12

to jvm-la...@googlegroups.com

On 08/05/2012 04:54 PM, Rob N wrote:
> Thanks. It will take me a while to understand that code. Once I do,
> maybe I can write a performance test. I'd like to see a comparison
> between invokedynamic with guarded/chained method handles, as you
> described, vs calling the dispatch logic the old fashioned way.
>
> Rob

Hi Rob,
I'm the guy that write this piece of code an because I'm stuck in Salt
Lake City airport for the next three hours,
I suppose I can write something about this code.

The implementation is based on two papers, [1] and [2] and
yes, I've written my Phd thesis on multi-method in Java a long time ago.

The first paper describes a way to do the multi-dispatch using a table
based approach
(like a vtable dispatch but more complex), so it provides a way to
'cover your ass', i.e.
a way to have if no optimization tricks work to fallback to a dispatch
algorithm which is linear to the number of parameter
(so it doesn't depend on the number of implementations of the
multi-method which is something important).

It works like this, let say we have two methods with B extends A,
void foo(A a, A a2) { ... } // impl1
void foo(A a, B b) { ... } // impl2

the algorithm first sort the methods using their types, here because B
is more specific that A,
impl2 is more specific than impl1, so the method array is { impl2, impl1 }.

Then for each parameter index, you record in a hashmap for each
available type,
if a method implementation is applicable (can be called).
Here, the hashmap for the first parameter contains only one key, A, and
the corresponding
bitset is 11 because impl2 and impl1 can be called if A is the first
parameter.
Using the same algorithm for the second parameters you get:
hash1 = { A -> 11 },
hash2 = { A -> 01, B -> 11 }

now when the multi-dispatch is called, let say with an object of class B
and another of class A,
call.foo(B,A),
the first hashmap, hash1, doesn't contains B, in that case, follow the
supertype hierarchy and
try to find a supertype which is a key, here because B inherits from A,
B will act as A,
so you can cache (for the next call) that B -> 11 in hash1.
the second parameter is A, so the hash entry is A -> 01,
The result is
call.foo(B,A) = {11 } & { 01} = 01, so method implementation at index
1 should be called -> impl1

so calling a multi-method should never be slower than accessing to the
hashmap for each arguments,
do a bitwise-and and get the lower bit set (the more specific).

The real algorithm is more convoluted because you can have interfaces,
so you can have more than
one bitset for a class at runtime, when the array is sorted, you may
have to introduce new implementation
by example, for foo(A,B) and foo(B,A) the Java semantics require to
include a third method foo(B,B)
that will just throw an exception because foo(A,B) and foo(B,A) are not
comparable.

Moreover, there is a bunch of special case that can be used to faster
the algorithm.
If there is only one implementation, multi-dispatch is not necessary, if
there is several implementation
but only one is ever call for a callsite (the callsite is monomorphic) a
guard with test will be faster,
(maybe it's also the case for bi-morphic, and so on if only the receiver
class changed)

Also, if the language you want to implement is object oriented, then the
number method applicables
depends on the receiver class, you can use a GuardWithTest for this or
consider the receiver class
as a parameter too.

Also, maybe for a parameter, there is only one possible class, for foo
of our running example,
the first parameter is always an A, in that case, it's better to not
process a bitset for it but
just do an instanceof check (or no check at all if you know that for a
callsite the first parameter
is always an A.

Also, if you have only one parameter (one that can changed), in that
case, it's better to not
compute the set but directly store the method handle in the hash.
For foo, it will be
hash2 = { A -> impl1, B -> impl2 }
and to use the result of hashmap.get() with a fold + exactInvoker.

Also, for a specific callsite, you can have primitive type in the
signature, in that case,
you can pre-process the bitset that will be used and insert it as a
constant (with a insertArguments).

Also, depending on your CPU, some provide an instruction for the highest
bit, some for the number
of trailing zeroes (latest AMD), so if the corresponding intrinsic
exists in hotspot
(I don't think it's true now but it may change) so you may have to sort
the method in one order or
in reverse order.

So currently, as far as I remember, I've implemented the bitset
algorithm but only
if there is less than 32 implementation, so store the bitset on an int.
The hash is thread-safe but it use a splay-tree to avoid to consume to
much memory
but it may be better for performance to use a hashmap (without a
volatile access
if there is no cache miss).
Note that there is two flavor of the hash, one if there is no interface,
so calling getSuperclass()
is enough and a second one works with interfaces.

And I've also implemented all the tricks listed above.

cheers,
R�mi

[1] http://webdocs.cs.ualberta.ca/~paullu/Papers/COOTS2001-HTML/mdj.html
[2] http://www.jot.fm/issues/issue_2005_12/article3/

Rob N

unread,

Aug 5, 2012, 9:30:19 PM8/5/12

to JVM Languages

Thank you everybody for the explanations; they've helped a lot. And
I'm glad you were stuck in the airport Rémi. :|

Rob

Reply all

Reply to author

Forward