Use invokedynamic instead of the reflection API when possible

522 views
Skip to first unread message

Rémi Forax

unread,
May 22, 2019, 7:16:58 PM5/22/19
to Clojure
Hi all,
now that Clojure is compatible with java 8, it can use invokedynamic where it makes sense, i.e. when the compiler has no information to generate directly the call in bytecode, instead of using the reflection API.

In fact, it's a little more complex than that,
- you can not fully replace all calls to the reflective API to use invokedynamic instead, because you have restriction on the methods you can call (you can not call a method annotated with @CallerSensitive for security reason) and
- using the method handle API doesn't necessary means that the calls will be faster than using the reflection API if the JIT is not able to inline the calls.

So the idea of the patch is to always generate invokedynamic at compile time but at runtime to use the methodhandle API if there is a good chance that the call will be inlined and fallback to call the Reflector API otherwise.

Obviously, i've not read how to contribute before writing the patch, so the patch is currently available on github

So now that i've read how to contribute, i think the first question to ask is:
does it make sense to allow the Clojure to use invokedynamic ?

regards,
Rémi

Alex Miller

unread,
May 22, 2019, 9:47:53 PM5/22/19
to Clojure
Hi Rémi! Thanks for all your work on ASM and other JVM stuff over the years by the way.

We have of course been poking at indy off and on over the years and there have been a number of experiments with it. Ghadi Shayban has probably done the most work on that and he probably has more useful feedback than I would on the technical aspects of the code.

Based on where we are in the release cycle right now, I expect that we probably aren't ready to engage with this today and it might be a couple months before we are in the meat of the next release and open to it. Some quick thoughts and questions though...

1. As with anything we work on, we don't stuff just because it's possible to do but because it satisfies some objective that seems worth doing. I assume the target benefit here is performance, but is that really it? Are there other benefits? Are there downsides to using indy vs reflection?

One big thing here is that generally we expect people to remove reflective calls in performance-sensitive code via type hints (and some people more thoroughly try to remove all use of reflection). Thus my expectation would be that the majority of users would experience no improvement or improvements only in parts of the code that are considered not important from a performance perspective. If we're adding code that increases ambiguity (via having multiple invocation paths which might have different failure modes) without a lot of benefit to users, then that prioritizes this pretty far down the list for me.

2. You mentioned the caller sensitive stuff - can you point at some resources about what those are? I guess those are calls checking security policy, etc? 

3. We did some work in the last release to avoid reflective calls to module-private methods, by modifying our reflective search to prefer interfaces or classes that are module accessible. Does module accessibility affect indy calls as well?

4. Clojure has very few compiler / RT flags (and it's most common to use none of them), and that's pretty intentional. Ideally we would not want a clojure.compiler.emit-indy flag but maybe you added this just to make the new work switchable. 

5. We are somewhat sensitive to trying to make AOT-compiled code work with later Clojure runtimes as much as possible (we don't guarantee binary compatibility but we have a very good track record here and try to never break that expectation). As such, we generally never change signatures in RT or Reflector or other important interfaces and make only additive changes. I think this patch is additive in that way, so that's good, but would want to carefully consider the new publicly accessible methods in Reflector (as we'd be supporting them forever), like the change in toAccessibleSuperMethod (which I'm not sure is needed?). 

There are other imo far more interesting possible uses for indy in places where we care a great deal about performance and those are places where I would place a lot higher priority. Ghadi, in particular, has investigated options for lazy-initing vars which could have a noticeable impact on startup performance while minimizing the effect on subsequent calls like other approaches that have been tried. Anyways, he can probably chime in on that more.

Alex

Ghadi Shayban

unread,
May 22, 2019, 11:43:46 PM5/22/19
to Clojure
Hi Rémi! What a pleasant surprise to see your name here.  The whole community owes you a great deal of gratitude for your work.

I'm hoping to be at JVMLS this summer to talk about some indy/condy experiments. Alex summed up the philosophy well: we don't do stuff just because it's possible to do, but because it satisfies some objective that seems worth doing.  There are others, but improving peak performance (which is already quite good, thanks to Rich's design and HotSpot) and improving startup time are two interesting objectives.

One complaint often heard in the community is long startup time.  Clojure can be AOT compiled, which helps.  We could also cache bytecode in a way that is sensitive to Maven or git dependencies.  The rest of this post assumes that we have bytecode and not raw s-exprs.  We could probably improve startup by 30-50% with bytecode changes alone, but I don't think that is a big enough number to assuage complaints, and personally I don't think this is one of the "large challenges" that Clojure programmers face day-to-day. It is important for containerization and AWS Lambda to have strategies for fast startup, maybe that needs more assistance from either the JVM (Class Data Sharing, Application CDS, etc.) or execution substrates like AWS Lambda. (The #1 way to help with startup is to depend on fewer dependencies and load fewer classes, and that's is a cultural thing that you can't solve in a compiler.) Some users are looking toward Graal native-image for fast startup, but there are so many restrictions with that tool that I don't even know what to say. A Lisp with eval is an open-world assumption, native-image is closed world, and it's Not Java.

That being said, if you look at Clojure bytecode, in general a lot work happens in <clinit> that can be deferred until an indy instruction bootstraps. (The current strategy predates indy and certainly condy). For example, in https://gist.github.com/ghadishayban/72a87c8e12cd66b0f4e285c1754157f5 there are two constants (a Pattern and a Var) which get initialized in <clinit> and stuffed into final fields. During the load of Clojure namespaces, we load a lot of similar, larger datastructures that serve as Var's metadata.

Another constant example is https://gist.github.com/ghadishayban/f7b4c2206836f29d7e9f8cd614cdd2d1
With condy, ldc's can take bootstrap methods, so we can get rid of the array construction and defer the <clinit> work, making the meat of the code degenerate to:

ldc `:id` (Keyword.class)
aload 1
ldc `:byte`
aload 2
ldc `:is`
aload 3
invokestatic IPersistentMap.of(Object,Object,Object,Object,Object,Object) (this API doesn't exist, but should!)
areturn

We can significantly reduce the bytecode size of regular Clojure functions, which is a proxy for better inlining+peak performance, and defer all the clinit setup, improving startup.

(Aside: notice that three of the six components in the map are static, only the right hand side is dynamic. We could pre-fab an array factory indy that has the static parts filled in already.  I've tried this, and it didn't pay off except with larger maps.)

There are a couple of other invocation types (protocol invokes and keyword invokes) that open-code a PIC in the bytecode, with static fields for the caches.  There are various strategies like MutableCallSite with GWTs to handle this, but hey you wrote the cookbook on this subject.

In Clojure 1.8, the compiler learned "direct linking", which made calls to Clojure functions call a static method.  Previously:

getstatic clojure.lang.Var
invokevirtual clojure.lang.Var.getRawRoot()
checkcast clojure.lang.IFn
push arguments...
invokeinterface clojure.lang.IFn.invoke(.....)

With direct linking:
push arguments
invokestatic someFunction.invokeStatic(....)

Direct linking is not the default except for clojure.core itself comes direct linked, but direct linking traded away dynamicity for performance. (You can't reload things that are direct linked)  Since a Var is essentially a box around a volatile field, there are other ways of getting the performance of the invokestatic without losing the dynamicity.

Reflection is another use-case, but as Alex mentioned, the general suggestion to users is: don't write reflective code. One of the few compiler flags that exist is:
(set! *warn-on-reflection* true)

Anyways, thanks for starting this discussion

Remi Forax

unread,
May 23, 2019, 10:25:28 AM5/23/19
to clo...@googlegroups.com
De: "Alex Miller" <al...@puredanger.com>
À: "Clojure" <clo...@googlegroups.com>
Envoyé: Jeudi 23 Mai 2019 03:47:53
Objet: Re: Use invokedynamic instead of the reflection API when possible
Hi Rémi! Thanks for all your work on ASM and other JVM stuff over the years by the way.
We have of course been poking at indy off and on over the years and there have been a number of experiments with it. Ghadi Shayban has probably done the most work on that and he probably has more useful feedback than I would on the technical aspects of the code.

Based on where we are in the release cycle right now, I expect that we probably aren't ready to engage with this today and it might be a couple months before we are in the meat of the next release and open to it. Some quick thoughts and questions though...

1. As with anything we work on, we don't stuff just because it's possible to do but because it satisfies some objective that seems worth doing. I assume the target benefit here is performance, but is that really it? Are there other benefits? Are there downsides to using indy vs reflection ?

One big thing here is that generally we expect people to remove reflective calls in performance-sensitive code via type hints (and some people more thoroughly try to remove all use of reflection). Thus my expectation would be that the majority of users would experience no improvement or improvements only in parts of the code that are considered not important from a performance perspective. If we're adding code that increases ambiguity (via having multiple invocation paths which might have different failure modes) without a lot of benefit to users, then that prioritizes this pretty far down the list for me.

yes, using indy instead of the reflection API give you better performance because
- you don't have to wrap the arguments in an array
- you generate a specialized code for each callsite, so usually you get inlining.

The other benefits are cleaner bytecode and less public API entrypoints, you have only one public API entry point per bootstrap method and not one per method call so you can change the logic and still be binary compatible.

The "beauty" of this patch is that it doesn't add another failure mode, if at runtime an error occurs, instead of propagating the error and creating another way to get errors, it fallbacks to call the reflection API so from a user POV, you only get an error from the reflection API. And given that indy is not visible in the stack trace, you get exactly the  same error with exactly the same stack trace.


2. You mentioned the caller sensitive stuff - can you point at some resources about what those are? I guess those are calls checking security policy, etc? 

A caller sensitive method is a method calling Reflection.getCallerClass() to get the caller class usually for doing a security check but it can be because the semantics of the method need it (get the caller class of a Logger by example).
The MethodHandle API can not call those methods because internally it works by inserting a kind of trampoline between the invokedynamic and the method called, so the called method will see the trampoline code instead of the real caller when asking for the caller class. It doesn't help that the trampoline code is in java.lang.invoke which is a privileged class.


3. We did some work in the last release to avoid reflective calls to module-private methods, by modifying our reflective search to prefer interfaces or classes that are module accessible. Does module accessibility affect indy calls as well?

Modules are not an issue with indy, when an indy invocation calls the bootstrap method, it calls it with a Lookup object (MethodHandles.Lookup), this Lookup object represent the caller class and so you have the same right as if you were inside the caller class.


4. Clojure has very few compiler / RT flags (and it's most common to use none of them), and that's pretty intentional. Ideally we would not want a clojure.compiler.emit-indy flag but maybe you added this just to make the new work switchable. 

yes, so i can compare with or without indy. Also indy is not supported by GraalVM native-image tool.


5. We are somewhat sensitive to trying to make AOT-compiled code work with later Clojure runtimes as much as possible (we don't guarantee binary compatibility but we have a very good track record here and try to never break that expectation). As such, we generally never change signatures in RT or Reflector or other important interfaces and make only additive changes. I think this patch is additive in that way, so that's good, but would want to carefully consider the new publicly accessible methods in Reflector (as we'd be supporting them forever), like the change in toAccessibleSuperMethod (which I'm not sure is needed?). 

oops, blunder, it's not needed.


There are other imo far more interesting possible uses for indy in places where we care a great deal about performance and those are places where I would place a lot higher priority. Ghadi, in particular, has investigated options for lazy-initing vars which could have a noticeable impact on startup performance while minimizing the effect on subsequent calls like other approaches that have been tried. Anyways, he can probably chime in on that more.

Rémi


Alex



On Wednesday, May 22, 2019 at 6:16:58 PM UTC-5, Rémi Forax wrote:
Hi all,
now that Clojure is compatible with java 8, it can use invokedynamic where it makes sense, i.e. when the compiler has no information to generate directly the call in bytecode, instead of using the reflection API.

In fact, it's a little more complex than that,
- you can not fully replace all calls to the reflective API to use invokedynamic instead, because you have restriction on the methods you can call (you can not call a method annotated with @CallerSensitive for security reason) and
- using the method handle API doesn't necessary means that the calls will be faster than using the reflection API if the JIT is not able to inline the calls.

So the idea of the patch is to always generate invokedynamic at compile time but at runtime to use the methodhandle API if there is a good chance that the call will be inlined and fallback to call the Reflector API otherwise.

Obviously, i've not read how to contribute before writing the patch, so the patch is currently available on github

So now that i've read how to contribute, i think the first question to ask is:
does it make sense to allow the Clojure to use invokedynamic ?

regards,
Rémi

--
You received this message because you are subscribed to the Google
Groups "Clojure" group.
To post to this group, send email to clo...@googlegroups.com
Note that posts from new members are moderated - please be patient with your first post.
To unsubscribe from this group, send email to
clojure+u...@googlegroups.com
For more options, visit this group at
http://groups.google.com/group/clojure?hl=en
---
You received this message because you are subscribed to the Google Groups "Clojure" group.
To unsubscribe from this group and stop receiving emails from it, send an email to clojure+u...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/clojure/75e34ba0-995d-4e70-9fca-b031d844ec1a%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

Remi Forax

unread,
May 23, 2019, 10:56:35 AM5/23/19
to clojure
De: "Ghadi Shayban" <gsha...@gmail.com>
À: "clojure" <clo...@googlegroups.com>
Envoyé: Jeudi 23 Mai 2019 05:43:46
Objet: Re: Use invokedynamic instead of the reflection API when possible
Hi Rémi! What a pleasant surprise to see your name here.  The whole community owes you a great deal of gratitude for your work.


I'm hoping to be at JVMLS this summer to talk about some indy/condy experiments. Alex summed up the philosophy well: we don't do stuff just because it's possible to do, but because it satisfies some objective that seems worth doing.  There are others, but improving peak performance (which is already quite good, thanks to Rich's design and HotSpot) and improving startup time are two interesting objectives.

I've sumitted a talk at the JVMLS about adding the support of lazy static final in Java, which may interest you :)



One complaint often heard in the community is long startup time.  Clojure can be AOT compiled, which helps.  We could also cache bytecode in a way that is sensitive to Maven or git dependencies.  The rest of this post assumes that we have bytecode and not raw s-exprs.  We could probably improve startup by 30-50% with bytecode changes alone, but I don't think that is a big enough number to assuage complaints, and personally I don't think this is one of the "large challenges" that Clojure programmers face day-to-day. It is important for containerization and AWS Lambda to have strategies for fast startup, maybe that needs more assistance from either the JVM (Class Data Sharing, Application CDS, etc.) or execution substrates like AWS Lambda. (The #1 way to help with startup is to depend on fewer dependencies and load fewer classes, and that's is a cultural thing that you can't solve in a compiler.) Some users are looking toward Graal native-image for fast startup, but there are so many restrictions with that tool that I don't even know what to say. A Lisp with eval is an open-world assumption, native-image is closed world, and it's Not Java.

it's not Java but it's moving in the direction to be more Java.



That being said, if you look at Clojure bytecode, in general a lot work happens in <clinit> that can be deferred until an indy instruction bootstraps. (The current strategy predates indy and certainly condy). For example, in https://gist.github.com/ghadishayban/72a87c8e12cd66b0f4e285c1754157f5 there are two constants (a Pattern and a Var) which get initialized in <clinit> and stuffed into final fields. During the load of Clojure namespaces, we load a lot of similar, larger datastructures that serve as Var's metadata.

here the Var seems to be an indy more than a condy because you want Var + getRawRoot + checkcast, and the Pattern is more an condy



Another constant example is https://gist.github.com/ghadishayban/f7b4c2206836f29d7e9f8cd614cdd2d1
With condy, ldc's can take bootstrap methods, so we can get rid of the array construction and defer the <clinit> work, making the meat of the code degenerate to:

ldc `:id` (Keyword.class)
aload 1
ldc `:byte`
aload 2
ldc `:is`
aload 3
invokestatic IPersistentMap.of(Object,Object,Object,Object,Object,Object) (this API doesn't exist, but should!)
areturn

yes, the keyword are condy (or an indy + constant method handle), for the Map, you can have an indy that works like the String concatenation introduces in Java 9, an indy + a String explaining how to create the map, something like ":id,:byte,:is" is all keys are constant or ":is,?,:is" if the second key is not constant.



We can significantly reduce the bytecode size of regular Clojure functions, which is a proxy for better inlining+peak performance, and defer all the clinit setup, improving startup.

yes.



(Aside: notice that three of the six components in the map are static, only the right hand side is dynamic. We could pre-fab an array factory indy that has the static parts filled in already.  I've tried this, and it didn't pay off except with larger maps.)

the idea with indy is to transform the String of the recipe to a serie of method calls that creates the PersistentMap and let the JIT propagate the constants.



There are a couple of other invocation types (protocol invokes and keyword invokes) that open-code a PIC in the bytecode, with static fields for the caches.  There are various strategies like MutableCallSite with GWTs to handle this, but hey you wrote the cookbook on this subject.

take a look to the patch, you have the code for the inline cache.



In Clojure 1.8, the compiler learned "direct linking", which made calls to Clojure functions call a static method.  Previously:

getstatic clojure.lang.Var
invokevirtual clojure.lang.Var.getRawRoot()
checkcast clojure.lang.IFn
push arguments...
invokeinterface clojure.lang.IFn.invoke(.....)

With direct linking:
push arguments
invokestatic someFunction.invokeStatic(....)

Direct linking is not the default except for clojure.core itself comes direct linked, but direct linking traded away dynamicity for performance. (You can't reload things that are direct linked)  Since a Var is essentially a box around a volatile field, there are other ways of getting the performance of the invokestatic without losing the dynamicity.




Reflection is another use-case, but as Alex mentioned, the general suggestion to users is: don't write reflective code. One of the few compiler flags that exist is:
(set! *warn-on-reflection* true)

Anyways, thanks for starting this discussion

I think a good start is to take a look to Var + Keyword,  i will see what i can do :)

Rémi

--
You received this message because you are subscribed to the Google
Groups "Clojure" group.
To post to this group, send email to clo...@googlegroups.com
Note that posts from new members are moderated - please be patient with your first post.
To unsubscribe from this group, send email to
clojure+u...@googlegroups.com
For more options, visit this group at
http://groups.google.com/group/clojure?hl=en
---
You received this message because you are subscribed to the Google Groups "Clojure" group.
To unsubscribe from this group and stop receiving emails from it, send an email to clojure+u...@googlegroups.com.

Remi Forax

unread,
May 23, 2019, 11:20:17 AM5/23/19
to clojure
Reply all
Reply to author
Forward
0 new messages