Clojure 1.7 and invokeStatic

469 views
Skip to first unread message

Robin Heggelund Hansen

unread,
Aug 6, 2014, 6:54:32 AM8/6/14
to clo...@googlegroups.com
Just read this blog post about Oxen (http://arrdem.com/2014/08/05/of_oxen,_carts_and_ordering/?utm_source=dlvr.it&utm_medium=twitter). In it is mentioned that Rich is re-introducing invokeStatic to achieve a possible 10% performance increase for Clojure 1.7.

I couldn't find any information about this. Anyone know where I can find out more?

Jozef Wagner

unread,
Aug 6, 2014, 7:05:40 AM8/6/14
to clo...@googlegroups.com

Robin Heggelund Hansen

unread,
Aug 6, 2014, 12:22:36 PM8/6/14
to clo...@googlegroups.com
Don't understand the compiler that well. Could you provide a short description of what is being done?

Reid McKenzie

unread,
Aug 6, 2014, 2:41:56 PM8/6/14
to clo...@googlegroups.com
Functions are objects implementing the IFn interface. This interface defines a set of methods named "invoke" which return an object given up to 21 arguments.  Once the compiler is done emitting any given function, an IFn object has been created. Def is a general operation which creates a value and binds it to a var named by the current (*ns*, symbol) pair. So for defn and an instance of this IFn object is what the bound var points to. So as an example,

user=> (defn foo [x y] (+ x y 1))
; macroexpanded (def foo (fn* ([x y] (+ x y 1))))
#'user/foo

If you inspect the user namespace, you will find that the symbol foo now maps to the var #'user/foo. Subsequent textual occurrences of the symbol foo in this namespace will at compile time be mapped to the var #'user/foo, and the emitted code will take the var #'user/foo and dereference it to get an IFn object implementing the foo function which can be invoked.

As there is overhead associated with dereferencing a var and some code such as clojure.core/* is expected not to be redefined by users, the ^:static annotation in Clojure 1.3 directed the compiler to emit `public static invokeStatic` methods in addition to the normal `public invoke` methods. This allowed potentially hot path functions to statically invoke each other rather than using var indirection. This static linking of function calls is how Oxcart achieves the reported 24% speedup, and the linked direct branch is Rich implementing invokeStatic again, presumably for Clojure 1.7.

This static linking feature was introduced in Clojure 1.3 and removed in Clojure 1.4 because as I mentioned in my linked blog post the downside of static linking is that live development and code redefinition become harder or impossible. My expectation is that, as 1.7 is projected to introduce compilation profiles, this problem will be mitigated by different builds or profiles of Clojure which may enable or disable static linking in a user visible manner. So for an application deployment build you may choose [org.clojure/clojure "1.7.0-static"] which can use ^:static annotations for a speedup while for development you may use [org.clojure/clojure "1.7.0"] which may ignore ^:static in exchange for a better REPL experience as Clojure 1.6 and 1.5 do.

Hope this helps,
Reid
--
You received this message because you are subscribed to the Google
Groups "Clojure" group.
To post to this group, send email to clo...@googlegroups.com
Note that posts from new members are moderated - please be patient with your first post.
To unsubscribe from this group, send email to
clojure+u...@googlegroups.com
For more options, visit this group at
http://groups.google.com/group/clojure?hl=en
---
You received this message because you are subscribed to the Google Groups "Clojure" group.
To unsubscribe from this group and stop receiving emails from it, send an email to clojure+u...@googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

Robin Heggelund Hansen

unread,
Aug 6, 2014, 2:49:31 PM8/6/14
to clo...@googlegroups.com
Perfect explination, thanks!

Mike Thvedt

unread,
Aug 6, 2014, 2:58:48 PM8/6/14
to clo...@googlegroups.com
It's worth pointing out that var indirection is already cheap in Java--it is generally dominated by IO, memory access, object construction, dynamic dispatch... The JIT compiler will inline any var access if the var doesn't visibly change, and only needs to check one word of memory per var each time the JIT compiled function is invoked. I've replaced vars with Java methods and found a 0% speedup.

Mike Thvedt

unread,
Aug 6, 2014, 2:59:47 PM8/6/14
to clo...@googlegroups.com
Last sentence should be: "I've replaced vars with Java methods in some high-performance cases and found a 0% speedup."

On Wednesday, August 6, 2014 5:54:32 AM UTC-5, Robin Heggelund Hansen wrote:

Ghadi Shayban

unread,
Aug 6, 2014, 3:30:06 PM8/6/14
to clo...@googlegroups.com
Var indirection is not super cheap, as it has a volatile field, which is a memory fence.  I have been working on Clojure with invokedynamic, and I have a demonstrable improvement on microbenchmarks.  Obviously your application will have IO and myriad other costs, but I just want to echo that it isn't a trivial cost.

Mike Thvedt

unread,
Aug 6, 2014, 3:42:36 PM8/6/14
to clo...@googlegroups.com
I don't want to question your microbenchmarks, but I'm not sure you have the correct interpretation.

Read memory fences have little to no cost. In particular, read memory fences are a no-op (literally) on x86 unless the cache line is invalidated.


On Wednesday, August 6, 2014 5:54:32 AM UTC-5, Robin Heggelund Hansen wrote:

Gary Trakhman

unread,
Aug 6, 2014, 3:44:02 PM8/6/14
to clo...@googlegroups.com
I think that also implies that the JVM can't inline across the fences, so there's another cost.


--

Alex Miller

unread,
Aug 6, 2014, 4:07:15 PM8/6/14
to clo...@googlegroups.com
I've compared the hotspot inlining on current vars through volatile reference, lazy vars ala fastload branch, and static calls ala direct branch and indeed the inlining is affected once things get hot. fastload avoids early loads but is ultimately slower. direct is about twice as fast due (I think) to tighter inlining.

Mike Thvedt

unread,
Aug 6, 2014, 8:13:01 PM8/6/14
to clo...@googlegroups.com
I didn't want to start a flame war, I just didn't want people being misled into thinking static vars are a big perf improvement for most code. It's better do use ordinary dynamic vars unless you're sure it will be beneficial for some tight loop somewhere. The usual case is the JIT inlines the var access and inserts a single safety-check, and many kinds of tight loops see no benefit (even when highly CPU bound).

Regarding memory barriers, I believe the JIT does the same whether the variable is volatile or not, because it can't write to memory if the inlining is invalidated. But I could be wrong. Reducing code size can also help inlining for some "inlining shapes".


On Wednesday, August 6, 2014 5:54:32 AM UTC-5, Robin Heggelund Hansen wrote:

Alex Miller

unread,
Aug 6, 2014, 9:57:35 PM8/6/14
to clo...@googlegroups.com
No flame war here, just healthy discussion. :)  I don't mean to over-state the benefit; I have not yet found a program where var invocation was actually the bottleneck and in general a lot of these differences are noise compared to the real meat of the code.
Reply all
Reply to author
Forward
0 new messages