GSoC Report: CinC, port of the clojure compiler in clojure

1381 views
Skip to first unread message

Nicola Mometto

unread,
Sep 22, 2013, 4:59:30 PM9/22/13
to clojure

For the past 3 months I've been working as part of my GSoC project,
on a port of the clojure compiler and analyzer to clojure.

Tomorrow the GSoC will end, so this is a report of what I've accomplished
with this project thus far; note that I'm not going to stop working on
CinC now that the GSoC is over.

First, a link to the project if anybody is interested: https://github.com/Bronsa/CinC

What's there?
The project is made of 2 major components: the analyzer and the emitter.

The analyzer is based heavly on the clojurescript analyzer, I've used
(successfully) the :children-keys approach to write a generic walker
over the AST and to implement multiple passes in order to
annotate the AST with all the information needed in order to compile to
JVM bytecode.
It should be noted that this analyzer provides all the information
collected by the clojure compiler (info on locals-clearing, loop-locals
invalidation etc) in a much more accessible way, exposing it all as
fields in a hash-map.

The compiler takes this AST and constructs an AST representing the class
to be emitted, expressing the byte-code as a data-structure too, and it
subsequently interprets this AST evaluating the expression.

The `doc` folder contains some further documentation on how the
analyzer/compiler works, it's not much yet but more is to come.

What's not there yet?
While `cinc.compiler.jvm.bytecode/eval` is capable of evaluating all of
clojure special forms, primitive support mostly not in place, and should
be expected to be broken.
This means that some expression that need primitive support to work, for
example `defrecord` will not work.

What's going to happen?
CinC is a project I wanted to work on for quite some time, and I'm not
going to abandon it now that the GSoC is over, I have things I want to
experiment with CinC (and I hope I'm not the only one), my current to-do
list is:

* get primitive support fully working (including invokePrim)
* refactor :tag/:cast/:box handling
* standardize the AST format between CinC, clojurescript and
jvm.tools.analyzer, David Nolen and Ambrose B.S. are obviously who I'm
most looking forward to talking to about this, but any opinion will be
be greatly appreciated.
* have 2 compile target: one "normal" target that won't do any type
specialization and emit a dynamic bytecode (mostly for repl
experimentations), one "optimized" target that will do aggressive
tag inference and compile to more static code trying to avoid most of
the runtime reflection.
* experiment dynamically emitting invokePrim interfaces for primitive
types other than long or doubles


Ideas/feedbacks/contributions are greatly appreciated!

Thanks,
Nicola

Ambrose Bonnaire-Sergeant

unread,
Sep 22, 2013, 10:26:27 PM9/22/13
to clojure
Excellent work! To hear CinC is a long term project is music to my ears!

Ambrose


--
--
You received this message because you are subscribed to the Google
Groups "Clojure" group.
To post to this group, send email to clo...@googlegroups.com
Note that posts from new members are moderated - please be patient with your first post.
To unsubscribe from this group, send email to
clojure+u...@googlegroups.com
For more options, visit this group at
http://groups.google.com/group/clojure?hl=en
---
You received this message because you are subscribed to the Google Groups "Clojure" group.
To unsubscribe from this group and stop receiving emails from it, send an email to clojure+u...@googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.

Colin Fleming

unread,
Sep 23, 2013, 4:03:00 AM9/23/13
to clo...@googlegroups.com
Congratulations Nicola, and thanks for all the hard work - this is a great project!

Mikera

unread,
Oct 1, 2013, 8:37:43 PM10/1/13
to clo...@googlegroups.com
Hi Nicola,

Congratulations - great work!

My main idea/feedback is that I'd love to see the "optimised" mode become able to emit code that matches Java / Scala performance in all cases. This would be a huge boost for people (like me) who do a lot of numerical / realtime work and would love to do this in pure Clojure rather than dropping down to Java all the time. 

Mostly, the difference in performance comes from using the correct static types, avoiding boxing, and avoiding all the dynamic checks/casting that Clojure currently does.

1) To make that work, we need to make more use of static typing / type inference to emit the right optimised bytecode for each situation. Sounds like that is in your plan already.

2) This in turn means we are going to have to "recompile" if we want to retain any sort of dynamic behaviour. The current system of independently mutable vars won't work (for obvious reasons: changing a var could easily violate the previous type inference assumptions) 

3) Doing recompilation efficiently and incrementally implies keeping a dependency graph of compiled code., as well as retaining the source code in memory.

4) For sanity and correctness reasons, I think the best approach to this is having an "immutable environment", i.e. any action that redefines a var (e.g. "def") produces a new modified immutable environment, including updated dependencies. Fortunately persistent data structures make this cheap. 

5) Immutable environments also offer a number of other advantages over the current mutable namespaces, e.g. better ability to handle circular references, the ability to store snapshots of past environments and the ability to fork environments to create new independent application instances.

That's a rough sketch of what I think would work for a truly great Clojure 2.0 "optimised mode". If we do all this, then we will be in a very good place:
a) We'll have the compiled performance equal to Java (or maybe even better, if we can exploit some JVM features that Java doesn't)
b) We'll retain all the dynamic / interactive coding capabilities of Lisp, which is part of what makes Clojure great now.
c) Immutable environments can evolve to replace the current mutable namespace system - which I think would be a big win in the long term, and is the right way to go philosophically if we take the "programming with values" stuff seriously.

If you are interested and able to make even part of this happen in CinC, happy to discuss and contribute where I can!

white...@polyc0l0r.net

unread,
Oct 7, 2013, 8:15:21 AM10/7/13
to clo...@googlegroups.com
Hello Nicola et al.,

I have only learned about  your CinC effort during last week with the news about core.typed.
I have ported the analyzer of the self-hosted cljs compiler to ClojureC:  https://github.com/schani/clojurec/pull/33 and can explore it from the metacircular scheme REPL already. It seems to work basically, but dynamic runtime primitives like namespaces, dynamic protocols, records, ... needed for the compiler are lacking. 

I want to proceed to a native REPL and statically compiled primitives in C which then can be used to interactively build and explore runtime design on this native REPL (linked C-libraries can be called natively and will be garbage collected as well if done properly). I *don't* see this as competition for any established runtime and I still would define ClojureC as the host and the environment as "hosted" as I really appreciate Clojure's hosted approach of decoupling the runtime from the language. I haven't sorted that out yet, though, as it means deciding how to link against libraries of the host (e.g. glib).

The motivation for the project is/was to write a Clojure runtime in Clojure itself with the help of ClojureC (mostly to learn about compilers and runtimes as I only develop with Clojure for 10 months now). I can picture it as a simple barebone which could be used to hook in https://github.com/halgari/clojure-metal/ or http://blog.danieljanus.pl/blog/2013/05/26/lithium-revisited/ into an already working REPL and explore (mixed) runtime building from there and lift JIT compiling and different runtime optimizations independently from this minimal Clojure implementation while adding the necessary static support for performance. The immutability aspects of Clojure also potentially allow more aggresive JIT optimizations through memoization, but this is only my speculation.

Up until now I just wanted to hack the compiler together to get some minimal Clojure REPL going before asking for help on this list (working with statically compiled C sucks feedback-wise). My idea was to use ClojureC's runtime closures like SICP does instead of an explicit bytecode to emit. I pictured that as Clojure in Clojure defined only in terms of the lambda-calculus (closures). But now I have doubts. I thought that a bytecode compiler could be done additionally/independtly and through rebinding of vars in the ClojureC runtime additional JIT compilers like the LLVM one of halgari could use their own bytecode (even different compilers/bytecodes during one runtime) to optimize these functions transparently. It would implement Clojure in Clojure completely without bytecode, but then this possibly defeats performance and does not make the barebone suitable for embedded applications (as memory consumption might bloat). It will be fairly slow anyway, but this is atm. no problem for me.
Now I am thinking about porting CinC instead and defining a native bytecode with you inspired by CinC needs and other bytecodes. I want to do that as a long-term project, the runtime will have little use in the near future as NodeJs and Lua are probably better hosts even for embedded applications.

What do you think? 

Christian

Nicola Mometto

unread,
Oct 7, 2013, 7:53:59 PM10/7/13
to clo...@googlegroups.com
Hi Mike,
First of all, thanks for the report and sorry for the late reply.

Your "proposal" looks like something that should definitely be tried out
in the long-term, however it implies adding a notable amount of
complexity, moving away from the simple "read top level form -> analyze
-> compile and then forget about it" model of the current compiler.

My "short"-term idea is to have 3 different compile targets:
- a normal target, fields are untyped and everything is dynamic
- a locally typed target, local fields are typed and we use some bits of
local-type inference, this should avoid a good amount of casting and
still preserve the globally dynamic capabilities of clojure
- an optimizing target, tags every top-level with its inferred tag,
losing some dynamicity but decreases the runtime-reflection

This way one can use the unoptimizing evaluation when e.g. developing
at the repl and then compile the code using the optimizing mode for
depoloyment; this is not unlike how clojurescript handles it with its
:simple/:whitespace/:advanced optimization levels.

Regarding the immutable environment idea -- honestly, that's (currently)
out of scope for CinC as it implies extensive changes on all the clojure runtime;
CinC is currently (only) a self-hosting analyzer/compiler with the
analyzer aiming for cross-runtime flexibility.
A complete clojure in clojure re-implementation implies a porting of the
whole runtime and it's not something I plan to do in the near future,
I've developed and will continue to develop tools.reader,
tools.analyzer, tools.analyzer.jvm and tools.emitter.jvm hoping that
they will be part of a clojure-in-clojure project some time but that will
require a lot of time and I'm not going to focus on that.

Anyway you're (and everybody) is more than welcome to contribute to CinC
and I'm open to discuss its development and the directions it should
take.

Nicola
> --

Mikhail Kryshen

unread,
Oct 8, 2013, 6:00:52 PM10/8/13
to clo...@googlegroups.com
Mikera <mike.r.an...@gmail.com> writes:

> 2) This in turn means we are going to have to "recompile" if we want to
> retain any sort of dynamic behaviour. The current system of independently
> mutable vars won't work (for obvious reasons: changing a var could easily
> violate the previous type inference assumptions)

The HotSpot JIT compiler already performs adaptive optimization based on
runtime profile and dynamicaly recompiles bytecode when necessary. The
problem is the way Clojure calls functions through vars is not
transparent to the compiler's optimizer. Invokedynamic can help here,
see
http://blog.headius.com/2011/10/why-clojure-doesnt-need-invokedynamic.html

--
Mikhail

Nicola Mometto

unread,
Oct 8, 2013, 6:09:02 PM10/8/13
to clo...@googlegroups.com

I don't think that's what Mike was talking about.
Say we have (defn x ^long [] 1), clojure will use the IFn$L and emit an
"public long invokePrim()" method.

When we do (defn y [] (let [a (x)] a) the compiler will call .invokePrim
instead of invoke.

If we redefine (defn x [] "") then y won't work because the new version
of x doesn't implement IFn$L, we'd need to recompile y too.

Mikera

unread,
Oct 8, 2013, 9:59:49 PM10/8/13
to clo...@googlegroups.com
Yes Nicola, that's exactly the problem I was talking about.

As far as I'm aware, the only way to make this maximally efficient at runtime on the JVM (and probably any other non-trivial runtime with typed method dispatch?) is to use some kind of recompilation strategy.

invokedynamic is nice but doesn't get you there either: it still has a non-trivial runtime overhead.

Scala has an incremental recompilation strategy using dependencies, though I think their compilation units are a bit larger than individual Clojure forms. See for more information:

Mikhail Kryshen

unread,
Oct 9, 2013, 8:29:46 AM10/9/13
to clo...@googlegroups.com
Nicola Mometto <brob...@gmail.com> writes:

> I don't think that's what Mike was talking about.
> Say we have (defn x ^long [] 1), clojure will use the IFn$L and emit an
> "public long invokePrim()" method.
>
> When we do (defn y [] (let [a (x)] a) the compiler will call .invokePrim
> instead of invoke.
>
> If we redefine (defn x [] "") then y won't work because the new version
> of x doesn't implement IFn$L, we'd need to recompile y too.

Actually I am talking about the same problem. For virtual calls the JIT
compiler can determine that only one implementation of the method is
being used, inline the call and perform object explosion, effectively
removing boxing of primitives. When new implementation becomes
available it dynamically recompiles affected bytecode. I don't know if
the current implementation does this for invokedynamic calls, but there
is a potential. It just does not feel right to invent a whole new
complex optimization layer on the Clojure side until we fully use the
optimization capabilities of the JVM.

--
Mikhail

Timothy Baldridge

unread,
Oct 9, 2013, 9:39:33 AM10/9/13
to clo...@googlegroups.com
One of the problems with these sort of discussions is the total lack of hard facts. Words such as "could do this" or "doesn't do that" when referring to HotSpot are very rarely backed up by actual benchmarks, or ASM printouts. I would imagine that any serious attempt at an optimization layer will be prefixed by lots of research into where the current compiler fails, and what the proper enhancements should be. 

Timothy
--
“One of the main causes of the fall of the Roman Empire was that–lacking zero–they had no way to indicate successful termination of their C programs.”
(Robert Firth)

Mikera

unread,
Oct 9, 2013, 10:20:08 PM10/9/13
to clo...@googlegroups.com
The JVM is certainly amazing at optimising virtual calls (to be expected since that's the usual calling convention for majority of Java OOP code....)

Clojure currently doesn't use virtual calls for functions however: that's part of the problem here. It pretty much always emits invokeinterface calls (via IFn and IFn's primitive variants). invokeinterface is slower than invokevirtual in many cases. They can certainly perform the same in cases where the JIT can eliminate the possibility of multiple implementations, but that's probably not going to be the case for IFn :-)

So part of making Clojure faster / making the most of the JVM should probably involve finding ways to use invokevirtual (or invokespecial) instead of invokeinterface where possible.

P.S. in the name of having some real performance data and to reassure myself that my knowledge of JVM behaviour is roughly correct, I made a quick caliper microbenchmark in Java to demonstrate the difference between interface and virtual dispatch:

 0% Scenario{vm=java, trial=0, benchmark=Interface} 13.04 ns; σ=0.19 ns @ 10 trials
50% Scenario{vm=java, trial=0, benchmark=Virtual} 7.39 ns; σ=0.26 ns @ 10 trials

benchmark    ns linear runtime
Interface 13.04 ==============================
  Virtual  7.39 =================

vm: java
trial: 0

Test code here for those interested:



Mikera

unread,
Oct 9, 2013, 10:42:02 PM10/9/13
to clo...@googlegroups.com
Good point Timothy - 100% agreed on the value of hard facts / benchmarks.

Though I think it's still useful to have some discussions first, in order to determine where to start / what to benchmark / where the problems are likely to be hiding. Good research should start with an understanding of the relevant issues, some hypotheses to test and a sensible experimental design - ideally before you start collecting too much data.

Also we shouldn't underestimate the value of sharing the wisdom and intuition gained from many years of optimising code on the JVM, which I'm sure many people around here can contribute :-)
Reply all
Reply to author
Forward
0 new messages