Consider the following code:
(defn foo [x]
"Result!")
(reduce + (map (comp count foo) (range 5)))
=> 35
A smart compiler would recognise that it is dealing with a constant expression, and compile down to a constant. But currently in Clojure it compiles to quite an expensive operation that allocates temporary structures, makes some interface calls, boxes various numbers etc.
(time (dotimes [i 1000000] (reduce + (map (comp count foo) (range 5)))))
"Elapsed time: 1049.207845 msecs"
i.e. we are taking about 1050ns for an operation that should be a constant load in less than 1ns. Not very good, and the JVM JIT can't help us much because it doesn't know enough about the structure of the code to perform the necessary higher-level code transformations.
A key root cause of the problem is that functions such as "foo" is opaque to the compiler. foo has important properties (being a pure function, always evaluating to a constant of type java.lang.String) that the compiler would need to see if it were going to optimise the given expression correctly. Likewise the compiler would need to know facts about the other functions involved (e.g. "comp of a pure function like count and a constant function like foo is itself a constant function", "(range n) produces a sequence of length n" etc.)
It seems to me that we could get this kind of optimisation if we added an AST representation as (hidden?) metadata to functions. The compiler could then use this for all sorts of valuable things:
- inlining for small functions
- spotting opportunities to use primitive versions of functions (avoiding boxing)
- avoiding reflection (often eliminating the need for explicit type hints)
- constant expression folding as above
- using core.logic to reason about possible higher level optimisations
- optionally producing type-check errors at compile time
The only downside I can see is some extra memory usage for keeping the ASTs around after initial compilation. But I don't think it would be a big issue, and you could even provide a "clear-compiled-ast" function if you wanted to null out the ASTs and get your memory back at some later stage (e.g. after your app has fully loaded)
It seems that we would want something like this in order to achieve the twin goals of having a high-level, dynamic functional language while still providing the ability to achieve high performance in production code.
Thoughts? A reasonable idea for Clojure 2.0 / Clojure in Clojure?