Re: What is the status of Clojure on LLVM or C?

3,866 views
Skip to first unread message

Timothy Baldridge

unread,
Mar 27, 2013, 5:21:56 PM3/27/13
to clo...@googlegroups.com
What use-case do you have for such an implementation? Is there something that Clojure on LLVM will give you that Clojure on the JVM or on V8 won't allow you to do?

Timothy


On Wed, Mar 27, 2013 at 2:05 PM, Joe Graham <josg...@gmail.com> wrote:
Hi Group,
Good afternoon I hope everyone is well.  I just wanted to reach out to this group and get the current status of Clojure today on the LLVM compiler or C based implementation?  Has anyone looked into a Julia implementation?  Just trying to get a roadmap on the main forks before searching on every permutation of this question.  Thanks so much for your help and valuable input of this group.

BR_joe

--
--
You received this message because you are subscribed to the Google
Groups "Clojure" group.
To post to this group, send email to clo...@googlegroups.com
Note that posts from new members are moderated - please be patient with your first post.
To unsubscribe from this group, send email to
clojure+u...@googlegroups.com
For more options, visit this group at
http://groups.google.com/group/clojure?hl=en
---
You received this message because you are subscribed to the Google Groups "Clojure" group.
To unsubscribe from this group and stop receiving emails from it, send an email to clojure+u...@googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.
 
 



--
“One of the main causes of the fall of the Roman Empire was that–lacking zero–they had no way to indicate successful termination of their C programs.”
(Robert Firth)

Mark Rathwell

unread,
Mar 27, 2013, 5:47:18 PM3/27/13
to clo...@googlegroups.com
A previous thread that covers a lot of ground, but should give you a lot of the information you are looking for [1].  There aren't too many use cases that couldn't be covered with ClojureScript+V8 or some of the other suggestions.


Mikera

unread,
Mar 27, 2013, 9:57:57 PM3/27/13
to clo...@googlegroups.com
On Thursday, 28 March 2013 04:05:03 UTC+8, Joe Graham wrote:
Hi Group,
Good afternoon I hope everyone is well.  I just wanted to reach out to this group and get the current status of Clojure today on the LLVM compiler or C based implementation?  Has anyone looked into a Julia implementation?  Just trying to get a roadmap on the main forks before searching on every permutation of this question.  Thanks so much for your help and valuable input of this group.

BR_joe

You may be interested in mjolnir: https://github.com/halgari/mjolnir 

I haven't used it yet, but it appears to be a pretty well designed library that enables Clojure to compile and run native code via LLVM.

It addresses what is probably the best use case for native code compilation in Clojure - i.e. run on the JVM to get the benefits of the JVM infrastructure and library ecosystem but generate native code where needed to achieve specific objectives (presumably performance or direct hardware access...).


John Szakmeister

unread,
Mar 28, 2013, 5:15:34 PM3/28/13
to clo...@googlegroups.com
On Wed, Mar 27, 2013 at 5:21 PM, Timothy Baldridge <tbald...@gmail.com> wrote:
> What use-case do you have for such an implementation? Is there something
> that Clojure on LLVM will give you that Clojure on the JVM or on V8 won't
> allow you to do?

Clojure on C would likely allow me to use Clojure in a deeply embedded
environment. Such as an ARM processor with 32MB of Flash and 64MB of
RAM. To run the JVM, that may require licensing, and V8 doesn't allow
for threads.

I'm not the OP, but I thought I'd share my view too.

-John

Marko Topolnik

unread,
Mar 28, 2013, 5:25:13 PM3/28/13
to clo...@googlegroups.com, jo...@szakmeister.net
Or you may have just a trivial requirement for a program that both starts and executes quickly.

-marko

Laurent PETIT

unread,
Mar 28, 2013, 5:45:53 PM3/28/13
to clo...@googlegroups.com, jo...@szakmeister.net
2013/3/28 Marko Topolnik <marko.t...@gmail.com>:
> Or you may have just a trivial requirement for a program that both starts
> and executes quickly.

To what extent would an LLVM / C version of a Clojure program not
incur startup penalty as the JVM does.

As far as I understand it, the startup cost is manyfold:
1/ JVM startup
2/ loading of Clojure Core
3/ loading of non-lazy parts of your application (generally from
loading a global namespace to invoke its -main function)

I know AOT compilation can somehow reduce load-time of 2/ and 3/, but
not bring them to zero. As far as I understand it, all the namespaces
involved in your application will still have to be linearly executed,
in a depth-first manner following the graph of namespace dependencies
+ loaded configuration files etc. Only the compilations of functions
will be optimized into loading of their corresponding classes.

So, short of having a "image-like" environment, I wonder what the time
taken to do 2/ + 3/ would be in LLVM / C versions of Clojure.

Just asking, not even sure the above makes sense,

-- Laurent

Mikera

unread,
Mar 28, 2013, 9:26:54 PM3/28/13
to clo...@googlegroups.com, jo...@szakmeister.net
On Friday, 29 March 2013 05:45:53 UTC+8, Laurent PETIT wrote:
2013/3/28 Marko Topolnik <marko.t...@gmail.com>:
> Or you may have just a trivial requirement for a program that both starts
> and executes quickly.

To what extent would an LLVM / C version of a Clojure program not
incur startup penalty as the JVM does.

As far as I understand it, the startup cost is manyfold:
1/ JVM startup
2/ loading of Clojure Core
3/ loading of non-lazy parts of your application (generally from
loading a global namespace to invoke its -main function)

In my experience 1) is a small fraction of the total. A trivial "hello world" Java program runs in less than 0.1sec on my machine, which proves that JVM startup isn't really important. Or at least, far less important than most people think.
 

I know AOT compilation can somehow reduce load-time of 2/ and 3/, but
not bring them to zero. As far as I understand it, all the namespaces
involved in your application will still have to be linearly executed,
in a depth-first manner following the graph of namespace dependencies
+ loaded configuration files etc. Only the compilations of functions
will be optimized into loading of their corresponding classes.

So, short of having a "image-like" environment, I wonder what the time
taken to do 2/ + 3/ would be in LLVM / C versions of Clojure.

It might even be slower in LLVM / C, unless you can at least match the JVM in terms of JIT optimisation and garbage collector efficiency, which in turn affects the runtime for 2+3 (I believe a garbage collector is a requirement to execute Clojure?). Beating the JVM isn't an easy feat.

Something I would be very interested in would be enhancements to Clojure that allow for lazy compilation, i.e. deferring compilation of parts of your application or Clojure Core until they are directly invoked for the first time. This is probably going to be the most promising approach for reducing Clojure startup time, although I expect it would require some breaking changes. 

Timothy Baldridge

unread,
Mar 29, 2013, 12:04:33 AM3/29/13
to clo...@googlegroups.com
This is something I've thought/talked about for some time now. In reality this is one of the reasons I started Mjolnir. I would like to see an implementation of Clojure on LLVM. Mjolnir is several months away from being able to handle a project like this, but I took the time tonight to type up my thoughts on the topic. 

https://github.com/halgari/clojure-metal/blob/master/README.md

I'd love to hear anyone's input on this doc. I just typed this up, so it's a bit rough, but it should communicate some of the ideas I have.

Timothy Baldridge



--
--
You received this message because you are subscribed to the Google
Groups "Clojure" group.
To post to this group, send email to clo...@googlegroups.com
Note that posts from new members are moderated - please be patient with your first post.
To unsubscribe from this group, send email to
clojure+u...@googlegroups.com
For more options, visit this group at
http://groups.google.com/group/clojure?hl=en
---
You received this message because you are subscribed to the Google Groups "Clojure" group.
To unsubscribe from this group and stop receiving emails from it, send an email to clojure+u...@googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.
 
 

Marko Topolnik

unread,
Mar 29, 2013, 4:24:15 AM3/29/13
to clo...@googlegroups.com, jo...@szakmeister.net


On Thursday, March 28, 2013 10:45:53 PM UTC+1, Laurent PETIT wrote:
2013/3/28 Marko Topolnik <marko.t...@gmail.com>:
> Or you may have just a trivial requirement for a program that both starts
> and executes quickly.

To what extent would an LLVM / C version of a Clojure program not
incur startup penalty as the JVM does.

As far as I understand it, the startup cost is manyfold:
1/ JVM startup
2/ loading of Clojure Core
3/ loading of non-lazy parts of your application (generally from
loading a global namespace to invoke its -main function)

Yes, the problems are wider than just JVM startup. My point is that Clojure can't be used to build "small-is-beautiful" programs that contribute to the standard *nix toolchain---at least not those that don't do enough massive work to dwarf the initialization costs. I am comparing this to Common Lisp in the '80s, where both startup time and execution speed were goals held in high regard. Someone looking for an LLVM implementation may be having just such a use case in mind.

-marko

John Szakmeister

unread,
Mar 29, 2013, 5:02:34 AM3/29/13
to clo...@googlegroups.com
On Thu, Mar 28, 2013 at 9:26 PM, Mikera <mike.r.an...@gmail.com> wrote:
> On Friday, 29 March 2013 05:45:53 UTC+8, Laurent PETIT wrote:
>>
>> 2013/3/28 Marko Topolnik <marko.t...@gmail.com>:
>> > Or you may have just a trivial requirement for a program that both
>> > starts
>> > and executes quickly.
>>
>> To what extent would an LLVM / C version of a Clojure program not
>> incur startup penalty as the JVM does.
>>
>> As far as I understand it, the startup cost is manyfold:
>> 1/ JVM startup
>> 2/ loading of Clojure Core
>> 3/ loading of non-lazy parts of your application (generally from
>> loading a global namespace to invoke its -main function)
>
> In my experience 1) is a small fraction of the total. A trivial "hello
> world" Java program runs in less than 0.1sec on my machine, which proves
> that JVM startup isn't really important. Or at least, far less important
> than most people think.

I certainly don't see that. I've measured this more than a few times,
and it's several seconds for a simple "Hello World" Java application
on any machine that I can touch. Additionally, on an embedded system,
I'm not going to have the same kind of CPU power. For instance, the
current processor we use runs at 400MHz instead of your desktop's
3GHz.

[snip]
> It might even be slower in LLVM / C, unless you can at least match the JVM
> in terms of JIT optimisation and garbage collector efficiency, which in turn
> affects the runtime for 2+3 (I believe a garbage collector is a requirement
> to execute Clojure?). Beating the JVM isn't an easy feat.

You could argue the same for any application written in C, though I
think in practice C keeps up pretty well. However, raw execution
speed isn't necessarily my goal. More interesting to me is having
better tools to use. Clojure's approach to concurrent programming is
world's better that the "share everything" approach used in C, and
it's that facility that I'd like to use the most. But requiring the
JVM to use it--in my environment--is just too high of a price to pay.
To be honest, LLVM might be too high as well. LLVM is certainly far
from small and lightweight. :-)

-John

Marko Topolnik

unread,
Mar 29, 2013, 5:28:47 AM3/29/13
to clo...@googlegroups.com, jo...@szakmeister.net

I certainly don't see that.  I've measured this more than a few times,
and it's several seconds for a simple "Hello World" Java application
on any machine that I can touch.  Additionally, on an embedded system,
I'm not going to have the same kind of CPU power.  For instance, the
current processor we use runs at 400MHz instead of your desktop's
3GHz.

In fairness to Java, your measurements are the exception, not Mikera's. For example:

$ java -version
java version "1.7.0_17"
Java(TM) SE Runtime Environment (build 1.7.0_17-b02)
Java HotSpot(TM) 64-Bit Server VM (build 23.7-b01, mixed mode)
$ echo 'public class Test { public static void main(String... args) { System.out.println("Hello"); } }' > Test.java
$ javac Test.java
$ time java Test
Hello

real 0m0.137s
user 0m0.097s
sys 0m0.034s

-marko

John Szakmeister

unread,
Mar 29, 2013, 5:40:48 AM3/29/13
to clo...@googlegroups.com
Hmmm... perhaps I was testing something more full-fledged (though not
a server application) that could be used as a skeleton for a
command-line tool. I'll have to check. I'm not seeing it be that
fast on my current machine, but is less that 1 second with Java
6--though this machine is rather powerful.

Still, it's not going to be that fast on a 400MHz ARM. :-)

-John

Mikera

unread,
Mar 29, 2013, 5:49:47 AM3/29/13
to clo...@googlegroups.com, jo...@szakmeister.net
I decided to benchmark JVM startup again, in case of any doubt, and because I see plenty of FUD on this issue.

timecmd bat file use to benchmark is in this SO answer: http://stackoverflow.com/a/6209392/214010

C:\xxx> java -version
java version "1.7.0_11"
Java(TM) SE Runtime Environment (build 1.7.0_11-b21)
Java HotSpot(TM) 64-Bit Server VM (build 23.6-b04, mixed mode)

C:\xxx> timecmd java -cp miscellania.jar hello.world.App
Hello World!
command took 0:0:0.10 (0.10s total)

As you can see JVM startup is 0.1s or less: in fact the entire execution of Hello World took 0.1 sec on my machine. Note that this is the server VM. I did about 20 successive runs which were all in the 0.08 to 0.12 sec range.

On Friday, 29 March 2013 17:02:34 UTC+8, John Szakmeister wrote:
On Thu, Mar 28, 2013 at 9:26 PM, Mikera <mike.r.an...@gmail.com> wrote:
> On Friday, 29 March 2013 05:45:53 UTC+8, Laurent PETIT wrote:
>>
>> 2013/3/28 Marko Topolnik <marko.t...@gmail.com>:
>> > Or you may have just a trivial requirement for a program that both
>> > starts
>> > and executes quickly.
>>
>> To what extent would an LLVM / C version of a Clojure program not
>> incur startup penalty as the JVM does.
>>
>> As far as I understand it, the startup cost is manyfold:
>> 1/ JVM startup
>> 2/ loading of Clojure Core
>> 3/ loading of non-lazy parts of your application (generally from
>> loading a global namespace to invoke its -main function)
>
> In my experience 1) is a small fraction of the total. A trivial "hello
> world" Java program runs in less than 0.1sec on my machine, which proves
> that JVM startup isn't really important. Or at least, far less important
> than most people think.

I certainly don't see that.  I've measured this more than a few times,
and it's several seconds for a simple "Hello World" Java application
on any machine that I can touch.  Additionally, on an embedded system,
I'm not going to have the same kind of CPU power.  For instance, the
current processor we use runs at 400MHz instead of your desktop's
3GHz.

I suggest you recheck your measurement approach / configuration. See above :-)

Agreed that any embedded processor is likely much slower than a PC (mine is a laptop in fact). And IO speed probably makes a big difference as well if there are caching effects / large .jar files to load.
 

[snip]
> It might even be slower in LLVM / C, unless you can at least match the JVM
> in terms of JIT optimisation and garbage collector efficiency, which in turn
> affects the runtime for 2+3 (I believe a garbage collector is a requirement
> to execute Clojure?). Beating the JVM isn't an easy feat.

You could argue the same for any application written in C, though I
think in practice C keeps up pretty well.  However, raw execution
speed isn't necessarily my goal.  More interesting to me is having
better tools to use.  Clojure's approach to concurrent programming is
world's better that the "share everything" approach used in C, and
it's that facility that I'd like to use the most.  But requiring the
JVM to use it--in my environment--is just too high of a price to pay.
To be honest, LLVM might be too high as well.  LLVM is certainly far
from small and lightweight. :-)

Agreed - the JVM is a poor fit for very tightly constrained environments. Excited to see what you can achieve here! 

Just don't knock the JVM unfairly, it is one of the best tools we have :-)

John Szakmeister

unread,
Mar 29, 2013, 6:37:36 AM3/29/13
to clo...@googlegroups.com
On Fri, Mar 29, 2013 at 5:49 AM, Mikera <mike.r.an...@gmail.com> wrote:
> I decided to benchmark JVM startup again, in case of any doubt, and because
> I see plenty of FUD on this issue.

Sorry, I don't mean to spread any FUD. I'm just being loose with the
phrase "start-up time". You're right, I should be more precise in my
terminology. FWIW, I think when people talk about this, they're doing
the same thing: including the class loading overhead for something
other than the most trivial of examples.

[snip]
> As you can see JVM startup is 0.1s or less: in fact the entire execution of
> Hello World took 0.1 sec on my machine. Note that this is the server VM. I
> did about 20 successive runs which were all in the 0.08 to 0.12 sec range.

Yes. It's about double that on my machine, but within reach.

[snip]
>> I certainly don't see that. I've measured this more than a few times,
>> and it's several seconds for a simple "Hello World" Java application
>> on any machine that I can touch. Additionally, on an embedded system,
>> I'm not going to have the same kind of CPU power. For instance, the
>> current processor we use runs at 400MHz instead of your desktop's
>> 3GHz.
>
> I suggest you recheck your measurement approach / configuration. See above
> :-)

Okay. I pushed up a barebones example of a command line application
here: https://github.com/jszakmeister/barebones

Really, it's just Clojure plus tools.cli, and a small snippet in main.
Running this 10 times, I'm seeing about 3.07s on my machine to
execute this example. I built in with "lein2 uberjar", and did:

:: time java -jar target/barebones-0.1.0-SNAPSHOT-standalone.jar
{:faux bar, :help false}
Hello, World!
java -jar target/barebones-0.1.0-SNAPSHOT-standalone.jar 3.09s user
0.28s system 186% cpu 1.804 total

:: java -version
java version "1.6.0_43"
Java(TM) SE Runtime Environment (build 1.6.0_43-b01-447-10M4203)
Java HotSpot(TM) 64-Bit Server VM (build 20.14-b01-447, mixed mode)

> Agreed that any embedded processor is likely much slower than a PC (mine is
> a laptop in fact). And IO speed probably makes a big difference as well if
> there are caching effects / large .jar files to load.

Definitely.

[snip]
> Agreed - the JVM is a poor fit for very tightly constrained environments.
> Excited to see what you can achieve here!
>
> Just don't knock the JVM unfairly, it is one of the best tools we have :-)

I don't mean to do that. The JVM is an amazing piece of software. I
just happen to be in circles where folks believe it's the answer to
everything, and unfortunately, it's not. It has some limitations and
it isn't well-suited to every problem. Even at 100ms for the start-up
time, that's still pretty non-trivial for a command line application.
The real question is whether we can get something like the barebones
example to fire up and run in a similar amount of time.

As you said: I'm excited to see what can be achieved in this space!

-John

Marko Topolnik

unread,
Mar 29, 2013, 6:44:26 AM3/29/13
to clo...@googlegroups.com, jo...@szakmeister.net
On Friday, March 29, 2013 11:37:36 AM UTC+1, John Szakmeister wrote:
On Fri, Mar 29, 2013 at 5:49 AM, Mikera <mike.r.an...@gmail.com> wrote:
> I decided to benchmark JVM startup again, in case of any doubt, and because
> I see plenty of FUD on this issue.

Okay.  I pushed up a barebones example of a command line application
here: https://github.com/jszakmeister/barebones

Really, it's just Clojure plus tools.cli, and a small snippet in main.
 Running this 10 times, I'm seeing about 3.07s on my machine to
execute this example.  I built in with "lein2 uberjar", and did:

:: time java -jar target/barebones-0.1.0-SNAPSHOT-standalone.jar
{:faux bar, :help false}
Hello, World!
java -jar target/barebones-0.1.0-SNAPSHOT-standalone.jar  3.09s user
0.28s system 186% cpu 1.804 total

:: java -version
java version "1.6.0_43"
Java(TM) SE Runtime Environment (build 1.6.0_43-b01-447-10M4203)
Java HotSpot(TM) 64-Bit Server VM (build 20.14-b01-447, mixed mode)

Yes, you are involving Clojure startup here, which turns the tables altogether. This is far more work than just Java startup: all the namespaces must be initialized: all their def'd values calculated at runtime and assigned. Some of these may involve quite heavyweight service startup. This is the real issue in the Clojure startup story: it is not aggressively optimized towards zero startup time. It is a problem that carries over to other underlying implementations.

-marko

Rich Morin

unread,
Mar 29, 2013, 11:55:11 AM3/29/13
to clo...@googlegroups.com
On Mar 29, 2013, at 03:44, Marko Topolnik wrote:
> Yes, you are involving Clojure startup here, which turns the tables
> altogether. This is far more work than just Java startup: all the
> namespaces must be initialized: all their def'd values calculated
> at runtime and assigned. Some of these may involve quite heavyweight
> service startup. This is the real issue in the Clojure startup story:
> it is not aggressively optimized towards zero startup time. It is a
> problem that carries over to other underlying implementations.

So, a naive question. How much of this work could be pre-calculated
and/or deferred until the values are needed?

-r

--
http://www.cfcl.com/rdm Rich Morin
http://www.cfcl.com/rdm/resume r...@cfcl.com
http://www.cfcl.com/rdm/weblog +1 650-873-7841

Software system design, development, and documentation


Herwig Hochleitner

unread,
Mar 29, 2013, 12:37:19 PM3/29/13
to clo...@googlegroups.com
2013/3/29 Marko Topolnik <marko.t...@gmail.com>
Yes, you are involving Clojure startup here, which turns the tables altogether. This is far more work than just Java startup: all the namespaces must be initialized: all their def'd values calculated at runtime and assigned. Some of these may involve quite heavyweight service startup. This is the real issue in the Clojure startup story: it is not aggressively optimized towards zero startup time. It is a problem that carries over to other underlying implementations.

The opposite is also true and weighs more heavy in this case, I think. Java is not optimized for good startup times. In particular, you can't embed any composite constants in byte code. Not even arrays. That means every last piece of metadata has to be allocated and built from scratch at startup. Every class for every toplevel fn has to be loaded (a process that involves deserialization aswell), initialized and instantiated.

In a native code implementation of clojure, all the statically known functions, data and metadata structures could be directly embedded into the binary. Thus the init cost of a namespace with defn as the only toplevel forms could be near zero.

Jean Niklas L'orange

unread,
Mar 29, 2013, 10:50:04 PM3/29/13
to clo...@googlegroups.com
On Friday, March 29, 2013 5:04:33 AM UTC+1, tbc++ wrote:
This is something I've thought/talked about for some time now. In reality this is one of the reasons I started Mjolnir. I would like to see an implementation of Clojure on LLVM. Mjolnir is several months away from being able to handle a project like this, but I took the time tonight to type up my thoughts on the topic. 

https://github.com/halgari/clojure-metal/blob/master/README.md

I'd love to hear anyone's input on this doc. I just typed this up, so it's a bit rough, but it should communicate some of the ideas I have.

Timothy Baldridge

Looks interesting to me, and what you describe is in my eyes a sound approach. Seems like you've been thinking about this for some time, esp. considering the talk you gave recently. Unfortunately I weren't there, but when one gives a talk on this topic, it's evident one has thought a lot about it.

What I found lacking was how one should design the Clojure core of Clojure-metal. Should one simply convert Clojure as it is right now, or should one take into account the lessons learned from creating Clojure? I'm sure Rich has some ideas on what should be done differently from what is currently done in Clojure right now, though perhaps aimed for the JVM implementation, and not in general. (Maybe you talked about this at Clojure/West?)

-- Jean Niklas L'orange

Timothy Baldridge

unread,
Mar 29, 2013, 11:10:35 PM3/29/13
to clo...@googlegroups.com
I really didn't discuss this in my talk at all. My talk focused more around Mjolnir and how it allows for construction of compilers in general. That being said, yes, this has been the hidden agenda behind the library from the start. 

For those who didn't see the talk, Mjolnir exposes access to LLVM through a set of typed S-expressions. This allows us to use macros, clojure namespaces, and the entire clojure runtime to construct compilers. An example of a "bare-bones" lisp compiler can be found here: https://github.com/halgari/mjolnir/blob/master/src/examples/simple_lisp.clj

clojure-metal is a unique beast from the other implementations of Clojure. Most Clojure implementations are built on top of OOP VMs. If you are writing a VM from scratch you don't have a OOP system to built on. So I think the answer to your question is somewhere between ClojureScript and Clojure on the JVM. Protocol based from the bottom up, but with a custom runtime.

The guiding idea would be something like this:

* Nothing but protocols (like ClojureScript)
* First-class Namespaces (like Clojure)
* Everything is assigned to a name spaced var (like Clojure)
* Ints, Floats, Doubles, BigDecimals (via GMP), etc. (like Clojure)
* Refrain from "if instance(foo, IBar)" dispatching, instead use protocols. This would remove much of RT.java


Timothy Baldridge



--
--
You received this message because you are subscribed to the Google
Groups "Clojure" group.
To post to this group, send email to clo...@googlegroups.com
Note that posts from new members are moderated - please be patient with your first post.
To unsubscribe from this group, send email to
clojure+u...@googlegroups.com
For more options, visit this group at
http://groups.google.com/group/clojure?hl=en
---
You received this message because you are subscribed to the Google Groups "Clojure" group.
To unsubscribe from this group and stop receiving emails from it, send an email to clojure+u...@googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.
 
 

Aaron

unread,
Mar 30, 2013, 1:19:35 AM3/30/13
to clo...@googlegroups.com
I'm working on a Clojure DSL for generating C code called c-in-clj: https://github.com/aaronc/c-in-clj.  Right now it only works on ClojureCLR, but could be ported to Clojure JVM if someone desired.  The API should be considered unstable but I have been using to generate real production code.

This isn't a Clojure in C implementation, it just lets me take advantage of Clojure features for generating the C code the way I want to.  For me, I feel like it covers (or will cover) most of the use cases I have for a Clojure C.  At the same time, it could be used as the foundation for writing Clojure in C.  If someone does feel inspired to take on this task, I can explain how to do it with this library.

Marko Topolnik

unread,
Mar 30, 2013, 2:11:13 AM3/30/13
to clo...@googlegroups.com
An excellent point. Indeed, the semantics of the JVM are not specfiically geared towards quick startup. Do mind, however, that this is just specification we're talking about. An implementation may perform any optimization as long as it maintains the semantics. As an obvious example, take the startup of the JVM itself, which loads and initializes many hundreds of classes, and sets up a myriad of internal datastructures, within 100-200 ms. This kind of optimization is unfortunately not something client code can participate in, and that's where many problems start. Clojure would basically need its own custom JVM derivation which would provide an accessible way to "fossilize" a namespace and turn it into a binary blob that can be loaded in microsecoonds, resulting in a fully initialized image in memory.

At a certain point in this departure from the JVM comes a moment where it is saner to start from scratch and build your own runtime, but that means you undo some 15 years of experience in building and optimizing a runtime for heayweight production systems. Some lessons can be carried over, but a huge spectrum of fine detials will be left for you to reinvent.

As long as the JVM is alive and strong I doubt we shall ever see a native implementation that wouldn't cause frustration to its users, because it won't be the universally best choice. For many years to come nothing will be able to dethrone the JVM at the server side, where startup time means nothing and long-running stability is precious. So whatever runtime you choose, it will take you only that far before its specific limitations start giving you a headache.

But hey, that's life: no language has conquered all the bases. Otherwise C would by now be just a tale from the elders.

-marko

Dax Fohl

unread,
May 30, 2013, 2:20:31 AM5/30/13
to clo...@googlegroups.com
Would an alternate approach be to write a Clojure interpreter in RPython and have the PyPy toolchain create everything for you?  That way you get an interpreter with a tracing JIT for free, plus it looks like they've got STM working now.  It seems like that could save a lot of work.  Am I missing something?  What are the downsides of this approach?

Michael Klishin

unread,
May 30, 2013, 6:11:38 AM5/30/13
to clo...@googlegroups.com
2013/5/30 Dax Fohl <dax....@gmail.com>

Am I missing something?  What are the downsides of this approach?

is RPython garbage collected? Key ideas in Clojure pretty much assume memory management is not something you
have to worry about.

What about concurrency primitives? Clojure builds its reference types on top of JDK/.NET ones (and mimics them
in ClojureScript).

Dax Fohl

unread,
May 30, 2013, 6:51:06 AM5/30/13
to clo...@googlegroups.com
I've got no idea about RPython itself, but the PyPy toolchain takes an interpreter for any language specified in RPython, and generates a native interpreter for that language, complete with JIT, garbage collection, etc.  (Here's an example http://morepypy.blogspot.com/2011/04/tutorial-writing-interpreter-with-pypy.html).  

Given there's an RPython interpreter for Python (aka PyPy), which dynamically generates the JIT, GC, etc, it seems logical that one could write an RPython interpreter for Clojure in the same way and get all the benefits of their toolchain.  Apparently they've got STM working there too.  And my understanding is that their generated tracing JIT is awesome for things marked as immutable (I forget where I read that), which would have insane performance consequences for Clojure.

Timothy Baldridge

unread,
May 30, 2013, 8:15:29 AM5/30/13
to clo...@googlegroups.com
No, you're not missing something. In the past I've turned down the idea of using RPython due to the lack of threading support. But in the past year major, major headway has been made (as you mentioned) so perhaps RPython isn't as crazy of an idea after all. 

As far as a GC goes, yes, RPython can use one of many JITs, with a simple command-line switch, the RPython translator can create binaries that use reference counting, Boehm GCs or a custom mark-and-sweep generational (compacting?) GC. The only downside is that IIRC the more complex GCs are not yet thread-safe. But once again, major work is being done there. 


Timothy


--
--
You received this message because you are subscribed to the Google
Groups "Clojure" group.
To post to this group, send email to clo...@googlegroups.com
Note that posts from new members are moderated - please be patient with your first post.
To unsubscribe from this group, send email to
clojure+u...@googlegroups.com
For more options, visit this group at
http://groups.google.com/group/clojure?hl=en
---
You received this message because you are subscribed to the Google Groups "Clojure" group.
To unsubscribe from this group and stop receiving emails from it, send an email to clojure+u...@googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.
 
 

Gary Trakhman

unread,
May 30, 2013, 8:21:36 AM5/30/13
to clo...@googlegroups.com
I just thought about this recently, but does the value-oriented nature of clojure mostly void the need for a cycles-aware GC?  It seems like you won't ever have cycles without identities, or pointers (java references).  Maybe this would be a problem only when you need identities, ie deftype or defprotocol implementing objects.

Dax Fohl

unread,
May 30, 2013, 8:45:02 AM5/30/13
to clo...@googlegroups.com
So what do you see as the advantage in going the clojure-metal path?  Is it that RPython is such a pain to debug that it ends up not being worth it in the end?  Is the tradeoff essentially being able to do things exactly how you want in LLVM versus having to put up with warts that might not quite fit in PyPy?  Or is there something the clojure-metal path will make easier than going with RPython?  (Also didn't I see in the clojure-py blog once that you overcame the lack of threading by launching separate processes?  Is there a reason that wouldn't work in the real world?)

Jean Niklas L'orange

unread,
May 30, 2013, 9:44:33 AM5/30/13
to clo...@googlegroups.com


On Thursday, May 30, 2013 2:21:36 PM UTC+2, Gary Trakhman wrote:
I just thought about this recently, but does the value-oriented nature of clojure mostly void the need for a cycles-aware GC?  It seems like you won't ever have cycles without identities, or pointers (java references).  Maybe this would be a problem only when you need identities, ie deftype or defprotocol implementing objects.

Sure thing, the value-oriented nature removes a lot of cycles in practice. However, you may for instance have an atom which contain itself, so they are not nonexistant. As such, the GC cannot be completely cycle-ignorant, but perhaps it doesn't have to be efficient at finding them either.

Another place where cycles happen are in (mutually) recursive functions, they may be iffy if you define many recursive anonymous functions at runtime.

Gary Trakhman

unread,
May 30, 2013, 10:47:41 AM5/30/13
to clo...@googlegroups.com
At first glance, those issues seem like they could be mitigated, so I think I see room here for some real-time ref-counted clojure.  I imagine it'd still have a lot of allocations so it wouldn't be that great for low-memory systems.


--

Timothy Baldridge

unread,
May 30, 2013, 10:56:42 AM5/30/13
to clo...@googlegroups.com
There are two things I see that reduce the viability of ref-count GCs with Clojure:

a) Clojure is insanely alloc heavy. Look at the source of the data structures, merging two hash-maps (for instance) requires about 2-3 allocations per assoc, per kv in the merged map:

(merge map1 map2)
allocs = ~((tree-depth of map1) * (count map2)

b) Every time you transverse a persistent structure's tree you'll need to increment and decrement the refcounts. Consider multi-core, now the refcounts have to be thread safe (atomic inc & dec). So if you have two cores reading (not just writing) from the same hash-map, they'll be fighting over the cache lines that the refcount is stored in, and that'll kill performance. 

Timothy

Gary Trakhman

unread,
May 30, 2013, 11:16:32 AM5/30/13
to clo...@googlegroups.com
Yes, it's a half-baked idea, but I'm curious if it might be worth an experiment.
re: a) yea, I suspected this could get pretty bad, and your comment about having to mutate counts while traversing it definitely amplifies the effect of it.

b) real-time is about latency and jitter and such, if throughput is sufficient. I'm curious how bad it would actually be.  Something that comes to mind is to simply be careful about sharing data between threads, and to perform full copies in memory when it's worthwhile to avoid cache misses, or some thread-local optimizations if possible (I'll go google some papers before I speculate some more on this). I imagine multi-process python and clojure would have a similar coordination problem.

I appreciate the feedback, I've been wistfully interested in the topic for a long time.

atucker

unread,
May 30, 2013, 4:00:59 PM5/30/13
to clo...@googlegroups.com
Hi!  I'm an interested spectator but understand very little :)  I wonder if anyone would take a moment to explain?
E.g. I can't see why reading from a data structure should ever lead to a change in the refcounts.
A

Gary Trakhman

unread,
May 30, 2013, 4:25:27 PM5/30/13
to clo...@googlegroups.com
Well, ref-counting in C++ is used by something like smart-pointers, the implementation uses operator overloading to overload the pointer dereference operator *, and it manages an internal pointer to the actual value.  Instantiating a smart-pointer increases the count for that object, and once that smart-pointer object goes out of scope (free/delete or popping off the stack), it'll automatically decrement the ref count again by action of its destructor.

The thing I'm proposing would use something like that, by my understanding, and data structures could either be implemented in native bits or further up once deftypes and protocols are defined on top of something like this.

So, in order to actually read from a memory value and guarantee that the object still exists, you have to allocate a wrapper that manages these reference counts. HTH.

atucker

unread,
May 30, 2013, 4:27:09 PM5/30/13
to clo...@googlegroups.com
Wait... maybe I do :)  Perhaps I was thinking that you needn't increment the refcount of a node when you're just looking at it, but only if you're going to return it or attach it to something else...  Sorry to know so little...

Gary Trakhman

unread,
May 30, 2013, 4:46:24 PM5/30/13
to clo...@googlegroups.com
I might have missed some details about the implementation of smart pointers there, (whoops, I guess I failed the C++ interview), but I think it's got the basic idea.

atucker

unread,
May 30, 2013, 4:49:45 PM5/30/13
to clo...@googlegroups.com
Thanks for this!  I do see now that it's probably a little trickier than I first thought :)  Still, like you, I am left with the feeling it might be possible to do well...

Dax Fohl

unread,
May 30, 2013, 10:26:56 PM5/30/13
to clo...@googlegroups.com
So what I'm gathering (I'm still trying to grok) is that clojure-metal is an approach that somewhat parallels PyPy, except in Clojure, and except that instead of defining a type-inferrable subset RClojure, you instead define an internal DSL via mjolnir that allows you to specify the types within Clojure?  But unlike RPython which can run as a CPython program without any special handling, the mjolnir DSL is only intended to be run as a bytecode generator.  Is that about right?  I'm wondering, if the intention is to parallel PyPy, but use Clojure instead, would a more generic thing to do be to start with PyPy toolchain, but abstract it so that the input is not RPython, but rather any type-inferrable language and a corresponding AST generator, and make it so that PyPy is no longer Python-specific at all?  (Well except the toolchain code would still be python, but that's just an implementation detail).  So from that toolchain it'd be possible to define RuRu, CloClo, maybe dynamic MLML or CC, or of course cross-version CloML or RuC?  Or is the PyPy toolchain still too specific to "Python-like" languages that it'd produce something suboptimal for Clojure and other languages?


On Thursday, May 30, 2013 8:15:29 PM UTC+8, tbc++ wrote:

Dax Fohl

unread,
May 31, 2013, 1:23:17 AM5/31/13
to clo...@googlegroups.com
(or the slightly hackier but probably easier version: create a tool that translates a subset of Clojure to RPython)

Michał T. Lorenc

unread,
May 10, 2014, 12:14:21 AM5/10/14
to clo...@googlegroups.com
Hi,
How about to port Clojure on top of golang?

To get Clojure running on LLVM some of the ideas could be maybe looked from http://julialang.org/.

Mic
Reply all
Reply to author
Forward
0 new messages