Clojure in Clojure?

197 views
Skip to first unread message

tmountain

unread,
Jul 9, 2009, 11:10:53 AM7/9/09
to Clojure
I just finished watching the Bay Area Clojure Meetup video, and Rich
spent a few minutes talking about the possibility of Clojure in
Clojure. The prospect of having Clojure self-hosted is incredibly
cool, but it brought a few questions to mind. For one, Rich mentions
that it would potentially open up additional target platforms for the
language citing Objective C, Actionscript, and Javascript as potential
host languages. As awesome as this sounds, wouldn't it first require a
native implementation to be created for each language prior to Clojure
in Clojure running on the platform? Perhaps there's some magic
bootstrapping stuff that can be done to avoid a full port? I'm also
wondering if Clojure would take a big performance hit as a result of
being self-hosted? Either way, this seems like a really neat idea.

If anyone wants to see the video, it's here:

(talks about Clojure in Clojure at around 47:00)
http://blip.tv/file/2301367

Paul Mooser

unread,
Jul 9, 2009, 12:07:43 PM7/9/09
to Clojure
Since clojure is a compiled language, and is going to just end up
generating java bytecodes, I wouldn't expect it to be particularly
slower if it was written in itself. Maybe that's naive ?

Daniel Lyons

unread,
Jul 9, 2009, 12:18:16 PM7/9/09
to clo...@googlegroups.com


It's not naive. This is called self-hosting and it is very common in
programming language implementation. To be safe one often retains a
stub compiler for some subset of the language written in another
language, and then implements the rest of the language in the stub
version. This is what GHC does for Haskell with Core and PyPy does
with RPython for Python (though GHC ultimately converts all Haskell
into core before compiling it). GCC works similarly, first building
xgcc which is a simple C compiler to compile itself, and then it
recompiles itself with itself, which is why it's such a time consuming
process.

Another approach is to go whole-hog and depend on a previous version
of the language to build the language. This is what CMU Common Lisp
has been doing (not sure if they've changed this recently or not). I
think Erlang is in a similar situation (the original host language was
Prolog, believe it or not).

Other languages retain a C or Java implementation forever. This is the
approach of the scripting languages, such as Python and Ruby et al.
There's nothing wrong with that either.

IMO, the principal advantages of self-hosting are that it forces you
to optimize in places you might not want to and that it gives you a
nice language to write your language in. :) It's also a good exercise
in general and it makes it easier for someone who only knows the
language the ability to work on the language.


Daniel Lyons

tmountain

unread,
Jul 9, 2009, 12:33:36 PM7/9/09
to Clojure
> To be safe one often retains a
> stub compiler for some subset of the language written in another
> language, and then implements the rest of the language in the stub
> version.

This makes a lot of sense. So basically, a subset of Clojure could be
ported to whatever language you'd want to target, and then that could
be used to bootstrap the rest of the language? Sounds like a neat
route to go.

Mark Addleman

unread,
Jul 9, 2009, 2:08:45 PM7/9/09
to Clojure
By the by, I believe Squeak Smalltalk has a 'compiler' written in
Squeak that it uses to generate C code which is then used to bootstrap
the rest of the language.

John Harrop

unread,
Jul 9, 2009, 4:24:17 PM7/9/09
to clo...@googlegroups.com
On Thu, Jul 9, 2009 at 11:10 AM, tmountain <TinyMo...@gmail.com> wrote:

I just finished watching the Bay Area Clojure Meetup video, and Rich
spent a few minutes talking about the possibility of Clojure in
Clojure. The prospect of having Clojure self-hosted is incredibly
cool, but it brought a few questions to mind. For one, Rich mentions
that it would potentially open up additional target platforms for the
language citing Objective C, Actionscript, and Javascript as potential
host languages. As awesome as this sounds, wouldn't it first require a
native implementation to be created for each language prior to Clojure
in Clojure running on the platform? Perhaps there's some magic
bootstrapping stuff that can be done to avoid a full port? I'm also
wondering if Clojure would take a big performance hit as a result of
being self-hosted? Either way, this seems like a really neat idea.

The difficult thing would be preserving the inability of bad Clojure code to crash the process, and most especially, providing all of Swing, AWT, JDBC, JAXP, and all of the rest of the goodies from the Java class library. Being JVM-hosted has its advantages.

Wilson MacGyver

unread,
Jul 10, 2009, 12:14:04 AM7/10/09
to clo...@googlegroups.com
Yea, for me, being on JVM is one of clojure's biggest selling point.
I don't know that I would've learn and use clojure were it not on the
JVM.
--
Omnem crede diem tibi diluxisse supremum.

Chouser

unread,
Jul 11, 2009, 12:17:27 AM7/11/09
to clo...@googlegroups.com
On Thu, Jul 9, 2009 at 4:24 PM, John Harrop<jharr...@gmail.com> wrote:
>
> The difficult thing would be preserving the inability of bad Clojure code to
> crash the process, and most especially, providing all of Swing, AWT, JDBC,
> JAXP, and all of the rest of the goodies from the Java class library. Being
> JVM-hosted has its advantages.

I don't think Clojure will be abandoning the JVM any time
soon. There's not been a lot of specifics anywhere about
what Clojure-in-Clojure actually is, so I wrote up what
I think I know. I hope it helps:

http://blog.n01se.net/?p=41

--Chouser

Tom Faulhaber

unread,
Jul 11, 2009, 1:33:36 AM7/11/09
to Clojure
> As awesome as this sounds, wouldn't it first require a
> native implementation to be created for each language prior to Clojure
> in Clojure running on the platform?

No, you don't need to write a native port for each platform.

Typically, you break the compiler into two broad parts: the platform
independent compiler and the code generator for each platform. (Each
of these has several parts internally.)

The idea is that most of the compiler is in the platform independent
part. This creates some sort of intermediate (but fairly low-level)
representation of your program. Then you have a smaller part for each
system that generates the exact instruction set for that system. So
you have one big compiler/library bundle and n code generators, one
for each platform. All of this is written in the source language (in
this case Clojure).

Now, the important thing is that the code generator for a target
machine X doesn't have to run *on* machine X. It can run on any
machine that supports the source language.

So say we have Clojure-in-Clojure on the JVM and we decide that we
want to run Clojure in Flash by compiling to ActionScript. What we
need to do is create an ActionScript code generator for the Clojure
compiler. Then, on the JVM, we compile all of Clojure using the
ActionScript code generator instead of the JVM code generator. The
resulting output *is* native Clojure for Flash. Typically (though not
always), you'll go ahead and recompile your compiler from Flash
generating a "pure" Flash compiler.

Thus, we never go through a "bootstrap" process on each new target
platform, we start by building a cross-compiler (which is identical to
the native compiler, but running on a platform other than the target)
and use that to get our bootstrap.

In reality, porting Clojure-in-Clojure will be more difficult than
this because of things like differences in GC between platforms. Also,
I suspect the first versions of Clojure-in-Clojure won't be quite so
nicely divided as that. But this is the basic theory.

Tom

Jules

unread,
Jul 11, 2009, 5:36:34 AM7/11/09
to Clojure
Another potential problem is the data structure library. Can you
implement vectors, maps, etc. in Clojure with acceptable performance?

Jules
Reply all
Reply to author
Forward
0 new messages