Clojure and c++ and a bit more

54 views
Skip to first unread message

nathaniel

unread,
Dec 20, 2009, 1:22:05 PM12/20/09
to Clojure
Hi: I've recently discovered Clojure and have loosely followed some
of the discussions here. First of all, I think Clojure is a great
language, since I also love Lisp, and I feel that the Java platform is
the most robust for web development. But I perhaps come from a
background a little different that most Clojure fans because in most
situations my language of choice is C++. I've therefore thought about
how Clojure and C++ can benefit from each other. I actually have two
ideas:

First, I've been working on a parsing library for C++ which is
actually based on Perl6, in the sense that I try to implement a Perl6-
like rule system in C++. I decided to make Clojure the basis of the
"output" language for these grammars. In other words, a grammar (a
set of name rules) is compiled into a C++ class, with rule names
becoming methods. This class exposes a method which takes a given
file or string and matches it against its rules. The results of these
matches are then presented as Clojure-compatible data structures. If
there are Clojure functions or macros matching the rule names, the
output could be directly executed. In effect, any structured text for
which a grammar has been defined, can be compiled to Clojure code. I
think there are several applications for this technology, such as
using Clojure for data transfer, and creating a Lisp-like macro system
for non-Lisp-like languages.

What I'd like to do is use this to create C++ extensions, syntactic
shortcuts I can put in C++ code to make it more readable, and as hints
for code generators to automate things like serialization, as if C++
had Java-style annotations. This might be a little like GNU's "Melt"
project ("Middle End Lisp Translator"), but using Clojure in lieu of
Melt's own language, which I think is scheme-like but not compatible.
Using Clojure simplifies both portability (Melt is only on Unix etc.)
and storing information about source code and compilation-related info
in a database.

The other thing I've been thinking about is whether a full C++
implementation of Clojure is feasible. I've studied some low-level
Clojure implementation classes and I have not seen any prohibitive
challenges to porting Clojure to C++ rather as Xronos tried to port it
to .Net. Obviously Clojure will not have access to arbitrary C++
classes, but (maybe using the parsing tools I described above), there
could be easy-to-use development tools to give C++ classes Clojure-
interoperability capabilities. Does anyone know of Clojure features
which rely on Java features that would be prohibitively difficult to
implement in C++? Even a restricted subset of Clojure would be
helpful: I've actually gotten a fair amount of Clojure code to run as
Lisp, with some readtable manipulation, so there's a Lisp-like subset
of Clojure which doesn't rely on any Java features to run.

While I love Clojure, I think the development tools for Clojure are a
little primitive, and I've sensed a little frustration along these
lines from some posts here. Members of this group may disagree, but
I'm still not sold on Java as an application development language (as
opposed to a web development language), and IDEs which run on Java
seem a little slower and harder to use than native-compiled one. So a
native Clojure IDE might just make Clojure programming a little
easier, but there'd obviously have to be a native Clojure runtime.
Perhaps if this project is actively promoted or advocated for, there'd
be a chance to bring a wider pool of programmers into the Clojure
umbrella, those coming from C++ or Lisp or maybe even Arc, Haskell,
etc...

pmf

unread,
Dec 21, 2009, 9:09:17 AM12/21/09
to Clojure
On Dec 20, 7:22 pm, nathaniel <nathan...@photino.org> wrote:
> Does anyone know of Clojure features
> which rely on Java features that would be prohibitively difficult to
> implement in C++?

You might run into the problem than any C++ garbage collector you find
will probably not be quite as efficient as the JVM's garbage collector
(I don't think it would be possible to implement Clojure without a
GC). Additionally, a lot of Clojure's concurrency features rely on
Java's concurrency mechanisms and how these are mapped to the JVM's
concurrency semantics and memory model, for which you will also have
to find a suitable C++ library.

mac

unread,
Dec 22, 2009, 6:43:22 AM12/22/09
to Clojure

Yes, if you want to port Clojure to native code it might be easier to
use a host language that already has these features but closely
resembles C++.
Candidates are D and the new Google Go.
Go in particular seems interesting because one of their goals is to
make a very efficient GC and the language is somewhat multicore aware.

/mac

Andrzej

unread,
Dec 24, 2009, 2:23:28 AM12/24/09
to clo...@googlegroups.com
On Tue, Dec 22, 2009 at 8:43 PM, mac <markus.g...@gmail.com> wrote:
>
>> You might run into the problem than any C++ garbage collector you find
>> will probably not be quite as efficient as the JVM's garbage collector
>> (I don't think it would be possible to implement Clojure without a
>> GC). Additionally, a lot of Clojure's concurrency features rely on
>> Java's concurrency mechanisms and how these are mapped to the JVM's
>> concurrency semantics and memory model, for which you will also have
>> to find a suitable C++ library.
>
> Yes, if you want to port Clojure to native code it might be easier to
> use a host language that already has these features but closely
> resembles C++.
> Candidates are D and the new Google Go.
> Go in particular seems interesting because one of their goals is to
> make a very efficient GC and the language is somewhat multicore aware.

Is the generic GC the best tool for managing persistent data
structures? I imagine that each such data structure could be, for
example, stored in a separate circular data buffer (so that new chunks
can overwrite old ones without an added effort from the GC).

Clojure exercises GC quite a lot but it does that in a somewhat
predictable manner (uses a lot of linked data structures). In
contrast, standard GC's are tuned for managing a smaller number of
irregular and relatively independent objects. Maybe writing a custom
GC is not that bad idea at all?

Andrzej

atucker

unread,
Dec 24, 2009, 12:14:50 PM12/24/09
to Clojure
I am also curious about this. Apologies, possibly naive question
ahead :)

My background is in C++. By choosing to work with immutable values
(i.e. with a lot of consts), I found that I was able to avoid most of
explicit memory management, pointers etc. Exceptions were:

(a) when interfacing with other people's code, and
(b) when making optimisations, e.g. to save a needless copy-on-return.

Otherwise everything happened behind the scenes as variables with
"automatic" lifespan, passing into and out of scope, had their memory
allocated and deallocated on the stack.

Surely this is the most performant approach to memory management where
it is possible? And doesn't Clojure's pure-functional ethos make it
possible?

Alistair


On Dec 24, 7:23 am, Andrzej <ndrwr...@googlemail.com> wrote:

mac

unread,
Dec 25, 2009, 12:38:38 PM12/25/09
to Clojure

On Dec 24, 6:14 pm, atucker <agjf.tuc...@googlemail.com> wrote:
> I am also curious about this.  Apologies, possibly naive question
> ahead :)
>
> My background is in C++.  By choosing to work with immutable values
> (i.e. with a lot of consts), I found that I was able to avoid most of
> explicit memory management, pointers etc.  Exceptions were:
>
> (a) when interfacing with other people's code, and
> (b) when making optimisations, e.g. to save a needless copy-on-return.
>
> Otherwise everything happened behind the scenes as variables with
> "automatic" lifespan, passing into and out of scope, had their memory
> allocated and deallocated on the stack.
>
> Surely this is the most performant approach to memory management where
> it is possible?  And doesn't Clojure's pure-functional ethos make it
> possible?
>
> Alistair
>

It's great until you want to have more than one reference to a
particular piece of data.
Then you have two options:
a) Copy
b) Make another pointer to it

Option a) is often good enough. But if it's a huge object (like sound
or picture data) you probably can't afford copying it.

Option b) means that you now have lost the ability to manage the
memory through scope alone. You will need manual management or some
kind of GC mechanism. In C++ you often use boost::shared_ptr<> but
that can't handle circular references so there is still a manual
management component to it.

When option a) is too expensive and option b) is too complicated,
that's when you need a real GC.

atucker

unread,
Dec 28, 2009, 12:57:51 PM12/28/09
to Clojure
I see. Thanks, that makes a lot of sense.

So just because this sort of multiple reference isn't explicit (or
even visible) in Clojure, that doesn't mean it's not happening. Under
the hood, a derived data structure is more than likely to share memory
with its progenitor. And that's for very good performance reasons
(option b in my C++ comparison).

But doesn't that all that data-sharing happen in tree-like structures
under the careful control of the compiler? I find myself wondering
whether circular references need remain a possibility in this
context. Perhaps we could still consider a simple reference-counting
mechanism (much like the one you mention, boost::shared_ptr<>).

Thanks
Alistair

nathaniel

unread,
Dec 31, 2009, 4:01:35 PM12/31/09
to Clojure
I'm trying to think of scenarios where circular references would be a
problem in Clojure. When does memory actually have to be allocated?
Inside a let block, most often. When lexically scoped variables are
passed to a function, their reference count increases as they are
bound to its parameters, but they're also immutable. One thing about
a language like Clojure (as compared to C++) is that lexical scopes
themselves can be first-class objects (at least from the
implementation's point of view). A let block could manage the memory
it needs for its own variables, and passing variables to a function
could be some kind of transaction between a let block and a lambda
block. An object would have only one reference from its initial let
block, and all other references would be from lambda blocks, which
would always call some cleanup method before the let block does.
Obviously things get more complicated when concurrency features like
ref and atoms get involved, but, as I understand it, there must be a
transaction object in memory for a Clojure variable to be nonconstant,
and that object could notify a let block which allocates memory for a
variable when the memory can be deallocated. It sounds like the
issues other writers have raised (and thanks alot for the feedback)
are that concurrency in C++ would be slower than concurrency in normal
Java Clojure, but even if this is true, I am interested in a C++
version of Clojure not primary for speed considerations, but for the
flexibility of not needing a JVM to run Clojure code. BTW, does
anyone know what kind of GC algorithms (reference counting or thread-
based or what) are used by other Lisps? I know Perl used to use a
reference-counting-based system, but Perl6 now says "reference-
counting is dead". Setting aside concurrency features, I'm not sure
how Clojure memory management would differ substantially from other
Lisps; Clojure let blocks used seq'able objects to contain their
variables being declared, but I don't see how that would affect memory
management for them.

Mike Meyer

unread,
Dec 31, 2009, 7:29:46 PM12/31/09
to clo...@googlegroups.com
On Thu, 31 Dec 2009 13:01:35 -0800 (PST)
nathaniel <nath...@photino.org> wrote:

> BTW, does anyone know what kind of GC algorithms (reference counting
> or thread- based or what) are used by other Lisps?

Reference-counting GC's in most LISPs are pretty much a thing of the
past. Between needing to do cycle detection and having to lock the
reference counters in concurrent environments, they just lose to many
ways. Generational garbage collectors were big last time I
looked. Given todays large address spaces, DDI might even be
acceptable again.

<mike
--
Mike Meyer <m...@mired.org> http://www.mired.org/consulting.html
Independent Network/Unix/Perforce consultant, email for more information.

O< ascii ribbon campaign - stop html mail - www.asciiribbon.org

Reply all
Reply to author
Forward
0 new messages