Observations from a real-world Clojure project

330 views
Skip to first unread message

Jamie

unread,
Oct 29, 2009, 11:57:16 AM10/29/09
to Clojure
First, thank you Rich & Co for doing Clojure.

We've been using Clojure for a real-world project, and I thought I'd
share some observations.

The project does some large-scale (but not really huge) data
analysis. Example: ~10M records are processed and transformed,
various computations occur, and ~100K records are spit out. Lots of
statistics. One type of run takes 12 hours on an 8GB, 4-core Linux
box. A few thousand lines of our Clojure code in total.

My background: I'm an old (?) Lisp hacker. I've used Symbolics, ACL,
CMUCL, schemes (including JScheme and SISC). Also Java since its
early days. And my share of C, C++, SQL, Perl, etc.

So here are some observations. None are news flashes.

1. Clojure works. We were obviously cautious about using a relatively
new system for real work. We tried Clojure, and then we tried some
more. Everything* just worked. We took a snapshot at 1.1.0, and we
didn't update it. [* The JVM does SEGV on us occasionally. Internal
error, which we have not diagnosed. Probably a GC issue. Java
1.6.0_14 with HotSpot 14.0-b16 64-bit server. But this problem is
another topic.]

2. Clojure-the-language is a good fit for real work. The language has
a distinct DWIM feel, which I like. Destructuring everywhere, auto-
gensym-ing, various syntactic sugar, and the advanced features (e.g.,
concurrency primitives with STM, tries) are all convenient, effective,
and useful. Clojure is practical.

3. We dropped into Java once to implement an LRU cache to back custom
memoization. We presumably could have implemented the cache in
Clojure, but Java seemed simpler. Probably mostly due to our
inexperience with Clojure. Clojure's Java interop is of course
excellent -- as claimed and widely reported.

4. Reminder: concurrent work on distinct datastructures uses memory
concurrently! Obvious and obviously not specific to Clojure. But
Clojure makes it so easy to do things in parallel that it's easy to
forget the implications. We found ourselves having to do some
judicious doall's and such to avoid running out of memory in a
subsequent -- so to speak -- stage. (Imagine a parallel aggregation
of a lot of data that results in a small object, which is then
subsequently used with different big data. Might not be able to work
on the former and latter at the same time.) So watch out when you
work on distinct data in stages in parallel on that 100-core Tilera
board. Aside: We might like the option of selective non-laziness, but
we're not sure. That's yet another topic.

5. The functionality of the docs hasn't kept up with Clojure. We
often resorted to text searches of the various sources. Need links
and see-also's. Clojure has grown/matured so much that it needs a doc
system of some sort.

6. Debugging facilities also have not kept up with the state of
Clojure. We use Slime and JDB and some contributed tools, but the
result isn't that convenient. I miss the good Lisp debuggers, but I'm
old-fashioned I guess. We need to re-survey what's available for
tracing, logging, restarts, etc. A state-of-the-art tutorial would be
great.

7. We use Incanter (http://incanter.org/), which worked well for some
of the statistics we need. BTW, for different work, we still use
Mathematica, but we haven't yet needed the slick-looking Clojuratica
(http://clojuratica.weebly.com/). We probably will, and I'm eager to
use it. For us, Clojure is becoming the application-level, all-
purpose glue. We can't throw away Mathematica or Matlab or R, and we
don't need to.

That's it for now. We'll look for ways we can contribute.

Thanks again for the excellent system.

--Jamie

Rich Hickey

unread,
Oct 30, 2009, 7:26:49 AM10/30/09
to clo...@googlegroups.com
On Thu, Oct 29, 2009 at 11:57 AM, Jamie <jsm...@gmail.com> wrote:
>
> First, thank you Rich & Co for doing Clojure.
>
> We've been using Clojure for a real-world project, and I thought I'd
> share some observations.
>

Thanks - these kinds of experience reports are very useful.

Rich

Spencer Schumann

unread,
Oct 30, 2009, 3:15:48 PM10/30/09
to Clojure
On Oct 29, 9:57 am, Jamie <jsmo...@gmail.com> wrote:
> 5. The functionality of the docs hasn't kept up with Clojure.  We
> often resorted to text searches of the various sources.  Need links
> and see-also's.  Clojure has grown/matured so much that it needs a doc
> system of some sort.

I recently started learning Clojure myself, and I agree that the docs
could use more work. The book "Programming Clojure" was a good
introduction to the language, but it's not a complete reference.

I'd like to see something like Lua's language manual for Clojure.
Lua's manual (http://www.lua.org/manual/5.1) is concise, complete, and
accurate. It's extremely helpful to have a single, de-facto guide to
every feature in the language. I wish every language had a freely
available document of similar quality, and perhaps it could serve as a
model for a Clojure language manual.

Tom Faulhaber

unread,
Oct 30, 2009, 6:25:16 PM10/30/09
to Clojure

>
> 5. The functionality of the docs hasn't kept up with Clojure.  We
> often resorted to text searches of the various sources.  Need links
> and see-also's.  Clojure has grown/matured so much that it needs a doc
> system of some sort.
>


This has been recognized for awhile now. I have promised to extend the
documentation system we use for clojure.contrib (see
http://richhickey.github.com/clojure-contrib/) to clojure itself. This
will mean that we have up-to-the-minute doc on the latest Clojure (as
well as info for historical versions and, eventually, all the
branches) with links to the source code, the ability to attach more
detailed docs, link to URLs, full index, etc.

This has been taking awhile because my real life (both work and
personal) has been ridiculously crazy the last few months. But I have
begun to lean in on it and think I should have a preliminary version
running with the next couple of weeks.

The goal is that this system will be generalizable for all projects
and can be used as a sort of "javadoc" for Clojure libraries (with
extra bonuses like direct github support).

Tom
Reply all
Reply to author
Forward
0 new messages