Re: Clojure for large programs

1,592 views
Skip to first unread message

Mark Engelberg

unread,
Jul 2, 2011, 9:26:21 PM7/2/11
to clo...@googlegroups.com
Ideally, I was hoping to start a more in-depth discussion about the
pros and cons of "programming in the large" in Clojure than just
waxing poetic about Clojure/Lisp's capabilities in the abstract :)

Yes, much of the initial excitement around Clojure comes from the
feeling of "Wow, I can do so much with so little code". But at some
point, all projects grow. I'm thinking that by now, there may be
enough people using Clojure in large projects and on large teams to
offer some good feedback about how well that works.

My Clojure codebase is somewhere around 2-3kloc and I already feel
like I'm bumping up against some frustration when it comes time to
refactor, maintain, and extend the code, all while keeping up with
ongoing changes to libraries, contrib structures, and Clojure
versions.

I want to hear war stories from those with even larger code bases than
mine. Has it proven to be a major hassle on large projects to avoid
circular dependencies in the modules? Are the lack of debugging
tools, documentation tools, and refactoring tools holding you back?
Anyone miss static typing?

One of my main gripes is that some of Clojure's built-ins return
nonsensical results (or nil), rather than errors, for certain classes
of invalid inputs. To me, one of the main benefits of functional
programming is that debugging is generally easier, in large part
because failures usually occur within close proximity of the flaw that
triggered the failure. Erlang, in particular, has really promoted the
idea of "fail fast" as a way to build robust systems. But Clojure's
lack of a "fail-fast" philosophy has burned me several times, with
hard-to-track-down bugs that were far-removed from the actual cause.
The larger my code grows, the more this annoys me, reminding me too
much of my days tracking down bugs in imperative programs.

One specific example of this is get, which returns nil whenever the
first input isn't something that supports get. For example, (get 2 2)
produces nil. This becomes especially problematic when you pass
something to get that seems like it should support get, but doesn't.
For example, (get (transient #{1}) 1) produces nil, when there's
absolutely no reason to think that (get (transient #{1} 1) would
behave any differently from ((transient #{1}) 1).

Brian Marick

unread,
Jul 2, 2011, 10:25:35 PM7/2/11
to clo...@googlegroups.com

On Jul 2, 2011, at 8:26 PM, Mark Engelberg wrote:
> My Clojure codebase is somewhere around 2-3kloc and I already feel
> like I'm bumping up against some frustration when it comes time to
> refactor, maintain, and extend the code, all while keeping up with
> ongoing changes to libraries, contrib structures, and Clojure
> versions.


I have a codebase with 2.6kloc of production code and 4.8kloc of tests, and I feel your pain (even despite having been a Lisp programmer in the early 80's). I'm not sure yet how to navigate the transition to 1.3 while retaining backwards compatibility. And organizing things into namespaces is something I still haven't figured out.

Russ Olsen said on this list: "The community behind a language and the techniques that it develops are as much a part of the language as the syntax." I think we, the community, need to step up and figure out these techniques and *publicize* them. I hope the core team can provide the infrastructure/support to make that work.

I was moderately heavily involved in the Ruby world starting in 2001 up until some time before Rails took the world by storm. There was a ton of inadvertent preparatory work done by people like Pragmatic Dave Thomas, Chad Fowler, Nathaniel Talbott, and Jim Weirich. We'd do well to learn from their oral histories of the early days of Ruby.

-----
Brian Marick, Artisanal Labrador
Contract programming in Ruby and Clojure
Occasional consulting on Agile
www.exampler.com, www.twitter.com/marick

Glen Stampoultzis

unread,
Jul 2, 2011, 10:52:40 PM7/2/11
to clo...@googlegroups.com
On 3 July 2011 11:26, Mark Engelberg <mark.en...@gmail.com> wrote:
 But Clojure's
lack of a "fail-fast" philosophy has burned me several times, with
hard-to-track-down bugs that were far-removed from the actual cause.
The larger my code grows, the more this annoys me, reminding me too
much of my days tracking down bugs in imperative programs.

 
I wonder if many people use the pre and post assertions when coding Clojure?  Assertions (& pre/post-conditions) seem to have lost favour as a go-to tool for programmers.  Most coders instead seem to go for unit testing exclusively.  It seems to me that assertions could provide a lot benefit in dynamic programs as a way to fail-fast and as a way to document intention. In my limited experiments with them I've found them to be helpful.  I do wish Clojure would give greater detail as to what went wrong when an assertion fails though.  See Groovy's assert statement for an example of a very helpful error report [1].

Milton Silva

unread,
Jul 2, 2011, 11:11:22 PM7/2/11
to Clojure
I have done 4 projects, one was with another person and ranked ~1k(a
lot of java calls). The others were done solo and ranged from ~300 to
~500 lines.

Now that I think about it, that (fail-slow/returns nil) also really
annoys me but, that normally only pops up when I have lots of side
effects chained (which make the code a lot harder to test, and it
generally leads to a lot more code being written before it's tested).

Otherwise, I iteratively build things on the repl, this coupled with
unit tests forces a bottom up approach and partially because of that
the code I write is generally a lot more maintainable than java code.

I would definitely like to hear from people with more experience to
understand if, how and why things change as the projects get larger
(in team size and code size).

Luc Prefontaine

unread,
Jul 2, 2011, 11:19:40 PM7/2/11
to clo...@googlegroups.com
On Sat, 2 Jul 2011 18:26:21 -0700
Mark Engelberg <mark.en...@gmail.com> wrote:

> Ideally, I was hoping to start a more in-depth discussion about the
> pros and cons of "programming in the large" in Clojure than just
> waxing poetic about Clojure/Lisp's capabilities in the abstract :)
>
> Yes, much of the initial excitement around Clojure comes from the
> feeling of "Wow, I can do so much with so little code". But at some
> point, all projects grow. I'm thinking that by now, there may be
> enough people using Clojure in large projects and on large teams to
> offer some good feedback about how well that works.
>
> My Clojure codebase is somewhere around 2-3kloc and I already feel
> like I'm bumping up against some frustration when it comes time to
> refactor, maintain, and extend the code, all while keeping up with
> ongoing changes to libraries, contrib structures, and Clojure
> versions.

We have above 6.5K lines of Clojure (src only) growing and it's all structured with name spaces.
We still have a mixed code base here (Java + Clojure + JRuby) and we had already
name spaces to structure the code.
The code base is structured in 10 different projects.
We use Eclipse and CounterClockWise for dev. Dev coding/testing is done in Eclipse
by specifying projects in dependencies.

We use leinigen to build these for Q/A and prod.

Moving from 1.0 to 1.2 was not painful. We did it methodically. With basic tests in each
project, we spotted issues quite fast. We rolled this over a week roughly.

>
> I want to hear war stories from those with even larger code bases than
> mine. Has it proven to be a major hassle on large projects to avoid
> circular dependencies in the modules? Are the lack of debugging
> tools, documentation tools, and refactoring tools holding you back?
> Anyone miss static typing?

Again using name spaces/individual projects here is the key to avoid circular dependencies.

We do not miss static typing at all, in fact we are in the process of
getting rid of the Java code. The goal is to clear this by next fall.

For debugging when it's serious, we use the Eclipse JVM debugger
and look at the Clojure runtime context when needed.

As far as documentation tool we rely on (doc ...) and document our code accordingly.

Since the code ratio versus Java is around one to 10, refactoring is not
a big deal even without the heavy assistance you may get in Java from
your IDE.


>
> One of my main gripes is that some of Clojure's built-ins return
> nonsensical results (or nil), rather than errors, for certain classes
> of invalid inputs. To me, one of the main benefits of functional
> programming is that debugging is generally easier, in large part
> because failures usually occur within close proximity of the flaw that
> triggered the failure. Erlang, in particular, has really promoted the
> idea of "fail fast" as a way to build robust systems. But Clojure's
> lack of a "fail-fast" philosophy has burned me several times, with
> hard-to-track-down bugs that were far-removed from the actual cause.
> The larger my code grows, the more this annoys me, reminding me too
> much of my days tracking down bugs in imperative programs.

Were did you find the link between functional languages and close proximity of
errors ? That's a language design decision. You may want to use assertions
on your fns to validate inputs. That sould improve your ability to track errors
before they carry things too far from the spotwhere it failed.
I would not trade this for systematic exception reporting.

>
> One specific example of this is get, which returns nil whenever the
> first input isn't something that supports get. For example, (get 2 2)
> produces nil. This becomes especially problematic when you pass
> something to get that seems like it should support get, but doesn't.
> For example, (get (transient #{1}) 1) produces nil, when there's
> absolutely no reason to think that (get (transient #{1} 1) would
> behave any differently from ((transient #{1}) 1).
>

The choice was made not to throw exceptions. Agree, it may feel frustrating
at the beginning. That's a choice that accommodate others while frustrating the
other half.

For your specific case, the first arg does not support the interface that get expects,
however you may do this:

(get 1 1 "WHATTHE...")

The third parm is the "not found" value. That may shed some light if your code starts to carry this value
elsewhere. Or add assertions to your fns or create a wrapper fn.

As for transient sets, pretty sure this is a bug in 1.2.1:

user=> (get (transient {:a 1}) :a)
1
user=> ((transient {:a 1}) :a)
1
user=> (get [1] 0)
1
user=> (get (transient [1]) 0)
1
user=> (get #{1} 1)
1
user=> (get (transient #{1}) 1)
nil <--- Oups...
user=>

Dunno if it is fixed in 1.3, no time to play with it these times.

--
Luc P.

================
The rabid Muppet

Sean Corfield

unread,
Jul 3, 2011, 4:13:27 AM7/3/11
to clo...@googlegroups.com
On Sat, Jul 2, 2011 at 7:25 PM, Brian Marick <mar...@exampler.com> wrote:
> I have a codebase with 2.6kloc of production code and 4.8kloc of tests, and I feel your pain (even despite having been a Lisp programmer in the early 80's). I'm not sure yet how to navigate the transition to 1.3 while retaining backwards compatibility. And organizing things into namespaces is something I still haven't figured out.

Since I mostly work with 50-100kloc projects, I think 5-10kloc
projects are kinda small :)

Given the compression ratio between Clojure and other languages, I
have to say that I'm not very worried about dealing with 10kloc of
Clojure. It was reassuring to see comments about Emacs being 3 million
lines of code.
--
Sean A Corfield -- (904) 302-SEAN
An Architect's View -- http://corfield.org/
World Singles, LLC. -- http://worldsingles.com/
Railo Technologies, Inc. -- http://www.getrailo.com/

"Perfection is the enemy of the good."
-- Gustave Flaubert, French realist novelist (1821-1880)

Sean Corfield

unread,
Jul 3, 2011, 4:37:24 AM7/3/11
to clo...@googlegroups.com
On Sun, Jul 3, 2011 at 1:13 AM, Sean Corfield <seanco...@gmail.com> wrote:
> On Sat, Jul 2, 2011 at 7:25 PM, Brian Marick <mar...@exampler.com> wrote:
>> I'm not sure yet how to navigate the transition to 1.3 while retaining backwards compatibility.

At World Singles we moved to 1.3 pretty much as a matter of course,
mostly because I'm used to "planning for the future" and trying to
work with (b)leading edge builds. When I was at Macromedia, I pushed
hard for us to take prerelease versions of our own products live so
that we could get early real world feedback on them. Since then I've
always tried to work with the latest version of tools because that
brings both the best set of features as well as allowing more
influence and more input on tools - and the feedback is useful to the
projects.

The biggest problem has been 3rd party libraries being slower to move
to 1.3. I was pleased to see Chas Emerick's tweet about this recently,
because he's working hard to ensure all the libraries referenced in
his book are all up to date. At World Singles, we're using CongoMongo
and I approached that team and they were very open about changes to
enable it to run on 1.2 and 1.3. More recently we wanted to use
clojure-csv and, again, the folks behind that were keen to get
compatible with 1.3. Both libraries are working great for us on 1.3
now.

Overall, whilst there's clearly going to be a lot of churn getting
everyone up to 1.3, I think it's still early in Clojure's cycle and we
should all be a bit more aggressive about getting on to the latest
version.

Shantanu Kumar

unread,
Jul 3, 2011, 6:43:40 AM7/3/11
to Clojure


On Jul 3, 7:52 am, Glen Stampoultzis <gst...@gmail.com> wrote:
> On 3 July 2011 11:26, Mark Engelberg <mark.engelb...@gmail.com> wrote:
>
> >  But Clojure's
> > lack of a "fail-fast" philosophy has burned me several times, with
> > hard-to-track-down bugs that were far-removed from the actual cause.
> > The larger my code grows, the more this annoys me, reminding me too
> > much of my days tracking down bugs in imperative programs.
>
> I wonder if many people use the pre and post assertions when coding Clojure?

I too think the real problem the OP is facing is due to not using pre/
post/assertions and also possibly due to non-idiomatic style of
writing code. Show us example code and maybe somebody can suggest how
to write the same code safer.

Regards,
Shantanu

James Keats

unread,
Jul 3, 2011, 7:26:44 AM7/3/11
to Clojure


On Jul 3, 2:26 am, Mark Engelberg <mark.engelb...@gmail.com> wrote:
> Ideally, I was hoping to start a more in-depth discussion about the
> pros and cons of "programming in the large" in Clojure than just
> waxing poetic about Clojure/Lisp's capabilities in the abstract :)


I am yet to do a large program in clojure, I still need to be
convinced in the "ok, so far so good, but where is this going?" but I
have this to say: large programs are primarily an architectural and
secondarily a managerial/organizational concern, not a language issue,
and large programs have been my prime driving consideration over the
years.

In terms of "where is this going?", I would be quite concerned if the
clojure community develops an unreasonably negative attitude towards
OO (I don't believe Rich Hickey himself has a negative attitude, I
believe his attitude is reasonable and balanced) and, on the other
hand, believes it would "do well to learn from the oral histories of
the early days of Ruby". Well, I believe it would do well to learn
from Ruby, but as a cautionary tale.

All those little niggling issues you mention cause me no worry, either
they could be worked around - or perhaps even properly understood - or
they're easily fixable as the language implementation/tools mature,
with the exception of "in large part because failures usually occur
within close proximity of the flaw that triggered the failure"; I
mentioned some concerns I had about datatypes/protocols, I'm yet to
make my mind up on that, in particular with regard to "failures
usually occur within close proximity of the flaw", I still need to
study them more.

I wish the Clojure community to learn from two sources. 1) Java
itself, and in particular what is happening with the service component
architecture (SCA). Clojure makes those good engineering practices of
services and contracts feasible for a small team or even an individual
developer. I'm not saying that clojure would necessarily work with
those frameworks, for that I believe Scala is better positioned, but I
believe clojure should be mindful of what's happening there, as I
believe that to be the biggest threat and hurdle Clojure faces in
terms of its enterprise utility and adoption. The Service Component
Architecture is incredibly well thought out, and it already has
industry titans singed up to it. 2) RDF/OWL, or otherwise called the
resource oriented architecture, or the global giant graph. I believe
if clojure plays it cards well then that - the semantic web - could be
its killer application. This too is a well thought out and compelling
architecture, and I believe Clojure is uniquely well positioned for
it.

Timothy Washington

unread,
Jul 3, 2011, 10:39:23 AM7/3/11
to clo...@googlegroups.com
I'm using pre / post assertions quite a bit in a project I'm building. And I too would love to see better or custom error messages for each assertion. 

They do work great btw, as a way of failing fast. 

Tim 


--
You received this message because you are subscribed to the Google
Groups "Clojure" group.
To post to this group, send email to clo...@googlegroups.com
Note that posts from new members are moderated - please be patient with your first post.
To unsubscribe from this group, send email to
clojure+u...@googlegroups.com
For more options, visit this group at
http://groups.google.com/group/clojure?hl=en

Shantanu Kumar

unread,
Jul 3, 2011, 11:42:48 AM7/3/11
to Clojure


On Jul 3, 7:39 pm, Timothy Washington <twash...@gmail.com> wrote:
> I'm using pre / post assertions quite a bit in a project I'm building. And I
> too would love to see better or custom error messages for each assertion.

That should be possible with a macro. For example, I use this:
https://bitbucket.org/kumarshantanu/clj-miscutil/src/acfb97c662d9/src/main/clj/org/bituf/clj_miscutil.clj#cl-1009

Maybe you need something like this(?):

(defmacro verify-my-arg
"Like assert, except for the following differences:
1. does not check for *assert* flag
2. throws IllegalArgumentException"
[err-msg arg]
`(if ~arg true
(throw (IllegalArgumentException. ~err-msg))))

Then use it thus:

(defn foo [m]
{:pre [(verify-my-arg "m must be a map" (map? m))]}
(println m))

Regards,
Shantanu

Christian Schuhegger

unread,
Jul 3, 2011, 12:21:39 PM7/3/11
to Clojure
I still have to do my personal large scale project in Clojure, but I
would like to share my thoughts so far. (10 years ago I implemented a
60k Common Lisp project; I never worked on more than 5k Clojure code
so far; the C++ and Java projects I was involved in reached 800k to 1M
lines of code).

I think one important ingredient for large scale Clojure projects
would be the literate programming approach that Tim Daly described
some time ago:
https://groups.google.com/group/clojure/browse_thread/thread/460417fe45f314c3?hl=de
In addition I think midje is a very good unit testing framework. I am
currently working on a maven archetype that makes the construction of
projects with literate programming and midje easy.

The "toy project" that I am thinking of implementing in order to test
the large scale characteristics of Clojure is the ERP like system
described in "Java Modeling In Color With UML":
http://www.amazon.com/Java-Modeling-Color-UML-Enterprise/dp/013011510X/
The biggest trouble in larger Clojure projects that deal with large
data graphs that I have is with the lack of a "data schema" like a
Java class definition or a SQL DDL schema definition or an xml schema.
I just want to know how data structures have to look like even before
instantiating the first instance of it. In addition a "reflection"
like query mechanism and other meta data like validation rules would
be helpful.

I for myself came to the conclusion that Clojure is not made for large
nested data structures on its own. I personally feel much better with
a combined approach like the Functional Relational Programming
approach recently mentioned here on the group:
https://groups.google.com/group/clojure/browse_frm/thread/ba3da253f6358ac9?hl=de
http://web.mac.com/ben_moseley/frp/frp.html
I think one could use either SQL (e.g. an embedded h2sql database) or
something like datalog:
http://code.google.com/p/clojure-contrib/wiki/DatalogOverview
to describe "entities" and their relations. Those entities would be
maps in clojure with *few*! key value pairs. The nice characteristics
of Clojure as a pure functional language that deals with immutable
data structures and is easy to test would be kept.

You could use SQL or datalog as a query language to link the smaller
structures together, the entities, together. You could even write a
runtime documentation feature that would display a graphical
representation of an E/R like diagram. Clojure would be the pure
functional aspect of the FRP approach and the data structures would be
handled by the relational algebra part.

I believe in what Alan Perlis said: "It is better to have 100
functions operate on one data structure than 10 functions on 10 data
structures." That's the reason why I have my difficulties with Python.
Nevertheless for large connected data graphs I think something like a
data-schema is needed. Clojure would still follow its approach to only
deal with maps, but there is a descriptive meta-data level in addition
that explains the connection between those maps.

I would agree to what was said elsewhere: the Clojure community has to
come up with idioms on how to deal with large scale projects.

James Keats

unread,
Jul 3, 2011, 3:46:02 PM7/3/11
to Clojure


On Jul 3, 5:21 pm, Christian Schuhegger
<christian.schuheg...@gmail.com> wrote:

> Nevertheless for large connected data graphs I think something like a
> data-schema is needed. Clojure would still follow its approach to only
> deal with maps, but there is a descriptive meta-data level in addition
> that explains the connection between those maps.
>
> I would agree to what was said elsewhere: the Clojure community has to
> come up with idioms on how to deal with large scale projects.

Christian, your thoughts, generally speaking, chime with mine. I would
suggest though that the clojure community does not try to reinvent the
wheel where a well-engineered one has been made elsewhere (Rich
Hickey's reluctance to give clojure a yet-another-distribution/
clustering-story and instead suggest a look at existing ones is one of
the many reasons I believe he has admirable and reassuring
sensibilities "Given the diversity, sophistication, maturity,
interoperability, robustness etc of these options, it's unlikely I'm
going to fiddle around with some language-specific solution."
http://groups.google.com/group/clojure/msg/4a7a866c45dc2101 - btw
Rich, if you're reading this, slightly on a tangent, I do agree that
I'm yet to be convinced by Erlang's distributed-at-all-time model,
which you expressed reservations about based on your com/dcom
experience. You may be interested to know that the SCA architecture
takes these into account, and this is detailed in Jim Marino's book
"Understanding SCA", which contains a discussion that echoes your
view, from which I'll quote: "It may seem odd that a technology
designed for building distributed applications specifies local service
contracts as the default when defined in Java. This was a conscious
decision on the part of the SCA authors. Echoing Jim Waldo’s seminal
essay, “The Fallacies of Distributed Computing,” location transparency
is a fallacy...")

Christian, With regard to large data graphs, meta data, datalog/prolog
and logic programming, I would suggest you take a long at RDF/OWL. It
is a burgeoning field of research and my view would be that the
clojure commmunity embraces it rather than attempt to reduplicate it.

Is Clojure's Json-esque data model suitable for large data graphs? No,
it isn't. Nor is sql or xml. That's the end of that story. But RDF/OWL
is specifically designed for that, and it is very well designed.
Clojure though is an ideal complement to that.

Brian Marick

unread,
Jul 3, 2011, 4:00:31 PM7/3/11
to clo...@googlegroups.com

On Jul 3, 2011, at 3:13 AM, Sean Corfield wrote:
> Since I mostly work with 50-100kloc projects, I think 5-10kloc
> projects are kinda small :)


My point was that I'm running into interesting questions even with a small program. The answers are not obvious to me. There's evidence I'm not alone, so those to whom the answers *are* obvious would help the community by describing them.

* An example: organizing code into namespaces (skippable)

I was uncertain that Midje's "sweet" (syntactically sugared) interface would catch on, so I organized it by translation layers. I wrote the "unprocessed" layer first; it had functions that worked solely on maps. The "semi-sweet" layer provided macros that introduced some useful conventions but had only one syntactic innovation. It was easy to translate the `expect` and `fake` macros into "unprocessed" function calls on maps. Then I added the "sweet" layer that has a considerably more ambitious set of macros that translate `facts` into `expects` and `fakes`.

As time went on, I pulled out utility functions into namespaces like [midje.util thread-safe-var-nesting laziness file-position]. But that organization failed. When I divide things up into files, I want the division such that I usually find things in the first place I look. That wasn't happening.

So I started migrating to an organization based on verbs (this is a functional language, right?). So I have namespaces like [midje.midje-forms recognizing translating building]. Two problems: 1) New features require recognizing, translating, and building, so all the hopping around files was annoying. 2) The functions didn't fall into such clear-cut categories that I could reliably find things in the first place I look. (Unsurprising, since clear-cut categories are rare in nature: http://www.exampler.com/testing-com/writings/pnsqc-2005-communication.pdf)

Now I'm moving toward an organization around nouns, which feels a bit too OO to me, but at least I'm far enough in the project that the key concepts/nouns are likely to stay stable.

This progression feels a lot more wasteful than it would have been in Java (which has IDE support) or Ruby (which lets you mention a file once and have it be available throughout the program). So I'd have preferred to get it (more) right in the first place.

* What would help

It'd be useful for people happy with their multi-namespace codebases to volunteer them as exemplars. What's grouped together and why? What are the dependencies? How'd you arrive at this structure? A really interesting thing to do would be to implement a feature and narrate how you decide where to put things, where existing things must be, and so forth. [I spend a fair amount of time parachuting into projects and learning the code structure by pairing. Works pretty nicely.]

Sean Corfield

unread,
Jul 3, 2011, 4:22:16 PM7/3/11
to clo...@googlegroups.com
On Sun, Jul 3, 2011 at 1:00 PM, Brian Marick <mar...@exampler.com> wrote:
> My point was that I'm running into interesting questions even with a small program. The answers are not obvious to me. There's evidence I'm not alone, so those to whom the answers *are* obvious would help the community by describing them.

I don't think there are obvious answers to most questions around large
programs. If those answers were obvious, we wouldn't have shelves full
of books talking about how to tackle the problems of large scale
software development :)

FWIW, at World Singles, we have namespaces for high-level concerns -
config, data, interop, logging - and nested namespaces either for
implementation (indicating only intended to be used from the API
namespace, e.g., worldsingles.config.impl.* files are only used by
worldsingles.config.* files) or specialization / layering, much like
you describe in midje (e.g., we have worldsingles.data.crud for a
high-level CRUD API for persistence that is exposed to our non-Clojure
code and worldsingles.data.crud.core which implements it and is
intended to be used elsewhere in our Clojure code). I expect we'll add
namespaces for more of our business concerns as our use of Clojure
expands: worldsingles.membership, worldsingles.search and
worldsingles.commerce are probably the three most obvious candidates
right now.

We're in an unusual place, I suspect, since we're inherently polyglot
so our top-level namespaces contain code we expose to non-Clojure code
and nested namespaces contain code we use internally within our
Clojure code. That said, I wouldn't be surprised if we refactored
extensively as our Clojure codebase grows larger and larger.

Phil Hagelberg

unread,
Jul 3, 2011, 5:44:28 PM7/3/11
to clo...@googlegroups.com
Brian Marick <mar...@exampler.com> writes:

> This progression feels a lot more wasteful than it would have been in
> Java (which has IDE support) or Ruby (which lets you mention a file
> once and have it be available throughout the program). So I'd have
> preferred to get it (more) right in the first place.

Have you tried Slamhound? http://technomancy.us/148

It allows you to rebuild your ns clauses based on searching the
classpath for the vars that are referenced. I wrote it because we were
going through similar pains at work: shuffling things around in order to
improve modularity while trying not to break things. Basically it lets
you move a given defn and then it can automate the modifications needed
to ns forms to deal with the move. Of course, if you're renaming
functions or splitting them up it won't help, but it eased a lot of the
pain we were having.

-Phil

Christian Schuhegger

unread,
Jul 4, 2011, 12:42:20 AM7/4/11
to Clojure
This is an unfinished thought: I think that the Single-Level-of-
Abstraction (SLA) principle promoted in OO needs to have a prominent
place in functional programming, too!

Each function should talk about the problem in its level of
abstraction, e.g. in its language. Functions related to the same level
of abstraction can be put into a package.

The problem of course is to understand what a level of abstraction is.
Therefore this thought is yet unfinished. Just a parallel I came
across some time ago. I prefer designing software along "features". A
lot of people have different ideas of what a feature is. I came across
a good way to identify features in the already mentioned book "Java
Modeling in Color with UML". They use a language template "<action>
the <result> <by|for|of|to> a(n) <object>" e.g. "Calculate the total
of a sale" or "Calculate the total purchases by a customer". The key
here was to find a language template. Language is at the core of this
concept. I feel that language may be also the core to find out a level
of abstraction.

A rule of thumb for me was in the past to look for groups of functions
that I start to add a name prefix to, e.g. "excel-number-cell", "excel-
format", ... I put them in a package"...excel". In general I am very
reluctant to creating namespaces. I often feel that name spaces are as
much in my way to create software as static type systems are. A
certain fraction of my brain is constantly involved in thinking about
"which grouping should I use" instead of thinking about the problem
and the level of abstraction at hand. By the way a similar experience
is between using C++ and using a language with a garbage collection. A
fraction of your brain is constantly busy with thinking about memory
allocation (extremely low level of abstraction) instead of the level
of abstraction you should think about.

Perhaps somebody else has already thought further into that direction
and I would be happy to take over your learnings :)

Christian Schuhegger

unread,
Jul 4, 2011, 12:45:43 AM7/4/11
to Clojure
Thanks for your feed-back. I already have RDF/OWL in my tool-kit. I am
only not sure if an ERP like system should be modeled along those
lines. But I did not put enough thought in that direction yet. Would
you base an ERP like system on top of RDF/OWL?

Mark Engelberg

unread,
Jul 4, 2011, 5:04:20 AM7/4/11
to clo...@googlegroups.com
On Sat, Jul 2, 2011 at 8:19 PM, Luc Prefontaine
<lprefo...@softaddicts.ca> wrote:
> Were did you find the link between functional languages and close proximity of
> errors ? That's a language design decision. You may want to use assertions
> on your fns to validate inputs. That sould improve your ability to track errors
> before they carry things too far from the spotwhere it failed.
> I would not trade this for systematic exception reporting.

Sorry if I wasn't clear about this. One time I was rereading a book
about the art of debugging (I think it was this book:
http://www.amazon.com/Why-Programs-Fail-Second-Systematic/dp/0123745152),
and realized that the main theme of the book is that the #1 reason
that debugging is hard is that most bugs result from some sort of
mutation of state in one part of your code that inadvertently violates
some assumption or invariant you had in your mind. But your program
doesn't crash right away, it keeps quietly chugging along with that
corrupted state until some completely separate portion of your program
tries to do something with that data that no longer makes sense and
KA-BOOM. But the line your debugger shows you just shows you where
the crash happened; it can't show you the series of steps that led to
the corruption of state that actually caused the crash. Thus, you
need to do a lot of detective work and step through the program. This
is precisely why, for example, most programmers will gladly pay the
performance penalty for bounds-checking on array reads and writes --
it's incredibly valuable to have your program crash where the problem
actually occurs, rather than continuing for a while with spurious
values or corrupted memory and getting a delayed crash with no clear
connection to the cause.

I had a personal a-ha moment when I read that, which made me realize
that one of the reasons I enjoy functional programming so much more is
that this class of bug just doesn't happen. Generally speaking,
crashes have good locality with respect to the flaw in the code that
causes them because there's no "state" to get corrupted and eventually
cause a delayed crash.

Of course, often the hardest bugs of all to find are the ones that are
the result of deep logical flaws. The program may be an exact
implementation of what you had in mind, but what you had in mind
doesn't quite accomplish what you expected it to.

And that's the problem I have with some of Clojure's core functions --
they can turn a blatant mismatch (between a function's input
requirements and the inputs that actually get passed) into a deep
logical flaw. The get example I raised is a perfect example of this.
When I passed a transient set to a function that used get, I
reasonably assumed that transient sets implement whatever interface
get requires. But rather than raise an error because the object
didn't support the desired interface, get just returned nil -- which
is the exact same value that is returned in ordinary usage when you
test whether something is in the set and it isn't! So now, I have
sets that are quietly being passed around, and returning sensible
values but behaving as if they don't have any elements. What should
be an easy bug has turned into a deep logical flaw in my program.
Everything appears to be working, but my program generates completely
bogus outputs because at some stage of its processing it tested for
membership in a set and got back nil for something that was actually
in the set. This is the kind of thing that is a real nuisance to
track down, requiring detailed detective work and a careful analysis
of the entire chain of logic to find the spot where things actually go
wrong. Given that get creates the illusion of working even when it
doesn't, I fail to see how a pre or post condition in my own code
could have picked up on this or validated the input, short of having a
deep understanding of all the interfaces required by every core
function and testing every input explicitly for support of those
interfaces (in which case, I might as well be using a statically typed
language).

James Keats

unread,
Jul 4, 2011, 8:26:20 AM7/4/11
to Clojure


On Jul 4, 5:45 am, Christian Schuhegger
My immediate instinct would suggest you already use an existing one,
but I note that you said it was a "toy project". ERP is a problem
that's historically been well-suited for prolog and logic programming.
Yes, I believe RDF/OWL(/Sparql and semantic web reasoners) is a major
advancement in that field and would recommend you look at it. A good
book to get you started would SEMANTIC WEB for the WORKING ONTOLOGIST,
of which a second edition has recently come out. :-)

James Keats

unread,
Jul 4, 2011, 8:40:39 AM7/4/11
to Clojure


On Jul 4, 1:26 pm, James Keats <james.w.ke...@gmail.com> wrote:
> On Jul 4, 5:45 am, Christian Schuhegger
> A good
> book to get you started would SEMANTIC WEB for the WORKING ONTOLOGIST,
> of which a second edition has recently come out. :-)

Sorry about the unintentional "to get you started" figure of speech; I
note you said you already had rdf/owl in your kit. It's not out of
underestimating your knowledge (though it might be out of my sense of
being mildly overwhelmed by the still remaining reading list I already
have of semantic web books, Springer just keeps dropping them like
rain. :-)

Islon Scherer

unread,
Jul 4, 2011, 9:41:27 AM7/4/11
to Clojure
I think the issue with large programs is not the language but
software engineering.
A large program should be well designed and architected, and this is a
problem (I think) many
people in clojure and functional programming in general have. "Clojure
is a very high level and concise language so I'll grow my program as I
type".
I'm not proposing UML or any specific tool or technique, but analysis
and design are a important part of a large software.
It's easier to understand your problem if you look at your high level
documentation/diagrams than look at code. Of course some problems and
refactor
will happen no matter how well you designed, but you'll understand
better what you did and what you should do.

Islon

Christian Schuhegger

unread,
Jul 4, 2011, 10:49:46 AM7/4/11
to Clojure
No worries. I have the book on my shelf. The first version. But thanks
for making me aware of the second version.

Timothy Washington

unread,
Jul 4, 2011, 3:17:28 PM7/4/11
to clo...@googlegroups.com
Yes, exactly. I'm going to check that out. 

Thanks Shantanu 
Tim 



Stuart Halloway

unread,
Jul 5, 2011, 9:01:36 AM7/5/11
to clo...@googlegroups.com
On large projects I do the following:

(1) Use "require :as prefix" everywhere. This felt ugly at first, but puts pressure on naming in way that is beneficial as the codebase grows.

(2) Think of the consumer of the lib, not the author. As a user of Midje, I would want all the utility fns in a single namespace (if they were separated from the domain API at all).

In general, I have found that namespaces should be larger than my OO intuition would have them be.

Stu


Stuart Halloway
Clojure/core
http://clojure.com

Laurent PETIT

unread,
Jul 5, 2011, 9:18:21 AM7/5/11
to clo...@googlegroups.com
2011/7/5 Stuart Halloway <stuart....@gmail.com>:

> On large projects I do the following:
> (2) Think of the consumer of the lib, not the author. As a user of Midje, I
> would want all the utility fns in a single namespace (if they were separated
> from the domain API at all).
> In general, I have found that namespaces should be larger than my OO
> intuition would have them be.
> Stu

Yes, and this is IMHO driven by the fact that there is less
dependencies between two functions in a namespace than two methods in
a class (which may share state via the instance).

Ken Wesson

unread,
Jul 5, 2011, 12:59:55 PM7/5/11
to clo...@googlegroups.com
On Tue, Jul 5, 2011 at 9:01 AM, Stuart Halloway
<stuart....@gmail.com> wrote:
> In general, I have found that namespaces should be larger than my OO
> intuition would have them be.

One problem with scaling up namespaces, though, is that ongoing
"invalid constant tag 32" issue with big enough input files (see other
thread). For now, until it's fixed, there's an effective size cap on
namespaces that is hit at around 1kloc (typically no more than a few
hundred functions).

--
Protege: What is this seething mass of parentheses?!
Master: Your father's Lisp REPL. This is the language of a true
hacker. Not as clumsy or random as C++; a language for a more
civilized age.

Sean Corfield

unread,
Jul 5, 2011, 1:33:29 PM7/5/11
to clo...@googlegroups.com
On Tue, Jul 5, 2011 at 6:01 AM, Stuart Halloway
<stuart....@gmail.com> wrote:
> (1) Use "require :as prefix" everywhere. This felt ugly at first, but puts
> pressure on naming in way that is beneficial as the codebase grows.

I've also started leaning toward that approach. At first I tended to
:use clojure.* namespaces and :require our own code with aliases but
now I'm moving more to :require on all namespaces, often without an
alias (on short ns names) and then using the long form in calls. In
other words, only using an alias if it really cleans up the code (one
tooling deficiency I noticed is that CCW won't recognize clojure.*
namespace functions if you use an alias and I find the color-coding is
worth more than the conciseness of the code).

> In general, I have found that namespaces should be larger than my OO
> intuition would have them be.

I'm beginning to find that. At first I was creating namespaces much as
I would have for classes but that soon produced long (ns) forms
requiring all the small namespaces so I backed off to less granular
namespaces and I'm finding that easier to manage.

I just saw Ken's note come in about "invalid contant tag 32" and
looking at the threads behind that, it looks like folks hit it when
they have "large" files but I'd be concerned about any single
namespace-based-API that grew that large - I would have expected to
break it down into a "public" API and a "private" implementation
namespace before files got that large. I guess it will be interesting
to see how this pans out as Clojure adoption continues to grow...
maybe that limitation should be endorsed and the compiler could issue
an "Error: your namespace is too big - please modularize your code!"
message as a way to keep namespaces to a maintainable length... :)

David Nolen

unread,
Jul 5, 2011, 2:58:14 PM7/5/11
to clo...@googlegroups.com
On Tue, Jul 5, 2011 at 12:59 PM, Ken Wesson <kwes...@gmail.com> wrote:
On Tue, Jul 5, 2011 at 9:01 AM, Stuart Halloway
<stuart....@gmail.com> wrote:
> In general, I have found that namespaces should be larger than my OO
> intuition would have them be.

One problem with scaling up namespaces, though, is that ongoing
"invalid constant tag 32" issue with big enough input files (see other
thread). For now, until it's fixed, there's an effective size cap on
namespaces that is hit at around 1kloc (typically no more than a few
hundred functions).

Why doesn't this limitation affect clojure.core, which is 6k+ loc?

David 

Ken Wesson

unread,
Jul 6, 2011, 2:38:42 AM7/6/11
to clo...@googlegroups.com

I have no idea. But clojure.core is hardly typical; for one thing, it
loads during rather than after bootstrap. Possibly AOT-compiled
namespaces don't have the problem, or have it at larger sizes, than if
JIT-compiled with load-file. It *is* interesting that core.clj is over
200k and doesn't fail whereas others have reported the error happening
consistently for any .clj file exceeding exactly 64k. Perhaps there is
some way to make larger .clj files palatable that could be used by us
normal folk, though if so it isn't obvious what.

Peter Taoussanis

unread,
Jul 6, 2011, 3:49:56 AM7/6/11
to Clojure
Don't know if it counts as "large", but I'm running a 20,000+ LOC
project for a 100%-Clojure web app at www.wusoup.com.

My 2c: I'm not an experienced developer by any stretch of the
imagination; this is something I'm working on completely alone, and
yet I've so far found the whole thing incredibly manageable. I'd
attribute that largely to Clojure.

Then again, I only noticed this thread because of its relation to the
"unknown constant tag" one ;p

I'd like to open-source the whole app at some stage (or at least some
large parts of it), but I'm also always happy to answer any questions
from the perspective of someone using exclusively Clojure for a small
(but hopefully growing) "production" application.

One of the things I've most enjoyed about Clojure (and it being
functional) is the ease with which I can bash on a function in the
REPL during development: testing it with all sorts of weird/nil input,
making sure that it'll be well behaved even if something else along
the way gets confused.

The modularity I can get with "functional" functions is reassuring for
me as a lone developer since once I've written something and it's gone
through that "bashing" stage- I'm normally pretty confident that it's
more or less "right". I very rarely end up needing to come back to fix
problems related to unexpected input, etc.

Most of the time when I need to "fix" a function it's because I simply
had the wrong idea about what it actually needed to do, rather than
because it was doing it wrong. If that makes any sense.


For a large project I think you probably need to be more disciplined
with something like Clojure than, say, Java. But that's the whole
"with great power" thing again: I think you get something valuable in
return for being asked to exercise some discipline.

Can't really comment on how easily Clojure works for large groups of
developers as such. The flexibility thing might start losing it's
charm when you have 10 different coding styles competing with one
another under time constraints, etc. (where discipline starts to go
out the window in favour of "getting stuff done").

- Peter Taoussanis

Ken Wesson

unread,
Jul 6, 2011, 4:05:31 AM7/6/11
to clo...@googlegroups.com
On Wed, Jul 6, 2011 at 3:49 AM, Peter Taoussanis <ptaou...@gmail.com> wrote:
> Can't really comment on how easily Clojure works for large groups of
> developers as such. The flexibility thing might start losing it's
> charm when you have 10 different coding styles competing with one
> another under time constraints, etc. (where discipline starts to go
> out the window in favour of "getting stuff done").

So far, the Clojure culture has strongly encouraged a sense for
particular idiomatic coding conventions for most common tasks; so
hopefully "10 different coding styles competing with one another"
won't be the sort of issue it might be if you were using, say, Common
Lisp.

Johan Gardner

unread,
Jul 6, 2011, 6:18:28 AM7/6/11
to clo...@googlegroups.com
What you say especially resonates with me regarding the 'ease of use' wrt hammering code in a highly iterative/productive way, and I have approached a number of 'enterprise' size solutions in exactly that way with extremely robust results (IMO of course :-)).  

Raoul Duke

unread,
Jul 6, 2011, 1:50:09 PM7/6/11
to clo...@googlegroups.com
On Wed, Jul 6, 2011 at 3:18 AM, Johan Gardner <jgar...@vikingstorm.com> wrote:
> What you say especially resonates with me regarding the 'ease of use' wrt
> hammering code in a highly iterative/productive way, and I have approached a
> number of 'enterprise' size solutions in exactly that way with extremely
> robust results (IMO of course :-)).

for those of you who
(a) find such reports tantalizing
and
(b) don't totally dislike static typing, especially when done with inference
i suggest checking into things in the ML family of languages. i found
that when i used SML it gave me the same amazing feeling only more so.

sincerely.
$0.02.

Zach Tellman

unread,
Jul 6, 2011, 7:39:53 PM7/6/11
to clo...@googlegroups.com
I agree that namespaces should be designed to be consumed, but that can be pretty taxing on the developer.  In my libraries, I tend to split the functions into whatever sub-namespaces I want to keep the organization easy for me, and then import all the functions I want to expose into a higher-level namespace.

For example, in Aleph I have HTTP functionality implemented in aleph.http.client, aleph.http.server, aleph.http.websocket, etc. but all the useful functions are gathered together into aleph.http.  This means that I don't have to navigate a monolithic namespace, but the users of my library don't have to declare a dozen namespaces to get anything done.  I find this approach scales for me pretty well, and I haven't heard any complaints from the people using my libraries about the organization.

Zach

Glen Stampoultzis

unread,
Jul 6, 2011, 10:50:11 PM7/6/11
to clo...@googlegroups.com
On 7 July 2011 09:39, Zach Tellman <ztel...@gmail.com> wrote:
I agree that namespaces should be designed to be consumed, but that can be pretty taxing on the developer.  In my libraries, I tend to split the functions into whatever sub-namespaces I want to keep the organization easy for me, and then import all the functions I want to expose into a higher-level namespace.

For example, in Aleph I have HTTP functionality implemented in aleph.http.client, aleph.http.server, aleph.http.websocket, etc. but all the useful functions are gathered together into aleph.http.  This means that I don't have to navigate a monolithic namespace, but the users of my library don't have to declare a dozen namespaces to get anything done.  I find this approach scales for me pretty well, and I haven't heard any complaints from the people using my libraries about the organization.
 
I think that's a fairly sane way to organise things.  I tend to get annoyed when a library has several use/requires to make use of it.

Feng Shen

unread,
Jul 7, 2011, 2:45:57 AM7/7/11
to Clojure
Our codebase is 6.8k kloc of production code, 4k of test code. We use
emacs, slime+swank to develop.
The editor is great, REPL is great. But lacking debuging and
refactoring support is a pain.

On Jul 3, 9:26 am, Mark Engelberg <mark.engelb...@gmail.com> wrote:
> Ideally, I was hoping to start a more in-depth discussion about the
> pros and cons of "programming in the large" in Clojure than just
> waxing poetic about Clojure/Lisp's capabilities in the abstract :)
>
> Yes, much of the initial excitement around Clojure comes from the
> feeling of "Wow, I can do so much with so little code".  But at some
> point, all projects grow.  I'm thinking that by now, there may be
> enough people using Clojure in large projects and on large teams to
> offer some good feedback about how well that works.
>
> My Clojure codebase is somewhere around 2-3kloc and I already feel
> like I'm bumping up against some frustration when it comes time to
> refactor, maintain, and extend the code, all while keeping up with
> ongoing changes to libraries, contrib structures, and Clojure
> versions.
>
> I want to hear war stories from those with even larger code bases than
> mine.  Has it proven to be a major hassle on large projects to avoid
> circular dependencies in the modules?  Are the lack of debugging
> tools, documentation tools, and refactoring tools holding you back?
> Anyone miss static typing?
>
> One of my main gripes is that some of Clojure's built-ins return
> nonsensical results (or nil), rather than errors, for certain classes
> of invalid inputs.  To me, one of the main benefits of functional
> programming is that debugging is generally easier, in large part
> because failures usually occur within close proximity of the flaw that
> triggered the failure.  Erlang, in particular, has really promoted the
> idea of "fail fast" as a way to build robust systems.  But Clojure's
> lack of a "fail-fast" philosophy has burned me several times, with
> hard-to-track-down bugs that were far-removed from the actual cause.
> The larger my code grows, the more this annoys me, reminding me too
> much of my days tracking down bugs in imperative programs.
>
> One specific example of this is get, which returns nil whenever the
> first input isn't something that supports get.  For example, (get 2 2)
>  produces nil.  This becomes especially problematic when you pass
> something to get that seems like it should support get, but doesn't.
> For example, (get (transient #{1}) 1) produces nil, when there's
> absolutely no reason to think that (get (transient #{1} 1) would
> behave any differently from ((transient #{1}) 1).

Scott Jaderholm

unread,
Jul 7, 2011, 7:55:31 PM7/7/11
to clo...@googlegroups.com
On Thu, Jul 7, 2011 at 2:45 AM, Feng Shen <she...@gmail.com> wrote:
> But lacking debuging and
> refactoring support is a pain.

In case you're not familiar with these (not saying they're full-featured):

https://github.com/pallet/ritz
http://www.youtube.com/watch?v=d_L51ID36w4

https://github.com/tcrayford/clojure-refactoring

Scott

Reply all
Reply to author
Forward
0 new messages