Largest Clojure codebases?

1,245 views
Skip to first unread message

dilettan...@live.com

unread,
Nov 15, 2015, 2:22:12 PM11/15/15
to Clojure
I've been having a (friendly) argument with a friend who is an old-school OOP programmer.  He insists that you need objects to make large-scale codebases legible and maintainable over the long run.  Quite apart from this argument's virtues or lack thereof, this made me curious -- what are the largest programs written in Clojure in terms of LOC?  I know I've seen mentions of 50k-100k LOC projects (World Singles, if I'm remembering correctly), but are there any that are larger?

   Vikram

Marc O'Morain

unread,
Nov 15, 2015, 2:48:40 PM11/15/15
to clo...@googlegroups.com, Clojure
We have a large app at CircleCI - as of September:

"The repo for our main app contains 93,983 lines of Clojure code. The src directory of our main app contains 369 namespaces."




On Sun, Nov 15, 2015 at 7:22 PM, dilettan...@live.com <dilettan...@live.com> wrote:

I've been having a (friendly) argument with a friend who is an old-school OOP programmer.  He insists that you need objects to make large-scale codebases legible and maintainable over the long run.  Quite apart from this argument's virtues or lack thereof, this made me curious -- what are the largest programs written in Clojure in terms of LOC?  I know I've seen mentions of 50k-100k LOC projects (World Singles, if I'm remembering correctly), but are there any that are larger?

   Vikram

--
You received this message because you are subscribed to the Google
Groups "Clojure" group.
To post to this group, send email to clo...@googlegroups.com
Note that posts from new members are moderated - please be patient with your first post.
To unsubscribe from this group, send email to
clojure+u...@googlegroups.com
For more options, visit this group at
http://groups.google.com/group/clojure?hl=en
---
You received this message because you are subscribed to the Google Groups "Clojure" group.
To unsubscribe from this group and stop receiving emails from it, send an email to clojure+u...@googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

Timothy Baldridge

unread,
Nov 15, 2015, 4:27:28 PM11/15/15
to clo...@googlegroups.com
It's funny, because most of the larger OOP projects I worked on were large because of class bloat, not because of business concerns. For example, the C# app I used to work on was a more or less simple CRUD app. We had a ORM that served up objects to a Silverlight UI. So if we wanted to display information about a person on the UI we normally had to modify around 5 classes in 5 files. We had the following layers

ORM - 1.4MB of generated C# for interfacing with the 200 or so SQL tables we had
DTO - Data Transfer Object, where used "normal" C# objects to abstract the ORM. This is where we had the "Person" object 
API - A web service that served up DTOs over HTTP
Data Model - Processed views of DTOs formatted in a way that was easily viewable by the UI
View Model - UI classes that would take data from a Data Model and emit UI controls.

All of that ceremony....that is replaced by one thing in Clojure...data. Hashmaps and vectors replace all the junk you see above.

So that's where I often assert "Yes, you may have 1 million lines of Java code....but that would only be ~10,000 lines of Clojure." With proper application of data driven systems (data that configures pipelines and writes code) you can easily get a 100:1 ratio of Java to Clojure code. 

Timothy

--
“One of the main causes of the fall of the Roman Empire was that–lacking zero–they had no way to indicate successful termination of their C programs.”
(Robert Firth)

Gregg Reynolds

unread,
Nov 15, 2015, 5:00:18 PM11/15/15
to clo...@googlegroups.com

I'm reminded of the old joke: to err is human; to really screw up you need a computer (read: OO).

Colin Yates

unread,
Nov 15, 2015, 6:23:01 PM11/15/15
to clo...@googlegroups.com
Exactly this. I couldn’t find a reliable way of counting LOC but my (Clojure/ClojureSciprt) src tree (excluding test) in the project I have to hand is 1.5MB.

dennis zhuang

unread,
Nov 15, 2015, 7:06:54 PM11/15/15
to Clojure
I use cloc(http://cloc.sourceforge.net/) to counting the LOC of our projects, it's total about 41025 lines of Clojure  code.





庄晓丹
Email:        killm...@gmail.com xzh...@avos.com
Site:           http://fnil.net
Twitter:      @killme2008


Kyle R. Burton

unread,
Nov 16, 2015, 9:54:23 AM11/16/15
to clo...@googlegroups.com
At the last company I was with I used sloccount [1] to analyze the codebase.  I concatenated all the clj files to a .lisp file so sloccount could analyze it.  I was curious about the cost estimate that sloccount performs to see how the team measured up (size varied from 2 to 7 over 5 years).  When I did the analysis (over a year ago) we had about 130k lines of Clojure that represented about two dozen libraries and bout six services.  Including the javascript, java, C, Ruby and other languages in our repositories, sloccount estimated over 5x the person years we actually spent.  This team was also responsible for the whole stack - production operations, releases, etc.  If someone is doing research, I'd be happy to reach out to a colleague to see if they would run the analysis again.


Michael Willis

unread,
Nov 16, 2015, 10:58:53 AM11/16/15
to Clojure
For what it's worth, here's the way that I always count lines of code, should work on any unix-like system:

find -name "*.clj" | wc -l

Sean Corfield

unread,
Nov 23, 2015, 2:29:38 PM11/23/15
to clo...@googlegroups.com
I know I've seen mentions of 50k-100k LOC projects (World Singles, if I'm remembering correctly), but are there any that are larger?

Our code base is "only" about 30kloc of production Clojure so far (and 7.5kloc of unit tests and 2kloc WebDriver tests). As we refactor our legacy code base, we’re rewriting quite a bit of it from a dynamic OO scripting language to Clojure and finding we need less code — sometimes a lot less — for FP compared to OOP. If we were using Java, our OOP codebase would be even bigger so the "compression" from Java to Clojure would be even more extreme. That means that a "large" codebase in OOP is likely to be much, much larger than the equivalent "large" codebase in FP — in general — making it somewhat hard to compare numbers in any meaningful way.

I’m not quite sure what you mean by "old-school OOP programmer" but, for comparison, I’ve been doing software development commercially since the early 80’s and OOP specifically since the early 90’s. I did FP at college in the 80’s and I’ve been doing it commercially for the last six years now. Based on that comparative experience, I’m pretty comfortable asserting that a) you do not need objects to make large-scale codebases legible and maintainable and b) an object-based codebase is likely to be larger than the equivalent functional codebase (and a smaller codebase is more legible and maintainable anyway).

Sean Corfield -- (904) 302-SEAN
World Singles -- http://worldsingles.com/

Nicolas Herry

unread,
Nov 24, 2015, 6:16:09 AM11/24/15
to clo...@googlegroups.com

-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA256



Le 16/11/2015 15:54, Kyle R. Burton a écrit :
> At the last company I was with I used sloccount [1] to analyze the codebase. I concatenated all the
clj files to a .lisp file so sloccount could analyze it. I was curious
about the cost estimate that sloccount performs to see how the team
measured up (size varied from 2 to 7 over 5 years). When I did the
analysis (over a year ago) we had about 130k lines of Clojure that
represented about two dozen libraries and bout six services. Including
the javascript, java, C, Ruby and other languages in our repositories,
sloccount estimated over 5x the person years we actually spent. This
team was also responsible for the whole stack - production operations,
releases, etc. If someone is doing research, I'd be happy to reach out
to a colleague to see if they would run the analysis again.

Is sloccount reliable for anything beyond counting the lines of code? In
the past, I've tried it over simple code bases in various languages and
it would always produce cost estimates way higher than the reality
(one-off perl things I would do over a week-end would be estimated at
something 150 days).

I confess I've never configured sloccount (actually, I have no idea
whether this is possible at all) and have always relied on the stock,
default execution. I'm genuinely interested in knowing whether sloccount
can be used as a serious cost estimator.

Nicolas
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v2

iQEcBAEBCAAGBQJWVEbeAAoJEEm9nYAWkk7woBIIAL6RcZxEAbI8t7x4r9EcuoZv
QvoD6XJYiUB70GL7fY/Rr5NMcLMjWAAE0RI/2Z25237+RNnJ/QM3gMhCRxNuuMAz
koEzEGY24yhCw6e6d7nmB7C2ryW0g7tlhLQUJz2vNqJkqlN8P+qvbvHIEiGHJ2lq
Q54GC4ZCodVUvJIKT48F4+E0i1d4uakWP4VP1ZYriTH+I3DZIDyomIAXKL9p37hg
rI+K3lFT1DQOXc87jyHufh4+R9er+RPwLiQ/3cUq3CsrJS9yU0is6a5iw284CAdf
6CbFWyuHUf9KvZ9nMnCDx4zM02tfz8UJkMJbIyByJNucsDduO32w81iGg6d/Rzw=
=bhxa
-----END PGP SIGNATURE-----

Reply all
Reply to author
Forward
0 new messages