Fun with Clojure, 2nd Edition

210 views
Skip to first unread message

Meikel Brandmeyer

unread,
Aug 2, 2008, 2:14:02 PM8/2/08
to clo...@googlegroups.com
Dear Clojurians,

this is the second post of my "Fun with Clojure" series. In the first
part I described my experiences with parser combinators. Now I have to
write about interfacing with Excel on Windows. (Another topic of limited
interest, though. Maybe I should think about a blog.)

Background:

I'm working in the quality assurance group of the electronic development
department of a rather large automotive supplier. My job involves a lot
of data mining on quality issues. On the one hand we do monthly and
weekly reports and on the other hand we do analysis on request, e.g..
for
a specific customer or device. This is all done in Excel, since we use a
home-grown method of analysis.

Long blabla, short idea: I had to get rid of the manual Excel
hanky-panky. So the requirements for an automatic solution were:

- the resulting Excel file must be 100% interactive (for flexibility
and exploratory data analysis)
- must be easy to use also for non-programmers (at-least as easy as
possible)
- the solution *must* support pivot tables

The last point disqualifies virtually all freely available solutions I
found on the net. So after reading an article about DSLs and Ruby I
thought, that this thing can't be too hard and started to write our own
generator. That was nearly 3 years ago and today we have some kind of
Excel "compiler" in Ruby compiling a DSL form description into Excel
files. Including pivot tables and charts!

(Didn't he say something about "fun" and "Clojure"? I'm missing this
somehow!)

Ok, ok! You are right. So how does Clojure fit into the picture?

Well. The Ruby solution basically works. But it is quite quirky. The GUI
blocks when we do the COM calls to Excel so a hairy Thread.pass orgy was
necessary to get it at least a bit responsive. Then I wanted to
reference columns used in other columns by name like a variable. Not by
the column itself (think of inserting a column somewhere -> *BOOM*). So
I had to do a complicated regex magic to build valid Excel formulae.
Also setting up the environment parameters only worked with globals due
to the way the DSL was implemented. If you don't take care the Excel
instance keeps running. etc.

Yes, yes! Now the Clojure part: Given the issues, which are mostly
related to the way of implementation (I have no clue whatsoever, how to
implement sanely a DSL), I decided rewrite the whole thing. And to get
rid also of the Ruby dependency (Java is ubiquitous at work) I decided
to do it in Clojure. A student worker does the GUI in Java, I do the
backend in Clojure.

Now we have Clojure. Where is the fun?

In the productivity boost: In three days I wrote an almost fully
functional new version in Clojure, which

- is independent of the local Excel language version (which the Ruby
version wasn't)
- is able to provide a fully configurable environment (which was at
least difficult and hairy in Ruby)
- stays fully interactive during Excel file generation (which the Ruby
version wasn't)
- provides a tight integration into the host language (which the Ruby
version wasn't)

Please let me demonstrate this in a short example. We define a Excel
workbook with two columns.

(defworkbook TheWorkbook
(sheet "A"
(column XYZ
(length 3)
(formula
(cond
(<= 1 row 2) row
:else (excel-sum (this 1 2) (ABC 1)))))
(column ABC
(length 1)
(value 7))))

Note the tight integration of Clojure and Excel functions: cond is
Clojure, excel-sum is the Excel SUM function. Also the columns are given
a name, which results in a function available in the formulae, which can
be used to obtain a reference to the column, independent of the
"physical" column in the workbook (or even the workbook itself).

"Compiling" this description is a two-step process. First the form above
is expanded into a set of nested maps which might be thought of some
kind
of AST. These maps are then used to create the Excel file itself.
Generating the example will give an Excel file called "TheWorkbook.xls"
containing a worksheet "A" which roughly looks like this:

A B
1 1 7
2 2
3 = SUM($A$1:$A$2, $B$1)

In a german Excel the formula would be automatically translated to
"= SUMME($A$1:$A$2; $B$1)". Note the function name and the semicolon.
(This is something Visual Basic does for the developer, but other
languages don't.)

Even given that I have now three years of experience in the problem
domain, the three days are still *much* faster than compared to the Ruby
version, which took me some weeks. With some simple applications of some
macros and lots of functional programming it was very easy to get the
basic functionality up and running. The new version is still not
finished, but the remaining parts are simply setting up the wrappers and
emitter functions for the charts and the pivots.

Clojure is fun, FUN, *FUN*!

Thanks again, Rich!

Sincerely
Meikel

PS: I'm sorry for this long post. But it's just another success story.
I simply was in the mood to share this.

PPS: Ruby is not a bad language. The Ruby version saved me months of
manual work! The problems (besides the thread issues) mainly arose
because of my crappy implementation.

Rich Hickey

unread,
Aug 4, 2008, 7:53:10 AM8/4/08
to Clojure
On Aug 2, 2:14 pm, Meikel Brandmeyer <m...@kotka.de> wrote:
> Dear Clojurians,
>
> this is the second post of my "Fun with Clojure" series. In the first
> part I described my experiences with parser combinators. Now I have to
> write about interfacing with Excel on Windows.

Thanks Meikel,

It's nice hearing about people having success (and fun) with Clojure.

Rich

Shawn Hoover

unread,
Aug 4, 2008, 12:23:55 PM8/4/08
to clo...@googlegroups.com
On Sat, Aug 2, 2008 at 11:14 AM, Meikel Brandmeyer <m...@kotka.de> wrote:
Long blabla, short idea: I had to get rid of the manual Excel
hanky-panky. So the requirements for an automatic solution were:

- the resulting Excel file must be 100% interactive (for flexibility
and exploratory data analysis)
- must be easy to use also for non-programmers (at-least as easy as
possible)
- the solution *must* support pivot tables

Hi Meikel,

This is cool. One thing I don't see from your description is how you interfaced to Excel from Clojure. Are you using COM via Java? It would be cool if you could put a snippet in the wiki showing how to get a handle to an Excel workbook in Clojure.

Shawn

Meikel Brandmeyer

unread,
Aug 4, 2008, 2:28:17 PM8/4/08
to clo...@googlegroups.com
Hello Shawn

Am 04.08.2008 um 18:23 schrieb Shawn Hoover:

> This is cool. One thing I don't see from your description is how you
> interfaced to Excel from Clojure. Are you using COM via Java? It would
> be cool if you could put a snippet in the wiki showing how to get a
> handle to an Excel workbook in Clojure.

Yes. I'm using COM. That limits the whole solution to Windows, but pivot
tables are far more important to me than portability. (Not in general,
but because of my situation. Programming is not my full-time job after
all.)

To access the COM interface I use Jacob[1]. Then the basic access code
look like this:

---8<---
(import '(com.jacob.com Dispatch ComThread))

(def xl (new Dispatch "Excel.Application"))
(def wbl (.. Dispatch (get xl "Workbooks") (toDispatch)))
(def wb (.. Dispatch (call wbl "Add") (toDispatch)))
(def wsl (.. Dispatch (get wb "Worksheets") (toDispatch)))
(def ws (.. Dispatch (call wsl "Item" 1) (toDispatch)))
(def c (.. Dispatch (call ws "Cells" 1 1) (toDispatch)))
(. Dispatch put c "Value" "Hello, World!")
(. Dispatch call wb "SaveAs" "Hello.xls")
(. Dispatch call wb "Close" false)
(. Dispatch call xl "Quit")
(. ComThread Release)
---8<---

This is *sooooo* ugly. After shutting down everything you also
have to run some cleanup code per thread, were you used these things.
So the first I did, was to lock away any COM stuff as good as possible.

The ugly (toDispatch) calls in the code above are needed, because you
always get a Variant. So wrapping all the calls into a variant->clj
function was necessary. It just looks inside and translates the
variant's contents into something native for Clojure.

The next step was to get rid of the Dispatch in front. This is also ugly
like hell, but necessary due to the way Jacob and COM work. So showtime
for another macro, which takes care of this.

---8<---
(def-ole-object-class
application
[Visible Workbooks]
[Quit])

(def-ole-object-class
workbook
[Worksheets]
[Close SaveAs])
---8<---

The macro itself is too long, to remember it right now. It simply
creates wrappers with the right names to call the right things. Together
with some other small wrappers, we can then re-write the above code to
the following.

---8<---
(def xl (application-new))
(def wb (workbook-new xl "Hello.xls"))
(def ws (worksheet-new wb "Sheet1"))
(def c (worksheet-cells ws 1 1))
(range-value c "Hello, World!")
(workbook-close wb true)
(application-quit xl)
(. ComThread Release)
---8<---

This is more concise and concentrated on the task, then on getting
Dispatch right, but it still suffers from exceptions and cleaning up
stuff. The with-excel macro to the rescue!

---8<---
(defn with-excel*
[thunk]
(let [worker (fn
[]
(let [xl (new Dispatch "Excel.Application")]
(try
(thunk xl)
(finally
(application-quit xl)
(. ComThread Release))))) ; Clean from Jacob
t (new Thread worker)]
(. t start)
(. t join)))

(defmacro with-excel
[xl-var & body]
`(with-excel* (fn [~xl-var] ~@body)))
---8<---

This takes care of starting up an Excel, do what it is told to and
finally
cleaning up the mess afterwards. In this case it is synchronous to the
calling thread, but you may also fire off an asynchronous request and
come back to it later on. Also we don't get anything out, yet. But since
I just want to put stuff in, I don't care right now.

Now we can finally come up with a half-way usable version of the first
example.

---8<---
(with-excel xl
(let [wb (workbook-new xl "Hello.xls")
ws (worksheet-new wb "Sheet1")
c (worksheet-cells ws 1 1)]
(range-value c "Hello, World!")
; Check that it worked.
(println "A1 =" (range-value c))
(workbook-close wb true)))
---8<---

So, long answer to your question: it is fairly easy to access Excel via
COM, but is low-level and tedious. One could also have a look at
jacobgen (also from the Jacob project). But the above solution fits my
needs and I have Clojure functions instead of Java . calls and full
control over how things are named.

I will ask my boss, in how far I can publish things. The above is fairly
trivial, but to disclose more internals or even release the whole thing
I have to check the legal consequences first.

What I can point you to, however, is the Excel VBA help. It's named
something like vbaxl10.chm. One can more or less translate the VBA calls
directly to Jacob.

Hope this helps.

Sincerely
Meikel

[1]: http://sf.net/projects/jacob-project

Shawn Hoover

unread,
Aug 4, 2008, 2:57:49 PM8/4/08
to clo...@googlegroups.com
On Mon, Aug 4, 2008 at 11:28 AM, Meikel Brandmeyer <m...@kotka.de> wrote:
Am 04.08.2008 um 18:23 schrieb Shawn Hoover:


This is cool. One thing I don't see from your description is how you
interfaced to Excel from Clojure. Are you using COM via Java? It would
be cool if you could put a snippet in the wiki showing how to get a
handle to an Excel workbook in Clojure.

Yes. I'm using COM. That limits the whole solution to Windows, but pivot
tables are far more important to me than portability. (Not in general,
but because of my situation. Programming is not my full-time job after
all.)

To access the COM interface I use Jacob[1].

 
Thanks, Meikel. That answers my question completely and is a great example of using Clojure. I also liked how you showed the steps of cleaning up the initial code by breaking out macros and functions.

Shawn
Reply all
Reply to author
Forward
0 new messages