Right now the data structures I'm using for xml are pure Clojure. I
wanted to make it as natural as possible to use built-in Clojure
functions, so the data is very "vanilla". Earlier on another thread
Rich had some good comments about why he thinks a plain-data strategy
like that is good, and I guess I tend to agree.
So elements are either just Clojure sequences (usually lists) or else
vectors. Tags are either symbols or keywords, or [uri sym|kwd] pairs
if namespace qualified. The processing library doesn't care about the
exact form within these bounds, which Clojure makes pretty easy, and
the parser can produce any variant depending on how it's called. I
think the sequence representation would be preferred when lazy
consumption is helpful, and in other situations the vector
implementation might give better performance (not tested, yet).
Here's an example document two different ways:
; as a Clojure list:
'(*TOP*
(*PI* "myapp" "a processing instruction")
(account {title "Savings 1" created "5/5/2008"}
(owner "12398")
(balance {currency "USD"} "3212.12")
(descr-html "Main " (b "short term savings") " account.")
(report-separator " ")
(*PI* "myapp" "another processing instruction"))))
; as a Clojure vector and with keyword tags
'[*TOP*
[*PI* "myapp" "a processing instruction"]
[:account {:created "5/5/2008", :title "Savings 1"}
[:ownerid "12398"]
[:balance {:currency "USD"} "3212.12"]
[:descr-html "Main " [:b "short term savings"] " account."]
[:report-separator " "]
[*PI* "myapp" "another processing instruction"]]]
I get your point about it being nonstandard in the Java world. I
wouldn't mind doing a transreptor though to the more common forms (I
might need some help though, haven't done one yet ...)
If you want to see example path expressions in Clojure and their
outputs, take a look at:
http://github.com/scharris/cxpath/tree/master/tutorial.clj
That tutorial is a work in progress. It doesn't yet show how to deal
with namespaces or how to integrate regular Clojure functions into
path expressions (which is probably the biggest advantage vs. the W3C
XPath). The syntax will probably (hopefully) change somewhat too (I
don't like *at* for attributes, wish I could have @ but that's not
allowed in symbols or keywords).
A couple of notes:
- This whole approach and some of the code derives from SXML and
SXPath, on which there's a lot more info at
http://okmij.org/ftp/Scheme/xml.html.
- There are other approaches to doing xml in Clojure, esp. Rich's
map-based xml.clj in the Clojure distribution. I chose this different
approach because I've been using SXPath in Scheme (and like it).
- STeve