Google Groups no longer supports new Usenet posts or subscriptions. Historical content remains viewable.
Dismiss

Alternative (better) xml parser available for ACL??

2 views
Skip to first unread message

Frank Sonnemans

unread,
Mar 13, 2003, 11:15:09 AM3/13/03
to
I am looking for a better XML parser to use with ACL. The parser in ACL
doesn't work for me.

What I am trying to do is process an xml file containing the instance
variables from smalltalk objects stored in xml. I would like to create a
list of structures or objects in lisp.

The xml file looks like:

<target version="1">
<customer>Frank</customer>
<product>LW5</customer>
</target>
<target version="1">
<customer>.......

The problem with the ACL parser is that it creates an s-expression like

((target version "1") (customer "frank")(product "lw5"))

However I expect:

(target (customer "frank") (product "lw5"))

It turns out that the attribute "version" messes things up. Manually
removing it results in the expected result. I believe this is an
inappropriate way to work for the parser because it does not recognize the
relation between the 'parent' and 'child' tags. so I am looking for an
alternative parser, or maybe a better approach to import the informaton.

Best Regards,


Frank


Kenny Tilton

unread,
Mar 13, 2003, 11:30:48 AM3/13/03
to

Frank Sonnemans wrote:
> I am looking for a better XML parser to use with ACL. The parser in ACL
> doesn't work for me.
>
> What I am trying to do is process an xml file containing the instance
> variables from smalltalk objects stored in xml. I would like to create a
> list of structures or objects in lisp.
>
> The xml file looks like:
>
> <target version="1">
> <customer>Frank</customer>
> <product>LW5</customer>
> </target>
> <target version="1">
> <customer>.......
>
> The problem with the ACL parser is that it creates an s-expression like
>
> ((target version "1") (customer "frank")(product "lw5"))
>
> However I expect:
>
> (target (customer "frank") (product "lw5"))
>
> It turns out that the attribute "version" messes things up. Manually
> removing it results in the expected result. I believe this is an
> inappropriate way to work for the parser because it does not recognize the
> relation between the 'parent' and 'child' tags.

I don't see any difference between ((target <mo stuff>) ...) and your
preferred (target ....) in re showing the parent relationship. The
parent relationship does not come from target being un-nested, it comes
from it being first. 'target' is still first in ((target version "1")
...), you just have to excavate it from a long-form (if you will)
description.

--

kenny tilton
clinisys, inc
http://www.tilton-technology.com/
---------------------------------------------------------------
"Cells let us walk, talk, think, make love and realize
the bath water is cold." -- Lorraine Lee Cudmore

Bob Bane

unread,
Mar 13, 2003, 12:31:17 PM3/13/03
to
Frank Sonnemans wrote:

> I am looking for a better XML parser to use with ACL. The parser in ACL
> doesn't work for me.
>

The ACL XML parser produces structure similar to the stuff in ACL's htmlgen, where


<foo bar="baz"><mumble>grumble</mumble></foo>

produces

((foo bar "baz") (mumble "grumble"))

The parser has to do *something* with the attributes, or you couldn't
reproduce the original XML from the Lisp structure. The htmlgen-ish
syntax is OK for writing HTML, but I found it painful for processing XML
because it makes it harder to find the tags. I wrote two little
converter functions that change the above structure to and from:

(foo (bar baz) (mumble nil "grumble"))

i.e., XML is always represented as (tag attribute-list content...).


Enjoy:

;; For structural processing purposes, I prefer XML rendered as
;;
;; (foo (att val...) stuff)
;; (foo nil stuff)
;;
;; rather than as
;;
;; ((foo att val...) stuff)
;; (foo stuff)
;;
;; These functions convert between these two representations,
;; consing minimally by smashing structure. Nasty, but fast.

(defun flatten-lxml (lxml)
(typecase lxml
(cons
(typecase (car lxml)
(symbol
(mapc #'flatten-lxml (cdr lxml))
(setf (cdr lxml) (cons nil (cdr lxml))))
(cons
(if (symbolp (caar lxml))
(let ((subnode (car lxml))
(stuff (cdr lxml)))
(mapc #'flatten-lxml stuff)
(setf (car lxml) (car subnode))
(setf (cdr lxml) subnode)
(setf (car subnode) (cdr subnode))
(setf (cdr subnode) stuff))
(error "Malformed lxml: ~s" lxml)))
(t (error "malformed lxml: ~s" lxml)))))
lxml)

(defun unflatten-lxml (uxml)
(typecase uxml
(cons
(unless (and (symbolp (car uxml))
(consp (cdr uxml)))
(error "malformed flat uxml: ~s" uxml))
(if (cadr uxml)
(let* ((subnode (cdr uxml))
(stuff (cdr subnode)))
(setf (cdr subnode) (car subnode))
(setf (car subnode) (car uxml))
(setf (car uxml) subnode)
(setf (cdr uxml) stuff))
(progn
(mapc #'unflatten-lxml (cddr uxml))
(setf (cdr uxml) (cddr uxml))))))

Tim Bradshaw

unread,
Mar 13, 2003, 2:55:34 PM3/13/03
to
* Frank Sonnemans wrote:

> It turns out that the attribute "version" messes things up. Manually
> removing it results in the expected result. I believe this is an
> inappropriate way to work for the parser because it does not
> recognize the relation between the 'parent' and 'child' tags. so I
> am looking for an alternative parser, or maybe a better approach to
> import the informaton.

Erm? Surely:

(defun de-attributize (form)
(typecase form
(cons
(destructuring-bind (tag/attr . body) form
(cons (typecase tag/attr
(cons (car tag/attr))
(t tag/attr))
(mapcar #'de-attributize body))))
(t form)))

--tim

0 new messages