XML problem

67 views
Skip to first unread message

yvan

unread,
Feb 10, 2010, 5:24:40 AM2/10/10
to Clojure
Hello Clojure group

I am testing Clojure and I have an error parsing thix XML excerpt
below.
Is this a SAX bug ou a Clojure bug .. or my mistake ?

thank's for help

IN REPL
********
(ns x (:require [clojure.xml :as xml]) )

x=> (try (xml/parse "exampleSortieXML.xml")(catch Exception e (. e
printStackTrace) ))


ANWSER
*******
org.xml.sax.SAXParseException: The reference to entity "utmn" must end
with the ';' delimiter.
at
com.sun.org.apache.xerces.internal.util.ErrorHandlerWrapper.createSAXParseException(ErrorHandlerWrapper.java:
195)
at
com.sun.org.apache.xerces.internal.util.ErrorHandlerWrapper.fatalError(ErrorHandlerWrapper.java:
174)

Etc...

XML file
******
<traffic>
<entry statusCode="200" method="GET" url="http://www.google-
analytics.com/__utm.gif?
utmwv=4.6.5&utmn=1786408720&utmhn=www.witbe.net&utmcs=UTF-8&utmsr=1680x1050&utmsc=24-
bit&utmul=fr&utmje=1&utmfl=10.0%20r42&utmcn=1&utmdt=Witbe%20-%20v
%C3%A9ritable%20supervision%20de%20bout%20en%20bout%20et%20monitoring
%20de%20la%20Qualit%C3%A9%20d%27Exp%C3%A9rience%20%3A%20Syst%C3%A8mes
%20d%27Information%20et%20Services%20Multi-
play&utmhid=2134295609&utmr=-&utmp=%2Fqoe%2Findex.php
%2FAccueil.html&utmac=UA-7415175-1&utmcc=__utma
%3D218258335.1952450742.1265618759.1265618759.1265618759.1%3B%2B__utmz
%3D218258335.1265618759.1.1.utmcsr%3D(direct)%7Cutmccn%3D(direct)
%7Cutmcmd%3D(none)%3B" bytes="35" start="2010-02-08T09:45:58.811+0100"
end="2010-02-08T09:45:58.922+0100" timeInMillis="111">
</entry>
</traffic>

Laurent PETIT

unread,
Feb 10, 2010, 9:54:14 AM2/10/10
to clo...@googlegroups.com
Hello Yvan,

I guess it's neither a clojure nor java SAX parser problem, but rather
a problem in the xml file itself.

It is illegal to have ampersands in attributes values.

The ampersand & should be replaced by &amp; everywhere in attribute
values. In other case, the xml parser tries to resolve what begins
with & and ends with ; as an xml entity and replace it with the xml
entity value.

Is the xml produced "by hand", or by string concatenation, rather than
produced by an xml producer ?

2010/2/10 yvan <yvan....@gmail.com>:

> --
> You received this message because you are subscribed to the Google
> Groups "Clojure" group.
> To post to this group, send email to clo...@googlegroups.com
> Note that posts from new members are moderated - please be patient with your first post.
> To unsubscribe from this group, send email to
> clojure+u...@googlegroups.com
> For more options, visit this group at
> http://groups.google.com/group/clojure?hl=en

Laurent PETIT

unread,
Feb 10, 2010, 10:02:16 AM2/10/10
to clo...@googlegroups.com
Here is the proof I was searching !

http://www.w3.org/TR/xml/#NT-AttValue

2010/2/10 Laurent PETIT <lauren...@gmail.com>:

Laurent PETIT

unread,
Feb 10, 2010, 10:02:46 AM2/10/10
to clo...@googlegroups.com

Alexandre Patry

unread,
Feb 10, 2010, 9:16:19 AM2/10/10
to clo...@googlegroups.com
Hi,

yvan wrote:
> Hello Clojure group
>
> I am testing Clojure and I have an error parsing thix XML excerpt
> below.
> Is this a SAX bug ou a Clojure bug .. or my mistake ?
>
> thank's for help
>
> IN REPL
> ********
> (ns x (:require [clojure.xml :as xml]) )
>
> x=> (try (xml/parse "exampleSortieXML.xml")(catch Exception e (. e
> printStackTrace) ))
>
>
> ANWSER
> *******
> org.xml.sax.SAXParseException: The reference to entity "utmn" must end
> with the ';' delimiter.
>

Your XML is not valid, you must escape all & with &amp;

The excerp "UTF-8&utmsr" should thus be "UTF-8&amp;utmsr".

Alex

Reply all
Reply to author
Forward
0 new messages