Helping out with clojure.data.xml

743 views
Skip to first unread message

pepijn (aka fliebel)

unread,
Nov 23, 2011, 11:12:46 AM11/23/11
to Clojure Dev
Hi,

What is the status of data.xml, and how can I help?

Much of the effort so far seems to have been put into parsing XML, I
spend the last couple of days and a couple of days a while back
implementing the generation of XML, especially with support for
namespaces[1].

This work currently lives under https://github.com/pepijndevos/ArmageDOM
but I'd be happy to contribute it to core.xml instead, or work with
what's there, depending on the state of the current code[2].

One philosophical difference I should point out is that core.xml seems
to use [tag attr content] for syntax, while I use ^{attr} [tag
content], since attributes are in fact exactly that, meta data.

Pepijn

[1]: https://github.com/clojure/data.xml/issues/2
[2]: at frist glance, mine doesn't have 20-line functions.

Alan Malloy

unread,
Nov 23, 2011, 1:49:06 PM11/23/11
to Clojure Dev
I strongly disagree that XML attributes are metadata - they are
exactly the data this library is supposed to manipulate. And {:keys
[tag attrs content]} is a structure already used by a few other
libraries, such as Enlive, so it's a natural and convenient choice for
a core package.

Anyway, data.xml is owned by Chouser, with some help from me. Right
now what it really needs is an official release so that it can get
some field testing. And what's holding that up is that one test, the
indentation test, is failing. This is because, apparently, Java has
pluggable XML handlers, which all have different ways to configure
things like whitespace handling. The tests work for me and Chouser,
but the indentation test fails on the CI machine, probably because it
has a different indenter. So if someone could figure out a reliable
way to detect what handler we're using and how to set up indentation,
a release would magically make its way to some maven repo.

data.xml certainly attempts to be namespace-aware, but I don't think
there were any tests for that yet. I added a test just now, at
https://github.com/clojure/data.xml/tree/namespaces, but it fails. I
suspect that's because my test is incorrect, though: I don't know a
lot about namespaces. You could probably also help by getting that
straightened out.

On Nov 23, 8:12 am, "pepijn (aka fliebel)" <pepijnde...@gmail.com>
wrote:


> Hi,
>
> What is the status of data.xml, and how can I help?
>
> Much of the effort so far seems to have been put into parsing XML, I
> spend the last couple of days and a couple of days a while back
> implementing the generation of XML, especially with support for
> namespaces[1].
>

> This work currently lives underhttps://github.com/pepijndevos/ArmageDOM

Christophe Grand

unread,
Nov 24, 2011, 11:31:36 AM11/24/11
to cloju...@googlegroups.com
On Wed, Nov 23, 2011 at 7:49 PM, Alan Malloy <al...@malloys.org> wrote:
data.xml certainly attempts to be namespace-aware, but I don't think
there were any tests for that yet. I added a test just now, at
https://github.com/clojure/data.xml/tree/namespaces, but it fails. I
suspect that's because my test is incorrect, though: I don't know a
lot about namespaces. You could probably also help by getting that
straightened out.

The problem with namespaces is that most (all) clojure xml libs are oblivious to the semantics of xmlns and xmlns:* attributes.
In your test, :api/method, api is an alias for http://some.uri.com/location. Aliases are serialization artifacts and, as such, I think they should not be part of the Clojure XML model.

It also follows that a node is not context-free anymore: you need to know its parents to resolve prefixes.

One way to fix this issue would be, in the presence of namespaces, to qualify everything[1] with the full URI (keyword (munge uri) name) and add aliases mapping in metadata (which implies, I think, the removal of xmlns and xmlns:* attributes from the model)

^{:xmlns {"api" "http://some.uri.com/location"}} {:tag :body :content [{:tag :http://some.uri.com/location/method :attrs {:args "" :ret ""}}]}

Hence nodes would be equals even if parsed from serializations using different aliases and a node could be serialized correctly (by gensyming aliases) even if its context has been lost (eg a xml node copied from one doc to another).

What do you think?

Christophe

[1] expect unqualified attributes.

Chris Perkins

unread,
Nov 24, 2011, 3:37:24 PM11/24/11
to cloju...@googlegroups.com, chris...@cgrand.net
On Thursday, November 24, 2011 11:31:36 AM UTC-5, Christophe Grand wrote:

The problem with namespaces is that most (all) clojure xml libs are oblivious to the semantics of xmlns and xmlns:* attributes.
In your test, :api/method, api is an alias for http://some.uri.com/location. Aliases are serialization artifacts and, as such, I think they should not be part of the Clojure XML model.

It also follows that a node is not context-free anymore: you need to know its parents to resolve prefixes.

One way to fix this issue would be, in the presence of namespaces, to qualify everything[1] with the full URI (keyword (munge uri) name) and add aliases mapping in metadata (which implies, I think, the removal of xmlns and xmlns:* attributes from the model)

^{:xmlns {"api" "http://some.uri.com/location"}} {:tag :body :content [{:tag :http://some.uri.com/location/method :attrs {:args "" :ret ""}}]}

Hence nodes would be equals even if parsed from serializations using different aliases and a node could be serialized correctly (by gensyming aliases) even if its context has been lost (eg a xml node copied from one doc to another).

What do you think?

 
While I agree that what you propose is the right way to do it, I would also like simple things to remain simple.

As an example, I don't want to be forced to change simple code like this:

(if (= (:tag e) :a) ...)

to something like:

(if (= (:tag e) :http://www.w3.org/1999/xhtml/a) ...)

In many cases - especially where there is only one namespace, with no prefix, and it's on the root element - it's nice to be able to simply ignore namespaces altogether.

I'm not sure what API I would want to support this, but it may be as simple as a flag to xml/parse to flip it back to the current, blissfully namespace-unaware, behavior.

 - Chris

Christophe Grand

unread,
Nov 25, 2011, 1:46:27 AM11/25/11
to cloju...@googlegroups.com
On Thu, Nov 24, 2011 at 9:37 PM, Chris Perkins <chrispe...@gmail.com> wrote:

While I agree that what you propose is the right way to do it, I would also like simple things to remain simple.

As an example, I don't want to be forced to change simple code like this:

(if (= (:tag e) :a) ...)

to something like:

(if (= (:tag e) :http://www.w3.org/1999/xhtml/a) ...)

In many cases - especially where there is only one namespace, with no prefix, and it's on the root element - it's nice to be able to simply ignore namespaces altogether.

I'm not sure what API I would want to support this, but it may be as simple as a flag to xml/parse to flip it back to the current, blissfully namespace-unaware, behavior.

 
I'm fully aware that it would bring complexity -- and that's why Enlive still relies on clojure.xml behavior.

You have a point that namespaces support should be toggleable -- it's a separate specification and most parsers provide a setting for enabling namespaces -- and by separating the two modes it may be easier to come up with a solution.

In namespace-aware node, one may choose to store under :tag only the local name of the tag (that is striping the alias) and store the namespace under another key. However this can't be done for attributes (short of changing attributes from keywords to maps or vectors) so for attributes namespace-munging still seems the answer but this is alleviated by the fact that attributes without alias are not to be resolved since the default namespace doesn't apply (http://www.w3.org/TR/REC-xml-names/#defaulting).

My proposal now is:

<body xmlns:api="http://some.uri.com/location"
      xmlns:s "http://other.uri.com/security">
  <api:method args="" ret="int" s:role="*"/>
</body>


without namespace (current state of affairs) is parsed into:

{:tag :body
 :attrs {:xmlns:api "http://some.uri.com/location"
         :xmlns:s "http://other.uri.com/security"}
 :content [{:tag :api:method
            :attrs {:args "" :ret "int"
                    :s:role "*"}}]}


with namespaces into:
          "s" "http://other.uri.com/security"}}
 {:tag :body
  :content [{:tag :method
             :ns "http://some.uri.com/location"
             :attrs {:args "" :ret "int"
                     :http://other.uri.com/security/role "*"}}]}

Feedback welcome.

Christophe

Ben Smith-Mannschott

unread,
Nov 25, 2011, 7:18:11 AM11/25/11
to cloju...@googlegroups.com

(0) This representation will break round-trip-ability for some XML
vocabularies.

The intention of declared namespace prefixes seems to have been as a
lexically scoped shortcut to keep element and attribute names compact
while using multiple namespaces. This scenario would otherwise require
a plethora of xmlns="..." declarations throughout the tree, or as one
early proposal had it: "{" some-ns-uri "}" local-name.

This implies that namespace prefixes are arbitrary and need not be
preserved by an XML parser. Round-tripping is preserved by simply
inserting synthesized xmlns declarations as necessary when writing the
DOM tree back out again.

Unfortunately, the "inspired" designers of XSD and XSLT chose to use
these namespace prefixes not just in names, but in *content*.
(Attribute values, specifically). Now suddenly, parsers must preserve
the particular prefixes chosen by the document author for fear of
otherwise breaking the document because those prefixes might be used
in content that is opaque to an general XML parser.

This presents another problem because while the prefix must be
preserved for the sake of sick vocabularies like XSD and XSLT it
shouldn't participate in determining equality of qualified names. So,
a Clojure representation of qualified names that preserves NS URI,
local name, and prefix would either need define its own
qualified-xml-names-are-equal? predicate and use it in place of =
where required, or it would have to store the *prefix* as meta-data
attached to an object representing the tuple (namespace-url,
local-name). And be careful not to lose it during transformations and
such.

(2) It doesn't seem advisable to mash the NS URI and the local name
together in this fashion as it's common to want to get at them
separately. Also, this prevents us from storing the NS URI only once
on the heap and reusing that form every element instead of duplicating
the potentially length NS URI as part of every element name.

This is what you're already doing with element names. Why are
attribute names being handled differently? Is it because attribute
names are generally not namespace qualified? (The fact that they are
scoped lexically to the element within which they occur generally
makes that unnecessary, I guess.

(3) At first glace :http://other.uri.com/security/role looks like it
could be potentially ambiguous. Local name must contain no slashes
(this is true [1]). The ns URI should not end in a slash (this is not
promised by RFC3986 3.3 [2]).

[1] http://www.w3.org/TR/2008/REC-xml-20081126/#NT-Name
[2] http://www.rfc-editor.org/rfc/rfc3986.txt

(p.s. I've been searching for a tool that lets me do non-lossy
editing/transformation of XML in Clojure; I've even sketched out ideas
for a few approaches -- but nothing has really gelled yet. One factor
is that the current conventional XML representation used established
by Clojure core sin't very smart about namespaces. p.p.s. XML makes
me grumpy; Sorry.)

// Ben

> Feedback welcome.
>
> Christophe
>
> --
> You received this message because you are subscribed to the Google Groups
> "Clojure Dev" group.
> To post to this group, send email to cloju...@googlegroups.com.
> To unsubscribe from this group, send email to
> clojure-dev...@googlegroups.com.
> For more options, visit this group at
> http://groups.google.com/group/clojure-dev?hl=en.
>

Christophe Grand

unread,
Nov 25, 2011, 7:49:25 AM11/25/11
to cloju...@googlegroups.com
Don't be sorry, XML and XML NSs especially make me grumpy too

On Fri, Nov 25, 2011 at 1:18 PM, Ben Smith-Mannschott <bsmit...@gmail.com> wrote:
> with namespaces into:
>
> ^{:xmlns {"api" "http://some.uri.com/location"
>           "s" "http://other.uri.com/security"}}
>  {:tag :body
>   :content [{:tag :method
>              :ns "http://some.uri.com/location"
>              :attrs {:args "" :ret "int"
>                      :http://other.uri.com/security/role "*"}}]}

(0) This representation will break round-trip-ability for some XML
vocabularies.

Unfortunately, the "inspired" designers of XSD and XSLT chose to use
these namespace prefixes not just in names, but in *content*.
(Attribute values, specifically).

*sigh* true I forgot them.
What if we propagate metadata on all elements?

 
(2) It doesn't seem advisable to mash the NS URI and the local name
together in this fashion as it's common to want to get at them
separately. Also, this prevents us from storing the NS URI only once
on the heap and reusing that form every element instead of duplicating
the potentially length NS URI as part of every element name.

I disagree, :http://other.uri.com/security/role is a namespaced keyword and is made of two strings:
"http://other.uri.com/security"  and role (the slash is just part of the representation).

So only one occurence of "http://other.uri.com/security" on the heap


 
This is what you're already doing with element names. Why are
attribute names being handled differently? Is it because attribute
names are generally not namespace qualified?

Exactly and to not switch to a composite representation for attributes names -- or worse something irregular: a keyword when no namespaces, something else when namespaced.
Plus it would make easier to write code which works on namespaced or non-namespaced trees.
 
(3) At first glace :http://other.uri.com/security/role looks like it
could be potentially ambiguous. Local name must contain no slashes
(this is true [1]). The ns URI should not end in a slash (this is not
promised by RFC3986 3.3 [2]).

   [1] http://www.w3.org/TR/2008/REC-xml-20081126/#NT-Name
   [2] http://www.rfc-editor.org/rfc/rfc3986.txt


attribute keywords would be genberated by (keyword (encode namespace) local-name) where encode is currently undefined.

The example namespaces don't need to be escaped but I'm certainly abusing the reader by using such keywords but I really don't care about the exact encoding to make urls fit into a keyword namespace.

David Powell

unread,
Nov 25, 2011, 7:56:11 AM11/25/11
to cloju...@googlegroups.com

What about using a [localName prefix nsUri] vector as a replacement for the munged keyword?

Or maybe ^{:prefix prefix} [localName nsUri] - as that would preserve the prefix without making it part of the data model that users would have to search on.


fyi: http://www.w3.org/TR/xml-infoset/ describes the set of information associated with each concept in an XML document - though you wouldn't want to go out of your way to preserve some of it.

-- 
Dave

Chris Perkins

unread,
Nov 25, 2011, 9:07:33 AM11/25/11
to cloju...@googlegroups.com, chris...@cgrand.net
A few thoughts:

1) I agree that xmlns metadata is needed on every element.  But it would also be nice to be able to do exact, textual round-tripping of XML, which means knowing on which elements the namespaces were actually declared in the original.

2) Should the xmlns metadata should have the keys and values the other way around, mapping uri -> prefix?  That's what is needed for writing out the XML, anyway.

3) I sometimes need to keep comments and processing instructions, and possibly cdata sections too, when I work with XML.  Any opinions on supporting these? (maybe too much for now? get the basics first?)

Point 1 is motivated by having worked with Python's ElementTree API in the past. That library does (or did, at the time) a great job of preserving "xml infoset correctness", while still managing to write out XML that is completely unusable from a practical point of view - tagnames munged to "{http://whatever/}name", xmlns declarations re-generated at emit-time in different places, etc.  I think metadata is the perfect tool to allow clojure's XML handling to have correctness without sacrificing practicality. Your proposal, Christophe, sounds to me like the way to get there.

- Chris

Christophe Grand

unread,
Nov 27, 2011, 1:24:54 PM11/27/11
to cloju...@googlegroups.com
Well earlier in this thread I argued in favor of keeping all attributes keys as keywords for uniformity sake. But uniformity doesn't buy us anything: you can't destructure :http://other.uri.com/security/role with :keys and nobody in is right mind is going to type such a keyword. The only thing that uniformity gives us is access to the local name through #'name (namespace access requires unmunging -- unless one decides to roll out a new type implementing c.l.Named but I don't think it's a good idea for a broadly used schema to use custom datatypes). It's better/simpler to have some xml/local-name and xml/namespace functions.

All that to say I have changed my mind and ^{:prefix prefix} [localName nsUri] seems ok (I like that localName comes first) albeit at first I would have discarded the prefix metadata.

However I still think that non-namespaced attributes should still be keywords and that [localName], [localName nil] or [localName ""] should not be legal representations of such attributes names. (For implementors, Postel's Law may apply but better not to mandate it at the schema level).

Christophe

--
You received this message because you are subscribed to the Google Groups "Clojure Dev" group.
To post to this group, send email to cloju...@googlegroups.com.
To unsubscribe from this group, send email to clojure-dev...@googlegroups.com.
For more options, visit this group at http://groups.google.com/group/clojure-dev?hl=en.



--
Professional: http://cgrand.net/ (fr)
On Clojure: http://clj-me.cgrand.net/ (en)

Chouser

unread,
Nov 27, 2011, 2:28:08 PM11/27/11
to cloju...@googlegroups.com

This is a fantastic discussion. Keep it up! I'd love to see data.xml
have solid support for namespaces, comments, processing instructions,
etc. Several of you clearly understand the issues involved here
better than I do. It sounds like you're close to a plan, so don't
stop now.

I just want to clarify a non-technical point: I have no particular
interest in being the owner (whatever that means) of data.xml. When I
wrote the old contrib lib it was because I needed it. I'm not in
particular need of xml processing these days, which unfortunately
comes out in my poor stewardship of the lib. Both Allan Malloy and
Ryan Senior have dome more recently to add features and push data.xml
toward a release.

Please don't let me be any kind of impediment to getting the design
and implementation you need, and pushing out releases.

Let me know how I can help, as needs arise.
--Chouser

Christophe Grand

unread,
Nov 27, 2011, 3:47:25 PM11/27/11
to cloju...@googlegroups.com
On Fri, Nov 25, 2011 at 3:07 PM, Chris Perkins <chrispe...@gmail.com> wrote:
1) I agree that xmlns metadata is needed on every element.  But it would also be nice to be able to do exact, textual round-tripping of XML, which means knowing on which elements the namespaces were actually declared in the original.

Exact round-tripping is a harsh requirement. How exact do you want to be? Up to the entities used? Attributes order?
However your point regarding elementtree is important and maintaining original ns aliases (and the place where thay are declared) as far as possible should be a goal. 
 
2) Should the xmlns metadata should have the keys and values the other way around, mapping uri -> prefix?  That's what is needed for writing out the XML, anyway.

It should, definitely -- I got it backwards.
 

3) I sometimes need to keep comments and processing instructions, and possibly cdata sections too, when I work with XML.  Any opinions on supporting these? (maybe too much for now? get the basics first?)

I already have support of comments in Enlive: it's a necessity in HTML between javascript in comments and IE conditional comments. It's simply {:type :comment :data "xxx"}. 

Chris Perkins

unread,
Nov 27, 2011, 5:06:51 PM11/27/11
to cloju...@googlegroups.com, chris...@cgrand.net
On Sunday, November 27, 2011 1:24:54 PM UTC-5, Christophe Grand wrote:
Well earlier in this thread I argued in favor of keeping all attributes keys as keywords for uniformity sake. But uniformity doesn't buy us anything: you can't destructure :http://other.uri.com/security/role with :keys and nobody in is right mind is going to type such a keyword. The only thing that uniformity gives us is access to the local name through #'name (namespace access requires unmunging -- unless one decides to roll out a new type implementing c.l.Named but I don't think it's a good idea for a broadly used schema to use custom datatypes). It's better/simpler to have some xml/local-name and xml/namespace functions.

It's not clear to me what sort of munging and unmunging would be necessary.  Isn't (keyword "anything-at-all" "localName") legal clojure? Why do you need to munge anything? Regardless, I agree that using a vector of [localName uri] would probably be more convenient.
 
All that to say I have changed my mind and ^{:prefix prefix} [localName nsUri] seems ok (I like that localName comes first) albeit at first I would have discarded the prefix metadata.

I don't think you need the prefix metadata because that information is in the element's metadata (assuming you don't need to be able to write out or otherwise process an attribute independent of its containing element).
 
However I still think that non-namespaced attributes should still be keywords and that [localName], [localName nil] or [localName ""] should not be legal representations of such attributes names. (For implementors, Postel's Law may apply but better not to mandate it at the schema level).

I don't quite follow what you're saying here.  Postel's Law is "be liberal in what you accept...", so why should [localName], [localName nil], and [localName ""] be illegal? Or do you just mean that a parser should be required not to emit those forms for unprefixed attributes, but that other code (eg: code to write out XML) should accept them?

- Chris

Laurent PETIT

unread,
Nov 27, 2011, 6:20:43 PM11/27/11
to cloju...@googlegroups.com


2011/11/27 Christophe Grand <chris...@cgrand.net>

On Fri, Nov 25, 2011 at 3:07 PM, Chris Perkins <chrispe...@gmail.com> wrote:
1) I agree that xmlns metadata is needed on every element.  But it would also be nice to be able to do exact, textual round-tripping of XML, which means knowing on which elements the namespaces were actually declared in the original.

Exact round-tripping is a harsh requirement. How exact do you want to be? Up to the entities used? Attributes order?
However your point regarding elementtree is important and maintaining original ns aliases (and the place where thay are declared) as far as possible should be a goal. 
 
2) Should the xmlns metadata should have the keys and values the other way around, mapping uri -> prefix?  That's what is needed for writing out the XML, anyway.

It should, definitely -- I got it backwards.

Is it legal to have the same uri referred to by different aliases?  If so, the prefix -> uri mapping may be a better fit?
 
 

3) I sometimes need to keep comments and processing instructions, and possibly cdata sections too, when I work with XML.  Any opinions on supporting these? (maybe too much for now? get the basics first?)

I already have support of comments in Enlive: it's a necessity in HTML between javascript in comments and IE conditional comments. It's simply {:type :comment :data "xxx"}. 

--
You received this message because you are subscribed to the Google Groups "Clojure Dev" group.

Chris Perkins

unread,
Nov 27, 2011, 7:01:00 PM11/27/11
to cloju...@googlegroups.com
On Sunday, November 27, 2011 6:20:43 PM UTC-5, lpetit wrote:

 
2) Should the xmlns metadata should have the keys and values the other way around, mapping uri -> prefix?  That's what is needed for writing out the XML, anyway.

It should, definitely -- I got it backwards.

Is it legal to have the same uri referred to by different aliases?  If so, the prefix -> uri mapping may be a better fit?
 
Good point.  Yes, it is legal.  OK, now XML is making me grumpy too.

But having two prefixes for the same uri is probably so rare in the real world that it would be fine to just keep one of them, more-or-less at random.  You may not get perfect round-tripping, but you don't sacrifice correctness, xml-infoset-wise.

- Chris
 

Laurent PETIT

unread,
Nov 28, 2011, 1:09:53 AM11/28/11
to cloju...@googlegroups.com


2011/11/28 Chris Perkins <chrispe...@gmail.com>
I was thinking of a scenario when one would assemble / merge, etc., xml from different sources, where presumably each source could have used different prefix for the same URI. Could that happen in the real world ? 

- Chris
 

--
You received this message because you are subscribed to the Google Groups "Clojure Dev" group.
To view this discussion on the web visit https://groups.google.com/d/msg/clojure-dev/-/st5pqJIglvEJ.

Cosmin Stejerean

unread,
Nov 28, 2011, 3:06:35 AM11/28/11
to cloju...@googlegroups.com
On Mon, Nov 28, 2011 at 5:09 PM, Laurent PETIT <lauren...@gmail.com> wrote:
> I was thinking of a scenario when one would assemble / merge, etc., xml from
> different sources, where presumably each source could have used different
> prefix for the same URI. Could that happen in the real world ?

In those cases though presumably the assembled XML will be sent to
some other system and it's unlikely that the other system would
appreciate different prefixes for the same URI.


--
Cosmin Stejerean
http://offbytwo.com

Christophe Grand

unread,
Nov 28, 2011, 3:28:02 AM11/28/11
to cloju...@googlegroups.com

*sigh*, XML -- I hope nobody is going to bring more fun to the table with validating XML processor...

Beyond the case of several aliases for one url, there is another argument in favor of prefix -> uri mapping: it ensures we are not going to have a xmlns map where two urls are bound to the same prefix!
The uri->prefix mapping is needed while serializing so the serializer can maintain it (from the prefix-> uri maps found while wlking the tree) and pass it to the relevant functions. Plus in such a setup it would be the the serializer job to pick the best alias for a given namespace.

Christophe Grand

unread,
Nov 28, 2011, 4:13:05 AM11/28/11
to cloju...@googlegroups.com
On Sun, Nov 27, 2011 at 11:06 PM, Chris Perkins <chrispe...@gmail.com> wrote:

It's not clear to me what sort of munging and unmunging would be necessary.  Isn't (keyword "anything-at-all" "localName") legal clojure? Why do you need to munge anything?

The result of such a call to keyword may not be readable.

 
Regardless, I agree that using a vector of [localName uri] would probably be more convenient.
 
All that to say I have changed my mind and ^{:prefix prefix} [localName nsUri] seems ok (I like that localName comes first) albeit at first I would have discarded the prefix metadata.

I don't think you need the prefix metadata because that information is in the element's metadata (assuming you don't need to be able to write out or otherwise process an attribute independent of its containing element).

That was my initial line of thought but it can't do no harm to keep this information for the rare use case where one copy attributes from one document to the other.
Anyway, it's metadata so after transformations it may be out of date or it may have disappeared so metadata are just going to act as hints for the serializer. Serializers implementor are free to adopt any strategy (including not looking at metadata).
 

However I still think that non-namespaced attributes should still be keywords and that [localNme], [localName nil] or [localName ""] should not be legal representations of such attributes names. (For implementors, Postel's Law may apply but better not to mandate it at the schema level).

I don't quite follow what you're saying here.  Postel's Law is "be liberal in what you accept...", so why should [localName], [localName nil], and [localName ""] be illegal?

To be conservative in what we send.
 
Or do you just mean that a parser should be required not to emit those forms for unprefixed attributes, but that other code (eg: code to write out XML) should accept them?

Exactly: other code MAY (or even SHOULD but definitely not MUST -- in the strict RFC 2119 sense of those verbs) accept them but one SHALL NOT emits them.

Christophe Grand

unread,
Nov 30, 2011, 1:02:10 PM11/30/11
to cloju...@googlegroups.com
I started a design page http://dev.clojure.org/display/DXML/Fuller+XML+support to summarize the current proposal, contributions and feedback more than welcome.

Ben Smith-Mannschott

unread,
Nov 30, 2011, 2:52:12 PM11/30/11
to cloju...@googlegroups.com
On Mon, Nov 28, 2011 at 00:20, Laurent PETIT <lauren...@gmail.com> wrote:
>
>
> 2011/11/27 Christophe Grand <chris...@cgrand.net>
>>
>> On Fri, Nov 25, 2011 at 3:07 PM, Chris Perkins <chrispe...@gmail.com>
>> wrote:
>>>
>>> 1) I agree that xmlns metadata is needed on every element.  But it would
>>> also be nice to be able to do exact, textual round-tripping of XML, which
>>> means knowing on which elements the namespaces were actually declared in the
>>> original.
>>
>>
>> Exact round-tripping is a harsh requirement. How exact do you want to be?
>> Up to the entities used? Attributes order?
>> However your point regarding elementtree is important and maintaining
>> original ns aliases (and the place where thay are declared) as far as
>> possible should be a goal.
>>
>>>
>>> 2) Should the xmlns metadata should have the keys and values the other
>>> way around, mapping uri -> prefix?  That's what is needed for writing out
>>> the XML, anyway.
>>
>>
>> It should, definitely -- I got it backwards.
>
>
> Is it legal to have the same uri referred to by different aliases?  If so,
> the prefix -> uri mapping may be a better fit?

Yes, you can have the same uri referred to by different prefixes:

<root xmlns="http://foo" xmlns:foo="http://foo" xmlns:foo2="http://foo">
<element/>
<foo:element/>
<foo2:element/>
</root>

Is perfectly legal. All three "element" elements belong to the
"http://foo" namespace (as does the "root" element).

Ben Smith-Mannschott

unread,
Nov 30, 2011, 3:04:57 PM11/30/11
to cloju...@googlegroups.com, chris...@cgrand.net
On Sun, Nov 27, 2011 at 23:06, Chris Perkins <chrispe...@gmail.com> wrote:
> On Sunday, November 27, 2011 1:24:54 PM UTC-5, Christophe Grand wrote:
>>
>> Well earlier in this thread I argued in favor of keeping all attributes
>> keys as keywords for uniformity sake. But uniformity doesn't buy us
>> anything: you can't destructure :http://other.uri.com/security/role with
>> :keys and nobody in is right mind is going to type such a keyword. The only
>> thing that uniformity gives us is access to the local name through #'name
>> (namespace access requires unmunging -- unless one decides to roll out a new
>> type implementing c.l.Named but I don't think it's a good idea for a broadly
>> used schema to use custom datatypes). It's better/simpler to have some
>> xml/local-name and xml/namespace functions.
>>
> It's not clear to me what sort of munging and unmunging would be necessary.
>  Isn't (keyword "anything-at-all" "localName") legal clojure? Why do you
> need to munge anything? Regardless, I agree that using a vector of
> [localName uri] would probably be more convenient.
>
>>
>> All that to say I have changed my mind and ^{:prefix prefix} [localName
>> nsUri] seems ok (I like that localName comes first) albeit at first I would
>> have discarded the prefix metadata.
>>
> I don't think you need the prefix metadata because that information is in
> the element's metadata (assuming you don't need to be able to write out or
> otherwise process an attribute independent of its containing element).

I don't believe that's accurate.

<element xmlns:foo="http://foo" a="" foo:a=""/>

- element belongs to no namespace
- attribute a belongs to no namespace (but is contained in element)
- attribute foo:a belongs to namespace "http://foo" and has the local name "a"

<element xmlns="http://foo" a=""/>

- element belongs to namespace "http://foo"
- attribute a belongs to no namespace.
(if you require an attribute belong to a particular namespace, the only way
to accomplish this is to declare a prefix for that namespace and use this
prefix.)

At least that's how I have always understood section 6.2
http://www.w3.org/TR/REC-xml-names/#defaulting:

# A default namespace declaration applies to all unprefixed element names
# within its scope. Default namespace declarations do not apply directly
# to attribute names; the interpretation of unprefixed attributes is
# determined by the element on which they appear."

See also 6.3: http://www.w3.org/TR/REC-xml-names/#uniqAttrs which gives
the following example:

# However, each of the following is legal, the second because the default
# namespace does not apply to attribute names:
#
# <!-- http://www.w3.org is bound to n1 and is the default -->
# <x xmlns:n1="http://www.w3.org"
# xmlns="http://www.w3.org" >
# <good a="1" b="2" />
# <good a="1" n1:a="2" />
# </x>

>>
>> However I still think that non-namespaced attributes should still be
>> keywords and that [localName], [localName nil] or [localName ""] should not
>> be legal representations of such attributes names. (For implementors,
>> Postel's Law may apply but better not to mandate it at the schema level).
>>
> I don't quite follow what you're saying here.  Postel's Law is "be liberal
> in what you accept...", so why should [localName], [localName nil], and
> [localName ""] be illegal? Or do you just mean that a parser should be
> required not to emit those forms for unprefixed attributes, but that other
> code (eg: code to write out XML) should accept them?
>
> - Chris
>

> --
> You received this message because you are subscribed to the Google Groups
> "Clojure Dev" group.

> To view this discussion on the web visit

> https://groups.google.com/d/msg/clojure-dev/-/FswmvaV7VKYJ.

Herwig Hochleitner

unread,
Nov 30, 2011, 4:12:29 PM11/30/11
to cloju...@googlegroups.com
2011/11/30 Christophe Grand <chris...@cgrand.net>:

> I started a design page
> http://dev.clojure.org/display/DXML/Fuller+XML+support to summarize the
> current proposal, contributions and feedback more than welcome.

The page seems not to be public right now.

Christophe Grand

unread,
Dec 1, 2011, 3:41:36 AM12/1/11
to cloju...@googlegroups.com
Indeed you need to be logged in :-/ I didn't finf where to change that on the page -- I think the whole DXML space is behind identification.
Can someone with more Confluence skills help?

--
You received this message because you are subscribed to the Google Groups "Clojure Dev" group.
To post to this group, send email to cloju...@googlegroups.com.
To unsubscribe from this group, send email to clojure-dev...@googlegroups.com.
For more options, visit this group at http://groups.google.com/group/clojure-dev?hl=en.

Christopher Redinger

unread,
Dec 2, 2011, 11:44:23 PM12/2/11
to cloju...@googlegroups.com
On Thu, Dec 1, 2011 at 3:41 AM, Christophe Grand <chris...@cgrand.net> wrote:
Indeed you need to be logged in :-/ I didn't finf where to change that on the page -- I think the whole DXML space is behind identification.
Can someone with more Confluence skills help?

DXML space permissions have been fixed to allow anonymous users to read. 

Christophe Grand

unread,
Dec 3, 2011, 5:52:57 AM12/3/11
to cloju...@googlegroups.com
Thanks!

--
You received this message because you are subscribed to the Google Groups "Clojure Dev" group.
To post to this group, send email to cloju...@googlegroups.com.
To unsubscribe from this group, send email to clojure-dev...@googlegroups.com.
For more options, visit this group at http://groups.google.com/group/clojure-dev?hl=en.
Reply all
Reply to author
Forward
0 new messages