Andreas Leitgeb <
a...@logic.at> writes:
> Not an XML expert, either.
> Something *else* still looks strange:
>
> set orig {<a><b>1</b><![CDATA[test of & <bad> format]]><c>12</c></a>}
>
> set d1 [dom parse $orig]
> set d1X [$d1 asXML]
>
> set d2 [dom parse $d1X]
> set d2X [$d2 asXML]
>
> string equal $d1X $d2X ;# --> 0
>
> So, they're apparently not "equivalent". [...]
>
> [...] The parse&asXML transormation obviously isn't "idempotent"
Please try:
set orig {<a><b>1</b><![CDATA[test of & <bad> format]]><c>12</c></a>}
set d1 [dom parse -keepEmpties $orig]
set d1X [$d1 asXML -indent none]
set d2 [dom parse -keepEmpties $d1X]
set d2X [$d2 asXML -indent none]
string equal $d1X $d2X ;# --> 1
So, if you want "idempotency", just ask for. (The -keepEmpties flag
isn't needed in this example but it is in general for this.)
The default for [dom parse ...] is to throw away any white space only
text node. With -keepEmpties this white space is kept as TEXT_NODE.
The default for [$doc asXML] is to generate some "pretty printed"
serialization. With -indent none (or -indent no, same result) no white
space whatsoever will be added in between the string representations of
the nodes of the DOM tree.
Both defaults are of course not the right thing if you look at them from
an XML zealot viewpoint. If you need full strictness in this detail
you have the options.
(Well, if you do xslt transformations it is recommended to parse both
source and stylesheet with -keepEmpties if you don't know for sure that
you don't need.)
This defaults show bias for "XML as a data format" (versus "XML as a document
format") and this was it where tDOM came from.
> PS:
>>> % package require tdom
>>> 0.8.3
> ditto here.
This doesn't matter. It's this way since more than 15 years.
The zealots have an argument to claim that defaults are the wrong way
around. (What this argument is? "The principle of least surprise". If
you know your recommendations then you expect to get even the white
space in your XML data as text nodes. And if you do XSLT in any case.)
To turn around this defaults isn't an option (would introduce data
driven bugs in a lot of code). I could add a global flag to switch the
defaults for the one that prefer that. But isn't that mostly
bikeshedding?