Google Groups no longer supports new Usenet posts or subscriptions. Historical content remains viewable.
Dismiss

asymmetry in tDOM XML parsing and rendering

28 views
Skip to first unread message

Aric Bills

unread,
Apr 11, 2008, 2:23:07 AM4/11/08
to
It appears that when tDOM renders an XML document (using [$doc asXML])
where an entity node and a text node are siblings and the entity node
precedes the text node, the text node gets rendered on its own line.
When you look at the value of the text node (using [$textnode
nodeValue]), however, it is clear that the newline is not considered
part of the value.

However, if you parse the output of [$doc asXML] into a new dom tree,
the newline is now included in the value of the text node in question.

Here is some code to demonstrate what I mean:

# start code
package require tdom

proc demonstrate {} {
dom createNodeCmd -returnNodeCmd element elementNode
dom createNodeCmd -returnNodeCmd text textNode
dom createDocument test doc

[$doc documentElement] appendFromScript {
elementNode {}
set textnode [textNode "some text"]
}

puts "Original document as XML:"
puts [$doc asXML]
puts "Value of original text node between slashes: /[$textnode
nodeValue]/\n"

dom parse [$doc asXML] doc2
puts "Reparsed document as XML:"
puts [$doc2 asXML]
set newtextnode [$doc2 selectNodes {/test/text()}]
puts "Value of new text node between slashes: /[$newtextnode
nodeValue]/\n"
}

demonstrate
# end code

Is there a way to get tDOM to parse its own output into an input
identical to the original DOM tree, without gratuitous newlines?

Thanks,
Aric

Koen Danckaert

unread,
Apr 11, 2008, 3:34:20 AM4/11/08
to
On 11 apr, 08:23, Aric Bills <aric.bi...@gmail.com> wrote:
> It appears that when tDOM renders an XML document (using [$doc asXML])
> where an entity node and a text node are siblings and the entity node
> precedes the text node, the text node gets rendered on its own line.
> When you look at the value of the text node (using [$textnode
> nodeValue]), however, it is clear that the newline is not considered
> part of the value.
>
> However, if you parse the output of [$doc asXML] into a new dom tree,
> the newline is now included in the value of the text node in question.
>
> [...]

> Is there a way to get tDOM to parse its own output into an input
> identical to the original DOM tree, without gratuitous newlines?

I believe parsing with -keepEmpties and serializing with -indent none
should preserve all whitespace and linebreaks:

dom parse -keepEmpties [$doc asXML -indent none] doc2

--Koen

Aric Bills

unread,
Apr 11, 2008, 4:35:37 AM4/11/08
to
On Apr 10, 11:34 pm, Koen Danckaert <koen.n...@gmail.com> wrote:
>
> I believe parsing with -keepEmpties and serializing with -indent none
> should preserve all whitespace and linebreaks:
>
> dom parse -keepEmpties [$doc asXML -indent none] doc2
>
> --Koen

Indeed! In fact, it looks like -indent none does the trick, at least
in the minimal example I posted. Thanks for the quick and helpful
response.

Aric

0 new messages