Long story short, I'm manipulating OpenOffice ODT, which in its
autogenerated form does not contain any inter-tag whitespace.
Example:
<office:text xmlns:office="foo" xmlns:text="bar"><text:p>Plain</
text:p><text:p><text:span text:style-name="T2">Bold</text:span></
text:p></office:text>
If I load that in Nokogiri and save with to_xml, I get this:
irb(main):001:0> @doc.to_xml
"<?xml version=\"1.0\"?>\n<office:text xmlns:office=\"foo\" xmlns:text=
\"bar\">\n <text:p>Plain</text:p>\n <text:p>\n <text:span
text:style-name=\"T2\">Bold</text:span>\n </text:p>\n</office:text>
\n"=> "<?xml version=\"1.0\"?>\n<text>\n
Or in human-readable form with namespaces put back in:
<office:text>
<text:p>Plain</text:p>
<text:p>
<text:span text:style-name="T2">Bold</text:span>
</text:p>
</office:text>
And now, if I load that in OO, it inserts a spurious single space
after "Bold" that wasn't there before. If you resave the file, it
changes into this:
<office:text ...><text:p>Plain</text:p><text:p><text:span text:style-
name="T2">Bold</text:span> </text:p></office:text>
In other words, other whitespace is stripped out, but the whitespace
after the bold span is apparently considered meaningful and kept,
albeit collapsed to a single space. Whether OO *should* interpret the
input this way is certainly debatable, but unfortunately that's the
way it is now.
Cheers,
-jani
On Dec 7, 3:39 pm, Mike Dalessio <
mike.dales...@gmail.com> wrote: