Does it surprise anyone else that both minidom and lxml (possibly using the same parser under the covers) don't treat "\r\n" and " " as equivalent? I would have expected, based on
http://www.w3.org/TR/REC-xml/#sec-line-ends, to get back "\n" as the value of the text node in both cases. That's not what happens, however. If the string is serialized in what the parser takes in as "\r\n" what comes out is "\n" (as I expected), but if it's serialized as " " it comes out as "\r\n"! Seems like either a flaw in the parser(s) or in the spec. If the spec (which I don't claim to have fully understood) really says these two representations of the two-character sequence should be treated differently, I haven't been able to find any rationale for why the line-ending normalization wouldn't operate on the characters represented by either serialization. Any clues?