Google Groups no longer supports new Usenet posts or subscriptions. Historical content remains viewable.
Dismiss

XmlWriter, encoding, and document declaration

558 views
Skip to first unread message

Zoe Hart

unread,
Jul 11, 2007, 8:02:24 AM7/11/07
to
I am trying to use an XmlWriter object to generate the XML for a SOAP post
to a web service. The sample requests documented at the web service all
include the XML document declaration <?xml version="1.0" encoding="utf-8"?>.
The XmlWriter seems to by default generate <?xml version="1.0"
encoding="utf-16"?> even when I specify UTF-8 encoding when I create the
XmlWriter. I've found in the .NET Framework documentation that I can use
XmlWriter.WriteProcessingInstruction to control the document declaration,
but I found it odd that the documentation states
This method can be used to write the XML declaration (rather than
WriteStartDocument). This could result in the encoding attribute being
incorrectly written. For example, the following C# code would result in an
invalid XML document because the default encoding is UTF-8.

XmlWriter writer = XmlWriter.Create("output.xml");
writer.WriteProcessingInstruction("xml", "version='1.0',
encoding='UTF-16'");
writer.WriteStartElement("root");
writer.Close();

If the default encoding is UTF-8, why does the XmlWriter by default emit
<?xml version="1.0" encoding="utf-16"?> if I don't explicitly specify that
processing instruction?

Any enlightenment greatly appreciated.

Zoe


Martin Honnen

unread,
Jul 11, 2007, 8:32:27 AM7/11/07
to
Zoe Hart wrote:

> If the default encoding is UTF-8, why does the XmlWriter by default emit
> <?xml version="1.0" encoding="utf-16"?> if I don't explicitly specify that
> processing instruction?

It does not, the default encoding is UTF-8, at least when you create a
file (or write to a stream) e.g. with

using (XmlWriter xmlWriter = XmlWriter.Create(@"file.xml"))
{
xmlWriter.WriteStartDocument();
xmlWriter.WriteStartElement("root");
xmlWriter.WriteElementString("foo", "bar");
xmlWriter.WriteEndElement();
xmlWriter.WriteEndDocument();
}

then the contents of the file written is
<?xml version="1.0" encoding="utf-8"?><root><foo>bar</foo></root>

The encoding UTF-16 is used when you use an XmlWriter over a
StringWriter as in

StringWriter stringWriter = new StringWriter();
using (XmlWriter xmlWriter = XmlWriter.Create(stringWriter))
{
xmlWriter.WriteStartDocument();
xmlWriter.WriteStartElement("root");
xmlWriter.WriteElementString("foo", "bar");
xmlWriter.WriteEndElement();
xmlWriter.WriteEndDocument();
}
Console.WriteLine(stringWriter.ToString());


If you want to change that then use your own subclass of StringWriter e.g.

public class StringWriterWithEncoding : StringWriter {
private Encoding myEncoding;
public override Encoding Encoding {
get {
return myEncoding;
}
}
public StringWriterWithEncoding (Encoding encoding) : base() {
myEncoding = encoding;
}
}

then you can do

StringWriter stringWriter = new
StringWriterWithEncoding(Encoding.UTF8);
using (XmlWriter xmlWriter = XmlWriter.Create(stringWriter))
{
xmlWriter.WriteStartDocument();
xmlWriter.WriteStartElement("root");
xmlWriter.WriteElementString("foo", "bar");
xmlWriter.WriteEndElement();
xmlWriter.WriteEndDocument();
}
Console.WriteLine(stringWriter.ToString());
--

Martin Honnen --- MVP XML
http://JavaScript.FAQTs.com/

Zoe Hart

unread,
Jul 11, 2007, 8:52:28 AM7/11/07
to
I had to read your response several times before it all sank in, but it
really cleared things up. I created my XmlWriter initially with
XmlWriter.Create(StringBuilder). Not StringWriter, but apparently the same
behavior you describe in which UTF-16 is the default encoding. I tried using
XmlWriterSettings.Encoding, setting it to UTF8Encoding and then creating my
XmlWriter with XmlWriter.Create(StringBuilder, XmlWriterSettings). That may
(I assume) have changed the encoding used, but appeared to have no impact on
the encoding specified in the XML document declaration. In both of the above
scenarios I did not have anywhere in my code either
XmlWrite.WriteStartDocument or XmlWrite.WriteProcessingInstruction, but the
declaration is included in the output automatically (if you don't specify
XmlWriterSettings.OmitXmlDeclaration = True). So, since I want UTF-8
encoding and I'm streaming to a StringBuilder, I'll keep my
XmlWriterSettings.Encoding and use XmlWriter.WriteProcessingInstruction to
explicitly set the contents of the XML declaration to reflect the UTF-8
encoding.

Thanks for you quick response. It helped to make sense of the behavior I was
seeing.

Zoe

"Martin Honnen" <maho...@yahoo.de> wrote in message
news:%23UUXOe7...@TK2MSFTNGP03.phx.gbl...

0 new messages