I'm having problems with Firefox 3 beta 5 when sending UTF-8
characters via an XMLHTTPRequest.
I have read the following page: http://developer.mozilla.org/en/docs/XMLHttpRequest
which states:
Note: Versions of Firefox prior to version 3 always send the request
using UTF-8 encoding; Firefox 3 properly sends the document using the
encoding specified by data.xmlEncoding, or UTF-8 if no encoding is
specified.
In my test example, I attempt to explicitly set the content-type
header to "text/xml;charset=utf-8", as well as verify that the
xmlEncoding is null.
However, when I view the request headers in FF3 I see:
Content-Type text/xml; charset=ISO-8859-1
and the content on the server side is garbled
In FF2 I see:
Content-Type text/xml;charset=UTF-8
and t content is correctly encoded as UTF-8 on the server side
Here's an example:
<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.01 Transitional//EN"
"http://
www.w3.org/TR/html4/loose.dtd">
<html>
<head>
<title></title>
<script language="javascript">
var dom = document.implementation.createDocument( "", "",
null );
var element = dom.createElement( "content" );
element.appendChild( dom.createTextNode( "üßäöüß" ) );
dom.appendChild( element );
var request = new XMLHttpRequest();
var url ="someURL";
request.open( "POST", url, true );
request.setRequestHeader( "Content-Type", "text/
xml;charset=utf-8" );
/* Firefox 3 should take it's cue from data.xmlEncoding,
according to
http://developer.mozilla.org/en/docs/XMLHttpRequest */
alert( "xmlEncoding? " + dom.xmlEncoding );
request.send( dom );
</script>
</head>
<body>
</body>
</html>
Does anyone have any idea on how to control this? Or how to set the
xmlEncoding property if necessary? Or what I'm doing wrong ;) ?
Thanks,
Peter
First of all, the documentation doesn't seem to be correct. Firefox will use
the encoding of the Document you pass to send(). If the document doesn't have
an XML prolog, its xmlEncoding will always be null, whereas the encoding of the
document may well not be. What XMLHttpRequest is actually using in DOM terms is
the non-standard document.characterSet.
In your example, you create a document via createDocument(). Sadly, there is no
way to specify the encoding to use for such documents, and it defaults to
ISO-8859-1.
> However, when I view the request headers in FF3 I see:
> Content-Type text/xml; charset=ISO-8859-1
Right. That's the change: the actual encoding of the document is now used no
matter what the page claims it's encoded as....
> and the content on the server side is garbled
Only if you decode it as UTF-8 without paying attention to what the data
actually is, right?
> Does anyone have any idea on how to control this? Or how to set the
> xmlEncoding property if necessary? Or what I'm doing wrong ;) ?
I don't think you can change the encoding of an existing Document object. You
can create a document with an arbitrary encoding by using DOMParser on a string
with the proper XML prolog, I would think.
-Boris
Though perhaps it really should default to UTF-8. Might be worth filing a bug
on this...
-Boris
I did that: https://bugzilla.mozilla.org/show_bug.cgi?id=431701
--
Martin Honnen
http://JavaScript.FAQTs.com/
Yes, thats right, I had assumed that all content is coming in as
UTF-8.
>
> I don't think you can change the encoding of an existing Document object. You
> can create a document with an arbitrary encoding by using DOMParser on a string
> with the proper XML prolog, I would think.
>
Yes, I have implemented this using something similar to:
var domDocument = new DOMParser().parseFromString("<?xml version='1.0'
encoding='UTF-8'?><foobar/>","text/xml");
and everything works as desired, although I'm somewhat concerned about
the performance tradeoff between these two implementations.
A similar bug was raised under:
https://bugzilla.mozilla.org/show_bug.cgi?id=407213
May wish to mark as duplicate of 431701.
Thanks for your help.
-Peter