? in XMLHttpRequest.ResponseText

Elizabeth

unread,

Feb 25, 2006, 3:35:40 PM2/25/06

to

I'm fetching some HTML files with XMLHttpRequest and dumping the
ResponseText into block elements; works fine except that single and double
quotes are being displayed as question marks (inside of a black diamond in
FireFox)

What's going on ? What is the workaround ? I've tried this:

divElement.innerHTML = x.responseText.replace(/\?/g, "'")

but it does nothing ... even if it did work it would not be distinguishing "
from '

Thanks for any help ...

-E-

TheBagbournes

unread,

Feb 26, 2006, 4:08:30 AM2/26/06

to

Elizabeth wrote:
> I'm fetching some HTML files with XMLHttpRequest and dumping the
> ResponseText into block elements; works fine except that single and double
> quotes are being displayed as question marks (inside of a black diamond in
> FireFox)

It sounds like the server is telling the browser the wrong character set
in it's response headers.

If you are in control of the server (ie, you're writing the
servlet/script which produces the text on the server), ensure that the
"Content-Type" header has the appropriate charset attribute.

eg

Content-Type:text/html;charset:UTF-8

or whatever the char set is.

Elizabeth

unread,

Feb 26, 2006, 4:33:06 PM2/26/06

to

"TheBagbournes" <no...@noway.com> wrote in message
news:dtrr6e$98b$1...@nwrdmz01.dmz.ncs.ea.ibs-infra.bt.com...

> > I'm fetching some HTML files with XMLHttpRequest and dumping the
> > ResponseText into block elements; works fine except that single and
double
> > quotes are being displayed as question marks (inside of a black diamond
in
> > FireFox)
>
> It sounds like the server is telling the browser the wrong character set
> in it's response headers.
>
> If you are in control of the server (ie, you're writing the
> servlet/script which produces the text on the server), ensure that the
> "Content-Type" header has the appropriate charset attribute.

The Content-Type charset header is UTF-8 ... that doesn't appear to be the
problem; I'm still baffled by it. Thanks for replying and please let me
know if you have any other ideas/suggestions ..

-E-

VK

unread,

Feb 26, 2006, 4:58:09 PM2/26/06

to

Elizabeth wrote:
> The Content-Type charset header is UTF-8 ... that doesn't appear to be the
> problem; I'm still baffled by it. Thanks for replying and please let me
> know if you have any other ideas/suggestions ..

My wild guess is that you're retrieving a page with "typography" quotes
in UTF-8 Embedded or Microsoft Embedded form.

Open the file and check that all quotes in the text are escaped
properly: &quote; and '

If you see any "typography" quotes (thus left quote differs from the
right one) or some strange entities like \“ or \” then I
bet you've found the reason.

Not a JavaScript problem though.

Elizabeth

unread,

Feb 26, 2006, 5:29:18 PM2/26/06

to

"VK" <school...@yahoo.com> wrote in message
news:1140991089....@t39g2000cwt.googlegroups.com...

>
> Elizabeth wrote:
> > The Content-Type charset header is UTF-8 ... that doesn't appear to be
the
> > problem; I'm still baffled by it. Thanks for replying and please let me
> > know if you have any other ideas/suggestions ..
>
> My wild guess is that you're retrieving a page with "typography" quotes
> in UTF-8 Embedded or Microsoft Embedded form.

Thanks. Your wild guess is correct. I guess someone had given me that text
from a word processor file. But why does it display correctly when I just
navigate to the HTML file in the browser and NOT when I GET it via
XMLHttpRequest and assign it to <block>.innerHTML ?

-E-

Ian Collins

unread,

Feb 26, 2006, 10:23:09 PM2/26/06

to

Maybe because the HTML is parsed when you assign it with innerHTML.

--
Ian Collins.

Elizabeth

unread,

Feb 26, 2006, 10:32:55 PM2/26/06

to

"Ian Collins" <ian-...@hotmail.com> wrote in message
news:11410105...@drone2-svc-skyt.qsi.net.nz...

> >>My wild guess is that you're retrieving a page with "typography" quotes
> >>in UTF-8 Embedded or Microsoft Embedded form.
> >
> >
> > Thanks. Your wild guess is correct. I guess someone had given me that
text
> > from a word processor file. But why does it display correctly when I
just
> > navigate to the HTML file in the browser and NOT when I GET it via
> > XMLHttpRequest and assign it to <block>.innerHTML ?
> >
> Maybe because the HTML is parsed when you assign it with innerHTML.

maybe, but what piece of code is doing the parsing ? and can't understand
the "typography quotes" ?

VK

unread,

Feb 27, 2006, 4:12:11 AM2/27/06

to

Elizabeth wrote:
> > > Thanks. Your wild guess is correct. I guess someone had given me that
> text from a word processor file. But why does it display correctly when I
> just navigate to the HTML file in the browser and NOT when I GET it via
> > > XMLHttpRequest and assign it to <block>.innerHTML ?

For a number of reasons which would take a page to lay out, plus I
might switch on prophanity words - as it happens every time oI need to
explain the Unicode.org production.

To save you ears :-) I say only the basic. UTF-8 is not something to
display - it is an encoding to deliver milti-byte chars as single-byte
sequences to be transformed back into Unicode characters on the
recipient side. With direct server <=> browser stream UTF-8 is being
parsed by browser HTML parser which is aware of Unicode-16 as well as
of Unicode-24 and other Unicode.org sick fantasies. With any ajaxoid
this stream is being parsed by JavaScript Unicode-16 parser. That
leaves the hell's doors wide open, and not only for quotes.

As a side note: you still can have typography quotes in your script by
using Unicode-16 \u201C and \u201D escape sequences.

VK

unread,

Feb 27, 2006, 4:37:52 AM2/27/06

to

VK wrote:
> As a side note: you still can have typography quotes in your script by
> using Unicode-16 \u201C and \u201D escape sequences.

Also: "?" sign you see on the page is not the question mark, so no use
to look for it in RegExp. It's Unicode Replacement Character entity
(FFFD) which is used in lieu of any unrecognised character.

Ian Collins

unread,

Feb 27, 2006, 3:19:15 PM2/27/06

to

Elizabeth wrote:
>>
>>Maybe because the HTML is parsed when you assign it with innerHTML.
>
>
> maybe, but what piece of code is doing the parsing ? and can't understand
> the "typography quotes" ?
>

The UA parses and adds the string you pass to inerHTML to the DOM for
the current page.

--
Ian Collins.

Thomas 'PointedEars' Lahn

unread,

Feb 27, 2006, 7:46:50 PM2/27/06

to

VK wrote:

> To save you ears :-) I say only the basic. UTF-8 is not something to
> display - it is an encoding to deliver milti-byte chars as single-byte
> sequences to be transformed back into Unicode characters on the
> recipient side.

Nonsense. The Unicode characters U+0000 to U+007F are encoded with one
UTF-8 code unit, therefore one byte.

> [snipped further nonsense]

Read <http://unicode.org/faq/>. NOW.

PointedEars

Elizabeth

unread,

Feb 27, 2006, 9:58:14 PM2/27/06

to

"VK" <school...@yahoo.com> wrote in message

news:1141031531.0...@p10g2000cwp.googlegroups.com...

>
> Elizabeth wrote:
> > > > Thanks. Your wild guess is correct. I guess someone had given me
that
> > text from a word processor file. But why does it display correctly when
I
> > just navigate to the HTML file in the browser and NOT when I GET it via
> > > > XMLHttpRequest and assign it to <block>.innerHTML ?
>
> For a number of reasons which would take a page to lay out, plus I
> might switch on prophanity words - as it happens every time oI need to
> explain the Unicode.org production.
>
> To save you ears :-) I say only the basic. UTF-8 is not something to
> display - it is an encoding to deliver milti-byte chars as single-byte
> sequences to be transformed back into Unicode characters on the
> recipient side. With direct server <=> browser stream UTF-8 is being
> parsed by browser HTML parser which is aware of Unicode-16 as well as

> of Unicode-24 and other Unicode.org sick fantasies. With any ajaxoid
> this stream is being parsed by JavaScript Unicode-16 parser. That
> leaves the hell's doors wide open, and not only for quotes.

huh ... guess I don't really understand why a string assignment needs to be
parsed ... or do you consider "parse" to be a synonym for "decode" ? I
guess that might be a form of parsing ... hadn't really thought about it ...

you don't need to spare the profanity on my account; I'm not politically
correct ... it keeps you from learning things

-E-