Google Groups no longer supports new Usenet posts or subscriptions. Historical content remains viewable.
Dismiss

Weird characters

1,212 views
Skip to first unread message

Nemo

unread,
Feb 9, 2012, 12:10:45 AM2/9/12
to
Content over at http://www.ducea.com/2006/05/14/tip-how-to-sort-folders-
by-size-with-one-command-line-in-linux/ illustrates a problem I encounter
not infrequently.

There's a comment near the top of the page by Jason dated 20th July 2006.
Everything displays fine when I open the page with Firefox 11 (beta,
Ubuntu) directly.

If I save the page to disk, edit the html file with gedit or geany and
then view the edited page offline in Firefox, part of that comment
displays weird characters instead of the variety of "smart quotes" and
"dashes" that Jason has used.

Here are examples of the weird stuff:

’
ââ
¬â„¢s
“-
“
â€
″
Â
—–

Looking further down the original page, one can that someone else
(Kunchok, post dated 2nd November 2007) isn't too happy either.

I'm making my own "glossary" to deal with this problem but there maybe an
easier not too technical approach?

(I realize that this may have nothing to do with Firefox, per se, so even
pointers to a solution elsewhere are welcome.)

Greywolf

unread,
Feb 9, 2012, 9:07:43 AM2/9/12
to
On 09/02/2012 12:10 AM, Nemo wrote:
> (I realize that this may have nothing to do with Firefox, per se, so even
> pointers to a solution elsewhere are welcome.)

You're right, it's not a FF issue, it's a character-code and font issue.
The operating system handles this, but an application can specify a
code-set different from the default. I'll start with a (partial, not
guaranteed) cure, and add a brief explanation.

Cure:
"Unicode" is supposed to cure this, but AFAICT it isn't 100% effective.
You can select different character encodings for FF to use.:
Options > Content pane > Fonts line > Advanced pane > Character
Encodings line > select the one you want.

FF will use the encoding you specify, but that won't eliminate weird
characters. Personally, I just put up with the occasional weird characters.

Here's the dirty details as I understand them:

a) Characters are represented by binary codes. There are several
code-sets or "encodings". The default is set so that the system can
display the language(s) you presumably use. Since languages are written
using some common and some different characters, each code-set include
codes for characters of a given language.

b) Fonts consist of "glyphs", ie, graphic shapes, one for each symbol
used a particular writing system.

c) The OS matches the character code with the glyph, and displays it.

d) Several mismatches are possible:
i) the code set at the source (web page) is for a different character
set than yours;
ii) the font doesn't have glyphs for all the characters specified in a
code-set;
iii) the font has different glyphs for some of the codes.

Result: some characters will not display correctly on your system.

HTH
Wolf K.

»Q«

unread,
Feb 9, 2012, 12:14:28 PM2/9/12
to
On Wed, 08 Feb 2012 23:10:45 -0600
Nemo <chim...@in.com> wrote:

> Content over at
> http://www.ducea.com/2006/05/14/tip-how-to-sort-folders-
> by-size-with-one-command-line-in-linux/ illustrates a problem I
> encounter not infrequently.
>
> There's a comment near the top of the page by Jason dated 20th July
> 2006. Everything displays fine when I open the page with Firefox 11
> (beta, Ubuntu) directly.
>
> If I save the page to disk, edit the html file with gedit or geany
> and then view the edited page offline in Firefox, part of that
> comment displays weird characters instead of the variety of "smart
> quotes" and "dashes" that Jason has used.

[snip]

When a browser gets a page from a web server, the web server tells the
browser what encoding the page uses. When you view the page offline,
the browser doesn't get that info and uses its fallback setting.
Here's the way to add that info to the page itself, so no server is
needed.

When you're viewing a page on the web, you can use Tools » Page Info to
get the page's encoding. In this case it's utf-8.

Then after saving, add a meta tag to the head section of the page. In
this case,

<head>
<meta http-equiv="Content-Type" content="text/html; charset=utf-8" />
...
</head>

When dealing with HTML instead of XHTML, leave out the slash that closes
the meta tag, e.g.,

<head>
<meta http-equiv="Content-Type" content="text/html; charset=utf-8">
...
</head>

Nemo

unread,
Feb 11, 2012, 12:17:07 PM2/11/12
to
Thank you, Greywolf and >>Q<<! I'm glad I asked even though the question
was not strictly related to Firefox. Including the code >>Q<< suggests
solves much of the problem.
In short, sticking "<meta http-equiv="Content-Type" content="text/html;
charset=utf-8">" into the head tag does the trick!
Thanks again for your replies!
0 new messages