Google Groups no longer supports new Usenet posts or subscriptions. Historical content remains viewable.
Dismiss

Character Encoding for Incoming Emails

688 views
Skip to first unread message

Cy Burnot

unread,
Oct 31, 2011, 7:16:25 PM10/31/11
to
I have character encoding for incoming emails set to user-defined.

Yet, there will be graphical characters in some of the emails. I thought
that user-defined would eliminate those. If not, what is user-defined?
Where do I make the definition?

Thanks.

Greywolf

unread,
Nov 1, 2011, 10:54:30 AM11/1/11
to
Well, actually, all characters are "graphical". Your system passes the
image-data to the display subsystem so that the characters show on your
screen in the font(s) you've selected. So what do you mean by "graphical
characters"? Smileys? Non-letters? "Weird little boxes"?

NB that in HTML it's possible to use any font supplied by the system on
which the message is composed. When your system doesn't have the font
specified in the message, it will use one that the OS designers think is
"close", or else use the default font. If there's a character mismatch,
the system does its best, which can result in strange stuff.

AFAIK, "user defined" means "default" if you don't specify anything
else. That's Western ISO-8859-1. This encodes the characters for the
standard alphabet. There's also Western-ISO-8859-15, which presumably
encodes more characters. You could try that. If it does what you want,
please report back.

Finally, some graphical characters are used by default: TB uses bars to
show threading in an HTML message.

HTH
Wolf K.

Cy Burnot

unread,
Nov 1, 2011, 7:40:08 PM11/1/11
to
Greywolf has written on 11/1/2011 10:54 AM:
> On 31/10/2011 7:16 PM, Cy Burnot wrote:
>> I have character encoding for incoming emails set to user-defined.
>>
>> Yet, there will be graphical characters in some of the emails. I thought
>> that user-defined would eliminate those. If not, what is user-defined?
>> Where do I make the definition?
>>
>> Thanks.
>
> Well, actually, all characters are "graphical". Your system passes the
> image-data to the display subsystem so that the characters show on your
> screen in the font(s) you've selected. So what do you mean by "graphical
> characters"? Smileys? Non-letters? "Weird little boxes"?

Black diamond containing a white question mark. =A0

Message header says:

Content-type: text/html; charset=ISO-8859-1
Content-transfer-encoding: quoted-printable

> NB that in HTML it's possible to use any font supplied by the system on
> which the message is composed. When your system doesn't have the font
> specified in the message, it will use one that the OS designers think is
> "close", or else use the default font. If there's a character mismatch,
> the system does its best, which can result in strange stuff.
>
> AFAIK, "user defined" means "default" if you don't specify anything
> else. That's Western ISO-8859-1. This encodes the characters for the
> standard alphabet. There's also Western-ISO-8859-15, which presumably
> encodes more characters. You could try that. If it does what you want,
> please report back.

How do I specify that other than by selecting it in place of the "user
defined" selection?


Greywolf

unread,
Nov 1, 2011, 10:43:57 PM11/1/11
to
On 01/11/2011 7:40 PM, Cy Burnot wrote:
> Greywolf has written on 11/1/2011 10:54 AM:

[...] So what do you mean by "graphical
>> characters"? Smileys? Non-letters? "Weird little boxes"?
>
> Black diamond containing a white question mark. =A0

That means that the system (not Tbird) doesn't know how to display the
character specified at that location. Keep in mind that when it comes to
displaying plain text on screen, Tbird has to rely on the operating
system to display characters. Every character is represented by a code,
and the code is interpreted by the OS in terms of the character encoding
specified, and the selected font. Problem is, some encodings and some
fonts are incomplete: some characters are missing. I think that's what's
happening.

I've noticed that the black diamond with a question mark shows up when I
receive messages from some of my European relatives. Try Central
European for incoming messages. It includes the 26 letters plus umlauts
and such. Or you may be able to get rid of the black diamond by choosing
a different font for displaying messages (I use Times Roman).

> Message header says:
>
> Content-type: text/html; charset=ISO-8859-1
> Content-transfer-encoding: quoted-printable

OK, that means that the sender's e-mail client sent the message in both
plain text and HTML formats. It will be displayed as either text or HTML
or both, depending on how the recipient's e-mail client (program)
handles this type of content. In Tbird, you specify that the message be
displayed as plain text or HTML. Try

[...]
>> AFAIK, "user defined" means "default" if you don't specify anything
>> else. That's Western ISO-8859-1. This encodes the characters for the
>> standard alphabet. There's also Western-ISO-8859-15, which presumably
>> encodes more characters. You could try that. If it does what you want,
>> please report back.
>
> How do I specify that other than by selecting it in place of the "user
> defined" selection?

That's the only way to specify it:
Tools > Options > Display > Formatting > Fonts Advanced > Character
Encodings > Incoming Mail > select the encoding.

As for "user defined", I believe that refers to some character-code set
that you've created or imported from another source. If so, I don't know
how to incorporate that code set into TBird's list.

HTH
Wolf K.

Greywolf

unread,
Nov 1, 2011, 10:47:44 PM11/1/11
to
On 01/11/2011 7:40 PM, Cy Burnot wrote:
[...]
> How do I specify that other than by selecting it in place of the "user
> defined" selection?
>
>

Also experiment with Font Control settings, and "When possible, allow
messages..." at bottom of ..Fonts Advanced.

HTH
Wolf K.

Horatio

unread,
Dec 18, 2011, 8:35:44 AM12/18/11
to
I'm also having an issue with character display in some messages that
we're displaying correctly before TB8, e.g.:

"He was his country’s first democratically elected president after the
nonviolent “Velvet Revolution†that ended four decades of repression
by a regime he ridiculed as “Absurdistan.†"

That example is from a New-York Times News Alert - it seems some special
characters are not interpreted correctly. My setup for message display
and character encoding is default. Any idea what is causing this?

Thanks,

Horatio

Dave Pyles

unread,
Dec 18, 2011, 9:22:28 AM12/18/11
to
Tools>Options>Display>Formatting tab>Advanced button. Try setting the
character encoding for incoming mail to Western (ISO-8859-1).
Dave Pyles

Horatio

unread,
Dec 18, 2011, 9:37:02 AM12/18/11
to
On 2011-12-18 09:22, Dave Pyles wrote:
> Horatio wrote:
>> On 2011-11-01 22:47, Greywolf wrote:
>>> On 01/11/2011 7:40 PM, Cy Burnot wrote:
>>> [...]
>>>> How do I specify that other than by selecting it in place of the "user
>>>> defined" selection?
>>>>
>>>>
>>>
>>> Also experiment with Font Control settings, and "When possible, allow
>>> messages..." at bottom of ..Fonts Advanced.
>>>
>>> HTH
>>> Wolf K.
>>
>> I'm also having an issue with character display in some messages that
>> we're displaying correctly before TB8, e.g.:
>>
>> "He was his country’s first democratically elected president
>> after the
>> nonviolent “Velvet Revolution†that ended four decades of
>> repression
>> by a regime he ridiculed as “Absurdistan.†"
>>
>> That example is from a New-York Times News Alert - it seems some special
>> characters are not interpreted correctly. My setup for message display
>> and character encoding is default. Any idea what is causing this?
>>
>> Thanks,
>>
>> Horatio
> Tools>Options>Display>Formatting tab>Advanced button. Try setting the
> character encoding for incoming mail to Western (ISO-8859-1).
> Dave Pyles

I've tried that already. The problem is still there - the only
difference is the kind of symbols showing up in place of the correct
characters.

Horatio

Dave Pyles

unread,
Dec 18, 2011, 9:50:22 AM12/18/11
to
Open the message, hit <CTRL>U to see the message source, then hit
<CTRL>F and type charset into the search box. Set the incoming mail
encoding to whatever encoding is shown on that line of the headers.
Dave Pyles

Horatio

unread,
Dec 18, 2011, 10:01:12 AM12/18/11
to
On 2011-12-18 09:50, Dave Pyles wrote:
> Horatio wrote:
>> On 2011-12-18 09:22, Dave Pyles wrote:
>>> Horatio wrote:
>>>> On 2011-11-01 22:47, Greywolf wrote:
>>>>> On 01/11/2011 7:40 PM, Cy Burnot wrote:
>>>>> [...]
>>>>>> How do I specify that other than by selecting it in place of the
>>>>>> "user
>>>>>> defined" selection?
>>>>>>
>>>>>>
>>>>>
>>>>> Also experiment with Font Control settings, and "When possible, allow
>>>>> messages..." at bottom of ..Fonts Advanced.
>>>>>
>>>>> HTH
>>>>> Wolf K.
>>>>
>>>> I'm also having an issue with character display in some messages that
>>>> we're displaying correctly before TB8, e.g.:
>>>>
>>>> "He was his country’s first democratically elected
>>>> president
>>>> after the
>>>> nonviolent “Velvet Revolution†that
>>>> ended four decades of
>>>> repression
>>>> by a regime he ridiculed as
>>>> “Absurdistan.†"
>>>>
>>>> That example is from a New-York Times News Alert - it seems some
>>>> special
>>>> characters are not interpreted correctly. My setup for message display
>>>> and character encoding is default. Any idea what is causing this?
>>>>
>>>> Thanks,
>>>>
>>>> Horatio
>>> Tools>Options>Display>Formatting tab>Advanced button. Try setting the
>>> character encoding for incoming mail to Western (ISO-8859-1).
>>> Dave Pyles
>>
>> I've tried that already. The problem is still there - the only
>> difference is the kind of symbols showing up in place of the correct
>> characters.
>>
>> Horatio
> Open the message, hit <CTRL>U to see the message source, then hit
> <CTRL>F and type charset into the search box. Set the incoming mail
> encoding to whatever encoding is shown on that line of the headers.
> Dave Pyles
Thanks, I'll try that and report back.

Horatio

Horatio

unread,
Dec 18, 2011, 10:16:50 AM12/18/11
to
On 2011-12-18 10:01, Horatio wrote:
> On 2011-12-18 09:50, Dave Pyles wrote:
>> Horatio wrote:
>>> On 2011-12-18 09:22, Dave Pyles wrote:
>>>> Horatio wrote:
>>>>> On 2011-11-01 22:47, Greywolf wrote:
>>>>>> On 01/11/2011 7:40 PM, Cy Burnot wrote:
>>>>>> [...]
>>>>>>> How do I specify that other than by selecting it in place of the
>>>>>>> "user
>>>>>>> defined" selection?
>>>>>>>
>>>>>>>
>>>>>>
>>>>>> Also experiment with Font Control settings, and "When possible,
>>>>>> allow
>>>>>> messages..." at bottom of ..Fonts Advanced.
>>>>>>
>>>>>> HTH
>>>>>> Wolf K.
>>>>>
>>>>> I'm also having an issue with character display in some messages that
>>>>> we're displaying correctly before TB8, e.g.:
>>>>>
>>>>> "He was his country’s first
>>>>> democratically elected president
>>>>> after the
>>>>> nonviolent “Velvet
>>>>> Revolution†that ended four decades of
>>>>> repression
>>>>> by a regime he ridiculed as
>>>>> “Absurdistan.†"
>>>>>
>>>>> That example is from a New-York Times News Alert - it seems some
>>>>> special
>>>>> characters are not interpreted correctly. My setup for message
>>>>> display
>>>>> and character encoding is default. Any idea what is causing this?
>>>>>
>>>>> Thanks,
>>>>>
>>>>> Horatio
>>>> Tools>Options>Display>Formatting tab>Advanced button. Try setting the
>>>> character encoding for incoming mail to Western (ISO-8859-1).
>>>> Dave Pyles
>>>
>>> I've tried that already. The problem is still there - the only
>>> difference is the kind of symbols showing up in place of the correct
>>> characters.
>>>
>>> Horatio
>> Open the message, hit <CTRL>U to see the message source, then hit
>> <CTRL>F and type charset into the search box. Set the incoming mail
>> encoding to whatever encoding is shown on that line of the headers.
>> Dave Pyles
> Thanks, I'll try that and report back.
>
> Horatio
Well, it looks like the problematic message uses the UFT-8 charset.
Unfortunately, if I change TB's default incoming charset to that, other
incoming messages that uses the Western (ISO-8859-1) charset show the
same type of problem. So this looks like an unsolvable issue at the moment.

Horatio

Dave Pyles

unread,
Dec 18, 2011, 10:58:52 AM12/18/11
to
You could try a method that was suggested in this discussion:
http://getsatisfaction.com/mozilla_messaging/topics/why_cant_thunderbird_detect_the_character_encoding
(Watch the wrap in the link)

Tools> Options> Advanced> General tab> Config editor button:
Type mailnews.force_charset_override in the filter. If it's set to
true, double click to toggle it to false which is the default.
Dave Pyles

Horatio

unread,
Dec 18, 2011, 11:03:41 AM12/18/11
to
On 2011-12-18 10:58, Dave Pyles wrote:
> Horatio wrote:
>> On 2011-12-18 10:01, Horatio wrote:
>>> On 2011-12-18 09:50, Dave Pyles wrote:
>>>> Horatio wrote:
>>>>> On 2011-12-18 09:22, Dave Pyles wrote:
>>>>>> Horatio wrote:
>>>>>>> On 2011-11-01 22:47, Greywolf wrote:
>>>>>>>> On 01/11/2011 7:40 PM, Cy Burnot wrote:
>>>>>>>> [...]
>>>>>>>>> How do I specify that other than by selecting it in place of the
>>>>>>>>> "user
>>>>>>>>> defined" selection?
>>>>>>>>>
>>>>>>>>>
>>>>>>>>
>>>>>>>> Also experiment with Font Control settings, and "When possible,
>>>>>>>> allow
>>>>>>>> messages..." at bottom of ..Fonts Advanced.
>>>>>>>>
>>>>>>>> HTH
>>>>>>>> Wolf K.
>>>>>>>
>>>>>>> I'm also having an issue with character display in some messages
>>>>>>> that
>>>>>>> we're displaying correctly before TB8, e.g.:
>>>>>>>
>>>>>>> "He was his
>>>>>>> country’s
>>>>>>> first
>>>>>>> democratically elected president
>>>>>>> after the
>>>>>>> nonviolent
>>>>>>> “Velvet
>>>>>>> Revolutionâ€ÂÂÂÂ
>>>>>>> that ended four decades of
>>>>>>> repression
>>>>>>> by a regime he ridiculed as
>>>>>>> “Absurdistan.†"
I'll try that. Thanks for the suggestions.

Horatio
0 new messages