Google Groups no longer supports new Usenet posts or subscriptions. Historical content remains viewable.
Dismiss

Why the displayed subject line is completely different from one in message source?

10 views
Skip to first unread message

Jack

unread,
Aug 31, 2008, 10:58:55 PM8/31/08
to
The displayed subject line is:
email рассылки
(I copied it after clicking Reply button on that message and skipped Re.
part))

The relevant part of message source is:
Subject: =?koi8-r?B?ZW1haWwg0sHT09nMy8k=?=

How that is possible?
Jack


Bruce Hagen

unread,
Aug 31, 2008, 11:15:42 PM8/31/08
to
I block posts with Cyrillic characters in the Subject field Just a guess,
but I believe the subject is in Cyrillic and some sort of source
transformation takes place by the news server. I never see them, thankfully.
--

Bruce Hagen
MS-MVP Outlook Express
Imperial Beach, CA


"Jack" <replyto@it> wrote in message
news:OGJHC79C...@TK2MSFTNGP06.phx.gbl...

st

unread,
Sep 1, 2008, 8:33:17 AM9/1/08
to
That's calling encoding, Jack. Back in 70's computers didn't transmit 8-bit characters well so E-mail & news messages was limited to latin letters; so became the standard. Nowadays, we transmit Cyrillic letters by encoding them:
=? + <codepage> + <B=base64> + ? + <a-zA-Z0-9+/>* + <=>* + ?=
Your news reader decodes such fields automatically.

See also RFC 2047 (http://www.ietf.org/rfc/rfc2047.txt)

"Jack" <replyto@it> сообщил/сообщила в новостях следующее: news:OGJHC79C...@TK2MSFTNGP06.phx.gbl...
> The displayed subject line is:
> email рассылки
>

st

unread,
Sep 1, 2008, 8:54:34 AM9/1/08
to
Btw. you may try to setup message rule to delete those messages automatically:
look for '=?koi8-r', '=?koi8-u', '=?iso-8859-5', or '=?windows-1251' in subject field.

P.S. Your sample subject reads: E-mailing lists :)

"st" <s...@sandy.localdomain> сообщил/сообщила в новостях следующее: news:eO6ng7C...@TK2MSFTNGP04.phx.gbl...

Jack

unread,
Sep 1, 2008, 11:17:00 AM9/1/08
to
Thank you very much for your explanation.
Does it mean, that if in the future I decide to set my email filter, should
I use trap characters as displayed in the message source instead of visible
subject line?
I have noticed that when setting mail filter online (on my ISP provider web
site ) when I select trap character from subject line it is being converted
automatically into something else.
For example.
If I copy this character: ?
then it is being saved as ?H
and the filter does not work properly.
I do not know how to set such filtering on online website.
Thanks,
Jack

"st" <s...@sandy.localdomain> wrote in message
news:ugLLZHD...@TK2MSFTNGP06.phx.gbl...

VanguardLH

unread,
Sep 1, 2008, 1:59:00 PM9/1/08
to
st wrote:

> Btw. you may try to setup message rule to delete those messages automatically:
> look for '=?koi8-r', '=?koi8-u', '=?iso-8859-5', or '=?windows-1251' in subject field.

It possible to encode using a language set that is different in the
Subject header than is used for the body of the message. Whether they
encoded the Subject header by itself or encoded the body using a
non-English character set, I don't want those e-mails (since I can't
read them). I have a rule that checks for several non-English character
sets but I test on both the "charset" parameter in the Content-Type
header and for the encoding in the Subject header. In Outlook, I
defined a rule that looks for the following strings in the "message
header" (which doesn't define a specific header prefix but looks
anywhere in the headers). The 'with ... in the message header' clause
in my Outlook rule has the following strings that it searches for (as
OR'ed parameters):

charset="Big5"
=?Big5
charset="ChineseBig (can be followed by other chars so only 1 quote)
=?ChineseBig
charset="EUC-KR"
=?EUC-KR
charset="GB2312
=?GB2312
charset="ISO-2022-JP"
=?ISO-2022-JP
charset="ISO-2022-KR"
=?ISO-2022-KR
charset="KOI8 (can be followed by other chars so only 1 quote)
=?KOI8
charset="KS_C_5601_1987"
=?KS_C_5601_1987
charset="Windows-1250"
=?Windows-1250
charset="Windows-1251"
=?Windows-1251
charset="Windows-1254"
=?Windows-1254
charset="Windows-1256"
=?Windows-1256
charset="Windows-1257"
=?Windows-1257
charset="Windows-1258"
=?Windows-1258
charset="Windows-874"
=?Windows-874

Basically, when I add another character set on which the rule should
fire, I add it in a paired set of parameters: one for the charset (in
case the body is encoded in an MIME part) and another for the Subject
header encoding (which could be done separately of the body encoding).
Although I'm searching a substrings in all headers, the "=?" encoding is
only used in the Subject header (for what I want to filter on). It is
entirely possible that only the Subject is encoded but not the body, or
visa versa, so I catch on both.

There are more non-English character sets that I list in my rule but
these have nailed all the foreign language e-mails that I've received,
so far. My rule marks the message as read, assigns it to the
"Non-English Charset" category (because I add the Category column to the
view to see later why a message got handled as spam), and moves it to
the Junk folder which has auto-archiving enabled to permanently delete
items over a week old (although I fluctuate down to 3 days if I start
getting loads of spam).

Jack

unread,
Sep 3, 2008, 11:05:57 AM9/3/08
to
Can be that rule implemented somehow in OE or it must be just Outlook?
Thanks,
Jack

"VanguardLH" <V...@nguard.LH> wrote in message
news:em4dEwFD...@TK2MSFTNGP04.phx.gbl...

Michael Santovec

unread,
Sep 3, 2008, 5:03:20 PM9/3/08
to
My experience has been that an OE message rule will not match on the
character set name (e.g.=?koi8-r') in the Subject just as it won't match
on HTML tags in the message body.

I've found this Outlook Express Message Rule fairly effective with
foreign spam

Apply this rule after the message arrives
Where the Subject line contains 'ä' or '±' or '¥' or 'À' or 'Ç' or '¤'
or 'Æ' or 'Á' or 'Ò' or '¶' or '¯' or 'ª'
Delete it
and Stop processing more rules

If it misses a message, I copy/paste another unique character from the
subject into the rule. That characters have to be added one at a time.

--

Mike - http://pages.prodigy.net/michael_santovec/techhelp.htm

"Jack" <replyto@it> wrote in message

news:ONkCWadD...@TK2MSFTNGP02.phx.gbl...

VanguardLH

unread,
Sep 4, 2008, 1:31:37 AM9/4/08
to
Jack wrote:

> Can be that rule implemented somehow in OE or it must be just Outlook?

Alas, the rule set in OE is not as potent as in Outlook. You won't be
able to search in any header, just in the few select headers where OE
will look, like the Subject header. So, in OE, you're stuck looking
only for the "=?<string>" in the Subject header.

I have only received a few spams or non-English e-mails that use
encoding in the content but still use an English charset in the Subject
header; however, they usually match. Yet I have run into lots of spams
that used an encoded charset in the Subject header but an English
charset in the body. Yet since I cannot read the non-English subject
header, I'm not interested in reading the body, either.

So you're stuck with the limited rule set provided in OE, and OE only
lets you test on just a few headers. If you use an anti-spam program
that lets you look for substrings in the headers then you could define
the Content-Type encoding in, say, a regex rule in your anti-spam
program to tag those e-mails with, say, "**SPAM**" in the Subject header
which then you can use a rule in OE to catch.

You're limited by the e-mail client that you use. I don't think Windows
Live Mail got much more added to its rules set than is available in OE.
Thunderbird's rules suck for newsgroup filters (worse than in OE) but,
as I recall, they're a bit better than OE. From users of Pegasus Mail,
they claim it has potent rules (but I've never got around to trialing
that program).

0 new messages