Google Groups no longer supports new Usenet posts or subscriptions. Historical content remains viewable.
Dismiss

Base64 encoded Google Groups spam in rec.arts.tv

32 views
Skip to first unread message

Adam H. Kerman

unread,
Oct 11, 2023, 11:34:26 AM10/11/23
to
Message-ID: <2e088a2e-5834-462b...@googlegroups.com>

I don't know what use, if any, this is to you. It's clearly the same
pattern as in other newsgroups.

I report email spam by forwarding it with all headers in an attachment to
an address set up for Spam Assasin to analyze it.

I'd hate to suggest that you set up something like that for your
subscribers to use as it could get abused.

Ray Banana

unread,
Oct 11, 2023, 1:09:55 PM10/11/23
to
* Adam H. Kerman wrote:
> Message-ID: <2e088a2e-5834-462b...@googlegroups.com>
>
> I don't know what use, if any, this is to you. It's clearly the same
> pattern as in other newsgroups.

Thanks, the M-ID is all I need.


> I report email spam by forwarding it with all headers in an attachment to
> an address set up for Spam Assasin to analyze it.
>
> I'd hate to suggest that you set up something like that for your
> subscribers to use as it could get abused.

The E-S newsserver can talk to SpamAssassin, so there's no need for
an email. All I have to do is run a command with just the M-ID
as an argument.

--
Пу́тін — хуйло́
http://www.eternal-september.org

Adam H. Kerman

unread,
Oct 26, 2023, 12:09:27 AM10/26/23
to
It's back.

Message-ID: <a5a34fe9-5a88-472e...@googlegroups.com>

Ray, the atomic LART you are working on cannot be ready soon enough.

Lynn McGuire

unread,
Oct 26, 2023, 2:33:37 AM10/26/23
to
The flood today in comp.lang.fortran and comp.lang.c has been unreal.

Shoot, maybe a rule that says if there is any non-ASCII in the subject
would catch many of them.

Lynn

Chris Schram

unread,
Oct 26, 2023, 4:39:44 AM10/26/23
to
Once upon a time one of the groups I frequent was temporarily infested
with a couple of goofball troublemakers who liked to insert strings of
emoji into subject lines. "Subject: UTF-8" cleaned them out for me.
YMMV.

--
chri...@me.com is a filtered spam magnet. Email replies may be lost.
You're better off replying to this newsgroup.

Chris Schram

unread,
Oct 26, 2023, 4:45:09 AM10/26/23
to
Never mind. I just remembered that I have seen a few Thai spams here and
there, and the UTF-8 filter does NOTHING.

Paul

unread,
Oct 26, 2023, 5:28:15 AM10/26/23
to
Out of 48672 posts, 7077 of the spam ones don't begin with this.
The other 40000 or so, do begin with this. A few of the posts,
were normal [ASCII] C group postings (amazing!). If you construct a
non-ASCII filter, that would knock out some of the spam headers incoming.
But not all of them.

"Subject: =?UTF-8" <=== maybe 84% effective or so

I use stuff like this for the determination.

grep "Subject:" comp.lang.c.txt | grep -v "Subject: =?UTF-8" | wc -l ==> 7077

When you capture a box of messages (like from comp.lang.c), the
ones with BASE64 are not decoded for you. Converting those
so the character set is exposed, would be an extra step of scripting.

The Content-Transfer-Encoding is not a constant either. Some English-speaking
users (not necessarily in comp.lang.c) have managed to generate BASE64 postings
while doing nothing suspicious at all. It is hard to figure out why Thunderbird
does this, but I'm sure there is some (convoluted) logic for it. The end result,
is you can't be completely careless, with filtering on BASE64 Encoding. Or
you might catch one particular user and throw their post away.

*******

Some day, we may have to de-camp and "use newsgroups that Google doesn't have" :-)
That will work too. Google can add some groups automatically, but not all of them,
because they're not clever enough for that.

Paul

candycanearter07

unread,
Oct 26, 2023, 11:26:49 AM10/26/23
to
Also, I've seen legit users use emojis in the subject line.
--
user <candycane> is generated from /dev/urandom

Adam H. Kerman

unread,
Oct 26, 2023, 11:57:09 AM10/26/23
to
Adam H. Kerman <a...@chinet.com> wrote:
>Ray Banana <ray...@raybanana.net> wrote:
Two more:

Message-ID: <36c6f6f0-ec6a-4aa8...@googlegroups.com>
Message-ID: <16a8af65-daab-4d05...@googlegroups.com>

Adam H. Kerman

unread,
Oct 26, 2023, 12:09:42 PM10/26/23
to
Emojis aren't plain text. There is no legitimate use of an emoji in
plain text communication.

Learn to express yourself using... words.

Nothing is less universally implemented than newsreaders' ability to
parse non-ASCII characters on Subject. Everybody knows this, yet too few
people will stick to ASCII and nothing but ASCII on Subject.

Encoded-word portion of RFC 2822 breaks backwards compatibility and
isn't universally implemented.

Any thread I participate in with UTF-8 has a Subject variation from one
followup to the next, quickly rendinging Subject partly unreadable.

My favorite examples are newsreaders that can indeed decode UTF-8
encoded word on Subject for the purpose of allowing the user to write a
followup, but do not then parse the proto article for non-ASCII
characters on Subject to encoded them. They post with unencoded
non-ASCII characters on Subject.

Adam H. Kerman

unread,
Oct 26, 2023, 12:13:44 PM10/26/23
to
Paul <nos...@needed.invalid> wrote:

>>. . .

>Out of 48672 posts, 7077 of the spam ones don't begin with this.
>The other 40000 or so, do begin with this. A few of the posts,
>were normal [ASCII] C group postings (amazing!). If you construct a
>non-ASCII filter, that would knock out some of the spam headers incoming.
>But not all of them.

>"Subject: =?UTF-8" <=== maybe 84% effective or so

>I use stuff like this for the determination.

> grep "Subject:" comp.lang.c.txt | grep -v "Subject: =?UTF-8" | wc
>-l ==> 7077

>When you capture a box of messages (like from comp.lang.c), the
>ones with BASE64 are not decoded for you. Converting those
>so the character set is exposed, would be an extra step of scripting.

>The Content-Transfer-Encoding is not a constant either. Some
>English-speaking users (not necessarily in comp.lang.c) have managed to
>generate BASE64 postings while doing nothing suspicious at all. It is
>hard to figure out why Thunderbird does this, but I'm sure there is some
>(convoluted) logic for it. The end result, is you can't be completely
>careless, with filtering on BASE64 Encoding. Or you might catch one
>particular user and throw their post away.

It's a feature not a bug!

Plenty of ENGLISH-speaker Thunderbird users manage to post the body in
BASE64 encoding. I've seen it plenty of times on Usenet. I've yet to
spot it in email, however. When I bring it to their attention, they
insist they have no idea what triggers it nor how to turn off that
feature. I've asked in the Thunderbird group.

It is a terrible newsreader.

Bobbie Sellers

unread,
Oct 26, 2023, 12:22:23 PM10/26/23
to
All they have to do is use Uuencode which is all ascii.

bliss
--
bliss dash SF 4 ever at dslextreme dot com

Adam H. Kerman

unread,
Oct 26, 2023, 12:35:04 PM10/26/23
to

Don Vito Martinelli

unread,
Oct 26, 2023, 12:49:25 PM10/26/23
to
There are legitimate reasons for non-ascii characters in the Subject.

Ascii covers (most of) the English alphabet but it falls short when it
comes to other languages, German language groups - for example -
routinely see messages containing the characters Ä Ö Ü ä ö ü ß. We see
Thai as the language of spammers but allegedly it also has other users.

J. P. Gilliver

unread,
Oct 26, 2023, 1:00:10 PM10/26/23
to
In message <uhe303$1n02a$1...@dont-email.me> at Thu, 26 Oct 2023 16:09:39,
Adam H. Kerman <a...@chinet.com> writes
>candycanearter07 <n...@thanks.net> wrote:
[]
>>Also, I've seen legit users use emojis in the subject line.
>
>Emojis aren't plain text. There is no legitimate use of an emoji in
>plain text communication.
>
>Learn to express yourself using... words.

There may be a confusion here.

Some users of more modern (but arguably not compliant) software may be
including emoji _single characters_ in the subject; others may be using
the old made-out-of-(usually-)three-characters type, like (-:, which
_is_ plain text. (Yes, sure, one maybe should avoid using those too, and
write a perfectly formed essay, but life's too short.)
>
>Nothing is less universally implemented than newsreaders' ability to
>parse non-ASCII characters on Subject. Everybody knows this, yet too few
>people will stick to ASCII and nothing but ASCII on Subject.
>
>Encoded-word portion of RFC 2822 breaks backwards compatibility and
>isn't universally implemented.
>
>Any thread I participate in with UTF-8 has a Subject variation from one
>followup to the next, quickly rendinging Subject partly unreadable.

Agreed.
>
>My favorite examples are newsreaders that can indeed decode UTF-8
>encoded word on Subject for the purpose of allowing the user to write a
>followup, but do not then parse the proto article for non-ASCII
>characters on Subject to encoded them. They post with unencoded
>non-ASCII characters on Subject.

Definitely!

Non-ASCII characters in the _body_ are a pain too - not that on the
whole most softwares _do_ have the ability to handle them, but when
they're just a non-ASCII version of the apostrophe, the double quote, or
- the most ridiculous - the space, there's absolutely no reason for them
to be substituted. But a lot of modern software uses those versions by
default - I think it's Microsoft Word that calls them "smart" (IMO
they're anything but). Sure, for the odd thing like ą, ×, ˝, and so on,
and the accented and umlauted characters, fine - but substituting for
characters where the average reader wouldn't even notice, is ... I'm not
sure what.
--
J. P. Gilliver. UMRA: 1960/<1985 MB++G()AL-IS-Ch++(p)Ar@T+H+Sh0!:`)DNAf

"Purgamentum init, exit purgamentum." Translation: "Garbage in, garbage out."

J. P. Gilliver

unread,
Oct 26, 2023, 1:10:11 PM10/26/23
to
In message <uhe5aj$1nkll$1...@dont-email.me> at Thu, 26 Oct 2023 18:49:22,
Don Vito Martinelli <hyperspac...@vogon.gov.invalid> writes
[]
>There are legitimate reasons for non-ascii characters in the Subject.
>
>Ascii covers (most of) the English alphabet but it falls short when it
>comes to other languages, German language groups - for example -
>routinely see messages containing the characters Ä Ö Ü ä ö ü ß. We see
>Thai as the language of spammers but allegedly it also has other users.

The German characters can be written in plain text - an umlaut means a
following e (e. g. the word for beautiful can be written as schoen,
Cologne is Koeln), and ß means ss. I'm unaware of any similar convention
for the French (and other languages) accents, though, let alone Thai.

Adam H. Kerman

unread,
Oct 26, 2023, 2:26:48 PM10/26/23
to
Thanks for completely ignoring everything I just wrote about RFC 2822
breaking backwards compatibility and bad newsreader implementation.

>Ascii covers . . .

I don't need the lecture that ASCII is mostly used for writing in
English. Use ASCII on Subject even if you have to substitute ASCII
characters for the character you really need. It guarantees readability.

RFC 2822 is not and never will be implemented unversally in newsreaders
in actual use on Usenet.

DrunkenThon

unread,
Oct 26, 2023, 5:50:03 PM10/26/23
to
Adam H. Kerman <a...@chinet.com> wrote:
Why should people care of what encoding their software uses? They
don't know what ASCII is, they don't know what *encoding*
means and why should they? Most people are not computer
technicians/programmers/admins. People just want to talk.

In the end, UTF was created to be universal *for the people* from
around the world: German, French, Russians, Spanish, Greek, whoever,
and to forget about this "pain-in-the-butt" word *encoding* once
and for all! If some German speaking person wants to enter a symbol
in the subject and the software doesn't allow him (i.e. enters some
ASCII alternative), he probably will consider it as bad software. And
software will generally go after people, for the people and (mostly)
follow their demands.

Parhaps, UTF is not perfect, but hey, what's perfect in our world? :)

--
Best regards,
DrunkenThon.

candycanearter07

unread,
Oct 26, 2023, 5:57:23 PM10/26/23
to
I thought that uuencode was deprecated?

Adam H. Kerman

unread,
Oct 26, 2023, 8:22:25 PM10/26/23
to
>Why should people care of what encoding their software uses? . . .

Huh? That's not how it works. That's not how any of it works. The user
is supposed to declare a character set to use on Subject, then the
newsreader encodes it. The newsreader doesn't use an encoding except
upon instruction from the user.

>In the end, UTF was created to be universal . . .

I wasn't stating that UTF-8, in and of itself, is bad. Once again, I am
stating that there are lots and lots and lots of newsreaders in use that
badly implement RFC 2822 or don't implement it at all, which results in
UNREADABLE characters on Subject that make communication worse.

Why cannot people write a followup addressing what I actually wrote
instead of what they believe I wrote?

Adam H. Kerman

unread,
Nov 8, 2023, 11:18:57 PM11/8/23
to
Various Google Groups-originating foreign language spam in rec.arts.tv
over the last day or so, either base64 or QP encoded

Message-ID: <f25358f7-0c47-41fc...@googlegroups.com>
Message-ID: <48fb1aeb-ccd5-4f35...@googlegroups.com>
Message-ID: <88a85dcb-5b60-471e...@googlegroups.com>
Message-ID: <9cb162aa-c0bb-40eb...@googlegroups.com>
Message-ID: <15181548-471c-49cc...@googlegroups.com>
Message-ID: <f5f5f571-4e2f-4388...@googlegroups.com>

Ray Banana

unread,
Nov 8, 2023, 11:44:55 PM11/8/23
to
Thus spake "Adam H. Kerman" <a...@chinet.com>

> <48fb1aeb-ccd5-4f35...@googlegroups.com>

These spammers are multilingual, probably using Google Translate.
They now switched to Japanese, And some of them even mix English
and Hindi, so it's harder to determine the language.

Filtering on BASE64 and QP alone creates far too many false positives,
but it increases the spam score.

Jesse Rehmer

unread,
Nov 9, 2023, 6:51:35 PM11/9/23
to
On Nov 8, 2023 at 10:44:05 PM CST, "Ray Banana" <ray...@raybanana.net> wrote:

> Thus spake "Adam H. Kerman" <a...@chinet.com>
>
>> <48fb1aeb-ccd5-4f35...@googlegroups.com>
>
> These spammers are multilingual, probably using Google Translate.
> They now switched to Japanese, And some of them even mix English
> and Hindi, so it's harder to determine the language.
>
> Filtering on BASE64 and QP alone creates far too many false positives,
> but it increases the spam score.

I've wondered if they are any legitimate Base64 encoded messages coming from
Google Groups? I use Diablo for my feeder and it rejects those articles using
its binary detection. Typically when I check a handful of rejected articles
they appear to be spam so I don't think twice, but maybe I should.

Ray Banana

unread,
Nov 9, 2023, 10:43:02 PM11/9/23
to
I haven't found out yet what triggers the encoding in the Google web interface,
but legitimate articles containing only text seem to be encoded in BASE64
or QP, too.

Adam H. Kerman

unread,
Nov 10, 2023, 10:04:09 AM11/10/23
to
This morning, I saw

sporge
Message-ID: <a9de36c5-1080-4739...@googlegroups.com>

movie download with quoted-printable
Message-ID: <cfd1a425-0c53-44cb...@googlegroups.com>
Message-ID: <2e2cf9ff-2381-44f6...@googlegroups.com>
Message-ID: <e7f28d08-fd4f-46b8...@googlegroups.com>
Message-ID: <5182c5a1-d479-49a0...@googlegroups.com>
Message-ID: <ffcf64d5-752f-41be...@googlegroups.com>

Jesse Rehmer

unread,
Nov 10, 2023, 8:00:01 PM11/10/23
to
On Nov 9, 2023 at 9:43:01 PM CST, "Ray Banana" <ray.ba...@googlemail.com>
wrote:
Do you have any Message-IDs for a QP encoded article? Diablo doesn't
specifically know about this type but curious what it thinks they are on my
end.

Does Cleanfeed not reject the Base64 encoded articles? I thought it did, but
they don't make it through to INN on my end to see, Diablo drops them as
misplaced binaries:

2023-11-10 18:52:52.973 feed-stl-a -
<9642371a-dd35-45ae...@googlegroups.com> 16685 000b00
IncomingFilter

Admittedly, I haven't decoded any of the articles to examine the contents;
I've always assumed they were spam of some sort. If Cleanfeed or pyClean
doesn't filter them maybe I should allow them through the feed from Diablo,
but I'm hesitant to do that.

Adam H. Kerman

unread,
Nov 11, 2023, 2:45:18 PM11/11/23
to

Don Vito Martinelli

unread,
Nov 11, 2023, 3:56:34 PM11/11/23
to
Adam H. Kerman wrote:
> From: mia khalifa <kmia...@gmail.com> in QP
> Note: This is a pr0n star's name.

According to https://www.bbc.com/news/av/entertainment-arts-49453376 it
was for three months around 2014/15, when she was 21.

Orange

unread,
Nov 11, 2023, 4:12:29 PM11/11/23
to
On 11/11/2023 19:45, A known bastard called Adam H. Kerman wrote:
> From: mia khalifa <kmia...@gmail.com> in QP
> Note: This is a pr0n star's name.
>
>

Did your prostitute mother told you this or did you watch her films? I'm
assuming she was a porn film star and not some online porn star.

Your mum must have made a fortune in her prostitution career which can
enable you to not work at all and simply spend her money. You have
become an expert in trolling these newsgroups to know about spam in
newsgroups.



Adam H. Kerman

unread,
Nov 11, 2023, 4:23:42 PM11/11/23
to
Your troll feeding always brightens up my day. It's so cute that you
make yourself feel secure, hiding behind would-be anonymity to insult me,
as if the rest of us don't know exactly where to find you.

Adam H. Kerman

unread,
Nov 17, 2023, 2:58:37 PM11/17/23
to
Fresh Google Groups base64 encoded spam from the last few minutes. It's
the gift that keeps on giving. Ray, please nuke from orbit:

Message-ID: <b3ce34e0-1917-4be5...@googlegroups.com>
Message-ID: <cbce753f-dabb-4b9c...@googlegroups.com>
Message-ID: <ed5466e4-acb8-498f...@googlegroups.com>
Message-ID: <c004e12a-c889-4419...@googlegroups.com>
Message-ID: <ca2c6e20-45fc-4eb1...@googlegroups.com>
Message-ID: <50974833-89a3-4348...@googlegroups.com>
Message-ID: <d61f6c1c-eedd-4280...@googlegroups.com>
Message-ID: <2bad8456-55fe-4b6c...@googlegroups.com>
Message-ID: <a9da2518-b0ec-49bf...@googlegroups.com>
Message-ID: <068520a9-ef68-423c...@googlegroups.com>
Message-ID: <9c784bc8-334f-4346...@googlegroups.com>
Message-ID: <880c4138-4116-40b3...@googlegroups.com>
Message-ID: <c29e55f5-78ab-43f0...@googlegroups.com>
Message-ID: <d454e40e-c393-431a...@googlegroups.com>
Message-ID: <e1181813-ec3c-4929...@googlegroups.com>
Message-ID: <4fc14f59-f0d3-4d84...@googlegroups.com>
Message-ID: <7707805f-68bc-44ba...@googlegroups.com>
Message-ID: <ba3ebbb0-f73c-427a...@googlegroups.com>
Message-ID: <c941798b-7548-4e32...@googlegroups.com>
Message-ID: <1c16153d-8b33-49e0...@googlegroups.com>
Message-ID: <38ece7a3-3fa3-4564...@googlegroups.com>
Message-ID: <a3fa21eb-7b9a-4db3...@googlegroups.com>
Message-ID: <e7935743-4a83-4c3f...@googlegroups.com>
Message-ID: <66c1db99-553d-477b...@googlegroups.com>
Message-ID: <096d341f-c1c6-4c29...@googlegroups.com>
Message-ID: <d44978d9-4fe0-44f7...@googlegroups.com>
Message-ID: <504591ef-0f56-4281...@googlegroups.com>
Message-ID: <1da6fc1e-241a-4cfa...@googlegroups.com>
Message-ID: <285ac108-02df-4d9c...@googlegroups.com>
Message-ID: <848ae8ce-d078-4a16...@googlegroups.com>
Message-ID: <ec702fc8-e324-4bbd...@googlegroups.com>
Message-ID: <15f0f871-2d44-4c7f...@googlegroups.com>
Message-ID: <c5c9e5c7-4be6-4c47...@googlegroups.com>
Message-ID: <24923b92-c153-46cb...@googlegroups.com>
Message-ID: <5c8c1914-077f-419b...@googlegroups.com>
Message-ID: <891ab03f-a10d-415c...@googlegroups.com>
Message-ID: <3571fb20-e9ff-49af...@googlegroups.com>
Message-ID: <81b19691-8fe4-4a3e...@googlegroups.com>
Message-ID: <980d8ad7-5180-4bd0...@googlegroups.com>
Message-ID: <4e3ecf4e-e1c1-4fbf...@googlegroups.com>
Message-ID: <17e72ede-b612-46d9...@googlegroups.com>
Message-ID: <dd3ea727-7833-44cc...@googlegroups.com>
Message-ID: <ab1214cb-0b37-408c...@googlegroups.com>
Message-ID: <c2c8d272-7ff2-410b...@googlegroups.com>
Message-ID: <f52ed819-e181-4a29...@googlegroups.com>
Message-ID: <4c17a930-26b2-4cfe...@googlegroups.com>

Ray Banana

unread,
Nov 17, 2023, 4:03:09 PM11/17/23
to
* Adam H. Kerman wrote:
> Fresh Google Groups base64 encoded spam from the last few minutes. It's
> the gift that keeps on giving. Ray, please nuke from orbit:

观看完整版高清

Cool, now I know how to say "Watch the full version in HD"
in Japanese ;-)

Adam H. Kerman

unread,
Nov 17, 2023, 4:20:39 PM11/17/23
to
Ray Banana <ray...@raybanana.net> wrote:
>* Adam H. Kerman wrote:

>>Fresh Google Groups base64 encoded spam from the last few minutes. It's
>>the gift that keeps on giving. Ray, please nuke from orbit:

>观看完整版高清

>Cool, now I know how to say "Watch the full version in HD"
>in Japanese ;-)

Heh

Kaz Kylheku

unread,
Nov 17, 2023, 6:32:09 PM11/17/23
to
On 2023-11-17, Ray Banana <ray...@raybanana.net> wrote:
> * Adam H. Kerman wrote:
>> Fresh Google Groups base64 encoded spam from the last few minutes. It's
>> the gift that keeps on giving. Ray, please nuke from orbit:
>
> 观看完整版高清
>
> Cool, now I know how to say "Watch the full version in HD"
> in Japanese ;-)

You know from 观that this is Chinese. Japanese has not simplified the
the 見("seeing") character/component to 见.

--
TXR Programming Language: http://nongnu.org/txr
Cygnal: Cygwin Native Application Library: http://kylheku.com/cygnal
Mastodon: @Kazi...@mstdn.ca
NOTE: If you use Google Groups, I don't see you, unless you're whitelisted.

Andy Burns

unread,
Nov 18, 2023, 5:23:53 AM11/18/23
to
Ray Banana wrote:

> 观看完整版高清
>
> Cool, now I know how to say "Watch the full version in HD"
> in Japanese

When I search for that string on an English google page, as expected I
get lots of results in Japanese, but it translates the usual
Shopping/News/Images/Maps/Videos links into Japanese.

Even when I try to force the language parameters to English in the URL,
I still get Japanese ...

<https://google.com/search?hl=en&lr=lang_en&q=%E8%A7%82%E7%9C%8B%E5%AE%8C%E6%95%B4%E7%89%88%E9%AB%98%E6%B8%85>

p.s. Happy to accept Kaz's explanation that this is Chinese, not Japanese

p.p.s Authentication issues again, will wait before retrying to avoid dupes.

Ray Banana

unread,
Nov 18, 2023, 6:36:59 AM11/18/23
to
Thus spake Andy Burns <use...@andyburns.uk>

> Even when I try to force the language parameters to English in the
> URL, I still get Japanese ...
> <https://google.com/search?hl=en&lr=lang_en&q=%E8%A7%82%E7%9C%8B%E5%AE%8C%E6%95%B4%E7%89%88%E9%AB%98%E6%B8%85>

Spamassassin says it's Japanese when it considers the complete article
and Chinese when only looking at the subject.

> p.p.s Authentication issues again, will wait before retrying to avoid dupes.

The virtual servers used for database backends still have issues with
network connectivity and after my conversation with the hoster I have
decided to move them away from this particular hoster. I have already
cancelled the server used as the feeder server (even got a refund for
it) and I'm in the process of moving the databases to a different platform.

Bdb

unread,
Nov 18, 2023, 9:24:57 AM11/18/23
to
Would it be helpful if users of Eternal-September paid you a fee for the
services which you provide?

Adam H. Kerman

unread,
Nov 18, 2023, 11:11:54 AM11/18/23
to
Kaz Kylheku <864-11...@kylheku.com> wrote:

>>. . .

>You know from 观that this is Chinese. Japanese has not simplified the
>the 見("seeing") character/component to 见.

Gah. How many kanji characters have they "simplified"? I had no idea,
but I'm hardly surprised that they would have pushe such a thing without
consulting linguists in the other two countries. Is there another obvious
way to spot differences between Japanese and Korean, or Mandarin versus
Catonese?

Ok, ok, I'll use MIME headers.

suzeeq

unread,
Nov 18, 2023, 11:17:30 AM11/18/23
to
Chinese is identically written in all its dialects, the difference is in
how it's spoken. I have a chinese-american friend who speaks Mandarin,
but can't understand Cantonese so if she encounters someone with another
dialect, they write it down.

Japanese Spammer Here

unread,
Nov 18, 2023, 11:40:49 AM11/18/23
to
On 18/11/2023 14:24, Bdb wrote:
>
> Would it be helpful if users of Eternal-September paid you a fee for
> the services which you provide?
>

There is nothing to stop you from making a contribution. You can send
the money to me and I will make sure it reaches the right person
providing you the service.

Let me know so that I can give you my paypal account details to send it to.

Best regards,

Takahiro

Adam H. Kerman

unread,
Nov 18, 2023, 11:53:17 AM11/18/23
to
I thought Cantonese and Mandarin were unrelated languages, not dialects
of the same language. You're saying they use the same pictograph for
comparable words? I didn't know.

suzeeq

unread,
Nov 18, 2023, 12:10:35 PM11/18/23
to
Yep, that's right. I'd known that years ago even before my friend
mentionend it.

Kaz Kylheku

unread,
Nov 18, 2023, 12:28:30 PM11/18/23
to
Korean is mainly made of Hangeul characters. While that has some
strokes in it similar to Chinese characters (on purpose), one
readily distinguishing feature is the presence of circles: 정, 음.

In Japanese, you will almost always see some hiragana and maybe katakana
characters: ひらがな. These are very curvy, having descended from
calligraphic forms of Chinese characters. Headline Japanese omits
grammar particles (similarly to headline English), and so Japanese
news headlines and such are sometimes just runs of kanji with no
hiragana.

The Chinese simplifications of characters introduce some strokes
that don't appear in the traditional characters. E.g. 語(language,
story, talk) is simplified to 语, and other characters that have
言 (say) as the left hand component are similarly simplified.

I don't know of any traditional characters which have that form on the
left, like 语 does. You may see that in cursive calligraphy but not
in print. It is somewhat similar to the left/bottom component in 道
(way, road), but has no bottom stroke.

If you see any character with that left side components, like 请, that's
Chinese.

I believe those simplifications are not found in Cantonese writing.

Someone unfamiliar with these languages might not be easily able to
distinguish a Japanese news headline (row of nothing but kanji) from
Cantonese.

Kaz Kylheku

unread,
Nov 18, 2023, 12:29:44 PM11/18/23
to
I seem to be under the impression that in Hong Kong, they don't use
simplified characters? Maybe they do now, in government texts and
other official writing?

Adam H. Kerman

unread,
Nov 18, 2023, 12:47:54 PM11/18/23
to
Kaz Kylheku <864-11...@kylheku.com> wrote:

>>. . .

>I seem to be under the impression that in Hong Kong, they don't use
>simplified characters? Maybe they do now, in government texts and
>other official writing?

You're right. They didn't use simplified characters. But they also speak
Cantonese, a language whose usage is actively suppressed in Cantonese-
speaking areas in mainland China.

I'm sure that they are totally screwed in all sorts of ways.

Adam H. Kerman

unread,
Nov 18, 2023, 12:49:17 PM11/18/23
to
Kaz Kylheku <864-11...@kylheku.com> wrote:

>>. . .

>Korean is mainly made of . . .

Thank you for the explanation.

Adam H. Kerman

unread,
Nov 18, 2023, 2:09:28 PM11/18/23
to
Just spotted one.

Message-ID: <967cb955-fe4b-452e...@googlegroups.com>

Adam H. Kerman

unread,
Nov 19, 2023, 1:36:19 PM11/19/23
to

Vir Campestris

unread,
Nov 20, 2023, 10:39:07 AM11/20/23
to
On 18/11/2023 16:53, Adam H. Kerman wrote:
> I thought Cantonese and Mandarin were unrelated languages, not dialects
> of the same language. You're saying they use the same pictograph for
> comparable words? I didn't know.

AIUI the written language is Mandarin, not Cantonese. People who are
native Cantonese speakers have to learn to write in a different
language. **** f I know how they can do it. I don't know either of them.

<https://en.wikipedia.org/wiki/Written_Chinese>

OK, so it's now the Beijing dialect of Mandarin.

Andy

Adam H. Kerman

unread,
Nov 23, 2023, 9:32:00 PM11/23/23
to
Have some spam Message-IDs for Thanksgiving consumption!

Message-ID: <ab8181bb-a8f5-45f4...@googlegroups.com>

test from a spammer
Message-ID: <8be7533d-a853-4e30...@googlegroups.com>

quoted-printable
Message-ID: <620ad6ac-b0de-4894...@googlegroups.com>
Message-ID: <b5610800-cc18-4c43...@googlegroups.com>
Message-ID: <fbbbdd64-90c5-47c3...@googlegroups.com>
Message-ID: <1438f3c7-ad4a-4bd5...@googlegroups.com>
Message-ID: <ad4d0830-86f9-42ab...@googlegroups.com>

Adam H. Kerman

unread,
Nov 27, 2023, 2:43:32 PM11/27/23
to
One spam article yesterday.
Message-ID: <8d613e1c-0950-4f76...@googlegroups.com>

Adam H. Kerman

unread,
Dec 17, 2023, 12:36:29 PM12/17/23
to

Ray Banana

unread,
Dec 17, 2023, 12:47:37 PM12/17/23
to
Thus spake "Adam H. Kerman" <a...@chinet.com>
How would you decide that this is spam without opening the URL and
translating the chinese text in the upper left corner of the page?

--
Пу́тін — хуйло́
https://www.eternal-september.org

Adam H. Kerman

unread,
Dec 17, 2023, 1:20:12 PM12/17/23
to
I was guessing that the URL to a Google map had nothing to do with the
group's topic without translating.

Ray Banana

unread,
Dec 17, 2023, 2:02:12 PM12/17/23
to
Thus spake "Adam H. Kerman" <a...@chinet.com>

>>How would you decide that this is spam without opening the URL and
>>translating the chinese text in the upper left corner of the page?
> I was guessing that the URL to a Google map had nothing to do with the
> group's topic without translating.

I will observe the difference between Chinese (all flavours) and Korean,
if you acknowledge the difference between off-topic postings and spam ;-)

It's the text in the map that makes it spam.

Adam H. Kerman

unread,
Dec 17, 2023, 2:28:42 PM12/17/23
to
Ray Banana <ray...@raybanana.net> wrote:
>Thus spake "Adam H. Kerman" <a...@chinet.com>

>>>How would you decide that this is spam without opening the URL and
>>>translating the chinese text in the upper left corner of the page?

>>I was guessing that the URL to a Google map had nothing to do with the
>>group's topic without translating.

>I will observe the difference between Chinese (all flavours) and Korean,
>if you acknowledge the difference between off-topic postings and spam ;-)

I promise I will not label one-off off-topic articles as spam.

Spam is a number of similar off-topic articles over a certain threshold.
I'm not using skirv's 30 year old "current" cancellable spam FAQ as the
only spam definition. I don't even recall if skirv mentioned NoCeMs
in it since that was written for spam cancellers.

>It's the text in the map that makes it spam.

I saw five very similar off-topic articles posted within a very short
period of time. The quantity made it spam.

The one-off off-topic articles I've mentioned in the past, with sporge,
appeared to be spammers testing your filters.

Adam H. Kerman

unread,
Jan 1, 2024, 3:22:16 AM1/1/24
to
0 new messages