Thread for reporing infiltrated SPAM

Anton Shepelev

unread,

Oct 10, 2023, 6:38:14 AM10/10/23

to

Since Ray says such reports are usful, here we go:

<e07ad8b6-30e3-4eb8...@googlegroups.com>
<375f5e92-e249-4761...@googlegroups.com>
<794a5d96-31d1-45ea...@googlegroups.com>
<fba7a7a8-e306-4eb9...@googlegroups.com>
<8e51ac8c-25e7-4a48...@googlegroups.com>
<2119a5d6-1a3b-445b...@googlegroups.com>
<6507c544-bb32-4b20...@googlegroups.com>
<97800f3f-715e-4db1...@googlegroups.com>
<107bb8a3-eb2c-4652...@googlegroups.com>

Shall we provide anything besides the Message-IDs?

Ray Banana

unread,

Oct 10, 2023, 6:46:16 AM10/10/23

to

* Anton Shepelev wrote:
> Since Ray says such reports are usful, here we go:
>
><e07ad8b6-30e3-4eb8...@googlegroups.com>

[...]

> Shall we provide anything besides the Message-IDs?

Message-ID will be sufficient:

grephistory '<e07ad8b6-30e3-4eb8...@googlegroups.com>' | sm -R | spamassassin -L
X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on
news.eternal-september.org
X-Spam-Flag: YES
X-Spam-Level: ************
X-Spam-Status: Yes, score=12.7 required=10.0 tests=BAYES_99,BAYES_999,
CONTENT_QP,PDS_OTHER_BAD_TLD autolearn=no autolearn_force=no
version=3.4.6
X-Spam-Report:
* 10 BAYES_99 BODY: Bayes spam probability is 99 to 100%
* [score: 1.0000]
* 0.2 BAYES_999 BODY: Bayes spam probability is 99.9 to 100%
* [score: 1.0000]
* 0.5 CONTENT_QP No description available.
* 2.0 PDS_OTHER_BAD_TLD Untrustworthy TLDs
* [URI: bily.top (top)]

--
Пу́тін — хуйло́
http://www.eternal-september.org

Anton Shepelev

unread,

Oct 10, 2023, 7:16:26 AM10/10/23

to

Ray Banana to Anton Shepelev

> > <e07ad8b6-30e3-4eb8...@googlegroups.com>
> > [...]
> > Shall we provide anything besides the Message-IDs?
>
> Message-ID will be sufficient:
>
> grephistory '<e07ad8b6-30e3-4eb8...@googlegroups.com>' | sm -R | spamassassin -L
> X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on
> news.eternal-september.org
> X-Spam-Flag: YES
> X-Spam-Level: ************
> X-Spam-Status: Yes, score=12.7 required=10.0 tests=BAYES_99,BAYES_999,
> CONTENT_QP,PDS_OTHER_BAD_TLD autolearn=no autolearn_force=no
> version=3.4.6
> X-Spam-Report:
> * 10 BAYES_99 BODY: Bayes spam probability is 99 to 100%
> * [score: 1.0000]
> * 0.2 BAYES_999 BODY: Bayes spam probability is 99.9 to 100%
> * [score: 1.0000]
> * 0.5 CONTENT_QP No description available.
> * 2.0 PDS_OTHER_BAD_TLD Untrustworthy TLDs
> * [URI: bily.top (top)]

I am not conversant in this, but to my unprepared eye it
looks like this message should have been filtered out, yet
it wasn't. It is there, in comp.lang.c, as viewed via E.-S.

--
() ascii ribbon campaign -- against html e-mail
/\ www.asciiribbon.org -- against proprietary attachments

Anton Shepelev

unread,

Oct 10, 2023, 7:25:06 AM10/10/23

to

<b9f785d9-ed13-428d...@googlegroups.com>
<6b165f21-c9eb-49de...@googlegroups.com>
<6d35ddd3-7ead-4b0a...@googlegroups.com>
<bd6946a8-e69e-4c00...@googlegroups.com>

Anton Shepelev

unread,

Oct 10, 2023, 7:37:08 AM10/10/23

to

Some messages appear when I first open the group, but are gone
after a refresh it later... There a present last time I checked:

<64bcaeba-1658-4957...@googlegroups.com>
<ae9db2d6-fd65-48d0...@googlegroups.com>
<96e294e0-2fc4-4468...@googlegroups.com>
<4afdde94-5e70-45a7...@googlegroups.com>
<a2d98793-fae5-4e21...@googlegroups.com>

Paul

unread,

Oct 10, 2023, 12:05:46 PM10/10/23

to

A little light humor for you.

https://www.cnet.com/tech/services-and-software/deja-news-joins-antispam-war/

Dec. 8, 1997 1:15 p.m. PT

"Deja News joins antispam war"

"We will eliminate most spam before it makes it into our database,"
Deja News founder and chief technology officer Steve Madere said today.

"The site also will filter incoming spam with in-house technology
that uses artificial intelligence to look for machine-generated <=== prescient!
postings"

Really.

Paul

Ray Banana

unread,

Oct 10, 2023, 12:15:56 PM10/10/23

to

* Mickey wrote:

> Do you also want to know about false positives?
> If yes, the following are not spam but are not available on
> eternal-september
>
><831a8cfc-05e7-41ed...@googlegroups.com>
><eff6fef5-e026-4ce4...@googlegroups.com>

Thanks. Hamified. Which does not mean these articles will
reappear ;-) I just told Spamassassin the were not spam.

Chris Schram

unread,

Oct 10, 2023, 3:16:58 PM10/10/23

to

Nice that there's a thread for this. Here's a new one from alt.fan.cecil.adams:
Message-ID: <7f23a51e-e4b3-492e...@googlegroups.com>

--
chri...@me.com is a filtered spam magnet. Email replies may be lost.
You're better off replying to this newsgroup.

Anton Shepelev

unread,

Oct 12, 2023, 5:15:10 AM10/12/23

to

<77aa81e3-8297-408d...@googlegroups.com>
<cd428538-8688-4dd3...@googlegroups.com>
<2fad24c1-6d71-494c...@googlegroups.com>
<f4394c83-5be1-471a...@googlegroups.com>
<882e6c30-8837-46fb...@googlegroups.com>
<d578af1b-0145-4912...@googlegroups.com>
<7be74e29-b5c3-426d...@googlegroups.com>
<44f49603-c43f-454b...@googlegroups.com>

Spammers sending probing articles:

<af5d7d3c-9c4d-4906...@googlegroups.com>
<68869296-875d-41ea...@googlegroups.com>

Ray Banana

unread,

Oct 12, 2023, 5:31:02 AM10/12/23

to

* Anton Shepelev wrote:
> Spammers sending probing articles:
>
><af5d7d3c-9c4d-4906...@googlegroups.com>
><68869296-875d-41ea...@googlegroups.com>

Thanks for the list. The above articles have too little flesh
on the bone to be useful for training SpamAssassin ;-)

Anton Shepelev

unread,

Oct 12, 2023, 7:18:36 AM10/12/23

to

<e6f51b7b-ab59-4b01...@googlegroups.com>
<557c5304-d482-443a...@googlegroups.com>
<ea62feed-2c04-4c30...@googlegroups.com>
<bb24db7f-72ec-4a34...@googlegroups.com>

Anton Shepelev

unread,

Oct 12, 2023, 8:07:30 AM10/12/23

to

<af778bc0-e29f-4c77...@googlegroups.com>
<1875ca77-6ac1-4bab...@googlegroups.com>
<f939a482-a3d7-456f...@googlegroups.com>
<b4f88997-6d0d-4dc8...@googlegroups.com>

This one is small, but the subect is a dead giveaway:
<b3cfdc2f-8fb8-45e2...@googlegroups.com>

Anton Shepelev

unread,

Oct 12, 2023, 3:54:35 PM10/12/23

to

<13226d46-1d02-45aa...@googlegroups.com>
<86d729f0-5e9c-4474...@googlegroups.com>
<c191e5f3-1699-41e1...@googlegroups.com>
<2156525e-7f4a-4603...@googlegroups.com>
<09a4aa28-b4ff-47a1...@googlegroups.com>
<fd5f4410-eecc-45bb...@googlegroups.com>
<0902a0f7-2709-46a0...@googlegroups.com>
<80a8e4e5-018a-48d9...@googlegroups.com>

Anton Shepelev

unread,

Oct 13, 2023, 4:19:02 AM10/13/23

to

<4e1e8481-704e-4b6e...@googlegroups.com>
<85881fe5-1612-4704...@googlegroups.com>
<42bf14c9-1d39-49f4...@googlegroups.com>
<8c7f7688-6fd5-459a...@googlegroups.com>

And there is a batch of very suspicious "test" posts:

<51289b7e-6abd-4927...@googlegroups.com>
<ea489010-9da0-4ec7...@googlegroups.com>
<778ac2c0-05b3-43f4...@googlegroups.com>
<fd06dd81-4cee-453d...@googlegroups.com>
<5b1c6111-b36d-4675...@googlegroups.com>
<e06a83a1-8a02-415c...@googlegroups.com>

Ray Banana

unread,

Oct 13, 2023, 5:28:12 AM10/13/23

to

Thanks. I'm aware of the current situation in comp.lang.c.
My friends from Kuala Lumpur are obviously working very
hard to avoid filters, but, for whatever reason, are
determined to continue posting to comp.lang.c.

They have switched the language from Thai to Indonesian,
have stopped encoding their spam with Base64 and
post links to their spam articles on Google Groups
in addition to posting the spam to Google Groups.

http://al.howardknight.net/?ID=169718924200

Anton Shepelev

unread,

Oct 13, 2023, 5:24:26 PM10/13/23

to

Anton Shepelev

unread,

Oct 13, 2023, 5:39:07 PM10/13/23

to

Ray Banana:

> They have switched the language from Thai to Indonesian,
> have stopped encoding their spam with Base64 and post
> links to their spam articles on Google Groups in addition
> to posting the spam to Google Groups.

Do you think they check whether their SPAM gets though to
E.-S.?

I cannot express my feelings about these sapmmers, so I'll
quote from Suer:

I had long forefelt that, slowly and unpreventably,
wrath was ripening somewhere.

Like a baby sepent, in an egg of white-hot sand, like
the fetus of a lightning in a cloud far away, like a
tuber of potato, like a mangel wurzel, like ginseng,
line an image, in the delirious brain of a poet, some-
where near by -- there was ripening wrath.

Ray Banana

unread,

Oct 14, 2023, 1:09:32 AM10/14/23

to

* Anton Shepelev wrote:
> Ray Banana:
>
>> They have switched the language from Thai to Indonesian,
>> have stopped encoding their spam with Base64 and post
>> links to their spam articles on Google Groups in addition
>> to posting the spam to Google Groups.
>
> Do you think they check whether their SPAM gets though to
> E.-S.?

I know they do, as I recently caught one of them (hence
I know where they come from) trying to spam through E-S.

They also read news.admin.net-abuse.usenet, as the changed
a small detail just 2 hours after I had inadvertently
published it there.

Apd

unread,

Oct 14, 2023, 9:03:16 AM10/14/23

to

alt.comp.freeware

<178df89954e8a083$1276470$2903823$a92e...@news.vipernews.com>
<178df8d3d5b04b76$1276489$2903823$a92e...@news.vipernews.com>

Ray Banana

unread,

Oct 14, 2023, 9:38:50 AM10/14/23

to

* Apd wrote:
> alt.comp.freeware
>
><178df89954e8a083$1276470$2903823$a92e...@news.vipernews.com>
><178df8d3d5b04b76$1276489$2903823$a92e...@news.vipernews.com>

You do realize that this is a completely different type of spam?
Anyway, I'm now including vipernews in the spam filter in addition to
googlegroups.com

Apd

unread,

Oct 14, 2023, 9:49:40 AM10/14/23

to

"Ray Banana" wrote:
>* Apd wrote:
>> alt.comp.freeware
>>
>><178df89954e8a083$1276470$2903823$a92e...@news.vipernews.com>
>><178df8d3d5b04b76$1276489$2903823$a92e...@news.vipernews.com>
>
> You do realize that this is a completely different type of spam?

It's not Google or far Eastern but I thought it worth reporting since
you appear to have made special provision for that group and the
spammers may catch on.

> Anyway, I'm now including vipernews in the spam filter in addition to
> googlegroups.com

Great, the posts are gone!

Anton Shepelev

unread,

Oct 14, 2023, 6:10:20 PM10/14/23

to

Ray Banana to Anton Shepelev:

> > Do you think they check whether their SPAM gets though
> > to E.-S.?
>
> I know they do

Surprising, considering how big and welcoming an audience
they find among Usenetters...

Anton Shepelev

unread,

Oct 14, 2023, 6:12:59 PM10/14/23

to

I wrote:

> What new possible rules I can propose: --
>

> 2. When a message contains /many/ URLS to GG, e.g.
> https://groups.google.com/g/comp.lang.c/c/sdAJO_V92pk

Definitely sounds like a good filter, e.g. for:
<69eb7be3-b7e1-4d8c...@googlegroups.com>

Adam H. Kerman

unread,

Oct 14, 2023, 6:23:13 PM10/14/23

to

Anton Shepelev <anto...@gmail.moc> wrote:
>Ray Banana to Anton Shepelev:

>>>Do you think they check whether their SPAM gets though
>>>to E.-S.?

>>I know they do

>Surprising, considering how big and welcoming an audience
>they find among Usenetters...

Snarf

Ray Banana

unread,

Oct 15, 2023, 2:35:21 AM10/15/23

to

* Anton Shepelev wrote:
> I wrote:
>
>> What new possible rules I can propose: --
>>
>> 2. When a message contains /many/ URLS to GG, e.g.
>> https://groups.google.com/g/comp.lang.c/c/sdAJO_V92pk
>
> Definitely sounds like a good filter, e.g. for:
><69eb7be3-b7e1-4d8c...@googlegroups.com>

I noticed that, too. Unfortunately, each filter rule in SpamAssassin
stops at the first hit and thus doesn't see the URLs on the lines
following the first hit. I will try maybe 15 different rules
for URLs, where each hit adds a small score to the article.
Let's see how that works out.

Adam H. Kerman

unread,

Oct 15, 2023, 5:09:33 AM10/15/23

to

Dear gawd, I cannot believe how much work you are putting into this.

Thank you

Are those who work on the SpamAssasin project willing to tweak it a bit
so it better meets your needs? You may be the first person to use it for
Usenet.

Ray Banana

unread,

Oct 15, 2023, 5:53:06 AM10/15/23

to

* Adam H. Kerman wrote:

> Are those who work on the SpamAssasin project willing to tweak it a bit
> so it better meets your needs? You may be the first person to use it for
> Usenet.

As Mail::SpamAssassin is designed as a spam filter for e-mail, I do not
expect the Apache project to extend it to Usenet. OTOH, SpamAssassin
has a very flexible and modular design, so it's easy to add rules and
extensions/plug-ins to make it usable for Usenet as well.

It is also possible to have a single installation of SpamAssassin on a
server and have different configurations depending on the user running
it.

AFAIK the first Usenet servers to use SpamAssassin as a spam filter
where alphanet.ch, pasdenom.fr and usenet.ovh (in alphabetic order),
focussing mainly on the french fr.* hierarchy and, in the case of
alphanet.ch and pasdenom.fr, extending their scope beyond spam to please
their own personal preferences. I started my work when alphanet.ch
announced to shut down in September because of these personal quarrels
(see fr.usenet.abus.d for details).

Retro Guy

unread,

Oct 15, 2023, 11:07:42 AM10/15/23

to

You may wish to take a look at (to be used with meta):
tflags __HAS_REPEAT_SOMETHING_HITS multiple maxhits=<some number>

I use this rule quite a bit and it's helpful.

I don't want to post actual in-use rules here, just to have people see how to get around them. If you want more info, just drop me an email and I'll share some of these rules.

--
Retro Guy

Ray Banana

unread,

Oct 15, 2023, 11:20:30 AM10/15/23

to

Thus spake Retro Guy <retr...@i2pn2.org>

> You may wish to take a look at (to be used with meta):
> tflags __HAS_REPEAT_SOMETHING_HITS multiple maxhits=<some number>
> I use this rule quite a bit and it's helpful.

> I don't want to post actual in-use rules here, just to have people see
> how to get around them. If you want more info, just drop me an email
> and I'll share some of these rules.

Great, I haven't looked at meta rules yet, but this sounds interesting.

Ray Banana

unread,

Oct 15, 2023, 12:05:12 PM10/15/23

to

Thus spake Ray Banana <ray...@raybanana.net>

> Thus spake Retro Guy <retr...@i2pn2.org>

>> You may wish to take a look at (to be used with meta):
>> tflags __HAS_REPEAT_SOMETHING_HITS multiple maxhits=<some number>
>> I use this rule quite a bit and it's helpful.

> Great, I haven't looked at meta rules yet, but this sounds interesting.

X-Spam-Report:
* 11 ES_COUNT_URIS URI: A multiple match used to count URIs in a
* message

This is exactly what I was looking for. Thanks

Retro Guy

unread,

Oct 15, 2023, 12:13:46 PM10/15/23

to

On Sun, 15 Oct 2023 18:05:06 +0200
Ray Banana <ray...@raybanana.net> wrote:

> Thus spake Ray Banana <ray...@raybanana.net>
>
> > Thus spake Retro Guy <retr...@i2pn2.org>
>
> >> You may wish to take a look at (to be used with meta):
> >> tflags __HAS_REPEAT_SOMETHING_HITS multiple maxhits=<some number>
> >> I use this rule quite a bit and it's helpful.
> > Great, I haven't looked at meta rules yet, but this sounds interesting.
>
> X-Spam-Report:
> * 11 ES_COUNT_URIS URI: A multiple match used to count URIs in a
> * message
>
> This is exactly what I was looking for. Thanks

Happy to help. Feel free to email at any time with any comments or suggestions
on filtering. I'm going to send you an email later today with a few lines that
I use, just in case it's useful to you.

--
Retro Guy

Siri Cruise

unread,

Oct 15, 2023, 8:58:45 PM10/15/23

to

Ray Banana wrote:
> I noticed that, too. Unfortunately, each filter rule in SpamAssassin
> stops at the first hit and thus doesn't see the URLs on the lines
> following the first hit. I will try maybe 15 different rules
> for URLs, where each hit adds a small score to the article.
> Let's see how that works out.
>

Can SpamAssassin deal with ancient problem by sending a troll
seeking missile up Rudy Canoodle's exhaust pipe?

--
Siri Seal of Disavowal #000-001. Disavowed. Denied. @
'I desire mercy, not sacrifice.' /|\
The Church of the Holey Apple .signature 3.1 / \
of Discordian Mysteries. This post insults Islam. Mohamed

Chris Schram

unread,

Oct 16, 2023, 5:12:17 PM10/16/23

to

Another one in alt.fan.cecil.adams crafted to resemble a followup:

<14981152-45bc-4ab6...@googlegroups.com>

sticks

unread,

Oct 16, 2023, 5:41:50 PM10/16/23

to

On 10/15/2023 11:05 AM, Ray Banana wrote:
> Thus spake Ray Banana <ray...@raybanana.net>
>
>> Thus spake Retro Guy <retr...@i2pn2.org>
>
>>> You may wish to take a look at (to be used with meta):
>>> tflags __HAS_REPEAT_SOMETHING_HITS multiple maxhits=<some number>
>>> I use this rule quite a bit and it's helpful.
>> Great, I haven't looked at meta rules yet, but this sounds interesting.
>
> X-Spam-Report:
> * 11 ES_COUNT_URIS URI: A multiple match used to count URIs in a
> * message
>
> This is exactly what I was looking for. Thanks
>

So I'm kinda wondering how to explain some odd behavior I'm seeing. I
thought everything from google groups was being blocked/deleted. Now,
it appears I was wrong, and work is being done on some fronts to
identify legitimate spam and block those. In the process, some posts
form "legitimate" users might get caught up in it and also blocked
(which bothers me very little).

Am I in the game now, or still not getting it?

Ray Banana

unread,

Oct 17, 2023, 1:54:00 AM10/17/23

to

Thus spake sticks <wolve...@charter.net>

> So I'm kinda wondering how to explain some odd behavior I'm seeing. I

Just describe what you are seeing and I will try to explain.

> thought everything from google groups was being blocked/deleted. Now,

No. The goal of the spam filter is to identify *spam* articles from
Google Groups and block these without interfering with non-spam
articles. For this purpose, many different aspects of articles are
are tested for spam indicators that will add up to a final score
that marks the article as either spam or ham (not spam).

As the number and complexity of tests increases, the risc of false
positives also increases. I just had a little typo in one of the rules,
which caused all articles mentioning mushrooms to be rejected rather
than flagging articles mentioning mushrooms in connection with
psylocibin or magic. This caused some legitimate articles in
rec.food.cooking to get dumped.

Ivo Gandolfo

unread,

Oct 17, 2023, 6:46:31 AM10/17/23

to

-------- Original Message --------
From: Ray Banana <ray...@raybanana.net>
> AFAIK the first Usenet servers to use SpamAssassin as a spam filter
> where alphanet.ch, pasdenom.fr and usenet.ovh (in alphabetic order),

Mine too, some time ago alphabet shared their code with me.

Sincerely

--
Ivo Gandolfo

Ray Banana

unread,

Oct 17, 2023, 7:18:56 AM10/17/23

to

Thus spake Ivo Gandolfo <use...@bofh.team>

> From: Ray Banana <ray...@raybanana.net>
>> AFAIK the first Usenet servers to use SpamAssassin as a spam filter
>> where alphanet.ch, pasdenom.fr and usenet.ovh (in alphabetic order),
> Mine too, some time ago alphabet shared their code with me.

Sorry, I assumed you just inherited the Miakibot ;-)

sticks

unread,

Oct 17, 2023, 10:51:27 AM10/17/23

to

Thanks for the explanation. I'll leave it to the others to report false
positives. A UDP on google.groups would be fine with me. If some of
those getting caught up in this come back, fine. Otherwise, good
riddance.

Apd

unread,

Oct 18, 2023, 2:35:40 PM10/18/23

to

Anton Shepelev

unread,

Oct 18, 2023, 3:48:49 PM10/18/23

to

Ray Banana

unread,

Oct 18, 2023, 10:21:58 PM10/18/23

to

Thus spake "Apd" <n...@all.invalid>

There was a problem with the spam filter yesterday for about
1 hour, when the news server disabled the filter without an
error message an spam could flow in unfiltered. I'm still
investigating, but I suspect the cause was a shortage of
memory.

Anton Shepelev

unread,

Oct 19, 2023, 6:19:48 AM10/19/23

to

Some fresh SPAM in comp.lang.c :

<e5660417-9160-4796...@googlegroups.com>
<72eb8ceb-6fe7-4034...@googlegroups.com>

Apd

unread,

Oct 19, 2023, 7:28:31 AM10/19/23

to

"Ray Banana" wrote:
> There was a problem with the spam filter yesterday for about
> 1 hour, when the news server disabled the filter without an
> error message an spam could flow in unfiltered. I'm still
> investigating, but I suspect the cause was a shortage of
> memory.

Ah, I did wonder if something had borked. I was able to retrieve
articles for those but some others (different GG spam) appearing
around the same time had gone.

Chris Schram

unread,

Oct 24, 2023, 4:13:55 PM10/24/23

to

Newsgroups: alt.fan.cecil-adams
Message-ID: <263c6c39-753c-487c...@googlegroups.com>

Retro Guy

unread,

Oct 24, 2023, 4:31:42 PM10/24/23

to

On Tue, 24 Oct 2023 20:13:53 -0000 (UTC)
Chris Schram <chri...@me.com> wrote:

> Newsgroups: alt.fan.cecil-adams
> Message-ID: <263c6c39-753c-487c...@googlegroups.com>

I see this article on both news.eternal-september.org and news.i2pn2.org

--
Retro Guy

Ray Banana

unread,

Oct 24, 2023, 4:43:09 PM10/24/23

to

It's not spam and it wasn't cancelled: i2pn2: 1, E-S: 1 ;-)

Win-win

Chris Schram

unread,

Oct 24, 2023, 5:12:30 PM10/24/23

to

On 2023-10-24, Ray Banana <ray...@raybanana.net> wrote:
> * Retro Guy wrote:
>> On Tue, 24 Oct 2023 20:13:53 -0000 (UTC)
>> Chris Schram <chri...@me.com> wrote:
>>
>>> Newsgroups: alt.fan.cecil-adams
>>> Message-ID: <263c6c39-753c-487c...@googlegroups.com>
>>
>> I see this article on both news.eternal-september.org and
>> news.i2pn2.org
>
> It's not spam and it wasn't cancelled: i2pn2: 1, E-S: 1 ;-)
>
> Win-win

I beg to differ. The message is "crafted" to resemble a followup to a
very old message thread, but then contains a link to a totally unrelated
commercial site. That's spam in my book, and I suspect that type of
message, though quite common, is very hard to filter.

Retro Guy

unread,

Oct 24, 2023, 6:09:56 PM10/24/23

to

On Tue, 24 Oct 2023 21:12:28 -0000 (UTC)

Chris Schram <chri...@me.com> wrote:

> On 2023-10-24, Ray Banana <ray...@raybanana.net> wrote:
> > * Retro Guy wrote:
> >> On Tue, 24 Oct 2023 20:13:53 -0000 (UTC)
> >> Chris Schram <chri...@me.com> wrote:
> >>
> >>> Newsgroups: alt.fan.cecil-adams
> >>> Message-ID: <263c6c39-753c-487c...@googlegroups.com>
> >>
> >> I see this article on both news.eternal-september.org and
> >> news.i2pn2.org
> >
> > It's not spam and it wasn't cancelled: i2pn2: 1, E-S: 1 ;-)
> >
> > Win-win
>
> I beg to differ. The message is "crafted" to resemble a followup to a
> very old message thread, but then contains a link to a totally unrelated
> commercial site. That's spam in my book, and I suspect that type of
> message, though quite common, is very hard to filter.

Sorry, I misunderstood your message to be listing a false positive.

I only speak for i2pn2, but this type of message will probably get through
the filters, as you mention above. Spamassassin generally doesn't know what
a link goes to unless it's in a list of "bad" links.

I have added this ggroups user to be filtered in the future, but I expect
this type of message to get through from other users. It just doesn't look
"spammy" to spamassassin.

--
Retro Guy

Ray Banana

unread,

Oct 25, 2023, 4:22:06 AM10/25/23

to

Thus spake Chris Schram <chri...@me.com>

>>> Chris Schram <chri...@me.com> wrote:
>>>> Newsgroups: alt.fan.cecil-adams
>>>> Message-ID: <263c6c39-753c-487c...@googlegroups.com>
>>> I see this article on both news.eternal-september.org and
>>> news.i2pn2.org
>> It's not spam and it wasn't cancelled: i2pn2: 1, E-S: 1 ;-)

> I beg to differ. The message is "crafted" to resemble a followup to a
> very old message thread, but then contains a link to a totally unrelated
> commercial site. That's spam in my book, and I suspect that type of
> message, though quite common, is very hard to filter.

It is impossible to filter on the criteria you just named:

1. Find References: header
2. Check age of references
3. Define a maximum age for articles replied to
4. Look up links and determine nature of web site
5. Compare to charter of newsgroups the article is posted to

And as a single article, it still is not cancellable spam,
so you need to find out the Breidbart index based on other
articles from the same poster.

I think, a reliability of 99,5% is very good for a spamfilter,
anything above this ratio will inevitably increase the rate of
false positves drastically. And it is my first priority to
avoid false positives that destroy legitimate content.

Chris Schram

unread,

Oct 25, 2023, 5:52:56 AM10/25/23

to

I have no quarrel with any of what you stated above. It probably is
impossible to filter the type of message I submitted. There is stylistic
similarity in the one or two lines preceding the spammy URL, but you
would need sophisticated AI in your filter, and still be at risk for too
many false positives.

True story (and drifting into off-topic land): My late wife during the
late '80s - early '90s worked phone support for a prominent MSDOS
anti-virus software company. One day she came home and told me the
latest version of their constantly-updated software was flagging
WordPerfect as a virus. Yes, false-positives were a constant headache in
that industry.

Paul

unread,

Nov 6, 2023, 1:29:55 AM11/6/23

to

On 10/10/2023 6:38 AM, Anton Shepelev wrote:
> Since Ray says such reports are usful, here we go:

<d10302ed-3d79-45ec...@googlegroups.com>

(comp.text.tex)

Might need a slight tweak.

http://al.howardknight.net/?STYPE=msgid&MSGI=%3Cd10302ed-3d79-45ec-9dc4-778f62aea9f6n%40googlegroups.com%3E

Paul

Ray Banana

unread,

Nov 6, 2023, 1:51:16 AM11/6/23

to

* Paul wrote:
> On 10/10/2023 6:38 AM, Anton Shepelev wrote:
>> Since Ray says such reports are usful, here we go:
>
><d10302ed-3d79-45ec...@googlegroups.com>
>
> (comp.text.tex)
>
> Might need a slight tweak.

;-)

llp

unread,

Jan 18, 2024, 5:38:59 PM1/18/24

to

Mickey wrote :
> I consider this spam from google groups, but maybe it isn't and
> therefor was allowed through the spam filters,
>
> Message-ID: <591abab7-e7e5-4c17...@googlegroups.com>
> Message-ID: <84182870-310b-43e5...@googlegroups.com>
> Message-ID: <52ce9741-442c-44ec...@googlegroups.com>
> Message-ID: <9ed84278-887e-4501...@googlegroups.com>
> Message-ID: <63d83c80-966e-441e...@googlegroups.com>
> Message-ID: <c1eb9ab7-a251-4d97...@googlegroups.com>
> Message-ID: <f10663d8-3550-4aed...@googlegroups.com>
>
> Otherwise all is workung quite well. Many thanks to all that are
> fighting the spam from google.

I see nocems for these messages.
For example:
<https://www.novabbs.com/SEARCH/search_nocem.php?msgid=%3Cf10663d8-3550-4aed-b116-6b2185554f4bn%40googlegroups.com%3E>