Moderation AI ?

Mikko

unread,

Oct 23, 2022, 5:51:29 AM10/23/22

to

Unmoderated goups are spammed so much that many have become unusable
and unused. Moderated groups need moderators that must work hard, and
if they fail to do so, or fail to moderate in the rignt way, the group
becomes uninteresting and unused.

Is it already (or in near future) possible to construct an AI that
could moderate a discussion group so that the amount of off-topic
messages stays acceptable but acceptable messages are not rejected
too often?

Mikko

Mikko

unread,

Oct 25, 2022, 9:26:52 AM10/25/22

to

On 2022-10-23 14:01:00 +0000, Doc O'Leary , said:

> For your reference, records indicate that

> Mikko <mikko....@iki.fi> wrote:
>
>> Unmoderated goups are spammed so much that many have become unusable
>> and unused.
>

> If you’re talking about Usenet itself, I would dispute that premise. There
> are plenty of online forums that are still used despite being full of spam;
> I could even argue that the sum total of social media exists *to* be a
> channel for spam, and that’s where the bulk of Usenet traffic has gone.
> Network effects are a better explanation for why nobody goes where nobody
> goes.

>
>> Is it already (or in near future) possible to construct an AI that
>> could moderate a discussion group so that the amount of off-topic
>> messages stays acceptable but acceptable messages are not rejected
>> too often?
>

> It has been possible to stop spam for decades, and no AI is required to do
> it. It doesn’t even require natural language processing of message content!
> Spam (and other forms of abuse) have a source, and using that metadata to
> block bad actors is all that is required to stop the abuse. The problem is
> that, if you do said analysis, you’ll quickly discover that the source of
> abuse turns out to be the same “too big to fail” companies that exploit
> network effects for their own benefits. For Usenet, that means Google
> Groups; if you have the courage to acknowledge Google is a hostile actor,
> cut them off and you’ll eliminate 90% of the spam on Usenet.

That approach depends on identification of spam and spam sources. But my
question about the possibility to identify on-topic messages is still
unanswered.

Mikko

Chris Buckley

unread,

Oct 26, 2022, 9:45:46 AM10/26/22

to

Is it possible to do better than random?
Absolutely.

Is it impossible to do perfectly?
Absolutely.

So even to begin with, you have to say what quality is acceptable.

Then after that, to even measure quality you need a definition of
spam. Then you'll find that humans will disagree far more often than
you would expect. In the well established field of experimental test
collection information retrieval, where the goal is to find documents
relevant to a user's query, the relevant sets (A,B) of two human
professionals will typically only agree 60% of the time
(A intersection B / A union B is about .6).

Then after that, the biggest problem is that you are in an adversarial
relationship with the spammers. Once you start interfering with the
spammers, they will change their approach. Retrospectively, given a decent
learning set, current machine learning approaches will do a decent job
at identifying spam in these past sets. But as the spammers learn what is
acceptable and what is not, the reliance on past spam will become less and
less useful. In the 2000's, some 30% of Google's search effort was spent
in this cat and mouse game with the spammers.

All that's in theory. In practice, any barrier at all to spam on
Usenet will reduce spam since the return from the spam is so small -
there are better places for the spammers. What would doom an effort
such as you suggest would be the complaints from the
borderline-legitimate posters about posts improperly identified as
spam. Usenet is dying fast enough as it is; it can't afford to
send these posters packing!

Chris