Dealing with the spam in discussion groups

Gervase Markham

unread,

Jul 14, 2010, 8:47:14 PM7/14/10

to

We've have several complaints about the spam (unsurprisingly).

Here's a document which explains why it's not a trivial problem to
solve, and suggests a way of solving it:
https://wiki.mozilla.org/Discussion_Forums/Proposal

Comments very welcome.

Gerv

Phillip Jones

unread,

Jul 14, 2010, 10:12:27 PM7/14/10

to

sounds like a good idea.

Us folks that have always accessed the groups through NNTP. how does
this plan work for us? Seems like I remember having to request to sign
up for groups back when when we started with Netscape Navigator 3.0.1.,
at netscape.

oh BTW: I notice the date of the wiki is 00:30 July 15, 2010. WE have
time traveled into the future. <grin>

--
Phillip M. Jones, C.E.T. "If it's Fixed, Don't Break it"
http://www.phillipmjones.net mailto:pjo...@kimbanet.com

johnjbarton

unread,

Jul 15, 2010, 1:24:15 AM7/15/10

to

I'd just say the John Resig's comments aren't really applicable to the
mozilla groups I read. I moderate the Firebug google group and while
John's comments are correct, they only really bad if you have high
traffic. Sure they are annoying but the spam on the mozilla newsgroups
is as bad or worse. At the traffic rate you have I don't think it would
be so hard to use moderated groups.

On the other hand, google groups are horribly slow to read compared to
Thunderbird with newsgroups. So even if Resig didn't write one of the
great diatribes of all time, I would still hope you don't use Google groups.

jjb

»Q«

unread,

Jul 15, 2010, 1:45:29 AM7/15/10

to

From that page:

Groups which, by their nature, have a regular influx of new posters
(e.g. support groups) might have trouble. What we do is an open
question - we might use good moderator coverage, or we might exclude
them from this (trading off spam for easy access).

I'd think that blocking new (legit) posters in any of the forums might
be a problem. How would the case-by-case whitelisting happen -- would
somebody have to monitor spamassassin's logs picking out false
positives?

I assume that spamassassin could be configured to let everything
posted via NNTP and ML through, only taking a hard look at the stuff
that comes from Google, since that's the source of all the spam.

Would it be possible to set the proposed system up on some groups and
temporarily configure spamassassin to pass everything through but
generate logs about what would have been blocked?

--
»Q« /"\
ASCII Ribbon Campaign \ /
against html e-mail X
<http://www.asciiribbon.org/> / \

Robert Kaiser

unread,

Jul 15, 2010, 9:32:20 AM7/15/10

to

johnjbarton schrieb:

> On the other hand, google groups are horribly slow to read compared to
> Thunderbird with newsgroups.

That's true on one hand, and on the other, anyone making his
communication dependent on Google is doomed.

Robert Kaiser

--
Note that any statements of mine - no matter how passionate - are never
meant to be offensive but very often as food for thought or possible
arguments that we as a community needs answers to. And most of the time,
I even appreciate irony and fun! :)

fantasai

unread,

Jul 15, 2010, 9:42:07 AM7/15/10

to

Add "moderator time" to the list of required resources?

BTW, the way W3C deals with this is by sending a message to
every new poster explaining that their post is about to be
publicly archived. They have the choice of approving that
particular post, approving all future posts from that email
address, or letting the post expire and finding some other
route for feedback. Asking permission like this has the
nice side-effect of blocking most spam from the W3C lists.

http://www.w3.org/2002/09/aa/

~fantasai

Gervase Markham

unread,

Jul 15, 2010, 3:07:00 PM7/15/10

to

On 14/07/10 19:12, Phillip Jones wrote:
> Us folks that have always accessed the groups through NNTP. how does
> this plan work for us? Seems like I remember having to request to sign
> up for groups back when when we started with Netscape Navigator 3.0.1.,
> at netscape.

You will need a one-time approval to post, then things will continue
much as before.

Gerv

Chris Ilias

unread,

Jul 16, 2010, 3:21:30 AM7/16/10

to

I naturally look at it from the perspective of support newsgroups, which
is a different animal than the rest of the groups:

* Almost everyone who starts a new thread is a first time poster, and
will usually not take part in other threads.

* Because there are other (better) web-based support forums
(support.mozilla.com, getsatisfaction, mozillazine, etc.), I don't see
web-access as mandatory for the support newsgroups.

So my initial thought is that making the Google Groups end read-only
would be a better option for the support newsgroups.

However,
We still get the odd spam posted via NNTP (Subject includes "IVÁN"). And
there are instances where mailing list members respond to posts that
have been removed from the news server.

So your proposal may be the better solution. We would need a team of
moderators to make sure new questions get posted with as little latency
as possible.

When the lists were first set up, I had to deal with a lot of users
thinking the list address was a private support address. They would post
messages without subscribing to the list. The solution was to
auto-reject messages from non-members, with rejection message explaining
what the list was for. If the newsgroup is moderated and going through
mailman, auto-rejecting messages from non-members would have to be
turned off. Is there a way to make messages sent to the list from
non-members automatically rejected /and/ messages from the news feed
held for moderation?

Justin Wood (Callek)

unread,

Jul 18, 2010, 1:05:05 AM7/18/10

to

On 7/14/2010 8:47 PM, Gervase Markham wrote:
> We've have several complaints about the spam (unsurprisingly).
>
> Here's a document which explains why it's not a trivial problem to
> solve, and suggests a way of solving it:
> https://wiki.mozilla.org/Discussion_Forums/Proposal
>

My rough comments from my read when this was first posted.

We should allow _all_ posts through after being hit against SpamAssasin
checks/bounds. As we currently do not have any such "whitelist"
restriction on the groups as they stand now, we can probably tweak the
settings to try and get the lowest possible false-positive.

Alternatively, we can use the whitelist approach in conjunction with the
above, and let the whitelist have the lowest-possible-spam-assasin
scoring, while anyone who did not elicit themselves to be white-listed
will get a slightly more aggressive check.

This accounts for:
"Anyone can post" (NNTP, today).
-- currently lists need a subscription first.

"No barrier for entry" (Support groups)
-- Support group users need not jump through hoops to get their post
read/seen, provided their post doesn't look like spam; (We can probably
tweak spamassasin settings to also auto-reply, "We have detected your
post as potentially being spam, if you are not a spammer, visit: <url>
to get your message marked for priority approval. If you intend to do
frequent posting to our lists you can also <link>file a bug</link> to
get added to our whitelist" or some such.

"Ease of understanding"
-- Why didn't my post show up, when I posted 5 hours ago, but
Callek's showed up already posted 10 min ago.

And probably easier on the moderator team too!

Gerv, I brought this up in passing on IRC; but I think I am forgetting
an idea I had then, if you recall it please help refresh my memory :-)

--
~Justin Wood (Callek)

Axel Hecht

unread,

Jul 18, 2010, 4:47:10 AM7/18/10

to

As a side note against the usefulness of poster whitelists, I get tons
of spam on my mozilla accounts claiming to be sent by other mozillians.
I'm not sure if there's that much of a win in terms of spam trapping if
we whitelist the folks we know.

Axel

Justin Wood (Callek)

unread,

Jul 18, 2010, 10:28:21 PM7/18/10

to

What I understood by whitelist, is more of a tiered catch-system. Where
whitelist is required for the post to even be considered for posting and
sent off to the main-part of SpamAssasin. I might have read too much
into it though

--
~Justin Wood (Callek)

»Q«

unread,

Jul 19, 2010, 1:41:41 AM7/19/10

to

I understood it differently, with non-whitelisted posts being scanned
by spamassassin and whitelisted posts bypassing that spamassassin, so
that the only problem would occur when a non-whitelisted post also got a
false positive from spamassasin.

A flowchart might help visualize the current proposal(s).

Gervase Markham

unread,

Jul 21, 2010, 1:37:37 PM7/21/10

to Chris Ilias

On 16/07/10 00:21, Chris Ilias wrote:
> So my initial thought is that making the Google Groups end read-only
> would be a better option for the support newsgroups.

Unfortunately, that means that people who try and post on GG seem to
succeed, but their post gets dropped in the bit-bucket. This is not a
good user experience :-(

> So your proposal may be the better solution. We would need a team of
> moderators to make sure new questions get posted with as little latency
> as possible.

We would. :-|

(This is not a full response to your message, I know.)

Gerv

Gervase Markham

unread,

Jul 21, 2010, 1:38:52 PM7/21/10

to Justin Wood (Callek)

On 17/07/10 22:05, Justin Wood (Callek) wrote:
> We should allow _all_ posts through after being hit against SpamAssasin
> checks/bounds. As we currently do not have any such "whitelist"
> restriction on the groups as they stand now, we can probably tweak the
> settings to try and get the lowest possible false-positive.

Dave implies that even so, we would get significant spam if we used this
approach.

> Alternatively, we can use the whitelist approach in conjunction with the
> above, and let the whitelist have the lowest-possible-spam-assasin
> scoring, while anyone who did not elicit themselves to be white-listed
> will get a slightly more aggressive check.

That may well be the way things combine. I have asked him to clarify here.

> -- Support group users need not jump through hoops to get their post
> read/seen, provided their post doesn't look like spam; (We can probably
> tweak spamassasin settings to also auto-reply, "We have detected your
> post as potentially being spam, if you are not a spammer, visit: <url>
> to get your message marked for priority approval. If you intend to do
> frequent posting to our lists you can also <link>file a bug</link> to
> get added to our whitelist" or some such.

Apparently auto-responding to spam can get you labelled as a spammer,
because of joe jobbing (forged From).

Gerv

Dave Miller

unread,

Jul 21, 2010, 2:07:14 PM7/21/10

to

In article <20100715004...@bellgrove.remarqs.net>, »Q«
<box...@gmx.net> wrote:

> I'd think that blocking new (legit) posters in any of the forums might
> be a problem. How would the case-by-case whitelisting happen -- would
> somebody have to monitor spamassassin's logs picking out false
> positives?

Pretty much.

> I assume that spamassassin could be configured to let everything
> posted via NNTP and ML through, only taking a hard look at the stuff
> that comes from Google, since that's the source of all the spam.

You know, I think we could actually do that, that's not a bad idea. I
know the anti-spam stuff in Mailman (the built-in stuff, not what it
shells out to SpamAssassin for) can be configured to hold messages
based on headers, so if we can figure out a header pattern that's
always present in stuff from we could set it to automatically hold for
moderation anything originating from Google Groups, and then leave it
otherwise open.

--
Dave Miller
Systems Administrator, Mozilla Corporation

Dave Miller

unread,

Jul 21, 2010, 4:10:54 PM7/21/10

to

In article <v6mdnb644_vbJd7R...@mozilla.org>, Callek
<Cal...@gmail.com> wrote:

> What I understood by whitelist, is more of a tiered catch-system. Where
> whitelist is required for the post to even be considered for posting and
> sent off to the main-part of SpamAssasin. I might have read too much
> into it though

Callek is correct here. Everything will go through SpamAssassin, even
if it comes from a known email address. If you're not on the
whitelist, it would just get automatically held for the moderator,
regardless of what SpamAssassin had to say about it.

»Q«

unread,

Jul 28, 2010, 11:16:12 PM7/28/10

to

Posts made through Google Groups always have a Message-ID field ending
in ".googlegroups.com>"

Chris Ilias

unread,

Jul 31, 2010, 1:28:48 PM7/31/10

to

On 10-07-21 1:37 PM, Gervase Markham wrote:
> On 16/07/10 00:21, Chris Ilias wrote:
>> So my initial thought is that making the Google Groups end read-only
>> would be a better option for the support newsgroups.
>
> Unfortunately, that means that people who try and post on GG seem to
> succeed, but their post gets dropped in the bit-bucket. This is not a
> good user experience :-(

Actually, people who try to post via Google Groups would not see a way
to post. For example, the mozilla.support.general newsgroup was removed
and is now read-only on Google:
<http://groups.google.com/group/mozilla.support.general>.

Gervase Markham

unread,

Aug 4, 2010, 6:25:07 AM8/4/10

to

On 31/07/10 18:28, Chris Ilias wrote:
> Actually, people who try to post via Google Groups would not see a way
> to post. For example, the mozilla.support.general newsgroup was removed
> and is now read-only on Google:
> <http://groups.google.com/group/mozilla.support.general>.

Being removed is not the same as being set to read-only.

Unless Dave contradicts me, I am fairly sure I remember him telling me
about this problem with GG. Unless you know of a counter-example (a
group which is still live, but is mirrored read-only to GG, and where
users do not get a "post" UI)?

Gerv

Chris Ilias

unread,

Aug 4, 2010, 4:44:44 PM8/4/10

to

That's how all Mozilla newsgroups looked before posting through Google
Groups was enabled. There was even a bug filed because the posting UI
appeared on a couple of support groups.
https://bugzilla.mozilla.org/show_bug.cgi?id=326634

bando?ers@gmail

unread,

Aug 21, 2010, 9:03:04 AM8/21/10

to

For the access'groups traffic is low for sure...I have a couple of low
traffic groups, and had to jump in and ban a spammer or two, but I
need to read the article to see what we can do since these groups come
from usenet, not just Google. The thing I use Google for is to send me
digests of low volume groups that are not worth spending the time to
check, and to tell the truth, I spend less time on usenet than I used
to...if I want to post on a given topic, or need to find something out
on the t'bird support group, then I arrow down to my News account, and
if the mailing lists get dealt with quickly, I may take a look'see.
The bottom line is of late I may delete the digests without even
looking at what is being discussed because of the spam....I better get
some coffee-not writing well...
Burt Henry

Justin Wood (Callek)

unread,

Sep 8, 2010, 12:04:32 AM9/8/10

to Gervase Markham

I wonder, it has only seemed to get worse lately... can we move forward
with *some* form of this?

--
~Justin Wood (Callek)

Chris Ilias

unread,

Sep 27, 2010, 2:54:20 PM9/27/10

to

On 10-07-16 3:21 AM, Chris Ilias wrote:
> When the lists were first set up, I had to deal with a lot of users
> thinking the list address was a private support address. They would post
> messages without subscribing to the list. The solution was to
> auto-reject messages from non-members, with rejection message explaining
> what the list was for. If the newsgroup is moderated and going through
> mailman, auto-rejecting messages from non-members would have to be
> turned off. Is there a way to make messages sent to the list from
> non-members automatically rejected /and/ messages from the news feed
> held for moderation?

On support-firefox and support-thunderbird, I've set
generic_nonmember_action to hold rather than reject. I'll see if we are
still getting private support requests. If we're not, I think we try
"spam filter --> auto-approve" for the support newsgroups.

Chris Ilias

unread,

Oct 16, 2010, 2:38:08 PM10/16/10

to

On 10-09-27 2:54 PM, Chris Ilias wrote:
> On support-firefox and support-thunderbird, I've set
> generic_nonmember_action to hold rather than reject. I'll see if we are
> still getting private support requests. If we're not, I think we try
> "spam filter --> auto-approve" for the support newsgroups.

We're still getting private support requests. Many more to the
thunderbird support list, which makes me think it may have something to
do with how they find the list address.

Maybe we can have an auto-response for first-time posters, which
explains that the list is a not a private support address and give other
introductory info like rules of etiquette.

Chris Ilias

unread,

Dec 9, 2010, 2:43:23 PM12/9/10

to mozill...@lists.mozilla.org

I haven't seen any objections to this plan for the support newsgroups,
so if there are no responses today, I'll post the proposal in the
support newsgroups for feedback.