Google Groups no longer supports new Usenet posts or subscriptions. Historical content remains viewable.
Dismiss

Cleanfeed update

0 views
Skip to first unread message

Steve

unread,
Feb 13, 2008, 10:25:31 AM2/13/08
to
-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA512

Hi all,

I thought I'd publish a couple of additional changes I've made to my
Cleanfeed recently.

http://www.mixmin.net/cleanfeed
http://www.mixmin.net/cleanfeed.diff (Seeded from Cleanfeed-20020501)
http://www.mixmin.net/cleanfeed.asc (Detached signature)

What's new:
* Added phl_exclude regex for excluding groups from the PHL filter
* Added fsl_exclude regex for excluding groups from the FSL filter
* Added support for a bad_hosts_central file. This is treated exactly
like the bad_hosts file. Adding it provides support for optionally
downloading a frequently updated list of bad hosts (via cron). The
central list itself is something I'm currently working on.
* Added an additional format mode to the saveart function to write just
NNTP-Posting-Host information to a file.

Feedback welcomed

Steve

-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.6 (GNU/Linux)

iD8DBQFHswvqtHGA1SKHYecRCkpEAJ9OdwyQmR/q/mlTSXOF22z4HQ7b+gCgvS20
dp6n7Fo9vIEX/Miuf0PAItM=
=eWS6
-----END PGP SIGNATURE-----

--
pub 1024D/228761E7 2003-06-04 Steven Crook
Key fingerprint = 1CD9 95E1 E9CE 80D6 C885 B7EB B471 80D5 2287 61E7
uid Steven Crook <st...@mixmin.net>

Sven Gottwald

unread,
Feb 13, 2008, 4:17:21 PM2/13/08
to
* Quoting Steve <st...@mixmin.net>:

> I thought I'd publish a couple of additional changes I've made to my
> Cleanfeed recently.

Could you please include the fix from <em4el1$q19$1...@guepard.ecp.fr> [1] ?

______
1: <URL:http://preview.tinyurl.com/2gaaun>
<URL:http://groups.google.com/group/news.software.nntp/msg/f18f44082a066a
79?dmode=source&output=gplain>

--
The truth may be out there, but lies are inside your head.
-- Terry Pratchett

Julien ÉLIE

unread,
Feb 13, 2008, 4:44:05 PM2/13/08
to
Hi Sven,

> Could you please include the fix from <em4el1$q19$1...@guepard.ecp.fr> [1] ?

> 1: <URL:http://preview.tinyurl.com/2gaaun>

Oh, thanks :)


Some remarks regarding that old message:

> adding and removing bad paths (softly, or with a configurable way [the
> newsmaster tells the level he wants])

Please also see Xavier Roche's idea <fkbt7t$9du$1...@news.trigofacile.com>:

http://groups.google.fr/group/news.software.nntp/msg/5bbc57f654521c27

> Best organizing the FSL/PHL filter (with newsgroups exception): for example,
> the default cleanfeed configuration does not know that « comp.lang.ruby » is
> gate bi-directionally into a mailing list. Cleanfeed's default PHL filters
> produce broken threads because many postings from the mailing list are dropped.
> Likewise FSL.
> A lot of work is to be done in newsgates hierarchies (like « linux.* » or « bit.* »).

Sure.


> Another example is the tiny bot of Netvisao.pt in « misc.test » (and another one
> in « microsoft.public.test.here ») which produces lots of false positive.

The first bot is still there and sends a message every two minutes:

Subject: NNTP Monitor Test Message
Date: Wed, 13 Feb 2008 21:44:31 +0000 (UTC)
Organization: Netvisao, A sua Internet por Cabo
Message-ID: <fovobv$ej8$1...@newshub.netvisao.pt>


However, the second bot does not seem to exist any longer. By the way,
the newsgroup "microsoft.public.test.here" is now "microsoft.public.test".

--
Julien ÉLIE

« Only wimps use tape backup: _real_ men just upload their important stuff
on ftp, and let the rest of the world mirror it ;) » (Linus Torvalds, 1996)

Steve

unread,
Feb 14, 2008, 6:04:04 AM2/14/08
to
On Wed, 13 Feb 2008 22:17:21 +0100 (CET), Sven Gottwald wrote in
Message-Id: <fovmp0$4af$3...@news.svengo.de>:

> Could you please include the fix from <em4el1$q19$1...@guepard.ecp.fr> [1] ?

Done. Thanks very much.

Steve

unread,
Feb 14, 2008, 6:10:05 AM2/14/08
to
On Wed, 13 Feb 2008 22:44:05 +0100, =?Windows-1252?Q?Julien_=C9LIE?= wrote in
Message-Id: <fovobv$45m$1...@news.trigofacile.com>:

Nice idea, I'll give this some thought.

> The first bot is still there and sends a message every two minutes:

> However, the second bot does not seem to exist any longer. By the way,
> the newsgroup "microsoft.public.test.here" is now "microsoft.public.test".

I've added comp.lang.ruby to the fsl and phl exclusions. The Microsoft
one will also be excluded as it matches the generic '\.test' that's also
in fsl and phl. I couldn't think of any good reason for not excluding
all test newsgroups from these filters.

Julien ÉLIE

unread,
Feb 14, 2008, 6:48:01 AM2/14/08
to
Hi Steve,

>> Please also see Xavier Roche's idea <fkbt7t$9du$1...@news.trigofacile.com>:
>>
>> http://groups.google.fr/group/news.software.nntp/msg/5bbc57f654521c27
>
> Nice idea, I'll give this some thought.

Basically, I believe that Cleanfeed should listen to special articles
and act according to them (if they have a good signature and if Cleanfeed
is told to do that with a configuration variable). Perhaps
something like cleanfeed.ctl should specify the rights (like nocem.ctl)
and you can find some code for parsing articles in perl-nocem.
Cleanfeed internals should then be updated (its arrays) and the bad_*
files be written. I do not know the format they should have (especially
if there is a notion of expiry in that system).

But maybe you have better ideas for dealing with that.


> I've added comp.lang.ruby to the fsl and phl exclusions.

Thanks. But I do not know the exact list of similar groups which should
also be added there.


> The Microsoft
> one will also be excluded as it matches the generic '\.test' that's also
> in fsl and phl. I couldn't think of any good reason for not excluding
> all test newsgroups from these filters.

And also "^es\.pruebas$".

--
Julien ÉLIE

« O fortunatos nimium, sua si bona norint, agricolas. » (Virgile)

Steve

unread,
Feb 14, 2008, 11:35:21 AM2/14/08
to
On Thu, 14 Feb 2008 12:48:01 +0100, =?Windows-1252?Q?Julien_=C9LIE?= wrote in
Message-Id: <fp19qa$oa$1...@news.trigofacile.com>:

> Basically, I believe that Cleanfeed should listen to special articles
> and act according to them (if they have a good signature and if Cleanfeed
> is told to do that with a configuration variable). Perhaps
> something like cleanfeed.ctl should specify the rights (like nocem.ctl)
> and you can find some code for parsing articles in perl-nocem.
> Cleanfeed internals should then be updated (its arrays) and the bad_*
> files be written. I do not know the format they should have (especially
> if there is a notion of expiry in that system).
>
> But maybe you have better ideas for dealing with that.

At the moment I'm just experimenting with a much simpler issue of
generating an automated list of bad_hosts. I'm seeding this from
various filter output that Cleanfeed produces via saveart.

Most of the problem is deciding on a policy of what is and what is not
acceptable usage. For example, jobcircle.com posts thousands of
articles every day to job newsgroups. Is this reasonable usage of
Usenet; a medium for discussion, or is it not?

Dynamic addresses are also a point for consideration. Is it worth
blocking a hipcrime abusers address when the next day he'll likely have
a different one?

Such a list needs to err very much on the side of caution. If it
indiscriminately blocks too much then it becomes more of a problem than
a solution. In the end I might just decide that it's not a worthwhile
exercise but I'm not convinced yet. :)

Steve

unread,
Feb 15, 2008, 12:33:58 PM2/15/08
to
On Wed, 13 Feb 2008 22:17:21 +0100 (CET), Sven Gottwald wrote in
Message-Id: <fovmp0$4af$3...@news.svengo.de>:

> Could you please include the fix from <em4el1$q19$1...@guepard.ecp.fr> [1] ?

Actually I think this element is an error:

[...]
+ [Jj][Pp][Ee]?[Gg]|
+ [Gg][Ii][Ff]|
+ [Pp][Nn][Gg]

Putting those file types in the uuencoded filter will actually prevent
them from being posted to any groups, regardless of whether they are
defined as binary groups.

The regex currently in there is:
[Tt][Ee]?[Xx][Tt]|
[Hh][Tt][Mm][Ll]?|
[Ee][Xx][Ee]|
[Uu][Rr][Ll]

These file types are to prevent uuencoded spam and viruses.

Sven Gottwald

unread,
Feb 16, 2008, 11:57:56 AM2/16/08
to
* Quoting Steve <st...@mixmin.net>:
[...]

> Actually I think this element is an error:
>
> [...]
> + [Jj][Pp][Ee]?[Gg]|
> + [Gg][Ii][Ff]|
> + [Pp][Nn][Gg]
>
> Putting those file types in the uuencoded filter will actually prevent
> them from being posted to any groups, regardless of whether they are
> defined as binary groups.

Okay, I've forgotten that some sites do want binary content (at least in
some groups). What do you think of
<URL:http://paste.pocoo.org/show/28153/>? I restored the original
regular expression and changed is_binary() a little bit.

Marco d'Itri

unread,
Feb 16, 2008, 5:46:57 PM2/16/08
to
iul...@nom-de-mon-site.com.invalid wrote:

>Basically, I believe that Cleanfeed should listen to special articles
>and act according to them (if they have a good signature and if Cleanfeed
>is told to do that with a configuration variable). Perhaps

Probably not, because cleanfeed processing is synchronous with articles
processing so it should be as fast as possible.
This kind of things should happen in a different process, like
perl-nocem.

--
ciao, |
Marco | * The Internet is full. Go away. -- Joel Furr *

Julien ÉLIE

unread,
Feb 17, 2008, 5:45:28 AM2/17/08
to
Hi Marco,

>>Basically, I believe that Cleanfeed should listen to special articles
>>and act according to them (if they have a good signature and if Cleanfeed
>>is told to do that with a configuration variable). Perhaps
>
> Probably not, because cleanfeed processing is synchronous with articles
> processing so it should be as fast as possible.

Indeed.


> This kind of things should happen in a different process, like
> perl-nocem.

Therefore, after each update, should a

ctlinnd reload filter.perl 'updated conf'

be executed?
Otherwise, how could Cleanfeed's variables be updated?

--
Julien ÉLIE

« -- Vous ramassez des champignons sans les connaître ?
-- Et alors ? Ce n'est pas pour les manger mais pour les vendre. »

Steve

unread,
Feb 17, 2008, 6:12:01 AM2/17/08
to
On Sun, 17 Feb 2008 11:45:28 +0100, =?Windows-1252?Q?Julien_=C9LIE?= wrote in
Message-Id: <fp938l$m6o$1...@news.trigofacile.com>:

> Therefore, after each update, should a
>
> ctlinnd reload filter.perl 'updated conf'
>
> be executed?
> Otherwise, how could Cleanfeed's variables be updated?

Yes, a reload needs to be issued in order to reread the bad_* files. If
you're downloading a central file via cron, it's easy to schedule the
reload immediately after the download.

Julien ÉLIE

unread,
Feb 17, 2008, 7:54:55 AM2/17/08
to
Hi Steve,

>> Therefore, after each update, should a
>>
>> ctlinnd reload filter.perl 'updated conf'
>>
>> be executed?
>> Otherwise, how could Cleanfeed's variables be updated?
>
> Yes, a reload needs to be issued in order to reread the bad_* files. If
> you're downloading a central file via cron, it's easy to schedule the
> reload immediately after the download.

My main concern was for something like perl-nocem for Cleanfeed.
I do not know whether it is good to reload the Perl filter
inside such a program whenever the configuration is updated.

Perhaps one would not want to reload it if he is currently modifying
some rules in his cleanfeed.local and has not yet tested them.


As for downloading a central file, it lacks in reactivity (I think
articles like NoCeM notices would be better).

--
Julien ÉLIE

« -- À trente jours de marche, aussi il vous faudra traverser le désert !
-- Je ne connais pas encore ce genre de traversée mais
par Toutatis, je suis certain de m'en sortir très vite ! » (Astérix)

Marco d'Itri

unread,
Feb 17, 2008, 8:35:40 AM2/17/08
to
iul...@nom-de-mon-site.com.invalid wrote:

>Therefore, after each update, should a
>
> ctlinnd reload filter.perl 'updated conf'
>
>be executed?
>Otherwise, how could Cleanfeed's variables be updated?

Either this, or cleanfeed itself could stat the files every N articles
processed.

Julien ÉLIE

unread,
Feb 17, 2008, 12:34:16 PM2/17/08
to
Hi Marco,

>>Therefore, after each update, should a
>>
>> ctlinnd reload filter.perl 'updated conf'
>>
>>be executed?
>>Otherwise, how could Cleanfeed's variables be updated?
>
> Either this, or cleanfeed itself could stat the files every N articles
> processed.

Yes, that is a good solution.
(like controlchan which stats control.ctl whenever a control article
is fed to him -- but here, N articles is better)

Thanks, Marco.

--
Julien ÉLIE

« -- Poussez pas derrière !
-- Pas si vite devant ! » (Astérix)

Steve

unread,
Feb 19, 2008, 7:46:33 AM2/19/08
to
-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA512

On 17 Feb 2008 14:35:40 +0100, Marco d'Itri wrote in
Message-Id: <fp9d7c$50f$1...@bongo.bofh.it>:

> Either this, or cleanfeed itself could stat the files every N articles
> processed.

I've implemented this so that bad_* files are reloaded every N
*accepted* articles. Where N is defined in $config{bad_rate_reload}.

I decided on accepted articles to prevent high volumes of rejected
articles from causing an undesired increase in the frequency of reloads.
At the moment bad_rate_reload defaults to 10000 articles, which on my full
text feed works out at (very roughly) once per hour.

The only other very minor update is the introduction of a $gr{alltest}
which returns True if all the groups in a given article match '\.test'.

http://www.mixmin.net/cleanfeed
http://www.mixmin.net/cleanfeed.asc
http://www.mixmin.net/cleanfeed.diff

-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.6 (GNU/Linux)

iD8DBQFHus+ptHGA1SKHYecRCipTAJwPbPL5XY+b1EcHFHe63/zT7zUWIQCffkHl
fum6VJMjOiJdooV/nqNulew=
=khoA
-----END PGP SIGNATURE-----

kjell

unread,
Feb 21, 2008, 1:22:03 PM2/21/08
to
Steve skrev på 2008-02-19 :

> http://www.mixmin.net/cleanfeed

How do I install the new filter?
Should I replace the old filter_innd.pl with cleanfeed
and rename it to filter_innd.pl? Or what?
I understand I have to change directory where
configuration files live and $MODE to inn.

--
Akademiker,högutbildade,välavlönade människor högt upp i
välståndsligan har plötsligt blivit de som har det svårt i samhället
detta skall de sämst ställda nu få betala.


Steve

unread,
Feb 21, 2008, 3:29:38 PM2/21/08
to
On Thu, 21 Feb 2008 19:22:03 +0100, kjell wrote in
Message-Id: <mn.ac8a7d82a...@uuyuy.se>:

> Steve skrev på 2008-02-19 :
>
>> http://www.mixmin.net/cleanfeed
>
> How do I install the new filter?
> Should I replace the old filter_innd.pl with cleanfeed

Hi Kjell,
Yes you need to copy the downloaded cleanfeed file over filter_innd.pl.

> I understand I have to change directory where
> configuration files live and $MODE to inn.

Correct.

Please be aware that the new version of the filter with default options
will reject postings that previously would have gone through. After
first installing it you need to carefully monitor the rejects to ensure
its behaviour is aligned with your policies.

An example of this behaviour occurred today when Eweka Internet Services
started adding a header of: "NNTP-Posting-Host: Eweka Internet Services"
to all their outgoing messages. This is inevitably going to cause false
positives in cleanfeed and my updates are likely to exacerbate the
situation.

D. Stussy

unread,
Feb 21, 2008, 9:39:28 PM2/21/08
to
"Steve" <st...@mixmin.net> wrote in message
news:fpkmvi$d3s$1...@news.mixmin.net...
> ...

> An example of this behaviour occurred today when Eweka Internet Services
> started adding a header of: "NNTP-Posting-Host: Eweka Internet Services"
> to all their outgoing messages. This is inevitably going to cause false
> positives in cleanfeed and my updates are likely to exacerbate the
> situation.

Why would that be a false positive? They should be using the "Organization"
header for that - and I don't see rejecting malformed articles as a bad
thing.


Steve

unread,
Feb 22, 2008, 4:15:36 AM2/22/08
to
On Thu, 21 Feb 2008 18:39:28 -0800, D. Stussy wrote in
Message-Id: <fplckq$e0r$1...@snarked.org>:

> Why would that be a false positive? They should be using the "Organization"
> header for that - and I don't see rejecting malformed articles as a bad
> thing.

I agree, it's not a bad thing to reject malformed articles. I just
don't want people to install this update and then immediately back it
out because it seems to be wildly rejecting stuff.

Julien ÉLIE

unread,
Feb 29, 2008, 1:21:53 PM2/29/08
to
Hi Steve,

I have just found out a bug in Cleanfeed. If you could fix it in your version,
it would be great.

In fact, I wanted to try the block_user_cancels feature and it did not work.
Here is the (easy) patch:


@@ -943,12 +943,12 @@
if exists $Bad_Cancel_Path{$_};
}

- reject('User-issued spam cancel')
+ return reject('User-issued spam cancel')
if $config{block_user_spamcancels}
and $hdr{'X-Trace'} and $hdr{'NNTP-Posting-Host'}
and $hdr{Path} =~ /!cyberspam!/;

- reject('User-issued cancel')
+ return reject('User-issued cancel')
if $config{block_user_cancels}
and not $hdr{Path} =~ /!cyberspam!/;

--
Julien ÉLIE

« -- C'est joli cette avenue le long de la mer... Ça s'appelle comment ?
-- La promenade des Bretons. » (Astérix)

Steve

unread,
Feb 29, 2008, 2:18:46 PM2/29/08
to
On Fri, 29 Feb 2008 19:21:53 +0100, =?Windows-1252?Q?Julien_=C9LIE?= wrote in
Message-Id: <fq9ig6$bmi$1...@news.trigofacile.com>:

> Hi Steve,
>
> I have just found out a bug in Cleanfeed. If you could fix it in your version,
> it would be great.
>
> In fact, I wanted to try the block_user_cancels feature and it did not work.
> Here is the (easy) patch:

Great thanks, I've checked it in and re-released it. Surprising that
hasn't been picked up before but it certainly looks like a bug to me.

Julien ÉLIE

unread,
Mar 1, 2008, 3:19:53 AM3/1/08
to
Hi Steve,

In your Cleanfeed:

* #TODO We treat pictures as binaries. There should be a seperate check that
-> separate

* Why has '\.test(?:$|\.)' just vanished from md5exclude?

* In your PHR check, there is a superfluous "}" (but it still works):
if $PHRhistory->add("$server}");

--
Julien ÉLIE

« -- Le bureau des renseignements ?
-- Sais pas. Adressez-vous aux renseignements, ils vous renseigneront. » (Astérix)

Steve

unread,
Mar 1, 2008, 4:35:47 AM3/1/08
to
On Sat, 1 Mar 2008 09:19:53 +0100, =?Windows-1252?Q?Julien_=C9LIE?= wrote in
Message-Id: <fqb3jf$4h3$1...@news.trigofacile.com>:

> In your Cleanfeed:
>
> * #TODO We treat pictures as binaries. There should be a seperate check that
> -> separate

#TODO Correct my bad spelling :)

> * Why has '\.test(?:$|\.)' just vanished from md5exclude?

I added a $gr{alltest} variable that returns true if all the groups in a
posting are test groups. This is checked in the MD5 filter:-
if ($config{do_md5} and not $gr{md5skip} and not $gr{alltest}
This is better than just checking it via a regex as a distribution that
contained a single test group was matching.

> * In your PHR check, there is a superfluous "}" (but it still works):
> if $PHRhistory->add("$server}");

Thanks, corrected that now.

Julien ÉLIE

unread,
Mar 1, 2008, 4:54:51 AM3/1/08
to
Hi Steve,

>> * Why has '\.test(?:$|\.)' just vanished from md5exclude?
>
> I added a $gr{alltest} variable that returns true if all the groups in a
> posting are test groups. This is checked in the MD5 filter:-
> if ($config{do_md5} and not $gr{md5skip} and not $gr{alltest}
> This is better than just checking it via a regex as a distribution that
> contained a single test group was matching.

All right. I see:

$gr{test}++ if /\.test\b/;

But some test groupes are missing (like es.pruebas, borland.public.test2 or
cern.testnews).
I have not tested but perhaps "2" is considered as a "word boundary" in the
Perl regexp "\b".

--
Julien ÉLIE

« Veni, uidi et j'ai compris. » (César)

Steve

unread,
Mar 1, 2008, 9:32:25 AM3/1/08
to
On Sat, 1 Mar 2008 10:54:51 +0100, =?Windows-1252?Q?Julien_=C9LIE?= wrote in
Message-Id: <fqb95i$ajr$1...@news.trigofacile.com>:

> All right. I see:
>
> $gr{test}++ if /\.test\b/;
>
> But some test groupes are missing (like es.pruebas, borland.public.test2 or
> cern.testnews).

Yes, this was a poor method of checking for test groups, I've changed it
now by adding a test_groups config option:-

test_groups =>
'\.test(ing)?(?:$|\.)|^es\.pruebas|^borland\.public\.test2'.
'|^cern\.testnews',

Rather than me re-signing and publishing each change like this, I've now
put the svn repository online. It's a little crude at the moment but
should do the job for now. You can access it at:
http://www.mixmin.net/svn/cleanfeed/trunk/

Julien ÉLIE

unread,
Mar 23, 2008, 5:52:24 AM3/23/08
to
Hi Steve,

> http://www.mixmin.net/cleanfeed
> http://www.mixmin.net/cleanfeed.diff (Seeded from Cleanfeed-20020501)
> http://www.mixmin.net/cleanfeed.asc (Detached signature)

The PHN filter is perhaps too strong by default.
I had several rejects yesterday:

Mar 22 20:18:52.854 - <64l4crF...@mid.individual.net> 437 EMP (phn path)
Mar 22 20:19:57.559 - <1j8qxvue...@mgr.nhr> 437 EMP (phn path)
Mar 22 20:20:55.477 - <asso22ha...@mgr.nhr> 437 EMP (phn path)
Mar 22 20:25:07.188 - <64l4ohF...@mid.individual.net> 437 EMP (phn path)
Mar 22 20:42:30.649 - <64l5p4F...@mid.individual.net> 437 EMP (phn path)
Mar 22 20:47:52.890 - <64l638F...@mid.individual.net> 437 EMP (phn path)
Mar 22 20:53:07.175 - <64l6d2F...@mid.individual.net> 437 EMP (phn path)
Mar 22 20:58:41.363 - <64l6nfF...@mid.individual.net> 437 EMP (phn path)
Mar 22 21:00:29.553 - <64l6qsF...@mid.individual.net> 437 EMP (phn path)
Mar 22 21:08:27.237 - <64l79pF...@mid.individual.net> 437 EMP (phn path)
Mar 22 21:37:38.115 - <1e225eh2gsibb$.dlg@mgr.nhr> 437 EMP (phn path)
Mar 22 22:11:37.174 - <64lb08F...@mid.individual.net> 437 EMP (phn path)
Mar 22 22:15:55.220 - <64lb8aF...@mid.individual.net> 437 EMP (phn path)

Two days ago too:

Mar 20 07:02:19.472 - <mn.a1a57d837...@evc.net> 437 EMP (phn path)
Mar 20 07:37:33.348 - <mn.a1c97d83c...@evc.net> 437 EMP (phn path)
Mar 20 07:45:36.686 - <mn.a1d17d834...@evc.net> 437 EMP (phn path)
Mar 20 10:14:58.307 - <mn.a2667d83a...@evc.net> 437 EMP (phn path)
Mar 20 10:59:32.194 - <64eqs1F...@mid.individual.net> 437 EMP (phn path)
Mar 20 11:04:09.196 - <64er4nF...@mid.individual.net> 437 EMP (phn path)
[...]


They are as often as not in fr.soc.politique, where there is a heavy traffic.

--
Julien ÉLIE

« Un clavier azerty en vaut deux. »

Steve

unread,
Mar 25, 2008, 4:31:09 PM3/25/08
to
On Sun, 23 Mar 2008 10:52:24 +0100, =?Windows-1252?Q?Julien_=C9LIE?= wrote in
Message-Id: <fs5991$ogi$1...@news.trigofacile.com>:

> The PHN filter is perhaps too strong by default.
> I had several rejects yesterday:

Hi Julien,
Thanks for the feedback. At the moment the defaults are:-
PHNRateCutoff => 150,
PHNRateCeiling => 200,
PHNRateBaseInterval => 3600,

I'm reluctant to set these any higher as they lose effectiveness in
non-aggressive mode, or when the post includes an NNTP-Posting-Host
header.

I'm wondering if the best solution is to spilt the filter into two, one
based on Newsgroup/NNTP-Posting-Host, the other on Newsgroup/Path. This
approach would allow for more granular control of the thresholds. The
downside of this is the introduction of yet another filter adding
complexity to the configuration.

Any thoughts?

Marco d'Itri

unread,
Mar 26, 2008, 1:42:47 PM3/26/08
to
st...@mixmin.net wrote:

>Any thoughts?
This part of the filter has worked well for years. Do not touch it.

Julien ÉLIE

unread,
Mar 26, 2008, 2:14:48 PM3/26/08
to
Hi Marco,

>>Any thoughts?
>
> This part of the filter has worked well for years. Do not touch it.

I think he was speaking about the PHN filter which is new. And obviously
it needs some tweaks since it rejects good articles.

--
Julien ÉLIE

« Le carré est une figure qui a un angle droit dans chaque coin. » (Jean-Charles)

Steve

unread,
Mar 27, 2008, 7:08:19 AM3/27/08
to
On 26 Mar 2008 18:42:47 +0100, Marco d'Itri wrote in
Message-Id: <fse1un$fi4$1...@bongo.bofh.it>:

> This part of the filter has worked well for years. Do not touch it.

Hi Marco,

The filter we were discussing isn't in the standard cleanfeed, it was
added by me in an attempt to address hipcrime type floods that were
plaguing a number of groups at the time.

In order to comply with the terms of the Artistic License, I've been
advertising any updates in news.software.nntp and making them available
here:-
http://www.mixmin.net/cleanfeed
http://www.mixmin.net/cleanfeed.asc (PGP Signature)
http://www.mixmin.net/cleanfeed.diff (Diff file seeded on
cleanfeed-20020501)

The original announcement was in: <fkbbae$anq$1...@news.mixmin.net>

The subversion repository I've used is also available at:
http://www.mixmin.net/svn/cleanfeed/trunk/

0 new messages