I was surprised by how many articles are rejected even with the threshold
set at 25. Since applying the patch yesterday, about 24 hours ago, ~400
articles were rejected by it. The limit could be much lower. It wouldn't
be out of line to make the limit 12, or even 6.
This patch was generated from an INN1.4-sec distribution.
---
Eric Pettersen
pe...@cgl.ucsf.edu (NeXTmail capable)
*** art.c.orig Wed Jun 05 13:42:43 1996
--- art.c Tue Jun 04 18:59:26 1996
***************
*** 1902,1912 ****
* If ngp == GroupPointers, then all the new articles newsgroups are
* "j" entries in the active file. In that case, we have to file it
* under junk so that downstream feeds can get it. */
! if (!Accepted || ngptr == GroupPointers) {
! if (!Accepted) {
! (void)sprintf(buff, "%d Unwanted newsgroup \"%s\"",
! NNTP_REJECTIT_VAL,
! MaxLength(HDR(_newsgroups), HDR(_newsgroups)));
ARTlog(&Data, ART_REJECT, buff);
#if defined(DONT_WANT_TRASH)
#if defined(DO_REMEMBER_TRASH)
--- 1902,1919 ----
* If ngp == GroupPointers, then all the new articles newsgroups are
* "j" entries in the active file. In that case, we have to file it
* under junk so that downstream feeds can get it. */
! #define MAX_CROSSPOST 25
! if (!Accepted || Data.Groupcount > MAX_CROSSPOST || ngptr ==
GroupPointers){
! if (!Accepted || Data.Groupcount > MAX_CROSSPOST) {
! if (!Accepted)
! (void)sprintf(buff, "%d Unwanted newsgroup \"%s\"",
! NNTP_REJECTIT_VAL,
! MaxLength(HDR(_newsgroups), HDR(_newsgroups)));
! else
! (void)sprintf(buff,
! "%d Unwanted large crosspost \"%s\" (%d groups)",
! NNTP_REJECTIT_VAL, MaxLength(HDR(_subject),
HDR(_subject)),
! Data.Groupcount);
ARTlog(&Data, ART_REJECT, buff);
#if defined(DONT_WANT_TRASH)
#if defined(DO_REMEMBER_TRASH)
***************
*** 1924,1930 ****
* we should throw the article away: if you have define
* DO_WANT_TRASH, then you want all trash except that which
* you explicitly excluded in your active file. */
! if (!GroupMissing) {
if (distributions)
DISPOSE(distributions);
ARTreject(buff, article);
--- 1931,1937 ----
* we should throw the article away: if you have define
* DO_WANT_TRASH, then you want all trash except that which
* you explicitly excluded in your active file. */
! if (!GroupMissing || Data.Groupcount > MAX_CROSSPOST) {
if (distributions)
DISPOSE(distributions);
ARTreject(buff, article);
If I recall, there's a FAQ posted to news.answers which is crossposted
to 14 newsgroups.
--Dave
In article <4p4s4t$o...@cgl.ucsf.edu>,
pe...@cgl.ucsf.edu (Eric Pettersen) wrote:
>
>I got tired of receiving binaries (and other articles) crossposted to
>dozens of groups. Here is a patch to the art.c file of innd to reject
>articles posted to more than MAX_CROSSPOST groups (defined to be 25,
>redefine it if you like).
>
>I was surprised by how many articles are rejected even with the threshold
>set at 25. Since applying the patch yesterday, about 24 hours ago, ~400
>articles were rejected by it. The limit could be much lower. It wouldn't
>be out of line to make the limit 12, or even 6.
>
>This patch was generated from an INN1.4-sec distribution.
I've been hoping for a patch of this nature for INN. Dave Barr, any
chance you could modify this patch to work with INN1.4unoff4 (maybe
even change it so that MAX_CROSSPOSTS is defined in config.data)?
To quote a famous net.guy: I see a great need. :-)
--Dave
-----BEGIN PGP SIGNATURE-----
Version: 2.6.2
iQCVAwUBMbX/Mw0Aj0bAuAXFAQFXbgQAtP0u0wVs1qU4+ISNQC2BkjAJHNsuXQ3M
VNZwhNy4oCtiF864st/kA/Md+m8PO83MaQPYgbU1gLEO/rSnwEMsvB2gm+BQBgN5
ih9J8RkDKa5yuS9qlbrpyUwCMQ1PnyKCbM9mQUIU35YJscnbU78c08oLuWslkKTS
ef3KBr6lEz0=
=CmF9
-----END PGP SIGNATURE-----
--
Concerned about your message security? Read alt.security.pgp!
David Guntner Internet: dav...@netcom.com Finger or key server
Vicksburg, MS GEnie: Just say NO! for PGP Public key
GO d? s:+ !a C++(++++) US(++) P+ E- W+ N++(+++)@ !o K w(---) M-- V-- PS
PE Y+ PGP++ t+ 5++ X R tv+ b+ DI+ !D G e/* h r* y+(*)
A humble suggestion - rejecting massively crossposted articles is
kinda severe IMHO. I'd like to see a patch which truncated the
crosspost list to a selectable value. It is reasonable to assume
that only the first few groups listed are really applicable.
As for _MY_ preference on the value: 6 is definitely as high as
an message _NEEDS_ to _EVER_ be crossposted. Actually, more than
two is suspicious to me, but I'd be generous.
--
Andrew E. Mileski
mailto:a...@ott.hookup.net http://www.redhat.com/~aem/
Linux Plug-and-Play Project http://www.redhat.com/pnp/
Red Hat Software sponsors these pages - I have no other affilitation
with Red Hat Software, and I have never used any of their products.
Checking my spool (I've been screening for 2 days but I keep FAQs for 14),
there are 6 FAQs posted to 14 or more groups, with the widest one going
to 25 groups. Just for the curious, that one is:
ISO 8859-1 National Character Set FAQ
which goes to:
comp.unix.questions,comp.unix.admin,comp.windows.x,comp.std.internat,comp.sof
tware.international,at.general,soc.culture.german,soc.culture.french,soc.cult
ure.belgium,soc.culture.quebec,soc.culture.nordic,soc.culture.spain,soc.cultu
re.portuguese,soc.culture.latin-american,soc.culture.brazil,soc.culture.argen
tina,soc.culture.mexico,soc.culture.colombia,soc.culture.venezuela,soc.cultur
e.peru,soc.culture.chile,soc.culture.italian,bit.listserv.catala,comp.answers
,soc.answers,news.answers
So, if you cranked the limit down as low as I mentioned above, you might
want to have exemptions for *.answers and *.announce. Using an idea
suggested to me by Jason Fesler in e-mail, the patch would then be:
*** art.c.orig Wed Jun 05 13:42:43 1996
--- art.c Thu Jun 06 11:37:54 1996
***************
*** 1680,1685 ****
--- 1680,1686 ----
BOOL CrossPosted;
BOOL ToGroup;
BOOL GroupMissing;
+ BOOL OverCrossposted;
BUFFER *article;
char linkname[SPOOLNAMEBUFF];
char **groups;
***************
*** 1902,1912 ****
* If ngp == GroupPointers, then all the new articles newsgroups are
* "j" entries in the active file. In that case, we have to file it
* under junk so that downstream feeds can get it. */
! if (!Accepted || ngptr == GroupPointers) {
! if (!Accepted) {
! (void)sprintf(buff, "%d Unwanted newsgroup \"%s\"",
! NNTP_REJECTIT_VAL,
! MaxLength(HDR(_newsgroups), HDR(_newsgroups)));
ARTlog(&Data, ART_REJECT, buff);
#if defined(DONT_WANT_TRASH)
#if defined(DO_REMEMBER_TRASH)
--- 1903,1923 ----
* If ngp == GroupPointers, then all the new articles newsgroups are
* "j" entries in the active file. In that case, we have to file it
* under junk so that downstream feeds can get it. */
! #define MAX_CROSSPOST 25
! OverCrossposted = Data.Groupcount > MAX_CROSSPOST
! && strstr(".announce",HDR(_newsgroups))
! && strstr(".answers",HDR(_newsgroups));
! if (!Accepted || OverCrossposted || ngptr == GroupPointers){
! if (!Accepted || OverCrossposted) {
! if (!Accepted)
! (void)sprintf(buff, "%d Unwanted newsgroup \"%s\"",
! NNTP_REJECTIT_VAL,
! MaxLength(HDR(_newsgroups), HDR(_newsgroups)));
! else
! (void)sprintf(buff,
! "%d Unwanted large crosspost \"%s\" (%d groups)",
! NNTP_REJECTIT_VAL, MaxLength(HDR(_subject),
HDR(_subject)),
! Data.Groupcount);
ARTlog(&Data, ART_REJECT, buff);
#if defined(DONT_WANT_TRASH)
#if defined(DO_REMEMBER_TRASH)
***************
*** 1924,1930 ****
* we should throw the article away: if you have define
* DO_WANT_TRASH, then you want all trash except that which
* you explicitly excluded in your active file. */
! if (!GroupMissing) {
if (distributions)
DISPOSE(distributions);
ARTreject(buff, article);
--- 1935,1941 ----
* we should throw the article away: if you have define
* DO_WANT_TRASH, then you want all trash except that which
* you explicitly excluded in your active file. */
! if (!GroupMissing || OverCrossposted) {
if (distributions)
DISPOSE(distributions);
ARTreject(buff, article);
---
Post in haste, repent at leisure. The comparison test in the patch is
bass ackwards. They should be !strstr(... instead of strstr(... . The
corrected patch:
*** art.c.orig Wed Jun 05 13:42:43 1996
--- art.c Thu Jun 06 12:33:47 1996
! && !strstr(".announce",HDR(_newsgroups))
! && !strstr(".answers",HDR(_newsgroups));
: I was surprised by how many articles are rejected even with the threshold
: set at 25. Since applying the patch yesterday, about 24 hours ago, ~400
: articles were rejected by it. The limit could be much lower. It wouldn't
: be out of line to make the limit 12, or even 6.
: This patch was generated from an INN1.4-sec distribution.
: ---
: Eric Pettersen
: pe...@cgl.ucsf.edu (NeXTmail capable)
Doesn't this just encourage to post the *SAME* message multiple times
to different sets of news groups to get the same distribution???
This will require *MORE* disk space to house the same article multiple
times.
I would rather allow the crossposting because innd will save only a
single copy of the article and use links (hard or symbolic) so that
the article can be seen from the other news groups.
-- darylg
>In article <4p4s4t$o...@cgl.ucsf.edu>, Eric Pettersen <pe...@cgl.ucsf.edu> wrote:
>>I was surprised by how many articles are rejected even with the threshold
>>set at 25. Since applying the patch yesterday,
You wanted to read: man 5 newsfeeds, paragraph Gcount and tell the sites
which feed you about it. Now if you had created a new option (say) Jcount
which worked like Gcount but grants exception to any article crossposted
to news.answers, WOW ...
>>about 24 hours ago, ~400
>>articles were rejected by it. The limit could be much lower. It wouldn't
>>be out of line to make the limit 12, or even 6.
>If I recall, there's a FAQ posted to news.answers which is crossposted
>to 14 newsgroups.
Groups FAQs
12: 4
13: 2
14: 2
15: 3
17: 4
18: 1
19: 1
26: 1 internationalization/iso-8859-1-charset
Since the Perl script to produce the above listing and more is only
50 line hack, I append it here. Use at your own risk, bei Nebenwirkungen
schlagen Sie Ihren Arzt oder Apotheker.
#!/usr/bin/perl
#
# Usage: find <archiv-directory> -type f | ngostat.pl
$lim = 14; # report all articles with #(newsgroups) >= $lim
#$|=1;
while ($fn = <>){
chop($fn);
if ( ! open(ARTIKEL, $fn) > 0 ){
$no_open++;
}else{
while (<ARTIKEL>){
chop;
if (/^$/){
$n_no_ng++;
print "N $fn\n";
last;
}elsif ( m/^Newsgroups:\s+(.*)$/io ){
$n_mit_ng++;
$ngln = $1;
$ngno = scalar(split(/,/, $ngln));
$ngnos{$ngno}++;
if ( $ngno >= $lim ){
print "L $fn $ngno $ngln\n";
$res1 = (($dev, $ino, $mode, $nlink, $uid, $gid, $rdev,
$size, $atime, $mtime, $ctime, $blksize, $blocks)
= stat($fn));
$res2 = (($sec, $min, $hour, $mday, $mon, $year,
$wday, $yday, $isdst) = localtime($mtime));
if ($res1 && $res2){
$mon++;
printf "M %s %4d/%02d/%02d %02d.%02d.%02d\n",
$fn, 1900+$year, $mon, $mday, $hour, $min, $sec;
};
};
last;
};
};
close(ARTIKEL);
};
};
printf "\ntotal %5d (%d not found, %d without newsgroups line)\n",
$n_mit_ng, $no_open, $no_no_ng;
foreach $i (sort {$a <=> $b} keys %ngnos){
printf "%3d: %5d\n", $i, $ngnos{$i};
};
--
Wolfgang Schelongowski w...@xivic.ruhr.de
Mustela locuta, causa finita. (With apologies to St. Chris)
pe...@cgl.ucsf.edu (Eric Pettersen) wrote:
> >I got tired of receiving binaries (and other articles) crossposted to
> >dozens of groups. Here is a patch to the art.c file of innd to reject
> >articles posted to more than MAX_CROSSPOST groups (defined to be 25,
> >redefine it if you like).
dav...@netcom.com (David Guntner) writes:
> I've been hoping for a patch of this nature for INN. Dave Barr, any
> chance you could modify this patch to work with INN1.4unoff4 (maybe
> even change it so that MAX_CROSSPOSTS is defined in config.data)?
> To quote a famous net.guy: I see a great need. :-)
I don't :-(. Considering that this is something that can be done in
the user interface with zero chance of error, there is no reason at
all to do it in the transport. In fact, there are good reasons *not*
to do it in the transport:
- It encourages spammers to post the same article many times, with few
enough newsgroups to squeak under the limit.
- It nails FAQs, conference announcements and such that are legitimately
posted to lots of groups.
- You don't save bandwidth since you have to receive the whole article
anyway, in order to check the newgroup count. (Getting your
neighbors to use the G flag in their feed for you might be nice.)
- As I mentioned, such articles can be caught by a killfile.
Do you *really* want the binaries posters to post each article 26
times, rather than a single article cross-posted to 26 groups?
--
Tom Fitzgerald Thinking Machines Corp, Bedford MA, USA A3FC3545C031E735
fi...@think.com (617)276-0400 x4848 3DE72FB31F6028D1
>A humble suggestion - rejecting massively crossposted articles is
>kinda severe IMHO.
Not really. A baseball bat to the kneecaps of the spammers would be
"kinds severe"
> I'd like to see a patch which truncated the
>crosspost list to a selectable value. It is reasonable to assume
>that only the first few groups listed are really applicable.
My experience with massive Xposts is that apart from FAQs, they seldom
if ever are relevant to ANY of the groups posted to.
>As for _MY_ preference on the value: 6 is definitely as high as
>an message _NEEDS_ to _EVER_ be crossposted. Actually, more than
>two is suspicious to me, but I'd be generous.
I have my onfeeds setup to not feed crossposted articles which hit 10 or
more groups (G10). Being able to make it fewer while allowing posts to
*.answers,* would be a great step forward IMHO as I know I'm restricting
a few FAQs from going downstream but 10 is a reasonable compromise.
AB
--
"Something always goes wrong when things are going right, you've swallowed
your pride to quell the pain inside. Someone captured your heart like a
thief in the night and squeezed all the juice out until it ran dry."
In article <4p7j2n$8...@hpax.cup.hp.com>, dar...@cup.hp.com (Daryl Gaumer) writes:
|> Doesn't this just encourage to post the *SAME* message multiple times
|> to different sets of news groups to get the same distribution???
1) The vast majority of the people who post massively cross-posted
articles don't know they're doing it. The most common cause of such
posts is followups to trolls. So no, they would not bother to take the
time to repost their followups multiple times to smaller lists of
groups; after all, they didn't even bother to take the time to trim
their Newsgroups list in the first place.
2) The people who *do* post many times to separate groups to get around
cross-posting limits end up exceeding spam-cancel limits, which causes
their articles to get cancelled by Chris Lewis, JEM et al. Most sites
never end up receiving such cancelled articles, let alone storing them.
Followups to news.admin.net-abuse.misc.
I wouldn't use the word "encourage". It does make it the only way to get
the same distribution, yes.
> This will require *MORE* disk space to house the same article multiple
> times.
>
> I would rather allow the crossposting because innd will save only a
> single copy of the article and use links (hard or symbolic) so that
> the article can be seen from the other news groups.
Yes, true. I am not suggesting putting this patch in universally. It a
matter of preference for the local admin. You also have to consider that
wide crossposts frequently generate a lot of followups (usually angry
ones), and that these followups frequently go to all the original groups.
From my logs for the last several days I get:
2083 total cross posts to more than 25 groups
of which:
783 are apparently "originals" (no leading "Re:")
165 are cancels
1135 are followups
A more general method of handling the crosspost problem needs to be found
(perhaps NoCeM). I offer the patch only as a stop-gap measure for some
sites.
I wouldn't use the word "encourage", but you are right that it makes it
the only way to guarantee that a spam survives the transport filter.
> - It nails FAQs, conference announcements and such that are legitimately
> posted to lots of groups.
The amended patch exempts *.answers and *.announce.
> - You don't save bandwidth since you have to receive the whole article
> anyway, in order to check the newgroup count. (Getting your
> neighbors to use the G flag in their feed for you might be nice.)
Yes, true. On the other hand, you can adjust your crosspost threshold
level without bothering your neighbors, and you can exempt *.answers and
*.announce. It's tradeoff that as to be decided by the individual admin.
> - As I mentioned, such articles can be caught by a killfile.
I am not impressed by the killfile argument. If killfiles were doing the
job, wouldn't spammers _already_ be posting their articles individually
to every group, instead of wide crossposting? Wouldn't there by zero
followups to such crossposts since everyone's killfiles would have nuked
them? There are just way too many naive users on Usenet now to assert
that killfiles are going to solve them problem. Empirically, they haven't.
>
> Do you *really* want the binaries posters to post each article 26
> times, rather than a single article cross-posted to 26 groups?
>
If it weren't _easy_ to post to 26 groups simultaneously, perhaps they
would limit the post to the 6 most pertinent groups and not bother with
the rest.
As you may have seen by now, I posted an amended patch that exempts
*.answers and *.announce. The "G" option has a bandwidth advantage in
that the article doesn't have to be transmitted to be screened, but it
can't exempt *.answers and *.announce. Also, I have to bother by neighbors
if I want the threshold adjusted. Creating a "J" option would allow
exemption, but would still require I bother my neighbors to adjust the
limit AND the neighbor would have to install a patched server(!).
I didn't really offer the patch as a contribution to the official release.
The official release may want to just enhance the "G" flag to allow some
newsgroups to be exempted. I offered the patch for individual admins to
use as a stopgap measure until a better method of handling problem
crossposting is found.
> > - It encourages spammers to post the same article many times, with few
> > enough newsgroups to squeak under the limit.
pe...@cgl.ucsf.edu (Eric Pettersen) writes:
> I wouldn't use the word "encourage", but you are right that it makes it
> the only way to guarantee that a spam survives the transport filter.
Then perhaps a better word is "require", or maybe "force". Spammers will
do whatever it takes to get their message out.
> > - You don't save bandwidth since you have to receive the whole article
> > anyway, in order to check the newgroup count. (Getting your
> > neighbors to use the G flag in their feed for you might be nice.)
> Yes, true. On the other hand, you can adjust your crosspost threshold
> level without bothering your neighbors, and you can exempt *.answers and
> *.announce. It's tradeoff that as to be decided by the individual admin.
It may be less of a bother to your upstream neighbors, but it's more
of a bother to any of your downstream neighbors who might have been
expecting a reliable feed from you, but are no longer getting it. If
they really want a full feed, they now have to get another feed site.
This is much more of a bother than making a small change to the
newsfeeds file. (This is between you and your feeds, of course - if
they know about your customization of the feed, and are happy with it,
it's your business.)
> > - As I mentioned, such articles can be caught by a killfile.
>
> I am not impressed by the killfile argument. If killfiles were doing the
> job, wouldn't spammers _already_ be posting their articles individually
> to every group, instead of wide crossposting?
If they do that, then the spamcancellers will get them (eventually,
after hundreds of thousands of people have already read them, and
followed up to randomly selected singleposted articles individually).
The whole point of the spamcancellers and cumulative Breidbart index
etc, was to encourage crossposting over mass singleposting.
If killfiles aren't doing the job, it's because people are choosing
not to use them, not because they're not capable of doing the job. If
you personally don't want to read any of the followups, then set your
own killfile to ignore them - you won't see the original, or any of
the followups.
> Wouldn't there by zero
> followups to such crossposts since everyone's killfiles would have nuked
> them? There are just way too many naive users on Usenet now to assert
> that killfiles are going to solve them problem. Empirically, they haven't.
Newbies also followup, in mobs, to singleposted spam articles. How is
this any better than following up to a crossposted article? The advantage
of keeping the spam crossposted is that a single killfile entry can nuke
both the original post and the followups from newbies.
Part of the process of becoming a non-newbie is learning how to use
killfiles - you can't save people the trouble of doing this, and once they
learn how to do this, they don't need your patch. "Protecting" newbies
from crossposted spam is not a solution, they need to learn how to defend
themselves against it.
> If it weren't _easy_ to post to 26 groups simultaneously, perhaps they
> would limit the post to the 6 most pertinent groups and not bother with
> the rest.
That's placing a great deal of faith in the goodwill of spammers.
They'll post to 24 groups, and send out a second post to the 24th and
25th. It's all automated to them, so why not?
You can do it with a triplet of entries:
foo!1:*,!*.announce,!*.answers:G5,Tm:foo
foo!2:!*,*.announce,*.answers:Tm:foo
foo:!*:...however foo should be fed...
>I forgot to mention one other advantage the patch has over the 'G'
>flag: your feed doesn't have to be running INN.
The disadvantage, tho, is that the article must be sent in the first
place, right?
/r$