Google Groups no longer supports new Usenet posts or subscriptions. Historical content remains viewable.
Dismiss

A simple definition of spam

38 views
Skip to first unread message

Mitchell Golden

unread,
Sep 30, 1994, 5:53:42 PM9/30/94
to
I would like to propose the following simple definition of spam:

"Spam is any mass posting of off-topic articles. The articles can
be either be posted separately or cross-posted. They may be posted
at once or over the course of days. The content of the articles is
not relevent: the only determination that need be made is whether or
not they are off-topic in most groups in which they are posted."

How big a "mass posting" is can be debated, but it strikes me that
this definition gets a the heart of the matter because it singles
out what is really wrong with spamming: it disrupts the topicality
of the newsgroups on a massive scale.

Comments?

Dave Hayes

unread,
Sep 30, 1994, 7:12:15 PM9/30/94
to
gol...@harpo.harvard.edu (Mitchell Golden) writes:
>"Spam is any mass posting of off-topic articles. The articles can
>be either be posted separately or cross-posted. They may be posted
>at once or over the course of days. The content of the articles is
>not relevent: the only determination that need be made is whether or
>not they are off-topic in most groups in which they are posted."

But...how do you determine if a message is off-topic if the content
is not relavent?
--
Dave Hayes -- Institutional NETworks - Section 394 -- JPL/NASA - Pasadena CA
da...@elxr.jpl.nasa.gov da...@jato.jpl.nasa.gov ...usc!elroy!dxh

He who has self-conceit in his head -
Do not imagine that he will ever hear the truth.

Seth Breidbart

unread,
Sep 30, 1994, 7:21:24 PM9/30/94
to
In article <Cwyq5...@das.harvard.edu>,

Cross-posting isn't as bad as multi-posting. I propose weighting each
post by a factor of, say, sqrt(#newsgroups), and adding the weights.
10 or more is spam. Articles properly crossposted to
news.announce.newgroups aren't spam, by definition.

Seth

Wilf Leblanc

unread,
Oct 1, 1994, 8:50:04 AM10/1/94
to
se...@panix.com (Seth Breidbart) writes:
[...]

>Cross-posting isn't as bad as multi-posting. I propose weighting each
>post by a factor of, say, sqrt(#newsgroups), and adding the weights.
>10 or more is spam. Articles properly crossposted to
>news.announce.newgroups aren't spam, by definition.

I think 20 or more was the number being used. Anything less might
be net-abuse but not cancelled.

Wrt the original poster, content is totally irrelevent. No special
cases.

Of course this is all IMHO...

>Seth

Al Black

unread,
Oct 2, 1994, 6:24:01 PM10/2/94
to

Yep, and I like at is, without a number. If the defintion specifies a
number, we'll see mass postings of N-1 articles. Besides, from all the
discussion seen here about cancelling and defintions of spam, both the
degree to which the articles are off topic and the number of articles
that are considered in the defintion.

al

--
a...@debra.dgbt.doc.ca ae...@freenet.carleton.ca

Mitchell Golden

unread,
Oct 3, 1994, 10:47:16 AM10/3/94
to
Seth Breidbart (se...@panix.com) wrote:
: Cross-posting isn't as bad as multi-posting. I propose weighting each

: post by a factor of, say, sqrt(#newsgroups), and adding the weights.
: 10 or more is spam. Articles properly crossposted to
: news.announce.newgroups aren't spam, by definition.

This is the point I'm trying to discuss. I claim that the major
difference between crossposting and multiposting is really a rather
minor technical difference that could be fixed by software(*). The
problem with spam is that it disrupts the discussion going on in the
groups. That would happen no matter what the technical aspects of
the reader and server software were.

My claim is that the real problem with spam is purely human, not one
of resources. If there were infinite Usenet resources at every site
and bandwidth were free, spam would still be bad.

(As a side issue, you might say that crossposting is _worse_ than
multiposting if the followup-to line is the same as the Newsgroups
line. That means that whatever flamewar develops in response to the
spam happens in all the groups it was crossposted to, not just one.)

(*) Suppose when a site downloaded an article, it computed a hash
function for it. If the hash of the article was the same as one of
the other articles currently on the site, the article would be saved
as though it were crossposted. Moreover, the article could then be
propagated as though it were crossposted. This would remove all the
difference between crossposting and multiposting with respect to
issues of bandwidth and storage.

Wilf Leblanc

unread,
Oct 3, 1994, 12:30:03 PM10/3/94
to
gol...@harpo.harvard.edu (Mitchell Golden) writes:
>Seth Breidbart (se...@panix.com) wrote:
>: Cross-posting isn't as bad as multi-posting. I propose weighting each
>: post by a factor of, say, sqrt(#newsgroups), and adding the weights.
>: 10 or more is spam. Articles properly crossposted to
>: news.announce.newgroups aren't spam, by definition.

>This is the point I'm trying to discuss. I claim that the major
>difference between crossposting and multiposting is really a rather
>minor technical difference that could be fixed by software(*). The
>problem with spam is that it disrupts the discussion going on in the
>groups. That would happen no matter what the technical aspects of
>the reader and server software were.

The difference between spam and not receiving *anything* is only
a minor technical difference. My newsreader could relatively
easily detect when an article or a series of article were spam
and torch them automatically.

>My claim is that the real problem with spam is purely human, not one
>of resources. If there were infinite Usenet resources at every site
>and bandwidth were free, spam would still be bad.

In a sense I agree with you... most admins would not know if their
sites or a neighbouring site was spamming unless someone alerted
them. Spam is highly antisocial, but the impact on news transport
is a little over estimated (IMHO).

>(As a side issue, you might say that crossposting is _worse_ than
>multiposting if the followup-to line is the same as the Newsgroups
>line. That means that whatever flamewar develops in response to the
>spam happens in all the groups it was crossposted to, not just one.)

Well, I still disagree, crossposts are not as bad as multi-postings
for the simple reason that *many* newsreaders will only show you
a crossposted article once. I admit that I could very well be biased
since my newsreader has this feature.

However, as Al Black has pointed out, defining what spam is a little
to carefully will just cause the knowledgeable spammers to spam
a little less than the minimum limit. I'm not really all that sure
how good spam is as a marketing tool... and the knowledgeable spammer
probably wouldn't spam.

--
wilf

Nico Garcia

unread,
Oct 3, 1994, 6:00:31 PM10/3/94
to
In article <Cx3qE...@das.harvard.edu> gol...@harpo.harvard.edu (Mitchell Golden) writes:

This is the point I'm trying to discuss. I claim that the major
difference between crossposting and multiposting is really a rather
minor technical difference that could be fixed by software(*). The

[...]

(*) Suppose when a site downloaded an article, it computed a hash
function for it. If the hash of the article was the same as one of
the other articles currently on the site, the article would be saved
as though it were crossposted. Moreover, the article could then be
propagated as though it were crossposted. This would remove all the
difference between crossposting and multiposting with respect to
issues of bandwidth and storage.

No. It still takes the transmission of downloading it, it makes the
victim responsible for protecting themselves from the deprecations of
the perpetrator, and any such broadly used filter will, sadly, be
worked around withing minutes of publication (by adding a few garbage
bytes or spaces, for example, to the individual posts). Nice try, but
like many hardware solutions for behavioral problems, it will only
work for a little bit *and* adds additional burdens for someone.

Nico Garcia
ra...@athena.mit.edu

Seth Breidbart

unread,
Oct 3, 1994, 8:35:19 PM10/3/94
to
In article <Cx3qE...@das.harvard.edu>,

Mitchell Golden <gol...@harpo.harvard.edu> wrote:
>Seth Breidbart (se...@panix.com) wrote:
>: Cross-posting isn't as bad as multi-posting. I propose weighting each
>: post by a factor of, say, sqrt(#newsgroups), and adding the weights.
>: 10 or more is spam. Articles properly crossposted to
>: news.announce.newgroups aren't spam, by definition.
>
>This is the point I'm trying to discuss. I claim that the major
>difference between crossposting and multiposting is really a rather
>minor technical difference that could be fixed by software(*). The
>problem with spam is that it disrupts the discussion going on in the
>groups. That would happen no matter what the technical aspects of
>the reader and server software were.

As a newsreader, I see a crosspost once, a multipost many times.
That's a major difference. That's why I weight multiposts much
higher.

Seth

Rahul Dhesi

unread,
Oct 3, 1994, 8:33:35 PM10/3/94
to
In <Cx3qE...@das.harvard.edu> gol...@harpo.harvard.edu (Mitchell
Golden) writes:

>I claim that the major
>difference between crossposting and multiposting is really a rather
>minor technical difference that could be fixed by software(*). The
>problem with spam is that it disrupts the discussion going on in the
>groups. That would happen no matter what the technical aspects of
>the reader and server software were.

Not really: Usenet news readers show the user each cross-posted
article only once. (I won't say anything about non-Usenet news
readers.)

My definition remains valid:

More than five physically distinct postings with substantially
identical content posted within a period of ten days.
--
Rahul Dhesi <dh...@rahul.net>
also: dh...@cirrus.com

Mitchell Golden

unread,
Oct 4, 1994, 9:19:19 AM10/4/94
to
Nico Garcia (ra...@athena.mit.edu) wrote:
: No. It still takes the transmission of downloading it, it makes the

: victim responsible for protecting themselves from the deprecations of
: the perpetrator, and any such broadly used filter will, sadly, be
: worked around withing minutes of publication (by adding a few garbage
: bytes or spaces, for example, to the individual posts). Nice try, but
: like many hardware solutions for behavioral problems, it will only
: work for a little bit *and* adds additional burdens for someone.

While I agree that my "solution" for multiposting doesn't work in
all cases, it would have reduced the Canter and Siegel spam to a
single post. And yes, someone who was determined to get around the
feature undoubtedly could. They probably wouldn't, however, because
most multiposting spammers have all the technical abilities of the
typical newbie. Besides, they probably don't care if you see the ad
more than once, they just want everyone to see it.

But the point I'm trying to make _isn't_ that such a solution should
be implemented. Let me say it again: I am _not_ advocating _any_
change of this type to the software. What I'm trying to say is that
the resource based arguments - how much disk space and bandwith are
used at the recipients' sites, etc - _aren't_ what's wrong with spam.
The problem with it is the massive way it disrupts the discussions
in all the different groups.

The problem also _isn't_ that you see the ad more than once. Again,
a smarter newsreader could doubtless fix this too. But since
spam is frequently posted to a number of very disparate groups, you
might not come across the ad more than once anyway. You _will_
however, come across the flamefest that the spam usually incites.
Thus, what's wrong with spam is that it incites such flamewars all
over the place. It's a small offense (an off topic post) on a
repeated, massive scale.

Wilf Leblanc

unread,
Oct 4, 1994, 10:40:04 AM10/4/94
to
gol...@harpo.harvard.edu (Mitchell Golden) writes:
[...]

Wrt the "resource" arguments I tend to agree. Furthermore,
a cancel-bot also is a resource hog (arguably as bad as the
original spam) so the resource argument against spamming isn't
all that good.

>The problem also _isn't_ that you see the ad more than once. Again,
>a smarter newsreader could doubtless fix this too. But since
>spam is frequently posted to a number of very disparate groups, you
>might not come across the ad more than once anyway. You _will_
>however, come across the flamefest that the spam usually incites.
>Thus, what's wrong with spam is that it incites such flamewars all
>over the place. It's a small offense (an off topic post) on a
>repeated, massive scale.

Spam should really be ignored, and it isn't the spammers fault
that a bunch of idiots decide to start a flame war over an off-topic
post. As far as I'm concerned, people who continue the off-topic
discussion publically are idiots. Arguments such as "he started it"
should be limited to grade school playgrounds.

Personally, I don't think spamming is a huge problem yet, but it is
probably a good thing to stop before things get out of hand.

--
wilf

Nico Garcia

unread,
Oct 4, 1994, 4:04:07 PM10/4/94
to
In article <Cx5H0...@das.harvard.edu> gol...@harpo.harvard.edu (Mitchell Golden) writes:

But the point I'm trying to make _isn't_ that such a solution should
be implemented. Let me say it again: I am _not_ advocating _any_
change of this type to the software. What I'm trying to say is that
the resource based arguments - how much disk space and bandwith are
used at the recipients' sites, etc - _aren't_ what's wrong with spam.
The problem with it is the massive way it disrupts the discussions
in all the different groups.

It's much *easier* to restrict it on the basis of waste of disk space
and bandwidth. That's why the multi-post spam definition works so
well, and it's one of the flaws of your description. Multi-post spam
still requires downloading it to generate that hash table, then
checking the hash table. That can suck up a considerable chunk of
CPU time and space as well, just to decide whether any incoming
post is spam. Plus someone has to write it, and install it everywhere.

Also remember that people like C&S will publish how to get around such
filters, and I'm sure it would appear in things like phrack or 2600
within days after its installation. It creates a security through
obscurity, and an extra burden for news sites to protect themselves
from something that shouldn't be done in the first place.

Your ideas are quite reasonable. I just don't think they work, since
they are insufficient to protect against even a slightly clever
spammer, and they suggest that because we don't lock our systems with
better locks, it's our fault if someone breaks in and steals our TV.
Or that the resulting riot when the thief is beaten while being
arrested is our fault, too, for letting him in so easily. There's
something wrong here....

Nico Garcia
ra...@athena.mit.edu

har...@ulogic.com

unread,
Oct 4, 1994, 4:52:44 PM10/4/94
to
In article <36i60f$n...@elxr.jpl.nasa.gov> da...@elxr.jpl.nasa.gov (Dave Hayes) writes:
>gol...@harpo.harvard.edu (Mitchell Golden) writes:
>>"Spam is any mass posting of off-topic articles. The articles can
>>be either be posted separately or cross-posted. They may be posted
>>at once or over the course of days. The content of the articles is
>>not relevent: the only determination that need be made is whether or
>>not they are off-topic in most groups in which they are posted."
>
>But...how do you determine if a message is off-topic if the content
>is not relavent?
>--

For the picky:

"Content is relavent solely in determing whether a given posting
is on-topic for the newsgroup(s) in which it was found. Content
may not be used for the cancellation on the basis of "objectionable
content" -- just about any content may be considered on-topic in
_some_ newsgroup."

Improvements to this qualification are welcome -- I am sure most of
you recognize the intent behind it.


-Richard Hartman
har...@ulogic.COM

=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=
"You cannot have the right to do what is wrong." -A. Lincoln

Nico Garcia

unread,
Oct 6, 1994, 4:09:45 PM10/6/94
to
In article <Cx987...@das.harvard.edu> gol...@harpo.harvard.edu (Mitchell Golden) writes:

In my hypothetical solution, _one_ site has to do the hashing etc,
the article then propagates as though it were crossposted. The
waste of net resources happens only on or near the poster's site.

This requires a centralized server, and a major revamp of the news
protocols. Possible, but I don't think desirable, and it's not
reasonable to reduce the blame for the damage of spam because
we *could* do this. It's like blaming us for car theft because
we *could* remove the spark plugs every time we park.

articles. I bet there are many people who used more net resources
in one week than does the typical (100 group, say) spam. We don't
object to them, because they are just well behaved, active
participants in Usenet. If we define spam via resources, a spammer

It's not the net use of resources, I think. It's the sudden and
over-whelming use of resources. A sudden burst of postage can and does
overwhelm servers, particularly when it adds to a *lot* of low-traffic
groups, and when it requires all these multiple copies. A modest,
although high-level flow can easily be coped with. I think there's
a very real difference.

Hence my definition: spam is massive disruption of the newsgroup
stucture.

Maybe. I prefer the precise technical definition of "50 or more
duplicate posts at the same time". This will catch a large number,
most even, of the newbie or malicious C&S style spammers. Any other
cases can be examined in light of other standards, such as if it's
on-topic or libelous or off-charter or false advertising or violates
user agreements.

Nico Garcia
ra...@athena.mit.edu

Mitchell Golden

unread,
Oct 6, 1994, 5:21:21 PM10/6/94
to
Nico Garcia (ra...@athena.mit.edu) wrote:
: In article <Cx987...@das.harvard.edu> gol...@harpo.harvard.edu
(Mitchell Golden) writes:

:: In my hypothetical solution, _one_ site has to do the hashing etc,
:: the article then propagates as though it were crossposted. The
:: waste of net resources happens only on or near the poster's site.

: This requires a centralized server, and a major revamp of the news
: protocols. Possible, but I don't think desirable, and it's not
: reasonable to reduce the blame for the damage of spam because
: we *could* do this. It's like blaming us for car theft because
: we *could* remove the spark plugs every time we park.

No centralized server is needed; not if it's implemented cleverly.
When a site receives an article, it hashes it to see if it already
has the article. If it does, then it just modifies the header of
the saved article to include the new group. When the site
propagates the article further, it just propagates it once. Thus,
either the posting site or some site "nearby" collapses the
multipost down to a single crossposted article, and no other site
has to do it again.

Again - I'm not arguing for this. I just am pointing out that the
amount of Usenet resources used by Usenet isn't a fixed given thing.
If there were some useful purpose to be served by allowing this kind
of posting, someone would write software to implement something akin
to what I'm suggesting. It's just that there _isn't_ any purpose to
be served by doing this, because spam really is bad for reasons
other than the resources it uses.

:: articles. I bet there are many people who used more net resources


:: in one week than does the typical (100 group, say) spam. We don't
:: object to them, because they are just well behaved, active
:: participants in Usenet. If we define spam via resources, a spammer

: It's not the net use of resources, I think. It's the sudden and
: over-whelming use of resources.

Canter and Siegel used resources overwhelmingly, most other spams
don't. Suppose for example that David Rhodes posts his chain letter
into 10 groups every day. That's not much of a use of resources -
I've seen people who (properly) post 10 messages a day into one
group! The difference between the Rhodes letter and the others _is
their effect on the group_, not their use of resources.

Ron Newman

unread,
Oct 6, 1994, 6:37:51 PM10/6/94
to
In article <Cx9sn...@das.harvard.edu>,

Mitchell Golden <gol...@harpo.harvard.edu> wrote:
>When a site receives an article, it hashes it to see if it already
>has the article. If it does, then it just modifies the header of
>the saved article to include the new group. When the site
>propagates the article further, it just propagates it once. Thus,
>either the posting site or some site "nearby" collapses the
>multipost down to a single crossposted article, and no other site
>has to do it again.

Let's say you receive a 10-newsgroup multi-post. The articles
are sent individually newsgroups N1 through N10, with Message-IDs
M1 through M10 respectively.

You propose to create a new article, whose Message-ID is M1,
and whose Newsgroups line contains all of N1 through N10.

The result is that you have two different versions of
an article, both with Message-ID M1, floating around the Net: the original,
and your cross-posted copy. Meanwhile, M2 through M10 are still
around, and some other site is converting one of THEM to a
cross-post.

I predict lots of nasty confusion resulting from this kind of scheme.
--
Ron Newman MIT Media Laboratory
rne...@media.mit.edu

Mitchell Golden

unread,
Oct 6, 1994, 9:59:29 AM10/6/94
to
Nico Garcia (ra...@athena.mit.edu) wrote:
: It's much *easier* to restrict it on the basis of waste of disk space

: and bandwidth. That's why the multi-post spam definition works so
: well, and it's one of the flaws of your description. Multi-post spam
: still requires downloading it to generate that hash table, then
: checking the hash table. That can suck up a considerable chunk of
: CPU time and space as well, just to decide whether any incoming
: post is spam. Plus someone has to write it, and install it everywhere.

In my hypothetical solution, _one_ site has to do the hashing etc,


the article then propagates as though it were crossposted. The
waste of net resources happens only on or near the poster's site.

Ands yes, for it to work, it would have to be installed most places.
But since it was only a point made for the sake of argument, we
don't have to go about campaigning for it yet.

To go back to the beginning: the problem with defining spam by the
amount of net resources it uses is this: There are people who use a
lot of net resources. They just post a lot, and some post very long


articles. I bet there are many people who used more net resources
in one week than does the typical (100 group, say) spam. We don't
object to them, because they are just well behaved, active
participants in Usenet. If we define spam via resources, a spammer

will be able to argue that since others use more resources than
they, what they're doing is okay.

Rahul Dhesi

unread,
Oct 6, 1994, 6:49:00 PM10/6/94
to
In <Cx987...@das.harvard.edu> gol...@harpo.harvard.edu (Mitchell
Golden) writes:

>Hence my definition: spam is massive disruption of the newsgroup
>stucture.

Some would argue that this is the nature of Usenet. That disruption
of the newsgroup structure *is* the normal state.

But more importantly, the above definition would not be precise enough
to be used in defining a crime.

Mine definition is precise enough:

Tom Haapanen

unread,
Oct 7, 1994, 12:29:15 PM10/7/94
to

Rahul Dhesi <dh...@rahul.net> writes:
> Mine definition is precise enough:
> More than five physically distinct postings with substantially
> identical content posted within a period of ten days.

Sorry, too strict. My Windows newsgroup "how-to" postings are substantially
identical, and are posted as three separate articles to keep the Newsgroups:
line below 256 characters.

In September, I autoposted these three times a week to reduce misposting
by the new students coming in. That's 12-15 postings per ten days.

Or was that spamming, too?

--
[ /tom haapanen -- to...@metrics.com -- software metrics inc -- waterloo, ont ]
[ "time flies like an arrow, fruit flies like a banana" ]

Rahul Dhesi

unread,
Oct 7, 1994, 8:42:49 AM10/7/94
to
In <RAOUL.94O...@cacciatore.mit.edu> ra...@athena.mit.edu (Nico
Garcia) writes:

>Maybe. I prefer the precise technical definition of "50 or more
>duplicate posts at the same time".

Too kind, and too vague. I see no justification for even 10 duplicate
postings, if by duplicate you mean physically distinct but with the
same content. And note that C&S didn't send their 5000+ postings at
the same time: they were sent at small intervals. I still like my
definition better:

More than five physically distinct postings with substantially
identical content posted within a period of ten days.

Dave Hayes

unread,
Oct 7, 1994, 7:53:32 PM10/7/94
to
Rahul Dhesi <dh...@rahul.net> writes:
>same content. And note that C&S didn't send their 5000+ postings at
>the same time: they were sent at small intervals. I still like my
>definition better:
> More than five physically distinct postings with substantially
> identical content posted within a period of ten days.

I have a better one:

"Any posting that causes more than 50 net.people to react vehemently
and irrationally against the poster and the site that originated the
posting."


--
Dave Hayes -- Institutional NETworks - Section 394 -- JPL/NASA - Pasadena CA
da...@elxr.jpl.nasa.gov da...@jato.jpl.nasa.gov ...usc!elroy!dxh

Do not take life too seriously; you will never get out if it alive.

Alexander Lehmann

unread,
Oct 8, 1994, 2:00:12 PM10/8/94
to
Ron Newman (rne...@media.mit.edu) wrote:
: In article <Cx9sn...@das.harvard.edu>,

: Mitchell Golden <gol...@harpo.harvard.edu> wrote:
: >When a site receives an article, it hashes it to see if it already
: >has the article. If it does, then it just modifies the header of
: >the saved article to include the new group. When the site
: >propagates the article further, it just propagates it once. Thus,
: >either the posting site or some site "nearby" collapses the
: >multipost down to a single crossposted article, and no other site
: >has to do it again.

: Let's say you receive a 10-newsgroup multi-post. The articles
: are sent individually newsgroups N1 through N10, with Message-IDs
: M1 through M10 respectively.

: You propose to create a new article, whose Message-ID is M1,
: and whose Newsgroups line contains all of N1 through N10.

: The result is that you have two different versions of
: an article, both with Message-ID M1, floating around the Net: the original,
: and your cross-posted copy. Meanwhile, M2 through M10 are still
: around, and some other site is converting one of THEM to a
: cross-post.

If identical articles should be combined into a crossposted article, it
would be necessary to enumerate the message-ids of all articles to avoid
receving the combined articles along other paths. This would certainly be
possible (technically), maybe with a header line Also-Message-IDs, but given
the slowness which which additions to the news standard are introduced, there
is no chance at all that this will get implemented in the milenium (this
estimate might be a bit drastic, but given other new features, e.g. ISO
charsets, I would say it is at about valid).

While this is technically possible, the effects of this are almost none,
since `knowledgable' spammers would just add a few junk lines or maybe the
current time to each posting. Back to square one.

bye, Alexander

--
Alexander Lehmann, | "On the Internet,
al...@hal.rhein-main.de (plain, MIME, NeXT) | nobody knows
alex...@rbg.informatik.th-darmstadt.de (plain) | you're a dog."

Rahul Dhesi

unread,
Oct 9, 1994, 4:29:37 AM10/9/94
to
In <CxB9s...@metrics.com> to...@metrics.com (Tom Haapanen) writes:

>My Windows newsgroup "how-to" postings are substantially
>identical, and are posted as three separate articles to keep the Newsgroups:
>line below 256 characters.

>In September, I autoposted these three times a week to reduce misposting
>by the new students coming in. That's 12-15 postings per ten days.

>Or was that spamming, too?

That is spamming indeed, though just barely so.

The original once a week and occasional brief pointers to it more often
would be marginally proper. Periodic postings should be generally
limited to no more often than once a month. A longer expiration time
will help it stick around on most Usenet sites.

Tom Haapanen

unread,
Oct 10, 1994, 1:07:50 PM10/10/94
to

> to...@metrics.com (Tom Haapanen) writes:
>> My Windows newsgroup "how-to" postings are substantially identical,
>> and are posted as three separate articles to keep the Newsgroups:
>> line below 256 characters.

>> In September, I autoposted these three times a week to reduce misposting
>> by the new students coming in. That's 12-15 postings per ten days.
>> Or was that spamming, too?

Rahul Dhesi <dh...@rahul.net> writes:
> That is spamming indeed, though just barely so.
>
> The original once a week and occasional brief pointers to it more often
> would be marginally proper. Periodic postings should be generally
> limited to no more often than once a month. A longer expiration time
> will help it stick around on most Usenet sites.

The "how-to" postings are indeed very brief: there is a 50-line "How to
get the FAQ" posting, and a 100-line guide to Windows newsgroups. But
does briefness make it non-spam? Or are you saying there is a difference
between "good spam" and "bad spam"? Absolute definitions for identifying
spams should be *very* thoroughly thought out...

The reason for the frequent September postings was not the expiration --
it's the fact that most people don't bother reading through old postings
when they first start reading a newsgroup. These frequent reminders
reduced the number of extremely basic newbie questions significantly.

Incidentally, I did not receive a single complaint about the frequency
of the postings.

--
[ /tom haapanen -- to...@metrics.com -- software metrics inc -- waterloo, ont ]

[ "everything that can be invented has been invented." ]
[ -- charles h. duell, u.s. patent commissioner, 1899 ]

har...@ulogic.com

unread,
Oct 11, 1994, 8:24:26 PM10/11/94
to
In article <CxAzB...@rahul.net>, Rahul Dhesi <dh...@rahul.net> wrote:
>In <RAOUL.94O...@cacciatore.mit.edu> ra...@athena.mit.edu (Nico

>
> More than five physically distinct postings with substantially
> identical content posted within a period of ten days.

First you must define "physically distinct". Since each posting
has a unique message ID, perhaps this could be used in your
definition to clean up that particular vagueness.

-rmh

Mitchell Golden

unread,
Oct 13, 1994, 9:31:33 AM10/13/94
to
Alexander Lehmann (alex...@rbg.informatik.th-darmstadt.de) wrote:

: Ron Newman (rne...@media.mit.edu) wrote:
: : In article <Cx9sn...@das.harvard.edu>,
[del]
: : The result is that you have two different versions of
: : an article, both with Message-ID M1, floating around the Net: the original,
: : and your cross-posted copy. Meanwhile, M2 through M10 are still
: : around, and some other site is converting one of THEM to a
: : cross-post.
[del]
: If identical articles should be combined into a crossposted article, it

: would be necessary to enumerate the message-ids of all articles to avoid
: receving the combined articles along other paths. This would certainly be
: possible (technically), maybe with a header line Also-Message-IDs, but given
: the slowness which which additions to the news standard are introduced, there
: is no chance at all that this will get implemented in the milenium (this
: estimate might be a bit drastic, but given other new features, e.g. ISO
: charsets, I would say it is at about valid).

: While this is technically possible, the effects of this are almost none,
: since `knowledgable' spammers would just add a few junk lines or maybe the
: current time to each posting. Back to square one.

My point in making this suggestion wasn't to start a debate about
how to actually implement the proposal. I know that there are
technical problems that would need to be overcome, but _if there
were any reason to do it_, I'm sure the technical issues could be
addressed.

My point was something else. People have been claiming that the
problem with spam is its use of net resources. My claim is that the
use of net resources has nothing to do with what's wrong with spam.
The other point I'm addressing is that whether the articles are
cross-posted or not is a _secondary_ concern. It's rather like the
issue of whether a big or small gun was used to hold up a
convienience store.

Now the definitions that have been proposed on this thread have the
property of being more "exact" than mine, but they don't follow from
any ethical theory of what is wrong with spam. So let me address
those who follow the thread this question: You say 50 (or whatever)
posts is too many. What is your theory of what is wrong with spam,
and how did that lead you to choose 50?

Rahul Dhesi

unread,
Oct 13, 1994, 7:17:12 PM10/13/94
to
In <onetouchC...@netcom.com> har...@ulogic.com writes:

>First you must define "physically distinct". Since each posting
>has a unique message ID, perhaps this could be used in your
>definition to clean up that particular vagueness.

Indeed, distinct message-ids would easily define physically distinct
postings.

Wilf Leblanc

unread,
Oct 13, 1994, 3:00:13 PM10/13/94
to
gol...@harpo.harvard.edu (Mitchell Golden) writes:
>Alexander Lehmann (alex...@rbg.informatik.th-darmstadt.de) wrote:
>: Ron Newman (rne...@media.mit.edu) wrote:
>: : In article <Cx9sn...@das.harvard.edu>,
[...]

>My point was something else. People have been claiming that the
>problem with spam is its use of net resources. My claim is that the
>use of net resources has nothing to do with what's wrong with spam.
>The other point I'm addressing is that whether the articles are
>cross-posted or not is a _secondary_ concern. It's rather like the
>issue of whether a big or small gun was used to hold up a
>convienience store.

Spam really isn't all that bad yet, and the operative word is *YET*.
I don't really find it all that annoying and to tell you the truth
I see the real problem as something which could happen in the future.
If the current rate of growth of spamming isn't curtailed it might
not be all that long before spamming is using alot of net-resources.

>Now the definitions that have been proposed on this thread have the
>property of being more "exact" than mine, but they don't follow from
>any ethical theory of what is wrong with spam. So let me address
>those who follow the thread this question: You say 50 (or whatever)
>posts is too many. What is your theory of what is wrong with spam,
>and how did that lead you to choose 50?

I think Al Black brought up the point that you can't really
define spam all that carefully since then the knowledgeable
spammers will just fit under the defn. I think that's a good
point. Spam is a large number of posts seeming spread out
across the net randomly with no regards to where it lands.

Personally, I like this definition:

"Spam is what gets cancelled".

--
wilf

Scott Southwick

unread,
Oct 15, 1994, 4:14:55 PM10/15/94
to
In article <moose.101...@achilles.net>,
Wilf Leblanc <wi...@zeus.achilles.net> wrote:

>I think Al Black brought up the point that you can't really
>define spam all that carefully since then the knowledgeable
>spammers will just fit under the defn. I think that's a good
>point.

One of the unspoken reasons for the "20" definition seems to be: "What
could possibly come after the first twenty except twenty more?"
There's a point at which they just don't look like they're going to
stop; twenty is a large enough number to serve as that indicator,
while being low enough to allow early warnings.

So it's not that 20 is the Nearest Border of Pure Evil. It's just that
at 20, odds are real good that they're shooting for the 500 club.

And if they stop at 18 to get around a definition? Good. Maybe they'll
even pick their 18 groups with care.

yrs,
Scotty

p.s: BTW, this is yet another reason massive cross-posting just
doesn't bother me. When somebody cross-posts to twenty groups, does
anybody really quake with fear that there'll be hundreds more?

* disclaimer: I never ever speak for IU *
* *
* On the Internet, nobody knows you're a dog, *
* until you start barking. *

sco...@ancho.ucs.indiana.edu

unread,
Oct 15, 1994, 7:14:00 PM10/15/94
to

Mitchell Golden

unread,
Oct 16, 1994, 11:18:28 PM10/16/94
to
Wilf Leblanc (wi...@zeus.achilles.net) wrote:
: gol...@harpo.harvard.edu (Mitchell Golden) writes:
: >Now the definitions that have been proposed on this thread have the

: >property of being more "exact" than mine, but they don't follow from
: >any ethical theory of what is wrong with spam. So let me address
: >those who follow the thread this question: You say 50 (or whatever)
: >posts is too many. What is your theory of what is wrong with spam,
: >and how did that lead you to choose 50?

: I think Al Black brought up the point that you can't really
: define spam all that carefully since then the knowledgeable
: spammers will just fit under the defn. I think that's a good
: point. Spam is a large number of posts seeming spread out
: across the net randomly with no regards to where it lands.

You answered one question, but not the other. The question is,
simply put

WHAT'S WRONG WITH SPAM?

What's wrong with posting randomly, with no regard to where the
posts land? Why is a spammed use of resources worse than any other
use of net resources? I don't ask this in jest, I'm trying to
understand the ethical theory behind the condemnation of spam.

: Personally, I like this definition:

: "Spam is what gets cancelled".

Would have been a good definition except that some people have taken
to cancelling anything they think is an ad, whether it's off topic or
not, and even if it's just a single posting.

Nico Garcia

unread,
Oct 17, 1994, 3:59:31 AM10/17/94
to
In article <Cxsru...@das.harvard.edu> gol...@harpo.harvard.edu (Mitchell Golden) writes:

You answered one question, but not the other. The question is,
simply put

WHAT'S WRONG WITH SPAM?

What's wrong with posting randomly, with no regard to where the
posts land? Why is a spammed use of resources worse than any other
use of net resources? I don't ask this in jest, I'm trying to
understand the ethical theory behind the condemnation of spam.

It's a fair question, but I think with an obvious answer. Spam
destroys the communications of the Net by burying news servers in
duplicates of irrelevant posting, wastes news client's time and money
by forcing downloading of these duplicates, wastes the reader's time
to jump past them in all the inappropriate newsgroups, buries mail
servers in the resulting flame wars, mail bombs, complaints, etc.,
wastes sys-admin's time dealing with the consequences, brings machines
crashing to a halt under the tidal wave of news and mail messages,
etc., etc.

It is not an ethical but a physical: it damages the Net and wastes our
time and money. It is usually also fraudulent or wrong in other ways:
the Green Card posting, as an example, offered to sell legal services
available elsewhere for the price of a stamp, and to people who were
not eligible for it (Canadians, for example). The Herbal spam was for
an herbal weight loss program, involving a series of false
testimonials under different names all testifying to its effectiveness
that were really all from the same guy.

These people also get slammed for spamming for the same reason Al
Capone got busted for income tax evasion, and the same reason
dangerous drivers get tickets for speeding. It's much easier to prove,
and busting them for that severely cuts into their ability to do other
dangerous or unethical things. It also costs a lot less to the people
punishing or stopping them.

Nico Garcia
ra...@athena.mit.edu

Brendan Dunn

unread,
Oct 18, 1994, 12:59:58 AM10/18/94
to
In article <Cxsru...@das.harvard.edu>,

Mitchell Golden <gol...@harpo.harvard.edu> wrote:
>You answered one question, but not the other. The question is,
>simply put
>
> WHAT'S WRONG WITH SPAM?

Not since 1982 at least.

>What's wrong with posting randomly, with no regard to where the
>posts land?

Somewhere in the vicinity of 1.24 million, I believe.

> Why is a spammed use of resources worse than any other
>use of net resources?

I think in this case you mean "tensile", but what I'm really looking for
is a list of the scores (by quarter) of every Bud Bowl.

> I don't ask this in jest, I'm trying to
>understand the ethical theory behind the condemnation of spam.

For spam, repeat this message 5000 times. For regular usenet, add 10%
signal.

--Brendan

Bob Allison

unread,
Oct 17, 1994, 11:46:18 PM10/17/94
to

Mitchell Golden <gol...@harpo.harvard.edu> wrote:
>
>You answered one question, but not the other. The question is,
>simply put
>
> WHAT'S WRONG WITH SPAM?
>
>What's wrong with posting randomly, with no regard to where the
>posts land? Why is a spammed use of resources worse than any other
>use of net resources? I don't ask this in jest, I'm trying to
>understand the ethical theory behind the condemnation of spam.


There's a wide spectrum between use and abuse of the Net, and what is
use or abuse is a matter of opinion, of course.

On the one end of the spectrum, you have a group of people carrying on a
two-way, inter-personal discussion. There are relationships among
them. The conversation is appropiate to the group. They may exchange
useful information, and they do not charge each other.

On the other end there's somebody spewing forth ads to any group,
whether appropriate or not. It usually does not offer useful information
without a charge. It causes a waste of time and money to busy people and
those who pay for their Net connection by the hour or byte. All with
little or no concern on the part of the spammer, because they are out to
make money.

--
Home Page: http://gagme.wwa.com/~boba
Finger ASCII ART FAQ: asci...@wwa.com
ASCII ART FTP: ftp.wwa.com/pub/Scarecrow
Email: bo...@wwa.com - Group: rec.arts.ascii

Nico Garcia

unread,
Oct 19, 1994, 7:14:40 PM10/19/94
to

Two years from now that 50 articles goes to many more machines,
driving the price of the disk space and transfer time up. And many
more people will flame it, driving the mail traffic up. And the time
and bandwidth used by individualusers having to flush it from their
news-reading will also be increased by the number of people it hits.

So the price dropping in the future is not a as powerful as you might think.
The dollar amount will certainly fluctuate, but the waste of disk, bandwidth,
and time on user's and sys-admin's parts will remain fairly high. Moreover,
I don't think anyone *does* spam in 50 post chunks. It's merely a convenient
number, since we can't picture any legitimate, on-topic reason to do that
many distinct copies.

The simple advantage of cross-posting is that the net only has to transmit
it once, and you only have to kill it once. It doesn't waste the bandwidth,
disk, or recipient's time to deal with it except to delete it *once*.

There is no strong reason to get into the logic of why spam is bad:
pointing to the physical and technical aspects makes it easy to define
and restrict, rather than getting into dangerous and subtle problems
of deciding what is "on-topic" and who makes that judgment. I may not
*like* what people shout about, but I'll try to protect their right to
say so, even if it's off-topic, as long as they don't drown out others.

Nico Garcia
ra...@athena.mit.edu

John Payson

unread,
Oct 17, 1994, 10:19:41 PM10/17/94
to
In article <Cxsru...@das.harvard.edu>,
Mitchell Golden <gol...@harpo.harvard.edu> wrote:
>
> WHAT'S WRONG WITH SPAM?
>
>What's wrong with posting randomly, with no regard to where the
>posts land? Why is a spammed use of resources worse than any other
>use of net resources? I don't ask this in jest, I'm trying to
>understand the ethical theory behind the condemnation of spam.

Usenet works because, on average, most posts add value. If I post a review
of a new Animaniacs episodee on alt.tv.animaniacs this post will use up some
resources on each of thousands of machines and will take some time from those
who read that group. On the other hand, that post will also have value to a
fraction (hopefully a large one) of the people that read it.

Part of the reason Usenet works so well is that posting is [from most servers
anyway] free. Thus, there is no financial disincentive to those who would
have something useful to post. Unfortunately, however, this also means that
the net gets seen by some as "*** FREE ADVERTISING ***".

Note that advertising is not, in and of itself, evil. After all, if I have
something to sell and I post an ad in a suitable newsgroup my posting had
value to the person(s) who bought my products(s). Of course, the posting
imposed a cost on everyone else. Which is to say that one needs to draw a
very fine line between what is acceptable and what is not, since many things
are highly borderline.

Spam, however, generally has a very low average value to the reader and, if
multi-posted rather than cross-posted, a rather high cost. Thus it usually
falls well beyond the line between "good" and "evil" advertising.
--
-------------------------------------------------------------------------------
supe...@mcs.com | "Je crois que je ne vais jamais voir... | J\_/L
John Payson | Un animal si beau qu'un chat." | ( o o )

Wilf Leblanc

unread,
Oct 17, 1994, 9:00:03 AM10/17/94
to
gol...@harpo.harvard.edu (Mitchell Golden) writes:
>Wilf Leblanc (wi...@zeus.achilles.net) wrote:
>: gol...@harpo.harvard.edu (Mitchell Golden) writes:
[...]

>: I think Al Black brought up the point that you can't really
>: define spam all that carefully since then the knowledgeable
>: spammers will just fit under the defn. I think that's a good
>: point. Spam is a large number of posts seeming spread out
>: across the net randomly with no regards to where it lands.

>You answered one question, but not the other. The question is,
>simply put

> WHAT'S WRONG WITH SPAM?

I did. "Spam is a large number of posts seeming spread out


across the net randomly with no regards to where it lands".

It isn't a *huge* resource hog, but could become on if not
held in check. People tend to follow-up to it, and flame
the poster (which is not really the fault of the spammer),
which adds more noise. Spams tend to be ads, in which case
the users of the system (i.e. us) have to pay for the adds.
I don't care if this costs me 2 cents per month ... why should
I have to pay for *their* ads ?

[...]


>: "Spam is what gets cancelled".

>Would have been a good definition except that some people have taken
>to cancelling anything they think is an ad, whether it's off topic or
>not, and even if it's just a single posting.

First of all, single ads are not spam. Cancelling posts is unwise
and although it is done, it doesn't mean it's right. Single posts
might be construed as abusive or even marginally illegal (i.e. M.M.F),
but that doesn't mean they should be cancelled. I suppose it depends
how it done though ... if you see MAKE.MONEY.FAST, save it,
cancel it, and then contact the author, explain why this is unwise
and offer to repost it if the author refuse to see why it is wrong
then I don't have a big problem with cancelling articles of that
nature.

Anyway, cancelling posts based on content is net-abuse.

--
wilf

Mitchell Golden

unread,
Oct 18, 1994, 9:01:38 PM10/18/94
to
Nico Garcia (ra...@athena.mit.edu) wrote:
: It's a fair question, but I think with an obvious answer. Spam

: destroys the communications of the Net by burying news servers in
: duplicates of irrelevant posting, wastes news client's time and money
: by forcing downloading of these duplicates,

While all these things are true, I think it's rather like arguing
that what's wrong with phone solicitations is that they wear out my
answering machine. The use of physical resources by spam is rather
small, and as the resources get cheaper over time, that aspect of
spam will be less and less of a problem.

: wastes the reader's time


: to jump past them in all the inappropriate newsgroups, buries mail
: servers in the resulting flame wars, mail bombs, complaints, etc.,
: wastes sys-admin's time dealing with the consequences, brings machines
: crashing to a halt under the tidal wave of news and mail messages,
: etc., etc.

Now we're talking. Even if the net were free, this part would still
be there. Let me say it this way. Suppose we use your definition
of spam from a past post: "50 or more duplicate posts at the same
time". Suppose that the reason for this is that that wastes some
dollar amount. Okay, two years from now that 50 post spam will cost
the net half as much (say) because disks are cheaper, bandwidth is
cheaper, etc. Does the limit now go to 100 posts? No, that 50 post
spam is still just as destructive to the net culture as it ever was.

So the underlying issue is not the use of net resources. The reason
I don't like your definition is that it doesn't state what the
problem is. How about a compromise like this:

"Spam is a mass posting of off-topic articles. A post is spam when
it disrupts the topicality of the newsgroups on a massive scale. A
group of substantially similar messages that appears in fifty groups
is nearly always spam; the exceptions mostly being posts devoted to
the maintainance of the group structure itself. Whether the
articles are posted separately or cross-posted is not important.
They may be posted at once or over the course of days. The content
of the articles is not relevant: the only determination that need be


made is whether or not they are off-topic in most groups in which
they are posted."

This definition has the property that it is relatively short,
(though not as short as I'd like) and fairly precise. It gives the
figure of 50 groups as a figure of merit, not as a hard-and-fast
number. (We don't want lots of 49 group spams.) It also lays out
IMHO what the real issue is.

Stephen Samuel

unread,
Oct 23, 1994, 6:54:37 AM10/23/94
to
In article <CxwAu...@das.harvard.edu>, gol...@harpo.harvard.edu
(Mitchell Golden) wrote:

> Nico Garcia (ra...@athena.mit.edu) wrote:
+ : It's a fair question, but I think with an obvious answer. Spam
+ : destroys the communications of the Net by burying news servers in
+ : duplicates of irrelevant posting, wastes news client's time and money
+ : by forcing downloading of these duplicates,


> While all these things are true, I think it's rather like arguing
> that what's wrong with phone solicitations is that they wear out my
> answering machine. The use of physical resources by spam is rather
> small, and as the resources get cheaper over time, that aspect of
> spam will be less and less of a problem.

One notable difference between phone solicitation and spamming is that
phone solicitation has roughly the same cost for the recipient as it does
for the transmitter. With spamming, on the other hand, it is the recipient
(net society, as a whole) who pays, rather then the transmitter.

If spammers were willing to pay all the recipient sites (half) their costs
of receiving the spam, then there MIGHT be some mitigating sentiment (but
then you'd still get into the question of overloading of resources, time,
etc.)

_______ (now I climb on my soap-box -- be warned! _________

The net is probably the world's oldest anarchy. It has managed to
survive as such because of the unwritten (and often unrecognized)
rule of the net: Contribute more than you take out of it. (I've most
often read it as "pay your own leg" in the earlier days of the net,
but it was, at that time, a presumption that beyond paying their own
way, anybody adding on would be contributing something to the net as
as well.)

In an analogy, we can view the difference between the European
powers, and the natives of north America.
The native people seem to have had, as an intrinsic part of their
culture (on the west coast, at least), the concept of adding more
to mother nature than they took out. (similarly, anybody who does
any sort of personal financial planning). Europeans, on the other
hand, had an attitude more like "take what you can before somebody
else does". which is fine, as long as you've got a resource base to
eat from.

Europeans had the attitude: "Once it's mine I can do what I want
with it", while natives were more like "I am the caretaker of it
for future generations"

This would explain why Europeans were so shocked at the rich
abundance of the new world. -- similarly why the newbies to the
net seem shocked at the rich abundance created by the internet
community. Used to the "gimme-gimme" attitude of the capitalistic
world at large, some have a hard time understanding why some people
would simply "give" to the community at large with no apparent care
for what was given back in direct form.

This, I think, is why the natives were so receptive of the Europeans
when they first arrived. Because it was so ingrained into their
society, they had an implicit (as opposed to an explicit)
expectation that the Europeans would only add to the welfare of the
group as a whole. I don't think they believed that anything else
was possible.

I think that the internet, as a whole, could learn from the lessons
of the native people. I think we should start making the implicit
explicit, and request of those coming into the net fold (most
notably, the commercial providers) that they start to explicitly
contribute more than they use. Otherwise, much like the redwood
forests of California, we may someday find the internet reduced to
little more than museum patches of what was once an awesome and
flourishing community.

I think that the "always contribute more than you take" philosophy
of the net is the net has always managed, so far, to avoid the
oft-repeated predictions of it's immanent collapse under it's own
weight. Like the trick of having 20 people lift a person using one
finger each, any weight is sustainable if you distribute the support
among enough members.

I think that, rather than simply taking from the net (often with
this strange underlying question of "what's wrong with this picture,
how could it possibly work like this?". once the implicit is made
explicit, that most people will understand the value of contributing
to the net, rather than simply taking.

If I am correct, in my postulations, however, the net may be at a
cusp point:
If we simply allow rampant extractive commercialization of the
resources of the internet community, then I think that the
oft-repeated "the net-world is about to crash" prophesies may
finally come to pass.

On the other hand, if we can bring the concept of contribution to the
community into the conscious conversation of the net, then it may
continue to grow and flourish beyond our wildest dreams.

--
Stephen Samuel (604)876-0426 sam...@cs.ubc.ca
Only when the last fish is caught, the last river poisoned, the last tree cut-Only then will you realize that money cannot be eaten. (Cree prophesy)

0 new messages