Thus spake Chris Schram <
chri...@me.com>
> I beg to differ. The message is "crafted" to resemble a followup to a
> very old message thread, but then contains a link to a totally unrelated
> commercial site. That's spam in my book, and I suspect that type of
> message, though quite common, is very hard to filter.
It is impossible to filter on the criteria you just named:
1. Find References: header
2. Check age of references
3. Define a maximum age for articles replied to
4. Look up links and determine nature of web site
5. Compare to charter of newsgroups the article is posted to
And as a single article, it still is not cancellable spam,
so you need to find out the Breidbart index based on other
articles from the same poster.
I think, a reliability of 99,5% is very good for a spamfilter,
anything above this ratio will inevitably increase the rate of
false positves drastically. And it is my first priority to
avoid false positives that destroy legitimate content.