[AMaViS-user] Stock tips emails

CRi...@checkfree.com

unread,

Nov 2, 2006, 9:17:56 AM11/2/06

to

Lately we seem to have a pretty good handle on all spam threats except for
the same old stock tip emails. The format of the email is:

1. Ramdom subject, but usually with "you have to read." somewhere in there
(not always though).
2. Talks about the stock, great volume, projected increase, per quarter
postings, its rating.
3. At the end of the body some random CNN news paragraph, apparently to
poison bayes.

The content of the email is ASCII, no HTML, no images, pretty clean.
Currently we are using Fuzzy_OCR, DCC, Pyzor, all the reliable RBL's and
SBL's and they still slip under the radar. What are you guys doing to
block these? I've grabbed a couple of SARE rulesets (kam, sare_stocks)
and they still don't seem to catch these specific ones.

I appreciate the help

-------------------------------------------------------------------------
Using Tomcat but need to do more? Need to support web services, security?
Get stuff done quickly with pre-integrated technology to make your job easier
Download IBM WebSphere Application Server v.1.0.1 based on Apache Geronimo
http://sel.as-us.falkag.net/sel?cmd=lnk&kid=120709&bid=263057&dat=121642
_______________________________________________
AMaViS-user mailing list
AMaVi...@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/amavis-user
AMaViS-FAQ:http://www.amavis.org/amavis-faq.php3
AMaViS-HowTos:http://www.amavis.org/howto/

Jakob Curdes

unread,

Nov 2, 2006, 9:36:06 AM11/2/06

to

Do you use Bayes in spamassassin ? Without a self-adapting filter such
as bayes or dspam a lot of stuff will slip through nowadays.You can tell
SA to train bayes from the rule-based score. Then you get a quite good
database without much manual traiing. Every spam case with slightly
varying text is ideal for bayes. after a dozen or so messages bayes
will have learnt the lesson.
But remember to also run bayes expiry runs or else your SA will take
longer and longer to process.

dspam is still better but more complicated to install and configure.

Yours, JC

CRi...@checkfree.com schrieb:

Clifton Royston

unread,

Nov 2, 2006, 3:11:05 PM11/2/06

to

On Thu, Nov 02, 2006 at 03:38:11PM +0100, Jakob Curdes wrote:
> Do you use Bayes in spamassassin ? Without a self-adapting filter such
> as bayes or dspam a lot of stuff will slip through nowadays.You can tell
> SA to train bayes from the rule-based score. Then you get a quite good
> database without much manual traiing. Every spam case with slightly
> varying text is ideal for bayes. after a dozen or so messages bayes
> will have learnt the lesson.

I beg to differ with you. On the contrary, there have been a number
of recent posts to the SA list from admins all of whose spam is now
identified as BAYES_00 and AWL, because they enabled autolearn and let
it run on its own while a lot of spam was slipping through.

If significant amounts of spam are being missed in the first place,
they are ipso facto being scored low, so if you aren't *manually*
training them as spam they are more likely to get learned as ham than
as spam.

> >The content of the email is ASCII, no HTML, no images, pretty clean.
> >Currently we are using Fuzzy_OCR, DCC, Pyzor, all the reliable RBL's and
> >SBL's and they still slip under the radar. What are you guys doing to
> >block these? I've grabbed a couple of SARE rulesets (kam, sare_stocks)
> >and they still don't seem to catch these specific ones.
> >
> >I appreciate the help

I know the kind of spam you mean, but I have not checked to see
what's catching or blocking it.

How long ago did you grab the stocks ruleset? It was updated only a
couple weeks ago, and I've been having pretty good results since then.

It may also be that the recently added graylisting on our mailservers
is deflecting the spamware used by this particular stock spammer.
Graylisting does seem to offer a huge benefit.

-- Clifton

--
Clifton Royston -- clif...@iandicomputing.com / clif...@lava.net
President - I and I Computing * http://www.iandicomputing.com/
Custom programming, network design, systems and network consulting services

CRi...@checkfree.com

unread,

Nov 2, 2006, 3:37:03 PM11/2/06

to

amavis-us...@lists.sourceforge.net wrote on 11/02/2006 02:48:03 PM:

> On Thu, Nov 02, 2006 at 03:38:11PM +0100, Jakob Curdes wrote:
> > Do you use Bayes in spamassassin ? Without a self-adapting filter such

> > as bayes or dspam a lot of stuff will slip through nowadays.You can
tell
> > SA to train bayes from the rule-based score. Then you get a quite good

> > database without much manual traiing. Every spam case with slightly
> > varying text is ideal for bayes. after a dozen or so messages bayes
> > will have learnt the lesson.
>
> I beg to differ with you. On the contrary, there have been a number
> of recent posts to the SA list from admins all of whose spam is now
> identified as BAYES_00 and AWL, because they enabled autolearn and let
> it run on its own while a lot of spam was slipping through.

This was my issue and the reason I turned off Bayes and DSPAM, because of
our lack of spending the time manually training it and leaving it on
autolearn, it was causing more harm than good. We have a large user base,
5K+ users in different states/countries, it was not clear cut to implement
a user mechanism where they could manually flag email as spam or ham and
train the DB, additionally we use Notes on the back end.

>
> If significant amounts of spam are being missed in the first place,
> they are ipso facto being scored low, so if you aren't *manually*
> training them as spam they are more likely to get learned as ham than
> as spam.
>
>
> > >The content of the email is ASCII, no HTML, no images, pretty clean.
> > >Currently we are using Fuzzy_OCR, DCC, Pyzor, all the reliable RBL's
and
> > >SBL's and they still slip under the radar. What are you guys doing
to
> > >block these? I've grabbed a couple of SARE rulesets (kam,
sare_stocks)
> > >and they still don't seem to catch these specific ones.
> > >
> > >I appreciate the help
>
> I know the kind of spam you mean, but I have not checked to see
> what's catching or blocking it.
>
> How long ago did you grab the stocks ruleset? It was updated only a
> couple weeks ago, and I've been having pretty good results since then.

I grabbed them about 2 weeks ago but are up to date, they flag plenty of
emails just not these specific ones. Again, I feel we have everything
under control right now, except for these emails.

>
> It may also be that the recently added graylisting on our mailservers
> is deflecting the spamware used by this particular stock spammer.
> Graylisting does seem to offer a huge benefit.

Greylisting is a doubled edged sword, although beneficial in defering
illegitimate messages it also breaks misconfigured mail servers and the
users scream bloody murder when they dont get email at the speed of IM.
Sad to say this is my case here. We did however have it turned on
(SQLgrey) for a month before it got voted off because of the latency it
incurred. I do however use SQLgrey at other sites and it is still not
stopping these emails :-(

Ed W

unread,

Nov 2, 2006, 6:26:37 PM11/2/06

to

> Greylisting is a doubled edged sword, although beneficial in defering
> illegitimate messages it also breaks misconfigured mail servers and the
> users scream bloody murder when they dont get email at the speed of IM.
> Sad to say this is my case here. We did however have it turned on
> (SQLgrey) for a month before it got voted off because of the latency it
> incurred.

I have seen very few servers break based on this, but agree that it does
happen occasionally.

I set my "ok" threshold to just a single email though - my thoughts were
that if the mailserver does actually retry at all then it looked real
enough to keep it. This way there is just a small one time penalty for
each mail server and stuff comes through with just a small delay generally

Also you can add some rulesets so that only certain domains (which look
like they might be dialup IPs) get greylisted and that way it's mainly
only spammers who get hit anyway.

I see very little stock spam slip through greylisting right now - that
which does comes in via my backup MX which doesn't greylist...

I also grepped the log files and for the dialup domains which are just
huge spam nests I block them with a notice that they should use their
ISP's mailserver... Not quite as harsh as using an DUL dsbl, but hits
the bit offenders

Ed W

Jakob Curdes

unread,

Nov 3, 2006, 1:26:05 AM11/3/06

to

>> I beg to differ with you. On the contrary, there have been a number
>>of recent posts to the SA list from admins all of whose spam is now
>>identified as BAYES_00 and AWL, because they enabled autolearn and let
>>it run on its own while a lot of spam was slipping through.
>>
>>
>
>This was my issue and the reason I turned off Bayes and DSPAM, because of
>our lack of spending the time manually training it and leaving it on
>autolearn, it was causing more harm than good. We have a large user base,
>5K+ users in different states/countries, it was not clear cut to implement
>a user mechanism where they could manually flag email as spam or ham and
>train the DB, additionally we use Notes on the back end.
>
>
>
>
>> If significant amounts of spam are being missed in the first place,
>>they are ipso facto being scored low, so if you aren't *manually*
>>training them as spam they are more likely to get learned as ham than
>>as spam.
>>
>>
>>

We do not use AWL; this has proven to be dangerous. We use dspam and
Bayes in parallel, train Bayes with hamp/spam levels of 0.0 and 5.0 and
score both with 3.5 on spam and -1.0 on nonspam. In conjunction with
the "postfix mini-greylister" aka
reject_unverified_sender and some other sanity rule on the domain name,
mx etc. we have very rare cases of spam slipping through with no manual
intervention. We are rejecting about 95 % of mail so it's not because we
dont't get spam....
But I agree mileages might vary widely. The last argument might be true
but in practice I found we catch most things because at least dspam
starts to score mails when several similar ones have arrived and with
that score bayes will soon start to score too.

Yours,
Jakob Curdes

Benny Pedersen

unread,

Nov 3, 2006, 9:30:26 AM11/3/06

to

On Fri, November 3, 2006 07:28, Jakob Curdes wrote:

> We do not use AWL; this has proven to be dangerous.

how ?

did "auto_whitelist_factor" not help ?

the reason i use it is that it caches first time senders nicely, with imho is
mostly spam senders anyway :-)

> We use dspam and Bayes in parallel,

how does this work from amavisd-new ?

using the patched version for amavisd-new for dspam ?

> train Bayes with hamp/spam levels of 0.0 and 5.0 and
> score both with 3.5 on spam and -1.0 on nonspam. In conjunction with
> the "postfix mini-greylister" aka reject_unverified_sender

this is "dangerous" since its posible to disable verify in postfix ?

> and some other sanity rule on the domain name,
> mx etc. we have very rare cases of spam slipping through with no manual
> intervention.

thats life :)

> We are rejecting about 95 % of mail so it's not because we
> dont't get spam....

but only 5% mail ?

> But I agree mileages might vary widely. The last argument might be true
> but in practice I found we catch most things because at least dspam
> starts to score mails when several similar ones have arrived and with
> that score bayes will soon start to score too.

yes true, just that bayes can do this if it was trained manuely with spam /
ham , then dspam is not needed, the effictivly of bayes depends on how its
configured and used

--
"This message was sent using 100% recycled spam mails."

Jakob Curdes

unread,

Nov 3, 2006, 12:25:20 PM11/3/06

to

Benny Pedersen schrieb:

>On Fri, November 3, 2006 07:28, Jakob Curdes wrote:
>
>
>
>>We do not use AWL; this has proven to be dangerous.
>>
>>
>
>how ?
>
>did "auto_whitelist_factor" not help ?
>
>
>
>the reason i use it is that it caches first time senders nicely, with imho is
>mostly spam senders anyway :-)
>
>
>

I would not agree to that. Our customers do not want to get the first
mail of an important business partner rejected.
Let's put it this way : if you use statistical filters you should not in
parallel use other autotraining methods such as AWL because the results
are unpredictable. I ended up with a lot of whitelisted spam and
switched AWL off.

>>We use dspam and Bayes in parallel,
>>
>>
>
>how does this work from amavisd-new ?
>
>using the patched version for amavisd-new for dspam ?
>
>
>

No, just integrate dspam into spamassassin. This is documented in the
RELEASE-Notes of amavisd some versions ago.

>>train Bayes with hamp/spam levels of 0.0 and 5.0 and
>>score both with 3.5 on spam and -1.0 on nonspam. In conjunction with
>>the "postfix mini-greylister" aka reject_unverified_sender
>>
>>
>
>this is "dangerous" since its posible to disable verify in postfix ?
>
>
>

It is unclear to me what you mean by "disable verify".

>
>but only 5% mail ?
>
>
>

Yup.

>yes true, just that bayes can do this if it was trained manuely with spam /
>ham , then dspam is not needed, the effictivly of bayes depends on how its
>configured and used
>
>
>

But if the customers don't want to spend time with training....

JC

Benny Pedersen

unread,

Nov 3, 2006, 1:26:27 PM11/3/06

to

On Fri, November 3, 2006 18:27, Jakob Curdes wrote:

>>> We do not use AWL; this has proven to be dangerous.
>> how ?
>> did "auto_whitelist_factor" not help ?
>> the reason i use it is that it caches first time senders nicely, with imho
>> is mostly spam senders anyway :-)
> I would not agree to that. Our customers do not want to get the first
> mail of an important business partner rejected.

if its get rejected in the first place its not awl fault

> Let's put it this way : if you use statistical filters you should not in
> parallel use other autotraining methods such as AWL because the results
> are unpredictable.

how ?

> I ended up with a lot of whitelisted spam and switched AWL off.

change scores on whitelist in spamassassin, newer whitelist from, if its not
with trusted rcvd headers

whitelist_from gives -100 with will poison awl scores

>>> We use dspam and Bayes in parallel,
>> how does this work from amavisd-new ?
>> using the patched version for amavisd-new for dspam ?
>>
> No, just integrate dspam into spamassassin. This is documented in the
> RELEASE-Notes of amavisd some versions ago.

okay, i will test when amavisd-new realese the modified dspam version

>>> train Bayes with hamp/spam levels of 0.0 and 5.0 and
>>> score both with 3.5 on spam and -1.0 on nonspam. In conjunction with
>>> the "postfix mini-greylister" aka reject_unverified_sender
>> this is "dangerous" since its posible to disable verify in postfix ?
> It is unclear to me what you mean by "disable verify".

postconf -d | grep disable | grep verify

>> but only 5% mail ?
> Yup.

maybe its time to make snail mail :-)

>> yes true, just that bayes can do this if it was trained manuely with spam /
>> ham , then dspam is not needed, the effictivly of bayes depends on how its
>> configured and used
> But if the customers don't want to spend time with training....

most custommers don't have a life and will sit deleteing mails on there own,
that could have being done for them on the mailserver :-)

well in the end some mails needs to be deleted anyway

--
"This message was sent using 100% recycled spam mails."