On 04/22/2018 02:11 PM, Vsevolod Stakhov wrote:
> On 22.04.2018 19:29, Dave Jones wrote:
>> My experience so far after a few weeks is Rspamd takes a lot of trial
>> and error to figure out the configuration layout/structure to put files
>> properly formatted in the local.d and override.d directories.
>
> Well, after some years of "configs" in Perl, your mind should be indeed
> shifted to treat weirdness as convenience.
I am not even talking about the weirdness of SA's perl rules. I am
referring to taking general principles of mail filtering and turning
them into something that makes rspamd do what you want/expect.
Rspamd doesn't complain or tell you when you have put settings in the
wrong place. They are just ignored so you have to wait for some
messages to go by that should have triggered some rule/symbol but don't
to know that you didn't do something right.
> I had something similar when
> used KDB (
https://kx.com/kdb-advanced/) - you just start to reject
> normal analytical tools (e.g. pandas) just because you can do something
> like:
>
> .[p;();,;u@:iasc u@:where not(u:distinct enlist val)in v:$[type key
> p:(`)sv tabledir,`sym;get p;0#`]];`sym!(v,u)?val}
>
> But I personally think it is a toxic experience. SA configs are not bad
> for writing regular expression rules. Aside that, I found that extremely
> hard to do something different.
>
>> Rspamd
>> also does not have some features implemented yet that SA does.
>
> Which ones?
>
- Many of the content rules that block obvious spam like SA's
LOTS_OF_MONEY and DRUGS_ERECTILE
- internal_networks and trusted_networks so the proper Received: header
can be checked against RBLs
- simple globbing in whitelist/blacklists to use existing lists created
for SA (I have thousands of entries)
>> I am running SA and Rspamd side-by-side where rspamd is a "second
>> opinion" only to add/subtract a small amount to SA's score. So far
>> rspamd has some potential but it's not ready for taking over a mature,
>> well-tuned SA platform.
>
> See my first point: it's all about your experience. Both Rspamd and SA
> requires some learning curve to be passed. I'm trying to do my best to
> make Rspamd's one not so complicated. Nonetheless, from your point of
> view it's all weird, difficult and not obvious, but it is not likely
> Rspamd issue, it's just your experiences with SA. Local.d/override.d are
> excellent to maintain large installations using automatic deployment
> (ansible, puppet etc). I can say that as we did something similar for SA
> configs and it was a mess.
>
I agree the local.d/override.d files are easier to maintain once you
have figure out the exact filename and format to get rspamd to do what
you want.
>> Rspamd does have a few extra features/modules that SA doesn't have and I
>> am trying to take advantage of them in SA. I am not sure if some of
>> these extra features provide enough extra value from a well-tuned MTA
>> which should be blocking the majority of the easy spam where SA/rspamd
>> only need to detect and block a very small percentage of what makes it
>> past the MTA checks primarily based on content.
>
> I always ask one question in this case: why the fuck you need to filter
> spam in your MTA? Rspamd can do it much more efficient than MTA as it is
> written just for this purpose and I did my best to make it fast. In
> fact, I know no MTA that can beat Rspamd in terms of speed. If you want
> to filter crap before Data command, then Rspamd provides no-content mode
> to do all regexps, spf, reputation and other checks in an efficient matter.
>
You should do as much as you possible in the MTA to give standard
rejection codes/messages that can be Google'd by the sender's tech
support. Postfix gives excellent response text that are well documented
and searchable.
Postfix's postscreen with weighted RBLs is the best. Combine the power
of basic DNS checks, 20+ RBLs in postscreen, and postwhite from Github
and you can easily/safely block the majority of spam without SA or rspamd.
>> You still have to train your Bayesian database _well_ then increase the
>> scores on both ends for the BAYES_HAM and BAYES_SPAM hits. This is the
>> best thing you can do for rspamd's content checking.
>
> Bayes is one of the methods to do content checking but it is obviously
> not the only one: you have fuzzy checks, URL plugins, regular
> expressions (that are blazingly fast due to Hyperscan) and even bloody
> neural network on top of it. Aside that, there are hundreds of functions
> to do content scan using Lua API.
>
Those of us that didn't write the rspamd code have to learn how to use
all of these cool features and how to put them in the proper conf file.
The documentation needs to have a complete listing of all options that
are available in each section/module so we know the structure without
having to trial and error everything for hours to get one new feature
working.
>> I do like how you can setup a multimap rule based on the subject header
>> then simply add lines to the map file with complex regex to detect odd
>> patterns of spammer subjects. This part is much nicer than how you have
>> to do it in SA.
>
> Multimap can do much more than this but I'm glad that you've found
> something nicer to configure than in SA.
>
I am sure it's very powerful but I am having to fumble my way around to
figure out all of rspamd's power.