On 01/11/2016 07:52, Andris Reinman wrote:
> Hey all,
>
> We at
www.zone.ee (local hosting company) have been using Rspamd in a
> single server to filter some of the outbound email traffic. MTA server
> that uses Rspamd is home-brewed ZoneMTA
> (
https://github.com/zone-eu/zone-mta). The current volume processed by
> Rspamd is about 200k emails a day and so far it has performed really
> well, CPU usage has been nearly non-existant. Based on the current
> experience we plan to replace SpamAssassin with Rspamd for inbound
> emails as well.
Thank you for your valuable feedback, please see my reply below.
> We have also ran into some problems with Rspamd:
>
> 1. Redis usage. Using Redis did not seem to work for Bayes classifier in
> stable 1.3 (Rspamd did not give any bayes ham or spam points to
> messages), so we upgraded to the development 1.4. In the end we had to
> turn it off though, Rspamd was opening so many connections to Redis over
> time (thousands of connections), that it exhausted Redis connection limits.
From 1.4, Rspamd uses Redis pool for lua connections but not for Bayes.
Your issue looks strange because I have never observed issues with
bayes. As a temporary workaround you could add connection time limit to
Redis itself:
# Close the connection after a client is idle for N seconds (0 to disable)
timeout 3
> 2. Large RAM use. Normally Rspamd process RAM usage was around 200-300MB
> but switching on PhishTank support increased RAM usage up to 700-900MB
> per "normal worker" process (main process still takes around 300MB). The
> server has a lot of cores and thus Rspamd created a lot of worker
> processes. We really wanted PhishTank support, so we reduced the number
> of Rspamd workers which kind of fixed the problem - the server has
> enough RAM that having 5 worker processes taking 4-5BG RAM is not a
> problem. It seems weird though.
Do you mean RSS or VSZ for a process? VSZ doesn't mean anything on 64
bit system. Here is what I see on a production system:
PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND
17895 _rspamd 20 0 170704 24316 7008 R 28,6 0,0 0:06.70 rspamd
So actually a process eats like 24Mb memory + 7Mb of shared memory which
looks quite reasonable. Over time, it usually increases to about
50-100Mb due to memory fragmentation.
Phishtank is different indeed. I've recently added some fixes to reduce
Rspamd memory (and network bandwidth) footprint when using it but it is
still quite high.
> 3. Understanding default configuration. It took a while until we
> realised that Rspamd has its own rate limiting. At first we thought that
> 504 errors for spam checks were happening because Rspamd itself was
> having issues. We do not need rate limiting for outbound so once when we
> figured out what was actually going on we were able to disable it.
Can you suggest something to improve this situation?
> 4. Quirks with HTTP protocol. We use chunked uploads for sending
> messages to Rspamd and it seems that you can't use chunks larger than
> 12kB, otherwise some kind of data loss happened which in turn started
> returning strange issues for messages like all DKIM keys failed
> validation, HTML parts were different for Text parts etc.
That's interesting. I've never ever used chunked encoding so I cannot
add more so far. I'll try to write a test, thanks for report.
> In general we are really happy with it and plan to extend using it.
Do not hesitate to ask me about your concerns or bugs you've found. You
can use this list or IRC for these purposes. I can also add your badge
to '
rspamd.com' site if you'd like.
--
Vsevolod Stakhov