The right values for somaxconn and Redis's tcp-backlog are entirely dependent on the connections your Redis servers see from clients. More specifically, the spikes in the rates of new connections from clients, and if events on the server are blocking Redis's ability to accept the new connections. If you use software that makes a connection to Redis for every command, then you will certainly see these kinds of spikes in the rate of new connections. If you use connection pools that keep connections open and send new commands through established connections (instead of opening new connections), then you won't see these spikes as often.
Here are two pretty good explanations of the application's listen() backlog setting and the kernel's somaxconn setting:
http://www.linuxjournal.com/files/linuxjournal.com/linuxjournal/articles/023/2333/2333s2.html http://veithen.github.io/2014/01/01/how-tcp-backlog-works-in-linux.htmlThe Linux Journal one is older, but more general and I find has an easier to understand explanation of the concepts. The Veithen one has more technical detail, but might be easier to get lost in.
This quote from the Linux Journal page is the overall summary:
The backlog has an effect on the maximum rate at which a server can accept new TCP connections on a socket.
If you have large spikes in new connections from clients, a longer buffer can prevent some of those connections from being missed (causing slower TCP waits and retries). In a similar way, if something briefly blocks Redis from accepting new connections out of the buffer, then a longer buffer can prevent some of those connections from being missed.
It's pretty clear that having a shorter buffer can produce performance issues when the connection rate is high, but are there issues when the buffer is longer? Is there a setting that's too long? I haven't found any descriptions of performance problems. The buffer length the kernel uses for each socket is the shorter of the global somaxconn setting and the length requested by the software in the listen() call. If you have a large somaxconn and many, many, many applications that listen on sockets and request a large length in their listen() calls, then the kernel could use more memory. When servers had 8 megabytes of RAM (Sparcstation 5 anyone?), then memory consumption could have been an issue. These days I don't see that happening. So I'm not aware of any drawbacks with configuring large values in somaxconn and in Redis's tcp-backlog settings.
The next question is, "How do I tell if/when I need to increase these two parameters?"
A:
- The kernel will often tell you with complaints about possible SYN flooding
- netstat -s can tell you in the parts labeled "N times the listen queue of a socket overflowed" or "SYNs to LISTEN sockets dropped
It's not a sitation where "any number above 0 is bad". Just like packet errors in "netstat -i", it's the ratio of good vs bad and whether the bad is increasing regularly.
These three pages have a lot of good info about measuring and tuning these parameters. The first link is long, but has great advice about graphing the measurements and checking the graphs:
http://engineering.chartbeat.com/2014/01/02/part-1-lessons-learned-tuning-tcp-and-nginx-in-ec2/ https://access.redhat.com/solutions/30453 https://serverfault.com/questions/646604/what-causes-syn-to-listen-sockets-dropped