I have encountered the following situation twice now, and would like to share what I know.
On a website, I'm using redis for sessions, and it froze out of nowhere, twice now. The first time it happened 5 month ago, the second time now. There is literally nothing special happening on the server, just a webserver serving a single website.
Then, out of nowhere redis freezes. What I can see when it froze:
1. The website (using "redis" package in Python 2.7) is down, with "ConnectionError: Error 32 while writing to socket. Broken pipe." or "Error 61 connecting to localhost:6379. Connection refused." or "Error while reading from socket: (54, 'Connection reset by peer')"
2. When I log in via ssh, "service redis restart" hangs.
3. When I try to kill or kill -9 the redis process, it hangs.
4. Given no other options, I have to run "reboot" which reboots the full server, which works.
In redis logs, there is nothing special, except the disappearance of the backup process.
65673:M 25 Aug 16:11:37.085 * 10 changes in 300 seconds. Saving...
65673:M 25 Aug 16:11:37.086 * Background saving started by pid 19403
19403:C 25 Aug 16:11:37.100 * DB saved on disk
65673:M 25 Aug 16:11:37.189 * Background saving terminated with success
65673:M 25 Aug 16:16:38.095 * 10 changes in 300 seconds. Saving...
65673:M 25 Aug 16:16:38.095 * Background saving started by pid 19418
19418:C 25 Aug 16:16:38.116 * DB saved on disk
65673:M 25 Aug 16:16:38.202 * Background saving terminated with success
65673:signal-handler (1472153405) Received SIGTERM scheduling shutdown...
65673:signal-handler (1472153623) Received SIGTERM scheduling shutdown...
Somehow the "signal-handler" / logging part is still functioning, as it can write those lines to the log.
After about hour after the last backup line, there is are entries in debug.log saying:
Aug 25 17:15:43 maphub-web kernel: sonewconn: pcb 0xfffff80062dd9000: Listen queue overflow: 193 already in queue awaiting acceptance (1 occurrences)
The OS is FreeBSD 10.2-RELEASE-p9
Redis is: Redis server v=3.0.7 sha=00000000:0 malloc=libc bits=64 build=b9586831a13c9f53
The redis build is the stock one supplied by FreeBSD pkg.
The environment is KVM on a Xeon / ECC hardware, which should be really reliable.
This bug only happened once, 5 month ago, so it is really really uncommon. On the other hand it is quite critical, as it brings down the whole website with itself.