unclosed connection detected on sentinels

226 views
Skip to first unread message

Chen Teng-Jun

unread,
Oct 2, 2015, 11:12:52 AM10/2/15
to Redis DB
We have this configuration in production environment:
1 Master , 2 Slaves, 3 sentinels,

We detected our connections are continuously incresing on 3 sentinels servers listening on 26379 port:
The number of connections Established increase every 7897s.

on the client sides, there are also every 7897s, a message like "Lost connection to Sentinel at sentinel_server1:26379 ..." "Lost connection to Sentinel at sentinel_server2:26379 ..." "Lost connection to Sentinel at sentinel_server3:26379 ..."


Do you have some ideas?

Greg Andrews

unread,
Oct 2, 2015, 1:23:44 PM10/2/15
to redi...@googlegroups.com
When TCP connections are lost and the client detects the trouble but the server does not, the cause is usually a network device between the client and the server that is breaking the connection.  A router or other device is performing address translation (NAT) or proxying/tunneling, and when it thinks the connections have been idle too long, it breaks them.  Usually no "shut down this connection" packets are sent to the client or server when this is done.

The client notices the next time it tries to send a command through the connection.  It gets back a reject, or times out waiting for the reply to the command.  The server doesn't notice because it was waiting for the next command to arrive through the connection, and none are arriving.

A router/firewall or proxy (or etherswitch performing those functions) is not the only way you can get these symptoms, but it's the most common way.

7897 seconds is 131 minutes and 37 seconds (2 hours 11 minutes, 37 seconds).  Perhaps that time period will match up to an idle timeout parameter somewhere in your network.

  -Greg

--
You received this message because you are subscribed to the Google Groups "Redis DB" group.
To unsubscribe from this group and stop receiving emails from it, send an email to redis-db+u...@googlegroups.com.
To post to this group, send email to redi...@googlegroups.com.
Visit this group at http://groups.google.com/group/redis-db.
For more options, visit https://groups.google.com/d/optout.

Greg Andrews

unread,
Oct 2, 2015, 1:43:11 PM10/2/15
to redi...@googlegroups.com
I want to add that it's possible the network device disconnected at 131 minutes, but your client didn't notice and log the error until 37 seconds later. 

Perhaps your client polls Sentinel every 30 seconds, and the seven seconds represents the first try (30 seconds after the disconnect), a 3.5 second TCP timeout, a retry, and another time out.

  -Greg

Chen Teng-Jun

unread,
Oct 12, 2015, 3:29:15 AM10/12/15
to Redis DB
Thank you for your help,
but, on sentinels side, the connections were always opened, every 7897s : on each sentinels servers, 3 new connections are opened...
Consequently, the number of connections establised increased steadily....

It's a bug of REDIS?

To unsubscribe from this group and stop receiving emails from it, send an email to redis-db+unsubscribe@googlegroups.com.

Greg Andrews

unread,
Oct 12, 2015, 7:02:57 AM10/12/15
to redi...@googlegroups.com
As I said, the most usual cause for the symptoms you describe is a network device breaking the connections. Since you asked, my answer is no.  I do not think this is a bug in Sentinel or Redis.  I think it's a network device that is not handling the connections the way you need.

Do your client programs have to go through a firewall or a load balancer or a proxy to reach the Sentinel servers?

  -Greg

To unsubscribe from this group and stop receiving emails from it, send an email to redis-db+u...@googlegroups.com.

To post to this group, send email to redi...@googlegroups.com.
Visit this group at http://groups.google.com/group/redis-db.
For more options, visit https://groups.google.com/d/optout.

--
You received this message because you are subscribed to the Google Groups "Redis DB" group.
To unsubscribe from this group and stop receiving emails from it, send an email to redis-db+u...@googlegroups.com.

Chen Teng-Jun

unread,
Oct 13, 2015, 6:02:33 AM10/13/15
to Redis DB
Yes, i have a firewall for the client, but i can't understand, why there isn't a timeout on sentinels for handling the unclosed connections...

To unsubscribe from this group and stop receiving emails from it, send an email to redis-db+unsubscribe@googlegroups.com.

To post to this group, send email to redi...@googlegroups.com.
Visit this group at http://groups.google.com/group/redis-db.
For more options, visit https://groups.google.com/d/optout.

Greg Andrews

unread,
Oct 13, 2015, 12:10:29 PM10/13/15
to redi...@googlegroups.com
What would that timeout be?  How much time passes between the client's last successful query through a connection to Sentinel and the firewall breaking that connection?

  -Greg

To unsubscribe from this group and stop receiving emails from it, send an email to redis-db+u...@googlegroups.com.
To post to this group, send email to redi...@googlegroups.com.
Visit this group at http://groups.google.com/group/redis-db.
For more options, visit https://groups.google.com/d/optout.

--
You received this message because you are subscribed to the Google Groups "Redis DB" group.
To unsubscribe from this group and stop receiving emails from it, send an email to redis-db+u...@googlegroups.com.
To post to this group, send email to redi...@googlegroups.com.
Visit this group at http://groups.google.com/group/redis-db.
For more options, visit https://groups.google.com/d/optout.
Reply all
Reply to author
Forward
0 new messages