Hi guys,
I've been testing the above scenario for quite a while now and, although haproxy v1.5-dev20 and newer added support for redis (option redis-check to go further to check redis protocol not only tcp) there is still a problem with the setup.
Sentinel is working as expected and, although it take about 8 seconds, it will eventually fail-over properly, according to my tests.
HAProxy configured with:
.....
backend b_redis
balance leastconn
option tcplog
option tcp-check
# tcp-check connect port 26379
tcp-check send PING\r\n
tcp-check expect string +PONG
# tcp-check expect string PONG
# option redis-check
tcp-check send info\ replication\r\n
# tcp-check send sentinel\ masters\r\n
tcp-check expect string role:master
# tcp-check expect string "redis-master"
tcp-check send QUIT\r\n
tcp-check expect string +OK
...............
will detect the new master and everything works beautifully.
If the failed redis-master would die in any way I was able to test, except for a net partitioning, everything looks good.
However, if I just isolate the redis-master from a network perspective (like an iptables rule to drop all packages to it) the process remains up and running (same pid, same run_id) and unaware that sentinel promoted another slave to replace it. I do run sentinel on each redis node including the initial master and I do not block sentinel communication.
Once I remove the iptables rule and make the old/initial master available again to the haproxy checks, haproxy will see it as a valid master and start sending requests to it. Sentinel will detect eventually that the old master re-became available and will reconfigure it as slave but a little bit too late, definitely later than haproxy will detect and start using it with the above configuration.
As far as I know now, there are 2 solutions:
1. increasing the rise in haproxy config to about 30; this combined with inter 1s makes haproxy consider the newly detected master as a valid backend server after 30 seconds which gives sentinel enough time to make reconfigure this master to be slave and so the problem would be solved. However, the 30 seconds delay will always happen, in all other failure scenarios so now you have HA but the fail-over is very slow (to be honest I would have expected for the sentinel to fail-over in less than a second...) and also, you can never be 100% sure that the 30 seconds delay will be long enough...
2. make a script (ran as a daemon) that constantly queries a sentinel instance (which one? or maybe all?), finds the current master, changes the haproxy configuration and soft-reloads haproxy - not a good idea at all in my opinion - many things can go wrong here and it is not really elegant either
I tried making HAProxy health-check sentinels and not redis masters and it can be done but all sentinel instances will have exactly the same data so all of them will output a sting that will show "they" have the master role. HAProxy at this moment can't reconfigure itself based on info (like the IP of the current redis master) that sentinel can give.
I agree that this is more a HAProxy issues/opportunity to improve it's redis support but maybe people here more familiar with redis came up with a better idea for this.
What do you think?
Thanks,