Redis - Manual failover works but Automatic doesnt

479 views
Skip to first unread message

Junaid Subhani

unread,
Oct 29, 2017, 1:32:11 PM10/29/17
to Redis DB
I have a node Redis cluser with 1 master and 2 slaves. Master, Slave1 and Slave2 are all deployed on their own physical boxes.

 Master            Slave1          Slave2
      |   -------------   |    -----------   |
Sentinel1       Sentinal2      Sentinel3


In my below config I have set quorum to 1 since I want just 1 sentinel to make the decision.


master 172.29.245.6
slave1 172.29.240.163
slave2 172.29.225.104


With my master up, this is the status of my cluster.

Version --> redis-3.2.10-2.el7.x86_64


Master

# Replication
role:master
connected_slaves:2
slave0:ip=172.29.225.104,port=6379,state=online,offset=486,lag=1
slave1:ip=172.29.240.163,port=6379,state=online,offset=633,lag=0
master_repl_offset:925
repl_backlog_active:1
repl_backlog_size:1048576
repl_backlog_first_byte_offset:2
repl_backlog_histlen:924


Slave1

# Replication
role:slave
master_host:172.29.245.6
master_port:6379
master_link_status:up
master_last_io_seconds_ago:1
master_sync_in_progress:0
slave_repl_offset:17719
slave_priority:100
slave_read_only:1
connected_slaves:0
master_repl_offset:0
repl_backlog_active:0
repl_backlog_size:1048576
repl_backlog_first_byte_offset:0
repl_backlog_histlen:0


Slave2

# Replication
role:slave
master_host:172.29.245.6
master_port:6379
master_link_status:up
master_last_io_seconds_ago:1
master_sync_in_progress:0
slave_repl_offset:20367
slave_priority:100
slave_read_only:1
connected_slaves:0
master_repl_offset:0
repl_backlog_active:0
repl_backlog_size:1048576
repl_backlog_first_byte_offset:0
repl_backlog_histlen:0


My aim is that when I shut down the master redis process one of the slaves should become a master.

But when I try to do an automatic failover, it gives the following messages.


Oct 27 14:18:22 redis-2 redis-sentinel[4866]: 4866:X 27 Oct 14:18:22.175 # +new-epoch 27
Oct 27 14:18:22 redis-2 redis-sentinel[4866]: 4866:X 27 Oct 14:18:22.175 # +try-failover master redis-master 172.29.245.6 6379
Oct 27 14:18:22 redis-2 redis-sentinel[4866]: 4866:X 27 Oct 14:18:22.185 # +vote-for-leader 61c51a37f2db1673ddcf5dc3fe6816c9ba83a408 27
Oct 27 14:18:33 redis-2 redis-sentinel[4866]: 4866:X 27 Oct 14:18:33.142 # -failover-abort-not-elected master redis-master 172.29.245.6 6379
Oct 27 14:18:33 redis-2 redis-sentinel[4866]: 4866:X 27 Oct 14:18:33.218 # Next failover delay: I will not start a failover before Fri Oct 27 14:20:23 2017


However when I use the method to manually failover, then it works.


[root@redis-1 ~]# redis-cli -p 26379 SENTINEL failover redis-master
OK

Oct 27 14:19:40 redis-3 redis-sentinel[1211]: 1211:X 27 Oct 14:19:40.503 # -sdown master redis-master 172.29.245.6 6379
Oct 27 14:19:40 redis-3 redis-sentinel[1211]: 1211:X 27 Oct 14:19:40.503 # -odown master redis-master 172.29.245.6 6379
Oct 27 14:24:56 redis-3 redis-sentinel[1211]: 1211:X 27 Oct 14:24:56.982 # +new-epoch 28
Oct 27 14:24:57 redis-3 redis-sentinel[1211]: 1211:X 27 Oct 14:24:57.851 # +config-update-from sentinel 5080eb7a0e27f9fb6ea6a197bf1abd334608943f 172.29.245.6 26379 @ redis-master 172.29.245.6 6379
Oct 27 14:24:57 redis-3 redis-sentinel[1211]: 1211:X 27 Oct 14:24:57.852 # +switch-master redis-master 172.29.245.6 6379 172.29.225.104 6379
Oct 27 14:24:57 redis-3 redis-sentinel[1211]: 1211:X 27 Oct 14:24:57.852 * +slave slave 172.29.240.163:6379 172.29.240.163 6379 @ redis-master 172.29.225.104 6379
Oct 27 14:24:57 redis-3 redis-sentinel[1211]: 1211:X 27 Oct 14:24:57.852 * +slave slave 172.29.245.6:6379 172.29.245.6 6379 @ redis-master 172.29.225.104 6379

I have gone through the documentation and followed it exactly but the automatic failover for me does not work. Any ideas ?

My sentinel.conf on all the nodes look like:


MAster

port 26379
sentinel myid 5080eb7a0e27f9fb6ea6a197bf1abd334608943f
sentinel monitor redis-master 172.29.245.6 6379 1
sentinel down-after-milliseconds redis-master 5000
sentinel failover-timeout redis-master 60000
sentinel auth-pass redis-master XXXXXXXXXXXXXXXXX
# Generated by CONFIG REWRITE
dir "/"
sentinel config-epoch redis-master 29
sentinel leader-epoch redis-master 29
sentinel known-slave redis-master 172.29.225.104 6379
sentinel known-slave redis-master 172.29.240.163 6379
sentinel known-sentinel redis-master 172.29.240.163 26379 61c51a37f2db1673ddcf5dc3fe6816c9ba83a408
sentinel known-sentinel redis-master 172.29.225.104 26379 85e2ce3707304f7b3b4d0f7281dcd6932094c8b6
sentinel current-epoch 29


Slave1

port 26379
sentinel myid 61c51a37f2db1673ddcf5dc3fe6816c9ba83a408
sentinel monitor redis-master 172.29.245.6 6379 1
sentinel down-after-milliseconds redis-master 5000
sentinel failover-timeout redis-master 60000
sentinel auth-pass redis-master XXXXXXXXXXXXXXXX

# Generated by CONFIG REWRITE
dir "/"
sentinel config-epoch redis-master 29
sentinel leader-epoch redis-master 27
sentinel known-slave redis-master 172.29.225.104 6379
sentinel known-slave redis-master 172.29.240.163 6379
sentinel known-sentinel redis-master 172.29.245.6 26379 5080eb7a0e27f9fb6ea6a197bf1abd334608943f
sentinel known-sentinel redis-master 172.29.225.104 26379 85e2ce3707304f7b3b4d0f7281dcd6932094c8b6
sentinel current-epoch 29


Slave2

port 26379
sentinel myid 85e2ce3707304f7b3b4d0f7281dcd6932094c8b6
sentinel monitor redis-master 172.29.245.6 6379 1
sentinel down-after-milliseconds redis-master 5000
sentinel failover-timeout redis-master 60000
sentinel auth-pass redis-master XXXXXXXXXXXX

# Generated by CONFIG REWRITE
dir "/"
sentinel config-epoch redis-master 29
sentinel leader-epoch redis-master 26
sentinel known-slave redis-master 172.29.225.104 6379
sentinel known-slave redis-master 172.29.240.163 6379
sentinel known-sentinel redis-master 172.29.245.6 26379 5080eb7a0e27f9fb6ea6a197bf1abd334608943f
sentinel known-sentinel redis-master 172.29.240.163 26379 61c51a37f2db1673ddcf5dc3fe6816c9ba83a408
sentinel current-epoch 29


My Sentinel config also looks good:


[root@redis-1 ~]# redis-cli -p 26379 info sentinel
# Sentinel
sentinel_masters:1
sentinel_tilt:0
sentinel_running_scripts:0
sentinel_scripts_queue_length:0
sentinel_simulate_failure_flags:0
master0:name=redismaster,status=ok,address=172.29.245.6:6379,slaves=2,sentinels=3

Although I see that slaves=2 but still its unable to elect a leader. What might be wrong with my config ? 



Tuco

unread,
Oct 29, 2017, 11:49:19 PM10/29/17
to Redis DB
Have you renamed some commands, like config command?

Junaid Subhani

unread,
Oct 30, 2017, 10:01:04 AM10/30/17
to Redis DB
Nope. I have done no other customizations. Everything is default.

hva...@gmail.com

unread,
Oct 30, 2017, 11:24:04 PM10/30/17
to Redis DB
The error in your Sentinel log mentions voting.  Are you using the recommended quorum setting for your size/type of server group?

Junaid Subhani

unread,
Oct 31, 2017, 1:42:03 PM10/31/17
to Redis DB
Yes. I have 3 sentinels and the quorum is set to 2. Thats the size.

What do you mean by type ? 

hva...@gmail.com

unread,
Oct 31, 2017, 11:28:40 PM10/31/17
to Redis DB
The reason I asked about quorum is your opening post in the thread said you have the quorum set to 1:


In my below config I have set quorum to 1 since I want just 1 sentinel to make the decision.

By type, I was thinking about whether the Sentinels are on the same servers as the Redis processes, or on the client servers, or on their own servers.

Junaid Subhani

unread,
Nov 1, 2017, 7:05:21 PM11/1/17
to Redis DB
Ah ok.

Every sentinel in running on its own virtual machine. I even created an extra Sentinel. I have 4 sentinels now and still the master does not fail over. This is very strange......

hva...@gmail.com

unread,
Nov 2, 2017, 1:05:15 PM11/2/17
to Redis DB
Are you setting quorum to 1 (as your first post said), or 2?
Do the logfiles from the Sentinels still have the complaint about voting trouble when they fail to perform the automatic failover?

Junaid Subhani

unread,
Nov 2, 2017, 3:29:23 PM11/2/17
to Redis DB
Ok so 

1) initially I had a 2 nodes. Redis master and Redis slave with quorum set to 1. Failover doesnt work. (2 sentinels)
2) Then. I added a node. Redis master and 2 slaves with quorum set to 2. Faiolver doesnt work. (3 sentinels)
3) Then added another sentinel. Redis master, 3 slaves. Quorum set to 2. Still doesnt work. (4 sentinels)

For case #1 , in the logs I see +sdown and then +odown and it is unable to elect a leader.
For case #2 and #3 , the logs indicate ONLY a +sdown and thats it. Nothing happens after that. 

Rahul Singh

unread,
Aug 17, 2018, 5:29:22 PM8/17/18
to Redis DB
Hello,

Did you by any chance was able to solve this issue? I am also facing same issue.


On Sunday, October 29, 2017 at 11:02:11 PM UTC+5:30, Junaid Subhani wrote:
Reply all
Reply to author
Forward
0 new messages