Redis failover is not happening with Sentinel

1,863 views
Skip to first unread message

Roshan Pradeep

unread,
Jun 2, 2015, 1:44:42 AM6/2/15
to redi...@googlegroups.com
Hi

I am using Redis redis-3.0.0 in master-slave (1 master 2 slaves) mode. Also followed the exact steps mentioned in http://redis.io/topics/sentinel with different IPs.

sentinel.conf
==========

port 26379
daemonize yes
pidfile "/var/run/redis/sentinel.pid"
logfile "/var/log/redis/sentinel.log"
bind 10.1.161.246

dir "/data/redis"

sentinel monitor mymaster 10.1.161.244 6379 2
sentinel down-after-milliseconds mymaster 5000
sentinel failover-timeout mymaster 60000
sentinel parallel-syncs mymaster 1


Sentinel 1 log
===========
3507:X 02 Jun 15:31:22.823 # Sentinel runid is 73834c1b6d787d92729cb8aee8b260e4ab9efed9
3507:X 02 Jun 15:31:22.823 # +monitor master mymaster 10.1.161.244 6379 quorum 2
3507:X 02 Jun 15:31:22.825 * +slave slave 10.1.161.245:6379 10.1.161.245 6379 @ mymaster 10.1.161.244 6379
3507:X 02 Jun 15:31:22.825 * +slave slave 10.1.161.246:6379 10.1.161.246 6379 @ mymaster 10.1.161.244 6379
3507:X 02 Jun 15:31:28.543 * +sentinel sentinel 10.1.161.245:26379 10.1.161.245 26379 @ mymaster 10.1.161.244 6379
3507:X 02 Jun 15:31:31.007 * +sentinel sentinel 10.1.161.246:26379 10.1.161.246 26379 @ mymaster 10.1.161.244 6379
3507:X 02 Jun 15:34:10.581 # +sdown master mymaster 10.1.161.244 6379
3507:X 02 Jun 15:34:11.324 # +new-epoch 1
3507:X 02 Jun 15:34:11.326 # +vote-for-leader f685c8cd0a4b210f61f4330f5205c4dc899b33f8 1
3507:X 02 Jun 15:34:11.695 # +odown master mymaster 10.1.161.244 6379 #quorum 3/2
3507:X 02 Jun 15:34:11.695 # Next failover delay: I will not start a failover before Tue Jun  2 15:36:12 2015
3507:X 02 Jun 15:36:11.729 # +new-epoch 2
3507:X 02 Jun 15:36:11.731 # +vote-for-leader f685c8cd0a4b210f61f4330f5205c4dc899b33f8 2
3507:X 02 Jun 15:36:11.769 # Next failover delay: I will not start a failover before Tue Jun  2 15:38:11 2015
3507:X 02 Jun 15:38:11.860 # +new-epoch 3
3507:X 02 Jun 15:38:11.861 # +try-failover master mymaster 10.1.161.244 6379
3507:X 02 Jun 15:38:11.862 # +vote-for-leader 73834c1b6d787d92729cb8aee8b260e4ab9efed9 3
3507:X 02 Jun 15:38:11.866 # 10.1.161.245:26379 voted for 73834c1b6d787d92729cb8aee8b260e4ab9efed9 3
3507:X 02 Jun 15:38:11.867 # 10.1.161.246:26379 voted for 73834c1b6d787d92729cb8aee8b260e4ab9efed9 3
3507:X 02 Jun 15:38:11.917 # +elected-leader master mymaster 10.1.161.244 6379
3507:X 02 Jun 15:38:11.917 # +failover-state-select-slave master mymaster 10.1.161.244 6379
3507:X 02 Jun 15:38:11.983 # +selected-slave slave 10.1.161.245:6379 10.1.161.245 6379 @ mymaster 10.1.161.244 6379
3507:X 02 Jun 15:38:11.984 * +failover-state-send-slaveof-noone slave 10.1.161.245:6379 10.1.161.245 6379 @ mymaster 10.1.161.244 6379
3507:X 02 Jun 15:38:12.067 * +failover-state-wait-promotion slave 10.1.161.245:6379 10.1.161.245 6379 @ mymaster 10.1.161.244 6379

Sentinel 2 log
===========
31983:X 02 Jun 15:31:26.485 # Sentinel runid is f685c8cd0a4b210f61f4330f5205c4dc899b33f8
31983:X 02 Jun 15:31:26.486 # +monitor master mymaster 10.1.161.244 6379 quorum 2
31983:X 02 Jun 15:31:26.486 * +slave slave 10.1.161.245:6379 10.1.161.245 6379 @ mymaster 10.1.161.244 6379
31983:X 02 Jun 15:31:26.486 * +slave slave 10.1.161.246:6379 10.1.161.246 6379 @ mymaster 10.1.161.244 6379
31983:X 02 Jun 15:31:26.837 * +sentinel sentinel 10.1.161.244:26379 10.1.161.244 26379 @ mymaster 10.1.161.244 6379
31983:X 02 Jun 15:31:30.998 * +sentinel sentinel 10.1.161.246:26379 10.1.161.246 26379 @ mymaster 10.1.161.244 6379
31983:X 02 Jun 15:34:11.246 # +sdown master mymaster 10.1.161.244 6379
31983:X 02 Jun 15:34:11.312 # +odown master mymaster 10.1.161.244 6379 #quorum 2/2
31983:X 02 Jun 15:34:11.312 # +new-epoch 1
31983:X 02 Jun 15:34:11.312 # +try-failover master mymaster 10.1.161.244 6379
31983:X 02 Jun 15:34:11.314 # +vote-for-leader f685c8cd0a4b210f61f4330f5205c4dc899b33f8 1
31983:X 02 Jun 15:34:11.317 # 10.1.161.244:26379 voted for f685c8cd0a4b210f61f4330f5205c4dc899b33f8 1
31983:X 02 Jun 15:34:11.318 # 10.1.161.246:26379 voted for f685c8cd0a4b210f61f4330f5205c4dc899b33f8 1
31983:X 02 Jun 15:34:11.397 # +elected-leader master mymaster 10.1.161.244 6379
31983:X 02 Jun 15:34:11.397 # +failover-state-select-slave master mymaster 10.1.161.244 6379
31983:X 02 Jun 15:34:11.459 # +selected-slave slave 10.1.161.245:6379 10.1.161.245 6379 @ mymaster 10.1.161.244 6379
31983:X 02 Jun 15:34:11.460 * +failover-state-send-slaveof-noone slave 10.1.161.245:6379 10.1.161.245 6379 @ mymaster 10.1.161.244 6379
31983:X 02 Jun 15:34:11.518 * +failover-state-wait-promotion slave 10.1.161.245:6379 10.1.161.245 6379 @ mymaster 10.1.161.244 6379
31983:X 02 Jun 15:35:11.584 # -failover-abort-slave-timeout master mymaster 10.1.161.244 6379
31983:X 02 Jun 15:35:11.667 # Next failover delay: I will not start a failover before Tue Jun  2 15:36:11 2015
31983:X 02 Jun 15:36:11.716 # +new-epoch 2

Sentinel 3 log
===========
32240:X 02 Jun 15:31:28.978 # Sentinel runid is a59ce934bf30cbd3fbfe9035ba258744dd229169
32240:X 02 Jun 15:31:28.978 # +monitor master mymaster 10.1.161.244 6379 quorum 2
32240:X 02 Jun 15:31:28.978 * +slave slave 10.1.161.245:6379 10.1.161.245 6379 @ mymaster 10.1.161.244 6379
32240:X 02 Jun 15:31:28.978 * +slave slave 10.1.161.246:6379 10.1.161.246 6379 @ mymaster 10.1.161.244 6379
32240:X 02 Jun 15:31:30.577 * +sentinel sentinel 10.1.161.245:26379 10.1.161.245 26379 @ mymaster 10.1.161.244 6379
32240:X 02 Jun 15:31:30.925 * +sentinel sentinel 10.1.161.244:26379 10.1.161.244 26379 @ mymaster 10.1.161.244 6379
32240:X 02 Jun 15:34:11.323 # +new-epoch 1
32240:X 02 Jun 15:34:11.324 # +vote-for-leader f685c8cd0a4b210f61f4330f5205c4dc899b33f8 1
32240:X 02 Jun 15:34:11.415 # +sdown master mymaster 10.1.161.244 6379
32240:X 02 Jun 15:34:11.486 # +odown master mymaster 10.1.161.244 6379 #quorum 3/2
32240:X 02 Jun 15:34:11.486 # Next failover delay: I will not start a failover before Tue Jun  2 15:36:11 2015
32240:X 02 Jun 15:36:11.728 # +new-epoch 2
32240:X 02 Jun 15:36:11.729 # +vote-for-leader f685c8cd0a4b210f61f4330f5205c4dc899b33f8 2
32240:X 02 Jun 15:36:11.738 # Next failover delay: I will not start a failover before Tue Jun  2 15:38:12 2015
32240:X 02 Jun 15:38:11.863 # +new-epoch 3


After I killed the master, asked the new master via sentinel, but it return the same old master ip. Whys is that? Did I missed something?

Thanks. 


Jan-Erik Rediger

unread,
Jun 2, 2015, 3:36:43 AM6/2/15
to redi...@googlegroups.com
31983:X 02 Jun 15:35:11.584 # -failover-abort-slave-timeout master mymaster 10.1.161.244 6379

For some reason the slave failed to respond in a timely manner.
Anything in the log of the slaves?
> --
> You received this message because you are subscribed to the Google Groups "Redis DB" group.
> To unsubscribe from this group and stop receiving emails from it, send an email to redis-db+u...@googlegroups.com.
> To post to this group, send email to redi...@googlegroups.com.
> Visit this group at http://groups.google.com/group/redis-db.
> For more options, visit https://groups.google.com/d/optout.

Roshan Pradeep

unread,
Jun 2, 2015, 6:23:48 AM6/2/15
to redi...@googlegroups.com, jan...@fnordig.de
Nothing specially in logs. I tried with Redis 3.0.1 as well. Below are the logs.

Sentinel 1
=======
10569:X 02 Jun 20:14:57.431 # Sentinel runid is 00b88bb29080539f1ccba58c80e7c71ddeae50b6
10569:X 02 Jun 20:14:57.431 # +monitor master mymaster 10.1.161.244 6379 quorum 2
10569:X 02 Jun 20:14:57.432 * +slave slave 10.1.161.245:6379 10.1.161.245 6379 @ mymaster 10.1.161.244 6379
10569:X 02 Jun 20:14:57.446 * +slave slave 10.1.161.246:6379 10.1.161.246 6379 @ mymaster 10.1.161.244 6379
10569:X 02 Jun 20:15:17.736 * +sentinel sentinel 10.1.161.245:26379 10.1.161.245 26379 @ mymaster 10.1.161.244 6379
10569:X 02 Jun 20:15:36.443 * +sentinel sentinel 10.1.161.246:26379 10.1.161.246 26379 @ mymaster 10.1.161.244 6379
10569:X 02 Jun 20:18:50.757 # +sdown master mymaster 10.1.161.244 6379
10569:X 02 Jun 20:18:50.771 # +new-epoch 1
10569:X 02 Jun 20:18:50.774 # +vote-for-leader 7947a8b01f35479a45705c20a0c9877e203b0120 1
10569:X 02 Jun 20:18:50.833 # +odown master mymaster 10.1.161.244 6379 #quorum 3/2
10569:X 02 Jun 20:18:50.833 # Next failover delay: I will not start a failover before Tue Jun  2 20:20:50 2015
10569:X 02 Jun 20:20:50.921 # +new-epoch 2
10569:X 02 Jun 20:20:50.924 # +vote-for-leader 63358cbf283fa3254df259516beba218c434c22f 2
10569:X 02 Jun 20:20:50.954 # Next failover delay: I will not start a failover before Tue Jun  2 20:22:51 2015


Sentinel 2
=======
6560:X 02 Jun 20:15:15.697 # Sentinel runid is 7947a8b01f35479a45705c20a0c9877e203b0120
6560:X 02 Jun 20:15:15.697 # +monitor master mymaster 10.1.161.244 6379 quorum 2
6560:X 02 Jun 20:15:15.698 * +slave slave 10.1.161.245:6379 10.1.161.245 6379 @ mymaster 10.1.161.244 6379
6560:X 02 Jun 20:15:15.702 * +slave slave 10.1.161.246:6379 10.1.161.246 6379 @ mymaster 10.1.161.244 6379
6560:X 02 Jun 20:15:15.960 * +sentinel sentinel 10.1.161.244:26379 10.1.161.244 26379 @ mymaster 10.1.161.244 6379
6560:X 02 Jun 20:15:36.433 * +sentinel sentinel 10.1.161.246:26379 10.1.161.246 26379 @ mymaster 10.1.161.244 6379
6560:X 02 Jun 20:18:50.704 # +sdown master mymaster 10.1.161.244 6379
6560:X 02 Jun 20:18:50.757 # +odown master mymaster 10.1.161.244 6379 #quorum 2/2
6560:X 02 Jun 20:18:50.757 # +new-epoch 1
6560:X 02 Jun 20:18:50.757 # +try-failover master mymaster 10.1.161.244 6379
6560:X 02 Jun 20:18:50.759 # +vote-for-leader 7947a8b01f35479a45705c20a0c9877e203b0120 1
6560:X 02 Jun 20:18:50.764 # 10.1.161.244:26379 voted for 7947a8b01f35479a45705c20a0c9877e203b0120 1
6560:X 02 Jun 20:18:50.765 # 10.1.161.246:26379 voted for 7947a8b01f35479a45705c20a0c9877e203b0120 1
6560:X 02 Jun 20:18:50.842 # +elected-leader master mymaster 10.1.161.244 6379
6560:X 02 Jun 20:18:50.842 # +failover-state-select-slave master mymaster 10.1.161.244 6379
6560:X 02 Jun 20:18:50.913 # +selected-slave slave 10.1.161.246:6379 10.1.161.246 6379 @ mymaster 10.1.161.244 6379
6560:X 02 Jun 20:18:50.913 * +failover-state-send-slaveof-noone slave 10.1.161.246:6379 10.1.161.246 6379 @ mymaster 10.1.161.244 6379
6560:X 02 Jun 20:18:50.975 * +failover-state-wait-promotion slave 10.1.161.246:6379 10.1.161.246 6379 @ mymaster 10.1.161.244 6379
6560:X 02 Jun 20:19:50.999 # -failover-abort-slave-timeout master mymaster 10.1.161.244 6379
6560:X 02 Jun 20:19:51.075 # Next failover delay: I will not start a failover before Tue Jun  2 20:20:51 2015
6560:X 02 Jun 20:20:50.910 # +new-epoch 2
6560:X 02 Jun 20:20:50.912 # +vote-for-leader 63358cbf283fa3254df259516beba218c434c22f 2
6560:X 02 Jun 20:20:50.958 # Next failover delay: I will not start a failover before Tue Jun  2 20:22:51 2015
6560:X 02 Jun 20:22:51.220 # +new-epoch 3
6560:X 02 Jun 20:22:51.220 # +try-failover master mymaster 10.1.161.244 6379
6560:X 02 Jun 20:22:51.223 # +vote-for-leader 7947a8b01f35479a45705c20a0c9877e203b0120 3
6560:X 02 Jun 20:22:51.228 # 10.1.161.244:26379 voted for 7947a8b01f35479a45705c20a0c9877e203b0120 3
6560:X 02 Jun 20:22:51.228 # 10.1.161.246:26379 voted for 7947a8b01f35479a45705c20a0c9877e203b0120 3
6560:X 02 Jun 20:22:51.289 # +elected-leader master mymaster 10.1.161.244 6379
6560:X 02 Jun 20:22:51.289 # +failover-state-select-slave master mymaster 10.1.161.244 6379
6560:X 02 Jun 20:22:51.360 # +selected-slave slave 10.1.161.246:6379 10.1.161.246 6379 @ mymaster 10.1.161.244 6379
6560:X 02 Jun 20:22:51.360 * +failover-state-send-slaveof-noone slave 10.1.161.246:6379 10.1.161.246 6379 @ mymaster 10.1.161.244 6379
6560:X 02 Jun 20:22:51.431 * +failover-state-wait-promotion slave 10.1.161.246:6379 10.1.161.246 6379 @ mymaster 10.1.161.244 6379

Sentinel 3
=======

7002:X 02 Jun 20:15:34.428 # Sentinel runid is 63358cbf283fa3254df259516beba218c434c22f
7002:X 02 Jun 20:15:34.428 # +monitor master mymaster 10.1.161.244 6379 quorum 2
7002:X 02 Jun 20:15:34.430 * +slave slave 10.1.161.245:6379 10.1.161.245 6379 @ mymaster 10.1.161.244 6379
7002:X 02 Jun 20:15:34.440 * +slave slave 10.1.161.246:6379 10.1.161.246 6379 @ mymaster 10.1.161.244 6379
7002:X 02 Jun 20:15:34.893 * +sentinel sentinel 10.1.161.244:26379 10.1.161.244 26379 @ mymaster 10.1.161.244 6379
7002:X 02 Jun 20:15:36.037 * +sentinel sentinel 10.1.161.245:26379 10.1.161.245 26379 @ mymaster 10.1.161.244 6379
7002:X 02 Jun 20:18:49.814 # +sdown master mymaster 10.1.161.244 6379
7002:X 02 Jun 20:18:50.772 # +new-epoch 1
7002:X 02 Jun 20:18:50.773 # +vote-for-leader 7947a8b01f35479a45705c20a0c9877e203b0120 1
7002:X 02 Jun 20:18:50.960 # +odown master mymaster 10.1.161.244 6379 #quorum 3/2
7002:X 02 Jun 20:18:50.960 # Next failover delay: I will not start a failover before Tue Jun  2 20:20:50 2015
7002:X 02 Jun 20:20:50.914 # +new-epoch 2
7002:X 02 Jun 20:20:50.914 # +try-failover master mymaster 10.1.161.244 6379
7002:X 02 Jun 20:20:50.916 # +vote-for-leader 63358cbf283fa3254df259516beba218c434c22f 2
7002:X 02 Jun 20:20:50.921 # 10.1.161.245:26379 voted for 63358cbf283fa3254df259516beba218c434c22f 2
7002:X 02 Jun 20:20:50.922 # 10.1.161.244:26379 voted for 63358cbf283fa3254df259516beba218c434c22f 2
7002:X 02 Jun 20:20:50.978 # +elected-leader master mymaster 10.1.161.244 6379
7002:X 02 Jun 20:20:50.978 # +failover-state-select-slave master mymaster 10.1.161.244 6379
7002:X 02 Jun 20:20:51.041 # +selected-slave slave 10.1.161.246:6379 10.1.161.246 6379 @ mymaster 10.1.161.244 6379
7002:X 02 Jun 20:20:51.041 * +failover-state-send-slaveof-noone slave 10.1.161.246:6379 10.1.161.246 6379 @ mymaster 10.1.161.244 6379
7002:X 02 Jun 20:20:51.141 * +failover-state-wait-promotion slave 10.1.161.246:6379 10.1.161.246 6379 @ mymaster 10.1.161.244 6379
7002:X 02 Jun 20:21:51.146 # -failover-abort-slave-timeout master mymaster 10.1.161.244 6379
7002:X 02 Jun 20:21:51.202 # Next failover delay: I will not start a failover before Tue Jun  2 20:22:51 2015
7002:X 02 Jun 20:22:51.234 # +new-epoch 3
7002:X 02 Jun 20:22:51.237 # +vote-for-leader 7947a8b01f35479a45705c20a0c9877e203b0120 3
7002:X 02 Jun 20:22:51.239 # Next failover delay: I will not start a failover before Tue Jun  2 20:24:51 2015

Any help is greatly appreciated.

Thanks.

IV

unread,
Aug 14, 2017, 6:53:12 PM8/14/17
to Redis DB
Did you ever figure out what was the problem? 
I ran into the exact same thing today.

Tuco

unread,
Aug 15, 2017, 12:14:18 AM8/15/17
to Redis DB
Have you renamed some commands in redis?

George Chilumbu

unread,
Aug 22, 2017, 3:13:21 AM8/22/17
to Redis DB
I have written step-by-step instructions on how to config redis and sentinel and perform failures. Please, refer to these two links below:


If you experience any problems, let me know and i will be more than happy to help.

George.

Dhirendra Patil

unread,
Jan 11, 2018, 1:37:42 PM1/11/18
to Redis DB
I not able to do failover.


=> sentinel.log <==
6924:X 11 Jan 17:53:03.793 # +elected-leader master redis-cluster xx.xx.xx.60 6377
6924:X 11 Jan 17:53:03.793 # +failover-state-select-slave master redis-cluster 1xx.xx.xx.60 6377
6924:X 11 Jan 17:53:03.845 # +selected-slave slave xx.xx.xx.64:6379 xx.xx.xx.64 6379 @ redis-cluster xx.xx.xx.60 6377
6924:X 11 Jan 17:53:03.845 * +failover-state-send-slaveof-noone slave xx.xx.xx.64:6379 xx.xx.xx.64 6379 @ redis-cluster xx.xx.xx.60 6377
6924:X 11 Jan 17:53:03.897 * +failover-state-wait-promotion slave xx.xx.xx.64:6379 10.50.216.64 6379 @ redis-cluster xx.xx.xx.60 6377

Dhirendra Patil

unread,
Jan 11, 2018, 9:37:39 PM1/11/18
to Redis DB
i was able to resolve issue 
I had some rename commands in my redis.conf after commenting those it got resolved.
Reply all
Reply to author
Forward
0 new messages