Hi Y'All,
Placing sentinels on seperate hosts did not solve this problem in my case. I
have 3 redis instances and 3 sentinels. Master is up and 2 slaves are
connected, yet sentinels want to perfrom failover every minute.
Restarting all sentinels and removing their temp configs at the same time
did not solve the problem.
I have updated to the latest stable version 5.0.3, but it did not fix my
problem.
Also, I found similar problem here:
http://redis-db.2338650.n4.nabble.com/Sentinel-problem-td4410.html
<
http://redis-db.2338650.n4.nabble.com/Sentinel-problem-td4410.html> . But
the topic is not finished.
Additional note: there is no firewall between redis and sentinel hosts.
redis1.blue
# Replication
role:slave
master_host:redis2.blue
master_port:6380
master_link_status:up
master_last_io_seconds_ago:0
master_sync_in_progress:0
redis2.blue
# Replication
role:master
connected_slaves:2
min_slaves_good_slaves:2
slave0:ip=172.24.23.123,port=6380,state=online,offset=89955552,lag=1
slave1:ip=172.24.23.121,port=6380,state=online,offset=89955466,lag=1
master_replid:29cd21012b52cda54c999ff70da7432abe6077e7
redis3.blue
# Replication
role:slave
master_host:redis2.blue
master_port:6380
master_link_status:up
master_last_io_seconds_ago:0
master_sync_in_progress:0
sentinel1.blue
master0:name=rds-6380,status=odown,address=
172.24.23.121:6380,slaves=0,sentinels=6
sentinel2.blue
master0:name=rds-6380,status=odown,address=
172.24.23.121:6380,slaves=0,sentinels=6
sentinel3.blue
master0:name=rds-6380,status=odown,address=
172.24.23.121:6380,slaves=0,sentinels=6
sentinel1.blue
19869:X 30 Jan 2019 13:15:44.779 # +new-epoch 10026
19869:X 30 Jan 2019 13:15:44.779 # +try-failover master rds-6380
172.24.23.121 6380
19869:X 30 Jan 2019 13:15:44.779 # +vote-for-leader
2d642654e88b852b7952a85682def21c51f8b714 10026
19869:X 30 Jan 2019 13:15:44.782 # fed493d9c17d8419b4c8f99ae5885714976cd522
voted for 2d642654e88b852b7952a85682def21c51f8b714 10026
19869:X 30 Jan 2019 13:15:44.783 # 025b20a18f6770d1e8532302bcf7394dba1bad0a
voted for 2d642654e88b852b7952a85682def21c51f8b714 10026
19869:X 30 Jan 2019 13:15:55.165 # -failover-abort-not-elected master
rds-6380 172.24.23.121 6380
19869:X 30 Jan 2019 13:15:55.220 # Next failover delay: I will not start a
failover before Wed Jan 30 13:16:45 2019
sentinel2.blue
21469:X 30 Jan 2019 13:15:44.782 # +new-epoch 10026
21469:X 30 Jan 2019 13:15:44.782 # +vote-for-leader
2d642654e88b852b7952a85682def21c51f8b714 10026
21469:X 30 Jan 2019 13:15:44.833 # Next failover delay: I will not start a
failover before Wed Jan 30 13:16:45 2019
sentinel3.blue
21970:X 30 Jan 2019 13:15:44.781 # +new-epoch 10026
21970:X 30 Jan 2019 13:15:44.782 # +vote-for-leader
2d642654e88b852b7952a85682def21c51f8b714 10026
21970:X 30 Jan 2019 13:15:44.861 # Next failover delay: I will not start a
failover before Wed Jan 30 13:16:45 2019
Manual failover does not work either, I suspect, due to master status=odown
and slaves=0.
# redis-cli -h sentinel1.blue -p 26380 SENTINEL failover rds-6380
(error) NOGOODSLAVE No suitable replica to promote
# redis-cli -h sentinel1.blue -p 26380 SENTINEL get-master-addr-by-name
rds-6380
1) "172.24.23.121"
2) "6380"
# redis-cli -h sentinel1.blue -p 26380 SENTINEL slaves rds-6380
(empty list or set)
[ro...@redis1.blue] [2019-01-31 09:24:46] ~
# redis-cli -h redis3.blue -p 6380 subscribe __sentinel__:hello
Reading messages... (press Ctrl-C to quit)
1) "subscribe"
2) "__sentinel__:hello"
3) (integer) 1
^C
[ro...@redis1.blue] [2019-01-31 09:24:58] ~
# redis-cli -h redis2.blue -p 6380 subscribe __sentinel__:hello
Reading messages... (press Ctrl-C to quit)
1) "subscribe"
2) "__sentinel__:hello"
3) (integer) 1
^C
[ro...@redis1.blue] [2019-01-31 09:25:09] ~
# redis-cli -h redis1.blue -p 6380 subscribe __sentinel__:hello
Reading messages... (press Ctrl-C to quit)
1) "subscribe"
2) "__sentinel__:hello"
3) (integer) 1
1) "message"
2) "__sentinel__:hello"
3)
"172.24.23.203,26380,fde792596cb11c017abd77df018148a55074620b,1103,rds-6380,172.24.23.121,6380,0"
1) "message"
2) "__sentinel__:hello"
3)
"172.24.23.201,26380,825d65e0e56816cc8c666400341fb6c186741c54,1103,rds-6380,172.24.23.121,6380,0"
1) "message"
2) "__sentinel__:hello"
3)
"172.24.23.202,26380,7459706aa08c2de1516fb862ec4030581f5dc48b,1103,rds-6380,172.24.23.121,6380,0"
1) "message"
2) "__sentinel__:hello"
3)
"172.24.23.203,26380,fde792596cb11c017abd77df018148a55074620b,1103,rds-6380,172.24.23.121,6380,0"
1) "message"
2) "__sentinel__:hello"
3)
"172.24.23.201,26380,825d65e0e56816cc8c666400341fb6c186741c54,1103,rds-6380,172.24.23.121,6380,0"
[ro...@redis1.blue] [2019-01-31 09:43:41] ~
# redis-cli -p 6380 client list | grep sentinel
id=186741 addr=
172.24.23.203:46902 fd=30 name=sentinel-fde79259-pubsub
age=67640 idle=0 flags=P db=0 sub=1 psub=0 multi=-1 qbuf=0 qbuf-free=0 obl=0
oll=0 omem=0 events=r cmd=subscribe
id=186737 addr=
172.24.23.201:47722 fd=7 name=sentinel-825d65e0-cmd age=67640
idle=0 flags=N db=0 sub=0 psub=0 multi=-1 qbuf=0 qbuf-free=0 obl=0 oll=0
omem=0 events=r cmd=publish
id=186738 addr=
172.24.23.201:57516 fd=28 name=sentinel-825d65e0-pubsub
age=67640 idle=0 flags=P db=0 sub=1 psub=0 multi=-1 qbuf=0 qbuf-free=0 obl=0
oll=0 omem=0 events=r cmd=subscribe
id=186744 addr=
172.24.23.202:38082 fd=31 name=sentinel-7459706a-cmd
age=67640 idle=0 flags=N db=0 sub=0 psub=0 multi=-1 qbuf=0 qbuf-free=32768
obl=0 oll=0 omem=0 events=r cmd=publish
id=186745 addr=
172.24.23.202:54021 fd=32 name=sentinel-7459706a-pubsub
age=67640 idle=0 flags=P db=0 sub=1 psub=0 multi=-1 qbuf=0 qbuf-free=0 obl=0
oll=0 omem=0 events=r cmd=subscribe
id=186740 addr=
172.24.23.203:55403 fd=29 name=sentinel-fde79259-cmd
age=67640 idle=0 flags=N db=0 sub=0 psub=0 multi=-1 qbuf=0 qbuf-free=32768
obl=0 oll=0 omem=0 events=r cmd=ping
[ro...@redis2.blue] [2019-01-31 09:45:08] ~
# redis-cli -p 6380 client list | grep sentinel
[ro...@redis3.blue] [2019-01-31 09:45:13] ~
# redis-cli -p 6380 client list | grep sentinel
[ro...@redis1.blue] [2019-01-31 09:44:15] ~
# redis-cli -p 6380 pubsub channels
1) "__sentinel__:hello"
[ro...@redis2.blue] [2019-01-31 09:47:59] ~
# redis-cli -p 6380 pubsub channels
(empty list or set)
[ro...@redis3.blue] [2019-01-31 09:45:19] ~
# redis-cli -p 6380 pubsub channels
(empty list or set)
[ro...@redis1.blue] [2019-01-31 09:49:26] ~
# redis-cli -p 6380 monitor | grep sentinel
1548924655.315747 [0
172.24.23.201:47722] "PUBLISH" "__sentinel__:hello"
"172.24.23.201,26380,825d65e0e56816cc8c666400341fb6c186741c54,1129,rds-6380,172.24.23.121,6380,0"
1548924655.723286 [0
172.24.23.203:55403] "PUBLISH" "__sentinel__:hello"
"172.24.23.203,26380,fde792596cb11c017abd77df018148a55074620b,1129,rds-6380,172.24.23.121,6380,0"
1548924656.006182 [0
172.24.23.202:38082] "PUBLISH" "__sentinel__:hello"
"172.24.23.202,26380,7459706aa08c2de1516fb862ec4030581f5dc48b,1129,rds-6380,172.24.23.121,6380,0"
[ro...@redis2.blue] [2019-01-31 09:49:47] ~
# redis-cli -p 6380 monitor | grep sentinel
^C
[ro...@redis3.blue] [2019-01-31 09:50:32] ~
# redis-cli -p 6380 monitor | grep sentinel
^C
[ro...@redis1.blue] [2019-01-31 09:54:33] ~
# redis-cli -p 6380 config get slaveof
1) "slaveof"
2) "redis2.blue 6380"
[ro...@redis2.blue] [2019-01-31 09:57:08] ~
# redis-cli -p 6380 config get slaveof
1) "slaveof"
2) ""
[ro...@redis1.blue] BLUE/WRO [2019-01-31 09:56:01] ~
# cat /var/run/redis/6380.conf
daemonize yes
pidfile /var/run/redis/6380.pid
logfile /var/log/redis/6380.log
dir /var/lib/redis/6380/
port 6380
bind 0.0.0.0
timeout 0
loglevel notice
slave-serve-stale-data yes
maxclients 1024
maxmemory 256mb
maxmemory-policy volatile-lru
appendonly no
appendfsync no
no-appendfsync-on-rewrite no
auto-aof-rewrite-percentage 100
auto-aof-rewrite-min-size 128mb
slowlog-log-slower-than 10000
slowlog-max-len 1024
list-max-ziplist-entries 512
list-max-ziplist-value 64
set-max-intset-entries 512
zset-max-ziplist-entries 128
zset-max-ziplist-value 64
activerehashing yes
min-slaves-to-write 1
slave-priority 313
I use the same config on redis2 and redis3 (except for the slave-priority).
In addition to that redis instance, I have more instances on ports
6381-6384. 3 of those can properly perform the failover and sentinels talk
to all redis instances on all redis hosts. One instance (on port 6382) has
the same failover problem as 6380 instance.
*Why sentinels do not subscribe to 6380 instances on redis2 and redis3?
Why is the master status "odown" in sentinel info and no slaves while redis
master is up and has 2 slaves connected?*
Regards,
Daniel Andrzejewski
--
Sent from:
http://redis-db.2338650.n4.nabble.com/