Redis sentinels - detection

560 views
Skip to first unread message

José Coelho

unread,
Nov 12, 2015, 10:05:56 AM11/12/15
to Redis DB
Hi,

I'm trying to setup the following redis structure:

redis servers:
redis1: 172.10.10.10
redis2: 172.10.10.11 (slaveof redis1)
redis3: 172.10.10.12 (slaveof redis1)

In each server is also running an instance of redis sentinel.

redis1: redis.conf:
pidfile /var/run/redis/redis-server.pid
loglevel verbose
daemonize yes
bind
172.10.10.10
dir
/var/lib/redis/
port
6379

redis2: redis.conf:
pidfile /var/run/redis/redis-server.pid
loglevel verbose
slaveof
172.10.10.10 6379
daemonize yes
bind
172.10.10.11
dir
/var/lib/redis/
port
6379

redis3: redis.conf:
pidfile /var/run/redis/redis-server.pid
loglevel verbose
slaveof
172.10.10.10 6379
daemonize yes
bind
172.10.10.12
dir
/var/lib/redis/
port
6379

redis sentinel is the same for all servers: sentinel.conf
sentinel monitor master 172.10.10.10 6379 2
daemonize yes
pidfile
"/var/run/redis/redis-sentinel.pid"
sentinel down
-after-milliseconds master 5000
port
26379
logfile
"/var/log/redis/redis-sentinel.log"
sentinel failover
-timeout master 5000
dir
"/var/lib/redis/sentinel"

Redis replication is working fine:
redis-cli -h 172.10.10.10 info replication
# Replication
role
:master
connected_slaves
:2
slave0
:ip=172.10.10.12,port=6379,state=online,offset=50252,lag=1
slave1
:ip=172.10.10.11,port=6379,state=online,offset=5479,lag=1
master_repl_offset
:50252
repl_backlog_active
:1
repl_backlog_size
:1048576
repl_backlog_first_byte_offset
:2
repl_backlog_histlen
:50251

redis-cli -h 172.10.10.11 info replication
# Replication
role
:slave
master_host
:172.10.10.10
master_port
:6379
master_link_status
:up
master_last_io_seconds_ago
:56
master_sync_in_progress
:0
slave_repl_offset
:5479
slave_priority
:100
slave_read_only
:1
connected_slaves
:0
master_repl_offset
:0
repl_backlog_active
:0
repl_backlog_size
:1048576
repl_backlog_first_byte_offset
:0
repl_backlog_histlen
:0

redis-cli -h 172.10.10.12 info replication
# Replication
role
:slave
master_host
:172.10.10.10
master_port
:6379
master_link_status
:up
master_last_io_seconds_ago
:1
master_sync_in_progress
:0
slave_repl_offset
:54446
slave_priority
:100
slave_read_only
:1
connected_slaves
:0
master_repl_offset
:0
repl_backlog_active
:0
repl_backlog_size
:1048576
repl_backlog_first_byte_offset
:0
repl_backlog_histlen
:0

The issue is with sentinels:

sentinel1 on redis1:
redis-cli -h 172.10.10.10 -p 26379 sentinel sentinels master
1)  1) "name"
   
2) "172.10.10.11:26379"
   
3) "ip"
   
4) "172.10.10.11"
   
5) "port"
   
6) "26379"
   
7) "runid"
   
8) "a0d8cf68b5b5a0afd5ba6c215b8fc0f974926996"
   
9) "flags"
   
10) "sentinel"
   
11) "pending-commands"
   
12) "0"
   
13) "last-ping-sent"
   
14) "0"
   
15) "last-ok-ping-reply"
   
16) "26"
   
17) "last-ping-reply"
   
18) "26"
   
19) "down-after-milliseconds"
   
20) "5000"
   
21) "last-hello-message"
   
22) "1027"
   
23) "voted-leader"
   
24) "?"
   
25) "voted-leader-epoch"
   
26) "0"
2)  1) "name"
   
2) "172.10.10.12:26379"
   
3) "ip"
   
4) "172.10.10.12"
   
5) "port"
   
6) "26379"
   
7) "runid"
   
8) "a336afe851fa844e56d2ef4d5ccc68e189737a93"
   
9) "flags"
   
10) "sentinel"
   
11) "pending-commands"
   
12) "0"
   
13) "last-ping-sent"
   
14) "0"
   
15) "last-ok-ping-reply"
   
16) "561"
   
17) "last-ping-reply"
   
18) "561"
   
19) "down-after-milliseconds"
   
20) "5000"
   
21) "last-hello-message"
   
22) "231"
   
23) "voted-leader"
   
24) "?"
   
25) "voted-leader-epoch"
   
26) "0"

$ redis-cli -h 172.10.10.10 -p 26379 sentinel slaves master
1)  1) "name"
   
2) "172.10.10.12:6379"
   
3) "ip"
   
4) "172.10.10.12"
   
5) "port"
   
6) "6379"
   
7) "runid"
   
8) ""
   
9) "flags"
   
10) "s_down,slave"
   
11) "pending-commands"
   
12) "31"
   
13) "last-ping-sent"
   
14) "833056"
   
15) "last-ok-ping-reply"
   
16) "833056"
   
17) "last-ping-reply"
   
18) "833056"
   
19) "s-down-time"
   
20) "827958"
   
21) "down-after-milliseconds"
   
22) "5000"
   
23) "info-refresh"
   
24) "1447340231535"
   
25) "role-reported"
   
26) "slave"
   
27) "role-reported-time"
   
28) "833056"
   
29) "master-link-down-time"
   
30) "0"
   
31) "master-link-status"
   
32) "err"
   
33) "master-host"
   
34) "?"
   
35) "master-port"
   
36) "0"
   
37) "slave-priority"
   
38) "100"
   
39) "slave-repl-offset"
   
40) "0"
2)  1) "name"
   
2) "172.10.10.11:6379"
   
3) "ip"
   
4) "172.10.10.11"
   
5) "port"
   
6) "6379"
   
7) "runid"
   
8) ""
   
9) "flags"
   
10) "s_down,slave"
   
11) "pending-commands"
   
12) "88"
   
13) "last-ping-sent"
   
14) "943399"
   
15) "last-ok-ping-reply"
   
16) "943399"
   
17) "last-ping-reply"
   
18) "943399"
   
19) "s-down-time"
   
20) "938321"
   
21) "down-after-milliseconds"
   
22) "5000"
   
23) "info-refresh"
   
24) "1447340231535"
   
25) "role-reported"
   
26) "slave"
   
27) "role-reported-time"
   
28) "943399"
   
29) "master-link-down-time"
   
30) "0"
   
31) "master-link-status"
   
32) "err"
   
33) "master-host"
   
34) "?"
   
35) "master-port"
   
36) "0"
   
37) "slave-priority"
   
38) "100"
   
39) "slave-repl-offset"
   
40) "0"

sentinel2 redis2:
redis-cli -h 172.10.10.11 -p 26379 sentinel slaves master
(empty list or set)

redis-cli -h 172.10.10.11 -p 26379 sentinel sentinels master
1)  1) "name"
   
2) "172.10.10.10:26379"
   
3) "ip"
   
4) "172.10.10.10"
   
5) "port"
   
6) "26379"
   
7) "runid"
   
8) "71e4e5a2b0252e2d08b0666a766b61e0414533ad"
   
9) "flags"
   
10) "sentinel"
   
11) "pending-commands"
   
12) "0"
   
13) "last-ping-sent"
   
14) "0"
   
15) "last-ok-ping-reply"
   
16) "546"
   
17) "last-ping-reply"
   
18) "546"
   
19) "down-after-milliseconds"
   
20) "5000"
   
21) "last-hello-message"
   
22) "701"
   
23) "voted-leader"
   
24) "?"
   
25) "voted-leader-epoch"
   
26) "0"

sentinel3 on redis3:
$ redis-cli -h 172.10.10.12 -p 26379 sentinel slaves master(empty list or set)

$ redis-cli -h 172.10.10.12 -p 26379 sentinel sentinels master
1)  1) "name"
   
2) "172.10.10.10:26379"
   
3) "ip"
   
4) "172.10.10.10"
   
5) "port"
   
6) "26379"
   
7) "runid"
   
8) "71e4e5a2b0252e2d08b0666a766b61e0414533ad"
   
9) "flags"
   
10) "sentinel"
   
11) "pending-commands"
   
12) "0"
   
13) "last-ping-sent"
   
14) "0"
   
15) "last-ok-ping-reply"
   
16) "978"
   
17) "last-ping-reply"
   
18) "978"
   
19) "down-after-milliseconds"
   
20) "5000"
   
21) "last-hello-message"
   
22) "430"
   
23) "voted-leader"
   
24) "?"
   
25) "voted-leader-epoch"
   
26) "0"

Redis Pub/Sub for sentinels:
1) "pmessage"
2) "*"
3) "__sentinel__:hello"
4) "172.10.10.10,26379,71e4e5a2b0252e2d08b0666a766b61e0414533ad,0,master,172.10.10.10,6379,0"
1) "pmessage"
2) "*"
3) "__sentinel__:hello"
4) "172.10.10.10,26379,71e4e5a2b0252e2d08b0666a766b61e0414533ad,0,master,172.10.10.10,6379,0"
1) "pmessage"
2) "*"
3) "__sentinel__:hello"
4) "172.10.10.10,26379,71e4e5a2b0252e2d08b0666a766b61e0414533ad,0,master,172.10.10.10,6379,0"

If I manually add a known-sentinel, for example in redis3, sentinel.conf:
sentinel monitor master 172.10.10.10 6379 2
daemonize yes
pidfile
"/var/run/redis/redis-sentinel.pid"
sentinel down
-after-milliseconds master 5000
port
26379
logfile
"/var/log/redis/redis-sentinel.log"
sentinel failover
-timeout master 5000
dir
"/var/lib/redis/sentinel"
# Generated by CONFIG REWRITE
maxclients
4064
sentinel config
-epoch master 0
sentinel leader
-epoch master 6
sentinel known
-sentinel master 172.10.10.10 26379 71e4e5a2b0252e2d08b0666a766b61e0414533ad
sentinel known
-sentinel master 172.10.10.11 26379 5227b445ef69689d70e29315fb8d284d0622b24f
sentinel current
-epoch 6

However:
$ redis-cli -h 172.10.10.12 -p 26379 sentinel sentinels master
1)  1) "name"
   
2) "172.10.10.10:26379"
   
3) "ip"
   
4) "172.10.10.10"
   
5) "port"
   
6) "26379"
   
7) "runid"
   
8) "71e4e5a2b0252e2d08b0666a766b61e0414533ad"
   
9) "flags"
   
10) "sentinel"
   
11) "pending-commands"
   
12) "0"
   
13) "last-ping-sent"
   
14) "0"
   
15) "last-ok-ping-reply"
   
16) "894"
   
17) "last-ping-reply"
   
18) "894"
   
19) "down-after-milliseconds"
   
20) "5000"
   
21) "last-hello-message"
   
22) "113"
   
23) "voted-leader"
   
24) "345ed562b502dd28bc6ae0554fa0838642792485"
   
25) "voted-leader-epoch"
   
26) "14"
2)  1) "name"
   
2) "172.10.10.11:26379"
   
3) "ip"
   
4) "172.10.10.11"
   
5) "port"
   
6) "26379"
   
7) "runid"
   
8) "5227b445ef69689d70e29315fb8d284d0622b24f"
   
9) "flags"
   
10) "sentinel,master_down"
   
11) "pending-commands"
   
12) "0"
   
13) "last-ping-sent"
   
14) "0"
   
15) "last-ok-ping-reply"
   
16) "1067"
   
17) "last-ping-reply"
   
18) "1067"
   
19) "down-after-milliseconds"
   
20) "5000"
   
21) "last-hello-message"
   
22) "1069"
   
23) "voted-leader"
   
24) "345ed562b502dd28bc6ae0554fa0838642792485"
   
25) "voted-leader-epoch"
   
26) "14"

Still no slaves:
$ redis-cli -h 172.10.10.12 -p 26379 sentinel slaves master(empty list or set)

And no exchange in pub/sub:
3) "__sentinel__:hello"
4) "172.10.10.10,26379,71e4e5a2b0252e2d08b0666a766b61e0414533ad,22,master,172.10.10.10,6379,0"
1) "pmessage"
2) "*"
3) "__sentinel__:hello"
4) "172.10.10.10,26379,71e4e5a2b0252e2d08b0666a766b61e0414533ad,22,master,172.10.10.10,6379,0"
1) "pmessage"
2) "*"
3) "__sentinel__:hello"
4) "172.10.10.10,26379,71e4e5a2b0252e2d08b0666a766b61e0414533ad,22,master,172.10.10.10,6379,0"


Any clues how to solve or debug this?

chilumb...@gmail.com

unread,
Nov 12, 2015, 9:57:59 PM11/12/15
to Redis DB
Why not just setup the redis cluster with three master nodes instead? 
...

The Baldguy

unread,
Nov 12, 2015, 11:51:04 PM11/12/15
to Redis DB
Not everybody needs cluster mode, and cluster mode doesn't work for everybody. Just for starters there are commands and patterns you can not use with cluster mode.

The Baldguy

unread,
Nov 13, 2015, 12:06:25 AM11/13/15
to Redis DB

Adding known sentinels won't do anything to help a given sentinel detect the slaves of a master, nor does slave detection take place in the PUBSUB channel. All that happens in PUBSUB is the sentinels issue their hello message. In your PUBSUB section I don't see any of the other sentinels. Which tells me they probably haven't been able to connect to the master successfully. To figure out why you should check the logs from those slaves. If nothing pops pit from there increase the verbosity if the server and restart it, then look in its logs.

As it stands I'd guess your slaves don't have the right connectivity or are fire walled away on the server.

I would also highly recommend setting the sentinels up on nodes other than where the redis servers are and try that. I've seen several issues with the way you've set it up here.

José Coelho

unread,
Nov 13, 2015, 12:49:28 PM11/13/15
to Redis DB
After some debug the issue was related with network issues between nodes.

when a sentinel was trying to issue the INFO command the connection would timeout, but  info replication command, for example, worked fine.

Thanks.
Reply all
Reply to author
Forward
0 new messages