Redis HA using redis sentinel

6,101 views
Skip to first unread message

ESWAR RAO

unread,
Nov 26, 2013, 1:06:17 AM11/26/13
to redi...@googlegroups.com
Hi All,

I want to configure the HA for redis-server on a 2 node setup.
I could see the redis sentinel to achieve the behavior.

Can someone please help me with the sentinel configuration on the 2 nodes??

Should I run both redis-server+sentinel on both the nodes??

machine 1: [192.168.100.120]
===========================

#redis-server --port 6400
#redis-server sentinel.conf --sentinel

sentinel.conf
++++++++++++++++
port 26379
sentinel monitor mymaster 127.0.0.1 6400 1
sentinel down-after-milliseconds mymaster 5000
sentinel failover-timeout mymaster 900000
sentinel can-failover mymaster yes
sentinel parallel-syncs mymaster 1


machine:2 [192.168.100.121]
=============================
# redis-server --port 6400 --slaveof 192.168.100.120 6400
# redis-server sentinel.conf --sentinel

Can I use the same sentinel.conf on both the machines??
Should I mention slave redis-server IP address in the sentinel.conf??


Thanks
Eswar

CharSyam

unread,
Nov 26, 2013, 2:00:29 AM11/26/13
to redi...@googlegroups.com
Don't use 127.0.0.1, use 192.168.100.120
and just use same sentinel.conf in each node.
sentinel will find slaves automatically.


2013/11/26 ESWAR RAO <eswa...@gmail.com>

--
You received this message because you are subscribed to the Google Groups "Redis DB" group.
To unsubscribe from this group and stop receiving emails from it, send an email to redis-db+u...@googlegroups.com.
To post to this group, send email to redi...@googlegroups.com.
Visit this group at http://groups.google.com/group/redis-db.
For more options, visit https://groups.google.com/groups/opt_out.

ESWAR RAO

unread,
Nov 26, 2013, 3:48:08 AM11/26/13
to redi...@googlegroups.com
Hi CHarSyam,

Thanks for the response.

I have the below configuration on two machines.

But if I stop the redis-server on the master node, the slave node is continuously emitting logs like
[1399] 26 Nov 00:42:37.045 # Error condition on socket for SYNC: Connection refused
..................................................................

and slave node is not promoted to master.
127.0.0.1:6400> set 4 def
(error) READONLY You can't write against a read only slave.


on machine 1: [192.168.100.120]

=========================
#redis-server --port 6400
#redis-server sentinel.conf --sentinel

sentinel.conf
++++++++++++++++
port 26379
sentinel monitor mymaster 192.168.100.120 6400 1
sentinel down-after-milliseconds mymaster 5000
sentinel failover-timeout mymaster 900000
sentinel can-failover mymaster yes
sentinel parallel-syncs mymaster 1

on machine 1: [192.168.100.121]
=========================
# redis-server --port 6400 --slaveof 192.168.100.120 6400
# redis-server sentinel.conf --sentinel

sentinel.conf
++++++++++++++++
port 26379
sentinel monitor mymaster 192.168.100.121 6400 1
sentinel down-after-milliseconds mymaster 5000
sentinel failover-timeout mymaster 900000
sentinel can-failover mymaster yes
sentinel parallel-syncs mymaster 1



Thanks
Eswar



--
You received this message because you are subscribed to a topic in the Google Groups "Redis DB" group.
To unsubscribe from this topic, visit https://groups.google.com/d/topic/redis-db/M6WPJ0LnaWI/unsubscribe.
To unsubscribe from this group and all its topics, send an email to redis-db+u...@googlegroups.com.

CharSyam

unread,
Nov 26, 2013, 8:41:20 AM11/26/13
to redi...@googlegroups.com
how long did you wait the switching? maybe does it need some times to switch...

and change

sentinel monitor mymaster 192.168.100.121 6400 1
to 
sentinel monitor mymaster 192.168.100.120 6400 1

ESWAR RAO

unread,
Nov 27, 2013, 4:43:07 AM11/27/13
to redi...@googlegroups.com
Hi CharSyam,

Thanks for the inputs.

I waited a long time but couldn't succeed.

Do the sentinels on both machines monitor the master node??

I have killed the master redis-server on 100.120 and I expected the slave on 100.121 to become the master automatically.

on machine 1: [192.168.100.120]
=========================
#redis-server --port 6400
#redis-server sentinel.conf --sentinel

sentinel.conf
++++++++++++++++
port 26379
sentinel monitor mymaster 192.168.100.120 6400 1
sentinel down-after-milliseconds mymaster 5000
sentinel failover-timeout mymaster 900000
sentinel can-failover mymaster yes
sentinel parallel-syncs mymaster 1

on machine 2: [192.168.100.121]
=========================
# redis-server --port 6400 --slaveof 192.168.100.120 6400
# redis-server sentinel.conf --sentinel

sentinel.conf
++++++++++++++++
port 26379
sentinel monitor mymaster 192.168.100.120 6400 1
sentinel down-after-milliseconds mymaster 5000
sentinel failover-timeout mymaster 900000
sentinel can-failover mymaster yes
sentinel parallel-syncs mymaster 1

Thanks
Eswar

CharSyam

unread,
Nov 27, 2013, 9:07:27 AM11/27/13
to redi...@googlegroups.com
Could you upload full sentinel log?

2013년 11월 27일 수요일에 ESWAR RAO님이 작성:
--

Salvatore Sanfilippo

unread,
Nov 27, 2013, 9:09:34 AM11/27/13
to Redis DB
Note: make sure to use Redis Sentinel available in Redis 2.8.1.
Salvatore 'antirez' Sanfilippo
open source developer - GoPivotal
http://invece.org

We suspect that trading off implementation flexibility for
understandability makes sense for most system designs.
— Diego Ongaro and John Ousterhout (from Raft paper)

ESWAR RAO

unread,
Nov 28, 2013, 2:25:36 AM11/28/13
to redi...@googlegroups.com
Hi CharSyam,

I am currently using redis-2.8.1.

on machine 1: [192.168.100.187]
=========================
#redis-server
#redis-server sentinel.conf --sentinel

sentinel.conf
++++++++++++++++

port 26379
sentinel monitor mymaster 192.168.100.187 6379 1

sentinel down-after-milliseconds mymaster 5000
sentinel failover-timeout mymaster 9000
sentinel parallel-syncs mymaster 1


on machine 2: [192.168.2.94]
=========================
# redis-server (changed in config file as: slaveof 192.168.100.187 6379)

# redis-server sentinel.conf --sentinel

sentinel.conf
++++++++++++++++
port 26379
sentinel monitor mymaster 192.168.100.187 6400 1

sentinel down-after-milliseconds mymaster 5000
sentinel failover-timeout mymaster 9000
sentinel parallel-syncs mymaster 1


I killed the redis-server on 100.120 but now I didnt see anything in sentinel logs and they are struck.

100.187
======
[22687] 27 Nov 23:23:07.534 # Sentinel runid is be6867252c4e5f8a075e9eb23b929d95edd6969a
[22687] 27 Nov 23:23:12.541 # +sdown sentinel 192.168.2.94:26379 192.168.2.94 26379 @ mymaster 192.168.100.187 6379
[22687] 27 Nov 23:23:12.941 # -sdown sentinel 192.168.2.94:26379 192.168.2.94 26379 @ mymaster 192.168.100.187 6379
[22687] 27 Nov 23:23:14.390 * -dup-sentinel master mymaster 192.168.100.187 6379 #duplicate of 192.168.2.94:26379 or 6c68df92d72081870de4a13296028b2859671f0c
[22687] 27 Nov 23:23:14.390 * +sentinel sentinel 192.168.2.94:26379 192.168.2.94 26379 @ mymaster 192.168.100.187 6379


2.94
=======
[17231] 28 Nov 12:53:12.094 # Sentinel runid is 6c68df92d72081870de4a13296028b2859671f0c
[17231] 28 Nov 12:53:12.259 * -dup-sentinel master mymaster 192.168.100.187 6379 #duplicate of 192.168.100.187:26379 or be6867252c4e5f8a075e9eb23b929d95edd6969a
[17231] 28 Nov 12:53:12.260 * +sentinel sentinel 192.168.100.187:26379 192.168.100.187 26379 @ mymaster 192.168.100.187 6379



Thanks
Eswar



--
You received this message because you are subscribed to a topic in the Google Groups "Redis DB" group.
To unsubscribe from this topic, visit https://groups.google.com/d/topic/redis-db/M6WPJ0LnaWI/unsubscribe.
To unsubscribe from this group and all its topics, send an email to redis-db+u...@googlegroups.com.

Salvatore Sanfilippo

unread,
Nov 28, 2013, 2:59:49 AM11/28/13
to Redis DB
The port of the monitored master is different in the two configs...

Salvatore

ESWAR RAO

unread,
Nov 28, 2013, 3:30:48 AM11/28/13
to redi...@googlegroups.com
Hi Salvatore,

Sorry for pasting wrongly in the mail.
Actually I kept same port number in both sentinal config files i.e., 6379

The sentinel logs also shows the same port number.

I dont know if I am missing/configuring anything wrongly.

Thanks
Eswar

Salvatore Sanfilippo

unread,
Nov 28, 2013, 3:49:39 AM11/28/13
to Redis DB
Hello again,

in your logs what I see is that the Sentinels are able to connect to
the master, as they detect each other.
However there is no +slave event, so apparently they are not able to
sense that a slave is actually connected.
Also after you kill the master no +sdown / +odown events are
generated, which is very odd.

Try the following in order to provide some more information.

1) Setup your environment and start the Redis servers and the two Sentinels.
2) Wait 30 seconds or more.
3) Send the following commands to Sentinels and provide the output here:
SENTINEL masters
SENTINEL slaves mymaster
4) Also send the following command to the master and report the
output: INFO replication
5) Now kill the master Redis server. Make sure you kill it by sending
the "DEBUG SEGFAULT" command to it.
6) Wait 10 seconds or more.
7) Report the output of:
SENTINEL masters
SENTINEL slaves mymaster

Thanks! Just use a single Sentinel to run the commands, no need to
have the output of both sentinels.

Regards,
Salvatore

CharSyam

unread,
Nov 28, 2013, 3:50:08 AM11/28/13
to redi...@googlegroups.com
when starting slave, at that time, is it showed in log?

sentinel will make logs when slave connected to master


2013/11/28 ESWAR RAO <eswa...@gmail.com>

ESWAR RAO

unread,
Nov 28, 2013, 4:20:09 AM11/28/13
to redi...@googlegroups.com
Hi Salvatore,

Thanks for the inputs.

Please find the information inline to your questions.

Now from the command outputs I can see the other node has become the master.
But I tried the similar commands by killing the redis-server (kill <pid> instead of "DEBUG SEGFAULT" and the sentinels even didnt recognize that node is down.
Please let me know if I am missing anything???

Thanks
Eswar


On Thu, Nov 28, 2013 at 2:19 PM, Salvatore Sanfilippo <ant...@gmail.com> wrote:
Hello again,

in your logs what I see is that the Sentinels are able to connect to
the master, as they detect each other.
However there is no +slave event, so apparently they are not able to
sense that a slave is actually connected.
Also after you kill the master no +sdown / +odown events are
generated, which is very odd.

Try the following in order to provide some more information.

1) Setup your environment and start the Redis servers and the two Sentinels.
2) Wait 30 seconds or more.
3) Send the following commands to Sentinels and provide the output here:
SENTINEL masters
 
127.0.0.1:26379> SENTINEL masters
1)  1) "name"
    2) "mymaster"
    3) "ip"
    4) "192.168.100.187"
    5) "port"
    6) "6379"
    7) "runid"
    8) "034ab843bdb6055b009eb9423666ee7effd2771d"
    9) "flags"
   10) "master"
   11) "pending-commands"
   12) "4"
   13) "last-ok-ping-reply"
   14) "117"
   15) "last-ping-reply"
   16) "117"
   17) "info-refresh"
   18) "7856"
   19) "role-reported"
   20) "master"
   21) "role-reported-time"
   22) "178028"
   23) "config-epoch"
   24) "0"
   25) "num-slaves"
   26) "1"
   27) "num-other-sentinels"
   28) "1"
   29) "quorum"
   30) "1"

 
SENTINEL slaves mymaster
127.0.0.1:26379> SENTINEL slaves mymaster
1)  1) "name"
    2) "192.168.2.94:6379"
    3) "ip"
    4) "192.168.2.94"
    5) "port"
    6) "6379"
    7) "runid"
    8) "872014e161667d60e5a157edf7f47a072d1342ea"
    9) "flags"
   10) "slave"
   11) "pending-commands"
   12) "0"
   13) "last-ok-ping-reply"
   14) "653"
   15) "last-ping-reply"
   16) "653"
   17) "info-refresh"
   18) "4062"
   19) "role-reported"
   20) "slave"
   21) "role-reported-time"
   22) "214591"
   23) "master-link-down-time"
   24) "0"
   25) "master-link-status"
   26) "ok"
   27) "master-host"
   28) "192.168.100.187"
   29) "master-port"
   30) "6379"
   31) "slave-priority"
   32) "100"
   33) "slave-repl-offset"
   34) "57131"
127.0.0.1:26379>
 
4) Also send the following command to the master and report the
output: INFO replication
 
127.0.0.1:6379> info replication
# Replication
role:master
connected_slaves:1
slave0:192.168.2.94,6379,online
127.0.0.1:6379>

 
5) Now kill the master Redis server. Make sure you kill it by sending
the "DEBUG SEGFAULT" command to it.
 127.0.0.1:6379> DEBUG SEGFAULT
Could not connect to Redis at 127.0.0.1:6379: Connection refused
(1.46s)
6) Wait 10 seconds or more.
7) Report the output of:
 
SENTINEL masters
 
127.0.0.1:26379> SENTINEL masters
1)  1) "name"
    2) "mymaster"
    3) "ip"
    4) "192.168.2.94"
    5) "port"
    6) "6379"
    7) "runid"
    8) "872014e161667d60e5a157edf7f47a072d1342ea"
    9) "flags"
   10) "master"
   11) "pending-commands"
   12) "0"
   13) "last-ok-ping-reply"
   14) "493"
   15) "last-ping-reply"
   16) "493"
   17) "info-refresh"
   18) "2700"
   19) "role-reported"
   20) "master"
   21) "role-reported-time"
   22) "570730"
   23) "config-epoch"
   24) "1"
   25) "num-slaves"
   26) "1"
   27) "num-other-sentinels"
   28) "1"
   29) "quorum"
   30) "1"

SENTINEL slaves mymaster
127.0.0.1:26379> SENTINEL slaves mymaster
1)  1) "name"
    2) "192.168.100.187:6379"
    3) "ip"
    4) "192.168.100.187"
    5) "port"
    6) "6379"
    7) "runid"
    8) ""
    9) "flags"
   10) "s_down,slave"
   11) "pending-commands"
   12) "1"
   13) "last-ok-ping-reply"
   14) "70248"
   15) "last-ping-reply"
   16) "70248"
   17) "s-down-time"
   18) "65214"
   19) "info-refresh"
   20) "1385629582996"
   21) "role-reported"
   22) "slave"
   23) "role-reported-time"
   24) "70248"
   25) "master-link-down-time"
   26) "0"
   27) "master-link-status"
   28) "err"
   29) "master-host"
   30) "?"
   31) "master-port"
   32) "0"
   33) "slave-priority"
   34) "100"
   35) "slave-repl-offset"
   36) "0"
127.0.0.1:26379>
 

Salvatore Sanfilippo

unread,
Nov 28, 2013, 4:34:55 AM11/28/13
to Redis DB
On Thu, Nov 28, 2013 at 10:20 AM, ESWAR RAO <eswa...@gmail.com> wrote:
> But I tried the similar commands by killing the redis-server (kill <pid>
> instead of "DEBUG SEGFAULT" and the sentinels even didnt recognize that node
> is down.
> Please let me know if I am missing anything???

Hello,

the most obvious reason is that you did not killed it the right way,
like wrong pid or something.
If you retry step by step, and after the kill you actually verify that
Redis is down using redis-cli (you should get connection refused),
you'll likely get the same exact effect as with DEBUG SEGFAULT.
The reason why I told you to kill it with DEBUG SEGFAULT is just that
it forces you to kill the instance by connecting with it, so it is a
less error prone process.

Cheers,
Salvatore

ESWAR RAO

unread,
Nov 28, 2013, 4:49:42 AM11/28/13
to redi...@googlegroups.com
Thanks Salvatore.

I am afraid I did it correctly but still I will redo it again.

I was killing the redis-server instances using "DEBUG SEGFAULT".

The sequence I followed is below:

100.187(M) - 2.94(S)
DEBUG SEGFAULT on  100.187
2.94(M)
restart 100.187
100.187(S) -2.94(M)
DEBUG SEGFAULT on 2.94
100.187(M)
restart 2.94
100.187(M) - 2.94(S)
DEBUG SEGFAULT on 100.187

After this,
on 2.94 continuously I am observing below logs:
================================================================
[17818] 28 Nov 15:09:00.292 * Caching the disconnected master state.
[17818] 28 Nov 15:09:00.668 * Connecting to MASTER 192.168.100.187:6379
[17818] 28 Nov 15:09:00.668 * MASTER <-> SLAVE sync started
[17818] 28 Nov 15:09:00.935 # Error condition on socket for SYNC: Connection refused
[17818] 28 Nov 15:09:01.671 * Connecting to MASTER 192.168.100.187:6379
[17818] 28 Nov 15:09:01.671 * MASTER <-> SLAVE sync started
[17818] 28 Nov 15:09:01.940 # Error condition on socket for SYNC: Connection refused
...........................................

2.94 sentinel logs:
====================
[17809] 28 Nov 15:09:04.649 # +sdown master mymaster 192.168.100.187 6379
[17809] 28 Nov 15:09:04.649 # +odown master mymaster 192.168.100.187 6379 #quorum 1/1
[17809] 28 Nov 15:09:04.649 # +new-epoch 7
[17809] 28 Nov 15:09:04.649 # +try-failover master mymaster 192.168.100.187 6379
[17809] 28 Nov 15:09:04.649 # +vote-for-leader 5a06b0dceed5d703dd8806f8045da80bad3242fb 7
[17809] 28 Nov 15:09:14.681 # -failover-abort-not-elected master mymaster 192.168.100.187 6379


100.187 sentinel logs:
======================
[23511] 28 Nov 01:39:04.916 # +sdown master mymaster 192.168.100.187 6379
[23511] 28 Nov 01:39:04.916 # +odown master mymaster 192.168.100.187 6379 #quorum 1/1
[23511] 28 Nov 01:39:04.916 # +new-epoch 7
[23511] 28 Nov 01:39:04.916 # +try-failover master mymaster 192.168.100.187 6379
[23511] 28 Nov 01:39:04.916 # +vote-for-leader 2654abbc4a315a84f0e25ad8f2291308938d6774 7
[23511] 28 Nov 01:39:14.952 # -failover-abort-not-elected master mymaster 192.168.100.187 6379


Thanks
Eswar



Salvatore Sanfilippo

unread,
Nov 28, 2013, 5:58:18 AM11/28/13
to Redis DB
This result (split brain in election so nobody gets elected as no
majority is reached) was produced following the steps above?

I'm asking as you need to be sure that the two Sentinels are indeed
connected, to verify this send:

SENTINEL sentinels mymaster

To both Sentinels, to see how each sense each other.

You should make sure the two Sentinels are able to communicate.
I understand you wrote "1" as quorum, but if you check the new
Sentinel documentation you'll se that this is only useful to configure
how many instances believe the master is not reachable to reach
"ODOWN" state. The election can't be won (and the failover not
performed) without a true majority.

Do you have the ability to write text messages on IRC, Hangouts, Skype
or alike? We can try interactive debugging.

The best would be you joining #redis on Freenode IRC network. I'm in.

Thanks,
Salvatore
> You received this message because you are subscribed to the Google Groups
> "Redis DB" group.
> To unsubscribe from this group and stop receiving emails from it, send an
> email to redis-db+u...@googlegroups.com.
> To post to this group, send email to redi...@googlegroups.com.
> Visit this group at http://groups.google.com/group/redis-db.
> For more options, visit https://groups.google.com/groups/opt_out.



Jacob Herbst

unread,
Jan 10, 2014, 4:39:08 PM1/10/14
to redi...@googlegroups.com
Hoping not to dig up a dead thread here, but I'm encountering a similar issue.

My master and slave seem to be working correctly.  I have 2 sentinels running -- 1 on master node, and 1 on slave node -- But it looks like each sentinel sees the other as "SDOWN" -- http://pastebin.com/Pzc7bgHZ

Any Recommendations?
THanks,
Jake (irc: jh__)

Andy Bowes

unread,
Feb 6, 2014, 5:39:55 AM2/6/14
to redi...@googlegroups.com
I have just started investigating using Sentinel for HA (migrating from TwemProxy) and I am seeing similar issues that the automated failover & election of a new master is not occuring.

I have a cluster of 4 redis server instances (1 master & 3 slaves) with 4 instances of redis sentinel running (one per redis server node).  If I shutdown the Redis Master node I was expecting Sentinel to automatically detect this and elect one of the 3 remaining slave nodes to be the new master.

Redis Sentinel Configuration (Same for all Sentinels except port, pid & log)
=========================================================
port 26350
sentinel monitor usersessionsmaster 127.0.0.1 6350 2
sentinel down-after-milliseconds usersessionsmaster 5000
sentinel failover-timeout usersessionsmaster 18000
sentinel parallel-syncs usersessionsmaster 1

pidfile /var/run/redis_26350.pid
loglevel verbose
logfile /var/log/redis_26350.log


When I shutdown the Master Node then I see the following details in the Sentinel Logs:
[25654] 06 Feb 10:30:42.234 # +sdown master usersessionsmaster 127.0.0.1 6350
[25654] 06 Feb 10:30:42.445 # +odown master usersessionsmaster 127.0.0.1 6350 #quorum 2/2

These lines seem to indicate that the failure of the Master Node has been detected but there is no subsequent automated failover to a new master node.

If I use redis-cli to attach to one of the Sentinel servers I get the following response to 'sentinel slaves usersessionsmaster' request:

3)  1) "name"
    2) "127.0.0.1:6351"
    3) "ip"
    4) "127.0.0.1"
    5) "port"
    6) "6351"
    7) "runid"
    8) "b85b9f4049c3ffbf20d91e013a59b8c0e333dea1"
    9) "flags"
   10) "slave"
   11) "pending-commands"
   12) "0"
   13) "last-ok-ping-reply"
   14) "907"
   15) "last-ping-reply"
   16) "907"
   17) "info-refresh"
   18) "804"
   19) "master-link-down-time"
   20) "134000"
   21) "master-link-status"
   22) "err"
   23) "master-host"
   24) "localhost"
   25) "master-port"
   26) "6350"
   27) "slave-priority"
   28) "100"

I can force a failover to a new master by issuing the 'sentinel failover usersessionsmaster' command but this should be triggered automatically.

Is there some configuration property that I am missing as I just can't get the automated failover to work?

Thanks 

Salvatore Sanfilippo

unread,
Feb 6, 2014, 5:52:33 AM2/6/14
to Redis DB
On Fri, Jan 10, 2014 at 10:39 PM, Jacob Herbst <jmhe...@gmail.com> wrote:
> My master and slave seem to be working correctly. I have 2 sentinels
> running -- 1 on master node, and 1 on slave node -- But it looks like each
> sentinel sees the other as "SDOWN" -- http://pastebin.com/Pzc7bgHZ


Hello, just open port 26379 since Sentinel runs in this different port.
This is starting to be like a FAQ so I'll add it to the doc ASAP.

Salvatore

--
Salvatore 'antirez' Sanfilippo
open source developer - GoPivotal
http://invece.org

To "attack a straw man" is to create the illusion of having refuted a
proposition by replacing it with a superficially similar yet
unequivalent proposition (the "straw man"), and to refute it
-- Wikipedia (Straw man page)

Andy Bowes

unread,
Feb 6, 2014, 10:02:26 AM2/6/14
to redi...@googlegroups.com
I identified that I was using an out of date version of Redis.

I have retested using 2.8.5 and it all seems to be working as expected.

regards
Andy

Salvatore Sanfilippo

unread,
Feb 6, 2014, 10:10:41 AM2/6/14
to Redis DB
Thanks for the follow up Andy.

Salvatore
To "attack a straw man" is to create the illusion of having refuted a
proposition by replacing it with a superficially similar yet
unequivalent proposition (the "straw man"), and to refute it
— Wikipedia (Straw man page)

Wils Gomes

unread,
Feb 7, 2014, 9:45:08 AM2/7/14
to redi...@googlegroups.com
Hi Folks,

I was thinking to use Sentinel as well but I am worried with this note from the documentation:
WARNING: Redis Sentinel is currently a work in progress.

I am wondering to know if you are using on production and how is it so far.

Thank you
Wils

Salvatore Sanfilippo

unread,
Feb 7, 2014, 9:49:34 AM2/7/14
to Redis DB
Hello Wils, Redis Sentinel in recent times got a lot more stable and
I'm receiving very positive setups.
I believe it is your best bet about Redis HA and you should use it.
Just as a precaution, take some time to test it with your setup
simulating the failure modes you are more worried about.

There is for sure to improve the documentation of Sentinel however,
there are many details not covered very well, like example
deployments, failure modes, how you can improve consistency under
partitions using the new options to don't accept writes when replicas
are not acking back, and so forth. I'll do this ASAP... hopefully.

Salvatore

Leo Volinier

unread,
Feb 28, 2014, 11:11:55 AM2/28/14
to redi...@googlegroups.com
Hello Salvatore!

Regarding:
On Fri, Feb 7, 2014 at 11:49:34 UTC-3, Salvatore Sanfilippo escribió:
Hello Wils, Redis Sentinel in recent times got a lot more stable and
I'm receiving very positive setups.

Are you talking about the latest unstable changes or about 2.8.6?

In my case, I'm still having issues with the failover actions. In general, in network failures scenarios, when the sentinel leader fails, the other sentinels don't seem to vote for a new leader. No action is taken when the master goes out.

I'm still trying to recover information from the incidents I had. The latest three, were using version 2.8.6.


I will open a new discussion thread when I reach some conclusions. 

Cheers,
Leo

Salvatore Sanfilippo

unread,
Feb 28, 2014, 11:53:19 AM2/28/14
to Redis DB
Hi Leo,

yes, the implementation of 2.8.6 has passed a number of tests so far
without big issues. Sorry to hear you had problems.
Btw in Sentinel there is no concept of leader. Every Sentinel can
failover, a Sentinel is only leader for the time needed to get a new
configuration epoch and start a failover. There is no need for a
change of role for another Sentinel to try to failover again if the
first failed, but just a timeout the Sentinel will try to respect
before starting a new one for a master that already received a try.

Please could you describe your setup? Take in mind that Sentinel can't
failover without majority agreement EVEN if you set the "quorum" to
less than majority.
The quorum is only used to reach ODOWN state, that is just a trigger
for the actual failover.

Salvatore
> --
> You received this message because you are subscribed to the Google Groups
> "Redis DB" group.
> To unsubscribe from this group and stop receiving emails from it, send an
> email to redis-db+u...@googlegroups.com.
> To post to this group, send email to redi...@googlegroups.com.
> Visit this group at http://groups.google.com/group/redis-db.
> For more options, visit https://groups.google.com/groups/opt_out.



--
Salvatore 'antirez' Sanfilippo
open source developer - GoPivotal
http://invece.org

To "attack a straw man" is to create the illusion of having refuted a
proposition by replacing it with a superficially similar yet
unequivalent proposition (the "straw man"), and to refute it

Salvatore Sanfilippo

unread,
Feb 28, 2014, 11:55:12 AM2/28/14
to Redis DB
p.s. I forgot to mention that there is no an unit test for Sentinel
(unstable and latest 2.8 branch commit, yet not released).
I'm adding more tests and I'll start adding randomized tests in the
next days, so I hope this will help to discover new issues and have a
solid base for modifications without huge regressions.
The same framework will be used for Cluster.

Salvatore

Leo Volinier

unread,
Feb 28, 2014, 2:13:13 PM2/28/14
to redi...@googlegroups.com
Hi Salvatore!

> Btw in Sentinel there is no concept of leader. Every Sentinel can
> failover, a Sentinel is only leader for the time needed to get a new
> configuration epoch and start a failover. There is no need for a
> change of role for another Sentinel to try to failover again if the
> first failed, but just a timeout the Sentinel will try to respect
> before starting a new one for a master that already received a try.

Ok. But documentation and some commands still describe a sentinel leader.
I know that the documentation may be a little behind the current release, but,
for instance, the command "Sentinel sentinels" is still showing a "voted-leader"
param.


> Please could you describe your setup? Take in mind that Sentinel can't
> failover without majority agreement EVEN if you set the "quorum" to
> less than majority.
> The quorum is only used to reach ODOWN state, that is just a trigger
> for the actual failover.

Running in a virtualized OpenStack environment, nowadays we have 3 boxes, 
1-server 1-sentinel each. Both slaves are attached to the master. Sentinels are 
configured with quorum=2.


When the box with the master fails (network wise: the processes are still working 
on each box, but the boxes cannot reach each other), the remaining 2 sentinels 
just see the sdown state, but don't vote for a new master. Also both slaves detects 
the "master link down" state. But nothing reacts. Just stays the same way.


But, as I said, I'm still looking for the exact reason of failure. I'll let you know if there 
is something related to Redis.


Thanks a lot!

Cheers,
L.

You received this message because you are subscribed to a topic in the Google Groups "Redis DB" group.
To unsubscribe from this topic, visit https://groups.google.com/d/topic/redis-db/M6WPJ0LnaWI/unsubscribe.
To unsubscribe from this group and all its topics, send an email to redis-db+u...@googlegroups.com.

Salvatore Sanfilippo

unread,
Mar 1, 2014, 3:39:53 AM3/1/14
to Redis DB
On Fri, Feb 28, 2014 at 8:13 PM, Leo Volinier
<leonardo...@gmail.com> wrote:

> Ok. But documentation and some commands still describe a sentinel leader.
> I know that the documentation may be a little behind the current release,
> but,
> for instance, the command "Sentinel sentinels" is still showing a
> "voted-leader"
> param.

Because the leader actually exists, what I wanted to say (and what
should be explained better) is that the leader is not a "stable" role.
The current Sentinel documentation is suboptimal, and after finishing
with the unit test, this is my main concern in the next weeks.
So what is the leader?

The Sentinel algorithm is leader-less in the sense that when a master
is down, any of its slaves can start a fail over.
Every Sentinel will try to failover the master with a random delay,
hoping to avoid a split brain condition (that will cause the failover
to be retried again of course).
So when the first Sentinel wakes up, it tries to get voted to
reconfigure its master and perform the failover.

If it is able to get the majority of votes, it is the "failover
leader", and receives an unique configEpoch in the course of the
voting process, that is guaranteed to be unique.
Now it can start to broadcast a new updated configuration for this
master, but it will do this only after being sure that the selected
slave was able to accept the SLAVEOF NO ONE command, and is actually
acting as a master reporting the new role in INFO.

However the Sentinel acting as "failover leader" can fail at any time,
so all the other sentinels try to failover the master if the master is
down. Now there are a couple of aspects to consider about this:
1) When a Sentinel votes for another Sentinel for a failover of a
given master, it will provide the courtesy of not trying to start to
get voted to perform the failover for the same master for a couple of
seconds. There is no point in all the Sentinels trying at the same
time to failover the same master.
2) If the Sentinel that was able to be elected leader succeeded, it
will broadcast the new config, so all the other Sentinels will point
to the new master and will no longer detect the master as down. End of
the story until the next failure.
3) If a Sentinel gets voted, but fails in the middle, for example
AFTER it already sent SLAVEOF NO ONE but BEFORE it was able to
broadcast the new config, what happens is that when the other
Sentinels will see a config mismatch for some time, but without a
change in the configuration, they'll force the instance to return back
to its slave role. After a couple of seconds another Sentinel will try
to perform the failover.

So what we call the "Failover Leader" here is actually just a Sentinel
authorized to proceed with the failover.
In a system with a real leader, things work in a different way, and
there is always the Leader to act. However when the leader fails it
gets replaced by some other node.

> Running in a virtualized OpenStack environment, nowadays we have 3 boxes,
> 1-server 1-sentinel each. Both slaves are attached to the master. Sentinels
> are
> configured with quorum=2.
>
>
> When the box with the master fails (network wise: the processes are still
> working
> on each box, but the boxes cannot reach each other), the remaining 2
> sentinels
> just see the sdown state, but don't vote for a new master. Also both slaves
> detects
> the "master link down" state. But nothing reacts. Just stays the same way.

For a Sentinel to try to get voted, ODOWN must be reached, otherwise
the failover can't be triggered.
If it is not reached, usually it is a problem of the Sentinels not
able to communicate since port 26379 is closed.
But it can be a bug as well, the system is every week more mature but
not immune to bugs at all...

Salvatore

Wils Gomes

unread,
Mar 24, 2014, 5:12:28 PM3/24/14
to redi...@googlegroups.com
Hi folks,

I am trying to setup redis sentinel in 2 vms, this is my setup:

- 10.1.8.36 mymaster
- 10.1.8.46 resque (redis.conf = slaveof 10.1.8.36) 

- in both mnachines /etc/redis-sentinel.conf
port 26379

sentinel monitor mymaster 10.1.8.36 6379 1
sentinel down-after-milliseconds mymaster 5000
sentinel failover-timeout mymaster 60000
sentinel parallel-syncs mymaster 1


- This is my log:
[3447] 24 Mar 16:26:55.561 # +new-epoch 6
[3447] 24 Mar 16:26:55.561 # +try-failover master mymaster 10.1.8.36 6379
[3447] 24 Mar 16:26:55.561 # +vote-for-leader 0ff5a5f69e7b420a666ca6f10544152e70f4676d 6
[3447] 24 Mar 16:26:55.561 # +elected-leader master mymaster 10.1.8.36 6379
[3447] 24 Mar 16:26:55.561 # +failover-state-select-slave master mymaster 10.1.8.36 6379
[3447] 24 Mar 16:26:55.638 # -failover-abort-no-good-slave master mymaster 10.1.8.36 6379

- However I keep hgetting the following error:
[3447] 24 Mar 16:26:55.638 # -failover-abort-no-good-slave master mymaster 10.1.8.36 6379

- I am not sure why it keeps saying: -failover-abort-no-good-slave master mymaster 10.1.8.36 6379

=P

Any thoughts?
tkx


On Tuesday, 26 November 2013 01:06:17 UTC-5, ESWAR RAO wrote:
Hi All,

I want to configure the HA for redis-server on a 2 node setup.
I could see the redis sentinel to achieve the behavior.

Can someone please help me with the sentinel configuration on the 2 nodes??

Should I run both redis-server+sentinel on both the nodes??

machine 1: [192.168.100.120]
===========================

#redis-server --port 6400
#redis-server sentinel.conf --sentinel

sentinel.conf
++++++++++++++++
port 26379
sentinel monitor mymaster 127.0.0.1 6400 1
sentinel down-after-milliseconds mymaster 5000
sentinel failover-timeout mymaster 900000
sentinel can-failover mymaster yes
sentinel parallel-syncs mymaster 1


machine:2 [192.168.100.121]
=============================
# redis-server --port 6400 --slaveof 192.168.100.120 6400
#
redis-server sentinel.conf --sentinel

Andy Bowes

unread,
Mar 24, 2014, 5:28:35 PM3/24/14
to redi...@googlegroups.com
Hi Wil's

I think that the problem is that you only have 2 nodes which means that if 1 fails then there is no way to get sufficient votes to elect a new Master node.  This is deliberate as if there is a network partition and the 2 nodes stop communicating then there is no way of determining if the other node has failed or if there has been a network error.

HTH
Andy


--
You received this message because you are subscribed to a topic in the Google Groups "Redis DB" group.
To unsubscribe from this topic, visit https://groups.google.com/d/topic/redis-db/M6WPJ0LnaWI/unsubscribe.
To unsubscribe from this group and all its topics, send an email to redis-db+u...@googlegroups.com.
To post to this group, send email to redi...@googlegroups.com.
Visit this group at http://groups.google.com/group/redis-db.
For more options, visit https://groups.google.com/d/optout.

Wils Gomes

unread,
Mar 24, 2014, 6:13:58 PM3/24/14
to redi...@googlegroups.com

Hi Andy,

I see your point however this is what I'm trying to reach, actually it can be done in the code using "slave of no one" to the slave in case the master fail but I thought sentinel would do this too...

2 machine: one is the master another ones is the slave.

Master fail: then slave become master.

That's it :)

Andy Bowes

unread,
Mar 24, 2014, 6:22:49 PM3/24/14
to redi...@googlegroups.com
Sentinel will only allow you to do this automatically if you have a larger cluster of nodes.  We are using 4 nodes and need 3 instances to vote on a new Master.  In general if you have X nodes then you will need to get a minimum of (X/2) + 1 votes to elect a new master safely.

Any way that you could spin up more VMs or possibly have multiple Redis Server & Sentinel instances on each node?

Jan-Erik Rediger

unread,
Mar 24, 2014, 6:42:40 PM3/24/14
to redi...@googlegroups.com
In my understanding with a quorum of 1 (as he set it) this should work
just fine with 1 master, 1 slave and 2 sentinels. It did work in a local
test on my laptop.
> You received this message because you are subscribed to the Google Groups "Redis DB" group.
> To unsubscribe from this group and stop receiving emails from it, send an email to redis-db+u...@googlegroups.com.

Srinivas Kotamarti

unread,
Apr 1, 2014, 9:19:08 AM4/1/14
to redi...@googlegroups.com
I have a similar need to setup Redis on two nodes as a master and slave with two instances of Redis-sentinel. Can H/A be supported for this type of scenario or do we need at-least 3 nodes? If this cannot be achieved with Redis-sentinel are there any alternatives to achieve H/A for just two nodes (Active / Standby setup).  

--Srinivas

Olivier Jeannet

unread,
Apr 23, 2014, 11:51:45 AM4/23/14
to redi...@googlegroups.com
Hi Salvatore!

Thanks for your explanations on sentinel algorithm.

I have problems with sentinel and master failover. I am testing sentinel with 3 VM (for 1 master and 2 slaves), each with 1 redis server and 1 sentinel, on standard ports.
The configuration of the sentinels is as follow :

 ===
 sentinel monitor redis-v01 172.19.247.1 6379 2
 sentinel down-after-milliseconds redis-v01 5000
 sentinel failover-timeout redis-v01 15000
 sentinel parallel-syncs redis-v01 2

 sentinel monitor redis-v02 172.19.247.2 6379 2
 [ an same 3 lines ]

 sentinel monitor redis-v03 172.19.247.3 6379 2
 [ an same 3 lines ]
 ===


On the VM with the current master, I simulate the unavailability with an "ifdown eth2", in order to have one of the 2 slaves promoted by the remaining 2 sentinels.
It works correctly half of the time, after around 5 seconds I get a new master and the configuration is stable.

But the rest of the time, I can see the following cases happening :
1) I have 2 masters for 10 to 20 seconds, then one of the masters becomes slave again
2) One of the slaves is promoted after 5 seconds, but 20 seconds later, the new master becomes slave again, and the slave becomes master
3) I have 2 slaves for 30 seconds, trying to synchronise with the other, then one of them gets promoted to master

I don't know how this is possible. The VMs are pre-production servers running only redis.
Has anyone experienced that ?

I also see this in the redis logs, sometimes even 4 "CONFIG REWRITE" in a row, and in the sentinel logs I can see that sentinel asked several times for "slave of noone" :

 [28682] 23 Apr 17:07:18.488 * MASTER MODE enabled (user request)
 [28682] 23 Apr 17:07:18.489 # CONFIG REWRITE executed with success.
 [28682] 23 Apr 17:07:18.969 # CONFIG REWRITE executed with success.

As for the case 3 mentionned above (where after a slave gets promoted, it is quickly demoted and there are 2 slaves), here are the logs of the 2 servers :

 === server 1 logs ===

[28682] 23 Apr 17:07:18.488 # Connection with master lost.
[28682] 23 Apr 17:07:18.488 * Caching the disconnected master state.
[28682] 23 Apr 17:07:18.488 * Discarding previously cached master state.
[28682] 23 Apr 17:07:18.488 * MASTER MODE enabled (user request)
[28682] 23 Apr 17:07:18.489 # CONFIG REWRITE executed with success.
[28682] 23 Apr 17:07:18.969 # CONFIG REWRITE executed with success.
[28682] 23 Apr 17:07:19.473 * Slave asks for synchronization
[28682] 23 Apr 17:07:19.473 * Full resync requested by slave.
[28682] 23 Apr 17:07:19.473 * Starting BGSAVE for SYNC
[28682] 23 Apr 17:07:19.475 * Background saving started by pid 56814
[56814] 23 Apr 17:07:19.491 * DB saved on disk
[56814] 23 Apr 17:07:19.492 * RDB: 4 MB of memory used by copy-on-write
[28682] 23 Apr 17:07:19.520 # Connection with slave 172.19.247.2:6379 lost.
[28682] 23 Apr 17:07:19.570 * Background saving terminated with success
[28682] 23 Apr 17:07:19.705 * SLAVE OF 172.19.247.2:6379 enabled (user request)
[28682] 23 Apr 17:07:19.706 # CONFIG REWRITE executed with success.
[28682] 23 Apr 17:07:19.971 * Connecting to MASTER 172.19.247.2:6379
[28682] 23 Apr 17:07:19.971 * MASTER <-> SLAVE sync started
[28682] 23 Apr 17:07:19.971 * Non blocking connect for SYNC fired the event.
[28682] 23 Apr 17:07:19.971 * Master replied to PING, replication can continue...
[28682] 23 Apr 17:07:19.972 * Partial resynchronization not possible (no cached master)
[28682] 23 Apr 17:07:19.972 * Master does not support PSYNC or is in error state (reply: -ERR Can't SYNC while not connected with my master)
[28682] 23 Apr 17:07:19.972 * Retrying with SYNC...
[28682] 23 Apr 17:07:19.974 # MASTER aborted replication with an error: ERR Can't SYNC while not connected with my master
 [ I then get a loop of "connecting to MASTER" every second for 30 seconds ]

 === END server 1 logs ===


 === server 2 logs ===

[51316] 23 Apr 17:07:19.355 # Connection with master lost.
[51316] 23 Apr 17:07:19.356 * Caching the disconnected master state.
[51316] 23 Apr 17:07:19.356 * Discarding previously cached master state.
[51316] 23 Apr 17:07:19.356 * SLAVE OF 172.19.247.1:6379 enabled (user request)
[51316] 23 Apr 17:07:19.357 # CONFIG REWRITE executed with success.
[51316] 23 Apr 17:07:19.470 * Connecting to MASTER 172.19.247.1:6379
[51316] 23 Apr 17:07:19.470 * MASTER <-> SLAVE sync started
[51316] 23 Apr 17:07:19.471 * Non blocking connect for SYNC fired the event.
[51316] 23 Apr 17:07:19.471 * Master replied to PING, replication can continue...
[51316] 23 Apr 17:07:19.472 * Partial resynchronization not possible (no cached master)
[51316] 23 Apr 17:07:19.473 * Full resync from master: 0b6dd95b536e670e5c211a20b73a8a076b145b55:3894193
[51316] 23 Apr 17:07:19.520 * MASTER MODE enabled (user request)
[51316] 23 Apr 17:07:19.521 # CONFIG REWRITE executed with success.
[51316] 23 Apr 17:07:19.530 * SLAVE OF 172.19.247.1:6379 enabled (user request)
[51316] 23 Apr 17:07:19.531 # CONFIG REWRITE executed with success.
[51316] 23 Apr 17:07:20.472 * Connecting to MASTER 172.19.247.1:6379
[51316] 23 Apr 17:07:20.472 * MASTER <-> SLAVE sync started
[51316] 23 Apr 17:07:20.473 * Non blocking connect for SYNC fired the event.
[51316] 23 Apr 17:07:20.473 * Master replied to PING, replication can continue...
[51316] 23 Apr 17:07:20.473 * Partial resynchronization not possible (no cached master)
[51316] 23 Apr 17:07:20.474 * Master does not support PSYNC or is in error state (reply: -ERR Can't SYNC while not connected with my master)
[51316] 23 Apr 17:07:20.474 * Retrying with SYNC...
[51316] 23 Apr 17:07:20.476 # MASTER aborted replication with an error: ERR Can't SYNC while not connected with my master
 [ I then get a loop of "connecting to MASTER" every second for 30 seconds ]

 === END server 2 logs ===


Any help would be apreciated.

Best regards,
Olivier.
[...]

barroudjo

unread,
Apr 30, 2014, 12:00:03 PM4/30/14
to redi...@googlegroups.com
I have a similar problem:
I have three nodes which end up configured similarly after playing around a bit (disconnecting the master a few times), in a configuration that makes it impossible to get a new master !!

logfile "/var/log/redis/sentinel.log"
sentinel monitor mymaster 127.0.0.1 6379 1
sentinel down-after-milliseconds mymaster 10000
sentinel failover-timeout mymaster 60000
sentinel config-epoch mymaster 85
# Generated by CONFIG REWRITE
port 26379
dir "/"
sentinel leader-epoch mymaster 102
sentinel known-slave mymaster 172.31.5.239 6379
sentinel known-slave mymaster 172.31.27.41 6379
sentinel known-slave mymaster 172.31.34.37 6379
sentinel known-sentinel mymaster 172.31.5.239 26379 7780d81949530d62aa6898eae2fce2c389a3ed73
sentinel known-sentinel mymaster 172.31.34.37 26379 43b3cf45b6297040e06c461b55e3dd85200a3fb7
sentinel current-epoch 104

That would seem to be a major issue...

barroudjo

unread,
Apr 30, 2014, 1:12:09 PM4/30/14
to redi...@googlegroups.com
I might have found the issue in my case: sentinel makes the redis server a slave of itself: 

127.0.0.1:6379> config get slaveof
1) "slaveof"
2) "127.0.0.1 6379"
(3.72s)

Olivier Jeannet

unread,
May 7, 2014, 11:11:05 AM5/7/14
to redi...@googlegroups.com
Hi all,

I found the problem : I did not understand correctly the way sentinel is supposed to be configured.

So for 3 redis instances (1 master + 2 slaves), instead of every sentinel watching all instances and defining 3 "mymaster" names, the configuration is the same for all 3 sentinels, with one "mymaster" name (like the name of the "cluster" made of the 3 instances) and the initial master redis IP, like this :

 sentinel monitor myRedisCluster 172.19.247.1 6379 2
 sentinel down-after-milliseconds myRedisCluster 5000
 sentinel failover-timeout myRedisCluster 15000


And the failover works perfectly well, after 5 seconds and a few 1/10 after master shutdown, I have a new master elected and the remaining slave synchronised with the new master.

Best regards.
Olivier.



On Wednesday, April 23, 2014 5:51:45 PM UTC+2, Olivier Jeannet wrote:
Hi Salvatore!

Thanks for your explanations on sentinel algorithm.

I have problems with sentinel and master failover. I am testing sentinel with 3 VM (for 1 master and 2 slaves), each with 1 redis server and 1 sentinel, on standard ports.
The configuration of the sentinels is as follow :

 ===
 sentinel monitor redis-v01 172.19.247.1 6379 2
 sentinel down-after-milliseconds redis-v01 5000
 sentinel failover-timeout redis-v01 15000
 sentinel parallel-syncs redis-v01 2

 sentinel monitor redis-v02 172.19.247.2 6379 2
 [ an same 3 lines ]

 sentinel monitor redis-v03 172.19.247.3 6379 2
 [ an same 3 lines ]
 ===

[... lines deleted ... ]

Salvatore Sanfilippo

unread,
May 7, 2014, 11:16:07 AM5/7/14
to Redis DB
Good that it works, and sorry for reading the thread only now that the
problem is fixed...
> --
> You received this message because you are subscribed to the Google Groups
> "Redis DB" group.
> To unsubscribe from this group and stop receiving emails from it, send an
> email to redis-db+u...@googlegroups.com.
> To post to this group, send email to redi...@googlegroups.com.
> Visit this group at http://groups.google.com/group/redis-db.
> For more options, visit https://groups.google.com/d/optout.



--
Salvatore 'antirez' Sanfilippo
open source developer - GoPivotal
http://invece.org

To "attack a straw man" is to create the illusion of having refuted a
proposition by replacing it with a superficially similar yet
unequivalent proposition (the "straw man"), and to refute it
— Wikipedia (Straw man page)

barroudjo .

unread,
May 7, 2014, 11:29:51 AM5/7/14
to redi...@googlegroups.com
Hello all,

Congrats on solving your problem ! Actually one of us should have been able to spot it, if we'd been more careful.

I also have an issue with sentinel. I use the same configuration: 3 servers, each with redis and sentinel running, and all sentinels set up with the same sentinel.conf.
The redis instances are also properly set up, one master and two slaves, the master being the one defined in sentinel.conf.
It starts fine. But when I kill the master (I stop both sentinel and redis), the slaves aren't able to do a failover. It seems the sentinel instances can't communicate with each other.
I haven't seen anyone having the same problem, and what's more, while the master is up the sentinels are able to communicate...
Help would be greatly appreciated !

Here is the sentinel configuration (same for all):

logfile "/var/log/redis/sentinel.log"
sentinel monitor mymaster 172.31.34.37 6379 1
sentinel down-after-milliseconds mymaster 5000
sentinel failover-timeout mymaster 20000
sentinel parallel-syncs mymaster 1

Here is the log for sentinel, slave 1:

[3695] 01 May 12:45:23.551 # Sentinel runid is 38278ae
[3695] 01 May 12:45:23.551 # +monitor master mymaster 172.31.34.37 6379 quorum 1
[3695] 01 May 12:45:23.995 * +sentinel sentinel 172.31.34.37:26379 172.31.34.37 26379 @ mymaster 172.31.34.37 6379
[3695] 01 May 12:45:29.020 # +sdown sentinel 172.31.34.37:26379 172.31.34.37 26379 @ mymaster 172.31.34.37 6379
[3695] 01 May 12:45:31.844 * +sentinel sentinel 172.31.27.41:26379 172.31.27.41 26379 @ mymaster 172.31.34.37 6379
[3695] 01 May 12:45:33.569 * +slave slave 172.31.5.239:6379 172.31.5.239 6379 @ mymaster 172.31.34.37 6379
[3695] 01 May 12:45:33.569 * +slave slave 172.31.27.41:6379 172.31.27.41 6379 @ mymaster 172.31.34.37 6379
[3695] 01 May 12:45:36.851 # +sdown sentinel 172.31.27.41:26379 172.31.27.41 26379 @ mymaster 172.31.34.37 6379
[3695] 01 May 12:47:48.882 # +sdown master mymaster 172.31.34.37 6379
[3695] 01 May 12:47:48.882 # +odown master mymaster 172.31.34.37 6379 #quorum 1/1
[3695] 01 May 12:47:48.882 # +new-epoch 1
[3695] 01 May 12:47:48.882 # +try-failover master mymaster 172.31.34.37 6379
[3695] 01 May 12:47:48.908 # +vote-for-leader 38278ae 1
[3695] 01 May 12:47:59.313 # -failover-abort-not-elected master mymaster 172.31.34.37 6379
[3695] 01 May 12:48:29.304 # +new-epoch 2
[3695] 01 May 12:48:29.304 # +try-failover master mymaster 172.31.34.37 6379
[3695] 01 May 12:48:29.311 # +vote-for-leader 38278ae 2
[3695] 01 May 12:48:40.126 # -failover-abort-not-elected master mymaster 172.31.34.37 6379
[3695] 01 May 12:49:10.133 # +new-epoch 3
[3695] 01 May 12:49:10.133 # +try-failover master mymaster 172.31.34.37 6379
[3695] 01 May 12:49:10.148 # +vote-for-leader 38278ae 3
[3695] 01 May 12:49:20.240 # -failover-abort-not-elected master mymaster 172.31.34.37 6379
[3695] 01 May 12:49:50.270 # +new-epoch 4
[3695] 01 May 12:49:50.271 # +try-failover master mymaster 172.31.34.37 6379
[3695] 01 May 12:49:50.277 # +vote-for-leader 38278ae 4
[3695] 01 May 12:49:52.447 # +new-epoch 5
[3695] 01 May 12:50:00.756 # -failover-abort-not-elected master mymaster 172.31.34.37 6379
[3695] 01 May 12:50:30.282 # +new-epoch 6
[3695] 01 May 12:50:30.685 # +new-epoch 7
[3695] 01 May 12:50:30.685 # +try-failover master mymaster 172.31.34.37 6379
[3695] 01 May 12:50:30.695 # +vote-for-leader 38278ae 7
[3695] 01 May 12:50:41.531 # -failover-abort-not-elected master mymaster 172.31.34.37 6379

Here is the log for sentinel, slave 2:

[3101] 01 May 12:45:31.572 # Sentinel runid is 180dd82
[3101] 01 May 12:45:31.572 # +monitor master mymaster 172.31.34.37 6379 quorum 1
[3101] 01 May 12:45:31.575 * +slave slave 172.31.5.239:6379 172.31.5.239 6379 @ mymaster 172.31.34.37 6379
[3101] 01 May 12:45:31.601 * +sentinel sentinel 172.31.5.239:26379 172.31.5.239 26379 @ mymaster 172.31.34.37 6379
[3101] 01 May 12:45:31.912 * +sentinel sentinel 172.31.34.37:26379 172.31.34.37 26379 @ mymaster 172.31.34.37 6379
[3101] 01 May 12:45:36.604 # +sdown sentinel 172.31.5.239:26379 172.31.5.239 26379 @ mymaster 172.31.34.37 6379
[3101] 01 May 12:45:36.982 # +sdown sentinel 172.31.34.37:26379 172.31.34.37 26379 @ mymaster 172.31.34.37 6379
[3101] 01 May 12:45:41.621 * +slave slave 172.31.27.41:6379 172.31.27.41 6379 @ mymaster 172.31.34.37 6379
[3101] 01 May 12:47:50.601 # +sdown master mymaster 172.31.34.37 6379
[3101] 01 May 12:47:50.601 # +odown master mymaster 172.31.34.37 6379 #quorum 1/1
[3101] 01 May 12:47:50.601 # +new-epoch 1
[3101] 01 May 12:47:50.601 # +try-failover master mymaster 172.31.34.37 6379
[3101] 01 May 12:47:50.612 # +vote-for-leader 180dd82 1
[3101] 01 May 12:48:01.113 # -failover-abort-not-elected master mymaster 172.31.34.37 6379
[3101] 01 May 12:48:31.100 # +new-epoch 2
[3101] 01 May 12:48:31.100 # +try-failover master mymaster 172.31.34.37 6379
[3101] 01 May 12:48:31.120 # +vote-for-leader 180dd82 2
[3101] 01 May 12:48:41.405 # -failover-abort-not-elected master mymaster 172.31.34.37 6379
[3101] 01 May 12:49:11.395 # +new-epoch 3
[3101] 01 May 12:49:11.396 # +try-failover master mymaster 172.31.34.37 6379
[3101] 01 May 12:49:11.412 # +vote-for-leader 180dd82 3
[3101] 01 May 12:49:21.551 # -failover-abort-not-elected master mymaster 172.31.34.37 6379
[3101] 01 May 12:49:51.545 # +new-epoch 4
[3101] 01 May 12:49:51.546 # +try-failover master mymaster 172.31.34.37 6379
[3101] 01 May 12:49:51.560 # +vote-for-leader 180dd82 4
[3101] 01 May 12:49:54.233 # +new-epoch 5
[3101] 01 May 12:50:01.751 # -failover-abort-not-elected master mymaster 172.31.34.37 6379
[3101] 01 May 12:50:31.757 # +new-epoch 6
[3101] 01 May 12:50:31.757 # +try-failover master mymaster 172.31.34.37 6379
[3101] 01 May 12:50:31.772 # +vote-for-leader 180dd82 6
[3101] 01 May 12:50:33.961 # +new-epoch 7
> You received this message because you are subscribed to a topic in the Google Groups "Redis DB" group.
> To unsubscribe from this topic, visit https://groups.google.com/d/topic/redis-db/M6WPJ0LnaWI/unsubscribe.
> To unsubscribe from this group and all its topics, send an email to redis-db+u...@googlegroups.com.

Salvatore Sanfilippo

unread,
May 7, 2014, 11:31:54 AM5/7/14
to Redis DB
Hello, make sure TCP port 26379 is open in all your servers.

barroudjo .

unread,
May 7, 2014, 2:02:29 PM5/7/14
to redi...@googlegroups.com
Thanks a lot ! That did the trick.
You might want to add that to the sentinel documentation.

Regards,

Jonathan

Salvatore Sanfilippo

unread,
May 8, 2014, 9:31:19 AM5/8/14
to Redis DB
Good call, added to Sentinel doc.

Salvatore
Reply all
Reply to author
Forward
0 new messages