Sentinel 2.8.4: NOGOODSLAVE in manual failover

3,005 views
Skip to first unread message

Karl

unread,
Jan 29, 2014, 5:46:48 AM1/29/14
to redi...@googlegroups.com
Dear redis community, 

I am currently trying to gain a deeper understanding for sentinel 2.8.4, mainly for the conditions that bring the message NOGOODSLAVE while trying a failover. 

The following scenario: 
Let's say all my sentinel servers report SDOWN for a master that has one replicated slave. (ODOWN won't be reached because of a higher quorum that I set on purpose. I do not want to have an immediate failover done by sentinel at the moment). There is no real traffic on both machines. After waiting maybe a minute, two or three I execute in one of the sentinels: sentinel failover <mastername>.

When I do this in sentinel 2.6 it works fine and as expected (well, at least by me):  
The one and only existing slave is promoted to become master no matter what (there is no alternative anyway), the sentinel servers start to communicate with each other. Failover succeeded. Each sentinel knows about the new master. I got a master running and an updated sentinel config on all servers

In Sentinel 2.8 I cannot do the manual failover with the same settings described above (waiting a few minutes, …), I receive: (error) NOGOODSLAVE No suitable slave to promote.

So I am wondering about two things: 

* Why has this slave become bad? (maybe a question too unspecific …)
* How could I tell sentinel not to judge in this case (with one single slave)? I do not have a better slave anyway :-) 

And somehow connected to the story above: Can I still tell senitel servers somehow not to failover even if the judgement is ODOWN? In 2.6 there was this option: "can-failover no" in the config file ... 

Any hints would be very much appreciated! Thank you very much!

Best regards, 
Karl

Ting Lei

unread,
Apr 4, 2014, 4:52:32 AM4/4/14
to redi...@googlegroups.com
I'm experiencing the same issue, on sentinel 2.8.5.

The reason I wanted to try this is I'd like sentinel failover to be controlled by human, instead of automatically.

I also set quorum of the master to a very high value in sentinel, shutdown the master, and wait a few minutes and do 'sentinel failover old-master'. The sentinel complains '(error) NOGOODSLAVE No suitable slave to promote.'

If I shutdown the master and do manual failover in just a few seconds, the failover process seems to be ok.

Is this intended behavior? Is there guidelines for using sentinel for manual failover only?

Best regards,
Lei Ting

Daniel Mezzatto

unread,
Apr 4, 2014, 1:56:43 PM4/4/14
to redi...@googlegroups.com
I had a similar problem with my setup. Got 3 redis-server running (one master, 2 slaves). Issued the "sentinel failover <mastername>" command to one of the slaves. It was almost immediately changed to a master as expected. But, after a while, I ended up with 3 slaves! Trying the sentinel failover command again and I received the NOGOODSLAVE error.

I had to send a "slaveof no one" command to one of the instances, wait for the sentinel to "understand" the changes and them the "sentinel failover <mastername>" command started working again (not responding the NOGOODSLAVE error)

Ostap36

unread,
Apr 4, 2014, 2:57:54 PM4/4/14
to redi...@googlegroups.com
Hello, 

Also encountered a similar issue with Redis Sentinel 2.8.6. 

Looking through the logs it seems that the sentinel which is elected as the leader for that fail over fails to switch the master to the newly promoted one, even though promotion of the selected slave was successful. And then each of the sentinels send slaveof commands to their counter part masters causing both redis instances (the old master and the new ) to become slaves of each other. This causes the new master to be marked +ODOWN and subsequent NOGOODSLAVE error during the next fail over attempt.

Detailed Logs and description are here https://github.com/antirez/redis/issues/1651 

Thank you
Reply all
Reply to author
Forward
0 new messages