Possible split-brain when using Sentinel

699 views
Skip to first unread message

Andrei Lukovenko

unread,
May 28, 2014, 4:32:07 AM5/28/14
to redi...@googlegroups.com
Hi,

  Consider the following configuration:
        * master A (slaveof no one)
        * slave B (slaveof master-A-ip master-A-port)
        * slave C (slaveof master-A-ip master-A-port)
        * Sentinel (quorum=1), considers A as the master

  If master A fails, Sentinel promotes one of the slaves, and change configs accordingly. So the configuration becomes:
        * master A (down, slaveof no one)
        * master B (slaveof no one)
        * slave C (slaveof master-B-ip master-B-port)
        * Sentinel (quorum=1), considers B as the master

  Now for some reason Sentinel goes down and restarts. As sentinel.conf has not been rewritten, Sentinel still thinks about host A being the master. As host A is down, the system becomes unresponsive, and our system administrator recovers host A:
        * master A (slaveof no one)
        * master B (slaveof no one)
        * slave C (slaveof master-B-ip master-B-port)
        * Sentinel (quorum=1), considers A as the master

   It is a very possible case of split brain, and currently there is no straight way do avoid it. I have considered the following workarounds:
a) Restoring redis.conf to the original state before restarting an instance. Not good, as we lose all the benefits of rewriting it.
b) Manually resolving these conflicts. 

   In my opinion, it would've been better if either:
a) We could explicitly describe our network configuration, including slaves in sentinel.conf. Then after restarting a sentinel it would turn B and C to slaves of A.
b) Sentinel would rewrite sentinel.conf after changing configuration. In this example, after promoting B to master and changing B's and C's config, it would also rewrite it's own config to consider B as the master.

  What do you think?

Salvatore Sanfilippo

unread,
May 28, 2014, 8:20:23 AM5/28/14
to Redis DB
Hello,

what you describe is AFAIK not possible, but moreover it is not
technically what is called a split-brain condition.
A split brain condition happens when multiple processes should agree
about some value, but instead they don't agree and actually have two
distinct values.
In eventually consistent systems like Sentinel, split-brain conditions
are possible during partitions, but there is the guarantee that when
all the partitions heal, all the sentinels agree about what is the
master.

What instead you describe is a loss of state during a crash-recovery
event. However AFAIK this is not possible because when a new
configuration is considered to be valid (we receive an acknowledge
examining the INFO output that the promoted slave actually turned into
master role), the configuration is persisted and fsync()-ed on disk
before it to be propagated to other nodes, or advertised by Sentinel
to any client.

However it is possible that the Sentinel sends a SLAVEOF NO ONE to the
promoted slave, and restarts before the slave is able to confirm the
role change.
But this case is exactly like a Sentinel observing a switch of a slave
from slave to master operated externally (for instance, manually).

In this case, because of the Sentinel liveness property to always try
to set the current logical configuration if there are instances
diverging from this configuration, such a slave with role equal to
master, is, after a small delay, converted back to slave role.

Regards,
Salvatore
> --
> You received this message because you are subscribed to the Google Groups
> "Redis DB" group.
> To unsubscribe from this group and stop receiving emails from it, send an
> email to redis-db+u...@googlegroups.com.
> To post to this group, send email to redi...@googlegroups.com.
> Visit this group at http://groups.google.com/group/redis-db.
> For more options, visit https://groups.google.com/d/optout.



--
Salvatore 'antirez' Sanfilippo
open source developer - GoPivotal
http://invece.org

To "attack a straw man" is to create the illusion of having refuted a
proposition by replacing it with a superficially similar yet
unequivalent proposition (the "straw man"), and to refute it
— Wikipedia (Straw man page)

Andrei Lukovenko

unread,
May 28, 2014, 9:26:14 AM5/28/14
to redi...@googlegroups.com
Hello,

  First of all, thank you for response.

  Regarding the definition of the split-brain I am still not convinced. In my example both instances A and B consider themselves masters. Both of them are able to serve clients, including writes. If it is not a split-brain, then what is?..

  The sequence described above is not imaginary. I've actually seen this exact situation during my tests, it is very real, and what I really want is to find a way to prevent repeating this in production.

  So far it seems that sentinel is able to change (and actually save on disk) configuration of an instance (master or slave), but does not change it's own configuration. Is that correct?
Best regards, Andrei

Salvatore Sanfilippo

unread,
May 28, 2014, 9:36:38 AM5/28/14
to Redis DB
On Wed, May 28, 2014 at 3:26 PM, Andrei Lukovenko <al...@cordeo.ru> wrote:
> Hello,
>
> First of all, thank you for response.
>
> Regarding the definition of the split-brain I am still not convinced. In
> my example both instances A and B consider themselves masters. Both of them
> are able to serve clients, including writes. If it is not a split-brain,
> then what is?..

Split brain conditions must be evaluated from the point of view of who
should be the source of authority in a distributed system.
In this case, it is the set of Sentinel instances, so as long as there
is no split-brain condition in the Sentinels themselves, the split
brain condition you see in the Redis instances is not a problem
because of the Sentinel property to always (with a delay) set the
logical configuration as the instances configuration.

> The sequence described above is not imaginary. I've actually seen this
> exact situation during my tests, it is very real, and what I really want is
> to find a way to prevent repeating this in production.

Probably what you observed is what I described in the previous email?
That's definitely possible.

1) A failover starts.
2) The Sentinel sends SLAVEOF NO ONE to the slave.
3) The Sentinel gets killed before getting the acknowledge.
4) The Sentinel restarts with the old config (which is correct since
the previous failover was not technically finished, and the Sentinel
never advertised the new master).

At this point you have two masters if you check the instances, but for
Sentinel the master is still the old one.
After some time (8 seconds, which is, four times the configuration
broadcasting period) it should detect that one of the slaves is
misconfigured, and reconfigure it accordingly, if this does not happen
there is a bug.

All this, of course, in Sentinel >= 2.8.
Sentinel shipped with 2.6 is broken and deprecated. Actually in the
latest 2.6 branch it is a dummy binary that warns you to use 2.8.

> So far it seems that sentinel is able to change (and actually save on
> disk) configuration of an instance (master or slave), but does not change
> it's own configuration. Is that correct?

Yes and no. It does not save the new configuration on purpose, because
it still did not received the acknowledge.
But here what is interesting is that, it saves the updated
configuration (with fsync) always *before* of advertising the new
configuration to clients and other Sentinels.

If it is not able to get the ack, it will reconfigure the new master
again back to slave.

If this does not happen, than there is a bug in the implementation,
but the designed semantics is very clear, the problem is if you find a
case where because of an implementation bug things does not work as
expected.

I'm trying to reproduce right now. Thanks for posting, it is vital
that we try to remove all the bugs in order to end with a system that
acts like the specification claims.

Salvatore

Salvatore Sanfilippo

unread,
May 28, 2014, 10:11:04 AM5/28/14
to Redis DB
Update: I tried to simulate the problem, and Sentinel always
reconfigures the instance that claims to be a master but is not the
logical master as a slave, after a bit more than 8 seconds.
During all the time, Sentinel never advertised the non-logical master
as a master to clients.

So apparently I'm not able to trigger the bug. If you are able to find
a sequence of operations where instances known by Sentinel can be
simultaneously masters for more than a few seconds, please post the
exact sequence, I'll try to reproduce and track the issue.

Cheers,
Salvatore

Andrei Lukovenko

unread,
May 28, 2014, 10:29:22 AM5/28/14
to redi...@googlegroups.com
Hi,

It seems that you are right. I wasn't able to reproduce this bug so far. Let's consider this a rare occasional fluke.

Thank you for your support!

Salvatore Sanfilippo

unread,
May 28, 2014, 10:34:27 AM5/28/14
to Redis DB
On Wed, May 28, 2014 at 4:29 PM, Andrei Lukovenko <al...@cordeo.ru> wrote:
> It seems that you are right. I wasn't able to reproduce this bug so far.
> Let's consider this a rare occasional fluke.

That's the problem, the current implementation of Sentinel, for how it
is specified, should never ever have occasional flukes.
It is more likely that you used an older version that contained bugs
perhaps (there are known issues in the past versions), or that there
is a non yet discovered issue which is hard to trigger.

One characteristic of eventually consistent systems is that they are
very resilient, because eventually, there is always an unique
information that wins over the other informations, and gets applied,
so it should be technically impossible to put it into a state where it
does not act in an obvious way. All this, modulo implementation bugs.
(For the same reasons some bugs in EC systems are hard to discover
since the system eventually converges masking issues, but Sentinel
tends to log everything so that bugs can be traced more easily).

Please if you happen to discover some problem, ping me, and thanks for
notifying about the possible problem.

Salvatore
Reply all
Reply to author
Forward
0 new messages