Account Options

  1. Sign in
The old Google Groups will be going away soon, but your browser is incompatible with the new version.
Google Groups Home
« Groups Home
Message from discussion Sentinel Operations

Received: by 10.204.127.19 with SMTP id e19mr788774bks.4.1348851961669;
        Fri, 28 Sep 2012 10:06:01 -0700 (PDT)
X-BeenThere: redis-db@googlegroups.com
Received: by 10.204.157.26 with SMTP id z26ls5177886bkw.1.gmail; Fri, 28 Sep
 2012 10:05:53 -0700 (PDT)
Received: by 10.204.127.19 with SMTP id e19mr788738bks.4.1348851953261;
        Fri, 28 Sep 2012 10:05:53 -0700 (PDT)
Received: by 10.204.127.19 with SMTP id e19mr788737bks.4.1348851953243;
        Fri, 28 Sep 2012 10:05:53 -0700 (PDT)
Return-Path: <anti...@gmail.com>
Received: from mail-bk0-f53.google.com (mail-bk0-f53.google.com [209.85.214.53])
        by gmr-mx.google.com with ESMTPS id 23si909135bku.1.2012.09.28.10.05.53
        (version=TLSv1/SSLv3 cipher=OTHER);
        Fri, 28 Sep 2012 10:05:53 -0700 (PDT)
Received-SPF: pass (google.com: domain of anti...@gmail.com designates 209.85.214.53 as permitted sender) client-ip=209.85.214.53;
Authentication-Results: gmr-mx.google.com; spf=pass (google.com: domain of anti...@gmail.com designates 209.85.214.53 as permitted sender) smtp.mail=anti...@gmail.com; dkim=pass header...@gmail.com
Received: by mail-bk0-f53.google.com with SMTP id jg15so3585934bkc.40
        for <redis-db@googlegroups.com>; Fri, 28 Sep 2012 10:05:53 -0700 (PDT)
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed;
        d=gmail.com; s=20120113;
        h=mime-version:in-reply-to:references:from:date:message-id:subject:to
         :content-type:content-transfer-encoding;
        bh=UKUiItqJ0Sqn7FCMk8224KGRUREt53tKgZWkcxIBCas=;
        b=l6ZI0cuvj9BPkJHTZUrkQkctl4yvscLWQij4wkPfkkOi4pwjhSCgAiZNuV1bH7IWU+
         u0x6UfCe8nLeJezSKdsrnA2w9Z99G4m8WgSeOZ3fnFshUEG9e1lzoMprCdDKOlMxIJv0
         fUuW5QWmL/7MLyKnk3qaGIcAuTA73ArVgmFQhXWnfZRkp/vcDTSRnkoS1DWegbskRPRi
         3LhWgJPm9SH8X3tzs6FWiCxOGz15ZVgFGhvx4izCKy5EcDAkXpcBO+N1+3O1tKwKHNo+
         XRX4wDOKqBtCWhIMWec2d1Je5qyXCTr3aWuCzhIsTJT35Uri7MrEVaO0UPVGU4s66ZmP
         Yv9A==
Received: by 10.204.157.145 with SMTP id b17mr3963839bkx.68.1348851952954;
 Fri, 28 Sep 2012 10:05:52 -0700 (PDT)
MIME-Version: 1.0
Received: by 10.204.36.86 with HTTP; Fri, 28 Sep 2012 10:05:32 -0700 (PDT)
In-Reply-To: <deea1182-1ddd-423d-a879-0e0b373b12ac@googlegroups.com>
References: <deea1182-1ddd-423d-a879-0e0b373b12ac@googlegroups.com>
From: Salvatore Sanfilippo <anti...@gmail.com>
Date: Fri, 28 Sep 2012 19:05:32 +0200
Message-ID: <CA+XzkVeYtYaK50zmzciCD5ATxKFV8OX-W1qT1Wn3nz+xpyn...@mail.gmail.com>
Subject: Re: Sentinel Operations
To: redis-db@googlegroups.com
Content-Type: text/plain; charset=windows-1252
Content-Transfer-Encoding: quoted-printable

On Fri, Sep 28, 2012 at 6:06 PM, Tom Coupland <tcoupl...@gmail.com> wrote:
> Afternoon,

Hello Tom,

> I'm trying to get my head around the Sentinel and just hoping someone cou=
ld
> clear up/confirm my idea of what it does. Will have to resort to a list i=
'm
> afraid:
>
>  1. Sentinels should be configured to point at Masters only, they will
> discover the Slaves. Otherwise you risk something like the following, may
> not be a problem, but certainly looks weird:
>      # Sentinel
>      sentinel_masters:3
>      master0:name=3Dslave1,status=3Dok,address=3D.32:6379,slaves=3D1,sent=
inels=3D2
>      master1:name=3Dslave2,status=3Dok,address=3D.32:6379,slaves=3D1,sent=
inels=3D2
>      master2:name=3Dmymaster,status=3Dok,address=3D.32:6379,slaves=3D1,se=
ntinels=3D2

I'm not sure about this, how did you obtained such an effect?

If a Sentinel is configured with the address of a slave what should
happen is a redirection so it should start monitoring the master
instead.
Btw was your output modified in some way? I see ".32" that is not a
valid address.

>  2. When a master fails (been using 'debug segfault' for this) the sentin=
els
> have a chat and promote one of the slaves to master. However if the maste=
r
> is rebooted it is not promoted back to master, nor is it reconfigured to =
a
> slave, at least that's what i'm seeing when testing this. Is it therefore=
 a
> requirement that any failed master needs to be reconfigured to a slaveof
> before rebooting? Of course any clients must keep their distance.

That's correct, Redis does not try to fix instances that were marked
as having issues right now, what to do in this case is currently not
clear, there are several strategies like doing nothing like today, and
wait for the system administrator to fix it (it was notified, in
theory).
Or actively try to disable the instance sending a SHUTDOWN in the
event it would be back for some reason so that clients will not be
able to talk with it.

There is also another option that looks promising, what we are going
to have ASAP is a way to configure Redis to be a slave of a master
that is not hardcoded ip:port, but discovered via sentinel. Something
like that:

slaveof sentinel://192.168.1.50:2679,192.168.1.51:2679 mymaster

With such a configuration what would happen is that Redis asks the
sentinels listed (one after the other until one replying is found) to
get the address of the master, and use this address to configure
replication.

When this will be available what you do is to configure all the Redis
instances like this, but manually switch one to master only the first
time (SLAVEOF NO ONE). So basically if an instance is rebooted it will
always try to be the slave of what the Sentinels tell it.

You also mentioned the clients in the context of an intermitting
failure / reboot. For instance a master does not work properly,
Sentinel promotes another slave, then the master is back online for
some reason, possibly after a reboot.
When this happens the clients should avoid engaging with it.

There are different solutions in this regard:

1) With the new SLAVEOF sentinel:// ... thing basically the instance
will start as a slave.
2) If the client is Sentinel-aware, when the master is failing and it
asks Sentinel info about the new master, it should probably update the
config of working masters / slaves, so the old one is no longer in the
table. This basically means that every time there is a connection
error a client should refresh the config accordingly to what Sentinels
are telling it.
3) In the failover script, to be safe, it is possible to add commands
to filter the IP of the failing master at layer 3.

>  3. When a master fails clients possible client reactions are:
>    - Receive a message from the sentinels via configured scripts and rebu=
ild
> connection pools.
>    - Store the replication information on connection to master, which the=
y
> can then use to find the new master should it pop.

Yes, my idea is that in the long term having "Sentinel aware" clients
is better than going for the configured script.
Every time a new Redis connection is created (new connection, link
reconnection after error or timeout) Sentinels should be queried.

So the client configuration should no longer be a list of Redis
instances, but a list of Sentinel addresses.

> I know sentinel is very much a work in progress, so I guess i'm really
> asking if these are design decisions or aspects that are on the list for
> completion.
>
> Keep thinking up places where Redis would be a great solution in the
> platform i'm a part of (its a great tool kit), just need to get my head
> around the high availability aspects before trying for a production use
> case.

Your email is very appreciated because Redis Sentinel is currently a
working system but is far from providing everything, the idea is to
collectively start to use it for what it can do it, and find together
what is the way for further developments to make it more useful and
simpler / safer to operate.

Cheers,
Salvatore

>
> Tom
>
> --
> You received this message because you are subscribed to the Google Groups
> "Redis DB" group.
> To view this discussion on the web visit
> https://groups.google.com/d/msg/redis-db/-/MWni3df9u00J.
> To post to this group, send email to redis-db@googlegroups.com.
> To unsubscribe from this group, send email to
> redis-db+unsubscribe@googlegroups.com.
> For more options, visit this group at
> http://groups.google.com/group/redis-db?hl=3Den.



--=20
Salvatore 'antirez' Sanfilippo
open source developer - VMware
http://invece.org

Beauty is more important in computing than anywhere else in technology
because software is so complicated. Beauty is the ultimate defence
against complexity.
       =97 David Gelernter