Sentinel design resists deployment in docker cluster

670 views
Skip to first unread message

Sheldon Hearn

unread,
Sep 9, 2014, 8:25:08 AM9/9/14
to redi...@googlegroups.com
Synopsis
========

For sentinel to work in an environment where services are only available through discovery (not at known ip:port locations), it would have to allow me to describe the slaves in configuration, instead of trying to discover them from the master. If it did this without supporting DNS SRV records, it would have to be restarted when redis instances relocate, but that could be made to work. As it stands, the current design resists deployment into a discovery-based environment.

Background
==========

I am building a docker docker cluster (based on CoreOS) in which containers are assigned ephemeral ports. Containerized redis processes listen on port 6379, but docker maps that to an ephemeral port on the docker host.

Each redis container is linked to cohosted ambassador containers that use Consul service discovery to locate other redis containers and proxy connections to them. The redis slave containers are passed the ip and port of the linked ambassadors via environment variables that are used to create their initial config. This works okay, and I am able to change the master-slave topology manually, using out of band service discovery.

It's not ideal. It would be much more awesome if the clients supported DNS SRV records in the SLAVEOF command. But it works.

Now, sentinel.

The sentinel configuration cannot provide the host:ip of each slave; they must be discovered from the configured master by sentinel. Unfortunately, each redis instance thinks it's listening on port 6379, when in fact it is not reachable on that port. Docker has mapped that port to another, ephemeral port that is reachable. So sentinel is unable to reach the slaves.

The bare minimum change to support deployment into discovery-driven environments would be for sentinel configuration to support declaration of the slaves. Combined with linked ambassador proxy containers, this would be enough. For replicated redis to be a pleasure to deploy into these environments, redis and sentinel would additionally need to support DNS SRV records. This would remove the need for the linked ambassador containers (which are a docker-specific work-around).

I discussed this with someone on #redis on freenode, who suggested I mention the issue here. I appreciate that this would be a huge change, so I'm not exactly making a feature request. I'm just highlighting a stumbling block for replicated redis in discovery-based environments, so that it can be considered when deployment into such environments becomes attractive to the community.

Ciao,
Sheldon.

Salvatore Sanfilippo

unread,
Sep 9, 2014, 8:56:08 AM9/9/14
to Redis DB
Hello,

the following commits into unstable (soon to be ported to 2.8) should
fix the issue:

0a6cbab Sentinel: don't set announce-ip if is empty.
c9437fe Sentinel: clarify announce-ip/port options in sentinel.conf.
cd576a1 Sentinel: announce ip/port changes + rewrite.
3d93926 sentinel: Decouple bind address from address sent to other sentinels

Basically you can tell every Sentinel what address / port should it
use to advertise itself.

I don't agree with the fact that the default is not good. It works
automatically in normal setups, and we fixed the case of not usual
setups.

Salvatore
> --
> You received this message because you are subscribed to the Google Groups
> "Redis DB" group.
> To unsubscribe from this group and stop receiving emails from it, send an
> email to redis-db+u...@googlegroups.com.
> To post to this group, send email to redi...@googlegroups.com.
> Visit this group at http://groups.google.com/group/redis-db.
> For more options, visit https://groups.google.com/d/optout.



--
Salvatore 'antirez' Sanfilippo
open source developer - GoPivotal
http://invece.org

"One would never undertake such a thing if one were not driven on by
some demon whom one can neither resist nor understand."
— George Orwell

Sheldon Hearn

unread,
Sep 9, 2014, 9:17:22 AM9/9/14
to redi...@googlegroups.com
Hi Salvatore,

On Tue, Sep 9, 2014 at 2:55 PM, Salvatore Sanfilippo <ant...@gmail.com> wrote:
Basically you can tell every Sentinel what address / port should it
use to advertise itself.

I don't think that fixes the problem I'm describing, although it sounds like it fixes a problem I would have run into if I'd gotten as far as trying to get sentinel working. I hadn't even thought about how sentinels discover each other. :-)

The problem I'm describing is not the location of sentinels, but the location of slaves. On a master, I see this in the INFO output:

role:master
connected_slaves:2
slave0:ip=172.17.8.102,port=6379,state=online,offset=1712,lag=0
slave1:ip=172.17.8.103,port=6379,state=online,offset=1712,lag=0

The slaves are not reachable at these locations, and so I can't use sentinel to handle fail over, because sentinel can't reach the slaves.

I don't agree with the fact that the default is not good. It works
automatically in normal setups, and we fixed the case of not usual
setups.

Absolutely! I'm sorry that's how my email came across. I certainly wasn't suggesting that the defaults aren't sensible. Discovery-based environments are (still) niche. I'm just saying that the current design doesn't support deployment into these niche environments.
 
Ciao,
Sheldon.

Salvatore Sanfilippo

unread,
Sep 9, 2014, 10:09:35 AM9/9/14
to Redis DB
On Tue, Sep 9, 2014 at 3:17 PM, Sheldon Hearn <shel...@starjuice.net> wrote:
> The slaves are not reachable at these locations, and so I can't use sentinel
> to handle fail over, because sentinel can't reach the slaves.

Oh I get it (sorry I did not read carefully the previos message), it
is a different instance of the same problem, more or less. Let me
think a bit about it, but the fix should be pretty easy, since we have
already mechanisms for the slave to communicate its coordinates with
the master (the way slaves are already able to communicate their TCP
port).

Thanks,
Salvatore

Vincent Chavelle

unread,
Oct 29, 2014, 6:01:32 AM10/29/14
to redi...@googlegroups.com
Hi Salvatore,

I've the same problem. I want to deploy sentinel to a docker environment (coreos) and the only solution I have found is to set --net=host... Not really what we expect to have in a discovery-based environment :-)
Do you have any update?

Thanks

Sheldon Hearn

unread,
Oct 29, 2014, 7:20:18 AM10/29/14
to redi...@googlegroups.com
Hi Vincent,

A demo of the best I've come up with for redis in a Docker cloud is
encoded here:

https://github.com/sheldonh/coreos-vagrant/redis

It expects the Docker cluster (provided from the parent directory) to
give it DNS discovery. It allows you to publish your desired
master/slave topology into etcd, and it does its best to apply that
topology in spite of rebooting cluster hosts, restarted containers,
etc. It doesn't to automatic failover -- my use case doesn't tolerate
the kind of data loss that can arise from a network partition. But
automating failover would not be hard.

Feel free to email me about it privately. It's probably a bit off
topic for the list.

Ciao,
Sheldon.
> --
> You received this message because you are subscribed to a topic in the
> Google Groups "Redis DB" group.
> To unsubscribe from this topic, visit
> https://groups.google.com/d/topic/redis-db/midrbfgxBrQ/unsubscribe.
> To unsubscribe from this group and all its topics, send an email to

Sheldon Hearn

unread,
Oct 29, 2014, 7:21:10 AM10/29/14
to redi...@googlegroups.com
On Wed, Oct 29, 2014 at 1:20 PM, Sheldon Hearn <shel...@starjuice.net> wrote:
> A demo of the best I've come up with for redis in a Docker cloud is
> encoded here:

I provided a broken link, sorry. Should have been:
https://github.com/sheldonh/coreos-vagrant/tree/master/redis

Ciao,
Sheldon.
Reply all
Reply to author
Forward
0 new messages