On Sat, May 4, 2013 at 2:45 AM, Andy McCurdy <
sed...@gmail.com> wrote:
> Hey Salvatore. Some replies below.
Thank you Andy,
> The occasional check to Sentinel is just part of what a client will have to do. Clients will also have to accept additional configuration options (at least a list/array of Sentinel ip:port values) and use that information to look up the real Redis servers. A sentinel aware client constructor might look something like this:
>
>>>> Redis(hostname, port, ..., sentinel_servers=[])
>
> This sounds trivial at first, but consider all the libraries (not client libs, but libs that are using a client lib to talk to Redis) out there that have code like this:
>
>>>> redis = Redis(REDIS_HOSTNAME, REDIS_PORT)
>>>> redis.get(...)
>
> where REDIS_HOSTNAME and REDIS_PORT come from environment variables, config files, etc. Even if the underlying client supports Sentinel, libs that use this common pattern won't work in Sentinel-based deployments until those libs also add more config options for Sentinel. This really sucks.
In environments where HA is needed, the only thing you can do is to
make things at least a bit more complex. There is no escape from that.
Even if we would use proxies, you still need to provide the list of
Sentinels, because if the client can't connect to the first, it is not
ok to fail, it needs to connect to the next.
Basically every distributed system that is reliable requires that
every client know about multiple access points to the system, there is
no escape from that whatever solution you use, otherwise the proxy
turns into a single point of failure.
That said, fortunately there are many things that can be done in
interesting ways.
1) Redis, as specified in the Sentinel doc (I don't remember if it's
in the spec or standard doc), will feature a command that will list
all the sentinels ip:port pairs that is monitoring a given instance.
This is trivial as Sentinels are required to send Pub/Sub messages for
auto-discovery. This means that clients, in case the user did not
specified the list of a few sentinels (they should as it is safer for
a cold start), can do a best-effort thing and at least fetch the list
of Sentinels once the object is created, assuming the master is up at
the moment the Redis object is created.
2) Users don't need to list all the Sentinels, it is enough to list a
few Sentinels in different conceptual availability zones, so that the
client wil be likely able to contact at least one. So at least the
list does not *need* to be esaustive.
> A single connection might see a slight drop in throughput, but assuming multiple Sentinels would be used to proxy connections to the Redis server, the system as a whole will still perform at the same level. In fact, in relational database communities, adding reverse proxies is often a good way to *increase* performance because they pool connections to the database server. Some clients like redis-py already do this, but it would be nice for everyone to have access to this functionality, regardless of client lib.
Well here I don't want to argue much, as this should be backed by
tests, but I can't agree that a middle layer can increase the
performance with the Redis architecture, unless you do sub-optimal
things in the client that a proxy can fix in some way. Proxies are
also another component that can fail, and another component where you
need to implement subtle things like TCP keep-alive, client timeouts,
and so forth. They will mask informations about client peer address
and port. I think that's really a shame to lost a lot of things just
for something that is definitely implementable client side easily. I
can understand using a proxy because of sharding concerns, if the
alternative is to implement sharding client side: this is a fair trade
off, but for HA, I don't see the win at all.
> I haven't used it yet, but twemproxy looks great. Unfortunately, it doesn't seem to know anything about master/slave relationships. If twemproxy becomes Sentinel aware, that's great. But then I would need to run 3 different daemons (redis-server, sentinel, and twemproxy) to get high availability.
It's three daemons but it would allow you to run with a setup that is
a different pick on trade offs, one that I don't like as stated
already but that many other very skilled people does. So I think there
is really a great value about having a Sentinel aware Twemproxy.
About the three daemons, that adds sysop complexity and is a good
point indeed, but to use this example to show my point of view, I'm a
lot more concenred with *three conceptual roles* added by the proxy
even if in the same daemon of Sentinel, as this raises the conceptual
complexity of the system.
Long story short, while I'll not go for the proxy, I think the
arguments you used for the proxy are the ones we could use to improve
the current system in different ways.
Cheers,