Redis High Availability using Corosync+Pacemaker

5,199 views
Skip to first unread message

Cal

unread,
Nov 12, 2012, 5:05:31 PM11/12/12
to redi...@googlegroups.com
Hi everyone,

I thought I'd post my findings here in case this helps anyone out.  With only some minor investigation into Sentinel, I wanted to find something more stable and quicker to recover.  Plus, I don't think I like all of the client side logic necessary when using Sentinel.

Pacemaker does HA at the OS level, using ARP to move a clustered IP address between nodes.  Along with that event, it has a big state machine of events that trigger actions when a failover occurs.

The downside to Pacemaker, is that it can be difficult to set up.  And, it behaves more reliably when you have at least three nodes, with each node having an IPMI card.  (It "fences" a downed node by triggering a hardware reset in order to ensure the node is truly down)

Given the setup headaches, the plus side is that I've found the failover time from Master to Slave is about 3 seconds.  We also use a similar cluster setup for our Memcache cluster, and it works great.  The Redis and Memcache clients don't need any logic other than trying to reconnect to an IP address on failure.  Then maybe give up after 10 seconds or so.

The (very rough) guideline for a Redhat 6 setup is:

* Install packages pacemaker, corosync, cluster-glue, fence-agents
* Download this and copy to /usr/lib/ocf/resource.d/redhat
* Modify the file to point at your Redis locations, replace start-stop-daemon lines with how you usually start/stop redis
* This very minimal CRM config should get you started, with less hair pulling than I went through:

node node1
node node2
# define the IP address clients will connect to
primitive cluster-ip ocf:heartbeat:IPaddr2 \
        params ip="192.168.1.10" cidr_netmask="24" \
        op monitor interval="1s" timeout="20s" \
        op start interval="0" timeout="20s" \
        op stop interval="0" timeout="20s" \
        meta is-managed="true" target-role="Started" resource-stickiness="500"
#define the Redis resource from the OCF script
primitive redis ocf:redhat:redis \
        meta target-role="Master" is-managed="true" \
        op monitor interval="1s" role="Master" timeout="5s" on-fail="restart"
# "clone" the Redis primitive across all available nodes
ms redis_clone redis \
        meta notify="true" is-managed="true" ordered="false" interleave="false" globally-unique="false" target-role="Master" migration-threshold="1"                                                                           
# Ensure cluster IP always has a running Master copy of Redis
colocation ip-on-redis inf: cluster-ip redis_clone:Master

Enjoy!

--Cal

Cal

unread,
Nov 13, 2012, 4:11:00 PM11/13/12
to redi...@googlegroups.com
Two small follow up modifications to that configuration: 

primitive redis ocf:redhat:redis \
        op monitor interval="1s" role="Master" timeout="5s" on-fail="restart" \
        op monitor interval="2s" role="Slave" timeout="5s" on-fail="restart"
ms redis_clone redis \
        meta notify="true" is-managed="true" ordered="false" interleave="false" globally-unique="false" target-role="Started" migration-threshold="5"                            

That turns on active monitoring for both the master and slave Redis instance.  Also the migration-threshold=5 enforces a limit of 5 restarts before the cluster determines a node is down.  If the Redis process dies but the node is still running fine, it's quicker to restart the daemon than to fail over.  After 5 failures without your intervention, something more sinister must be happening, and that node will be fenced (rebooted). 

Intervention in this case, means running "crm resource cleanup redis_clone" -- that will reset all of the failure counts to 0 and restart services if necessary.

--Cal

Razvan Oncioiu

unread,
Jun 16, 2014, 3:57:37 PM6/16/14
to redi...@googlegroups.com
Hello Cal, 

I don't suppose you still have a link that works to that resource, it seems that it no longer exists on GitHub.

Thank You.

Salvatore Sanfilippo

unread,
Jun 17, 2014, 8:47:06 AM6/17/14
to Redis DB
Hello,

I want to hijack this thread a little bit. Basically, I think that
people should use Sentinel, which is far better than an home made
solution based on a CP store.
Note that the original message in this thread is from 2012. Sentinel
back then was broken (and unstable). New Sentinel can failover in a
matter of hundred of milliseconds, providing the guarantee of an
eventually consistent configuration from the point of view of the
state about the current master, and an eventually consistent behavior
from the point of view of the data set (eventually every node
replicates the same master).

If you use a CP system for Redis failover, and you believe that this
provides a CP system from the point of view of Redis itself, you are
obviously wrong. an eventually consistent configuration during
partitions is the best you can get.

Example: you use zookeeper for failovers. A client and the current
master get partitioned in the minority side. ZK can either don't reply
to you with an updated configuration, or reply stale data, since the
failover can only happen in the majority side. So your client will
still write to the old master. (The solution is to have bound nodes
desynchronization on partitions, which can be achieved by other means
using the Redis configuration).
Moreover home-made solutions will not care to do what Sentinel does,
which is, to also reconfigure all the reachable instances to impose
the current configuration to the Redis nodes, so it is easy to imagine
old masters remaining configured as masters, slaves not reconfigured
correctly, or a failover interrupted half-way to remain inconsistent
forever.

So if you plan to use a non-Sentinel based solution, ask yourself
"why". Either you have sounding technical reasons, if so, please share
with us, or the choice is a random one.

Salvatore
> --
> You received this message because you are subscribed to the Google Groups
> "Redis DB" group.
> To unsubscribe from this group and stop receiving emails from it, send an
> email to redis-db+u...@googlegroups.com.
>
> To post to this group, send email to redi...@googlegroups.com.
> Visit this group at http://groups.google.com/group/redis-db.
> For more options, visit https://groups.google.com/d/optout.



--
Salvatore 'antirez' Sanfilippo
open source developer - GoPivotal
http://invece.org

To "attack a straw man" is to create the illusion of having refuted a
proposition by replacing it with a superficially similar yet
unequivalent proposition (the "straw man"), and to refute it
— Wikipedia (Straw man page)

Cal Heldenbrand

unread,
Jun 17, 2014, 8:58:02 AM6/17/14
to redi...@googlegroups.com
I would concur with Salvatore.  My Corosync/Pacemaker machines are long gone.  I became pretty frustrated with it.  If you don't have at least 3 nodes in a cluster with hardware controlled STONITH devices, the cluster can become unstable.  Sort of like a cheap RAID card, I found it was less dependable than a single machine.

I haven't had any experience with Sentinel though.  But, I've had success with keepalived, and it seems stable and functional.

--Cal


--
You received this message because you are subscribed to a topic in the Google Groups "Redis DB" group.
To unsubscribe from this topic, visit https://groups.google.com/d/topic/redis-db/eY3zCKnl0G0/unsubscribe.
To unsubscribe from this group and all its topics, send an email to redis-db+u...@googlegroups.com.

Cal Leeming [Simplicity Media Ltd]

unread,
Jun 17, 2014, 9:08:50 AM6/17/14
to redi...@googlegroups.com
I got really confused for a moment there... thinking that old age was messing with my memory lol.

First time I've seen someone else on a mailing list with the same name!

Cal

Razvan Oncioiu

unread,
Jun 17, 2014, 6:00:32 PM6/17/14
to redi...@googlegroups.com

Hello , 

I would say ;

1- I know and am decent at Pacemaker and Corosync, I have used it on multiple servers and I find it suits my needs very well. ( and I don't necessarily have the time to hack at a resource right now to get the Redis resource I found to run , or to build one from scratch)  

2- When investigating the problem , I ran in to Sentinel , and from the Redis site, it states : "Sentinel is currently developed in the unstable branch of the Redis source code at Github" , now I ask you, should I be a test subject and place this in production? No I don't think I should. Of course , you would ask , would I test a Corosync/Pacemaker solution? Yes, of course I would. And the tests would be satisfied a lot quicker than testing something I am not familiar with. But as it matures, I would give Sentinel as it seems to be a nice solution. 

But I would try my request again, if anyone has a new link or a way to get the HA resource for Redis that is in this thread above , it would be greatly appreciated.

Thank You.


On Monday, November 12, 2012 5:05:31 PM UTC-5, Cal wrote:

Salvatore Sanfilippo

unread,
Jun 17, 2014, 6:13:02 PM6/17/14
to Redis DB
Hello Razvan,

Sentinel actually is considered stable. The full sentence in the
documentation is: "Sentinel is currently developed in the unstable
branch of the Redis source code at Github. However an update copy of
Sentinel is provided with every patch release of Redis 2.8.".

It means that new commits are performed into the unstable branch
always, but the stable releases are synched with the commits as soon
as they are tested.
Maybe this is not clear enough, I'll write in the documentation
clearly that Sentinel as shipped in Redis 2.8 is stable. This
documentation is a bit unclear because it used to say that Sentinel
was *only* available into the unstable branch, years ago.

So you would not be a test subject, there are multiple users using
Sentinel in production and all the latest bug reports received about
Sentinel turned out to be always problems with the configuration and
never Sentinel itself acting in an unwanted way. So bugs are not
impossible, but the project looks solid.

Of course you are free to implement your own solution, there are
multiple ways to do things. However since this thread started about
problems with Sentinel that no longer exist (here the specific issue
was that it was slow to react, but it had other issues as well, so it
was reimplemented from scratch), I wanted to clarify that today's
Redis Sentinel is stable, has a simple to understand semantics (you
can easily predict what it does in different conditions), and is IMHO,
the to-go solution for Redis HA.

Salvatore

Salvatore Sanfilippo

unread,
Jun 17, 2014, 6:19:16 PM6/17/14
to Redis DB
I clarified the documentation.

Thanks,
Salvatore

Razvan Oncioiu

unread,
Jun 17, 2014, 6:33:06 PM6/17/14
to redi...@googlegroups.com
Very cool, I will give this a try as it seems to do what we want it to do. Thank you for your work!

Thakn You, 

Razvan Oncioiu

Salvatore Sanfilippo

unread,
Jun 17, 2014, 6:35:51 PM6/17/14
to Redis DB
Thank you, please report any issue here, I'll do my best to fix things
ASAP if there are.

Salvatore
"One would never undertake such a thing if one were not driven on by
some demon whom one can neither resist nor understand."
— George Orwell

can.oe...@gmail.com

unread,
Nov 3, 2015, 2:25:49 AM11/3/15
to Redis DB
Hello,


Of course you are free to implement your own solution, there are
multiple ways to do things. However since this thread started about
problems with Sentinel that no longer exist (here the specific issue
was that it was slow to react, but it had other issues as well, so it
was reimplemented from scratch), I wanted to clarify that today's
Redis Sentinel is stable, has a simple to understand semantics (you
can easily predict what it does in different conditions), and is IMHO,
the to-go solution for Redis HA.

Salvatore

I'm also considering a corosync/pacemaker resource agent to provide automatic failover in case of an outage.
The reason is that both Sentinel and Redis Cluster require at least 3 machines. I only have 2. With Corosync I can set them up so that they communicate via 2 separate network interfaces.
If a machine gets offline, the other one will notice that and can promote a new master.

br,
Can

Josiah Carlson

unread,
Nov 5, 2015, 12:55:16 AM11/5/15
to redi...@googlegroups.com
Or if you have a netsplit, both will think they are the master.

Answer: you don't just run sentinel on the machines you are running Redis, you also run them on every machine your clients are running on. That's how you get 3+ machines running Sentinel, even when you've only got 2 servers.

 - Josiah


--

can.oe...@gmail.com

unread,
Nov 6, 2015, 8:40:24 AM11/6/15
to Redis DB

Or if you have a netsplit, both will think they are the master.

Answer: you don't just run sentinel on the machines you are running Redis, you also run them on every machine your clients are running on. That's how you get 3+ machines running Sentinel, even when you've only got 2 servers.

 - Josiah


In my setup the servers are also the clients.
I could setup a third machine in a different datacenter but then again network outages could occur because the link is not local.

It's unlikely to have a netsplit in the current setup - since I have 2 network interfaces. The machines would have to not see each other over both network interfaces for this to happen.

Can

Salvatore Sanfilippo

unread,
Nov 6, 2015, 9:32:07 AM11/6/15
to Redis DB
Old thread... and after so much time we have better doc. The Sentinel
documentation hopefully now does a good job at explaining things.
Please read it and report the exact part which is unclear for further
info.
> --
> You received this message because you are subscribed to the Google Groups
> "Redis DB" group.
> To unsubscribe from this group and stop receiving emails from it, send an
> email to redis-db+u...@googlegroups.com.
> To post to this group, send email to redi...@googlegroups.com.
> Visit this group at http://groups.google.com/group/redis-db.
> For more options, visit https://groups.google.com/d/optout.



--
Salvatore 'antirez' Sanfilippo
open source developer - Redis Labs https://redislabs.com

"If a system is to have conceptual integrity, someone must control the
concepts."
— Fred Brooks, "The Mythical Man-Month", 1975.
Reply all
Reply to author
Forward
0 new messages