|Riak behind a Load Balancer||Matt Black||6/24/12 11:36 PM|
Does anyone have an opinion on the concept of putting a Riak cluster behind a load balancer?
We wish to be able to automatically add/remove nodes from the cluster, so adding an extra layer at the front is desirable. We should also benefit for incoming requests behind shared across all nodes.
Can anyone see any drawbacks / problems with doing this?
|Re: Riak behind a Load Balancer||Samuel Elliott||6/25/12 1:40 AM|
On Mon, Jun 25, 2012 at 7:36 AM, Matt Black <matt....@jbadigital.com> wrote:It has been done before. there are various results when searching
"riak haproxy" in your favourite search engine.
If your load balancer falls over, what do you do then? Highly
available may go down the pan. Have more than one would be the obvious
What do you do when you want to transparently add more machines to
your load balancer?
Maybe it might be better to have a list of riak nodes stored in a
separate registry (I'm thinking something like zookeeper), that your
application servers can then poll for changes (or even subscribe to
changes) to the list of servers.
> riak-users mailing list
+44 (0)7891 993 664
riak-users mailing list
|Re: Riak behind a Load Balancer||Sean Cribbs||6/25/12 4:44 AM|
Another typical setup is to have each client node have its own haproxy, and when Riak nodes are added or removed (not a common occurrence, mind you), a configuration management tool like Chef/Puppet/cfengine/etc can adjust the config and signal the process to reload it (I think it's `kill -HUP`). Then your client code also only ever needs to connect to localhost, and doesn't have to have itself reconfigured.--
Sean Cribbs <se...@basho.com>
Basho Technologies, Inc.
|Re: Riak behind a Load Balancer||Anthony Molinaro||6/25/12 11:21 AM|
This is almost exactly what we do with our riak clusters (as well as actually
all our subclusters). It's actually nice for developer because when they
need to talk to a network service they just reference 127.0.0.1 and some
port, and haproxy gets them to the right place. This also allows an
operations team to make network changes without impacting applications
(via the use of the haproxy reload).
The only downside is that if your connections are long lived and you restart
your riak nodes, all requests will end up on the last node. And if you restart
haproxy they will all end up on the first node initially. I sort of wish
there was a configuration parameter to haproxy to randomize the start
point for the roundrobin so you could keep things a little more balanced
across many machines running haproxy.
Anthony Molinaro <anth...@alumni.caltech.edu>
|Re: Riak behind a Load Balancer||Swinney, Austin||6/25/12 11:41 AM|
I use the Amazon Elastic Load Balancer (ELB) on my ec2 riak cluster. I understand the concerns of LB fail, but for me, using Riak is largely about ease of use. Easy to deploy, grow, shrink, replace, etc. Using an the ELB allows me to do those things without consulting those who manage the services that use cluster. And in the turbulent cloud space, if a node goes unresponsive, it just gets kicked out without any action on my part.
A few weeks ago, I had some episode where I replaced all the nodes in the riak cluster. It sucked for me (because I don't know what I'm doing!), but the service owner who actually uses the cluster came to my desk a week later and asked, "So how is everything going with the riak cluster?" He hadn't even noticed, nor had anyone else unless I told them about it. To me, that is pure ops gold.
Performance wise, I get what I need out of it. Response times through the ELB are all pretty similar across the board. I never tested it without the ELB, so I am not aware of any performance hit using ELB.
My ELB health check settings:
If I didn't have ELB, I'd look at the zookeeper route.
|Re: Riak behind a Load Balancer||Michael Clemmons||6/25/12 12:01 PM|
If your running multiple clients use an LB on each and do failover if the local lb is down.-Michael
|Re: Riak behind a Load Balancer||Eric Moritz||6/25/12 12:24 PM|
|RE: Riak behind a Load Balancer||ad...@forcecontent.com||6/26/12 7:10 PM|
Sorry for top posting you, but I run a riak cluster behind two separate virtual IPs on my load balancer (an F5 LTM), one on a private IP that I write to and one on a public IP that can be read from. I control that behavior with iRules, though I'm sure other LBs have their own mechanism for the same. It may be overkill but it was an easy way to control where reads/writes came from.
|RE: Riak behind a Load Balancer||Dave Greenstein||6/27/12 6:48 AM|
We're running smoothly behind nginx in a round robin config. One thing
to remember is eventual consistency... So, if you have two very quick
serial operations, one dependent on the results of the other, be sure
to make sure both requests are hitting the same node. Also, it was
necessary to bypass the lb for any streaming keys operation.
|Re: Riak behind a Load Balancer||Sean Carey||8/30/12 2:08 PM|
Haproxy is my load balancer of choice. You can always run multiple copies of haproxy and use some type of dynamic dns with it.
We do this in many cases. Haproxy scales well. I've seen a single node sustain multiple gigabits per second with almost no sweat.
|Re: Riak behind a Load Balancer||Dave Brady||8/31/12 1:56 AM|
There's a reference to an article on Basho's site, written by Amazon, about Dynamo:
Section 6.4 explains why they do not use load balancers.
The rest of the article is good reading, too.
From: "Sean Carey" <ca...@basho.com>
To: "Matt Black" <matt....@jbadigital.com>
Cc: "riak-users" <riak-...@lists.basho.com>
Sent: Thursday, August 30, 2012 11:09:00 PM
Subject: Re: Riak behind a Load Balancer
|Re: Riak behind a Load Balancer||Guido Medina||8/31/12 2:07 AM|
We use HA proxy on a 4 nodes cluster for a J2EE application, before using HA proxy we delegated on the Java Riak cluster client config to handle the 4 nodes, but too many errors were generated, once we switched to HA proxy and a single Java Riak client config, the errors were reduced dramatically.
I don't know how accurate an article can be, but my experience comes from transferring 10+ millions records from PostgreSQL to Riak several times, running processes for 48 to 72 hours continuously, so IMHO, HA proxy + new client client (The one that comes with Protobuf 2.4.1+, not the old 2.3.0) is the way to go. To add more, we have autossh tunnels on the Java application server connected to each Riak node and behind it, HA proxy.
With the new protobuf I have the transferring running for the last 48 hours and not a single error.
|Re: Riak behind a Load Balancer||Armon Dadgar||8/31/12 10:46 AM|
We have been running Riak behind HAProxy since day one, using the
Protocol Buffers interface. Hasn't ever been a problem, and makes cluster
changes transparent to the application.
|Re: Riak behind a Load Balancer||Sebastian Cohnen||9/8/12 7:09 AM|
AFAIK Riak does not expose these data via API in order to implement a "client-driven coordination", right?
This sounds quite interesting and would be a nice way to reduce latencies when talking to Riak.
|Re: Riak behind a Load Balancer||Sean Cribbs||9/8/12 9:02 AM|
There are plans to have client-driven request routing (that is, at
least sending the request to a member of the preflist) in the future,
but that is currently vaporware. An interim solution we have discussed
was to send the client a "hint" as to where to send a request for that
key on the next time around, but that is also unimplemented. On
well-performing client and Riak machines, the difference will be
small-to-negligible anyway; the primary effect of sending a request to
the member of the preflist would be reduced network traffic between
the Riak nodes themselves.
|Re: Riak behind a Load Balancer||Sebastian Cohnen||9/8/12 9:18 AM|
how do you explain these rather huge improvements Amazon presents in their Dynamo paper?
|Re: Riak behind a Load Balancer||Sean Cribbs||9/8/12 9:52 AM|
On Sat, Sep 8, 2012 at 12:18 PM, Sebastian CohnenMy point was mainly that the performance benefits vs. complexity
overhead you would get from client-side routing is sometimes not as
great as simply getting better hardware or more nodes. That is a big,
complex feature to implement on both sides of the connection.
Clearly, doing a local disk I/O to fetch the key, and then waiting for
only one replica from the network before replying is going to be more
efficient than going to the remote nodes for both replicas needed to
meet the default quorum. Additionally, requesting fewer remote
replicas would reduce the effects of TCP incast, should you be under
high load. On the other hand, if the value is not in the local disk
cache or a buffer maintained by the storage engine, you might very
well be seek-bound (in some engines multiple seeks), making the
network overhead negligible in comparison. For example, we have seen
several Riak users on AWS become bound by latency to the EBS volume
(even when in RAID), while the network is comparatively idle. Most
people also store small values in Riak, meaning a response from or
request to a replica will often fit inside a single MTU.
This is certainly something we want to do in the future, but without
it being the greatest bottleneck, we'll focus our performance
improvements elsewhere in the meantime. There are still many gains to
be made in our storage engines and other core components.
|Re: Riak behind a Load Balancer||gibraltar||9/8/12 10:25 AM|
I wonder what happens if one would run a Riak cluster on 5/10/15+ EC2 micro (or small) Linux machines behind Elastic Load Balance? Do you think it would perform well enough for a web site with moderate traffic. Idea is having many many "small" machines rather than couple of "big" machines.
Anyone has any experience with something similar?
|Re: Riak behind a Load Balancer||Matt Black||9/9/12 5:57 PM|
I'm currently running six Riak nodes on EC2 small instances behind an ELB. This works fine for us - although when running a large map reduce task we connect directly to a single node rather than routing through the ELB. I don't have any actual performance statistics to hand, but I could get some if the list is interested.--