We’re developing a riak-core application, that does not include any persistence and works in-memory, and are wondering what are the best use cases to test riak-core and erlang itself in large-scale deployments (>100 physical nodes).
For example some of the map-reduce frameworks (like hadoop) have performance tests like terasort, etc., which can show to what extent the whole framework can be scaled.
So could you share some ideas what are the best practices to test large-scale deployments of riak-core and erlang applications? What synthetic tests and benchmarks can be executed to answer the following questions:
1. Does the system scale well?
2. Can the system be considered as linearly scalable?
The information contained in this message may be privileged and conf idential and protected from disclosure. If you are not the original intended recipient, you are hereby notified that any review, retransmission, dissemination, or other use of, or taking of any action in reliance upon, this information is prohibited. If you have received this communication in error, please notify the sender immediately by replying to this message and delete it from your computer. Thank you for your cooperation. Troika Dialog, Russia. If you need assistance please contact our Contact Center (+7495) 258 0500 or go to www.troika.ru/eng/Contacts/system.wbp
We (OpenX) have a riak_core based application that's running on a 125 node
cluster (there are also other smaller clusters). We never really tested to
see where it would fall over (and the cluster was much smaller when it
started), but I see no indicators that it will fall over when we add the
126th node. FWIW, it's running riak_core 0.13.0, and I assume the newer
versions of riak_core have only gotten better. Answers to some of your
other questions (based solely on my experience) in-line below.
On Fri, Sep 21, 2012 at 6:29 AM, Zhemzhitsky Sergey <
> We’re developing a riak-core application, that does not include any
> persistence and works in-memory, and are wondering what are the best use
> cases to test riak-core and erlang itself in large-scale deployments (>100
> physical nodes).****
> ** **
> For example some of the map-reduce frameworks (like hadoop) have
> performance tests like terasort, etc., which can show to what extent the
> whole framework can be scaled. ****
> ** **
> So could you share some ideas what are the best practices to test
> large-scale deployments of riak-core and erlang applications? What
> synthetic tests and benchmarks can be executed to answer the following
> questions: ****
> ** **
> **1. **Does the system scale well?
Yes, so far it has scaled well.
> ** **
> **2. **Can the system be considered as linearly scalable?
Yes, the riak_core portion can be considered linearly scalable. The overall
behavior is largely dependent on what you're doing in your vnodes and how
well you hash the things you want distributed. In theory, if you hash
poorly you can get hot-spots that will prevent linear scalability, but I
haven't seen that happen with our workload.
> ****
> **3. **Is the system truly fault-tolerant?
For the most part, 'yes', but that again depends on how you implement your
vnode. The problems that I've encountered were due to my own inexperience
with erlang when implementing my vnode.
In general I've been very happy with riak_core and we're definitely looking
at using it more for places where it's the right solution.
> The information contained in this message may be privileged and conf
> idential and protected from disclosure. If you are not the original
> intended recipient, you are hereby notified that any review,
> retransmission, dissemination, or other use of, or taking of any action in
> reliance upon, this information is prohibited. If you have received this
> communication in error, please notify the sender immediately by replying to
> this message and delete it from your computer. Thank you for your
> cooperation. Troika Dialog, Russia.
> We (OpenX) have a riak_core based application that's running on a 125 node cluster (there are also other smaller clusters). We never really tested to see where it would fall over (and the cluster was much smaller when it started), but I see no indicators that it will fall over when we add the 126th node. FWIW, it's running riak_core 0.13.0, and I assume the newer versions of riak_core have only gotten better. Answers to some of your other questions (based solely on my experience) in-line below.
However, as mentioned in the thread, a fully connected network of nodes (fully connected because of the usage of distributed Erlang) does have a natural limit (due to the network speed) on scalability with the net tick time. You can always increase the net tick time, but then failures will take longer to detect.
So, your success may rely on your fault-tolerance requirements.
Best Regards,
Sergey Zhemzhitsky
Phone. +7 495 2580500 ext. 1246
From: Michael Truog [mailto:mjtr...@gmail.com]
Sent: Saturday, September 22, 2012 5:58 AM
To: Joel Meyer; Zhemzhitsky Sergey
Cc: riak-us...@lists.basho.com; erlang-questions
Subject: Re: [erlang-questions] Large scale deployments testing
On 09/21/2012 03:00 PM, Joel Meyer wrote:
Hi Sergey,
We (OpenX) have a riak_core based application that's running on a 125 node cluster (there are also other smaller clusters). We never really tested to see where it would fall over (and the cluster was much smaller when it started), but I see no indicators that it will fall over when we add the 126th node. FWIW, it's running riak_core 0.13.0, and I assume the newer versions of riak_core have only gotten better. Answers to some of your other questions (based solely on my experience) in-line below.
However, as mentioned in the thread, a fully connected network of nodes (fully connected because of the usage of distributed Erlang) does have a natural limit (due to the network speed) on scalability with the net tick time. You can always increase the net tick time, but then failures will take longer to detect.
So, your success may rely on your fault-tolerance requirements.
The information contained in this message may be privileged and conf idential and protected from disclosure. If you are not the original intended recipient, you are hereby notified that any review, retransmission, dissemination, or other use of, or taking of any action in reliance upon, this information is prohibited. If you have received this communication in error, please notify the sender immediately by replying to this message and delete it from your computer. Thank you for your cooperation. Troika Dialog, Russia. If you need assistance please contact our Contact Center (+7495) 258 0500 or go to www.troika.ru/eng/Contacts/system.wbp