Account Options

  1. Sign in
The old Google Groups will be going away soon, but your browser is incompatible with the new version.
Google Groups Home
« Groups Home
Message from discussion Project Voldemort vs Cassandra for write heavy apps
The group you are posting to is a Usenet group. Messages posted to this group will make your email address visible to anyone on the Internet.
Your reply message has not been sent.
Your post was successful
 
From:
To:
Cc:
Followup To:
Add Cc | Add Followup-to | Edit Subject
Subject:
Validation:
For verification purposes please type the characters you see in the picture below or the numbers you hear by clicking the accessibility icon. Listen and type the numbers you hear
 
Sean  
View profile  
 More options Feb 1 2011, 1:47 am
From: Sean <sean.bigdata...@gmail.com>
Date: Mon, 31 Jan 2011 22:47:13 -0800 (PST)
Local: Tues, Feb 1 2011 1:47 am
Subject: Re: Project Voldemort vs Cassandra for write heavy apps
I see a vast performance difference in a published benchmark. can
someone give a thought on it?

On Jan 31, 12:23 pm, Alex Feinberg <feinb...@gmail.com> wrote:

> Hi Sean,

> By default, Voldemort uses BerkeleyDB Java Edition as the storage
> engine. BerkeleyDB Java Edition actually uses a log-structured B+Tree,
> which is the same design principle as Log Structured Merge Trees used
> by SSTables in BigTable/Cassandra. If you'd like to learn more, I
> suggest reading the BerkeleyDB Java Edition Architecture white paper
> (http://www.oracle.com/go/?&Src=4945225&Act=7) from Oracle/Sleepycat.
> If you'd like to understand log structured systems in general, Mendel
> Rosenblum's Ph. D dissertation is a good start:http://www.eecs.berkeley.edu/~brewer/cs262/LFS.pdf(the paper is about
> a file system, but in grand scheme of things file systems and
> databases are remarkably similar).

> In terms of actual benchmarks, here's one:http://blog.medallia.com/2010/05/choosing_a_keyvalue_storage_sy.html

> Both Voldemort and Cassandra are also supported by the YCSB (Yahoo
> Cloud Storage Benchmark). We provide a slightly modified version of
> YCSB with Voldemort as the performance tool:https://github.com/voldemort/voldemort/wiki/Performance-Tool

> As far as I recall, writes are slightly faster in Cassandra and reads
> are slightly faster in Voldemort. At least with version 0.6 and
> earlier, I believe, out of the box, the performance impact of log
> compaction is somewhat less visible in Voldemort than in Cassandra (of
> course it entirely depends on your environment and configuration in
> both cases).

> HBase and Hypertable also use LSM trees and, in a normal scenario,
> also have very high write performance. I am not very familiar with
> Hypertable, but HBase is also able to do fast range scans. There are
> some very interesting applications built that leverage write
> performance and data model of HBase e.g., OpenTSDB.

> The key difference between Dynamo and BigTable is the behaviour in a
> failure scenario: in the case of BigTable, when a node responsible for
> a partition goes down, there is a period when read and write
> availability is lost until another node takes over.  Using WAL
> shipping (I believe that either is supported or may be supported by
> HBase in the future), it's possible to achieve high availability for
> reads and there's ongoing work to minimize the "transition" period for
> a failed node down to a few seconds (presently, if I am correct, it's
> around 1-2 minutes?). Once this is done, it would mean that upon
> failure, you will see latency spikes as the clients retry writes until
> a success happens.  The advantage of this is ability to do more atomic
> operations: e.g., to implement a counter in Voldemort, you have to use
> an "optimistic lock" with a vector clock (see the applyUpdate() method
> in StoreClient interface), but this can be done atomically in HBase.

> Of course, keep in mind that I'm talking about the architecture here,
> the implementation details change.

> Thanks,
> - Alex

> On Sun, Jan 30, 2011 at 10:30 PM, Sean <sean.bigdata...@gmail.com> wrote:
> > People seem to have consensus that Bigtable model (HBase/Hypertable)
> > is good for range query, and Dynamo model (Cassandra/Voldemort) is
> > good for write. Ok, let's discuss from this consensus:

> > For write-heavy apps, is there any benckmark between Project Voldemort
> > and Cassandra? -- I suppose the consistency model and DHT routing are
> > probably similar in these two systems. The performance has a lot to do
> > with the data node storage? (BDB vs SSTable?)

> > Is there any theoretical or empirical comparison? Or benchmark results?

> > --
> > You received this message because you are subscribed to the Google Groups "project-voldemort" group.
> > To post to this group, send email to project-voldemort@googlegroups.com.
> > To unsubscribe from this group, send email to project-voldemort+unsubscribe@googlegroups.com.
> > For more options, visit this group athttp://groups.google.com/group/project-voldemort?hl=en.


 
You must Sign in before you can post messages.
To post a message you must first join this group.
Please update your nickname on the subscription settings page before posting.
You do not have the permission required to post.