Is Couchbase suitable for a small-reco​rd write-inte​nsive billion-re​cord applicatio​n?

146 views
Skip to first unread message

yassen

unread,
Oct 23, 2012, 1:38:22 PM10/23/12
to couc...@googlegroups.com

Hi everyone,

Absolutely new (and ignorant) to NoSQL solutions and to Couchbase (my apologies; but extensive experience with SQL RDBMS).

We consider a NoSQL DB deployment for a mission-critical application where we need to store several hundreds of millions of data records, each record consisting of about 6 string fields, record total length is 160 bytes. There is a unique key in each record that seems suitable for hashing (20+ bytes string, e.g. "cle01_tpls01_2105328884").

The application should be able to write several hundreds of new records per second, but first check if the unique key already exists. Writing is to be done only if it is not there. If it is, the app needs to retrieve the whole record and return it to the client and no writing is done in this case.

We need to have a cluster of at least 2-3 nodes, which must be able to grow easily if a need be.

I need to know if Couchbase would be suitable for such application. Please, advice, thank you!

Chad Kouse

unread,
Oct 23, 2012, 3:01:33 PM10/23/12
to couc...@googlegroups.com
Well, shallowly, there is an "add" method in couchbase that you could use -- it only adds a record if one doesn't already exist.  You can check the result code to detect if it already existed or not.

From there it's just a "get" to retrieve the document

The part I'm not great at answering is how much space will be eaten up by metadata on such small keys.  Maybe one of the couchbase pros can do that.

-- 
Chad Kouse

Zeming Jin

unread,
Oct 23, 2012, 3:36:34 PM10/23/12
to couc...@googlegroups.com
57-bytes for each entry is a good number to estimate metadata size.

Chad Kouse

unread,
Oct 23, 2012, 5:03:00 PM10/23/12
to couc...@googlegroups.com
btw, this sounds similar to our use case. (see screenshot: http://grab.by/gZ0C ) these 2 buckets both reside on the same 16 nodes. -- we've been really happy with performance.. although I'm not sure where those 2 disk fetches per second are creeping in from.. must investigate!

We also carry 1 replica copy so couchbase is handling over a billion data points for us (by the way our document sizes can be a lot larger than 160 bytes)

-- 
Chad Kouse

yassen

unread,
Oct 29, 2012, 8:35:21 AM10/29/12
to couc...@googlegroups.com
Chad, Musician, thank you guys!

> 57-bytes for each entry is a good number to estimate metadata size

This sounds great; 57 bytes is really nice metatdata overhead :)

> Well, shallowly, there is an "add" method in couchbase that you could use -- it only adds a record if one doesn't already exist.

That sounds very nice as well!

Can anyone tell me something about b-tree vs. hashing for the primary key? Do we have both options in couchbase? How about different data stores for a similar use case? (Heading me to the proper documentation sources would be great.)

Thank you!
Yassen
Reply all
Reply to author
Forward
0 new messages