Absolutely new (and ignorant) to NoSQL solutions and to Couchbase (my apologies; but extensive experience with SQL RDBMS).
We consider a NoSQL DB deployment for a mission-critical application where we need to store several hundreds of millions of data records, each record consisting of about 6 string fields, record total length is 160 bytes. There is a unique key in each record that seems suitable for hashing (20+ bytes string, e.g. "cle01_tpls01_2105328884").
The application should be able to write several hundreds of new records per second, but first check if the unique key already exists. Writing is to be done only if it is not there. If it is, the app needs to retrieve the whole record and return it to the client and no writing is done in this case.
We need to have a cluster of at least 2-3 nodes, which must be able to grow easily if a need be. I need to know if Couchbase would be suitable for such application. Please, advice, thank you!
Well, shallowly, there is an "add" method in couchbase that you could use -- it only adds a record if one doesn't already exist. You can check the result code to detect if it already existed or not.
From there it's just a "get" to retrieve the document
The part I'm not great at answering is how much space will be eaten up by metadata on such small keys. Maybe one of the couchbase pros can do that.
On Tuesday, October 23, 2012 at 1:38 PM, yassen wrote:
> Hi everyone,
> Absolutely new (and ignorant) to NoSQL solutions and to Couchbase (my apologies; but extensive experience with SQL RDBMS).
> We consider a NoSQL DB deployment for a mission-critical application where we need to store several hundreds of millions of data records, each record consisting of about 6 string fields, record total length is 160 bytes. There is a unique key in each record that seems suitable for hashing (20+ bytes string, e.g. "cle01_tpls01_2105328884").
> The application should be able to write several hundreds of new records per second, but first check if the unique key already exists. Writing is to be done only if it is not there. If it is, the app needs to retrieve the whole record and return it to the client and no writing is done in this case.
> We need to have a cluster of at least 2-3 nodes, which must be able to grow easily if a need be.
> I need to know if Couchbase would be suitable for such application. Please, advice, thank you!
On Tue, Oct 23, 2012 at 2:01 PM, Chad Kouse <c...@tunewiki.com> wrote:
> Well, shallowly, there is an "add" method in couchbase that you could
> use -- it only adds a record if one doesn't already exist. You can check
> the result code to detect if it already existed or not.
> From there it's just a "get" to retrieve the document
> The part I'm not great at answering is how much space will be eaten up by
> metadata on such small keys. Maybe one of the couchbase pros can do that.
> --
> Chad Kouse
> On Tuesday, October 23, 2012 at 1:38 PM, yassen wrote:
> Hi everyone,
> Absolutely new (and ignorant) to NoSQL solutions and to Couchbase (my
> apologies; but extensive experience with SQL RDBMS).
> We consider a NoSQL DB deployment for a mission-critical application where
> we need to store several hundreds of millions of data records, each record
> consisting of about 6 string fields, record total length is 160 bytes.
> There is a unique key in each record that seems suitable for hashing (20+
> bytes string, e.g. "cle01_tpls01_2105328884").
> The application should be able to write several hundreds of new records
> per second, but first check if the unique key already exists. Writing is to
> be done only if it is not there. If it is, the app needs to retrieve the
> whole record and return it to the client and no writing is done in this
> case.
> We need to have a cluster of at least 2-3 nodes, which must be able to
> grow easily if a need be.
> I need to know if Couchbase would be suitable for such application.
> Please, advice, thank you!
btw, this sounds similar to our use case. (see screenshot: http://grab.by/gZ0C ) these 2 buckets both reside on the same 16 nodes. -- we've been really happy with performance.. although I'm not sure where those 2 disk fetches per second are creeping in from.. must investigate!
We also carry 1 replica copy so couchbase is handling over a billion data points for us (by the way our document sizes can be a lot larger than 160 bytes)
On Tuesday, October 23, 2012 at 3:36 PM, Zeming Jin wrote:
> 57-bytes for each entry is a good number to estimate metadata size.
> On Tue, Oct 23, 2012 at 2:01 PM, Chad Kouse <c...@tunewiki.com (mailto:c...@tunewiki.com)> wrote:
> > Well, shallowly, there is an "add" method in couchbase that you could use -- it only adds a record if one doesn't already exist. You can check the result code to detect if it already existed or not.
> > From there it's just a "get" to retrieve the document
> > The part I'm not great at answering is how much space will be eaten up by metadata on such small keys. Maybe one of the couchbase pros can do that.
> > -- > > Chad Kouse
> > On Tuesday, October 23, 2012 at 1:38 PM, yassen wrote:
> > > Hi everyone,
> > > Absolutely new (and ignorant) to NoSQL solutions and to Couchbase (my apologies; but extensive experience with SQL RDBMS).
> > > We consider a NoSQL DB deployment for a mission-critical application where we need to store several hundreds of millions of data records, each record consisting of about 6 string fields, record total length is 160 bytes. There is a unique key in each record that seems suitable for hashing (20+ bytes string, e.g. "cle01_tpls01_2105328884 (tel:2105328884)").
> > > The application should be able to write several hundreds of new records per second, but first check if the unique key already exists. Writing is to be done only if it is not there. If it is, the app needs to retrieve the whole record and return it to the client and no writing is done in this case.
> > > We need to have a cluster of at least 2-3 nodes, which must be able to grow easily if a need be.
> > > I need to know if Couchbase would be suitable for such application. Please, advice, thank you!
> 57-bytes for each entry is a good number to estimate metadata size
This sounds great; 57 bytes is really nice metatdata overhead :)
> Well, shallowly, there is an "add" method in couchbase that you could use
-- it only adds a record if one doesn't already exist.
That sounds very nice as well!
Can anyone tell me something about b-tree vs. hashing for the primary key? Do we have both options in couchbase? How about different data stores for a similar use case? (Heading me to the proper documentation sources would be great.)