Q limits scalability?

152 views
Skip to first unread message

ooi

unread,
Oct 11, 2012, 6:16:14 PM10/11/12
to bigcou...@googlegroups.com
BigCouch documentation recommends to overshard because you can't (currently) add shards later, and the number of shards limits your scalability.  You can only have up to Q x N nodes in a cluster, and they will have 1 shard each.  Adam@cloudant's talk on BigCounch says it makes sense to overprovision as long as there are file descriptors to handle it and uses 64 as an example. 

So I want to pick a really big value for Q, right?  Our system is expected to grow by TB daily, and exceed PB over 25 years.  I figure thousands of shards would be overkill for now, but let us continue to scale as scientists around the world try to read huge data sets from our multi-PB store.  But…

Our puny test system was created with only 256 shards (Q=256, N=3, W=R=2).  It crashed and burned on fairly small batch updates (<500KB) on a 3-node cluster (each with 3 cores, 6gb RAM).  I didn't expect Q to have such an impact on performance, but it really does.  As Q goes up, the time to do the bulk insert goes up dramatically:

Q    sec
4      6.4
8      7.7
16    10.8
32    16.9
64    37.0

When I started with BigCouch a couple months ago, I thought this would be able to scale from a couple of regular PCs to a sizable server farm as long as Q was big enough.  Now it looks like you either have to pick Q low enough to run on those regular PCs (but then you can't scale past a couple dozen nodes) or you need to start with a big HW investment to support Q high enough to scale into petabytes.

Am I missing something?  Has anyone else run into this?  Why would it take so much more CPU?  Has anyone worked around this?

And sorry adam@cloudant, this isn't limited by file descriptors.  Tried upping that limit too...

Idan Shechter

unread,
Nov 11, 2012, 2:34:46 AM11/11/12
to bigcou...@googlegroups.com
I am interested to hear any opinion about it too.

Michael Miller

unread,
Nov 11, 2012, 12:42:24 PM11/11/12
to bigcou...@googlegroups.com, bigcou...@googlegroups.com
Hi, this is a fair question and worthy of a longer response than I can hack out on a phone. I'll try to turn around something more helpful later today, but in the short term:

1) you are correct.  The optimal value for q depends on both DB scale and the resources available. 

2) while bigcouch doesn't support automatic resharding, it is standard practice to manually reshard as you grow. Manual here means creating a new database and replicating, so it's not that brutal, nor that manual. 

3) bigcouch is built to support both a single massive database or millions of small databases. In the latter case it is routine to have a node count far exceeding n*q. 

I'll try to add more specifics later. 

Thanks, Mike Miller (Cloudant)
Reply all
Reply to author
Forward
0 new messages