Bigtable unsuitable for smaller datasets?

784 views
Skip to first unread message

Igor Clark

unread,
Nov 23, 2016, 12:49:32 PM11/23/16
to Google Cloud Bigtable Discuss
Hi there,

I'm weighing up GCP database options and it seems like Bigtable is the best fit for what I want to do - fairly simple data that fits the column model well, requires fast throughput, doesn't have complex relationships, don't need ACID, don't need/want SQL.

However, it's not going to be a huge amount of data - probably only in the order of several hundred thousand or maybe low millions of rows - and the Bigtable docs say "it is not a good solution for less than 1 TB of data".

Why is that? Just pricing? Or is there some performance factor that makes it only really come into its own when you start to throw lots of data at it? Or a penalty when you don't have enough data in there to get some aspect warmed up, or something like that?

Thanks,
Igor

Douglas Mcerlean

unread,
Nov 23, 2016, 1:25:25 PM11/23/16
to Google Cloud Bigtable Discuss, Misha Brukman
You've got the general idea. If you only put in, say, a few GB, Bigtable will only produce a handful of shards in the steady state, which causes two main problems:
 * Rebalancing the data over your CBT nodes isn't as effective, as the sharding is coarse
 * A transient issue with a single shard will cause a large fraction of your table to be temporarily unavailable

Bigtable also adjusts the sharding in response to your access patterns, so if you expect to be constantly throwing high throughput at your tables (enough to stress the minimum 3-node footprint, at least) things won't be quite so dire. However, if you're talking about throughput that could be supported by a single server for the foreseeable future, you probably want to consider a cheaper and simpler system. Bigtable also reduces the number of shards after a while if there's no longer any load, so intermittent batch jobs over small data are likely to encounter reduced performance upon starting up again.

Feel free to contact our PM (Misha, cc'd) or myself off list with more details about your use case, if you still think Bigtable may be a viable option for you.

--
You received this message because you are subscribed to the Google Groups "Google Cloud Bigtable Discuss" group.
To unsubscribe from this group and stop receiving emails from it, send an email to google-cloud-bigtable-discuss+unsub...@googlegroups.com.
To post to this group, send email to google-cloud-bigtable-discuss@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/google-cloud-bigtable-discuss/f027ff47-a510-4e9f-be37-75e1b3e31b3e%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

Igor Clark

unread,
Nov 23, 2016, 8:54:52 PM11/23/16
to Google Cloud Bigtable Discuss, mbru...@google.com
Got it. Super helpful. Thank you!
Reply all
Reply to author
Forward
0 new messages