Comparison of various data stores (sql or nosql) with emphasis on scalability/benchmarks

44 views
Skip to first unread message

Gitted

unread,
Jun 24, 2016, 7:40:30 PM6/24/16
to Google App Engine
Hello,

Is there a chart that compares the benchmarks of the various data stores that app engine offers, with a focus on scaling/benchmarks?

If my app will require 100's of writes per second, which store can I use?
What if it requires 2-3K writes per second?


Thanks.

Nick (Cloud Platform Support)

unread,
Jun 27, 2016, 3:33:24 PM6/27/16
to Google App Engine
Hey Gitted,

These are excellent questions. I'll do my best to answer you:

There are numerous benchmarks performed by users and third-parties out there which can be found through Google searches. As far as I know, we don't host any benchmark data of our own. But that doesn't mean it's impossible to estimate the capacity of any service. We publish Service-Level Agreement information for most services, and the pricing/quotas information for a given service can also give an indication - you can use this information to roughly determine what each option is capable of handling.

I'll try to list each of the main storage options and discuss the general outlook on scaling for them. To determine whether each service can handle either 100's or 1000's of queries / inserts / etc. per second, you'll need to consider any rate-limiting quotas in effect and the state of the system as provisioned in the case where scaling is user-based, as for a Cloud SQL replica pool or BigTable cluster. 

1. Datastore

Scaling: Automatic, you will in theory* never exhaust the system as long as your billing account continues to work. You may want to contact us if planning any critical extremely-large workloads (especially since by default an overall limit of 100 million requests per day applies), even just to get some best-practices advice, but this service in general is highly, highly scalable, effectively* unlimited. You can read about all this in the documentation [1] [2].

Benchmarking: In addition to reading about benchmarks out there, to run your own, you'll want to write tests that use the main operations: put, get, query, mutli-put, multi-get

2. Memcache

Scaling: Not applicable, while there is no guaranteed capacity for shared memcache, dedicated memcache has between 1 to 100GB of capacity. You can contact cloud-acc...@google.com to request more capacity.

Benchmarking: The documentation [3] provides information on the above and more, describes the basic operations (set/get), as well as their rate-limits, and advice for scaling. As in all other cases, benchmaring is as much an art as a science, and the choice of the distribution over key-space your choose for your tests will impact the observed performance. Reading the docs is highly recommended, as for all services.

3. Cloud SQL 

Scaling: The scaling properties of your Cloud SQL database will be given by your own choices in replication and sharding, two topics important to MySQL servers in general. In many ways, Cloud SQL acts just like a thin layer of management and provisioning, with API access and integration with the Cloud Platform, on top of an SQL server. You can read about replication in the docs [4], and google search will bring up great general resources for MySQL database scaling.

Benchmarking: MySQL benchmarking is a huge topic, and there are many great frameworks and libraries which have been designed to assist in getting a MySQl benchmark running no matter the infrastructure the database is running on, and performing some rather exhaustive tests of all possible queries, etc.

4. BigTable

Scaling: Determined by your own monitoring [5] and provisioning [6] of nodes in the cluster. You should also take a look at the Quota documentation [7].

Benchmarking: Very similar to the methodology of running a benchmark for memcache, you'll want to simply test get/set operations.

5. BigQuery

Scaling: An extremely large amount of data can be stored in BigQuery. I suggest reading in the "Resources" section of the docs to see what rate-limit quotas exist [8].

Benchmarking: As for the other services, there are benchmarks out there online, although there are probably more for BigQuery than for many other services, and you can of course run tests using a relatively simple batching setup of a few instances running inserts, running queries, etc.

***

I hope this information has been helpful, feel free to let me know if you have any further questions, I'll be happy to assist.

Cheers,

Nick
Cloud Platform Community Support


Reply all
Reply to author
Forward
0 new messages