Data storage service recommendation (cost vs query latency) when using large time series dataset?

28 views
Skip to first unread message

Floris Calkoen

unread,
Oct 18, 2020, 4:07:08 PM10/18/20
to Google App Engine
I'm designing a small Flask Google App web-application/dashboard which will present evolution of some climate data over time at global level. 

The dataset consists of approx. 1.5 million time series, which were measured at an equal number of sensors. The time series have each around 50 observations, hence my time series dataset has approx 75 million rows. 

The end-user should have ability explore all observations from a certain sensor. Such query would therefore have to return all observations which match the sensor id provided by the end user. 

Unfortunately the budget for this application is small. However, I don't expect many users to use the webapp (few per day maybe). Nevertheless query latency should be as small as possible. 

Considering all these constraints, what would be my optimal google data storage service? Just Cloud Bigtable? Or will that be relatively expensive compared to Cloud SQL for example? 

George (Cloud Platform Support)

unread,
Oct 20, 2020, 11:16:20 AM10/20/20
to Google App Engine
You may consider Big Query, depending on how you plan to handle your data. A small example is provided on the "Quickstart: Using client libraries" page [1].

This discussion group is oriented more towards general opinions, trends, and issues of general nature touching the app engine. For coding and program architecture, as well as decisions on how to best handle large datasets, you may be better served in dedicated forums such as stackoverflow, where experienced programmers are within reach and ready to help. 

Reply all
Reply to author
Forward
0 new messages