I was planning to use firestore database for one of my application which deals with 30~40 million records data-set. There are certain query which run on data-sets amounting 100,000 records, record size roughly 50MB.
I do have tried running these queries inside Cloud Functions and Android Application, and looks like that it is a tedious job to perform such operations anywhere. Here are my observations:
Cloud Functions:
- Query a single record from the firestore database takes nearly 500ms, which I think is little on higher side, but one can live with it.
- But the bigger problem is its handling of large amount of data. most of the time my cloud function which deals with large amount of data, it time-outs. Need not to mention it takes considerable amount of time as well.
- I had a simple query, on single field ("area-code") which would return 33K records. When I executed this query, it took Firestore nearly 20 seconds to execute, most of the time the cloud-function time-out in performing this operation. Which means I cannot have a cloud function which deals with large volume of data. Here one more point to understand is that I have only 33K records in my database.
Android Functions:
- Results are so frustrating, that it is not even worth mentioning it here, did I say in many minutes??
Considering various experiments I have done so far, in my opinion, Firestore could be a good database for the use cases where application may have large amount of data, but it need not to run queries on data-set which is higher than few thousands (less than 10).
For example there are 26K restaurants in New York city (
https://www.quora.com/How-many-restaurants-are-there-in-New-York-City). If I just want to store location of these restaurants and provide them
off-line to a mobile customer, it will become an herculean task. One can ask, why would someone wants to know all the 26K restaurants in the city, well may be because I am a TaxMan and want to know the location of each restaurant who have not paid property tax this year and I want to see those details even when my data plan is over or on a choppy network. IS it not the one of the use-case what Firebase promotes (off-line sync). Now at the same time, if health-inspectors are also visiting these restaurants and giving grades on different parameters. A consolidated report on restaurant's in different grades to be plot on map will be tough, if I do not have some master data already available on my device first. Now my use case is actually 4 times of the total restaurants in New York.
Some idea's on solution will be welcome.