Firebase Query: Get Multiple Documents in a "One Round Trip" with Python

641 views
Skip to first unread message

Alessandra Bartolo

unread,
Jan 21, 2022, 8:15:55 PM1/21/22
to Firebase Google Group
Hi there everyone!

My name is Alessandra, and I am working in a proyect and we want to develop a Cloud Function, triggered by http, that will recieve a file (.csv or json) with multiples document id to be searched in Firestore Database, using python.

As the 'IN' option supports up to 10 comparison values at a time, we cannot use it. 
We've heard there is a method, Firestore.getAll(), but is only available for Java, isn't?

We are wondering if we can load the docs id's file in a collection in Firestore and make a match with the collection we want to search, and retrieve the documents details. Is that posible? 

Or maybe there are other ways to do what we need. We just don't want to read each document in a loop, it would be slower and more expensive. 

Do you know if there is a workaround for the Firebase Query "IN" Limit to 10?
Has anyone already faced this same problem? If so, how do you suggest us deal with it?

I will be trully grateful with you help.
Regards.


Kato Richardson

unread,
Jan 24, 2022, 11:24:53 AM1/24/22
to Firebase Google Group
Hi Alessandra,

Assuming the total number of documents is fairly small (less than hundreds) then fetching individual documents or using multiple IN calls will work fine. There is no cost optimization to doing them in a query, although obviously less round trips so some work efficiency. Note that you don't have to read each document sequentially of course. You can make multiple calls in parallel.

Given that this is an XY problem, perhaps you should include the use case? There might be better solutions. For example, presumably whatever the client is viewing that they want to export as a CSV is a list of some sort, generated from some query/filter criteria (e.g. recent changes or something)? Perhaps that could be reconstituted on the server instead of individual keys? If not, then there is probably nothing better here.

☼, Kato

--
You received this message because you are subscribed to the Google Groups "Firebase Google Group" group.
To unsubscribe from this group and stop receiving emails from it, send an email to firebase-tal...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/firebase-talk/ffc485a9-9e86-46c0-9aa6-da566fc99581n%40googlegroups.com.


--

Kato Richardson | Developer Programs Eng | kato...@google.com | 775-235-8398

Arthur Thompson

unread,
Jan 24, 2022, 1:16:09 PM1/24/22
to Firebase Google Group
Hi Alessandra,

Would it be possible to add a field to the documents that will allow you to query using a filter? It seems similar to keeping a list of document IDs but might be easier to manage.

Arthur.

Alessandra Bartolo

unread,
Jan 24, 2022, 7:08:48 PM1/24/22
to Firebase Google Group
Hi Kato,

Thanks for your answer. In fact, I skipped adding more information when I submitted my question. The use case here is a Data Marketplace. We are going to upload sensitive data of people and companies from different providers and sell it to companies interested in that information, to complement their own data.

So the amount of data to be filtered/read is at about millions at a time. The source data is not updated very frequently, the minimum update frequency is 6 months, so we are not worried about recent changes.

We initially came up using Firestore for it reading speed and scalability. The key of each document will be the National Identity Document (DNI) of a person/company. The read process would be done in Cloud Function that reads the Firestore documents choseen by the client, http triggered.  

Is Firestore really suitable for our use case? Or maybe we should think about another Google Cloud Platform service?

What will you recommend us?

Regards,
Alessandra.

Kato Richardson

unread,
Jan 25, 2022, 11:14:51 AM1/25/22
to Firebase Google Group
Given that you're going to read millions of documents at a time, based on individual keys, this isn't a solid fit for Firestore; Firebase is focused on mobile apps and responsive/realtime updates, which aren't really the point here. 

I'm not sure querying a database on millions of keys is a great fit for any database model. So maybe some sort of intermediary/processing queue that receives the list of ids and builds out the file is a more appropriate solution. But this isn't my area of expertise of course.

Might be worth reaching out to Google Cloud Sales and seeing what they would recommend. I'm guessing some combination of Spanner or BigQuery with Cloud Run would be ideal.

☼, Kato



Reply all
Reply to author
Forward
0 new messages