Which is the best way to delete bulk data in Firestore?

3,337 views
Skip to first unread message

Diros Apps

unread,
Feb 13, 2021, 12:09:33 PM2/13/21
to Firebase Google Group

Hello,

I have been working with Firestore and I need to find out which of the following ways to delete unnecessary data is better in terms of cost:

- First option is to recursively delete all documents from collections until they're empty, I tried doing this for a day and realized that it incurs in entity reading and entity deleting costs for Firestore, it would take a really long time to delete everything I need to erase due to the amount of data, and it doesn't seem to lower the data storage quota usage at all.

- Second option is to create a new Firestore DB, migrate specific data to it, populate it from that point onwards with data from users, and after a month, delete the old Firestore DB with a CLI command. The process of migrating data would take a while, and I don't know how much it would cost to execute a firebase firestore:delete --all-collections; I assume it must cost accordingly to the volume of data eliminated, but I'm unsure if it only incurs in entity delete costs, or if it also has entity read costs, like the first option.

Does someone have information regarding this? I would really appreciate it.

Thanks for your time and attention.

Mauricio Walters

unread,
Feb 14, 2021, 7:44:47 PM2/14/21
to Firebase Google Group
The Firebase CLI uses the Firestore REST API to delete documents. It looks like it still executes a read for every doc that it deletes.
The docs actually suggest using the firebase CLI from a Cloud Function to delete collections recursively.

I don't think Firesbase includes a way of avoiding the read costs for deleting collections that way. The problem is that you can't know which documents a collection contains unless you read them all.

If you plan on deleting large amounts of data often, you could store the path to every document you want to delete in Redis. When you were ready to delete documents that were of a certain age, you would get those paths and compose a DocumentReference for each one. Firestore.batch would be the best way to delete them in this case, as it only requires a ref to the document.

Sam Stern

unread,
Feb 15, 2021, 5:38:54 AM2/15/21
to Firebase Google Group
Hi there,

In order to delete a document you have to know its path. In almost all cases this means you first have to read the document to delete it. The only exception would be if you can "guess" the path (doc 1, doc 2, etc) or if you have stored a list of the paths somewhere else (which would be unusual).  So yes if you want to delete 1M documents using the Firebase CLI you will pay for 1M reads and 1M deletes. There is no more efficient way to do this right now.

Storage cost should absolutely go down after you delete data, although there may be some lag in the reporting as storage usage is not calculated in real time. If you think your storage costs are incorrect please reach out to Firebase support.

- Sam

--
You received this message because you are subscribed to the Google Groups "Firebase Google Group" group.
To unsubscribe from this group and stop receiving emails from it, send an email to firebase-tal...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/firebase-talk/e814cf69-1f1b-4750-b9e7-2773c83429f2n%40googlegroups.com.

Diros Apps

unread,
Feb 15, 2021, 9:39:00 AM2/15/21
to Firebase Google Group
Hello everyone,

Thank you for your answers, Mauricio and Sam. So it seems there's no difference, cost-wise, between using the CLI and recursively deleting documents from a server script, like for example Node or PhP.

Given that, I'm wondering now if executing the deletions via CLI is the best option time-wise; it should work faster to delete everything via CLI than to iterate through all documents in a collection from a server, right?

Thanks for your time and attention.

Sam Stern

unread,
Feb 15, 2021, 9:46:43 AM2/15/21
to Firebase Google Group
Yes that's correct. The CLI doesn't do anything special, you could replicate it on your servers. In most cases the CLI is faster than your own code though since we have heavily optimized it to be as fast as possible.

- Sam

Diros Apps

unread,
Feb 15, 2021, 1:40:42 PM2/15/21
to Firebase Google Group
Ok, that's good to know. Thank you Sam!
Reply all
Reply to author
Forward
0 new messages