Backups are useless if you can't restore from them

853 views
Skip to first unread message

br...@hatchlings.com

unread,
Jul 24, 2018, 7:30:54 PM7/24/18
to Firebase Google Group
I've wasted the past week of my life trying to ensure that we have an ability to restore from our Realtime Database automated backups in the event that we need to. Long story short: it looks like there is no good way to do this if your database is over 256MB. Ours is 12GB -- what is the point of having a backup if there is no way to restore it?

This saga all started because I wanted to setup a mirror of our database for one of our new developers to play with so he wasn't messing with production data. Trying to upload a backup via the console gives an error, "413 Request Entity Too Large".

Looking at the restoring from backups support page, it says "If you are having trouble restoring a backup from a very large database, please reach out to our support team." So I messaged support... they are absolutely ZERO help. They pointed me back at the same support page that said to contact them and said "You'll have to upload your data by using the REST API by chunking it into smaller payloads that can fit within the limits (250MB)."

They also mentioned the "streaming import library", a project that's been dead for 4 years and doesn't work at all (so much so that the link from `firebase-import` was removed).

Judging by the docs and this thread, it looks like at one time there was a way for Firebase to do it manually.. if that's the case, someone needs to tell the people manning support so they don't send people like me down rabbit holes trying to get non-working "official" code to work that hasn't worked in years. (That would help for now but we want to be able to run queries for data not on production so if we want current data we will need to do this process on an ongoing basis).

So since it was pretty clear I wasn't going to get any help beyond support repeating "chunk your data" I started looking into how to do that... there's really nothing out there that works for arbitrary data. If your backup is bigger than your system memory you need to stream it. And every library I've found requires you to manually tell it what "paths" you want to stream.

So I've spent the last several days adapting one on my own to get something that mostly works. At least I think I'm close to getting something that will stream data chunks small enough to write to Firebase.

Except it looks like I'm hitting Firebase's (new??) limits and writes are INCREDIBLY slow. It looks like it's going to take almost 24 hours to write this data.

I'm losing sleep over what the heck we're going to do if we ever need to actually restore our production firebase from a backup. I had no idea for the past couple of years that there was no way to restore the automated backups! I expect a lot more from a service I'm paying so much money for.

Patryk Lesiewicz

unread,
Jul 25, 2018, 12:42:13 PM7/25/18
to fireba...@googlegroups.com
Hi Brad -

I apologize for your bad experience with Firebase Database. We are aware of difficulties with importing large data into RTDB. Currently our recommended way is to contact Firebase Support and use our internal backup restore procedure that bypasses front doors and uploads data from our hourly snapshots directly into persistence layer.

I found your support case and will follow up there to get you going ASAP.
I will also review all support playbooks and update them to clarify the latest recommendations.

Best regards,
Patryk

--
You received this message because you are subscribed to the Google Groups "Firebase Google Group" group.
To unsubscribe from this group and stop receiving emails from it, send an email to firebase-tal...@googlegroups.com.
To post to this group, send email to fireba...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/firebase-talk/0f4d2c61-3a49-4af6-a3c1-af5f1c72ce15%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

Piotr Kaminski

unread,
Jul 25, 2018, 5:27:07 PM7/25/18
to fireba...@googlegroups.com
Hi Patryk,

For the record, restoring a backup is not the only use case for importing large chunks of data so having Firebase support do it from an internal copy isn't a panacea.  For example, I'd like to be able to download-transform-upload an entire datastore to fix some data structures or rotate encryption keys.  While I've written code to do such transforms online it gets tricky to work around the various size and throughput limitations, so it would be much easier to run them offline and upload the results.

Thanks,

    -- P.


To unsubscribe from this group and stop receiving emails from it, send an email to firebase-talk+unsubscribe@googlegroups.com.

To post to this group, send email to fireba...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/firebase-talk/0f4d2c61-3a49-4af6-a3c1-af5f1c72ce15%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

--
You received this message because you are subscribed to the Google Groups "Firebase Google Group" group.
To unsubscribe from this group and stop receiving emails from it, send an email to firebase-talk+unsubscribe@googlegroups.com.

To post to this group, send email to fireba...@googlegroups.com.

For more options, visit https://groups.google.com/d/optout.



--
  Piotr Kaminski <pi...@ideanest.com>
  "That bun is dirty.  Don't eat that bun."

Patryk Lesiewicz

unread,
Jul 27, 2018, 2:51:58 PM7/27/18
to fireba...@googlegroups.com
Hi Piotr,

I understand that projects may need to do batch processing on the entire data set from time to time. I'm not sure if download-transform-upload approach is scalable enough for big data. I wish we could support Cloud Dataflow on RTDB, unfortunately our current architecture makes if very difficult to support. Dataflow for Firestore is on our roadmap though.

Thank you for your feedback and understanding,
Patryk

On Wed, Jul 25, 2018 at 2:27 PM Piotr Kaminski <pi...@ideanest.com> wrote:
Hi Patryk,

For the record, restoring a backup is not the only use case for importing large chunks of data so having Firebase support do it from an internal copy isn't a panacea.  For example, I'd like to be able to download-transform-upload an entire datastore to fix some data structures or rotate encryption keys.  While I've written code to do such transforms online it gets tricky to work around the various size and throughput limitations, so it would be much easier to run them offline and upload the results.

Thanks,

    -- P.


On Wed, Jul 25, 2018 at 9:41 AM, 'Patryk Lesiewicz' via Firebase Google Group <fireba...@googlegroups.com> wrote:
Hi Brad -

I apologize for your bad experience with Firebase Database. We are aware of difficulties with importing large data into RTDB. Currently our recommended way is to contact Firebase Support and use our internal backup restore procedure that bypasses front doors and uploads data from our hourly snapshots directly into persistence layer.

I found your support case and will follow up there to get you going ASAP.
I will also review all support playbooks and update them to clarify the latest recommendations.

Best regards,
Patryk
On Tue, Jul 24, 2018 at 4:30 PM <br...@hatchlings.com> wrote:
I've wasted the past week of my life trying to ensure that we have an ability to restore from our Realtime Database automated backups in the event that we need to. Long story short: it looks like there is no good way to do this if your database is over 256MB. Ours is 12GB -- what is the point of having a backup if there is no way to restore it?

This saga all started because I wanted to setup a mirror of our database for one of our new developers to play with so he wasn't messing with production data. Trying to upload a backup via the console gives an error, "413 Request Entity Too Large".

Looking at the restoring from backups support page, it says "If you are having trouble restoring a backup from a very large database, please reach out to our support team." So I messaged support... they are absolutely ZERO help. They pointed me back at the same support page that said to contact them and said "You'll have to upload your data by using the REST API by chunking it into smaller payloads that can fit within the limits (250MB)."

They also mentioned the "streaming import library", a project that's been dead for 4 years and doesn't work at all (so much so that the link from `firebase-import` was removed).

Judging by the docs and this thread, it looks like at one time there was a way for Firebase to do it manually.. if that's the case, someone needs to tell the people manning support so they don't send people like me down rabbit holes trying to get non-working "official" code to work that hasn't worked in years. (That would help for now but we want to be able to run queries for data not on production so if we want current data we will need to do this process on an ongoing basis).

So since it was pretty clear I wasn't going to get any help beyond support repeating "chunk your data" I started looking into how to do that... there's really nothing out there that works for arbitrary data. If your backup is bigger than your system memory you need to stream it. And every library I've found requires you to manually tell it what "paths" you want to stream.

So I've spent the last several days adapting one on my own to get something that mostly works. At least I think I'm close to getting something that will stream data chunks small enough to write to Firebase.

Except it looks like I'm hitting Firebase's (new??) limits and writes are INCREDIBLY slow. It looks like it's going to take almost 24 hours to write this data.

I'm losing sleep over what the heck we're going to do if we ever need to actually restore our production firebase from a backup. I had no idea for the past couple of years that there was no way to restore the automated backups! I expect a lot more from a service I'm paying so much money for.

--
You received this message because you are subscribed to the Google Groups "Firebase Google Group" group.
To unsubscribe from this group and stop receiving emails from it, send an email to firebase-tal...@googlegroups.com.

To post to this group, send email to fireba...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/firebase-talk/0f4d2c61-3a49-4af6-a3c1-af5f1c72ce15%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

--
You received this message because you are subscribed to the Google Groups "Firebase Google Group" group.
To unsubscribe from this group and stop receiving emails from it, send an email to firebase-tal...@googlegroups.com.

To post to this group, send email to fireba...@googlegroups.com.

For more options, visit https://groups.google.com/d/optout.



--
  Piotr Kaminski <pi...@ideanest.com>
  "That bun is dirty.  Don't eat that bun."

--
You received this message because you are subscribed to the Google Groups "Firebase Google Group" group.
To unsubscribe from this group and stop receiving emails from it, send an email to firebase-tal...@googlegroups.com.

To post to this group, send email to fireba...@googlegroups.com.

Piotr Kaminski

unread,
Jul 27, 2018, 3:08:58 PM7/27/18
to fireba...@googlegroups.com
On Fri, Jul 27, 2018 at 11:51 AM, 'Patryk Lesiewicz' via Firebase Google Group <fireba...@googlegroups.com> wrote:
I'm not sure if download-transform-upload approach is scalable enough for big data.

I'm not sure it's ideal either, but it beats online transforms.  :)  And since the download and transform part is already taken care of, you'd "just" need to add a bulk loading process for RTDB...

Anyway, thanks for responding and keeping this use case in mind.

    -- P.

Philip Ashton

unread,
Sep 21, 2018, 10:23:32 AM9/21/18
to Firebase Google Group
As mentioned above by Brad, being able to copy data is essential when working with productions systems and developers. Being able to run tests on a copy of a production data set is essential (even without transformations). Actually, being able to copy a whole firebase project to another project would be useful (but might be too infrequently used to get any google development time.

I will back Brad's transformation offline. I am using firestore and therefore accept some limitations but the same issues exist in RTDB so they may be able to share a single developed solution.

In firestore due to the nested collections, it becomes even more useful to be able to do offline transformations as it has become clear I may need to restructure the data post go live and this will be almost impossible in a timely manner at the moment, as it would require all the data to be backed up, transformed and then loaded document by document.

A bulk load has always been the answer with all data storage technologies, so add my vote to this functionality.

Philip

Kiana McNellis

unread,
Sep 21, 2018, 6:56:02 PM9/21/18
to fireba...@googlegroups.com
Firestore backups are much easier, actually!  They export directly to Cloud Storage, and can be imported directly back into your Project, or copied to a new one.


Mike Sparr

unread,
Oct 18, 2018, 8:50:56 PM10/18/18
to Firebase Google Group
I ran into this earlier this year and found this repo and it works: https://github.com/FirebaseExtended/firebase-import

Follow the instructions and you should be in good shape. Agree that having automated backups that you cannot restore are a fail for the product and should be addressed, however.  I hope this workaround works for you as it did me.

Cheers!
Reply all
Reply to author
Forward
0 new messages