Trying to export Firebase data to Big Query to use in Data Studio

535 views
Skip to first unread message

Spry

unread,
Dec 22, 2017, 9:43:10 PM12/22/17
to Firebase Google Group
Hey Everyone,

Currently, my data is stored in FIrebase DB, but I want to be able to use the Google Data Studio. But right now my data is all over the place and really difficult to use.

Any suggestions on how I can easily port it into Big Query and have it cleaned up in a usable way?

See below images (mock data). Each user will have a an 'inquiry = onboarding profile', email conditions / sign up and then a 'symptoms log' where we can find what conditions they have. 

Looking for an easy way to parse out this data into Big Query so it is easily used. So I can use date as a filter to selced

Thanks,

Dave

Kato Richardson

unread,
Dec 27, 2017, 11:17:31 AM12/27/17
to Firebase Google Group
HI Dave,

There is an Admin SDK for Firebase and a REST API you could use to migrate data between data stores. Additionally, you could set up Cloud Functions to automate the copy of data, but it could be pricey depending on volume.

Another alternative would be to reverse the pipeline and feed the data into a server or HTTP function, and have it dual write upfront.

☼, Kato

--
You received this message because you are subscribed to the Google Groups "Firebase Google Group" group.
To unsubscribe from this group and stop receiving emails from it, send an email to firebase-talk+unsubscribe@googlegroups.com.
To post to this group, send email to fireba...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/firebase-talk/08292153-ab8e-488f-9fbd-60d7ee5598cc%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.



--

Kato Richardson | Developer Programs Eng | kato...@google.com | 775-235-8398

Tony Meng

unread,
Dec 27, 2017, 3:18:24 PM12/27/17
to Firebase Google Group
Hi Dave,

I'm currently working on open-sourcing a tool that will help with exporting RTDB data from backups to BigQuery. I'll post back here when it's ready.

Tony

Matt Silverlock

unread,
Dec 28, 2017, 11:30:11 AM12/28/17
to Firebase Google Group
Another option: run a small application on App Engine Flex as a cron job[1] that connects to your RDB or Firestore and exports records to BigQuery at a fixed interval.

This is the approach I’m looking at taking, at least until I can run a Cloud Function based on a timer (ala Lambda) vs. only triggers.

[1]: https://cloud.google.com/appengine/docs/flexible/nodejs/scheduling-jobs-with-cron-yaml

ja...@thebabyboxco.com

unread,
Dec 29, 2017, 10:58:51 AM12/29/17
to Firebase Google Group
This is fairly tricky to implement, we did it at our company but there are some limitations. For example, even with the Admin SDK, it's very expensive to retrieve all keys for a large object. So if you have an object with >100,000 keys and you try to grab them all at once to bulk export to BigQuery, it starts to choke both FireBase and your AppEngine Cron Process.

We got around this by using the Firebase Data Backup and streaming JSON but because you can't create a Firebase backup on demand, the data is always subject to going to BigQuery only after the backup runs.

AppEngine Cron in general doesn't have the best monitoring and notification system so when a job fails you have to check it. You'd be better off using RunDeck or some other job servers.

BigQuery JSON is different than FireBase JSON so you have to write some transformers and make sure you can auto-load them to BigQuery.

Make sure you export your JSON to a file and then Bulk Job import it to BigQuery. If you stream it to BigQuery, it'll be rather expensive as BQ charges for Streaming.

It would be great to open source a tool as @Tony mentioned above

Jason 

***NOTICE***
Any non-public information in this email is confidential to The Baby Box Company and/or its partners and customers. If you realize that you are not the intended recipient of this email or we inform you that this email was sent in error, please immediately delete all copies of this email and do not share it with anyone else.

Kato Richardson

unread,
Dec 29, 2017, 1:43:55 PM12/29/17
to Firebase Google Group
Since Firebase automated backups are stored in GCS, you can also set up a trigger event to run whenever the storage bucket is updated and import that into BigQuery. The difficult part about this process is that you usually want to do some form of data transform to make the data useful in BigQuery, so that means running it through a GAE or GCE instance, or creating a Functions trigger to handle the transform.

☼, Kato

--
You received this message because you are subscribed to the Google Groups "Firebase Google Group" group.
To unsubscribe from this group and stop receiving emails from it, send an email to firebase-talk+unsubscribe@googlegroups.com.
To post to this group, send email to fireba...@googlegroups.com.

For more options, visit https://groups.google.com/d/optout.
Reply all
Reply to author
Forward
0 new messages