How to parse Cloud Datastore backups

662 views
Skip to first unread message

Soeren Balko

unread,
Jul 31, 2019, 1:59:49 AM7/31/19
to Google App Engine
Hey there,

we are using the datastore batch export feature to do nightly ETL runs of our datastore instance into BigQuery. Unfortunately, BigQuery skips Blob and JSON properties (which we use to store large JSON strings). To overcome this limitation, we were thinking to first turn the datastore export into CSV and then import that into BigQuery.

As I understand Cloud Datastore backups are in LevelDB format. So far, my attempts to open the file(s) with LevelDB client libraries were unsuccessful. Has anyone succeeded parsing/processing raw datastore backups? If so, can you recommend any client libraries (for node or Python) and have some sample code?

Thanks heaps for any assistance!
Soeren

Julie (cloud platform support)

unread,
Jul 31, 2019, 10:59:35 AM7/31/19
to Google App Engine
Please check though the suggested method for exporting from Datastore to BigQuery to see if all the limitations and requirements are compatible with your use case. The document also provides details on how to use Datastore backups to update BigQuery. App Engine's Python standard runtime connects to Cloud Datastore using the NDB Client Library and if you are planning on using App Engine flex you can use the Client library which also provides a code sample. This is the same for Node.js for Standard and Flex environments which provides the applicable samples. 
Reply all
Reply to author
Forward
0 new messages