How to create a consistent backup of the datastore?

61 views
Skip to first unread message

Attila-Mihaly Balazs

unread,
Apr 26, 2017, 1:01:41 AM4/26/17
to Google App Engine
Lets assume that I have a "properly" written AppEngine application (writes to the datastore are wrapped into transactions, tasks are put in the task queue and retried, etc) and I want to make a "consistent" backup (ie. I don't want half-committed transactions in the backup). The documentation [1] suggests disabling writes and running a backup from the datastore admin, which sounds sensible but fails because the backup map-reduce job itself needs to write to the datastore. Any other suggestions?

On a related note, the backup is just a dump in protobuff format. I can create a hacky python script to extract the data, but is there a recommended/supported way to transform it into something like JSON or CSV?

Attila

[1] https://cloud.google.com/datastore/docs/console/datastore-backing-up-restoring

Nicholas (Google Cloud Support)

unread,
May 5, 2017, 4:55:05 PM5/5/17
to Google App Engine
Interesting.  If the map-reduce job reduce job requires writing to the datastore and the article suggests disabling writes while it runs, this would suggest inconsistent expectations about its behavior.  Either the map-reduce behavior should not require writes or the article is incorrect and should be changed.  Either way, this may be an issue with the platform.

I would recommend filing a public issue on the Issue Tracker.  When doing so, please include the relevant error logs along with minimal reproduction that exemplifies this error.  Be sure to also post a link to the issue here so that others in the community can also follow along.

As for encoding protobuf data into json, I'd recommend using Google's protobuf library and looking at the json_format module to achieve your goals.
Reply all
Reply to author
Forward
0 new messages