FYI: how to parse datastore export

133 views
Skip to first unread message

Attila-Mihaly Balazs

unread,
Nov 20, 2018, 7:07:25 AM11/20/18
to Google App Engine
There is a somewhat new service to export/import datastore entities: https://cloud.google.com/datastore/docs/export-import-entities#starting_managed_export_and_import_operations (which previously could be done from the "datastore admin"). The documentation says:

"The output of a managed export uses the LevelDB log format." and links to this page: https://github.com/google/leveldb/blob/master/doc/log_format.md

However this doesn't seem to be entirely accurate. They seem to be doing some funky things with the CRC for one and marking the end of the file isn't in accordance with the documentation. Fortunately the Cloud SDK has code which produces such dumps (see RecordsWriter in google/appengine/ext/mapreduce/records.py) and furthermore has a reader (!). So here is a small python snippet to read the models from such a dump:

import sys
sys.path.append('/usr/lib/google-cloud-sdk/platform/google_appengine')

from google.appengine.datastore import entity_pb
from google.appengine.ext.mapreduce import records
from google.appengine.ext import ndb


class TestModel(ndb.Model):  # we need the definition of the model we want to read
    foobar = ndb.StringProperty(indexed=False)


with open(sys.argv[1], 'rb') as f:
    for r in records.RecordsReader(f, strict=True):
        entity = TestModel._from_pb(entity_pb.EntityProto(r))
        print(entity)

Katayoon (Cloud Platform Support)

unread,
Nov 20, 2018, 3:48:14 PM11/20/18
to Google App Engine
Thank you for your attention to this matter, however you may send your feedback on any documentation via "SEND FEEDBACK" link located at top right of each page. 

Attila-Mihaly Balazs

unread,
Nov 22, 2018, 12:56:46 AM11/22/18
to Google App Engine
Thank you for reminding me about that option. I've now sent the feedback using that option (I didn't actually see it in the top-right corner, but rather at the end of the page), but I also felt it important to post the information in a "public" place so that it can be found by other facing the same issue.

Attila
Reply all
Reply to author
Forward
0 new messages