ndb to cloud ndb migration experience

192 views
Skip to first unread message

Aert van de Hulsbeek

unread,
Jan 15, 2020, 8:27:12 AM1/15/20
to Google App Engine
We are currently looking to migrate a reasonable number of gae std services from py2 to py3 and are tossing up between a datastore or cloud ndb implementation.
Has anyone done a somewhat complex migration from ndb to cloud ndb and what was this experience like? I am mainly interested in unforeseen and undocumented gotcha's / blockers that might direct us down the datastore path.
If anyone can share some hands-on experience, that would be fantastic. Thanks, Aert.

Elliott (Cloud Platform Support)

unread,
Jan 15, 2020, 12:31:37 PM1/15/20
to Google App Engine
Hello Aert,

What are some of your specifications? How is it complex? Adding them might put value in the responses you will get.

Ryan B

unread,
Jan 16, 2020, 4:47:03 PM1/16/20
to Google App Engine
i recently migrated a number of nontrivial apps from python 2 to python 3 (standard). it wasn't easy, especially since i used a number of the old APIs - task queues, logging, memcache, ndb, etc - that have changed significantly or disappeared entirely. it took a significant amount of effort, coding, and workarounds, but it was doable in the end. kudos in particular to chris rossi, danny hermes, carlos de la guardia, et al for the new python 3 ndb library!

i've posted many of the bigger gotchas i hit here on this list, and over on SO, and on the python 3 ndb issue tracker. beyond all those, two key things i ended up doing for ndb were:

* replace testbed with the datastore emulator for running unit tests without mocking out ndb. example commands.
* this WSGI middleware to run all HTTP request handlers inside an NDB client context. without it, i would have had to add a ton of ugly new with ndb_client.context(): ... blocks and indents everywhere.

also, shameless plug, feel free to test things out in shell-py3.appspot.com, an interactive REPL inside python 3 GAE standard that includes the new ndb.

Jon Grover

unread,
Jan 21, 2020, 10:31:00 AM1/21/20
to Google App Engine
On Thursday, January 16, 2020 at 1:47:03 PM UTC-8, Ryan B wrote:
* this WSGI middleware to run all HTTP request handlers inside an NDB client context. without it, i would have had to add a ton of ugly new with ndb_client.context(): ... blocks and indents everywhere

I feel like you just saved my life. I was beginning to truly despair. 

Jon Grover

unread,
Feb 18, 2020, 8:42:26 AM2/18/20
to Google App Engine
On Thursday, January 16, 2020 at 1:47:03 PM UTC-8, Ryan B wrote:
* replace testbed with the datastore emulator for running unit tests without mocking out ndb. example commands.

Ryan, I was wondering if you could elaborate a bit more on what you did here. I'm working on migrating an app with several hundred unit tests but there's a couple of items that I'm either a little stuck on or have me on edge:

1) First and foremost, I had repeated issues with dev_appserver connecting to my production datastore despite having set all the correct environment variables and setting the --support_datastore_emulator flag. I was finally able to solve it by explicitly passing --env_var DATASTORE_EMULATOR_HOST=localhost:8081, but I'm not immediately sure how to do that from within our unittest framework and I'm very worried about our unit tests writing to the production datastore. (For the time being, I used the gcloud CLI to change my project to a dummy one.)

2) This is maybe my own unfamiliarity with NDB context, but I'm not sure what the correct way is to use the same NDB context across all unit tests. I borrowed your WSGI middleware trick for the app itself, but I haven't yet quite figured out how to apply a single context across our entire unittest suite. I tried creating a new NDB context in our base setUp method, which looked like it was working but dragged the test speed to a crawl.

Just curious if you had any additional insights or example code available. I've already learned quite a bit by reading through your other postings, so thanks regardless.

Ryan B

unread,
Feb 18, 2020, 2:30:32 PM2/18/20
to Google App Engine
hi jon!


On Tuesday, February 18, 2020 at 5:42:26 AM UTC-8, Jon Grover wrote:
On Thursday, January 16, 2020 at 1:47:03 PM UTC-8, Ryan B wrote:
1) First and foremost, I had repeated issues with dev_appserver connecting to my production datastore despite having set all the correct environment variables and setting the --support_datastore_emulator flag. I was finally able to solve it by explicitly passing --env_var DATASTORE_EMULATOR_HOST=localhost:8081, but I'm not immediately sure how to do that from within our unittest framework and I'm very worried about our unit tests writing to the production datastore. (For the time being, I used the gcloud CLI to change my project to a dummy one.)

yes! this one was/is disappointing. it's a hole in dev_appserver that most likely will never be closed, since dev_appserver itself is basically deprecated. i literally monkey patch the host/port into the ndb context manually: https://github.com/snarfed/webutil/blob/master/appengine_config.py#L39 . background: https://github.com/googleapis/python-ndb/issues/238
 
 
2) This is maybe my own unfamiliarity with NDB context, but I'm not sure what the correct way is to use the same NDB context across all unit tests. I borrowed your WSGI middleware trick for the app itself, but I haven't yet quite figured out how to apply a single context across our entire unittest suite. I tried creating a new NDB context in our base setUp method, which looked like it was working but dragged the test speed to a crawl.

right! i do exactly that, and also reset the datastore and call __enter__(), and then __exit__() in tearDown: https://github.com/snarfed/bridgy/blob/master/tests/testutil.py#L282-L289

unit tests are definitely slower now with the emulator vs testbed in the python 2 runtime. one test suite of mine that does datastore operations in almost every test, often many per test, takes ~50s to run 526 tests. not great, but i usually iterate by running individual tests, so it doesn't hurt too much.

noverlyjoseph

unread,
Feb 19, 2020, 11:34:03 AM2/19/20
to Google App Engine
Google do recommend to use local environment for testing instead of dev_appserver, and to use pytest for unit testing.[1]
dev_appserver was recommend for First Generation App Engine


[1] https://cloud.google.com/appengine/docs/standard/python3/testing-and-deploying-your-app#running_locally

Jon Grover

unread,
Feb 21, 2020, 2:01:45 PM2/21/20
to google-a...@googlegroups.com
Yes, but we’re still on the Python 2 environment, hence the migration. Once we're on Cloud NDB we need to finish eliminating testbed and move off taskqueues, though I think our usage of the latter is small enough that I can do it at the since time we move to Python 3.

Ryan, thanks for the additional suggestions. I ran into an error when trying to set the context in the unittest setUp() method that said there was already a context open, and I wasn't immediately able to figure out why the context wasn't getting closed in tearDown(). I was able to get things unstuck a bit by overwriting the unittest.TestCase run() method as below, though that seems like an ugly approach. But you gave me some new things to try out, so thank you!

def run(self, result=None):
with ndb_client.context():
super(BackstabbrTestCase, self).run(result)

--
You received this message because you are subscribed to a topic in the Google Groups "Google App Engine" group.
To unsubscribe from this topic, visit https://groups.google.com/d/topic/google-appengine/hN_8VlRl5Dk/unsubscribe.
To unsubscribe from this group and all its topics, send an email to google-appengi...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/google-appengine/630ff6f3-1f0a-426e-b885-17fa283bb376%40googlegroups.com.
Reply all
Reply to author
Forward
0 new messages