What is status of NDB project?

Kien Nguyen Trung

unread,

Sep 25, 2016, 2:16:58 AM9/25/16

to appengine-ndb-discuss

Does NDB project still get supported from Google?

I checked out the github repo https://github.com/GoogleCloudPlatform/datastore-ndb-python

This repositories does not have any update for last 4 months, and many github issues does not have people to answer?

Beech Horn

unread,

Sep 25, 2016, 2:20:34 AM9/25/16

to appengine-...@googlegroups.com

Only Google/Alphabet could officially comment. Would personally think that it's no longer being fully developed in the open since Guido Van Rossum left.

--
You received this message because you are subscribed to the Google Groups "appengine-ndb-discuss" group.
To unsubscribe from this group and stop receiving emails from it, send an email to appengine-ndb-discuss+unsub...@googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

Guido van Rossum

unread,

Sep 25, 2016, 5:31:06 PM9/25/16

to appengine-...@googlegroups.com, Patrick Costello

I don't see a reason to believe they're no longer supporting it. The
docs are still there and the tracker has many lively discussions
mentioning NDB. But the open source project does seem unsupported.
Maybe Patrick can comment?

--Guido

On Sat, Sep 24, 2016 at 11:20 PM, Beech Horn <beec...@gmail.com> wrote:
> Only Google/Alphabet could officially comment. Would personally think that
> it's no longer being fully developed in the open since Guido Van Rossum
> left.
>
>
> On Sunday, 25 September 2016, Kien Nguyen Trung <trungk...@gmail.com>
> wrote:
>>
>> Does NDB project still get supported from Google?
>>
>> I checked out the github repo
>> https://github.com/GoogleCloudPlatform/datastore-ndb-python
>>
>> This repositories does not have any update for last 4 months, and many
>> github issues does not have people to answer?
>>
>> --
>> You received this message because you are subscribed to the Google Groups
>> "appengine-ndb-discuss" group.
>> To unsubscribe from this group and stop receiving emails from it, send an

>> email to appengine-ndb-di...@googlegroups.com.

>> For more options, visit https://groups.google.com/d/optout.
>

> --
> You received this message because you are subscribed to the Google Groups
> "appengine-ndb-discuss" group.
> To unsubscribe from this group and stop receiving emails from it, send an

> email to appengine-ndb-di...@googlegroups.com.

> For more options, visit https://groups.google.com/d/optout.

--
--Guido van Rossum (python.org/~guido)

Kien Nguyen Trung

unread,

Sep 26, 2016, 12:56:07 AM9/26/16

to appengine-ndb-discuss

OMG, Guido van Rossum answer my question.

I am waiting for official answer from Google/Alphabet

At our company, we use NDB a lot. The auto batching feature of NDB is so great. Thanks you a lot Guido van Rossum.

The library helps us make a code clean without sacrifice for performance.

But there are some cases we want to tune the auto batcher.

E.g: currently in NDB, query is run without batching, it makes all other actions after query is not batch

Let me make it more clear

In our app, we have model User, and Debt

class User(ndb.Model): name = ndb.StringProperty() class Debt(ndb.Model): user_key = ndb.keyProperty(kind='User')

so now, if we run a code like this one

@ndb.tasklet def get_debt_count(user_key): key = '%s:debt_count' % user_key.urlsafe() ctx = ndb.get_context() value = yield ctx.memcache_get(key, use_cache=True) if value is None: value = yield Debt.query(Debt.user_key == user_key).count_async() yield ctx.memcache_set(key, value, use_cache=True) raise ndb.Return(value) @ndb.tasklet def get_user(user_key): user, debt_count = yield user_key.get_async(), get_debt_count(user_key) raise ndb.Return(user, debt_count) @ndb.tasklet def get_users_info(user_keys): result = yield map(get_user, user_keys) raise ndb.Return(result)

Now, if we want to get info of 10 users, the code

get_users_info(user_keys).get_result()

will run with

+ 1 memcache.get user, and debt_count cached from

+ 10 query to Debt to get debt count if debt count is not cached

+ 10 memcache.set to set debt count int cached

What we want is an option to tell context wait after queries, and batch all 10 memcache.set into 1 query.

To do that, I think I could make an auto batcher for these Debt queries, but the flush of that auto batcher is making all queries in batch run parallelly.

Is there any way to do that without change the internal of NDB library?

Beech Horn

unread,

Sep 26, 2016, 4:47:04 AM9/26/16

to appengine-...@googlegroups.com

Hi Kien,

You'll have to excuse me not remembering this very well. Last time I looked at NDB, the context did do the auto batching behaviour you describe.

Is there a chance that the in-lining of yield on the future assignment lines is causing this. Had similar behaviour before getting to grips the way it worked.

Would end up replacing lines like:

result = yield map(get_user, user_keys)

with ones more like

futures = map(get_user, user_keys)

results = yield futures

you can then use hooks to record number of API calls up until that point, etc to narrow down on the issues.

This is in the understanding that it auto-batches correctly itself and would only break if I'd broken it myself. If this behaviour has significantly changed in recent years, then please ignore me.

--

You received this message because you are subscribed to the Google Groups "appengine-ndb-discuss" group.

To unsubscribe from this group and stop receiving emails from it, send an email to appengine-ndb-discuss+unsub...@googlegroups.com.

Kien Nguyen Trung

unread,

Sep 26, 2016, 7:02:10 AM9/26/16

to appengine-ndb-discuss

Hi Beech

I could confirm that there is not different between

result = yield map(get_user, user_keys)

and

futures = map(get_user, user_keys)

results = yield futures

I think it is because Query is not batchable (in Context class, I only see the auto batcher for memcache get/set/cas/off and datastore get/put/delete).

So Query will be run independently by event loop, and then when it finish it will run the next statements in current generator.

I could understand why it was designed like this, because there is a case: after a Query is another Query, so the best way scheduler could do it try to run it as fast as possible

I just think it will be great If I could add more auto batcher into current Context without money patching it.

I have some idea to customize NDB in my head, but the status of NDB repository on Github make me feel like this project is not active.

In fact, from the Git history we could see, after Guido leaving Google, NDB project was not changed much, only a some commits were added in 4 years.

I do hope we could bring this project back, because I think the concept of auto batching is really great.

Beech Horn

unread,

Sep 26, 2016, 7:16:20 AM9/26/16

to appengine-...@googlegroups.com

Hi Kien,

For queries I'd do an ID only query, then use the ID to get the database object. There should be an option for ID only queries. Apologies again if this isn't useful and I am reading this incorrectly.

To unsubscribe from this group and stop receiving emails from it, send an email to appengine-ndb-discuss+unsubscri...@googlegroups.com.

For more options, visit https://groups.google.com/d/optout.

Kien Nguyen Trung

unread,

Sep 26, 2016, 8:15:30 AM9/26/16

to appengine-ndb-discuss

Hi Beech

I think the query for ID only you meant is `get` in datastore. For that action, NDB works like charm :)

Hi Kien,

To unsubscribe from this group and stop receiving emails from it, send an email to appengine-ndb-discuss+unsub...@googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

Beech Horn

unread,

Sep 26, 2016, 12:30:45 PM9/26/16

to appengine-...@googlegroups.com

Hi Kien,

On the query specify:

keys_only=True

this allows you to make sure the query is only performing an API call for the query itself.

You can then get the objects from the datastore at a point of your choosing, using batch operations for instance.

If it's still giving you a larger number of API calls than expected, add logging and lookup the number of calls performed so far at different stages in your code.

General survival technique for keeping stair-stepping/sequential API calls down was to use keys_only queries and yield futures at the last minute (hence the technique of not inline yielding them). Appreciate the latter doesn't help in this case, but as a technique creating the futures, then yielding them as a separate steps preventing re-factoring for efficiency later on, but YMMV.

Hi Kien,

To unsubscribe from this group and stop receiving emails from it, send an email to appengine-ndb-discuss+unsubscri...@googlegroups.com.

For more options, visit https://groups.google.com/d/optout.

--
You received this message because you are subscribed to the Google Groups "appengine-ndb-discuss" group.

To unsubscribe from this group and stop receiving emails from it, send an email to appengine-ndb-discuss+unsubscri...@googlegroups.com.

For more options, visit https://groups.google.com/d/optout.

Beech Horn

unread,

Sep 26, 2016, 12:31:45 PM9/26/16

to appengine-...@googlegroups.com

If I remember correctly doing a keys_only=True then getting from datastore is the same cost as doing a query and returning the objects there and then (but this is from memory and not to be trusted as the official answer).

Kien Nguyen Trung

unread,

Sep 26, 2016, 10:15:36 PM9/26/16

to appengine-ndb-discuss

Hi Beech

Thanks you for giving me new technique with `keys_only`.

Do you meant I should code like this

@ndb.tasklet
def get_debt_count(user_key):
key = '%s:debt_count' % user_key.urlsafe()
ctx = ndb.get_context()
value = yield ctx.memcache_get(key, use_cache=True)
if value is None:

items = Debt.query(Debt.user_key == user_key).fetch(keys_only=True)
value = len(items)

yield ctx.memcache_set(key, value, use_cache=True)
raise ndb.Return(value)

But as I understand, if I code like this, these queries `Debt.query` will run sequentially but not parallelly.

Hi Kien,

To unsubscribe from this group and stop receiving emails from it, send an email to appengine-ndb-discuss+unsub...@googlegroups.com.

For more options, visit https://groups.google.com/d/optout.

--
You received this message because you are subscribed to the Google Groups "appengine-ndb-discuss" group.

To unsubscribe from this group and stop receiving emails from it, send an email to appengine-ndb-discuss+unsub...@googlegroups.com.

For more options, visit https://groups.google.com/d/optout.

--
You received this message because you are subscribed to the Google Groups "appengine-ndb-discuss" group.

To unsubscribe from this group and stop receiving emails from it, send an email to appengine-ndb-discuss+unsub...@googlegroups.com.

Reply all

Reply to author

Forward