NDB Parallel Tasklets

548 views
Skip to first unread message

Moises Belchin

unread,
Jan 30, 2013, 12:00:31 PM1/30/13
to Google App Engine
Hi all,

Please take a look at this parallel tasklet code snippet #1.


@ndb.tasklet
def get_data_parallel(e): 
usr, det = yield (e.user.get_async(), 
                         
                  MyKind.query(ancestor = e.key).fetch_async())
 
raise ndb.Return((e, usr, det))
  

If e.user is None this raise an Exception.

  
I'm trying this snippet #2. However I still get Exception: "TypeError: Expected Future, received <type 'NoneType'>: None"


@ndb.tasklet
def get_data_parallel(e):
  usr, det = yield (e.user.get_async() if e.user else None, 
                    MyKind.query(ancestor = e.key).fetch_async())
  raise ndb.Return((e, usr, det))
  
How can I do something like snippet #2 ? How can I return future(None) or future('')

Thanks and regards
Moisés Belchín.

Guido van Rossum

unread,
Jan 30, 2013, 1:03:51 PM1/30/13
to google-a...@googlegroups.com, appengine-...@googlegroups.com
You can factor it out into two yields, one of which is optional. First
create a future for the query that you always want to run:

f = MyKind.query(ancestor = e.key).fetch_async() # No yield!

Then conditionally yield the other async request:

if e.user:
usr = yield from e.user.get_async()
else:
usr = None

Finally yield the future:

det = yield f

The trick is that the query will run when you yield the other operation.

--
--Guido van Rossum (python.org/~guido)

Jim Morrison

unread,
Jan 30, 2013, 4:46:55 PM1/30/13
to google-a...@googlegroups.com, appengine-...@googlegroups.com
We added the following method:
@staticmethod
@ndb.tasklet
def get_key_async(key):
"""
returns a future that upon calling get_result() will either
return the Future's result
or None if the key was None
"""
if not key: raise ndb.Return(None)
result = yield key.get_async()
raise ndb.Return(result)

to our code, a similar tasklet could be (which we've removed from out code):
@ndb.tasklet
def future_or_none(future):
if future:
result = yield future
raise ndb.Return(result)
raise ndb.Return(None)

Then in your case you'd do
yield (future_or_none(e.user.get_async() if e.user else None), ...)
> --
> You received this message because you are subscribed to the Google Groups "Google App Engine" group.
> To unsubscribe from this group and stop receiving emails from it, send an email to google-appengi...@googlegroups.com.
> To post to this group, send email to google-a...@googlegroups.com.
> Visit this group at http://groups.google.com/group/google-appengine?hl=en.
> For more options, visit https://groups.google.com/groups/opt_out.
>
>



--
Jim

Moises Belchin

unread,
Jan 31, 2013, 3:36:07 AM1/31/13
to Google App Engine
Hi Guido, 

Thanks for the answer and thanks for the trick. 

I figured it out your code. In that sample the yields run concurrently they don't in parallel. What I'm really trying to do is paralleling the asyncs yielding a tuple, but it's possible one of them will be None and in that case It'll throw an exception: "None object has no attribute .get_async()". 

How can I avoid this exception?

Thanks in advance and regards.


Saludos.
Moisés Belchín.


2013/1/30 Guido van Rossum <gu...@python.org>

Moises Belchin

unread,
Jan 31, 2013, 3:39:09 AM1/31/13
to Google App Engine
Hi Jim,

thanks for the answer and for the code.

I think your code is a better aproach for paralleling yields and for check if one of them is None.

Thanks in advance and regards.


Saludos.
Moisés Belchín.


2013/1/30 Jim Morrison <j...@twist.com>

Moises Belchin

unread,
Jan 31, 2013, 5:05:59 AM1/31/13
to Google App Engine
Hi all again,

Is there any significant benefit between code#1 and code#2 ?? In docs you can read code#2 run in parallel.

@ndb.tasklet
def get_cart_plus_offers(acct):
 
cart, offers = yield get_cart_async(acct), get_offers_async(acct)
 
raise ndb.Return((cart, offers))

That yield xy is important but easy to overlook. If that were two separate yield statements, they would happen in series. But yielding a tuple of tasklets is a parallel yield: the tasklets can run in parallel and the yield waits for all of them to finish and returns the results. (In some programming languages, this is known as a barrier.)


In appstats there is no significant difference between both options or maybe I don't see it. It's possible I'm missing something.

Could someone bring to me more light here ?!

Thanks in advance and best regards.

Code#1:

@ndb.tasklet
def get_data(e):  
  usr = yield e.user.get_async()
  det = yield MyKind.query(ancestor = e.key).fetch_async()
  raise ndb.Return((e, usr, det))

Code#2:

@ndb.tasklet
def get_data_parallel(e):  
  usr, det = yield (e.user.get_async(), MyKind.query(ancestor = e.key).fetch_async())
  raise ndb.Return((e, usr, det))



Saludos.
Moisés Belchín.


2013/1/31 Moises Belchin <moises...@gmail.com>

Guido van Rossum

unread,
Jan 31, 2013, 11:28:35 PM1/31/13
to google-a...@googlegroups.com
Are you looking at Appstats in the dev appserver? It does not give
results (in cases like this) that match production. The dev appserver
does not really execute RPCs in parallel (which is the same as
concurrently, here).

I promise you that in production the "yield f1, f2" form runs the
tasks represented by futures f1 and f2 concurrently (== in parallel).

I should also explain (again) that there is a huge difference between this:

f1 = foo_async()
f2 = bar_async()
yield f1, f2

vs.

yield foo_async()
yield bar_async()

the latter is equivalent to

f1 = foo_async()
yield f1
f2 = bar_async()
yield f2

Because the first future is yielded before the second is even created,
nothing runs concurrently (== in parallel) here. However, now compare
to this:

f1 = foo_async()
f2 = bar_async()
yield f1
yield f2

This runs both futures in parallel (== concurrently) even though they
are yielded separately! The reason is that when you yield *any*
future, *all* futures that exist at that point are allowed to run. But
futures that haven't been created yet can't run!

Hope this helps. It is important to "get" this. (Also that no future
runs until you yield something. Futures are buffered in the app's
memory until a yield forces all buffered futures out to the servers.)

Moises Belchin

unread,
Feb 1, 2013, 4:43:52 AM2/1/13
to Google App Engine
Hi Guido,

Thanks a lot for the great explanation. Sorry for my error between concurrent and parallel, fast typing error !

Thanks again and regards.


Saludos.
Moisés Belchín.


2013/2/1 Guido van Rossum <gu...@python.org>
Reply all
Reply to author
Forward
0 new messages