--
You received this message because you are subscribed to the Google Groups "Google App Engine" group.
To unsubscribe from this group and stop receiving emails from it, send an email to google-appengi...@googlegroups.com.
To post to this group, send email to google-a...@googlegroups.com.
Visit this group at http://groups.google.com/group/google-appengine.
For more options, visit https://groups.google.com/groups/opt_out.
We just learned that there is a limit to how fast you can defer to the task queue. You can only have 10 async requests open at a time, so if you try to defer 100 things number 11-100 will fail if the scheduler for task 1 doesn’t respond by the time you get to 11.
This is really annoying since If the scheduler responds half as fast as you try to add them you will get something like 1-12, 15-18, 20-23, 27-29, and so on and so forth. We spent a bit of time figuring out why our updates would have chunks missing, especially since days the defers were running fast everything would work.
Instead of iterating over all keys at once you could also use the hidden __scatter__ property to determine proper split points for your key range in advance. That is what the mapreduce library does.
I haven't understand what you can do with __scatter__ property.As MapReduce docs says, "We do not allow retrieving this value directly (it's stripped from the entity before it's returned to the application by the Datastore)"...
Also, I haven't understood what's wrong with using tasks and cursors.
A task can simply iterate the data, fetching N entities at time (in this case 1000), than launch a task that can update them (by itself or splitting again the work). At most, if 10 minutes aren't so much to iterate over all data, this "main" task can fetch only a piece of data, start update task, than start itself on the next chunk of data...
Obviously if is possible to split the work using only key's informations (for example, keys are number from 1 to 400.000) this works better, because fetching data by key is always a better than querying for the same data...