Monitoring Datastore Read / Write Time in Google App Engine

50 views
Skip to first unread message

Karan Tongay

unread,
Dec 5, 2016, 10:39:30 AM12/5/16
to Google App Engine

I need to perform large insert operations from an App Engine Application to Datastore, as of now, the speed of insertion is around 80 rows per minute for a data containing 11 columns and 80000 rows. This speed seems to be very slow and I need 80 rows per sec instead of per minute. 

I had searched about batch operations, but still I am not getting considerable improvement if I create batches of 50 rows and if I add more rows in a batch the operations are not successful. 

Therefore, I need a mechanism by which I can monitor which block of code takes more time.

I tried integrating Jamon, but was unable to execute because of jsp errors while deploying. I also used Appstats, but found that it does not return the time taken by the datastore queries and also it is not monitoring the task running on task queue.

Is there any tool from Google that would help to monitor the time taken by the block of code in App Engine and also the time taken by the datastore read and write operations.

Please acknowledge.

Vitaly Bogomolov

unread,
Dec 5, 2016, 2:33:25 PM12/5/16
to Google App Engine
Hi Karan

the batch operation is a key.
 


5. You can batch put, get and delete operations for efficiency

Every time you make a datastore request, such as a query or a get() operation, your app has to send the request off to the datastore, which processes the request and sends back a response. This request-response cycle takes time, and if you're doing a lot of operations one after the other, this can add up to a substantial delay in how long your users have to wait to see a result.

Fortunately, there's an easy way to reduce the number of round trips: batch operations. The db.put(), db.get(), and db.delete() functions all accept lists in addition to their more usual singular invocation. When passed a list, they perform the operation on all the items in the list in a singledatastore round trip and they are executed in parallel, saving you a lot of time. For example, take a look at this common pattern:

for entity in MyModel.all().filter("color =",
    old_favorite).fetch(100):
  entity.color = new_favorite
  entity.put()

Doing the update this way requires one datastore round trip for the query, plus one additional round trip for each updated entity - for a total of up to 101 round trips! In comparison, take a look at this example:

updated = []
for entity in MyModel.all().filter("color =",
    old_favorite).fetch(100):
  entity.color = new_favorite
  updated.append(entity)
db.put(updated)

By adding two lines, we've reduced the number of round trips required from 101 to just 2!


also, see this thread:
https://groups.google.com/forum/#!topic/appengine-ndb-discuss/04wiYYI26pc

WBR, Vitaly


Reply all
Reply to author
Forward
0 new messages