Datastore write performance question

4 views
Skip to first unread message

I.K.

unread,
Sep 1, 2008, 8:46:29 AM9/1/08
to Google App Engine
Hi all,

In the belief that I will improve systems performance, I have been
reducing the number of Datastore writes by creating a few larger
models with lots of data in therm, rather than a larger number of
smaller models. Am I correct in my assumption?

This obviously affects my design and code, so I just want to be sure
I'm making extra work for myself.

Thanks

I.K.

unread,
Sep 4, 2008, 10:46:07 AM9/4/08
to Google App Engine
Nobody?

Can somebody point me to any relevant blog entries/ posts / documents,
so I can figure it out myself

Bill

unread,
Sep 4, 2008, 8:50:49 PM9/4/08
to Google App Engine
The small amount of benchmarks I've run show puts to be very
expensive, so minimizing them in general will help. Your question,
though, is a bit more complex. First, you are increasing the size of
your puts. I've not done any benchmarks checking that tradeoff (# of
puts vs size of puts). If you increase the size of your models,
depending no your application, you might also increase the chance of
requiring transactions, which are much more expensive than vanilla
puts.

I would suggest running a benchmark experiment in the cloud. Use two
app versions, one with chunky models, the other with properties
distributed. Then time the puts. That would give you a more
definitive answer. If I get the time, I might test it out myself.
-Bill

I.K.

unread,
Sep 5, 2008, 4:23:28 AM9/5/08
to Google App Engine
Thanks for the advice.

Unfortunately I need to do a large number of writes. I have in effect,
a batch process to run a couple of times a day. I think some
profiling and some proof-of-concept tests might be a good idea, or
i'll run out of free CPU cycles ;)

I'll post a follow up if I get a chance. So much to do and so little
time ;)

Cheers

ryan

unread,
Sep 5, 2008, 12:44:45 PM9/5/08
to Google App Engine
in general, bill is right. writes are more expensive than reads. we
also don't (yet) have dedicated support for large offline or batch
processing.

have you tried passing multiple entities to put(), as opposed to
calling put() individually on each entity? if not, it's worth a try.
it's noticeably more efficient:

http://code.google.com/appengine/docs/datastore/functions.html#put

Feris Thia

unread,
Sep 19, 2008, 3:16:03 PM9/19/08
to google-a...@googlegroups.com
Hi,

I've tried to benchmark put's performance in the cloud and notice that writing 29 entities - with 3 properties and used put individually for each entity - took 1 second in one request thread.

Question is, is this performance also limited to parallel thread requests ? I mean that in a second is the cloud can only hold to 29 clients's datastore write activities ?

Thanks,

Feris
Reply all
Reply to author
Forward
0 new messages