Performance test result for loading multiple entities by key_name using different strategy.

11 views

Skip to first unread message

blep

unread,

Aug 6, 2008, 10:24:39 AM8/6/08

to Google App Engine

I did some performance testing (and profiling) to figure out what was
the best strategy to use to load multiple entities by key names. I
though it would be nice to share the result, so here they are...

I've put an excerpt of the source used for the test at the following
place:
http://nopaste.info/31b3652941_nl.html

I have compared the following strategies:
- multiple calls to db.get_by_key_name() or datastore.Get()
(respectively _loadDocumentsAsModel and _loadDocumentsAsRaw below)
- single calls to db.get_by_key_name() or datastore.Get() passing
multiple key_names (respectively _loadDocumentsAsModelBatch and
_loadDocumentsAsRawBatch below)
- modified version datastore.get() with my own Entity._FromPB
(_loadDocumentsAsCustomBatch below)

Notes that db.get_by_key_name() use datastore.Get() in its
implementation.

The model has about 13 properties, among those 3 string properties
which are usually about 30 characters long, and a text property of
about 200 characters long.

Results:

Load strategy \ number of entities: 300 1000 2000
_loadDocumentsAsModel 2.53s
_loadDocumentsAsRaw 2.48s
_loadDocumentsAsModelBatch 0.73s 2.60s
_loadDocumentsAsRawBatch 0.70s 2.47s
_loadDocumentsAsCustomBatch 0.42s 1.48s 2.97s

Notes:
- Missing data are due to dead-line excess (processing time >3s).

Conclusion:
- Batch get where you pass the list of keys are a must use
- datastore.get() give only a small 5% advantage compared to
db.get_by_key_name()
- _loadDocumentsAsCustomBatch using the modified datastore.Get() I
dubbed get_as_pb() seems to be worth the pain when there is a need to
retrieve a singiticant amount of entities. It provides a good 30%
reduction of CPU usage compared to datastore.Get() as it replaces the
costly Entity._FromPB() mapping with a more effective (and specific)
one.

Aral

unread,

Aug 6, 2008, 2:38:16 PM8/6/08

to Google App Engine

Thanks, blep -- very useful!

Aral

On Aug 6, 3:24 pm, blep <baptiste.lepill...@gmail.com> wrote:
> I did some performance testing (and profiling) to figure out what was
> the best strategy to use to load multiple entities by key names. I
> though it would be nice to share the result, so here they are...