Better solution for get_multi_async

305 views
Skip to first unread message

Fredrik Bonander

unread,
Mar 6, 2012, 10:56:12 AM3/6/12
to appengine-...@googlegroups.com
Hi, 

In the application I'm currently developing I'm fetching x number of records based on an array och ndb.key's with get_multi_async. The number of key's in the array is somewhere between 10 and 200 (on some occasions more) depending on the action needed. 

When using appstats in production it shows me that for each key in the array it does an datastore_v3.Get, which make sense considering that it need to get one record for each unique records for each key. However it uses a a lot of cpu time. 

In a test request, it totals to 94 RPCs with a RPC Total of 302239ms (14275ms api). 

So my questions are:
1. Is there any possibility to reduce the number of RPCs (the get_multi_async is responsible for 78 times of datastore_v3.Get)?
2. When inspecting the datastore_v3.Get all run in async, but the first datastore_v3.Get seems to take aslong as it takes for the last one to finish. In the case almost 10 seconds. Is this correct?
3. Is there any way to simulate the same behavior in the dev_appserver? When looking at the same request there it only makes one datastore_v3.Get.

The core concept of the application is a simple index for search terms. What I do is to take a string property of a datastore entity, break it down in to substrings and store each substring as an entity with the substring as key and add the datastore entity key to a property called reference_list(a model.KeyProperty(repeated=True)) containing the key of the current datastore entity. And if the same substring is found in another datastore entity it will add that key to the reference_list property.

My IndexModel looks like:

class SearchIndex(model.Model):
    name = model.StringProperty(required=True) 
    reference_list = model.KeyProperty(repeated=True)

And the code for storing the substrings:

@classmethod
    @ndb.tasklet
    def add_verbs_to_index(cls, verb_list, datastore_entity):
        """
            Create keys for each verb in verb list.
            A verb is either a entire word of a substring of a verb.

            Args:
               verb_list: A set for verbs that will be indexed for the current datastore_entity
               datastore_entity: A datastore entity that should be associated with the current set of verbs

            Example of a verb_list for string "abc":
                verb_list = set('a','b','c','ab','bc','abc')

        """
        key_list = [ndb.Key(cls._SEARCH_INDEX_DB_MODEL, verb) for verb in verb_list]
        # Fetch a list of all entities associated with a key
        # This will return None on the same position (in the array) if the key doesn't have an entity
        index_entities = yield ndb.get_multi_async(key_list)
        new_index_entities = []
        # Bool flag needed to determine if we need to put something back to the datastore
        is_stored = True

        for n in range(0, len(index_entities)):
            # If None, the entity doesn't exists in the datastore and we need to create it
            if index_entities[n] is None:
                # verb_list is a set, need to convert it to a list
                verb = list(verb_list)[n]
                # Is this really needed, maybe use key_list[n].
                key = ndb.Key(cls._SEARCH_INDEX_DB_MODEL, verb)
                # Create a new entity
                ent = cls._SEARCH_INDEX_DB_MODEL(key=key, name=verb, reference_list=[datastore_entity.key])
                # Add new entity in array to be put'ed to the datastore
                new_index_entities.append(ent)
                is_stored = False
            else:
                # Entity already exists
                ent = index_entities[n]
                # Store the datastore_entity.key in ent.reference_list to be able to get all
                # datastore_entity that's associated with current verb
                if not datastore_entity.key in ent.reference_list:
                    ent.reference_list.append(datastore_entity.key)
                    new_index_entities.append(ent)
                    is_stored = False

        # If something needs to get stored, store it
        if not is_stored:
            yield ndb.put_multi_async(new_index_entities)

        raise ndb.Return(new_index_entities)



I know this is a lot to take in, but any ideas to get around the high CPU usage for this would be very helpful!

If it makes sense to upload appstats screenshots, just give me a shout and I'll upload them.

Thanks, 

..fredrik







Jakob Holmelund

unread,
Mar 6, 2012, 11:18:35 AM3/6/12
to appengine-...@googlegroups.com
Hmm couldnt you just save the broken down string in a list  and save that to the entity ? This way, you dont even need the search index.. An entity would the hold a list of strings and you would get all the entities belonging ti a certain string(query_terms) by doing

Entity.query(Entity.query_terms == "query")

this can even be extended so that you can query for items containing multiple query terms..

q = "spam eggs"
q_tokens = q.split()
query =  Entity.query() 

for token in q_tokens:
    query =  query.filter(Entity.query_terms ==  token)

unless you have some wierd usecase for your specific method, this method is the way to go..

Jakob Holmelund

unread,
Mar 6, 2012, 11:22:41 AM3/6/12
to appengine-...@googlegroups.com
and if your items are regularily fetched, you could even utilize the keys_only argument in fetch(), and then do a get_multi() on the keys..
Message has been deleted

Fredrik Bonander

unread,
Mar 6, 2012, 11:31:18 AM3/6/12
to appengine-...@googlegroups.com
Thanks for the quick response. 

I'll try you method out. 

They main reason for my approach is what I want fast queries against the index. In my solution I can simple get the index with the search term as key. 

Also doesn't querying a list property cause a datastore_v3.Get for each item in the list?

..fredrik

Jakob Holmelund

unread,
Mar 6, 2012, 12:07:31 PM3/6/12
to appengine-...@googlegroups.com
Nope.. only 1  datastore_v3.Get pr query if you dont use OR. the trick is that the query doesnt execute until you call fetch() on it. you could even add more filters for other properties.. This method only scales so much, but it seems that it would be fine for your use case..

Guido van Rossum

unread,
Mar 6, 2012, 12:12:43 PM3/6/12
to appengine-...@googlegroups.com
I am guessing you are using the HR Datastore. In this case, the
lowest-level datastore API code (in
google/appengine/datastore/datastore_rpc.py -- this is not part of
NDB) splits your requests up by entity group. Assuming that all your
keys are root keys, each is in its own entity group, and you end up
with as many parallel Get RPCs as you have keys. The engineers
responsible for the HRD implementation assure me this is more
efficient than issuing a single multi-key Get RPC. Nevertheless, you
can always try issuing fewer RPCs by passing
max_entity_groups_per_rpc=N to the get_multi_async() call. I would try
small values first and see if it improves the real time taken by your
requests: N=1, N=2, N=5, N=10. You can try the same thing for the
put_multi_async() call.

I wouldn't try to put all keys in a single entity group, since AFAIK
it doesn't scale to the volume I assume you are interested in
(millions of verbs, I presume).

The times reported by Appstats for overlapping async RPCs aren't all
that meaningful; all you can really tell for sure is how many RPCs
happened and how long they took all together in real time. The sum of
their real times is meaningless; the api time is somewhat meaningful
as it is a measure for how much work the backend had to do to satisfy
your request. (However, since the introduction of new billing, this
number is not directly related to the price charged for that work.
This is a known issue that we hope to address.)

The actual CPU time (not real time) used by your request is most
likely due to deserialization costs. This should be pretty much
independent of how many RPCs are being executed in real time. The only
way to reduce this is to have fewer properties in your entities. (Even
if the request is satisfied from memcache, you still incur exactly the
same deserialization overhead -- each entity still has to be converted
from a string to a structured objects.)

It seems you are not getting the data from memcache. Did you
explicitly turn it off in NDB? Or are these verbs simply too "fresh"
or too infrequently used to be found in memcache?

--Guido

--
--Guido van Rossum (python.org/~guido)

Fredrik Bonander

unread,
Mar 6, 2012, 3:12:53 PM3/6/12
to appengine-...@googlegroups.com
Thanks. 

I tried your solution and it works great for my current project. 

One drawback was that I can no longer do quires like SearchIndex.query(SearchIndex.name >= qry, SearchIndex.name <= query_stop). But it's a bit overkill for my current project. 

And one odd thing I noticed, when adding 10-15 entities with a substring list of about 200 substring, it took about ~4000 write operations which is about 8% of the daily quota of write operations.
Is this since all datastore values is written as key-value, hence one write operation per item in the substring list?

..fredrik 

Fredrik Bonander

unread,
Mar 6, 2012, 3:51:30 PM3/6/12
to appengine-...@googlegroups.com


On Tuesday, March 6, 2012 6:12:43 PM UTC+1, Guido van Rossum wrote:
I am guessing you are using the HR Datastore. In this case, the
lowest-level datastore API code (in
google/appengine/datastore/datastore_rpc.py -- this is not part of
NDB) splits your requests up by entity group. Assuming that all your
keys are root keys, each is in its own entity group, and you end up
with as many parallel Get RPCs as you have keys. The engineers
responsible for the HRD implementation assure me this is more
efficient than issuing a single multi-key Get RPC. Nevertheless, you
can always try issuing fewer RPCs by passing
max_entity_groups_per_rpc=N to the get_multi_async() call. I would try
small values first and see if it improves the real time taken by your
requests: N=1, N=2, N=5, N=10. You can try the same thing for the
put_multi_async() call.

Yes using HRD. I made some simple tests with changing max_entity_groups_per_rpc from 1, 4 and then 10. For both 4 and 10 I saw a dramatic increase in speed. From the get operations taking in total ~3300ms down to ~1300ms and about half as many RPC for max_entity_groups_per_rpc=4. And down to ~450ms for max_entity_groups_per_rpc=10. I'll verify this better so it's not due to random text I'm testing did contain only stop-words etc. But unfortunately my write operations quota is at 100%. So have to wait for billing to get enabled. One thing I'm curios about is if I increase max_entity_groups_per_rpc from 1 to 10, what's the drawbacks? Or is it simply just better ?
 

I wouldn't try to put all keys in a single entity group, since AFAIK
it doesn't scale to the volume I assume you are interested in
(millions of verbs, I presume).

I sure hope that I'm able to build something that that can scale to that! :)

 

The times reported by Appstats for overlapping async RPCs aren't all
that meaningful; all you can really tell for sure is how many RPCs
happened and how long they took all together in real time. The sum of
their real times is meaningless; the api time is somewhat meaningful
as it is a measure for how much work the backend had to do to satisfy
your request. (However, since the introduction of new billing, this
number is not directly related to the price charged for that work.
This is a known issue that we hope to address.)

The actual CPU time (not real time) used by your request is most
likely due to deserialization costs. This should be pretty much
independent of how many RPCs are being executed in real time. The only
way to reduce this is to have fewer properties in your entities. (Even
if the request is satisfied from memcache, you still incur exactly the
same deserialization overhead -- each entity still has to be converted
from a string to a structured objects.)

Do you mean entities in general? Or my SearchIndex (it only has 2 properties) ?

 

It seems you are not getting the data from memcache. Did you
explicitly turn it off in NDB? Or are these verbs simply too "fresh"
or too infrequently used to be found in memcache?

In this test the verbs are too "fresh", trying to do my tests on a clean datastore to ensure that my code handles a lot of new "fresh" verbs.
 

--Guido


Jakob Holmelund

unread,
Mar 6, 2012, 6:27:51 PM3/6/12
to appengine-...@googlegroups.com

Actually you can break down the string like

[s,sp,spa,spam,e,eg,egg,eggs] you could also add phonetic codes that can catch many spelling errors.

Jakob Holmelund

unread,
Mar 6, 2012, 6:31:10 PM3/6/12
to appengine-...@googlegroups.com

If you don't have too many query terms in your entity I guess you could also add

[am,pam,gs,ggs]

Guido van Rossum

unread,
Mar 6, 2012, 7:01:37 PM3/6/12
to appengine-...@googlegroups.com
> On Tuesday, March 6, 2012 6:12:43 PM UTC+1, Guido van Rossum wrote:
>> I am guessing you are using the HR Datastore. In this case, the
>> lowest-level datastore API code (in
>> google/appengine/datastore/datastore_rpc.py -- this is not part of
>> NDB) splits your requests up by entity group. Assuming that all your
>> keys are root keys, each is in its own entity group, and you end up
>> with as many parallel Get RPCs as you have keys. The engineers
>> responsible for the HRD implementation assure me this is more
>> efficient than issuing a single multi-key Get RPC. Nevertheless, you
>> can always try issuing fewer RPCs by passing
>> max_entity_groups_per_rpc=N to the get_multi_async() call. I would try
>> small values first and see if it improves the real time taken by your
>> requests: N=1, N=2, N=5, N=10. You can try the same thing for the
>> put_multi_async() call.

On Tue, Mar 6, 2012 at 12:51, Fredrik Bonander
<carl.fredr...@gmail.com> wrote:
> Yes using HRD. I made some simple tests with
> changing max_entity_groups_per_rpc from 1, 4 and then 10. For both 4 and 10
> I saw a dramatic increase in speed. From the get operations taking in total
> ~3300ms down to ~1300ms and about half as many RPC
> for max_entity_groups_per_rpc=4. And down to ~450ms
> for max_entity_groups_per_rpc=10. I'll verify this better so it's not due to
> random text I'm testing did contain only stop-words etc. But unfortunately
> my write operations quota is at 100%. So have to wait for billing to get
> enabled. One thing I'm curios about is if I
> increase max_entity_groups_per_rpc from 1 to 10, what's the drawbacks? Or is
> it simply just better ?

It's a tuning parameter. If it's better it's better. Enjoy. :-)

>> I wouldn't try to put all keys in a single entity group, since AFAIK
>> it doesn't scale to the volume I assume you are interested in
>> (millions of verbs, I presume).
>
> I sure hope that I'm able to build something that that can scale to that! :)

>> The times reported by Appstats for overlapping async RPCs aren't all
>> that meaningful; all you can really tell for sure is how many RPCs
>> happened and how long they took all together in real time. The sum of
>> their real times is meaningless; the api time is somewhat meaningful
>> as it is a measure for how much work the backend had to do to satisfy
>> your request. (However, since the introduction of new billing, this
>> number is not directly related to the price charged for that work.
>> This is a known issue that we hope to address.)
>>
>> The actual CPU time (not real time) used by your request is most
>> likely due to deserialization costs. This should be pretty much
>> independent of how many RPCs are being executed in real time. The only
>> way to reduce this is to have fewer properties in your entities. (Even
>> if the request is satisfied from memcache, you still incur exactly the
>> same deserialization overhead -- each entity still has to be converted
>> from a string to a structured objects.)
>
> Do you mean entities in general? Or my SearchIndex (it only has 2
> properties) ?

Remember that a repeated property with 10 items in the list costs the
same as 10 single properties.

>> It seems you are not getting the data from memcache. Did you
>> explicitly turn it off in NDB? Or are these verbs simply too "fresh"
>> or too infrequently used to be found in memcache?
>
> In this test the verbs are too "fresh", trying to do my tests on a clean
> datastore to ensure that my code handles a lot of new "fresh" verbs.

Cool.

Guido van Rossum

unread,
Mar 6, 2012, 7:38:22 PM3/6/12
to appengine-...@googlegroups.com
On Tue, Mar 6, 2012 at 16:01, Guido van Rossum <gu...@google.com> wrote:
> Remember that a repeated property with 10 items in the list costs the
> same as 10 single properties.

Two more thoughts.

1. Unindexed properties cost fewer write ops than indexed properties.

2. Instead of KeyProperty(repeated=True), consider either a
(non-repeated) JsonProperty containing a list of verbs (it's easy to
reconstitute the key from the verb in your app, IIUC) or even a
TextProperty containing the verbs concatenated with spaces -- it's
easy to split that into a list of verbs, from which you can, again,
easily construct the keys. (This only makes sense if the model
containing the list of key properties is the one whose deserialization
is slowing you down, of course.)

Reply all
Reply to author
Forward
0 new messages