querying for entites returns empty list

2,155 views
Skip to first unread message

Stefan Keil

unread,
Mar 29, 2017, 5:01:38 AM3/29/17
to objectify-appengine
Hey,

I have the problem that querying for entities sometimes returns an empty list, even though there exists data that should be returned. When I go into the browser https://console.cloud.google.com/datastore/entities/query?... and do the query there it works and I can see the data. Then, after a while, performing the query in my code works again and returns the appropriate results.

In more detail:

I have an entity like this:

@Entity
@Cache
public class Entity {
@Parent private Key<ParentClass> parentKey;
@Id private String id;
@Index private String myIndexedString;
@Index private Date myIndexedDate;
...
}

Then I do this query (also sometimes with order("myIndexedDate")):

public List<Entity> queryEntites(Key<ParentClass> parentKey, String theIndexedValue, int offset, int limit) {
return ofy().consistency(Consistency.STRONG).load()
.type(Entity.class)
.ancestor(parentKey)
.filter("myIndexedString", theIndexedValue)
.order("-myIndexedDate")
.offset(offset)
.limit(limit)
.list();
}

I have these entries in my datastore-index.xml:
<datastore-index kind="Entity" ancestor="true" source="manual">
    <property name="myIndexedString" direction="asc"/>
    <property name="myIndexedDate" direction="desc"/>
</datastore-index>

<datastore-index kind="Entity" ancestor="true" source="manual">
    <property name="myIndexedString" direction="asc"/>
    <property name="myIndexedDate" direction="asc"/>
</datastore-index>

Then I call

queryEntites(specificParentKey, specificIndexedValue, 0, 100);

and the returned list is empty.

Immediately after this I use the following function

public List<Entity> queryEntityOnlyByIndexedValueWithoutParentAndOrder(String theIndexedValue, int offset, int limit) {
return ofy().consistency(Consistency.STRONG).load().type(Entity.class)
.filter("myIndexedString", theIndexedValue)
.offset(offset)
.limit(limit)
.list();
}

via calling

queryEntityOnlyByIndexedValue(specificIndexedValue, 0, 100);

and also the returned result is also an empty list.

My hope was that this might work, as there is no multi-index involved.

So what is happening? As it seems, the index is not available. Why is there not an exception is this case, but a returned empty list?

In both cases there should have been returned something. I could check in the browser by querying for "myIndexedString" that there are indeed objects (and I know for other reasons as well). Why can I look at the objects in the browser, even though queryEntityOnlyByIndexedValue() was not working?

I actually also tried the first query queryEntites() 50 times in a row with waiting half a second between each try. In all cases the returned list was empty. Then I tried it again 30min later and it worked.

Some remarks:

I have a lot of Entities in my database (millions), and even though a single entity is normally not changed that often (maybe 5 times in 24h and then possibly never changed again), it can be that for a short period of time a single entity is changed multiple times in a row (like 10 times in 5min), or that fixing a parentKey, that there are many changes to entities with that parentKey (like 10.000 new entries or changes in 1 hour).

Questions:

Why do I get an empty list via the query, even though I can watch the data in the browser?
Why is there no exception thrown?
Is this a problem with the index, or are there other reasons?

Thanks a lot
Stefan

Jeff Schnitzer

unread,
Mar 30, 2017, 4:26:53 PM3/30/17
to objectify-appengine
This sounds like a cache contamination issue - especially the 30m delay (long enough for cache eviction). Perhaps you have those negative cache entries.

First test should be to disable the @Cache annotation and see if this problem goes away.

Second: Are you running a bunch of queries with ofy().cache(false) mixed with regular ofy().load() operations?

Jeff

--
You received this message because you are subscribed to the Google Groups "objectify-appengine" group.
To unsubscribe from this group and stop receiving emails from it, send an email to objectify-appengine+unsub...@googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

Stefan Keil

unread,
Mar 31, 2017, 3:11:49 AM3/31/17
to objectify-appengine, je...@infohazard.org
Hey Jeff,

thanks for your answer!

So, first of all from your answer I deduce that you are not expecting the index to be wrong or unavailable, you assume it is a memcache problem. That is kind of good news.

How do queries work with cache?

I imagine it like this:
The query accesses the index and gets a list of keys. This always works without memcache, right?
Then objectify tries to load the entities with the returned keys. Here it will use the memcache if and only if we have ofy().cache(true), right? (Thus it will return those entities that are already in memcache and accesses those from the database that are not in memcache and updates the memcache for those entities, right?)
Then it puts the loaded entities in a list omitting those entities that did not exist, right?
Hence, the return value is at least an empty list and it may contain fewer elements than the list of keys which was returned by the index, right?

Is this how queries work?

Best
Stefan

Additional question:
What exactely happens when I remove the @Cache annotation? Then my entity will never be put in the memcache, independent of whether ofy().cache(true)?
What happens if I remove this annottation and already have some entities in memcache? Assume I will save such an entity again. Then it will not clear the existing memcache entry at that moment? If I would then put the @Cache annotation back to my entity and load again my entity, will it then return the entry in my memcache, which existed there before (and which is not equal to the entity in my database?

Jeff Schnitzer

unread,
Mar 31, 2017, 9:41:12 AM3/31/17
to objectify-appengine
I believe I answered the question about how hybrid queries work in the other email - let me know if further explanation would help.

Regarding the way the cache works:

 * Writes (including deletes) clear the cache entry. They do not populate memcache.
 * Reads check memcache, fetching into it if necessary.

If you have @Cache entities in memcache, then remove the annotation + deploy, then save some enttiies, then try to add the annotation back, yes you may have stale cache entries. Flush the cache. Similarly, you can cause a problem by doing ofy().cache(false).save().enttiies(…) so watch out for that - it literally just ignores the cache for that operation, which means the cache may have stale entries.

If you have having persistent problems, you might just want to remove @Cache from entities or give them a short expiration. Usually memcache is reliable, and an expiration backstop of a few minutes will guarantee that staleness is never severe. But if your data cannot tolerate that kind of “really-really-eventual” consistency, don’t use the cache.

Jeff

--

Stefan Keil

unread,
Mar 31, 2017, 10:10:44 AM3/31/17
to objectify-appengine, je...@infohazard.org
Thanks for your explanations! I hope it is ok, that I ask a follow up question to the hybrid queries.

So what I described

"
The query accesses the index and gets a list of keys. This always works without memcache, right?
Then objectify tries to load the entities with the returned keys. Here it will use the memcache if and only if we have ofy().cache(true), right? (Thus it will return those entities that are already in memcache and accesses those from the database that are not in memcache and updates the memcache for those entities, right?)
Then it puts the loaded entities in a list omitting those entities that did not exist, right?
Hence, the return value is at least an empty list and it may contain fewer elements than the list of keys which was returned by the index, right?
"

is exactely a hybrid query, right?

What I dont understand: what is the difference in a non-hybrid query?

Jeff Schnitzer

unread,
Mar 31, 2017, 10:45:17 AM3/31/17
to objectify-appengine
The naive answer is that non-hybrid queries perform a full query (not keys-only) and iterate through whole entities. It’s faster than doing keysonly + batch get. This will bypass memcache.

The somewhat more complicated answer is that, deep under the covers, google itself is doing a “keys-only query” (aka looking up keys in an index) followed by a “batch get” (fetching entity values for those keys). It’s faster to do this in the bowels of GAE than in userspace, but not radically faster. So the performance penalty of query hybridization (ignoring caching costs) is not as bad as it sounds.

Jeff

--
You received this message because you are subscribed to the Google Groups "objectify-appengine" group.
To unsubscribe from this group and stop receiving emails from it, send an email to objectify-appengine+unsubscribe...@googlegroups.com.

Stefan Keil

unread,
Apr 3, 2017, 4:58:15 AM4/3/17
to objectify-appengine, je...@infohazard.org
So in short, a query is ALWAYS first a keys-only process which happens in an index. And then the result obtained from the index, which is a list of keys, is used for a batch-by-keys-load in the datastore. This second step can happen with or without memcache.

Correct?

Jeff Schnitzer

unread,
Apr 3, 2017, 11:57:32 AM4/3/17
to objectify-appengine
_Very_ crudely, yes. Although when you do a normal (non-keys-only and not-hybrid) query, the work is done by the underlying platform (not in “user space”).

The datastore is built on bigtable. There’s one table that maps your keys to a serialized blob of your entity. Then there are other tables that map index values to keys. Queries walk indexes and fetch the serialized blobs (which may be stored on any of thousands of machines in the cluster). Keys-only (ad projection) queries are fast because they can skip the second fetch step.

Hybrid queries are a bit slower than “native" queries because Objectify does the work in user space so there are additional RPCs. But if the cache hits it will be faster (and cheaper).

Jeff

On Mon, Apr 3, 2017 at 1:58 AM, Stefan Keil <stefa...@fanpagekarma.com> wrote:
So in short, a query is ALWAYS first a keys-only process which happens in an index. And then the result obtained from the index, which is a list of keys, is used for a batch-by-keys-load in the datastore. This second step can happen with or without memcache.

Correct?

--
You received this message because you are subscribed to the Google Groups "objectify-appengine" group.
To unsubscribe from this group and stop receiving emails from it, send an email to objectify-appengine+unsub...@googlegroups.com.

Stefan Keil

unread,
Apr 4, 2017, 3:48:52 AM4/4/17
to objectify-appengine, je...@infohazard.org
Hey Jeff,

thanks a lot for answering!

So a query starts with the "other tables that map index values to keys", and then in a hybrid query, it would return the keys to objectify, which then starts a batch-by-keys-load using first the session cache, and then accesses the database (possibly with or without memcache) in a second RPC, which then accesses the "one table that maps your keys to a serialized blob of your entity". Then the serialized blobs are returned and then objectify converts these blobs to java-objects and returns it to the user.
In the non-hybrid version, there would be no return of keys in the middle, there would be no second RPC, there would be no use of a session cache and also no use of memcache (or is memcache used?). GAE will return the blobs to objectify directly.

Did I get you right?

Jeff Schnitzer

unread,
Apr 4, 2017, 8:53:52 AM4/4/17
to objectify-appengine
The session cache is always used no matter what. If an entity already exists in the session and you load it via a (non-hybrid) query, the data loaded is ignored and the session value is used. This works just like a JPA/Hibernate session.

The other caveat is that while you do have the number of RPCs correct with respect to the boundary between the application and Google’s service tier, there are plenty of additional RPCs within the service tier. You might query for 200 entities; those entities might all exist on separate tablet servers within the bigtable cluster. It’s very difficult to predict the actual amount of work that any query might do.

Jeff

--

Stefan Keil

unread,
Apr 6, 2017, 5:57:09 AM4/6/17
to objectify-appengine, je...@infohazard.org
Hey Jeff,

I started a new mail concerning the details about Datastore, Memcache and Objectify (https://groups.google.com/forum/#!topic/objectify-appengine/Z-3tB0eeHdo).

I'd like to come back to my original question.

I am doing a query (with memcache, with parent, with type, with 1 filter and 1 order) and the returned result is an empty list. The query looks like this:

ofy().consistency(Consistency.STRONG).load()
.type(Entity.class)
.ancestor(parentKey)
.filter("myIndexedString", theIndexedValue)
.order("-myIndexedDate")
.offset(offset)
.limit(limit)
.list();

You were thinking about negative cache entries.

Now I tried the following: Directly after the empty list was returned I did a keys-only-query with a new ofy-object (to avoid any session cache issues):

factory().begin().consistency(Consistency.STRONG).load()
.type(Entity.class)
.ancestor(parentKey)
.filter("myIndexedString", theIndexedValue)
.order("-myIndexedDate")
.offset(offset)
.limit(limit)
.keys()
.list();

The returned list was empty, too.

When I got you right, in the keys-only-query there is no memcache involved. So it can't be a problem with negative cache entries in this case, right?

Maybe there is a problem with the index of the Datastore?

Best
Stefan

Jeff Schnitzer

unread,
Apr 6, 2017, 1:08:11 PM4/6/17
to objectify-appengine
This is probably a data issue. To satisfy this query you need:

 * A multi-property index on {ancestor, myIndexedString, myIndexedDate DESC}
 * @Index on myIndexedString
 * @Index on myIndexedDate

I’m assuming you have the multiproperty index configured because you usually get errors if you don’t. The most likely scenario is that you haven’t indexed all the single properties. In order for an entity to participate in a multi-index, all of its single indexes must be indexed. Note that you can’t just add @Index to the Java class, you have to do that and then resave the entity.

Also, get rid of consistency(STRONG). That’s the default behavior. There are no circumstances in which ofy().consistency(STRONG) has any effect (unless it’s to undo a previous setting of EVENTUAL).

Jeff

--

Stefan Keil

unread,
May 9, 2017, 6:57:04 AM5/9/17
to objectify-appengine, je...@infohazard.org
Hey Jeff,

of course I had the multi-index configured properly and also the @Index-annotation was there from the very beginning and was never removed in the meantime.

I will give you a complete summary of what I did, so it should be easier for you to see the whole picture. Unfortunately, I still don't have a solution to my problem, and neither do I know why it is happening.

(I will keep all the consisteny-commands as they are, just to be sure that the code I present here is exactly what was running in my app, but I am pretty aware now about your recommendations, so no need to emphasize this again ;) )

The database set-up is the following:

I have an entity like this:

@Entity
@Cache
public class Entity {
@Parent private Key<ParentClass> parentKey;
@Id private String id;
@Index private String myIndexedString;
@Index private Date myIndexedDate;
...
}

And I have these entries in my datastore-index.xml:
<datastore-index kind="Entity" ancestor="true" source="manual">
    <property name="myIndexedString" direction="asc"/>
    <property name="myIndexedDate" direction="desc"/>
</datastore-index>

<datastore-index kind="Entity" ancestor="true" source="manual">
    <property name="myIndexedString" direction="asc"/>
    <property name="myIndexedDate" direction="asc"/>
</datastore-index>

Then I do the following ancestor-multi-index-entities-query:
ofy().consistency(Consistency.STRONG).load()
.type(Entity.class)
.ancestor(specificParentKey)
.filter("myIndexedString", specificIndexedValue)
.order("-myIndexedDate")
.offset(0)
.limit(100)
.list();

-> In very rare cases the returned List is empty. But I know that this is wrong. 

Then immideately afterwards, I tried two different things. Both were pure keys-queries (because I wanted to exclude that it is a memcache problem, like negative entries). In both cases I used a new ofy-object, to make sure that there is no problem with the session-cache.

In the first try I was doing an ancestor-multi-index-keys-query:
factory().begin().consistency(Consistency.STRONG).load()
.type(Entity.class)
.ancestor(parentKey)
.filter("myIndexedString", specificIndexedValue)
.order("-myIndexedDate")
.offset(0)
.limit(100)
.keys()
.list();

The result was an empty list.

In the second try I was doing a non-ancestor-orderless-single-index-keys-query:
factory().begin().consistency(Consistency.STRONG).load()
.type(Entity.class)
.filter("myIndexedString", specificIndexedValue)
.offset(0)
.limit(100)
.keys()
.list();

The result was again an empty list.

So it looks like all the Entities of the "specificIndexedValue" are not (yet) indexed.

BUT: When I check the datastore in the browser directly and make a single-index query for "specificIndexedValue" I get all the results!
--------------------------------------

So what is the problem? To me it looks like "the datastore index is not up to date/not available".

If I got you right, then the second try actually uses eventual consistency, so there it might very well be the case, that you get "an old view on the index", but still then it is strange, that the returned list of keys is empty. The indexed Entities are usually not all saved or deleted at the same time. They come and go. So if I would expect 100 keys in the index, then I'd rather expect that with eventual consistency I am missing the changes from the last secs/mins, so there should be some returned keys, but not none.

And the first query actually is a strong consistency query, right?. But of course it uses "another index" as the second try, so there is a problem with two indexes. (At least they are consistent in their error.) I am not doing this query in a transaction. Does this matter in any way?

Interestingly, this phenomena only happens very sporadic on specific "specificIndexedValue", and is not happening on several "myIndexedString"s at the same time.

One final thought: The very first entities-queries with which I start, this is a hybrid query, right? So internally objectify should actually start exactely with what I am doing in my "first keys-only-query", followed by loading the entities by the obtained keys, right? So ofy gets this empty list of keys within the hybrid query, is this result stored somewhere? Does it effect my two tries with the keys-only-queries afterwards?

Any ideas why I am having this issue?

Thanks
Stefan

Jeff Schnitzer

unread,
May 9, 2017, 11:46:52 AM5/9/17
to objectify-appengine
Are there any other factors involved that you haven’t mentioned? Is there a transaction running? (you know that queries in a transaction always reflect the state of the universe at the start of the transaction, even if you have saved something earlier in the txn?) Are you seeing this behavior in local dev or in production GAE? When you say the results are missing, how long is the duration since they were saved? Do you see inconsistent data that has been there for a long time, or is there some sort of producer-consumer thing going on here with rapid entity churn? Are you running this from a normal request (with the ObjectifyFilter installed) or are you doing something “odd” like using the remote api?

There is no query cache in Objectify. With a keys-only query, what you see is exactly what the low-level API returns.

Yes, without the ancestor(), the query results are always eventual. You can’t change that.

The chance that you are tickling some kind of bug in either Objectify or the datastore is vanishingly small; this code is well tested, very widely used, and hasn’t changed much in years. And with a keys-only query, there isn’t even very much code involved. As described, a simple use case would have to be broken and we would have heard a lot about it already. This isn’t an edge case.

So there must be something essential missing from the description of the issue that is getting stripped out when you distill the problem down to a generic example. What else is going on in your system? How does the data in that list get generated? Is there anything deleting it at the same time, and if so, how do you know that your queries should include the data? Is it possible the bug/issue is in the code that saves the data, not in the query? We need to zoom out a little.

Jeff

--

Nicholas Okunew

unread,
May 9, 2017, 6:51:45 PM5/9/17
to objectify...@googlegroups.com
Can you share the code that saves these entities - its possible that they're just being saved async or something else odd is happening


To unsubscribe from this group and stop receiving emails from it, send an email to objectify-appen...@googlegroups.com.

For more options, visit https://groups.google.com/d/optout.

--
You received this message because you are subscribed to the Google Groups "objectify-appengine" group.
To unsubscribe from this group and stop receiving emails from it, send an email to objectify-appen...@googlegroups.com.

Stefan Keil

unread,
May 10, 2017, 12:37:48 PM5/10/17
to objectify-appengine
Hey Jeff and Nick,

I am saving the entities like this:

ofy().save().entity(Entity).now();
ofy().save().entities(Collection<Entity>).now();

The Entities are only saved in transactions. Never outside. But they are loaded a lot outside of transactions. E.g., in the scenario I depicted above there is no transaction involved.

And to be precise, the entities are never deleted, but they are stored again with the "myIndexedString" being null. I actually also index the entites in case of "myIndexedString" being null (for no special reason, but maybe it is important to tell). It is a likely situation that several of my Entities of same parentKey get loaded in a single transaction, all get their "myIndexedString" set to null and then they are saved within the single transaction. In rare cases several of my Entites with "myIndexedString" being null are loaded in one transaction and then being saved with values for "myIndexedString" being different to null (very likely with the same value for "myIndexedString"). In most cases a new Entity is saved on its own in a single transaction. In peak situations, it can be that for a fixed parentKey there are like 10,000 new Entites within 1 hour, each in a sigle transaction, so I am actually reaching sometimes the maximum rate for transactions of a fixed parentKey. Usually after an Entity is created, there are not many changes to it anymore. It is a user-input that changes the value of "myIndexedString" (to null, or from null to another value), it is not happening in an automatic way.

To answer a few questions:

This is not a problem I can "create". It happens very rarly. I never had it locally, only in production. 

I can't say at the moment how long is the duration between the error and the Entities being saved. I have to add more logs first. But I assume, that when the error happens there was a saving of some Entities of that parentKey and "myIndexedString==specificIndexedValue" a few seconds before. Assume a user sees the first 10 Entities with "myIndexedString==specificIndexedValue", changes the value of "myIndexedString" to null, and then paginates to the next 10 Entities (with "myIndexedString==specificIndexedValue"). Now the error occurs and the returned list is empty and I cant show the user the next 10 Entities.

Maybe there is a thing with the ObjectifyFilter. When I use threads I do something like this:

new Callable<Boolean>() {

@Override
public Boolean call() throws Exception {
ObjectifyService.run(new VoidWork() {
public void vrun() {
<THE CODE OF MY THREAD>
}
});
return true;
}
});

And whenever a user wants to set the value of "myIndexedString" to null for several Entities of different parentKeys, then I do this parallel in different threads, one for each parentKey.

I don't understand what you mean by "Do you see inconsistent data that has been there for a long time, or is there some sort of producer-consumer thing going on here with rapid entity churn?".

The reason why I know that there must be some Entities for my "myIndexedString==specificIndexedValue" (and given parentKey) is the following:
I have another object of another class (let's call it ParentClass) for each non-null value of the "myIndexedString". This object has the same parentKey as the Entities and is modified within each transaction in which I modify the Entities. Within this ParentClass I store (among other data) the number of Entities of the given "myIndexedString" (matching the parentKey, of course).

Thanks a lot
Stefan

Jeff Schnitzer

unread,
May 10, 2017, 1:39:46 PM5/10/17
to objectify-appengine
Thanks for elaborating — there’s nothing that obviously stands out to me as an issue. Using run() from threads is correct. I presume you’re managing the threads properly using the GAE APIs.

My question about churn is wondering if you’re seeing this behavior on entities that haven’t changed in a while or if they’re rapidly mutating. It sounds like they’re rapidly mutating. Can you describe in more detail how you establish that there is an inconsistency? It sounds like you are comparing a count in the parent entity with the expected number of elements. Is this read being done in a transaction? Is it possible that you are doing this without a transaction and the data is actually changing between reading the parent entity and reading the children?

Jeff

--
You received this message because you are subscribed to the Google Groups "objectify-appengine" group.
To unsubscribe from this group and stop receiving emails from it, send an email to objectify-appengine+unsub...@googlegroups.com.

Stefan Keil

unread,
May 10, 2017, 5:50:32 PM5/10/17
to objectify-appengine, je...@infohazard.org
Hey Jeff,

thanks for yor reply.

The scenario is the following: The ParentEntity is loaded first, without a transaction. Then the ancestor-query for the Entities is performed, again without a transaction. In most cases the cached number of entities in the ParentEntity match with the number of returned Entities. In a few cases there is a minor difference. When I load the ParentEntity again with a new ofy-object immediately after the query is performed and compare the numbers again, then it usually matches, but even then sometimes there is a small difference, but it is all correct. As you noted, the Entities of the fixed parentKey and "myIndexedString" are (sometimes) rapidly changing, so this is all correct. At least, as long as the returned list of Entities is not empty. (Additional note: The ParentEntity gets actually deleted in case the number of Entities for this parentKey and value for "myIndexedString" is 0.) Hence, as long as there is a ParentEntity, I expect Entities. And I am not talking about a small difference in the actual number of returned Entities and the cached number of Entities. This is perfectly explained by the fact that I load the data outside of a transaction. I am talking about an existing ParentEntity, stating that there should be more than 100 Entities and none is returned. I also tried setting the request to sleep for 20secs and then doing the query again and no Entity is returned. But when I check 3min afterwards in the datastore directly (via a browser), then I see that there are all these 100 Entities that should be there. It looks like a structural mistake. To me this doesnt look like a "the-data-is-not-within-a-transaction-and-thus-there-is-a-difference-between-the-actual-number-and-the-previously-loaded-cached-number". It is always this 0 returned Enities (or Keys).

Maybe we should dig deeper into this "I presume you’re managing the threads properly using the GAE APIs.", as I don't know what you mean.

Any further ideas?

Best
Stefan

Nicholas Okunew

unread,
May 10, 2017, 6:11:21 PM5/10/17
to objectify...@googlegroups.com
Regarding pagination:

Assume a user sees the first 10 Entities with "myIndexedString==specificIndexedValue", changes the value of "myIndexedString" to null, and then paginates to the next 10 Entities (with "myIndexedString==specificIndexedValue"). Now the error occurs and the returned list is empty and I c...

How are you performing this query? Is the index you query on the the one they're removing items from? 


--

Jeff Schnitzer

unread,
May 10, 2017, 6:22:58 PM5/10/17
to objectify-appengine
If this data is rapidly mutating, what prevents this scenario?

1) Process A loads parent entity, shows 5 children
2) Process B deletes children and parent
3) Process A queries for children, sees none

Threads in GAE Standard must be created using a special API so that the thread-local ApiProxy gets set up. You wouldn’t be able to access services if you weren’t doing this correctly, so I’m guessing that is probably not a problem.

Jeff


--

Stefan Keil

unread,
May 11, 2017, 3:04:57 AM5/11/17
to objectify-appengine, je...@infohazard.org
(As far as I can tell, the threads are precisely doing what they should do. So I assume they are set up correctly, too.)

I will do further investigation. If I am still convinced afterwards that there is an error, I will come back to you.

Thanks a lot so far for your help. I really appreciated it.

Stefan

Stefan Keil

unread,
May 11, 2017, 4:48:07 AM5/11/17
to objectify-appengine, je...@infohazard.org
I just realized that the transactions where I save all my Entities are created with the consistency flag set to EVENTUAL. Can this be the issue? Or is this flag ignored when you start a transaction, as internally everything is done with STRONG consistency?

ofy().cache(true).consistency(Consistency.EVENTUAL).transactNew(limitTries, new Work<Foo>() {

@Override
public Foo run() {
...
}
});

Jeff Schnitzer

unread,
May 11, 2017, 12:00:02 PM5/11/17
to objectify-appengine
consitency() should not have any effect in a transaction; they are always strongly consistent. Really it’s best to purge your code of consistency() calls. I should put a big warning in the javadocs that this is an advanced feature that doesn’t do what you probably think it does :-)  I only included it because GAE allows it, but maybe this was a mistake.

Perhaps I’m misreading you, but from what you have described it sounds like you have a straightforward race condition in your code. 

Jeff

--

Stefan Keil

unread,
May 12, 2017, 8:50:35 AM5/12/17
to objectify-appengine, je...@infohazard.org
I have the following real situation which was happening this morning for a fixed parentKey:

It starts at time 0. At this moment there are 26 Entities in my Database of that fixed parentKey. The "indexedString" is null for 13 of them, and not null (with the same value) for the remaining 13. The last change to any of these Entities at that moment was long ago, presumable 4 hours, so we are in a pretty stable situation at that point.

Then a user wants to see the data with the indexedString being this 1 special value. For this I load the ParentEntity for this parentKey and this fixed indexedString. It says that there are 13 Entities and it contains also 10 cached keys of these entities. So I don't query for the Entities, but do a keys-load for these 10 cached Entity-keys, displaying the loaded Entities to the user. Then the user modifies 8 of them individually within 40 secs, setting the indexedString to null for these 8 Entities. This is happening in 8 single Transactions. So afterwards, there are 5 remaining Entities with the indexedString not being null. The View for the user is only updated in that way, that these 8 Entities he modified are removed. No further loading of Entities was happening. So the user ends up with seeing 2 more Entities of the 10 he started with.

Then at time 40secs, the user clicks the button "show me all Entities" (which leads to a paginated view of 100 elements per page), hence one would expect a single page with 5 elements.

And now, there is the error! I am querying for the first 100 Entities matching the parentKey and the specific "indexedString", and an empty list is returned.

--------

In more detail:

Beside this user input within the last 40 secs, there was no change in any Entity with that parentKey at all (independent of the value of the indexedString). (To be even more precise, the user was actually changing all these 8 Entities he was modifying twice. All these 16 changes happened in individual transactions. In all cases, the Entity was first modified at a value, which was not the indexedString, and then in a second user input the indexedString was set to null. 15 of the 16 transaction had no Concurrency problems, 1 transaction was actually performed 3 times until it committed.) All these 16 transactions took 40 secs together to successfully complete.

The time between the last of the 16 transactions being completed and the user clicking on the "show all Entities button" was short, like half a second to a second.

The procedure of the "show all button" is the following.

* The ParentEntitiy is loaded (transaction-less, with memcache, strong consistency). Is says correctly, that there are 5 Entities to be displayed.

* Then (after some time) the query for the Entities is performed like this (transactionless-ancestor-multi-indexed-with-order):
ofy().consistency(Consistency.STRONG).load()
.type(Entity.class)
.ancestor(specificParentKey)
.filter("myIndexedString", specificIndexedValue)
.order("-myIndexedDate")
.offset(0)
.limit(100)
.list();

-> "ERROR": The returned list of Entities is empty.

* Then, immediately afterwards, I reload the ParentEntity (with a new ofy to have fresh session-cache, again transaction-less, with memcache, strong consistency), to check again whether it still exists, and whether the cached numberOfEntities match better with the number of loaded Entities. The returned ParentEntitiy exists and is the same as before, stating that there should be 5 Entities.

* Then, immediately afterwards, I am doing a keys-only query with a new ofy (non-ancestor-orderless-single-index-keys-query):
factory().begin().consistency(Consistency.STRONG).load()
.type(Entity.class)
.filter("myIndexedString", specificIndexedValue)
.offset(0)
.limit(100)
.keys()
.list();

-> again "ERROR", as the returned list of Keys is empty.

* Then, immediately afterwards, I was doing something new that I added yesterday in the code. I start a transaction to load the ParentEntity and query the keys within that transaction, to see whats happening. The code looks as follows (no need to comment the Consistency.EVENTUAL, I got your point, simply putting it here to not change the whole picture):

List<Key<Entity>> keysFromTransaction = ofy().cache(true).consistency(Consistency.EVENTUAL).transact(new Work<List<Key<Entity>>>() {
@Override
public List<Key<Entity>> run() {

List<Key<Entity>> keys = ofy().consistency(Consistency.STRONG).load()
.type(Entity.class)
.ancestor(specificParentKey)
.filter("myIndexedString", specificIndexedValue)
.order("-myIndexedDate")
.offset(0)
.limit(100)
.keys()
.list();

ParentEntity parentEntity = ofy().consistency(Consistency.STRONG).load().key(parentEntityKey).now();

int numberOfKeysInTransaction = keys.size();

if (parentEntity == null) {
if (numberOfKeysInTransaction == 0) {
return null; //Everything is fine. There is no parentEntity and no keys.
}

String errorMessage = "Transaction: ERROR. There is no parentEntity, but there are " + numberOfKeysInTransaction + " keys: " + keys;
throw new RuntimeException(errorMessage);
}

int numberOfEntitiesInParentEntityInTransaction = parentEntity.getNumberOfEntities();
if (numberOfKeysInTransaction == numberOfEntitiesInParentEntityInTransaction) {
return ticketKeys; //Everything is fine. There is a parentEntity and there are keys, and the numbers match.
}

String errorMessage = "Transaction: ERROR. There is a parentEntity but the numbers do NOT match! numberOfKeysInTransaction: " + numberOfKeysInTransaction
+ ", numberOfEntitiesInParentEntityInTransaction: " + numberOfEntitiesInParentEntityInTransaction 
+ ", parentEntity: " + parentEntity + ", keys: " + keys;
throw new RuntimeException(errorMessage); // <- HERE!
}
});

Guess what was happening - the second RuntimeException was thrown, meaning that within the transaction there was a parentEntity and the number of queried keys was not matching the cached number in the parentEntity. Actually, the loaded parentEntity in the transaction was exactly the same as the parentEntity which was loaded twice before that transaction, stating that there should be 5 Entities. But the queried list of Entity-keys within the transaction was empty... How is that possible???

The whole procedure of the user clicking the "show all button" took 700ms. There was no other activity going on for that special parentKey during that time. Thus, this special scenario I am presenting here is not a race condition (or data race) at all!

------

Interesting things to add:

When the user was doing these actions that set the indexedString to null for one of the entities, then within that transaction I perform precisely the same query for entities that was failing when the user clicked the "show all button". Because I want to update the 10 cached Keys, thus in the transaction I query for the 11th Entity and add its Key to the ParentEntity (and removing the Key of the Entity the user was modifying). This worked perfectly fine. I checked the 5 cached Keys in the ParentEntity, which was logged in the RuntimeException. These 5 Keys are precisely what was expected. 2 of these are the remaining 2 entities that were not modified by the user, and the other 3 are those Keys which were not among the cached Keys in the beginning. So actually, I only had to do this query for a new Entity in 3 of the 8 transactions. In the remaining 5 cases I knew due to the cached total number of Entities that I already have cached all Entities.)

So 3 times in a row querying for Entities within a transaction worked fine, and then 1 sec later doing the same query outside of a transaction was returning an empty list instead of the 5 expected elements, and also the keys-only-query outside of a transaction returned no result and further within a new transaction a query for the keys also returned no result. Super strange!

One final strange thing: This problem never occurs within the "normal" transaction (when a user sets the indexedString to null), when I have to load the next Entity to cache its Key (among other values). It only happens when the user clicks the "show all button". There I am calling the exact same method, the only difference is, that it is outside of a transaction. Of course you could argue at that point with race conditions / data races, but it is a strong consistent ancestor query, at a moment when there were 13 indexed elements for four hours and then within the last 40 seconds 8 of these 13 elements got removed from the index. So if the query would return more than the expected 5 Entities I would be perfectly fine with "race condition etc", but not with a returned empty list.

I am still shocked that within the transaction I got a contradiction ("// <- HERE!"). There must be a mistake in what I am doing in this "show all button". I never have a contradiction in the "normal" transactions modifying the Entities and querying for the next cached Entities in the ParentEntity. I do many double-checks during the "normal" transaction and would never allow it to commit, if the cached number of Entities in the ParentEntity says that there are more Entities to be queried for (to be cached), and the actual query would return no Entity. This is never happening. But I do have this problem in the "show all button" regularly. And repeatable. If the user would click the button again a few seconds later, I am very sure that the same error will happen again.

Stefan Keil

unread,
May 12, 2017, 1:55:58 PM5/12/17
to objectify-appengine, je...@infohazard.org
And just to make this clear, the error is not happening all the time. The "show all Entities" buttons works whenever I click on it. When I change a few of the Entities and then click on the button, it works. I was never able to create the error on my own. It is always coming from users using the software. So there is no obvious bug in my code.

The only bug I can imagine in my code, would be that somehow the cached parentKey and specificIndexedString was altered, so I am actually performing a wrong query. But I can't see how this would happen.

Stefan Keil

unread,
May 13, 2017, 5:50:07 AM5/13/17
to objectify-appengine, je...@infohazard.org
Ok, I solved it. I was actually doing the wrong query. So simple. So embarrassing.

-----

This is what happened: There also exists "anotherIndexedString" in the Entity. The algorithm (which was used when the user clicks the "show all button"), that decides for which of the two indexed Strings I was querying for, had a very subtle bug. The problem was, that the algorithm was relying on a value obtained from the User-Session. But the framework we use to control the User-Session has a "bug" (or let's call it a "counter-intuitive behaviour"), that might lead to a "wrong" state of the User-Session. As my backend process was simple accepting (and relying on) the value from the User-Session, the error occurred whenever the User-Session was "wrong", which only happened in rare cases. Instead, I should simply compute this value on the backend side again, instead of relying on "frontend-data". Well, I had my lesson.

-----

Thank you so much for all your answers and explanations, Jeff. It was very helpful and I learnt a lot.

Best
Stefan

Jeff Schnitzer

unread,
May 13, 2017, 10:18:42 AM5/13/17
to objectify-appengine
I’m just glad to hear you have it working! :-)

Cheers,
Jeff

--
Reply all
Reply to author
Forward
0 new messages