did the writes become more async in 1.5.1 (and can i force them to be synchronous?)

163 views
Skip to first unread message

tfannon

unread,
Jul 1, 2011, 8:03:03 PM7/1/11
to objectify...@googlegroups.com
I have some sample data generation that I run on my local environment.   It saves objects very quickly and the tests depend on a previous save completing.  It seems that with 1.5.1 saves in a tight loop are problematic in that the data is not quite there yet when the next save/processing occurs.

I am not sure if this is new to 1.5.1 or its unique to my development environment,  but I am wondering what others are doing to safeguard this condition.  To explain it in a little more detail:

user does some stuff and saves an object.
user does some more stuff and saves another object.

this second save has to pull the first object out of the db for a comparison.  if a save has gone through but does not show up when the next one is pulled, this is a problem.  i guess this is the nature of async so if necessary, can i force a save to be synchronous with objectify?

tfannon

unread,
Jul 4, 2011, 1:35:01 AM7/4/11
to objectify...@googlegroups.com
Is anyone else experience this kind of strangeness in their development environment?   Basically, writes look like they are not showing up right away.

Sometimes the delay is more noticeable than others.   I just took Indigo/SDK 1.5.1

Thoughts?



tfannon

unread,
Jul 4, 2011, 6:24:00 PM7/4/11
to objectify...@googlegroups.com
Hoping that Jeff will come on here and offer some insight.  

What I have discovered is that even though the datastore writes an entity and that entity can be retrieved directly with the key immediately,  it does not become available for any queries for an indeterminate amount of time.   I was able to verify this in my test environment by writing some blocking code which basically queried until the object was available.

Example:

void saveResistanceAndBlock(Resistance r) {
Key<Resistance> aKey = dao.ofy().put(r);
boolean found = false;
while (!found) {
for (Key<Resistance> key : dao.ofy().query(Resistance.class).fetchKeys()) {
if (key.equals(aKey)) {
found = true;
return;
}
}
System.out.println("Blocked: " + aKey.getName());
}
}

I can see it outputting the blocked message a few times for some items. 
 
This is odd because prior to taking Eclipse Indigo and App Engine 1.5.1 I did not see this in my development environment.  I would have seen it as it causes my app to not work. 

So, this is an interesting conundrum for me.  I don't think this is a particularly great thing to do all over my production code.   Anyone else come up with an idea for this?   I guess this is another one of those gotchas about the scalable datastore.  The objects will show up in queries when they damn well please thank you very much.    


Broc Seib

unread,
Jul 4, 2011, 8:51:23 AM7/4/11
to objectify...@googlegroups.com

Can you post an isolated test case? When I get home later this week I can try to duplicate and confirm dev environment behavior with 1.5.1.

Jeff Schnitzer

unread,
Jul 5, 2011, 4:51:33 AM7/5/11
to objectify...@googlegroups.com
Sorry, I was camping in the Nevada desert riding dirt bikes and
blowing stuff up :-)

It sounds like you have discovered the new eventual consistency
behavior of the High-Replication Datastore. This isn't a 1.5.1 thing,
it should affect any HRD apps. Basically indexes are written
asynchronously so queries might show stale data - if you want to force
consistency you can do an ancestor() query to fetch it, or just do a
fetch by key.

There's a fair amount of talk about this here:
http://code.google.com/appengine/docs/java/datastore/hr/overview.html

Jeff

tfannon

unread,
Jul 5, 2011, 7:42:14 AM7/5/11
to objectify...@googlegroups.com
Thanks Jeff.    I hope your 4th was fun.  Blowing stuff up sounds fun.   We got rained out here in Tampa so no blowing up stuff this year.  

In reading that section in the GAE docs and the section in your docs about ancestor queries,  it looks like my relationship could have a parent relationship in order to be able to do the query with immediate results.   The interesting thing is my app has been around long enough that it is still a master-slave configuration.   

I am NOT seeing this problem in production yet.. only in development.    In production,  it is unlikely that data gets put in the rapid succession that it takes to cause this.  
(I wrote some sample data generation code which blasted a bunch of data in locally to test things.).

Fetch by key will not work because I perform some processing over a list of entities that meet the conditions of a filter so need to be able to see things that happened in the past.   

What happens to the data in production if I add the @parent to my class and deploy?  Can I safely load/save each entity pair in production?

Thanks for all your help.  You really should be paid by Google :)



tfannon

unread,
Jul 5, 2011, 7:45:57 AM7/5/11
to objectify...@googlegroups.com
The interesting thing is that there seem to be 3 known environments

1) production
2) local using local_db.bin on disk
3) junit which uses its own place for temporarily storing stuff

I find it frustrating that I cannot combine 2 and 3.  It would nice to be able to load up stuff with junit and then inspect it with the app.  I have been unable to figure out how to do that.  

As far as a test case... it is pretty simple.
Create an entity
Create an index on one of the properties to use in a filter condition
Use some code in a tight loop to put about 10 of these into the local database
Run the query with a filter. 
Result:  not all the entities come back right away.  The longer you wait,  the more likely they all come back.   Or you can write the code as I did above which basically does a Put, followed by querying until it comes back.

-tf

tfannon

unread,
Jul 5, 2011, 10:36:00 AM7/5/11
to objectify...@googlegroups.com
just found this on the google blog:


"High Replication in the SDK: Since releasing the High Replication Datastore, we’ve wanted to provide tools that help developers understand and test the new consistency model while developing their applications. The 1.5.1 SDKs for Java and Python can now emulate the HRD consistency model. This means that now, by setting the appropriate SDK config options, queries across entity groups will occasionally return results that don’t reflect the most recent data written. This should allow you to develop your application to be more resilient to this consistency model."

Guess I need to figure out the settings to make it behave like the master-slave which my app is set up to be.


Jeff Schnitzer

unread,
Jul 5, 2011, 12:38:51 PM7/5/11
to objectify...@googlegroups.com
Ah yes, that would be the problem. Odd that they didn't make this an
option for local development. I guess Google really wants to push
people towards the HRD.

FWIW, I've found latency to be vastly better on the HRD. Since it's
soon to be priced identically to M/S, you might seriously want to
consider switching... although you'd have to figure out a solution for
the query consistency issue.

Jeff

tfannon

unread,
Jul 5, 2011, 12:47:40 PM7/5/11
to objectify...@googlegroups.com
Well.  For metadata, the stuff that doesn't change that often:  i can just use my strategy of blocking until the query returns as i discovered earlier.

However, I was thinking about it some more, and it seems like I really should be using  the @parent property on the entity to group my items.  

My relationship is as follows:

Lifter->*LogEntries.    These log entries belong to this lifter and never need to be accessed outside the context of the owning Lifter identity.  I think the only reason I didn't use parent was because the objectify docs said "don't use parent" :)    What would the process be like to convert these to an owned relationship.  I am envisioning something like this:

for (LogEntry entry : getAllLogEntries() {
   entry.setLifter(lifter).
}
ofy().beginTrans
ofy().put(lifter).
ofy().commit

I just want to make sure I don't upload the code with the @parent attribute set and lose all my data somehow.

Thanks a lot for your time.
-tf

Jeff Schnitzer

unread,
Jul 5, 2011, 10:08:26 PM7/5/11
to objectify...@googlegroups.com
Yeah that looks about right... and remember you will need to delete
the old LogEntry objects (the ones will null parents) afterwards. And
you will have to stop creating new log entries of the old format
before you start the conversion.

Jeff

Reply all
Reply to author
Forward
0 new messages