ReadPolicy.Consistency.EVENTUAL under HR?

52 views
Skip to first unread message

Peter Murray

unread,
Jul 5, 2011, 1:51:19 PM7/5/11
to google-a...@googlegroups.com

Greetings,

In a small test retrieving 100 independent (non entity-group'd) entities by key (e.g. ds.get(keys) ) in an HR datastore, we found that the ReadPolicy significantly affected performance - with Consistency.STRONG set, the entities were returned in about 500ms (plus or minus), with Consistency.EVENTUAL set the entities were returned in about 50ms.  Could someone please describe the practical differences between the two settings under an HR datastore?  I had thought, from the I/O talk on the subject, that essentially in an HR environment you were not guaranteed strong consistency under any circumstances unless the entities are in the same entity-group.  Of course, something is taking 10x longer - the question is, what am I getting for the 10x?

Best,

pete

Stephen Johnson

unread,
Jul 7, 2011, 3:33:18 PM7/7/11
to google-a...@googlegroups.com
Hi Pete,
Yes, I've also tested this and the speed improvement is VERY
noticeable. With the way that the documentation explains the
difference between the HR datastores STRONG and EVENTUAL settings you
would assume that getting by the key value would be the same speed
because according to the documentation gets always return the most
recent data. You would assume that queries on the other hand would be
the thing that would have the time difference because queries with the
EVENTUAL setting can use the replicas (unless an ancestor query) and
could have stale results. So something else is going on here that
hasn't been explained. It seems due to the speed increase with
EVENTUAL consistency on that gets by key are using the replicas (or
possibly master if that's the closest datastore) to return the entity
instead of the master, so what it could be is that with STRONG
consistency on you always use the master for all datastore operations
even when doing a get by key even though it could be handled by a
nearby replica and with EVENTUAL consistency you always use the
closest datastore which could be a replica or the master. Or it could
be something entirely else. Hopefully an informed Googler could
explain the subtle differences a little better.
Stephen

> --
> You received this message because you are subscribed to the Google Groups
> "Google App Engine" group.
> To view this discussion on the web visit
> https://groups.google.com/d/msg/google-appengine/-/apsAu6MR-BoJ.
> To post to this group, send email to google-a...@googlegroups.com.
> To unsubscribe from this group, send email to
> google-appengi...@googlegroups.com.
> For more options, visit this group at
> http://groups.google.com/group/google-appengine?hl=en.
>

Robert Kluin

unread,
Jul 8, 2011, 6:35:52 AM7/8/11
to google-a...@googlegroups.com
Hi,
This is because doing a strongly consistent fetch by key uses a
transaction -- which means you'll always get the latest version. If
you fetch a bunch of keys, these transactions are currently done
serially which significantly slows the request. An eventually
consistent fetch goes to the fastest datastore node and returns
whatever version it has.


Robert

--
------
Robert Kluin
Ezox Systems, LLC

Stephen Johnson

unread,
Jul 8, 2011, 12:37:33 PM7/8/11
to google-a...@googlegroups.com
Hi Robert,
This is what I kind of thought but I didn't want to go that far in a statement without official clarification. If that is the case then I think the documentation should clarify this then because under the Usage Notes of http://code.google.com/appengine/docs/java/datastore/hr/overview.html  The last sentence which I've bolded below makes it seem that a get which I'm assuming is by key will see the most recently written data even under Eventual consistency. Also, many times when people have asked about this it has always been responded that the queries are eventually consistent but that get by key is strongly consistent (at least that has been my impression). I do believe you're right but IMHO this should be better documented.
Stephen

With eventual consistency, more than 99.9% of your writes are available for queries within a few seconds. The goal is to find a caching solution for your application that provides the data for the current user for the period of time in which they're posting to your app. The caching solution might involve memcache, a cache in a cookie, some state you put in the URL, or something else entirely. The point is that, if the solution provides the data for the current user in context of their posts, it will likely be sufficient to make the eventual consistency of High Replication completely acceptable. Remember, if you do a get(), put(), or a transaction, you will always see the most recently written data.

Ikai Lan (Google)

unread,
Jul 8, 2011, 2:59:36 PM7/8/11
to google-a...@googlegroups.com
Hey guys,

This is my fault. I wrote a really long groups post about this a while ago and never had the chance to write a version that goes into the official docs:


Ikai Lan 
Developer Programs Engineer, Google App Engine

Stephen Johnson

unread,
Jul 8, 2011, 4:09:10 PM7/8/11
to google-a...@googlegroups.com
Thanks Ikai!

Peter Murray

unread,
Jul 8, 2011, 10:58:51 PM7/8/11
to google-a...@googlegroups.com
Thanks, Ikai.  I read the post and think I understand the situation.  

Too bad the STRONG consistency costs as much as 10x performance - is there any hope on the horizon for improving that?

Cheers,

pete


Stephen Johnson

unread,
Jul 8, 2011, 11:16:16 PM7/8/11
to google-a...@googlegroups.com
Or you could view it the other way and that EVENTUAL consistency is 10x cheaper :) You can also use both consistencies in your app. Do you really need STRONG consistency for every query and fetch. For example, if your selling products, does the descriptions of the products change that often. I don't know your application so maybe yours does but I think a lot of apps could use EVENTUAL for a lot of things and be fine with it.


pete


--
You received this message because you are subscribed to the Google Groups "Google App Engine" group.
To view this discussion on the web visit https://groups.google.com/d/msg/google-appengine/-/LAhotZocsRQJ.

Robert Kluin

unread,
Jul 11, 2011, 1:10:20 AM7/11/11
to google-a...@googlegroups.com
Hey Stephen,
Yeah that last sentence looks like an incorrect, or at least
seriously misleading, statement to me; there is a chance that an
eventually consistent get will not see the most recent data (it also
will not initiate a catch-up cycle if it finds stale data). Hopefully
Ikai will get a chance to fix that. ;)


Robert

Reply all
Reply to author
Forward
0 new messages