Advantages of Google Datastore Entity Groups

812 views
Skip to first unread message

Paul Mazzuca

unread,
Oct 4, 2016, 8:23:24 PM10/4/16
to Google App Engine
Is there any reason to use an Entity Group in Google Datastore other than for enabling transactions? 

For example, does having entities in the same entity group speed up queries?  The situation that I run into often is whether to have X be a parent to Y or have X be a property of Y.   Both cases allow for Y to be queried given X, but when transactions aren't needed perhaps the entity group is not necessary.  I guess another way of stating the question is whether or not ancestor queries provide speed up over property based queries without ancestors?

Evan Jones

unread,
Oct 6, 2016, 1:31:43 PM10/6/16
to Google App Engine
As far as I am aware, the only documented reason to use an entity group is because it can give you strong consistency, which does not necessarily require a transaction. To take your example, let's imagine we are trying to decide if a user's "favourite colors" should be a list property on the User entity, a separate entity with a "user" reference (not in the same entity group), or a separate entity with the same parent. Consider a user takes the following actions:

1. Add "blue" to my list of favourite colors.
2. Query for my favourite colors.


Case: Single entity: Result: You will always see the last color that was added, since the read and write is of the same entity.

Case: Multiple entities, not in an entity group. Result: Unknown! You might see "blue", you might not. If there are many concurrent additions and deletions, you might see any combination of them. That is because queries across entities are weakly consistent.

Case: Multiple entities in an entity group, using an ancestor query. Result: You will see blue. This query gives you strong consistency, even without the transaction.


Note this has nothing to do with latency or throughput. I have not seen suggestions in the app engine documents about either (if they exist, let me know!)


For details, see:



Alex Martelli

unread,
Oct 6, 2016, 1:50:21 PM10/6/16
to google-a...@googlegroups.com
On Tue, Oct 4, 2016 at 5:23 PM, Paul Mazzuca <paul.j....@gmail.com> wrote:
Is there any reason to use an Entity Group in Google Datastore other than for enabling transactions? 

Strong consistency: ancestor queries are strongly consistent. That's a key semantic difference compared with the eventual consistency of most queries (though, of course, you do pay for it in othe ways).

And yes, you MAY get some performance benefits due to locality under very specific circumstances -- see https://cloud.google.com/datastore/docs/articles/balancing-strong-and-eventual-consistency-with-google-cloud-datastore/#h.3loc7ynqbw6i for MUCH more.

 

For example, does having entities in the same entity group speed up queries?  The situation that I run into often is whether to have X be a parent to Y or have X be a property of Y.   Both cases allow for Y to be queried given X, but when transactions aren't needed perhaps the entity group is not necessary.  I guess another way of stating the question is whether or not ancestor queries provide speed up over property based queries without ancestors?

Not in the general case, but, in specific ones (since, per the URL I quoted, "entities are sorted and stored by the lexicographical order of the keys"), it sure might help.

Me, in this case like in many others, I'm partial to benchmarking -- make a toy app closely simulating your specific use case, implemented both ways, run a large number of equivalent queries in either way, get the precise delay numbers and ponder their histograms. I'll take real data over theoretical considerations any day of the week:-).

Alex
 

--
You received this message because you are subscribed to the Google Groups "Google App Engine" group.
To unsubscribe from this group and stop receiving emails from it, send an email to google-appengine+unsubscribe@googlegroups.com.
To post to this group, send email to google-appengine@googlegroups.com.
Visit this group at https://groups.google.com/group/google-appengine.
To view this discussion on the web visit https://groups.google.com/d/msgid/google-appengine/846d1d10-af2c-4c9a-80eb-bf29708017da%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

Paul Mazzuca

unread,
Oct 6, 2016, 2:43:18 PM10/6/16
to google-a...@googlegroups.com
Agreed on all points.  Some additional thoughts that I have had...

The locality of Entity Group data provides "strong consistency" and potentially some speedup on ancestor queries (pending benchmarks). The "strong consistency" enables these Entity Groups to operate in "transactions", though only to the point of about 1 write/sec.

So for me, if I am trying to figure out if two kinds of entities should belong in the same group, it merely comes down to whether or not ALL these statements are true:

1) I need to perform an ancestor query such that the data retrieved is strongly consistent.  For example, give me all the account balances in the bank at a point in time.
2) I need to perform a transaction that affects two or more entities.  For example, transferring money from member A to member B. 
3) My data write requirements do not exceed 1 write/sec in a transaction

I say ALL because if the first point is not needed, XG transactions would still allow for transactions of the entities if they weren't put in the same group (assuming you stay under the limit of 25 kinds of entities).



--
You received this message because you are subscribed to a topic in the Google Groups "Google App Engine" group.
To unsubscribe from this topic, visit https://groups.google.com/d/topic/google-appengine/tTn_BS84zFE/unsubscribe.
To unsubscribe from this group and all its topics, send an email to google-appengine+unsubscribe@googlegroups.com.

To post to this group, send email to google-appengine@googlegroups.com.
Visit this group at https://groups.google.com/group/google-appengine.

Nicholas (Google Cloud Support)

unread,
Oct 7, 2016, 11:34:57 AM10/7/16
to Google App Engine
Thanks for providing more of a use case.  You mention wanting to determine if 2 entity kinds belong in the same group.  From your 3 requirements, it seems like there are several entity kinds: account balance, bank, member, money transfer.  I am unclear on how entities of these kinds relate to another and what dependencies they have.
  • How are entities of each of these kinds related to one another (hierarchy and such)?
  • How often do entities of each kind get changed?
  • Which CRUD operations with each entity kind require strong consistency?
Knowing the above may help with providing more specific architecture suggestions regarding entity groups, transactions and the use of ancestor queries.


On Tuesday, October 4, 2016 at 8:23:24 PM UTC-4, Paul Mazzuca wrote:
Reply all
Reply to author
Forward
0 new messages