Architecting entity groups, is this the correct approach?

78 views
Skip to first unread message

Rob Curtis

unread,
May 12, 2017, 1:08:41 AM5/12/17
to Google App Engine

Hi,

I'm trying to understand which approach is better with regard to write throughput in transactions.

We currently have multiple models that all use the same parent key. 
As an example, we have the following models:

  • Foo
  • FooLog (which uses a parent key from Foo)
  • FooAudit (which uses a parent key from Foo)

My understanding is that writes to FooLog, FooAudit and Foo are all on the same entity group and thus there'll be limited throughput and possible failed transactions.
We still want to be able to query in transactions, so a parent key is required.

As a way around this, could we create a parent key using Foo's Id?
E.g. 
FooLog would have Key("FooLog", foo.key.id())
and 
FooAudit would have a key("FooAudit", foo.key.id()).

By doing this, Foo, FooLog and FooAudit would all be in different entity groups, but I can still get consistent results when querying in a transaction.

Is this the correct approach approach?

Jordan (Cloud Platform Support)

unread,
May 15, 2017, 4:08:23 PM5/15/17
to Google App Engine
An Entity Group consists of a root ancestor Entity, and all of its descendants. There is a write throughput of one transaction per second per Entity Group, where transactions on a given entity group are executed serially, one after another. If too many concurrent modifications are attempted on the same entity group, Datastore will return an error. 

Your code should properly handle failed transactions with an exponential backoff retry strategy in order to retry the transaction. You can also initially write to Memcache before performing a transaction, as this will make the written results available immediately. Your reads would then first attempt a read from Memcache, and then fail directly to reading from the Datastore if the results are not cached. 

- Alternatively as you suggested, you can manually place the parent 'Foo' entity ID in a different custom property of 'FooLog' and 'FooAudit', forcing both 'FooLog' and 'FooAudit' to both become their own root entity (as they will no longer have a parent). This will in turn remove the entity group, and remove the write throughput limit.  Note, this will of course remove the strong-consistency benefits of entity groups. 

  


Rob Curtis

unread,
May 15, 2017, 10:08:42 PM5/15/17
to Google App Engine
Hi,

Thanks Jordan.
I had explained incorrectly regarding the use of the constructed parent key.

So in the example of Foo and FooLog, I had meant to say: 
Construct a fake parent key for FooLog:    parent = Key("FooLog",foo.key.id()) 

So a FooLog entity would be created with
 parent = Key("FooLog", foo.key.id())
e.g. foolog = FooLog(parent=parent, some_prop="property")

Is it OK to use a parent key that doesn't exist? 

Thanks
Rob

Attila-Mihaly Balazs

unread,
May 15, 2017, 11:43:50 PM5/15/17
to Google App Engine
Yes, using parent keys which don't exists are absolutely ok.

Attila

Rob Curtis

unread,
May 15, 2017, 11:53:49 PM5/15/17
to Google App Engine
Great, thanks!

On Tue, May 16, 2017 at 5:43 AM, Attila-Mihaly Balazs <dify...@gmail.com> wrote:
Yes, using parent keys which don't exists are absolutely ok.

Attila

--
You received this message because you are subscribed to a topic in the Google Groups "Google App Engine" group.
To unsubscribe from this topic, visit https://groups.google.com/d/topic/google-appengine/HqamGjTEzVE/unsubscribe.
To unsubscribe from this group and all its topics, send an email to google-appengine+unsubscribe@googlegroups.com.
To post to this group, send email to google-appengine@googlegroups.com.
Visit this group at https://groups.google.com/group/google-appengine.
To view this discussion on the web visit https://groups.google.com/d/msgid/google-appengine/8b93f1ac-e695-4a45-a419-d3736f2153a0%40googlegroups.com.

For more options, visit https://groups.google.com/d/optout.

Rob Curtis

unread,
May 17, 2017, 8:37:47 AM5/17/17
to Google App Engine
I have noticed now, that since recreating the entities, using a new parent (as outlined above), that there are now occasional ID allocation issues ( BadRequestError: the id allocated for a new entity was already in use, please try again )

I find this strange as I would expect Id and parent key to form the key and wouldn't expect there to be conflict.

Why does changing the parent key potentially causes id allocation issues when new entities are created? Short of deleting old entities, what's another approach for this situation (where id allocation is failing).

Thanks
Rob

Jordan (Cloud Platform Support)

unread,
May 17, 2017, 2:31:57 PM5/17/17
to Google App Engine
Google Groups is meant for general product discussions and not for technical support. 

I recommend posting your code along with the stacktrace you are getting to Stack Overflow using one of the support Google Cloud tags. Our technical community support team is active there. 

The two common reasons for the error you are seeing are:
1. Manually allocating IDs and accidentally using the same value of an existing entity. The solution would be to issue ids using a method that gives you values spread uniformly across the key space (the Python uuid module, for example).
2. Hot tablets. This can happen with the Datastore automatic ID generator if you have high write throughput (100s of writes/sec on the same entity). The solution would be to batch write ops to lower QPS.
 
If you are migrating your entities to a new Entity Group (by changing its ancestor parent) but are attempting to keep the same ID you must first be sure to allocate the ID to prevent Cloud Datastore from assigning one of your manual numeric IDs to another entity.
Reply all
Reply to author
Forward
0 new messages