migrating from ms to hr: Re-parenting, root entities and performance

84 views
Skip to first unread message

Kenneth

unread,
Aug 5, 2011, 10:51:35 AM8/5/11
to google-a...@googlegroups.com
There's a lot of good information bubbling out about the pitfalls of moving from from ms to hr, especially the problems with keys having your appid in them.  Apologies if this question has come up before.

Like most others I suspect, 99% of my datastore objects don't have parents. Since these are all in the same root entity group I'm now limited to 1-10 writes per second to all of these objects if I don't reparent, according to http://code.google.com/appengine/docs/python/datastore/hr/overview.html.

If my write rate does climb above 10/s I assume that I'm going to block?

What is the strategy here?  I can see making up something random as the parent, but then I need that random thing to do a get_by_id since that's what I do in most of my app, only passing the id to the user rather than the whole key (which you are not supposed to pass to the user because it is a security issue if you're using namespace).  Am I screwed?

This is of course leaving the whole issue of consistency aside, I'm ok with that side of things, more or less.

Thanks.

Joshua Smith

unread,
Aug 5, 2011, 11:04:39 AM8/5/11
to google-a...@googlegroups.com
Either you're mis-reading the docs or I am.

Can you quote the text that leads you to that conclusion?

Kenneth

unread,
Aug 5, 2011, 11:51:38 AM8/5/11
to google-a...@googlegroups.com
Under usage notes at the bottom:
The High Replication code sample above writes to a single entity group per guestbook. This allows queries on a single guestbook to be strongly consistent, but also limits changes to the guestbook to 1 write per second (the supported limit for entity groups). Therefore, writing to a single entity group per guestbook is not ideal when high usage is expected. If your app is likely to encounter heavy write usage, consider using another means. For example, you can put recent posts in memcache with an expiration, and then display a of mix recent posts from memcache and posts retrieved from the datastore.
I assumed that the root-entity group isn't special, they seem to be going out of their way to not use parent=None, is that just a grab for strong consistency?

Joshua Smith

unread,
Aug 5, 2011, 12:10:16 PM8/5/11
to google-a...@googlegroups.com
There is no "root-entity group".  When parent=None, every entity is its own group.  So you are mis-reading the docs, based on an erroneous assumption.

"An entity created without a parent is a root entity. A root entity without any children is an entity group by itself."

Yes, setting the parent is just a grab for stronger consistency, although it's not really that strong, since queries will still return only "eventually consistent" results.

The documentation of this is very bad, as is the documentation of transactions.

The pattern I've seen both in HR and in Transactions is:

1) Give an example which is easy to understand, but is a complete anti-pattern. Something you should NEVER do.

2) Put a footnote below the example, saying it's an anti-pattern.

These sections of documentation should be "taken out back and shot."

These are complicated problems, and we'd really like to see correct examples.

The documentation of transactions should create child entities for idempotence, and use task queues to deal with exceptions.

The documentation of HR should include real-world examples where entity groups are really needed for transactional consistency, and should include an example of using memcache to provide apparent consistency atop the queries.

(Or, put more simply, have whoever wrote the sharded counter example write these examples. 'cause that person knows how to write great examples.)

-Joshua


--
You received this message because you are subscribed to the Google Groups "Google App Engine" group.
To view this discussion on the web visit https://groups.google.com/d/msg/google-appengine/-/a4o19jM0nU4J.
To post to this group, send email to google-a...@googlegroups.com.
To unsubscribe from this group, send email to google-appengi...@googlegroups.com.
For more options, visit this group at http://groups.google.com/group/google-appengine?hl=en.

Kenneth

unread,
Aug 6, 2011, 5:31:09 AM8/6/11
to google-a...@googlegroups.com
Ok, that makes a ton more sense, thanks for the clarification!

Wendel

unread,
Aug 6, 2011, 12:45:20 PM8/6/11
to google-a...@googlegroups.com
"The High Replication code sample above writes to a single entity group per guestbook. This allows queries on a single guestbook to be strongly consistent, but also limits changes to the guestbook to 1 write per second (the supported limit for entity groups)"

Does this limit of 1 write per second apply to a single entity item or to the entire table of entities of the same kind?

The documentation is unclear about this, but I assume it is limited to only a single entity record, otherwise it would be impossible to scale.

Robert Kluin

unread,
Aug 7, 2011, 1:38:34 PM8/7/11
to google-a...@googlegroups.com

The docs clearly state that the write-rate limit applies to an *entity
group*. An entity group is a collection of entities you define (by
specifying a parent). By default, every entity is in its own group.

http://code.google.com/appengine/docs/python/datastore/entities.html#Entity_Groups_and_Ancestor_Paths


>
> --
> You received this message because you are subscribed to the Google Groups
> "Google App Engine" group.
> To view this discussion on the web visit

> https://groups.google.com/d/msg/google-appengine/-/NgLOARGK7J8J.

App Engine Group

unread,
Aug 5, 2011, 5:06:45 PM8/5/11
to Google App Engine
Hi Kenneth,

There are a few good articles about writing scalable applications for
google app engine: http://code.google.com/appengine/articles/scaling/overview.html

An example that might help: http://code.google.com/appengine/articles/sharding_counters.html

- Wen


On Aug 5, 7:51 am, Kenneth <kennet...@aladdinschools.com> wrote:
> There's a lot of good information bubbling out about the pitfalls of moving
> from from ms to hr, especially the problems with keys having your appid in
> them.  Apologies if this question has come up before.
>
> Like most others I suspect, 99% of my datastore objects don't have parents.
> Since these are all in the same root entity group I'm now limited to 1-10
> writes per second to all of these objects if I don't reparent, according tohttp://code.google.com/appengine/docs/python/datastore/hr/overview.ht...

Ikai Lan (Google)

unread,
Aug 8, 2011, 11:54:34 AM8/8/11
to google-a...@googlegroups.com
Hello,

Yes, this limit applies to entity groups.

1. Entity Kind != entity group
2. "No entity group parent" means an entity is its own entity group root

The pitfalls here revolve around consistency guarantees. Check out these slides I did that describe the differences in consistency guarantees:


--
Ikai Lan 
Developer Programs Engineer, Google App Engine



--
You received this message because you are subscribed to the Google Groups "Google App Engine" group.
Reply all
Reply to author
Forward
0 new messages