My master/slave to high replication datastore migration experience

320 views
Skip to first unread message

kuanyong

unread,
Jul 11, 2011, 5:14:42 PM7/11/11
to google-a...@googlegroups.com
Hey guys,

I had a pretty good experience migrating a fairly complex python app from the master/slave datastore to the high replication datastore. I wrote a blog post to document a few things I learned along the way. I hope some of you will find it useful.


Enjoy,
Kuan

Joshua Smith

unread,
Jul 12, 2011, 9:05:14 AM7/12/11
to google-a...@googlegroups.com
You call that a "pretty good experience"?  I call that a nightmare.

Googlers would be wise to read that post and realize that many of the things he had to do are insanely complicated. About half of this stuff would not be necessary if google figured out a way to keep the same app ID when porting to HR. I imagine that the crux of the problem is that there is only one big table, and so there would be name collisions between the HR and M/S versions, right? So how about cleaving the world into two big tables? Is that possible? Or how about adding a hidden boolean (_is_hr) attribute to all models, and rigorously mixing that boolean into all gets, puts, queries, etc, deep in the infrastructure.

The other half of the complexity is coming from his need to start using entity groups.  I have about 15 apps running on GAE (comprising about 10K lines of python code) and I have never, ever, parented an entity. When I read the docs a hundred years ago, it became clear to me that parenting entities must be something that is required for some kind of application that I'm not writing. Frankly, I've never completely groked what an entity group maps to in my (30+ years of) programming experience. It's certainly not a parent/child relationship in the usual sense.

One thing that this post made clear to me is that I need to spend some significant time figuring out what the hell an entity group really is, and why they suddenly matter in HR.  Because right now, I don't get it. And Kuan certainly wouldn't have gone to all that trouble if he hadn't needed to.

-Joshua

p.s.: cool site, Kuan

--
You received this message because you are subscribed to the Google Groups "Google App Engine" group.
To view this discussion on the web visit https://groups.google.com/d/msg/google-appengine/-/cutzl4HHDO4J.
To post to this group, send email to google-a...@googlegroups.com.
To unsubscribe from this group, send email to google-appengi...@googlegroups.com.
For more options, visit this group at http://groups.google.com/group/google-appengine?hl=en.

Greg

unread,
Jul 13, 2011, 9:41:57 PM7/13/11
to Google App Engine
On Jul 13, 1:05 am, Joshua Smith <JoshuaESm...@charter.net> wrote:
> Frankly, I've never completely groked what an entity group maps to in my (30+ years of) programming experience. It's certainly not a parent/child relationship in the usual sense.

If you figure it out, let me know! I have a app that needs to update
two distinct entity types at the same time. Because they are distinct,
I can't do both in a transaction. Even if it was only one entity type,
I don't want to lock all of them when writing to one, which I believe
is a consequence of making them a group.

A good article called "entity groups for dummies" seems warranted!

Waleed Abdulla

unread,
Jul 13, 2011, 10:17:03 PM7/13/11
to google-a...@googlegroups.com
I agree with Joshua. That's a very bad experience to force us to go through to migrate. My migration is probably going to be much worse because I have a lot of data, which means that it'll take a very long time (days?) to copy the data across, and by that time the data would have changed on the original app. I cannot bring the app down for extended periods of time while I do this. 

Not to mention how much of a hassle it is to change app IDs.

I posted an issue here, star it if you agree that this process should be automated:






--
You received this message because you are subscribed to the Google Groups "Google App Engine" group.

Robert Kluin

unread,
Jul 13, 2011, 11:08:03 PM7/13/11
to google-a...@googlegroups.com
Hey Greg and Joshua,
Entity groups really aren't that hard, once you play with them and
do some testing, I promise ;).

First thing to note is that entity groups have effectively nothing
to do with the entity's kind. It has to do with the entity's key; or,
more precisely, its path. Entity's in the same group have the same
path prefix and so are stored "together," and they can be operated on
transactionally. The relationship is parent-child in the same sense
that a file has a parent directory. You could think of them almost as
a sub-database.

There are limits, however, to the frequency you can write entities
in the same entity group (I think 2 or 3 per second is stated in the
docs). Also, because the entity group is stored in the key it can not
be changed once the entity has been saved.

Hopefully that is not more confusing. :)


Robert

> --
> You received this message because you are subscribed to the Google Groups "Google App Engine" group.

vivpuri

unread,
Jul 14, 2011, 9:56:02 PM7/14/11
to Google App Engine
+1 to @Waleed and @Joshua. Personally i dont really have the resources
and time to deal with migration. Just switch my datastore to HR. I am
willing to foot the bill for the higher pricing, but cannot take
downtime and all related issues. Besides that, looking at the
AppEngine group, it feels like i need to make following the group as a
full time job. There are so many changes going on that it is really
really hard to keep track.

Mike Lawrence

unread,
Aug 15, 2011, 7:57:50 PM8/15/11
to google-a...@googlegroups.com
if you use slim3 for your data store operations you can avoid all this parenting nonsense.
slim3 allows you to update two of the same kind in a single transaction!
why should we be required to place entities that are not logically related into a parenting relationship just to get transaction support?

I have a game where a user domain object has a list of friends (users). 
Try modeling that as a parent relationship. It's self-referential. A user is not a parent of another user.
Then try adding a list of games domain objects,
and add a new friend to a new game in a single transaction.
With stand-alone domain objects it's simple.

Back to the original post....
I'm finding the latency on the master/slave datastore is too unpredictable for my game.
Hopefully migrating will eliminate some of the long latency, and dynamic instance start problems i'm seeing.

Tim Hoffman

unread,
Aug 15, 2011, 10:08:49 PM8/15/11
to google-a...@googlegroups.com
HI Mike.

You said "slim3 allows you to update two of the same kind in a single transaction!"

How does slim3 achieve this as the underlying datastore doesn't "currently"  support multiple entity group transactions.

It would seem this would be a slim3 transaction (I know nothing about slim3 by the way) and not a datastore operation,
which would suggest that the transaction isn't reliable.  

But would love to be enligntened on this topic.

Rgds

Tim 

Mike Lawrence

unread,
Aug 15, 2011, 10:40:24 PM8/15/11
to google-a...@googlegroups.com, google-a...@googlegroups.com
tcp guarantees delivery built upon udp
which doesn't, so it's not too hard to believe you can build something more robust on top of a limited implementation 


it's open source so you can view the implementation for yourself to certify it's sound. I'd ask the slim3 user group. 
they're friendly and very responsive. 

good luck 

Mike Lawrence

--
You received this message because you are subscribed to the Google Groups "Google App Engine" group.
To view this discussion on the web visit https://groups.google.com/d/msg/google-appengine/-/weML1S1dK6EJ.

Tim Hoffman

unread,
Aug 16, 2011, 1:19:26 AM8/16/11
to google-a...@googlegroups.com
OK, they are just creating roll forward transactions, in multiple entity groups the same as this article by 
Nick Johnson.


As to the statement "you can avoid all this parenting nonsense."  thats what slim3 is doing under the hood for you any way. So one way or another it has to happen.

Rgds

Tim
Reply all
Reply to author
Forward
0 new messages