--
You received this message because you are subscribed to the Google Groups "Google App Engine" group.
To post to this group, send email to google-a...@googlegroups.com.
To unsubscribe from this group, send email to google-appengi...@googlegroups.com.
For more options, visit this group at http://groups.google.com/group/google-appengine?hl=en.
Hey Casey,
One other thought, you could run over the data bundling and dumping it to the blobstore. Then pull the blobs to amazon. That might let you get a little more efficiency in the transfer. I've done it going in, should work just as well going out.
Aldo note that auto generated keys are not sequential nor strictly increasing. So you could potentially loose data. A solution I've used is to make a small adjustment to my models, I'll add an indexed 'batch' field that gets put on all new / updated entities. Do your main transfer using key order, then when your ready, grab everything with a batch value. Old data won't have a value for batch, so it won't be picked up in your final conversion. With a couple iterations you should be able to minimize your downtime.
You could also use the remote API to fetch data during the final transfer stage. That should let you have zero downtime.
Robert
Hi Daniel,
I strongly suspect you're going to need a different solution to transfer that much data out in a timely manner. The best solution depends on your write rates and update patterns
Robert
File a ticket. I think this should be part of Google Take Out
Hey Casey,
I think you can find some useful stuff in the SDK and maybe in the docs. I'm mobile now so I don't have links.
Robert
We are in the same situation. How’s everyone’s progress?
We’re planning to migrate to AWS but we have quite a bit of data to move like you guys. We’re trying to minimize service disruption and keeping our site online while making the move. Here’s our scheme (would love to hear suggestions form you guys):
1) Upload a new app version that adds a Boolean value to every type of entity in Datastore. Call it “updated”. All call to put() will set this Boolean to true, and push the put() data to a GAE pull queue.
2) Use Remote API to batch get all entity with Boolean=false. This will get any unmodified data from DS. Data that are modified by DS after fetch can be retrieved from the pull queue later.
3) Transform the data and push them to AWS.
4) From AWS, lease the data from pull queue and fill up the database
5) Modify DNS record to point to the new AWS site while keeping the GAE app alive until it receives no traffic.
We are trying to take advantage of the GAE pull queue’s ability to be accessed outside of GAE. Do you guy foresee any problem with this scheme? We’re busy coding this at the moment and would love to hear your input. Thank you.
Also we are planning to use AWS Elastic Beanstalk to ease Tomcat admin effort. Anyone could share their experience with this technology?
The only problem I see is that you won't be able to get existing data with 'updated=False', the old data won't be indexed. You'll need to just get everything and maybe skip the stuff you have already pushed. Otherwise sounds like the idea might work.
> --
> You received this message because you are subscribed to the Google Groups "Google App Engine" group.
> To view this discussion on the web visit https://groups.google.com/d/msg/google-appengine/-/FWTtjG5urQ0J.