Migration 'stuck'

3,727 views
Skip to first unread message

Klaas Pieter Annema

unread,
Jan 19, 2012, 4:23:39 AM1/19/12
to google-a...@googlegroups.com
I've started an app migration 3 days ago and it's still running. Yesterday it seemed to be making progress, the status of the copy phase changed from 'copying' to 99.00% (approximately 0:30:00 remaining). The approximate time slowly increasing during the day and is now at 0:43:19.

I'm starting to doubt wether it's still working. I have about 40gb of data in my datastore. The datastore statistics of the receiving application has been showing all data since the end of day 1. Is the migration supposed to take this long? How do I know if something went wrong?

- Klaas Pieter

Klaas Pieter Annema

unread,
Jan 21, 2012, 1:40:35 PM1/21/12
to google-a...@googlegroups.com
Migration is still stuck. Status is now “99.00% (approximately 1:17:41 remaining). Nobody at Google is responding to the issue. I've already had to postpone one announced maintenance period. The next one is rapidly approaching and it looks like we're not going to make that either.

Anyone have a suggestion?

- Klaas Pieter

Amy Unruh

unread,
Jan 21, 2012, 5:59:08 PM1/21/12
to google-a...@googlegroups.com
Klaas,

If the migration still appears 'stuck', can you send me the app id?

--
You received this message because you are subscribed to the Google Groups "Google App Engine" group.
To post to this group, send email to google-a...@googlegroups.com.
To unsubscribe from this group, send email to google-appengi...@googlegroups.com.
For more options, visit this group at http://groups.google.com/group/google-appengine?hl=en.

Klaas Pieter Annema

unread,
Jan 22, 2012, 5:51:15 AM1/22/12
to google-a...@googlegroups.com
I just reverted and restarted the migration hopefully this time it'll copy correctly. The app id is enstoresecure, can you see what went wrong with the previous migration? Will it stop the current one as well?

- Klaas Pieter

Klaas Pieter Annema

unread,
Jan 22, 2012, 1:56:11 PM1/22/12
to google-a...@googlegroups.com
The new migration seems to be stuck as well. Again at 99% with approximate time slowly rising.

- Klaas Pieter

Brandon Wirtz

unread,
Jan 22, 2012, 5:20:50 PM1/22/12
to google-a...@googlegroups.com

There have been other threads on this, until your migration has been stuck for 72 hours, don’t assume it died.  Many people’s migrations have taken 120 hours.

Klaas Pieter Annema

unread,
Jan 22, 2012, 5:39:19 PM1/22/12
to google-a...@googlegroups.com
My previous migration had been stuck for over 72 hours. I stopped it and started a new one, It went to 99% in half an hour. Has been been in it's current state for over 12 hours.

- Klaas Pieter

Klaas Pieter Annema

unread,
Jan 24, 2012, 5:25:31 AM1/24/12
to google-a...@googlegroups.com
Took a while but I was finally able to finish the migration.

The problem was that some of my entities were over the index entries/entity limit. These entities were created before AppEngine enforced the limits and were grandfathered into the old limits. As long as the keys of these entities don't change they work correctly. Because migration changes the keys the new application does not allow these objects to be put.

We had some old same property indexes that are not needed anymore. Removing these from the index.yaml and running vacuum_indexes on both the old and new application resolved the problem.

- Klaas Pieter

On Sunday 22 January 2012 at 23:20, Brandon Wirtz wrote:

Robert Kluin

unread,
Jan 26, 2012, 2:46:52 AM1/26/12
to google-a...@googlegroups.com
Yikes, you mean the 5000 indexes / entity limit? Those probably cost
a bit to write.


Robert

On Tue, Jan 24, 2012 at 04:25, Klaas Pieter Annema

Klaas Pieter Annema

unread,
Jan 27, 2012, 3:15:52 AM1/27/12
to google-a...@googlegroups.com
Yes I meant those and yes they probably cost a lot to write. Luckily we didn't need them anymore :)

- Klaas Pieter

Ikai Lan (Google)

unread,
Jan 27, 2012, 6:58:36 PM1/27/12
to google-a...@googlegroups.com
Oh, bummer. Yeah, there's an RPC size limit, and when an entity gets bigger and bigger it creeps up such that it gets too big to update anymore. 

There are a few more cases when migrations will appear to be stuck or run forever, but they tend to be symptoms of the same underlying issue:

- imbalanced namespaces. If you use 5000 namespaces, but 99% of your entities are in 1 namespace, the tool shards by namespace and most of the entities will be serviced by a single worker instance. If you write new entities to this namespace faster than the single worker copies them to the new application, your migration will run forever

- writing entities faster than the migration tool can map over keys. This is a typical case where if you are serving hundreds or thousands of queries per second (we call this QPS) and new entities are being written at that rate, the mapper cannot keep up. The mapper is what is responsible for sharding the entities into buckets so they can be copied in parallel

In most cases, the migrations while your app is still in read-write is still going, it's just taking a really long time and the ETA doesn't accurate reflect that because it does not take into account the incoming stream of new entities. If you're seeing this effect, when you do the initial copy, the best solution is to figure out a method by which you can slow the writes down so the migration tool far outpaces it on the map step. We've recently pushed an update that makes migrations run even faster, so hopefully you should be seeing these issues less frequently.

--
Ikai Lan 
Developer Programs Engineer, Google App Engine
Reply all
Reply to author
Forward
0 new messages