To be honest I had not yet thought about it much beyond having the initial idea.
Due to the fact that the dev server is simulating all the GAE services (datastore, urlfetch, memcache, images, etc.) I doubt there is anything useful to learn from how they perform on the dev server that can be applied to the production server. Anything that requires an RPC is going to run far too differently on the dev server. I think all we could really compare is how pure Python code runs.
Your index example is very likely due to the fact that the simulated datastore does not have the same performance characteristics as the production datastore. The underlying storage system is completely different. If you are not currently using the SQLite backend on the dev server give that a try. In my experience it performs much better than the default one.
For my tests I was just going to run some processing and memory IO intensive operations and see how they compare. Will this give me any insight into how to better optimize for the production server? Probably not. But, as with getting GAE running on the Pi in the first place, I don't really have any lofty goals with this and I am just curious.
- Bryce