<SNIP>
> In the meanwhile, if the good folks at Google are about to release a
> much better solution, I'd appreciate a heads up so I can devote my
> efforts to building my app again instead of building infrastructure.
>
They probably are:
http://groups.google.com/group/google-appengine/browse_thread/thread/18d246b30e267da4/dc2a10eb6749339a?q=export+datastore&lnk=ol&
>
> Thanks,
> Aral
> >
>
--
Barry
> Thanks,
> Aral
> >
>
well the restore part of that script was to take the XML and parse it
with (elementtree in my case) and then just pass that to the
constructor of the datastore objects.
With that method I don't have any real slowness, although I haven't
tested with millions of records.
> And, as I mentioned above, I want to be
> able to restore that data not just to the deployment instance but to
> (a) the local SDK and, (b) to a separate App Engine instance to be
> used as a staging environment.
>
why will you need a staging environment? the way versions are handle
in appengine you could deploy both on the same account. You have that
build in, with the admin console and setting the correct version to be
deployed.
>> where it gets
>> complicated is with relationships and "foreign keys" because as far as
>> I know there is no way to reproduce the exact same key.
>
> You don't have to reproduce the exact same key. I use the actual key
> as the key name when generating the new keys. Any entities with the
> original key are removed prior to the creation of the new key.
>
then it's not a backup, just a duplicate. Be careful to find a way to
check if the db has already been restored on that instance otherwise
you take the chance of having a DB with each peace of information
twice.
> The way I am handling references is to make sure that I observe the
> source order of the model classes (I tried using inspect to do this
> but it relies on imp and os.readlink -- monkeypatching these worked on
> the local SDK but not on the deployment environment so I ended up
> simply reading in the source file myself and running a regex on it.)
why you need that? if you are going to use the db then import it and
use it, if you want clone the file inside the backup. Since it's all
appengine you could duplicate the module somewhere on the backup data
and then use the __import__ statement
http://docs.python.org/lib/built-in-funcs.html, afterall the model is
part of the backup, isn't it?
if you are storing to python code, which means you are doing some sort
of code generation, then it means you are creating very big files of
constructor calls, are you sure the slowness isn't due to this
approach? depending on the code you could be eating up a lot of memory
on both sides of the transaction.
> Since the reference properties require a model to be defined before it
> is referenced, this guarantees that the referred to entities are
> created before the entities that reference them. This is working fine
> currently.
>
yes, this is one of the limitations of appengine's db, which is an
advantage here.
> As soon as I fix the max redirection issue (raising the
> network.http.redirection-limit in FireFox didn't work so I'm going to
> try the META refresh approach instead), I should have a working proof
> of concept. Once I have that stable, I'll start work on making it a
> generic Django solution that you can pop into any existing app.
>
I see no reason why that's happening, are you doing a page call for
each object?