Google Groups no longer supports new Usenet posts or subscriptions. Historical content remains viewable.
Dismiss

Moving class used in pickle

14 views
Skip to first unread message

Jeffrey Barish

unread,
May 15, 2007, 7:23:29 PM5/15/07
to pytho...@python.org
I have a class derived from string that is used in a pickle. In the new
version of my program, I moved the module containing the definition of the
class. Now the unpickle fails because it doesn't find the module. I was
thinking that I could make the unpickle work by putting a copy of the
module in the original location and then redefine the class by sticking a
__setstate__ in the class thusly:

def __setstate__(self, state):
self.__dict__.update(state)
self.__class__ = NewClassName

My plan was to specify the new location of the module in NewClassName.
However, when I do this I get the message "'class' object layout differs
from 'class'". I don't think that they do as the new module is a copy of
the old one. I suspect that I am not allowed to make the class assignment
because my class is derived from string. What is the best way to update
the pickle? The only thought I have is to read all the data with the old
class module, store the data in some nonpickle format, and then, with
another program, read the nonpickle-format file and rewrite the pickle with
the class module in the new location.
--
Jeffrey Barish

Jean-Paul Calderone

unread,
May 15, 2007, 8:32:53 PM5/15/07
to jeff_...@earthlink.net, pytho...@python.org

This is one of the reasons pickle isn't very suitable for persistence of
data over a long period of time: no schema (and so no schema upgrades), and
few tools for otherwise updating old data.

For your simple case, you can just make sure the module is still available
at the old name. This will allow pickle to find it when loading the objects
and automatically update the data to point at the new name when the object
is re-pickled. This doesn't give you any straightforward way to know when
the "upgrade" has finished (but maybe you can keep track of that in your
application code), and you need to keep the alias until all objects have been
updated.

For example, if you had a/b.py and you renamed it to a/c.py, then you can do
one of two things: in a/__init__.py, add a 'from a import b as c'. Now a.b
is the same object as a.c, but if you have a class defined in a/c.py and you
access it via a.b, it will still "prefer" a.c (ie, its __module__ will be
'a.c', not 'a.b'); alternatively you can have b.py and import the relevant
class(es) from c.py into it - 'from a.c import ClassA'.

You could also build a much more complex system in the hopes of being able to
do real "schema" upgrades of pickled data, but ultimately this is likely a
doomed endeavour (vis <http://divmod.org:81/websvn/wsvn/Quotient/trunk/atop/versioning.py?op=file&rev=0&sc=0>).

Hope this helps,

Jean-Paul

Berthold Höllmann

unread,
May 21, 2007, 7:24:55 PM5/21/07
to
Jeffrey Barish <jeff_...@earthlink.net> writes:

You can fiddle with the file class used reading the pickled file. I.e.
the "read()" method could replace each instance of "foo.myclass" by
"greatnewmodule.mynewclass" could bring you back in the game.

Porting some applications of my from 32 to 64 bit i discovered that my
Numeric int arrays really had to be int32. So i opened the (binary)
pickled files and replaced the occurences of "'l'" by "'i'". Later we
replaced Numeric by numpy. I used the editor approach again to replace
"Numeric" string by "numpy" or "numpy.oldnumeric". A collegue of mine
wrote a small reader class implementing the approach working on the
input stream.

Regards
Berthold
--

Peter Otten

unread,
May 21, 2007, 9:42:33 PM5/21/07
to
Jeffrey Barish wrote:

You could overwrite Unpickler.find_class():

import pickle
from cStringIO import StringIO

class AliasUnpickler(pickle.Unpickler):
def __init__(self, aliases, *args, **kw):
pickle.Unpickler.__init__(self, *args, **kw)
self.aliases = aliases
def find_class(self, module, name):
module, name = self.aliases.get((module, name), (module, name))
return pickle.Unpickler.find_class(self, module, name)

def loads(aliases, str):
file = StringIO(str)
return AliasUnpickler(aliases, file).load()

if __name__ == "__main__":
import before, after
data = before.A()
print data.__class__, data
dump = pickle.dumps(data)
data = loads({("before", "A"): ("after", "B")}, dump)
print data.__class__, data

In the example the aliases dictionary maps (module, classname) pairs to
(module, classname) pairs. Of course this only works when the class layout
wasn't changed.

Peter

Gabriel Genellina

unread,
May 22, 2007, 12:40:49 AM5/22/07
to pytho...@python.org
En Mon, 21 May 2007 16:24:55 -0300, Berthold Höllmann
<bh...@despammed.com> escribió:

> Jeffrey Barish <jeff_...@earthlink.net> writes:
>
>> I have a class derived from string that is used in a pickle. In the new
>> version of my program, I moved the module containing the definition of
>> the class. Now the unpickle fails because it doesn't find the module.
>> I
>

> You can fiddle with the file class used reading the pickled file. I.e.
> the "read()" method could replace each instance of "foo.myclass" by
> "greatnewmodule.mynewclass" could bring you back in the game.

There is a hook in the pickle module (load_global or find_global) that you
can override instead, and it's exactly for this usage, see:
http://docs.python.org/lib/pickle-sub.html

--
Gabriel Genellina

0 new messages