Pickled TrialHandler Backwards Compatibility

125 views
Skip to first unread message

Erik Kastman

unread,
Mar 27, 2012, 12:59:43 PM3/27/12
to psycho...@googlegroups.com
Hi all,

Looks like newer versions of psychopy are balking trying to load old pickled TrialHandler(.psydats). I've just posted a new issue (with proper github auto-links to the earlier commits involved), but thought the mailing list might be a better place for an in-depth discussion, since I have no clue how to fix it. Any ideas?

I'm in the middle of collecting pilot data, and it's hard to analyze everything in the same script when I can only look at half the data at a time. Because it was piloting, I updated mid-data-collection, too. Lesson learned! :)

Thanks,
Erik

Here's the issue text:

----------------------------------------
When Jon updated TrialHandler to inherit from _BaseTrialHandler (somewhere around 34a14ad154, end of January) it broke the unpickling/loading, which was then fixed by Rebecca Sharman in 5b7efa7f54. 

However, that fix isn't backward compatible, so any TrialHandlers that were pickled with the older version (I think 1.73.02) can only be unpickled by that version and earlier. Here's the failing output when trying to unpickle files made with the earlier version using current versions (1.73.05, HEAD):

TypeError: ('object.__new__(X): X is not a type object (classobj)', <function _reconstructor at 0x1ef870>, (<class psychopy.data.TrialHandler at 0x6780308>, <type 'object'>, None))

Rebecca's fix was to remove object inheritance from _BaseTrialHandler in data.py: 

-class _BaseTrialHandler(object):
+class _BaseTrialHandler:

I don't understand python's inheritance well enough to know what's going when. My guess is that the pickled TrialHandler wasn't saved inheriting from anything, and doesn't think it needs to inherit from anything, but the newer version of TrialHandler thinks it should and is failing when it's not given a _BaseTrialHandler. But that's completely a guess.

In any case, we should be able to load older saved data using newer versions, especially since this was just a patch release. Can someone explain what's going on? And give me guidance on fixing it? I thought about using a try: except TypeError, but I don't know how to fix it without re-writing the method definition above. 

Jonathan Peirce

unread,
Mar 30, 2012, 11:09:15 AM3/30/12
to psycho...@googlegroups.com
eesh, that's my fault. Sorry.

I kept seeing that other peoples' python code often subclasses 'object' and started doing it for top-level classes of psychopy, assuming that this was good practice for some reason. That stopped us being able to open older files and reverting it made it all work again, but I hadn't realised that now the files from the /intermediate/ period (1.73.00-1.73.03) wouldn't open.

To fix this properly I'm guessing we'll need a custom unpickling class (which is going to be ugly I'll bet!)
    http://docs.python.org/library/pickle.html

Jon
-- 
Jonathan Peirce
Nottingham Visual Neuroscience

http://www.peirce.org.uk

This message and any attachment are intended solely for the addressee and may contain confidential information. If you have received this message in error, please send it back to me, and immediately delete it. Please do not use, copy or disclose the information contained in this message or in any attachment. Any views or opinions expressed by the author of this email do not necessarily reflect the views of the University of Nottingham.

This message has been checked for viruses but the contents of an attachment may still contain software viruses which could damage your computer system: you are advised to perform your own checks. Email communications with the University of Nottingham may be monitored as permitted by UK legislation.

Erik Kastman

unread,
Apr 4, 2012, 5:38:41 PM4/4/12
to psycho...@googlegroups.com
Hey all,

I just finished reading the docs, and I have _no_ idea where to start on building an unpickler complex enough to do what we want. I'm writing a stack overflow question, but I don't know if it will get any hits. On the bright side, this will make a good piece of literature / how-to if we ever do figure it out. I'm a little leary of official documentation that says "If this sounds like a hack, you’re right. Refer to the source code to make this work."

Has anyone had a chance to think about this at all? Since the problem is so limited, is there a better way to combine the few datasets that were collected from this intermediate period than writing a grand unpickler? Maybe we could serialize them using a different format with weaker references, like yaml, and load the yaml up correctly? 

Erik

Erik Kastman

unread,
Apr 9, 2012, 2:57:52 PM4/9/12
to psycho...@googlegroups.com
When Jon added _BaseTrialHandler(object) he was doing a good thing by converting TrialHandler from a python "old style" class to a "new style" one… but that broke loading old Handlers that had been pickled as old-style classes. 

New-style classes are recommended, and I've written some code (https://github.com/psychopy/psychopy/issues/125#issuecomment-5003318) that will load both new and old style classes, but it's a little klugey (it stubs an old-style trial handler and then magically switches the class on the fly if an old-style class is loaded, then initializes a new-style object and transfers over all the attributes from the loaded old-style class. 

There are still some problems with this: 

1) The loader can't exist in misc.py, since it's magic requires data to be imported in order to do the magic class switching, and that introduces a circular dependency. 

2) Stubbing out the older class is a pretty ugly hack.

3) I'm not sure that just copying all the variables in dir() is a complete initialization (although I think from the docs that that's essentially one of the strategies that pickle itself uses) and I'm not positive that this won't introduce bugs for people.

4) Anyone loading their own handlers using Pickle.load() and not using the provided loader will have to edit their code. This isn't a huge problem, but it could be an inconvenience for people for no good reason besides "future compatibility."

Is it worth it to use the recommended new-styel classes without a specific reason, given the existing history? Anyone have a feel for that?

Erik

Jonathan Peirce

unread,
Apr 11, 2012, 7:35:48 AM4/11/12
to psycho...@googlegroups.com
Sorry I wasn't much help in providing a fix here, but your solution looks good. Although the bug won't affect very many people I think it's worth making the switch to new-style classes and using your fix to allow the importing of old data.

So I'd suggest we bring your fix in for this next version.

[One advantage of new-style classes is that they report their type() as being the class they are an instance of, whereas old-style classes merely report their type as 'instance' (which I never found very informative).]

Jon

On 09/04/2012 19:57, Erik Kastman wrote:
When Jon added _BaseTrialHandler(object) he was doing a good thing by converting TrialHandler from a python "old style" class to a "new style" one� but that broke loading old Handlers that had been pickled as old-style classes.�

New-style classes are recommended, and I've written some code (https://github.com/psychopy/psychopy/issues/125#issuecomment-5003318) that will load both new and old style classes, but it's a little klugey (it stubs an old-style trial handler and then magically switches the class on the fly if an old-style class is loaded, then initializes a new-style object and transfers over all the attributes from the loaded old-style class.�

There are still some problems with this:�

1) The loader can't exist in misc.py, since it's magic requires data to be imported in order to do the magic class switching, and that introduces a circular dependency.�

2) Stubbing out the older class is a pretty ugly hack.

3) I'm not sure that just copying all the variables in dir() is a complete initialization (although I think from the docs that that's essentially one of the strategies that pickle itself uses) and I'm not positive that this won't introduce bugs for people.

4)�Anyone loading their own handlers using Pickle.load() and not using the provided loader will have to edit their code. This isn't a huge problem, but it could be an inconvenience for people for no good reason besides "future compatibility."

Is it worth it to use the recommended new-styel classes without a specific reason, given the existing history? Anyone have a feel for that?

Erik

On Apr 4, 2012, at 5:38 PM, Erik Kastman wrote:

Hey all,

I just finished reading the docs, and I have _no_ idea where to start on building an unpickler complex enough to do what we want. I'm writing a stack overflow question, but I don't know if it will get any hits. On the bright side, this will make a good piece of literature / how-to if we ever do figure it out. I'm a little leary of official documentation that says "If this sounds like a hack, you�re right. Refer to the source code to make this work."

Has anyone had a chance to think about this at all? Since the problem is so limited, is there a better way to combine the few datasets that were collected from this intermediate period than writing a grand unpickler? Maybe we could serialize them using a different format with weaker references, like yaml, and load the yaml up correctly?�

Erik


On Mar 30, 2012, at 11:09 AM, Jonathan Peirce wrote:

eesh, that's my fault. Sorry.

I kept seeing that other peoples' python code often subclasses 'object' and started doing it for top-level classes of psychopy, assuming that this was good practice for some reason. That stopped us being able to open older files and reverting it made it all work again, but I hadn't realised that now the files from the /intermediate/ period (1.73.00-1.73.03) wouldn't open.

To fix this properly I'm guessing we'll need a custom unpickling class (which is going to be ugly I'll bet!)


Jon

On 27/03/2012 17:59, Erik Kastman wrote:
Hi all,

Looks like newer versions of psychopy are balking trying to load old pickled TrialHandler(.psydats). I've just posted a new issue (with proper github auto-links to the earlier commits involved), but thought the mailing list might be a better place for an in-depth discussion, since I have no clue how to fix it. Any ideas?

I'm in the middle of collecting pilot data, and it's hard to analyze everything in the same script when I can only look at half the data at a time. Because it was piloting, I updated mid-data-collection, too. Lesson learned! :)

Thanks,
Erik

Here's the issue text:

----------------------------------------
When Jon updated TrialHandler to inherit from _BaseTrialHandler (somewhere around�34a14ad154, end of January) it broke the unpickling/loading, which was then fixed by Rebecca Sharman in 5b7efa7f54.�

However, that fix isn't backward compatible, so any TrialHandlers that were pickled with the older version (I think 1.73.02) can only be unpickled by that version and earlier. Here's the failing output when trying to unpickle files made with the earlier version using current versions (1.73.05, HEAD):

TypeError: ('object.__new__(X): X is not a type object (classobj)', <function _reconstructor at 0x1ef870>, (<class psychopy.data.TrialHandler at 0x6780308>, <type 'object'>, None))

Rebecca's fix was to remove object inheritance from _BaseTrialHandler in data.py:�

-class _BaseTrialHandler(object):
+class _BaseTrialHandler:

I don't understand python's inheritance well enough to know what's going when. My guess is that the pickled TrialHandler wasn't saved inheriting from anything, and doesn't think it needs to inherit from anything, but the newer version of TrialHandler thinks it should and is failing when it's not given a _BaseTrialHandler. But that's completely a guess.

In any case, we should be able to load older saved data using newer versions, especially since this was just a patch release. Can someone explain what's going on? And give me guidance on fixing it? I thought about using a try: except TypeError, but I don't know how to fix it without re-writing the method definition above.�


-- 
Jonathan Peirce
Nottingham Visual Neuroscience

http://www.peirce.org.uk

This message and any attachment are intended solely for the addressee and may contain confidential information. If you have received this message in error, please send it back to me, and immediately delete it. Please do not use, copy or disclose the information contained in this message or in any attachment. Any views or opinions expressed by the author of this email do not necessarily reflect the views of the University of Nottingham.

This message has been checked for viruses but the contents of an attachment may still contain software viruses which could damage your computer system: you are advised to perform your own checks. Email communications with the University of Nottingham may be monitored as permitted by UK legislation.

Konstantin Sering

unread,
Apr 11, 2012, 11:03:31 AM4/11/12
to psycho...@googlegroups.com
I think it is worth switching and sticking to new-style classes, too.

I've read a bit yesterday through this old-style new-style classes
stuff and it seems that you need new-style classes if you want to do
some useful (maybe a bit more fancy) python-hacking.

Cheers, Tino


Am 11. April 2012 13:35 schrieb Jonathan Peirce
<jonatha...@nottingham.ac.uk>:


> Sorry I wasn't much help in providing a fix here, but your solution looks
> good. Although the bug won't affect very many people I think it's worth
> making the switch to new-style classes and using your fix to allow the
> importing of old data.
>
> So I'd suggest we bring your fix in for this next version.
>
> [One advantage of new-style classes is that they report their type() as
> being the class they are an instance of, whereas old-style classes merely
> report their type as 'instance' (which I never found very informative).]
>
> Jon
>
>
> On 09/04/2012 19:57, Erik Kastman wrote:
>
> When Jon added _BaseTrialHandler(object) he was doing a good thing by
> converting TrialHandler from a python "old style" class to a "new style"

> one… but that broke loading old Handlers that had been pickled as old-style
> classes.
>


> New-style classes are recommended, and I've written some code
> (https://github.com/psychopy/psychopy/issues/125#issuecomment-5003318) that
> will load both new and old style classes, but it's a little klugey (it stubs
> an old-style trial handler and then magically switches the class on the fly
> if an old-style class is loaded, then initializes a new-style object and
> transfers over all the attributes from the loaded old-style class.
>

> There are still some problems with this:
>

> 1) The loader can't exist in misc.py, since it's magic requires data to be
> imported in order to do the magic class switching, and that introduces a
> circular dependency.
>

> 2) Stubbing out the older class is a pretty ugly hack.
>
> 3) I'm not sure that just copying all the variables in dir() is a complete
> initialization (although I think from the docs that that's essentially one
> of the strategies that pickle itself uses) and I'm not positive that this
> won't introduce bugs for people.
>

> 4) Anyone loading their own handlers using Pickle.load() and not using the


> provided loader will have to edit their code. This isn't a huge problem, but
> it could be an inconvenience for people for no good reason besides "future
> compatibility."
>
> Is it worth it to use the recommended new-styel classes without a specific
> reason, given the existing history? Anyone have a feel for that?
>
> Erik
>
>
> On Apr 4, 2012, at 5:38 PM, Erik Kastman wrote:
>
> Hey all,
>
> I just finished reading the docs, and I have _no_ idea where to start on
> building an unpickler complex enough to do what we want. I'm writing a stack
> overflow question, but I don't know if it will get any hits. On the bright
> side, this will make a good piece of literature / how-to if we ever do
> figure it out. I'm a little leary of official documentation that says "If

> this sounds like a hack, you’re right. Refer to the source code to make this


> work."
>
> Has anyone had a chance to think about this at all? Since the problem is so
> limited, is there a better way to combine the few datasets that were
> collected from this intermediate period than writing a grand unpickler?
> Maybe we could serialize them using a different format with weaker
> references, like yaml, and load the yaml up correctly?
>

> Erik
>
>
> On Mar 30, 2012, at 11:09 AM, Jonathan Peirce wrote:
>
> eesh, that's my fault. Sorry.
>
> I kept seeing that other peoples' python code often subclasses 'object' and
> started doing it for top-level classes of psychopy, assuming that this was
> good practice for some reason. That stopped us being able to open older
> files and reverting it made it all work again, but I hadn't realised that
> now the files from the /intermediate/ period (1.73.00-1.73.03) wouldn't
> open.
>
> To fix this properly I'm guessing we'll need a custom unpickling class
> (which is going to be ugly I'll bet!)

>     http://docs.python.org/library/pickle.html
>
> Jon
>
> On 27/03/2012 17:59, Erik Kastman wrote:
>
> Hi all,
>
> Looks like newer versions of psychopy are balking trying to load old pickled
> TrialHandler(.psydats). I've just posted a new issue (with proper github
> auto-links to the earlier commits involved), but thought the mailing list
> might be a better place for an in-depth discussion, since I have no clue how
> to fix it. Any ideas?
>
> I'm in the middle of collecting pilot data, and it's hard to analyze
> everything in the same script when I can only look at half the data at a
> time. Because it was piloting, I updated mid-data-collection, too. Lesson
> learned! :)
>
> Thanks,
> Erik
>
> Here's the issue text:
>
> https://github.com/psychopy/psychopy/issues/125
> ----------------------------------------
> When Jon updated TrialHandler to inherit from _BaseTrialHandler (somewhere

> around 34a14ad154, end of January) it broke the unpickling/loading, which


> was then fixed by Rebecca Sharman in 5b7efa7f54.
>

> However, that fix isn't backward compatible, so any TrialHandlers that were
> pickled with the older version (I think 1.73.02) can only be unpickled by
> that version and earlier. Here's the failing output when trying to unpickle
> files made with the earlier version using current versions (1.73.05, HEAD):
>
> TypeError: ('object.__new__(X): X is not a type object (classobj)',
> <function _reconstructor at 0x1ef870>, (<class psychopy.data.TrialHandler at
> 0x6780308>, <type 'object'>, None))
>
>
> Rebecca's fix was to remove object inheritance from _BaseTrialHandler in
> data.py:
>

> -class _BaseTrialHandler(object):
> +class _BaseTrialHandler:
>
> I don't understand python's inheritance well enough to know what's going
> when. My guess is that the pickled TrialHandler wasn't saved inheriting from
> anything, and doesn't think it needs to inherit from anything, but the newer
> version of TrialHandler thinks it should and is failing when it's not given
> a _BaseTrialHandler. But that's completely a guess.
>
> In any case, we should be able to load older saved data using newer
> versions, especially since this was just a patch release. Can someone
> explain what's going on? And give me guidance on fixing it? I thought about
> using a try: except TypeError, but I don't know how to fix it without
> re-writing the method definition above.
>
>

Jeremy Gray

unread,
Apr 11, 2012, 11:08:56 AM4/11/12
to psycho...@googlegroups.com
first of all, kudos to Erik for working out a way forward!

I expect that switching to new style classes now will ease the
eventual transition to python 3 (whenever that time comes).

--Jeremy

Erik Kastman

unread,
Apr 11, 2012, 11:32:12 AM4/11/12
to psycho...@googlegroups.com
Thanks all; I'm glad everyone's on the same page with this - that was my impression but I didn't want to make trouble for anyone!

I'll think a little bit more about the best way to add the loader now that we want to go ahead with it, as I don't think it's important enough to deserve it's own "legacy.py" file orphaned off as it is. We will definitely want to thoroughly test this and make sure that transferring everything in the dict is sufficient for everyone's use, and add a stubbed StairHandler as well. I'll get to it!

Erik

Jeremy Gray

unread,
Apr 11, 2012, 1:27:45 PM4/11/12
to psycho...@googlegroups.com
just an idea: before putting more real work into it, you might do some
market research to get a sense of how widely useful that extra work
would be. one informal way to do this is just to wait for people to
ask questions via the list, asking for help. another way is to post
something saying "I had this issue, and solved it by doing X, Y, and
Z." until someone asks for help, going with a "legacy.py" file might
not be a bad way to go. its easy, there if someone needs it, and out
of the way if not.

but sometimes putting real work into something, regardless of market
research, is not a bad way to go either. I often learn a lot about
psychopy (and python more generally) by trying to implement some
feature. Steve Jobs disdained market research. I think Henry Ford
(inventor of the automobile) said something like "If I asked people
what they wanted, they would have said a faster horse."

--Jeremy

Jonathan Peirce

unread,
Apr 11, 2012, 1:45:35 PM4/11/12
to psycho...@googlegroups.com

On 11/04/2012 18:27, Jeremy Gray wrote:
> just an idea: before putting more real work into it, you might do some
> market research to get a sense of how widely useful that extra work
> would be. one informal way to do this is just to wait for people to
> ask questions via the list, asking for help. another way is to post
> something saying "I had this issue, and solved it by doing X, Y, and
> Z." until someone asks for help, going with a "legacy.py" file might
> not be a bad way to go. its easy, there if someone needs it, and out
> of the way if not.

How about compatibility.py rather than legacy.py? I'll bet we'll need to
add additional functions to handle compatibility sooner or later. In
fact, I have some ideas in my head about some functions to provide
warnings about compatibility based on version number. That would be most
sensible in such a package.


> but sometimes putting real work into something, regardless of market
> research, is not a bad way to go either. I often learn a lot about
> psychopy (and python more generally) by trying to implement some
> feature. Steve Jobs disdained market research. I think Henry Ford
> (inventor of the automobile) said something like "If I asked people
> what they wanted, they would have said a faster horse."

Actually I agree, and I really like the Ford quote. Asking people what
they want seems 'right' but has rarely been fruitful.

Incidentally, I think Karl Benz's family might want to correct you on
the invention of the automobile thing. ;-)

all the best
Jon

>>>>> one� but that broke loading old Handlers that had been pickled as old-style

>>>>> this sounds like a hack, you�re right. Refer to the source code to make this

http://www.peirce.org.uk/

Jeremy Gray

unread,
Apr 11, 2012, 2:08:44 PM4/11/12
to psycho...@googlegroups.com
uh-oh, Benz too? http://quoteinvestigator.com/2011/07/28/ford-faster-horse/

>>>>>> one… but that broke loading old Handlers that had been pickled as

>>>>>> this sounds like a hack, you’re right. Refer to the source code to

Erik Kastman

unread,
Apr 11, 2012, 5:40:45 PM4/11/12
to psycho...@googlegroups.com
I've been thinking about compatibility too, and would love to hear your thoughts. Putting the loader into compatibility.py sounds like a decent starting point.

In my experience people get a lot louder about market research _after_ you've broken their code. :) But I like the suggestion about posting to the users list with a more general description of the problem and seeing if anyone remarks on it.

Also, I just want to make sure that I'm understanding things completely: since in this case pickle loads a binary file, there's no reason to use codecs.open(), right? Text encodings only deal with text-encoded files?

Erik

>>>>>> one… but that broke loading old Handlers that had been pickled as old-style

>>>>>> this sounds like a hack, you’re right. Refer to the source code to make this

Jeremy Gray

unread,
Apr 11, 2012, 6:10:21 PM4/11/12
to psycho...@googlegroups.com
pickle is not binary. probably good to have a test file that includes
some utf-8 characters. I rather suspect that you'll want codec.open(
).

Erik Kastman

unread,
Apr 11, 2012, 7:48:18 PM4/11/12
to psycho...@googlegroups.com
I thought the mode was 'rb' in the pickle doc examples, and that's what I've been using to successfully test so far, but I think you're right because pickle is, at its core, just instructions for running a small virtual machine.

And it would be safer to use codecs regardless since there's
no harm even if it was binary. Thanks!

Jeremy Gray

unread,
Apr 11, 2012, 8:21:12 PM4/11/12
to psycho...@googlegroups.com
> I thought the mode was 'rb' in the pickle doc examples, and that's what I've been using to successfully test so far,

in my limited understanding, the 'b' in 'rb' is there because of some
quirk in windows, something like an ambiguity about file types, and so
it can freak out. to work around this quirk, people toss in the 'b' to
make things more portable / cross-platform, even if its not actually
needed outside of MS-world. I think 'b' should be interpreted as
"might need to be handled as if it were binary" rather than "is
binary".

from http://docs.python.org/library/pickle.html
"By default, the pickle data format uses a printable ASCII
representation. This is slightly more voluminous than a binary
representation. The big advantage of using printable ASCII (and of
some other characteristics of pickle‘s representation) is that for
debugging or recovery purposes it is possible for a human to read the
pickled file with a standard text editor."

Jonathan Peirce

unread,
Apr 12, 2012, 5:43:28 AM4/12/12
to psycho...@googlegroups.com
Correct the 'b' means 'handle as if it were binary'.
The issue is mostly to do with end-of-line characters. For python on
windows loading/saving of text files includes a step to convert LF
characters to or from CRLF. But if the file is something like an image,
that transformation will break the file. So 'b' simply indicates that
this file should be treated like a binary file and should be
loaded/saved untouched.

On 12/04/2012 01:21, Jeremy Gray wrote:
>> I thought the mode was 'rb' in the pickle doc examples, and that's what I've been using to successfully test so far,
> in my limited understanding, the 'b' in 'rb' is there because of some
> quirk in windows, something like an ambiguity about file types, and so
> it can freak out. to work around this quirk, people toss in the 'b' to
> make things more portable / cross-platform, even if its not actually
> needed outside of MS-world. I think 'b' should be interpreted as
> "might need to be handled as if it were binary" rather than "is
> binary".
>
> from http://docs.python.org/library/pickle.html
> "By default, the pickle data format uses a printable ASCII
> representation. This is slightly more voluminous than a binary
> representation. The big advantage of using printable ASCII (and of

> some other characteristics of pickle�s representation) is that for

>>>>>>>>>> one� but that broke loading old Handlers that had been pickled as old-style

>>>>>>>>>> this sounds like a hack, you�re right. Refer to the source code to make this

Reply all
Reply to author
Forward
0 new messages