collections module

29 views
Skip to first unread message

Sam

unread,
Sep 10, 2011, 9:19:31 PM9/10/11
to jsonpickle
I'm trying to jsonpickle an object that contains many things,
including a collections.defaultdict(list) and a collections.deque()

It seems like jsonpickle can't properly handle either of these out of
the box. Is that correct? What's my best course of action?

I'm using Python 2.6, and jsonpickle 0.4 with simplejson.

Thanks

David Aguilar

unread,
Sep 10, 2011, 11:05:20 PM9/10/11
to jsonp...@googlegroups.com, jsonpickle

Interesting. I've never tried JP with these objects. can you fork the repo on github and perhaps commit a (failure-expecting) test case? hopefully we can figure out a simple solution.

until then, you may be able to register a serialization handler for those classes.

I'm leaving to Europe for two weeks so I won't be able to dig in for a little bit but having the testcase would be very helpful. feel free to fork and experiment.

https://github.com/jsonpickle/jsonpickle

thanks,
--
David

John Paulett

unread,
Sep 11, 2011, 3:02:22 PM9/11/11
to jsonp...@googlegroups.com, sams...@gmail.com
Sam,

The relevant code will likely be in jsonpickle.pickler's flatten() and
_flatten_obj_instance()
https://github.com/jsonpickle/jsonpickle/blob/master/jsonpickle/pickler.py#L72

defaultdict is a subclass of dict, so the check for default dict
probably needs to be handled before we handle normal dicts.

The stdlib pickle module handles defaultdict perfectly fine, so it
would be nice to include this in jsonpickle.

John

> --
> You received this message because you are subscribed to the Google Groups "jsonpickle" group.
> To post to this group, send email to jsonp...@googlegroups.com.
> To unsubscribe from this group, send email to jsonpickle+...@googlegroups.com.
> For more options, visit this group at http://groups.google.com/group/jsonpickle?hl=en.
>
>

Sam

unread,
Sep 26, 2011, 2:02:21 PM9/26/11
to jsonpickle
I wanted to thank you both for the responses. Unfortunately I have a
lot of projects on my plate, so I don't have time to play with the
source code for jsonpickle.

For now, I've moved to yaml. It works great with the collections
module out of the box, but it is slow!

If you do release a new version that works with collections I'll
definitely try it out. It would definitely be nice to use something
faster. It takes 33 seconds to deserialize the yaml (admittedly it's a
very large structure).

Thanks
Sam

On Sep 11, 4:02 pm, John Paulett <j...@paulett.org> wrote:
> Sam,
>
> The relevant code will likely be in jsonpickle.pickler's flatten() and
> _flatten_obj_instance()https://github.com/jsonpickle/jsonpickle/blob/master/jsonpickle/pickl...
>
> defaultdict is a subclass of dict, so the check for default dict
> probably needs to be handled before we handle normal dicts.
>
> The stdlib pickle module handles defaultdict perfectly fine, so it
> would be nice to include this in jsonpickle.
>
> John
>
>
>
>
>
>
>
> On Sat, Sep 10, 2011 at 10:05 PM, David Aguilar <dav...@gmail.com> wrote:

David Aguilar

unread,
Sep 26, 2011, 3:07:29 PM9/26/11
to Sam, jsonp...@googlegroups.com
On Mon, Sep 26, 2011 at 8:02 PM, Sam <sams...@gmail.com> wrote:
> I wanted to thank you both for the responses. Unfortunately I have a
> lot of projects on my plate, so I don't have time to play with the
> source code for jsonpickle.
>
> For now, I've moved to yaml. It works great with the collections
> module out of the box, but it is slow!
>
> If you do release a new version that works with collections I'll
> definitely try it out. It would definitely be nice to use something
> faster. It takes 33 seconds to deserialize the yaml (admittedly it's a
> very large structure).
>
> Thanks
> Sam

Do you happen to have an example of what these objects look like?
Maybe a quick snippet of python to show me what you expect to see?


I'm just getting back to the states now. I'll try and look into it
later this week.
--
    David

Sam

unread,
Sep 26, 2011, 11:44:47 PM9/26/11
to jsonpickle
Hi David..

Here's a quick program that shows the failure pretty well.

It takes the same structure, and encodes it and decodes it using:
* Pickle
* Yaml
* JsonPickle

It then asserts that they are the same as the original item.

The item is essentially this: {'mydeque': deque([8, 2, 'noon']),
'foo': defaultdict(<type 'list'>, {'car': [3, 5, 'twelve'], 'foo':
['bar']})}

jsonpickle encodes it as: {"mydeque": null, "foo": {"py/object":
"collections.defaultdict", "car": [3, 5, "twelve"], "foo": ["bar"]}}

and decodes it as:

{'mydeque': None, 'foo': defaultdict(None, {'car': [3, 5, 'twelve'],
'foo': ['bar']})}

while pickle and yaml give back the exact original.

When I wrote this program it made me also wonder why you use unique
nomenclature of encode and decode.

It seems like before the 1.0 release it would be best to copy some
existing standard and use something like dump/dumps or load/loads. :)

I've also put the below program on pastebin: http://pastebin.com/j182A1hs

Thanks for the help!
-Sam



from __future__ import print_function

import collections
import jsonpickle
import yaml
import pickle
import bson

def dumpme(item):
print("item is:", item)

y = yaml.dump(item)
print("yaml is: ", y)
assert item==yaml.load(y)

p = pickle.dumps(item)
print("pickle is: ", p)
assert item==pickle.loads(p)

j = jsonpickle.encode(item)
print("jsonpickle:", j)
decoded = jsonpickle.decode(j)
print("decoded is:", decoded)
assert item==jsonpickle.decode(j)

def main():
x = collections.defaultdict(list)
x['foo'].append('bar')
x['car'] = [3, 5, 'twelve']
y = dict(foo=x,
mydeque=collections.deque((8, 2, 'noon')))
dumpme(y)

if __name__ == '__main__':
main()


On Sep 26, 4:07 pm, David Aguilar <dav...@gmail.com> wrote:

Sam

unread,
Oct 21, 2011, 10:07:02 PM10/21/11
to jsonpickle
Any progress on this? I'd still really like to see if I can
successfully use JsonPickle instead of YAML (and see if JsonPickle is
faster).

Thanks
Sam

David Aguilar

unread,
Oct 25, 2011, 2:29:24 AM10/25/11
to Sam, jsonpickle
On Fri, Oct 21, 2011 at 7:07 PM, Sam <sams...@gmail.com> wrote:
> Any progress on this?  I'd still really like to see if I can
> successfully use JsonPickle instead of YAML (and see if JsonPickle is
> faster).

I pushed some changes to my github fork's "collections" branch.
They add support for collections.defaultdict. Your sample code is in
a test case now so it'll always work.

We're going to need to look at the rest of the classes in collections
and find the best way to support them in (a generic way), if possible.

RE: Performance -- I've never done a micro-optimization pass over the
code but I'm sure it could shave some msecs here and there. Of
course, finalizing on the JSON format and doing it in C would be the
fastest, but no one's needed that kind of speed yet. A streaming API
would be interesting.

Let us know how it goes. Code reviews (and patches) are highly appreciated :-)

cheers,
--
    David

Reply all
Reply to author
Forward
0 new messages