Non string dict keys

145 views
Skip to first unread message

Eoghan Murray

unread,
Jul 25, 2009, 6:20:25 PM7/25/09
to jsonpickle
The following is currently broken:
>>> jsonpickle.decode(jsonpickle.encode({(1, 2):3}))
{u'(1, 2)': 3}
The tuple isn't restored.

Here is a naive fix:

--- a/jsonpickle/pickler.py
+++ b/jsonpickle/pickler.py
@@ -164,7 +164,7 @@ class Pickler(object):
if util.is_function(v):
continue
if type(k) not in types.StringTypes:
- k = repr(k)
+ k = repr(self.flatten(k))
self._namestack.append(k)
data[k] = self.flatten(v)
self._namestack.pop()

--- a/jsonpickle/unpickler.py
+++ b/jsonpickle/unpickler.py
@@ -113,6 +113,13 @@ class Unpickler(object):
data = {}
for k, v in obj.iteritems():
self._namestack.append(k)
+ try:
+ evaled = eval(k)
+ if util.is_dictionary(evaled):
+ k = self.restore(evaled)
+ except:
+ # Can't eval, probably a normal string
+ pass
data[k] = self.restore(v)
self._namestack.pop()
return self._pop(data)

Also, I'm not sure if it handles integer dict keys at the moment?

Thanks!

Eoghan

David Aguilar

unread,
Jul 26, 2009, 2:45:51 AM7/26/09
to jsonp...@googlegroups.com, Eoghan Murray

Hi,

On Sat, Jul 25, 2009 at 03:20:25PM -0700, Eoghan Murray wrote:
>
> The following is currently broken:
> >>> jsonpickle.decode(jsonpickle.encode({(1, 2):3}))
> {u'(1, 2)': 3}
> The tuple isn't restored.

Although it'd be nice to support this in theory,
there is a limit. I think we have a test where we
specifically expect tuples to become unicode reprs of the
original tuples.


> Here is a naive fix:
>
> --- a/jsonpickle/pickler.py
> +++ b/jsonpickle/pickler.py
> @@ -164,7 +164,7 @@ class Pickler(object):
> if util.is_function(v):
> continue
> if type(k) not in types.StringTypes:
> - k = repr(k)
> + k = repr(self.flatten(k))
> self._namestack.append(k)
> data[k] = self.flatten(v)
> self._namestack.pop()


Okay, that's not too bad... but...


> --- a/jsonpickle/unpickler.py
> +++ b/jsonpickle/unpickler.py
> @@ -113,6 +113,13 @@ class Unpickler(object):
> data = {}
> for k, v in obj.iteritems():
> self._namestack.append(k)
> + try:
> + evaled = eval(k)
> + if util.is_dictionary(evaled):
> + k = self.restore(evaled)
> + except:
> + # Can't eval, probably a normal string
> + pass

This seems a little heavy-handed.

What if the key is called 'k', 'v', or even worse, 'data'?

The eval() finds them in locals() and uses the local variables
when we actually wanted a genuine key called 'k'. Hmm.

You'd probably need to guard against this by passing in an empty
dict for eval's locals and globals option.

That reminds me.. we have a couple of eval calls in jsonpickle
that need the same protection.


Also, calling eval() and setting up the exception frame for
every single item could be expensive. We recently added
some small tweaks to help make firepython faster and I
wouldn't want to make everything that much slower.

Maybe if that behavior was guarded behind a flag, maybe called
allow_nonstring_keys or something, similar to how we have a
special unpicklable flag.


> data[k] = self.restore(v)
> self._namestack.pop()
> return self._pop(data)
>
> Also, I'm not sure if it handles integer dict keys at the moment?

Correct, we don't handle that either at the moment.
simplejson has the same behavior; the int becomes a string.

I'm trying to think of how we could do a better job.

There are lots of different strategies.

A few off the top of my head:

- adding a nonstring_keys=False flag that can be enabled so
that the expensive eval stuff is enabled

- keeping a separate 'py/dictkeys' map at the top-level
that maps namespace keys to objects which are then replaced
as the final step in the unpickling.

The namespace names are the ones that jsonpickle uses for
tracking object references. e.g. /foo/bar is equivalent to
the 'bar' member of the foo instance below.

class foo(object):
def __init__(self):
self.bar = {(1,2,3): True}
self.baz = {(1,2,3): False}

jsonpickle.encode({'foo': foo()})


I can imagine keeping around a replacement dict for something
like this:

{
'foo': {
'py/obj': 'foo.Foo',
'bar': {'SHA-1': True},
'baz': {'SHA-1': False}
},

'py/dictkeys': {
'SHA-1': {
'dicts': ['/foo/bar', '/foo/baz']
'value': {'py/tuple': [1,2,3]},
}
}

That might be less expensive since it won't incur the eval()
for every single variable.

The checksum/SHA-1 could be computed from the repr/str of the
non-string key, or perhaps some other way
(id() or hash() perhaps.. not sure.. this needs more thought).

That said, it's kinda ugly :-/
but that would probably be flexible enough to allow arbitrary
objects as keys in a dictionary.

It'd be nice if the structure was simpler.

As complex as jsonpickle might seem to be, I actually do
think of jsonpickle as a fundamentally simple lib and
do like to keep it rather simple.

Hmm.. I'll have to think about this one.
The fact that you pointed out the:

>>> jsonpickle.encode({1: True})

...problem makes me think that maybe we should try and
support it at some point.


*****
We will want to wait for the current code to get released
first, though. We just had a big round of new features so if
we did try to do this we'd want to schedule it for the next
release at the earliest.
*****


I've written several objects that use arbitrary objects as
their dictionary keys. e.g.

self.foo = {some_object: True}

Being able to serialize those would be a big win.

The thing that we'll have to be careful about is making sure
that we preserve object identity.

Right now jsonpickle is really good about that.

If you refer to the same object in multiple places, e.g.

self.foo = some_object
self.bar = some_object
self.baz = {'some-key': some_object}

jsonpickle is actually very careful about that and preserves
the original references. So instead of getting 3 copies of
some_object, you get 1, just like you'd expect.

That's really important, so we'll want any new features to
also have this behavior.


Oh.. another reason why blindly calling eval() is probably
bad: what if someone genuinely has a string key that evals to
a number, e.g. '12.3'. calling eval will turn that into a
float when it was originally a string.

That'd be bad, so let's think of a better way.
Let's completely ignore my nonstring_keys
suggestion above. That's not going to fly.

We put a lot of breadcrumbs in the json stream for jsonpickle
to operate on. If we're talking about adding more breadcrumbs
and doing some extra processing, then that'd be fine. I'd
even be okay doing that without adding an extra flag to enable
the behavior since it's very explicit.

Doing the eval thing by default seems wrought with troubles,
though.

Anyways, let me know what you think.
Do the tests cases pass with your small tweak, btw?

--

David


Eoghan Murray

unread,
Jul 26, 2009, 4:49:31 AM7/26/09
to jsonpickle, David Aguilar
Hi David,

How about a semaphore in the json dictionary key + a string encoding
of the key...
This also works for integer keys.
[The following code is importing the module level function `encode`
from within pickler.py and unpickler.py .., I'm not sure how you'd
want to rearrange the code to sort that out]

--- a/jsonpickle/tags.py
+++ b/jsonpickle/tags.py
@@ -12,6 +12,7 @@ REPR = 'py/repr'
REF = 'py/ref'
TUPLE = 'py/tuple'
SET = 'py/set'
+DICT_KEY = 'py/dict_key'

# All reserved tag names
RESERVED = set([OBJECT, TYPE, REPR, REF, TUPLE, SET])

--- a/jsonpickle/pickler.py
+++ b/jsonpickle/pickler.py
@@ -164,7 +164,8 @@ class Pickler(object):
if util.is_function(v):
continue
if type(k) not in types.StringTypes:
- k = repr(k)
+ from jsonpickle import encode
+ k = tags.DICT_KEY + " " + encode(k);
self._namestack.append(k)
data[k] = self.flatten(v)
self._namestack.pop()

--- a/jsonpickle/unpickler.py
+++ b/jsonpickle/unpickler.py
@@ -113,6 +113,9 @@ class Unpickler(object):
data = {}
for k, v in obj.iteritems():
self._namestack.append(k)
+ if k.startswith(tags.DICT_KEY):
+ from jsonpickle import decode
+ k = decode(k[len(tags.DICT_KEY)+1:])
data[k] = self.restore(v)
self._namestack.pop()
return self._pop(data)

--- a/jsonpickle/tests/jsonpickle_test.py
+++ b/jsonpickle/tests/jsonpickle_test.py
@@ -350,37 +350,28 @@ class JSONPickleTestCase(unittest.TestCase):

def test_tuple_dict_keys(self):
"""Test that we handle dictionaries with tuples as keys.
- We do not model this presently, so ensure that we at
- least convert those tuples to repr strings.
-
- TODO: handle dictionaries with non-stringy keys.
"""
- pickled = jsonpickle.encode({(1, 2): 3,
- (4, 5): { (7, 8): 9 }})
+ tuple_dict = {(1, 2): 3, (4, 5): { (7, 8): 9 }}
+ pickled = jsonpickle.encode(tuple_dict)
unpickled = jsonpickle.decode(pickled)
- subdict = unpickled['(4, 5)']
-
- self.assertEqual(unpickled['(1, 2)'], 3)
- self.assertEqual(subdict['(7, 8)'], 9)
+ self.assertEqual(unpickled, tuple_dict)

def test_datetime_dict_keys(self):
"""Test that we handle datetime objects as keys.
- We do not model this presently, so ensure that we at
- least convert those tuples into repr strings.
-
"""
- pickled = jsonpickle.encode({datetime.datetime(2008, 12, 31):
True})
+ datetime_dict = {datetime.datetime(2008, 12, 31): True}
+ pickled = jsonpickle.encode(datetime_dict)
unpickled = jsonpickle.decode(pickled)
- self.assertTrue(unpickled['datetime.datetime(2008, 12, 31, 0,
0)'])
+ self.assertEqual(unpickled, datetime_dict)

def test_object_dict_keys(self):
"""Test that we handle random objects as keys.

"""
- pickled = jsonpickle.encode({Thing('random'): True})
+ object_dict = {Thing('random'): True}
+ pickled = jsonpickle.encode(object_dict)
unpickled = jsonpickle.decode(pickled)
- self.assertEqual(unpickled,
- {u'jsonpickle.tests.classes.Thing
("random")': True})
+ self.assertEqual(unpickled, object_dict)

def test_load_backend(self):
"""Test that we can call jsonpickle.load_backend()


With the modified test cases the only thing that fails is
test_object_dict_keys.
I assume this is because the objects have different identities?

Eoghan

David Aguilar

unread,
Feb 26, 2010, 2:18:56 PM2/26/10
to Eoghan Murray, jsonpickle
On Sun, Jul 26, 2009 at 01:49:31AM -0700, Eoghan Murray wrote:
> Hi David,
>
> How about a semaphore in the json dictionary key + a string encoding
> of the key...
> This also works for integer keys.
> [The following code is importing the module level function `encode`
> from within pickler.py and unpickler.py .., I'm not sure how you'd
> want to rearrange the code to sort that out]

Hi Eoghan

Sorry for taking so long to get back to this.

FoxyKeys reported issue #5 on github:
http://github.com/jsonpickle/jsonpickle/issues#issue/5

Eoghan, I merged your branch and refactored things so that the
encode() / decode() imports are safe.

The code was tweaked very slightly.
Can you verify if it's doing the trick for you?

The code is in the "dictkeys" branch on:
http://github.com/jsonpickle/jsonpickle


Also, slightly-related, I have another branch called "make_refs"
which solves issue #1:
http://github.com/jsonpickle/jsonpickle/issues#issue/1

That one was easy to do but the most glaring problem with it is
that it has no tests. We can hold off merging that branch until
some tests have been written. The "dictkeys" branch seems like
it's good to go but I would like to hear feedback on it.


(ya, I still haven't had time to do the big rework to make
referencing work across list elements, etc., primarily
due to time)

--
David

Eoghan Murray

unread,
Mar 6, 2010, 5:29:56 AM3/6/10
to David Aguilar, jsonpickle
On 26 February 2010 19:18, David Aguilar <dav...@gmail.com> wrote:
Eoghan, I merged your branch and refactored things so that the
encode() / decode() imports are safe.


Great, thanks - I wasn't sure how to import them correctly
 
The code was tweaked very slightly.
Can you verify if it's doing the trick for you?

The code is in the "dictkeys" branch on:
http://github.com/jsonpickle/jsonpickle

 
 The "dictkeys" branch seems like

it's good to go but I would like to hear feedback on it.


Yes, it is working fine for me after I did a string replace from 'py/dict_key ' to 'py/object:'
I've switched over to this branch for my projects.

 

Also, slightly-related, I have another branch called "make_refs"
which solves issue #1:
http://github.com/jsonpickle/jsonpickle/issues#issue/1
 
That one was easy to do but the most glaring problem with it is
that it has no tests.  We can hold off merging that branch until
some tests have been written.
(ya, I still haven't had time to do the big rework to make
 referencing work across list elements, etc., primarily
 due to time)
I sympathise :)

You might have noticed 3 tests in http://github.com/eoghanmurray/jsonpickle/blob/master/tests/jsonpickle_test.py
  test_refs_keys_values
  test_refs_in_list
  test_refs_recursive
That are currently failing

Until the ref stuff is done, maybe it would be better to make go with the make_refs branch (with no-referencing the default) ?

Cheers!

Eoghan


David Aguilar

unread,
Mar 7, 2010, 10:14:56 PM3/7/10
to Eoghan Murray, jsonpickle
On Sat, Mar 6, 2010 at 2:29 AM, Eoghan Murray <eoghan...@gmail.com> wrote:
> On 26 February 2010 19:18, David Aguilar <dav...@gmail.com> wrote:
>>
>> Eoghan, I merged your branch and refactored things so that the
>> encode() / decode() imports are safe.
>> [...]

>> The code is in the "dictkeys" branch on:
>> http://github.com/jsonpickle/jsonpickle
>>
>>  The "dictkeys" branch seems like
>> it's good to go but I would like to hear feedback on it.
>
> Yes, it is working fine for me after I did a string replace from
> 'py/dict_key ' to 'py/object:'
> I've switched over to this branch for my projects.

That's great. Unless I hear any objections this'll be cooking in master soon.


>> Also, slightly-related, I have another branch called "make_refs"
>> which solves issue #1:
>> http://github.com/jsonpickle/jsonpickle/issues#issue/1
>>
>> That one was easy to do but the most glaring problem with it is
>> that it has no tests.  We can hold off merging that branch until
>> some tests have been written.
>>
>> (ya, I still haven't had time to do the big rework to make
>>  referencing work across list elements, etc., primarily
>>  due to time)
>
> I sympathise :)
>
> You might have noticed 3 tests in
> http://github.com/eoghanmurray/jsonpickle/blob/master/tests/jsonpickle_test.py
>   test_refs_keys_values
>   test_refs_in_list
>   test_refs_recursive
> That are currently failing
>
> Until the ref stuff is done, maybe it would be better to make go with the
> make_refs branch (with no-referencing the default) ?
>
> Cheers!
>
> Eoghan

That's not a bad idea.

If we did this then it means a new release because it changes the behavior
of the API (refs by default).

This also mean that recursive object graphs won't work
by default since it can't recognize that it's already seen an object.
This is very subtle so the documentation will need to be thorough.

Thoughts?

--
David

Ronnie Maor

unread,
Jul 4, 2010, 2:32:32 AM7/4/10
to David Aguilar, jsonp...@googlegroups.com
I'm trying to use jsonpickle to replace pickle, and have the same
issue - in my case there are dictionaries with integer keys
Don't have a problem using the dictkeys branch for now, but just
wondering if there's anything holding it back from being merged into
master.

thanks
Ronnie

On Mar 8, 6:14 am, David Aguilar <dav...@gmail.com> wrote:


> On Sat, Mar 6, 2010 at 2:29 AM, Eoghan Murray <eoghanomur...@gmail.com> wrote:
> > On 26 February 2010 19:18, David Aguilar <dav...@gmail.com> wrote:
>
> >> Eoghan, I merged your branch and refactored things so that the
> >> encode() / decode() imports are safe.
> >> [...]
> >> The code is in the "dictkeys" branch on:
> >>http://github.com/jsonpickle/jsonpickle
>
> >>  The "dictkeys" branch seems like
> >> it's good to go but I would like to hear feedback on it.
>
> > Yes, it is working fine for me after I did a string replace from
> > 'py/dict_key ' to 'py/object:'
> > I've switched over to this branch for my projects.
>
> That's great.  Unless I hear any objections this'll be cooking in master soon.
>
>
>
>
>
> >> Also, slightly-related, I have another branch called "make_refs"
> >> which solves issue #1:
> >>http://github.com/jsonpickle/jsonpickle/issues#issue/1
>
> >> That one was easy to do but the most glaring problem with it is
> >> that it has no tests.  We can hold off merging that branch until
> >> some tests have been written.
>
> >> (ya, I still haven't had time to do the big rework to make
> >>  referencing work across list elements, etc., primarily
> >>  due to time)
>
> > I sympathise :)
>
> > You might have noticed 3 tests in

> >http://github.com/eoghanmurray/jsonpickle/blob/master/tests/jsonpickl...

David Aguilar

unread,
Jul 6, 2010, 10:41:01 PM7/6/10
to Ronnie Maor, jsonp...@googlegroups.com
On Sat, Jul 03, 2010 at 11:32:32PM -0700, Ronnie Maor wrote:
> I'm trying to use jsonpickle to replace pickle, and have the same
> issue - in my case there are dictionaries with integer keys
> Don't have a problem using the dictkeys branch for now, but just
> wondering if there's anything holding it back from being merged into
> master.
>
> thanks
> Ronnie

I don't think there's anything in particular holding it back.
I've merged the branch into my repo (davvid/jsonpickle) for now.
If JP agrees then hopefully we'll see it in
jsonpickle/jsonpickle soon, too.

I guess I should probably merge the make_refs branch, too, since
that allows users to disable jsonpickle's reference tracking.

--
David

John Paulett

unread,
Jul 7, 2010, 11:01:22 PM7/7/10
to jsonp...@googlegroups.com
I merged the more recent changes on master into make_refs
(http://github.com/johnpaulett/jsonpickle/tree/make_refs).
Unfortunately it looks like there are 4 regressions, most of which
seem to be related to unpickable=False. Nothing is jumping out at me
at the moment, so if anyone notices what the issue is please let me
know. Otherwise, I will dive in tomorrow to try to resolve these
issues.

$ tests/runtests.py
...
======================================================================
ERROR: test_dictsubclass_notunpickable (jsonpickle_test.PicklingTestCase)
----------------------------------------------------------------------
Traceback (most recent call last):
File "/home/jpaulett/projects/jsonpickle/tests/jsonpickle_test.py",
line 294, in test_dictsubclass_notunpickable
self.assertEqual(1, flattened['key1'])
TypeError: 'bool' object is unsubscriptable

======================================================================
ERROR: test_repr_not_unpickable (jsonpickle_test.PicklingTestCase)
----------------------------------------------------------------------
Traceback (most recent call last):
File "/home/jpaulett/projects/jsonpickle/tests/jsonpickle_test.py",
line 326, in test_repr_not_unpickable
self.assertFalse(tags.REPR in flattened)
TypeError: argument of type 'bool' is not iterable

======================================================================
FAIL: test_encode_notunpicklable (jsonpickle_test.JSONPickleTestCase)
----------------------------------------------------------------------
Traceback (most recent call last):
File "/home/jpaulett/projects/jsonpickle/tests/jsonpickle_test.py",
line 432, in test_encode_notunpicklable
self.assertEqual('{"name": "A name", "child": null}', pickled)
AssertionError: '{"name": "A name", "child": null}' != 'true'

======================================================================
FAIL: Doctest: jsonpickle
----------------------------------------------------------------------
Traceback (most recent call last):
File "/usr/lib/python2.6/doctest.py", line 2152, in runTest
raise self.failureException(self.format_failure(new.getvalue()))
AssertionError: Failed doctest test for jsonpickle
File "tests/../jsonpickle/__init__.py", line 8, in jsonpickle

----------------------------------------------------------------------
File "tests/../jsonpickle/__init__.py", line 54, in jsonpickle
Failed example:
print oneway
Expected:
{"name": "A String", "child": null}
Got:
true

John

> --
> You received this message because you are subscribed to the Google Groups "jsonpickle" group.
> To post to this group, send email to jsonp...@googlegroups.com.
> To unsubscribe from this group, send email to jsonpickle+...@googlegroups.com.
> For more options, visit this group at http://groups.google.com/group/jsonpickle?hl=en.
>
>

David Aguilar

unread,
Jul 8, 2010, 3:40:25 AM7/8/10
to jsonp...@googlegroups.com
On Wed, Jul 07, 2010 at 10:01:22PM -0500, John Paulett wrote:
> I merged the more recent changes on master into make_refs
> (http://github.com/johnpaulett/jsonpickle/tree/make_refs).
> Unfortunately it looks like there are 4 regressions, most of which
> seem to be related to unpickable=False. Nothing is jumping out at me
> at the moment, so if anyone notices what the issue is please let me
> know. Otherwise, I will dive in tomorrow to try to resolve these
> issues.
>
> $ tests/runtests.py

Maybe we're in the clear?

$ env PYTHONPATH="$PWD" tests/runtests.py
[...]
----------------------------------------------------------------------
Ran 108 tests in 0.168s

OK


Maybe this failure below is because we're picking up the
installed python-jsonpickle package?

$ tests/runtests.py
[...]
----------------------------------------------------------------------
Ran 108 tests in 0.165s

FAILED (failures=2, errors=1)

$ dpkg-query --status python-jsonpickle
Package: python-jsonpickle
Status: install ok installed
Priority: optional
Section: python
Installed-Size: 112
Maintainer: John Paulett <jo...@paulett.org>
Architecture: all
Source: jsonpickle
Version: 0.3.1-1


> [...]
> ERROR: test_dictsubclass_notunpickable (jsonpickle_test.PicklingTestCase)
> ERROR: test_repr_not_unpickable (jsonpickle_test.PicklingTestCase)
> FAIL: test_encode_notunpicklable (jsonpickle_test.JSONPickleTestCase)
> FAIL: Doctest: jsonpickle
> [...]

--

David


Reply all
Reply to author
Forward
0 new messages