jsonpickle and unicode

289 views
Skip to first unread message

Antonin Hildebrand

unread,
Dec 13, 2008, 4:05:45 AM12/13/08
to jsonpickle
Hello again,

I'm not a Python pro, but I'm afraid jsonpickle is not unicode aware.

I'm using r38 and when trying to jsonpickle structure which contains
unicode strings (it's a huge struct, I believe there are some
unicodes) I get:
in _flatten_dict_obj
self._namestack.append(str(k))
UnicodeEncodeError: 'ascii' codec can't encode characters in position
0-2: ordinal not in range(128)

I don't understand internals well, but using str() or similar
functions which produce ascii strings is simply wrong. jsonpickle
should work internally with unicode strings and also returning value
from jsonpickle.encode should be unicode string (now it returns
vanilla ascii byte strings).

It's up to the user to encode resulting unicode json with encoding of
her choice. For example I am using utf-8, so simply call
jsonpickle.encode(...).encode('utf-8') to get correctly encoded json
string.

regards,
Antonin

David Aguilar

unread,
Dec 13, 2008, 1:07:27 PM12/13/08
to antonin.h...@gmail.com, jsonp...@googlegroups.com
On Sat, Dec 13, 2008 at 1:05 AM, Antonin Hildebrand
<antonin.h...@gmail.com> wrote:
>
> Hello again,
>
> I'm not a Python pro, but I'm afraid jsonpickle is not unicode aware.
>
> I'm using r38 and when trying to jsonpickle structure which contains
> unicode strings (it's a huge struct, I believe there are some
> unicodes) I get:
> in _flatten_dict_obj
> self._namestack.append(str(k))
> UnicodeEncodeError: 'ascii' codec can't encode characters in position
> 0-2: ordinal not in range(128)
>


You are correct. We need to fix this.
Do you have a simple testcase that shows how to reproduce this?
It should be straightforward to fix.
Send it my way and I can take a look.

Thanks,
-David





> I don't understand internals well, but using str() or similar
> functions which produce ascii strings is simply wrong. jsonpickle
> should work internally with unicode strings and also returning value
> from jsonpickle.encode should be unicode string (now it returns
> vanilla ascii byte strings).
>
> It's up to the user to encode resulting unicode json with encoding of
> her choice. For example I am using utf-8, so simply call
> jsonpickle.encode(...).encode('utf-8') to get correctly encoded json
> string.
>
> regards,
> Antonin
> >
>



--
David

Antonin Hildebrand

unread,
Dec 13, 2008, 4:19:17 PM12/13/08
to jsonpickle
Hi David,
I don't have it right now. I've encountered this problem when
jsonpickling locals() on all stack frames during exception in my
webapp. So it was quite complex scenario. And because I have no decent
python debugger, I've left in the air :-(

On Dec 13, 7:07 pm, "David Aguilar" <dav...@gmail.com> wrote:
> On Sat, Dec 13, 2008 at 1:05 AM, Antonin Hildebrand
>

David Aguilar

unread,
Dec 13, 2008, 6:26:50 PM12/13/08
to jsonp...@googlegroups.com
On Sat, Dec 13, 2008 at 1:19 PM, Antonin Hildebrand
<antonin.h...@gmail.com> wrote:
>
> Hi David,
> I don't have it right now. I've encountered this problem when
> jsonpickling locals() on all stack frames during exception in my
> webapp. So it was quite complex scenario. And because I have no decent
> python debugger, I've left in the air :-(


I added a test case that I know could trigger it:
dictionaries with unicode keys, or objects with unicode attributes.
Try using the latest svn and let me know if you see it again.
--
David

David Aguilar

unread,
Dec 14, 2008, 8:09:31 AM12/14/08
to Antonin Hildebrand, jsonp...@googlegroups.com
On Sun, Dec 14, 2008 at 3:01 AM, Antonin Hildebrand
<ant...@hildebrand.cz> wrote:
> Hi David,
> thank you for fast response.
>
> I'm subscribed to google group but it is rejecting my email responses.
> I'll investigate it further.
>
> I've tested r40 with my case, and it still doesn't work:
> File "/Users/woid/code/hed/herodes/firepython/jsonpickle/__init__.py",
> line 511, in _mkref
> self._objs[objid] = '/' + '/'.join(self._namestack)
> TypeError: sequence item 13: expected string or Unicode, tuple found
>
> It seems something wrong get into _namestack
>
> regards, Antonin
>


I believe we should handle this correctly now.
Judging by the stacktrace, it looks like you're passing
in a dictionary that has tuples as its keys.

JSON (and thus jsonpickle) does not support that at the moment.
I've fixed jsonpickle so that it will properly handle this condition.
It is a lossy operation, though, since when you unpickle your
object the tuple keys will have been converted to unicode
strings. In the future we might be able to invent a convention
for embedding complex types in dictionary keys, but no
such convention exists in jsonpickle at the moment.

That said, the latest svn head should at least be able
to process your objects without triggering a backtrace.
Let me know if it works out for you.
-David




> On Sun, Dec 14, 2008 at 12:27 AM, David Aguilar <dav...@gmail.com> wrote:
>> Sorry, I forgot to cc: you -- I'm not sure if you're subscribed to the list.
>> read on below.
>> enjoy,
>> -David
>> --
>> David
>>
>



--
David

Antonin Hildebrand

unread,
Dec 14, 2008, 8:27:51 AM12/14/08
to David Aguilar, jsonp...@googlegroups.com
Hi David,

Thank you, I think feeding jsonpickle with these complex structures is
good test case of jsonpickle's robustness. I don't need to be
unpickable. I just need to present user the data structure into some
decent level (for FirePython).

gave r41 a try and got next error:

/Users/woid/code/hed/herodes/firepython/jsonpickle/__init__.py in
_flatten_dict_obj(self=<firepython.jsonpickle.Pickler object at
0x7346ad0>, obj={<type 'datetime.datetime'>: 7, <class
'google.appengine.api.datastore_types.Category'>: 1, <class
'google.appengine.api.datastore_types.Link'>: 2, <class
'google.appengine.api.datastore_types.Email'>: 8, <class
'google.appengine.api.datastore_types.GeoPt'>: 9, <class
'google.appengine.api.datastore_types.IM'>: 10, <class
'google.appengine.api.datastore_types.PhoneNumber'>: 11, <class
'google.appengine.api.datastore_types.PostalAddress'>: 12, <class
'google.appengine.api.datastore_types.Rating'>: 13, <class
'google.appengine.api.datastore_types.Text'>: 15, ...}, data={u"<class
'google.appengine.api.datastore_types.Blob'>": 14, u"<class
'google.appengine.api.datastore_types.Category'>": 1, u"<class
'google.appengine.api.datastore_types.Email'>": 8, u"<class
'google.appengine.api.datastore_types.Link'>": 2, u"<class
'google.appengine.api.datastore_types.PhoneNumber'>": 11, u"<class
'google.appengine.api.datastore_types.PostalAddress'>": 12, u"<class
'google.appengine.api.datastore_types.Rating'>": 13, u"<class
'google.appengine.api.datastore_types.Text'>": 15})
611 continue
612 if type(k) not in types.StringTypes:
613 k = unicode(k)
614 self._namestack.append(k)
615 data[k] = self.flatten(v)
k = <class 'google.appengine.api.datastore_types.IM'>, builtin unicode
= <type 'unicode'>

<type 'exceptions.TypeError'>: unbound method __unicode__() must be
called with IM instance as first argument (got nothing instead)
args = ('unbound method __unicode__() must be called with IM
instance as first argument (got nothing instead)',)
message = 'unbound method __unicode__() must be called with IM
instance as first argument (got nothing instead)'

David Aguilar

unread,
Dec 14, 2008, 3:49:05 PM12/14/08
to ant...@hildebrand.cz, jsonp...@googlegroups.com
On Sun, Dec 14, 2008 at 5:27 AM, Antonin Hildebrand
<antonin.h...@gmail.com> wrote:
> Hi David,
>
This is an easy one.
Instead of unicode(k) I should have used repr(k).
It's committed in svn and another couple of tests were added.
--
David

Antonin Hildebrand

unread,
Dec 14, 2008, 4:37:06 PM12/14/08
to David Aguilar, jsonp...@googlegroups.com
Great! finally works for me. Thanks, David.

Unfortunately in my case of complex structures pickling takes more
than 10 seconds and whole CPU.

I'm attaching patch, when one can limit depth of traversal. Depth==10
works good for me.

regards,
Antonin
max_depth_parameter.diff

David Aguilar

unread,
Dec 14, 2008, 6:26:45 PM12/14/08
to ant...@hildebrand.cz, jsonp...@googlegroups.com
On Sun, Dec 14, 2008 at 1:37 PM, Antonin Hildebrand
<antonin.h...@gmail.com> wrote:
> Great! finally works for me. Thanks, David.
>
> Unfortunately in my case of complex structures pickling takes more
> than 10 seconds and whole CPU.
>
> I'm attaching patch, when one can limit depth of traversal. Depth==10
> works good for me.
>
> regards,
> Antonin

I slightly modified it so that max_depth=0 is handled.
Unlimited depth is specified by passing in either a negative number or None.

max_depth=0 does make sense logically so I made it consistent
even though max_depth=0 doesn't make much sense from a usage POV.

It's in the latest svn trunk. It seems like we are somewhat due for a
release now that the unicode fixes and the django fixes from
last week have had time to settle in.

Thanks for the patch,
--
David

John Paulett

unread,
Dec 14, 2008, 11:34:59 PM12/14/08
to jsonp...@googlegroups.com, ant...@hildebrand.cz
Great work on this David and Antonin!

We have a ton of changes on the trunk that we should get out to the
public. Unless there are objections or we find a critical bug, let's
plan on a release later this week.

Reply all
Reply to author
Forward
0 new messages