How to Handle JSON with escaped Unicode characters using python json module?

284 views
Skip to first unread message

Sravan via StackOverflow

unread,
Aug 14, 2012, 12:38:51 PM8/14/12
to google-appengin...@googlegroups.com

EDIT: The error doesn't appear in Prompt, but in the following Google App Engine environment. Code execution environment I have following json

>>>dat = r"""{"name":"Something", "data":"For youth \n\nBe a hero! Donate blood!\n\u091c\u092f \u0939\u093f\u0902\u0926! \u0935\u0928\u094d\u0926\u0947 \u092e\u093e\u0924\u0930\u092e\u094d"}"""

It contains unicode escaped characters. I want to parse this. So I did

>>>jsDat = json.loads(js)

Then following works

>>>name = jsDat.get('name')
>>>name = name.encode('ascii') #This is because json module handles in unicode
>>>print name
Something

But trying for the field with unicode data, that is "data", an error is displayed

>>>data = jsDat.get('data')
UnicodeEncodeError: 'ascii' codec can't encode characters in position 366-367: ordinal not in range(128)

How should I parse the data?



Please DO NOT REPLY directly to this email but go to StackOverflow:
http://stackoverflow.com/questions/11956503/how-to-handle-json-with-escaped-unicode-characters-using-python-json-module

ernie via StackOverflow

unread,
Aug 14, 2012, 12:38:53 PM8/14/12
to google-appengin...@googlegroups.com

You can't encode unicode to ASCII if the characters exceed the ASCII character set. If you want to force the conversion, and lose data, you can do this:

data = jsDat.get('data')
data = data.encode('ascii', 'ignore')

See the doc for str.encode for more details about the ignore.

As an aside, I'm not sure why you're trying to encode to ASCII - the JSON module seems to handle that raw string just fine?



Please DO NOT REPLY directly to this email but go to StackOverflow:
http://stackoverflow.com/questions/11956503/how-to-handle-json-with-escaped-unicode-characters-using-python-json-module/11956688#11956688

Greg via StackOverflow

unread,
Aug 14, 2012, 4:03:58 PM8/14/12
to google-appengin...@googlegroups.com

The error is coming from your 'print' line, and only because you're trying to print to a 'terminal' that doesn't understand the encoding. Doing anything else with the JSON object shouldn't produce errors.



Please DO NOT REPLY directly to this email but go to StackOverflow:
http://stackoverflow.com/questions/11956503/how-to-handle-json-with-escaped-unicode-characters-using-python-json-module/11959864#11959864
Reply all
Reply to author
Forward
0 new messages