JSON serialization in RequestHandler.write

1,615 views
Skip to first unread message

Russ Weeks

unread,
Jul 9, 2012, 7:31:02 PM7/9/12
to python-...@googlegroups.com
Hi, folks,

I love the simplicity of passing a dict into RequestHandler.write and having JSON come out the other end.  But for me, sometimes the default JSON encoder is a little too strict.  When I serialize json using json.dumps from the standard library, I can specify a default encoder to eg. serialize a datetime or a mongo ObjectID as a string.  This is useful to me because I don't need JSON to be a robust data transfer format, I just need to produce something that the browser can render.

Can anybody suggest a minimally-hacky way for me to get this "lossy encoding" functionality into escape.json_encode?  I'd like to continue using RequestHandler.write, for consistency and so I don't have to remember to explicitly set the Content-Type.

Thanks,
-Russ

Jorge Puente Sarrín

unread,
Jul 9, 2012, 7:34:49 PM7/9/12
to python-...@googlegroups.com
Hi Russ,

I assume
you're using MongoDB
, you can to use json_utils from pymongo.

self.write(json.dumps(content, default=json_util.default))

Regards.

2012/7/9 Russ Weeks <rwe...@newbrightidea.com>



--
Atte.
Jorge Puente Sarrín.

Jorge Puente Sarrín

unread,
Jul 9, 2012, 7:35:54 PM7/9/12
to python-...@googlegroups.com

Don't forget the header:


self.set_header("Content-Type", "application/json")



2012/7/9 Jorge Puente Sarrín <puente...@gmail.com>

Russ Weeks

unread,
Jul 9, 2012, 8:01:30 PM7/9/12
to python-...@googlegroups.com
Hi, Jorge,

Thanks, but that's the problem - I emit a JSON response from a bunch of different request handlers and I know I _will_ forget at some point.  But the implementation of RequestHandler.write seems to be pretty simple, so I guess my best bet is to redefine .write in my base RequestHandler subclass and customize the json encoding behaviour there (invoking json.dumps, as you say).  Probably beats monkey-patching escape.json_encode or whatever.

Thanks again,
-Russ

Jorge Puente Sarrín

unread,
Jul 10, 2012, 12:28:33 AM7/10/12
to python-...@googlegroups.com
Ohh sorry Russ,

You're right, I also write a subclass and invoke to json.dumps.

Maybe a good idea, overwrite 'write' method adding a parameter with content type and default value = "html".

HENG

unread,
Jul 10, 2012, 12:54:50 AM7/10/12
to python-...@googlegroups.com
If you read the source code of Tornado, you will get this:

def write(self, chunk): """Writes the given chunk to the output buffer. To write the output to the network, use the flush() method below. If the given chunk is a dictionary, we write it as JSON and set the Content-Type of the response to be application/json. (if you want to send JSON as a different Content-Type, call set_header *after* calling write()). Note that lists are not converted to JSON because of a potential cross-site security vulnerability. All JSON output should be wrapped in a dictionary. More details at http://haacked.com/archive/2008/11/20/anatomy-of-a-subtle-json-vulnerability.aspx """ if self._finished: raise RuntimeError("Cannot write() after finish(). May be caused " "by using async operations without the " "@asynchronous decorator.") if isinstance(chunk, dict): chunk = escape.json_encode(chunk) self.set_header("Content-Type", "application/json; charset=UTF-8") chunk = utf8(chunk) self._write_buffer.append(chunk) 


if chunk is dict type, Tornado will write it as json..............

It is awesome, Isn't it ?


2012/7/10 Jorge Puente Sarrín <puente...@gmail.com>



--
--------------------------------------------------------------------
HengZhou
---------------------------------------------------------------------
--

Lorenzo Bolla

unread,
Jul 10, 2012, 4:51:26 AM7/10/12
to python-...@googlegroups.com
I had a similar problem recently: the standard "json" encoder is really slow and I wanted to use a faster one (ujson).

I thought that adding a functionality to Tornado to "configure" which modules to use, instead of the standard ones, would be nice. Something similar to the "configure" method of AsyncHttpClient (http://www.tornadoweb.org/documentation/httpclient.html?highlight=configure#tornado.httpclient.AsyncHTTPClient.configure), but at the "module" level.
It would work like this: tornado.configure('json', your_fancy_json_library).
If there is interest, I could submit a pull request for such feature.

As a quick workaround, I created a method in my base handler with a "write_json" method, instead of overwriting the standard "write" method. "write_json" accepts a dictionary and calls: "self.write(ujson.dumps(dict))".

L.

Russ Weeks

unread,
Jul 10, 2012, 2:25:53 PM7/10/12
to python-...@googlegroups.com
Hi, Ed,

I went with a mixin that overrides RequestHandler.write, re-implementing the json serialization stuff.  I'm not 100% happy with it, but it'll do for now.  I like Lorenzo's idea of defining a module configuration API... OTOH I recognize that what I'm trying to do - make JSON encoding less reliable - is not something that a good framework should encourage :)

-Russ

On Tue, Jul 10, 2012 at 6:40 AM, ESV <ed.vi...@gmail.com> wrote:
Russ,

I'd love to see what you end up with.  Unfortunately, monkey-patching tornado.escape._json_encode to be an extended json.JSONEncoder (to get, e.g., serialization of datetimes and MongoDB ObjectIDs) is exactly what I settled on in this situation.

Cheers,
Ed


On Monday, July 9, 2012 7:01:30 PM UTC-5, Russ Weeks wrote:
Hi, Jorge,

Thanks, but that's the problem - I emit a JSON response from a bunch of different request handlers and I know I _will_ forget at some point.  But the implementation of RequestHandler.write seems to be pretty simple, so I guess my best bet is to redefine .write in my base RequestHandler subclass and customize the json encoding behaviour there (invoking json.dumps, as you say).  Probably beats monkey-patching escape.json_encode or whatever.

Thanks again,
-Russ

On Mon, Jul 9, 2012 at 4:35 PM, Jorge Puente Sarrín wrote:

Don't forget the header:


self.set_header("Content-Type", "application/json")



2012/7/9 Jorge Puente Sarrín
Hi Russ,


I assume
you're using MongoDB
, you can to use json_utils from pymongo.

self.write(json.dumps(content, default=json_util.default))

Regards.


2012/7/9 Russ Weeks
Hi, folks,

I love the simplicity of passing a dict into RequestHandler.write and having JSON come out the other end.  But for me, sometimes the default JSON encoder is a little too strict.  When I serialize json using json.dumps from the standard library, I can specify a default encoder to eg. serialize a datetime or a mongo ObjectID as a string.  This is useful to me because I don't need JSON to be a robust data transfer format, I just need to produce something that the browser can render.

Can anybody suggest a minimally-hacky way for me to get this "lossy encoding" functionality into escape.json_encode?  I'd like to continue using RequestHandler.write, for consistency and so I don't have to remember to explicitly set the Content-Type.

Thanks,
-Russ
--
Atte.
Jorge Puente Sarrín.

Lorenzo Bolla

unread,
Jul 11, 2012, 3:39:24 AM7/11/12
to python-...@googlegroups.com
This is what I came up with:

Basically, default functionality of Tornado is unchanged, but users can do this:

In [1]: import tornado.escape                                                      
                                                                                   
In [2]: tornado.escape.json_encode({'a': 1})                                       
Out[2]: '{"a": 1}'                                                                 
                                                                                   
In [3]: import ujson                                                               
                                                                                   
In [4]: tornado.escape.JSON.configure(ujson.loads, ujson.dumps)                    
                                                                                   
In [5]: tornado.escape.json_encode({'a': 1})                                       
Out[5]: '{"a":1}'                                                        

If people like it, I can issue a pull request.

L.




 
  
On Tue, Jul 10, 2012 at 8:21 PM, Jorge Puente Sarrín <puente...@gmail.com> wrote:
Lorenzo,

How to extend ujson for to include default parameter in dumps function? and to include object_hook to loads function?


2012/7/10 Lorenzo Bolla <lbo...@gmail.com>

Ben Darnell

unread,
Jul 11, 2012, 5:48:36 AM7/11/12
to python-...@googlegroups.com
Global configuration like this is generally a red flag - it works OK
if it's just swapping in a faster implementation of the same
interface, but if there are any semantic differences then it
introduces too many possibilities for subtle bugs. The main purpose
of tornado.escape.json_encode is to have a json function that's always
present (even on python 2.5) - if tornado started after python 2.6 the
function probably wouldn't exist, and we'd just import json directly
where needed. Applications with more sophisticated requirements from
their json library should just use whatever they want directly. That
just leaves the two places where Tornado references json_encode on
your behalf, which are easily remedied in your app's BaseHandler:

class BaseHandler(RequestHandler):
def write(self, chunk=None):
if isinstance(chunk, dict):
self.set_header("Content-Type", "application/json; charset=UTF-8")
chunk = myjson.encode(chunk)
super(BaseHandler, self).write(chunk)

def render(self, template_name, **kwargs):
kwargs.setdefault('json_encode', myjson.encode)
super(BaseHandler, self).render(template_name, **kwargs)

Also note that since the first message in this thread was about
encoding datetimes, customizing json encoding at the handler level
rather than globally allows you to encode datetimes in a locale-aware
way.

-Ben

Russ Weeks

unread,
Jul 11, 2012, 4:40:29 PM7/11/12
to python-...@googlegroups.com
Hi, Ben,

I agree with everything you've said below, I just want to address this point:

Also note that since the first message in this thread was about
encoding datetimes, customizing json encoding at the handler level
rather than globally allows you to encode datetimes in a locale-aware
way.

I achieve this by passing in a partially-applied function as the 'default' handler for json.dumps, like so:

        if isinstance(chunk, dict):
            chunk = json.dumps(chunk, default=lambda x: lossy_encode(x, self.locale))
            self.set_header("Content-Type", "application/json; charset=UTF-8")

Where "lossy_encode" has logic like,
    if isinstance(o, datetime.datetime):
        return locale.format_date(o)

-Russ
Reply all
Reply to author
Forward
0 new messages