|json vs simplejson||Luke Plant||6/11/12 2:51 PM|
We've switched internally from json to simplejson. Our 1.5 release notes
You can safely change any use of django.utils.simplejson to json
I just found a very big difference between json and simplejson
i.e. simplejson returns bytestrings if the string is ASCII (it returns
unicode objects otherwise), while json returns unicode objects always.
This was, unfortunately, a very unfortunate design decision on the part
of simplejson - json is definitely correct here - and a very big change
in semantics. It led to one very difficult to debug error for me already.
So, this is a shout out to other people to watch out for this, and a
call for ideas on what we can do to mitigate the impact of this. It's
likely to crop up in all kinds of horrible places, deep in libraries
that you can't do much about. In my case I was loading config, including
passwords, from a config file in JSON, and the password was now
exploding inside smtplib because it was a unicode object.
Variables won't, constants aren't.
Luke Plant || http://lukeplant.me.uk/
|schinckel||6/11/12 9:06 PM||<This message has been deleted.>|
|Re: json vs simplejson||Alex Ogier||6/11/12 10:14 PM|
On Mon, Jun 11, 2012 at 5:51 PM, Luke Plant <L.Pla...@cantab.net> wrote:This seemed strange to me because the standard library json shipping
with python 2.7.3 is in fact simplejson 2.0.9, so I did some digging.
It turns out that if the C extensions have been compiled and you pass
a str instance to loads(), then you get that behavior in both
versions. This isn't documented anywhere, but here's the offending
If the C extensions aren't enabled, or you pass a unicode string to
loads(), then you get the "proper" behavior as documented. I'm not
sure how you are triggering this optimized, iffy behavior in
django.utils.simplejson though, without also triggering it in the
standard library. Did you ever install simplejson with 'pip install
simplejson' such that Django picked it up? Can you try running 'from
django.utils import simplejson; print simplejson.__version__'?
|Re: json vs simplejson||Vinay Sajip||6/12/12 2:58 AM|
On Jun 11, 10:51 pm, Luke Plant <L.Plant...@cantab.net> wrote:Do you mean the other way around?
Right. And on Python 3, the json module (correctly) doesn't accept
byte-strings at all.
This is one place where there are limitations in the 2.x stdlib -
other places include cStringIO and cookies. For example, if you pass a
Unicode object to a cStringIO.StringIO, it doesn't complain, but does
the wrong thing:
>>> from cStringIO import StringIO; StringIO(u'abc').getvalue()
Fun and games ...
I'm not sure there's any easy way out, other than comprehensive
|Re: json vs simplejson||Luke Plant||6/12/12 3:53 AM|
On 12/06/12 06:14, Alex Ogier wrote:Thanks for digging into that.
(BTW, in reply to Vinay, yes I meant "from simplejson to json", not the
other way around).
I've found the same difference of behaviour on both a production machine
where I'm running my app (CentOS machine, using a virtualenv, Python
2.7.3), and locally on my dev machine which is currently running Debian,
using the Debian Python 2.7.2 packages.
In both cases, json is always returning unicode objects, which implies I
don't have the C extensions for the json module according to your
analysis. I don't know enough about how this is supposed to work to
It also implies I probably not the only one affected by this, if it's
happened on two quite different machines. Looking at this discussion:
it seems that lots of people don't have the C extension for json
(reporting json 10x slower than simplejson).
|Re: json vs simplejson||Luke Plant||6/12/12 4:19 AM|
On 12/06/12 10:58, Vinay Sajip wrote:There is another issue I found.
Django's DateTimeAwareJSONEncoder now subclasses json.JSONEncoder
instead of simplejson.JSONEncoder. The two are not perfectly compatible.
simplejson.dumps() passes the keyword argument 'namedtuple_as_object' to
the JSON encoder class that you pass in, but json.JSONEncoder doesn't
accept that argument, resulting in a TypeError.
So any library that uses Django's JSONEncoder subclasses, but uses
simplejson.dumps() (either via 'import simplejson' or 'import
django.utils.simplejson') will break. I found this already with
I think we at least need a bigger section in the release notes about this.
|Re: json vs simplejson||Alex Ogier||6/12/12 5:19 AM|
On Jun 12, 2012 6:54 AM, "Luke Plant" <L.Pla...@cantab.net> wrote:I'm not sure why no one is getting speedups from simplejson, but I can
tell you that on python 2.6+ django.utils.simplejson.loads should be
an alias for json.loads:
>>> import json
>>> from django.utils import simplejson
>>> json.loads == simplejson.loads
|Re: json vs simplejson||Alex Ogier||6/12/12 5:28 AM|
On Tue, Jun 12, 2012 at 7:19 AM, Luke Plant <L.Pla...@cantab.net> wrote:Wait, 'import simplejson' works? Then that explains your problems. You
are using a library you installed yourself that has C extensions,
instead of the system json. If you switch to a system without
simplejson installed, then you should see the "proper" behavior from
django.utils.simplejson.loads(). If your program depends on some
optimized behavior of the C parser such as returning str instances
when it finds ASCII, it is bugged already on systems without
simplejson. If Django depends on optimized behavior, then it is a bug,
and a ticket should be filed.
|Re: json vs simplejson||Luke Plant||6/12/12 5:49 AM|
On 12/06/12 13:28, Alex Ogier wrote:I agree my existing program had a bug. I had simplejson installed
because a dependency pulled it in (which means it can be difficult to
get rid of).
The thing I was flagging up was that the release notes say "You can
safely change any use of django.utils.simplejson to json." I'm just
saying the two differences I've found probably warrant at least some
The second issue is difficult to argue as a bug in my program or
dependencies. Django has moved from a providing a JSONEncoder object
that supported a certain keyword argument to one that doesn't. We could
'fix' it to some extent:
def __init__(self, *args, **kwargs):
super(DjangoJSONEncoder, self).__init__(*args, **kwargs)
But like that, it would create more problems if the json module ever
gained that keyword argument in the future.
|Re: json vs simplejson||Alex Ogier||6/12/12 6:14 AM|
On Tue, Jun 12, 2012 at 8:49 AM, Luke Plant <L.Pla...@cantab.net> wrote:Like loads(), json.JSONEncoder is just an alias for
simplejson.JSONEncoder, and we need to support versions of simplejson
down to 1.9 which is what python 2.6 ships with. This
'namedtuple_as_object' thing seems to only appear as of simplejson
2.2, which means that depending on it is a bug that appears on any
system without a recent version of simplejson (for example, the
version that was bundled with Django doesn't support it). Depending on
this kwarg is a bug in Django, and should be fixed.
It's clear that people have begun to depend on the quirky ways in
which simplejson diverged from its earlier codebase. I found the place
where that unicode "proper behavior" was fixed, so apparently in
Python's stdlib they undid the C optimizations at some point. So I was
incorrect earlier, and the C speedups work "properly" with Python
Basically, anyone who depended on features of simplejson added after
1.9, or its wonky optimizations, already had arguably broken code in
that it only worked when simplejson is installed. I'm torn as to
whether we should add a note about these subtle problems when
switching to json, recommend that people switch to simplejson instead,
or undeprecate django.utils.simplejson as a necessary wart (we can
still stop vendoring simplejson though).