Faster JSON library

305 views
Skip to first unread message

Lars Nilsson

unread,
Oct 6, 2011, 2:03:43 PM10/6/11
to clo...@googlegroups.com
The clojure.contrib.base64 discussion has inspired me (sorry!) to
write this.. I would very much like to see a faster JSON parser be in
contrib. clj-json can beat clojure.data.json by up to a factor of 140x
when reading/parsing and 5x when creating a JSON string.

clojure.data.json reading:

(dotimes [_ 5] (time (read-json (slurp "foo.json"))))
"Elapsed time: 105137.039484 msecs"
"Elapsed time: 109517.590644 msecs"
"Elapsed time: 114940.018075 msecs"
"Elapsed time: 107612.194846 msecs"
"Elapsed time: 104434.230607 msecs"
nil

clj-json reading:

(dotimes [_ 5] (time (parse-string (slurp "foo.json") true)))
"Elapsed time: 851.541746 msecs"
"Elapsed time: 716.894466 msecs"
"Elapsed time: 713.257132 msecs"
"Elapsed time: 710.379671 msecs"
"Elapsed time: 709.358592 msecs"
nil

clojure.data.json create string:

(def foo (read-json (slurp "foo.json")))
(dotimes [_ 5] (time (json-str foo)))
"Elapsed time: 1546.511918 msecs"
"Elapsed time: 1533.056017 msecs"
"Elapsed time: 1534.136322 msecs"
"Elapsed time: 1537.893503 msecs"
"Elapsed time: 1555.343765 msecs"
nil

clj-json create string:

(def foo (parse-string (slurp "foo.json")))
(dotimes [_ 5] (time (generate-string foo)))
"Elapsed time: 375.415311 msecs"
"Elapsed time: 298.440444 msecs"
"Elapsed time: 272.829368 msecs"
"Elapsed time: 271.800466 msecs"
"Elapsed time: 273.67808 msecs"
nil

The JSON file is about 217KB, with vectors containing a couple of
thousand JSON objects with nested vector objects between 2-6 levels
deep.

Granted, clj-json uses a (presumably heavily optimized) Java library
as the work horse, while clojure.data.json is pure Clojure. However, I
feel the speed penalty is too big of a price to pay in this case. Now,
I can use clj-json for my own parsing needs. However, something like
clutch (couchdb library) that uses c.d.json behind the scenes may be
paying a price in performance that I cannot easily overcome without
hacking around inside it in order to swap JSON implementation, rather
than tweaking my own code (although, in this case it may be limited to
just the JSON string creation).

Perhaps there are benefits (of which I'm not aware) to c.d.json that
are not available in clj-json, but I'd be hard-pressed to come up with
a scenario where I wouldn't pick the significant speed boost of
clj-json.

Lars Nilsson

Tal Liron

unread,
Oct 6, 2011, 8:07:53 PM10/6/11
to clo...@googlegroups.com
An excellent JVM library to use as base is Jackson:

http://jackson.codehaus.org/

It would be wonderful to see a Clojure-friendly version of it: having it create Clojure-specific structures from JSON, and also recognizing Clojure deftypes for serialization. The streaming API is friendly enough that I can see it being relatively easy.

David Nolen

unread,
Oct 6, 2011, 8:16:11 PM10/6/11
to clo...@googlegroups.com
clj-json uses Jackson and so does https://github.com/dakrone/cheshire


--
You received this message because you are subscribed to the Google
Groups "Clojure" group.
To post to this group, send email to clo...@googlegroups.com
Note that posts from new members are moderated - please be patient with your first post.
To unsubscribe from this group, send email to
clojure+u...@googlegroups.com
For more options, visit this group at
http://groups.google.com/group/clojure?hl=en

Phil Hagelberg

unread,
Oct 6, 2011, 8:16:29 PM10/6/11
to clo...@googlegroups.com
On Thu, Oct 6, 2011 at 5:07 PM, Tal Liron <tal....@gmail.com> wrote:
> An excellent JVM library to use as base is Jackson:
>
> http://jackson.codehaus.org/
>
> It would be wonderful to see a Clojure-friendly version of it

Both clj-json and cheshire (https://github.com/dakrone/cheshire) are
actually already based on Jackson.

> I can use clj-json for my own parsing needs. However, something like
> clutch (couchdb library) that uses c.d.json behind the scenes may be
> paying a price in performance

Seems a bit silly if someone's just choosing a slower implementation
just because it's in contrib. I suspect the author of Cheshire may
have plenty of good reasons for not wanting to put his library in
contrib, including the fact that its deps might not line up with
contrib policy, wanting to accept patches without making contributors
mail in paperwork, and wanting to use a better bug tracker than Jira.

-Phil

Tal Liron

unread,
Oct 6, 2011, 8:20:56 PM10/6/11
to clo...@googlegroups.com
Cheshire looks great, thanks for the tip!

I wonder, then, what's the OP's problem? I think it's good to have a lightweight, 100% Clojure version of JSON in contrib. A lighter weight is often a higher priority than performance. I think both approaches have their place.

In the Java world, too, there's the option of using the slower, simpler reference implementation of JSON.

Lars Nilsson

unread,
Oct 6, 2011, 8:41:32 PM10/6/11
to clo...@googlegroups.com
As I mentioned in my previous email, my problem isn't really picking a
JSON implementation for my own needs, but rather when I use a library
that uses a slower implementation. If I wanted to use clutch for
couchdb access and didn't pay too much attention as to what leiningen
pulls in, I wouldn't know from looking at its API that c.d.json is
used, only if I checked what was stuffed in lib, or started looking at
its source. I could go ahead and use clj-json all day long within my
own code, but whenever I call a clutch function that involves reading
or writing JSON I would not have any say in the matter (unless I dig
into its code and make changes, rather than work on my own..) I cloned
clutch in github with the intent of playing around with swapping in
clj-json to see what difference it makes in practice, but it's not
what I'd like to work on at the moment, ideally.

Lars Nilsson

Dave Sann

unread,
Oct 6, 2011, 9:30:16 PM10/6/11
to clo...@googlegroups.com
In my opinion, the situation is not clear cut:
  I might want a slower but more portable library if porting clutch to clojurescript. 
  (I read that someone has this working...)
  I might just want a lib that works if moving to .net in the short term but optimise with a faster library later. 
  Or, I might want a fast JVM specific library

Json parsing and writing has a relatively simple API/interface so different implementations of the "same api" are not unexpected.

So I have two thoughts:

1. Assuming a standard API. How can you practically choose between different implementations that trade off different characteristics depending on your need. For example: performance vs portability; or performance on certain problem types vs others.

2. How many libraries might have a standard API with different implementations. (is it worth expending time to address this?)

In general, this is a potentially tricky question in respect of dependency management.

Cheers

Dave

Meikel Brandmeyer (kotarak)

unread,
Oct 7, 2011, 1:55:48 AM10/7/11
to clo...@googlegroups.com
Hi,

slf4j comes to mind. Have a standard API which is provided by the different libraries. If you were targeting clojurescript you'd specify the portable library. For a server application running on the JVM you'd specify a fast Jackson-based implementation. This leaves the choice to the user of the library. What she specifies in her project dependencies is used.

Welcome to the paradox of choice.

Sincerely
Meikel

Chas Emerick

unread,
Oct 7, 2011, 3:43:35 AM10/7/11
to clo...@googlegroups.com
Clutch was mentioned a couple of times, so I figured I'd chime in. :-)

As for why clutch uses c.c.json — I don't think there's any particular reason.  Tunde chose it before I got involved, but I'm sure I probably would have done the same thing, mostly because JSON en/decoding speed isn't top-most on the performance priority list when you're IO-bound.  Also, all things being equal, it's reasonable to use the 'blessed' library (an interesting topic/concept in and of itself).  In any case, we'll probably be looking around at other options as we eliminate our last usages of classic contrib. (FYI, Clutch is 1.3-compatible even though it uses classic contrib.)

Dave: I'm not sure if you were referring to me, but I have been working on supporting ClojureScript views in Clutch (which are working nicely now; blog post coming shortly).  *Porting* Clutch to ClojureScript doesn't make a lot of sense to me, but that could easily just be me. (It's mostly interop, so it's not really a porting task, and I'm not too keen on clients touching my CouchDB instances directly in any case.)

Cheers,

- Chas

Dave Sann

unread,
Oct 7, 2011, 3:55:43 AM10/7/11
to clo...@googlegroups.com
There was no particular reason to mention clutch. It was just the example that seemed to be in the discussion. 
Dave

Chas Emerick

unread,
Oct 7, 2011, 4:15:15 AM10/7/11
to clo...@googlegroups.com
Sure, I wasn't attempting to be defensive or whatever. Just thought the perspective might be worthwhile.

- Chas

kovas boguta

unread,
Oct 7, 2011, 1:58:03 PM10/7/11
to clo...@googlegroups.com
My 2 cents:

1. JSON transformation is of fundamental importance to many Clojure
applications.
2. Having the "standard" solution be blown away by a factor of 140x
for the sake of purity is not pragmatic.

If the user experience with contrib is to use it, realize its not
ready for prime time, and then go rummaging around through github for
better solutions made by people who've previously realized the same
thing, that is a fail. And makes one less likely to look to contrib
for default solutions to common problems.

Lars Nilsson

unread,
Oct 7, 2011, 2:10:57 PM10/7/11
to clo...@googlegroups.com
Trying to be a little bit constructive here, in case I come across as
complaining, I took the source for c.d.json and put it into a
leiningen project, enabled warn on reflection, and found that several
cases of (... (let [c (char i] ... (= c \x) ...) results in Clojure
deciding it needs to perform reflection in order to call equals in the
comparison with a fixed character. I'm not really sure what the proper
solution for this is, but I changed the "let" to (let [c
(Character/valueOf (char i)] ...) and the time for my 217KB JSON file
dropped from 107 seconds to 2 seconds, or only a little more than
twice as slow as clj-json (which clocked in a little under one second
for my file).

Lars Nilsson

Stuart Halloway

unread,
Oct 7, 2011, 2:48:10 PM10/7/11
to clo...@googlegroups.com
Thanks. I am going to take a look at this now.

Stu

Stuart Halloway
Clojure/core
http://clojure.com

Sean Corfield

unread,
Oct 7, 2011, 3:04:00 PM10/7/11
to clo...@googlegroups.com
That would be http://dev.clojure.org/jira/browse/DJSON-1 which I
opened at the end of July...

Tal Liron

unread,
Oct 7, 2011, 3:08:27 PM10/7/11
to clo...@googlegroups.com

As long as we're fixing d.c.json... it would be nice to add support for encoding sequences and maps.

(I know, I should open a bug....)

Sean Corfield

unread,
Oct 7, 2011, 3:12:56 PM10/7/11
to clo...@googlegroups.com
On Fri, Oct 7, 2011 at 12:08 PM, Tal Liron <tal....@gmail.com> wrote:
> (I know, I should open a bug....)

http://dev.clojure.org/jira/browse/DJSON :)
--
Sean A Corfield -- (904) 302-SEAN
An Architect's View -- http://corfield.org/
World Singles, LLC. -- http://worldsingles.com/
Railo Technologies, Inc. -- http://www.getrailo.com/

"Perfection is the enemy of the good."
-- Gustave Flaubert, French realist novelist (1821-1880)

Lars Nilsson

unread,
Oct 7, 2011, 3:32:48 PM10/7/11
to clo...@googlegroups.com
I get the following, trying to follow that link.

Login Required
You are not logged in.
You cannot view this URL as a guest. You must log in or sign up for an account .
If you think this message is wrong, please consult your administrators
about getting the necessary permissions.

Lars Nilsson

Stuart Halloway

unread,
Oct 7, 2011, 4:20:24 PM10/7/11
to clo...@googlegroups.com
> Trying to be a little bit constructive here, in case I come across as
> complaining, I took the source for c.d.json and put it into a
> leiningen project, enabled warn on reflection, and found that several
> cases of (... (let [c (char i] ... (= c \x) ...) results in Clojure
> deciding it needs to perform reflection in order to call equals in the
> comparison with a fixed character. I'm not really sure what the proper
> solution for this is, but I changed the "let" to (let [c
> (Character/valueOf (char i)] ...) and the time for my 217KB JSON file
> dropped from 107 seconds to 2 seconds, or only a little more than
> twice as slow as clj-json (which clocked in a little under one second
> for my file).
>
> Lars Nilsson

This reflection warning can be fixed with an enhancement on the
Clojure side, which I have just pushed to master [1].

I would like to create 1.4 alpha 1 with the code changes that have
gone in today. It would be super-great if anybody has time to build
your own project against master and let us know if you see any issues.

Thanks,
Stu

[1] https://github.com/clojure/clojure/commit/405d24dd49d649c01b7881f1394fc90924c54ef0

Kevin Downey

unread,
Oct 7, 2011, 5:36:24 PM10/7/11
to clo...@googlegroups.com
seems like that could be added to Intrinsics.java

> --
> You received this message because you are subscribed to the Google
> Groups "Clojure" group.
> To post to this group, send email to clo...@googlegroups.com
> Note that posts from new members are moderated - please be patient with your first post.
> To unsubscribe from this group, send email to
> clojure+u...@googlegroups.com
> For more options, visit this group at
> http://groups.google.com/group/clojure?hl=en

--
And what is good, Phaedrus,
And what is not good—
Need we ask anyone to tell us these things?

Lars Nilsson

unread,
Oct 10, 2011, 11:19:18 AM10/10/11
to clo...@googlegroups.com

Cloning and building 1.4.0-master-SNAPSHOT resulted in much better
performance in loading data into couchdb. When using c.d.json it takes
about 98 seconds for loading and storing 100 JSON documents (ranging
in size from some tens of KB, to 5MB), while clj-json takes about 86
seconds.

Comparing just the load time c.d.json takes 27 seconds and clj-json
takes 12 seconds. Overall, the 100 JSON files are about 114MB. So to
keep things in perspectice, a single 217KB file took over 100 seconds
before this change, now 114MB takes 27 seconds.

Lars Nilsson

Lars Nilsson

unread,
Oct 10, 2011, 11:36:23 AM10/10/11
to clo...@googlegroups.com

Ugh, I need to keep things straight. It was obviuosly some iterations
earlier for the single file, compared with once for each file
afterward. Still, the difference is huge.

Lars Nilsson

Stuart Sierra

unread,
Oct 10, 2011, 12:58:49 PM10/10/11
to clo...@googlegroups.com
Patch welcome... ;)
-S

Stuart Sierra

unread,
Oct 10, 2011, 1:00:39 PM10/10/11
to clo...@googlegroups.com
I think I got the permissions fixed...
-S

Sean Corfield

unread,
Oct 11, 2011, 11:47:56 AM10/11/11
to clo...@googlegroups.com
Nice! This will let me remove some workarounds for reflection warnings
I was getting and probably give me better performance too :)

I'll try to run tests against 1.4.0-master-SNAPSHOT today (that'll
have this change, right?).

Sean

Sean Corfield

unread,
Oct 11, 2011, 2:17:54 PM10/11/11
to clo...@googlegroups.com
On Tue, Oct 11, 2011 at 8:47 AM, Sean Corfield <seanco...@gmail.com> wrote:
> I'll try to run tests against 1.4.0-master-SNAPSHOT today (that'll
> have this change, right?).

I get a NPE from the 1.4 compiler on Congomongo. Details reported on
clojure-dev.

Reply all
Reply to author
Forward
0 new messages