[ANN] megajson: High Performance JSON Parser Generation

1,786 views
Skip to first unread message

Ben Johnson

unread,
Nov 8, 2013, 7:27:30 PM11/8/13
to golan...@googlegroups.com
Hi everyone-

I've hit performance bottlenecks with JSON encoding on two projects (sky & goraft) so I wrote an alternative called megajson. I noticed that a lot of CPU time went to the "reflect" package to determine the appropriate types to serialize so megajson takes the approach of generating parsers using the "go/ast" package at compile time.

It's still in an alpha stage but I thought I'd post it to get some feedback from people around my approach.


Using the test data in encoding/json, megajson encodes and decodes about 2x faster. Obviously, YMMV.

Special thanks to Shopify for sponsoring the work on it.


Ben Johnson


Robert Melton

unread,
Nov 8, 2013, 7:33:49 PM11/8/13
to Ben Johnson, golang-nuts
Ben--

Thanks!

On Fri, Nov 8, 2013 at 7:27 PM, Ben Johnson <b...@skylandlabs.com> wrote:
> It's still in an alpha stage but I thought I'd post it to get some feedback
> from people around my approach.
>
> https://github.com/benbjohnson/megajson
>
> Using the test data in encoding/json, megajson encodes and decodes about 2x
> faster. Obviously, YMMV.

This is very timely, we were just considering starting work on a very
similar package, I look forward to taking it for a spin.

--
Robert Melton

Ben Johnson

unread,
Nov 8, 2013, 7:50:25 PM11/8/13
to Robert Melton, golang-nuts
Robert-

Let me know how it works for you. Test coverage is about 80% but I'm going to bump that up soon. It's only about 1200 LOC so it shouldn't be too hard to grok. Let me know if you have any questions.

Bug reports and pull requests are always welcome. :)


Ben

Dave Cheney

unread,
Nov 8, 2013, 8:11:20 PM11/8/13
to Ben Johnson, Robert Melton, golang-nuts
Hi Ben,

With your permission I'd like to add megajson to my autobench harness [1]

Here is some raw data comparing megajson running on 1.1.2 vs 1.2rc3

#megajson
benchmark old ns/op new ns/op delta
BenchmarkCodeEncoder 20810857 15047966 -27.69%
BenchmarkCodeDecoder 75351857 54290820 -27.95%

benchmark old MB/s new MB/s speedup
BenchmarkCodeEncoder 93.24 128.95 1.38x
BenchmarkCodeDecoder 25.75 35.74 1.39x

[1] https://github.com/davecheney/autobench/tree/megajson
> --
> You received this message because you are subscribed to the Google Groups
> "golang-nuts" group.
> To unsubscribe from this group and stop receiving emails from it, send an
> email to golang-nuts...@googlegroups.com.
> For more options, visit https://groups.google.com/groups/opt_out.

Kevin Gillette

unread,
Nov 9, 2013, 2:45:40 AM11/9/13
to golan...@googlegroups.com
Some notes from your README:

Performance - The reflection library is slow and cannot be optimized by the compiler at compile time.

That's not true. I significantly advanced reflect-aware Go compiler could optimize a lot about the expression `reflect.TypeOf(int(0)).String()` for example. Further, while the types being reflected upon are often not known at compile time, the operations are known, and could be optimized at compile time.

Supported Types

What about `type Int int`? It's unclear if that would be supported, and that kind of thing is done frequently enough to be a concern.
  • Pointers to structs which have been megajsonified.
  • Arrays of pointers to structs which have megajsonified.
 I presume where you say arrays you instead/also mean slices. Why only pointers to structs? An application which uses large slice of value structs will easily lose any performance gains offered by megajson by needing to construct a corresponding slice of pointers to the original struct elements.

I think a critical improvement would be encoding/json compatibility... For example, instead of (or in addition to) providing a NewMyStructEncoder(writer).Encode(val) for each type, why not generate MarshalJSON/UnmarshalJSON methods on MyStruct, so that programmers who are already using encoding/json can just use the megajson code generator, and without any further code changes of their own, in order to produce a transparent performance boost? It would also mean that the application programmer wouldn't need to jump through any hoops if they wanted to embed or contain a MyStruct within some other type.

Eventually, you should want to strive for full encoding/json -> megajson compatibility of supported types so that anything that could be handled by encoding/json could be handled by megajson, or else that lack of supported-types compatibility will make megajson less appealing in the long term sense.

Ben Johnson

unread,
Nov 11, 2013, 9:43:18 AM11/11/13
to Dave Cheney, Robert Melton, golang-nuts
hey Dave-

That'd be great if you could add it to autobench. It'll be interesting to see how raw code vs encoding/json's reflection-based approach improve over time.

Thanks!


Ben

Dave Cheney

unread,
Nov 11, 2013, 9:53:29 AM11/11/13
to Ben Johnson, Robert Melton, golang-nuts
Done, if you add an encoding/json benchmark with the same input then autobench will pick that up automatically. 


Alberto García Hierro

unread,
Nov 11, 2013, 9:05:29 PM11/11/13
to golan...@googlegroups.com
If you really want high performance, I would suggest avoiding interfaces and, in general, function calls like the plague, since they are quite expensive in Go (compared to C). We have implemented basically the same for our internal web framework (to be released some day) and we're almost 4x faster than encoding/json without doing too much optimization. I'm sure we could make this even faster.

// This is a small benchmark comparing the performance of these JSON encoding
// methods. JSONDirect uses WriteJSON(), JSONSerialize uses
// gnd.la/mux/serialize (which adds some overhead because it also sets the
// Content-Length and Content-Encoding headers and thus must encode into an
// intermediate buffer first), while JSON uses json.Marshal(). All three
// benchmarks write the result to ioutil.Discard.
//
// BenchmarkJSONDirect 1000000 1248 ns/op 117.73 MB/s 
// BenchmarkJSONSerialize 1000000 1587 ns/op 92.62 MB/s
// BenchmarkJSON 500000 4583 ns/op 32.07 MB/s

b...@skylandlabs.com

unread,
Nov 12, 2013, 9:05:53 AM11/12/13
to golan...@googlegroups.com
Kevin-

Thanks for the awesome feedback. A couple responses to your notes:

1. I got the impression that reflection was slower because the results of my CPU profiling on a couple projects have always shown significant time in the reflect package and those calls stemmed from the encoding/json package. That's actually why I decided to write megajson. What version of Go were the reflection optimizations added?

2. The limited supported types are mainly to get megajson out the door. I'm aiming for full compatibility with encoding/json and I don't think that's too far off but I was mainly trying to time box myself.

3. Yes, I meant slices of pointers. Old habits die hard. :)

4. I love the idea of MarshalJSON/UnmarshalJSON generation. Great idea. Thanks!


Ben

b...@skylandlabs.com

unread,
Nov 12, 2013, 9:09:07 AM11/12/13
to golan...@googlegroups.com
Alberto-

I'll try removing some of the interfaces and inlining more myself to see if I can squeeze out some additional performance. Thanks for the tip. I'd be interested to see the parser generator you guys wrote when it's released.


Ben

Kevin Gillette

unread,
Nov 12, 2013, 12:21:32 PM11/12/13
to golan...@googlegroups.com
On Tuesday, November 12, 2013 7:05:53 AM UTC-7, b...@skylandlabs.com wrote:
1. I got the impression that reflection was slower because the results of my CPU profiling on a couple projects have always shown significant time in the reflect package and those calls stemmed from the encoding/json package. That's actually why I decided to write megajson. What version of Go were the reflection optimizations added?

I was specifically critiquing your use of the term "cannot" in: "cannot be optimized by the compiler at compile time." "isn't" would be more accurate.

j...@colchiscapital.com

unread,
Jan 22, 2015, 1:57:45 PM1/22/15
to golan...@googlegroups.com
Alberto, sorry to necro this thread, but: do you still plan to release this optimized JSON library?

DISCLAIMER:

This is not an investment recommendation or a solicitation to become an investor of the firm. Unless indicated, these views are the author's and may differ from those of the firm or others in the firm. We do not represent this is accurate or complete and we may not update this. Past performance is not indicative of future returns. For additional information and important disclosures, contact me at this email address. You should not use email to request or authorize the investment in any security or instrument, or to effect any other transactions. We cannot guarantee that any such requests received via email will be processed in a timely manner. This communication is solely for the addressee(s) and may contain confidential information. We do not waive confidentiality by mistransmission. Contact me if you do not wish to receive these communications.

Ugorji Nwoke

unread,
Jan 22, 2015, 2:15:32 PM1/22/15
to golan...@googlegroups.com, j...@colchiscapital.com
Just FYI:

package github.com/ugorji/go/codec is an optimized JSON library that gives better performance than encoding/json without code generation, and then blows the socks off when code generation is employed, and is not restricted in the type of fields supported in the struct (all types are supported, where megajson only supports a subset of types).

See 
http://ugorji.net/blog/go-codec-json      (benefits over encoding/json)
http://github.com/ugorji/go                    (source code)

Hope this helps.

Jon Cooper

unread,
Feb 13, 2015, 5:08:20 PM2/13/15
to Ugorji Nwoke, golan...@googlegroups.com
Thanks, Ugorji. This library is indeed awesome!
Reply all
Reply to author
Forward
0 new messages