marshall/unmarshall of large json files

1,054 views
Skip to first unread message

gmuth...@gmail.com

unread,
Nov 7, 2016, 1:26:49 PM11/7/16
to golang-nuts
We are using Go for one of our projects and our API's use JSON as the payload formatter.  We have a large json file (12MB) that needs to be unmarshalled. We just used the standard json enconde/decode and it bloats up the memory. I ran the pprof an realized it was due to reflect_unsafe.NewArray that needs a lot of allocation.  Is this the standard way to implement API's with large json files in Go or should I not use encode/decode for this purpose?

Ugorji Nwoke

unread,
Nov 7, 2016, 4:02:14 PM11/7/16
to golang-nuts, gmuth...@gmail.com
Show some code, and we may be able to advise you better.

Gokul Muthuswamy

unread,
Nov 7, 2016, 5:49:45 PM11/7/16
to Ugorji Nwoke, golang-nuts
I have created a mock with a test file the code attached jsoncodegolang.txt, attached is also a test file to simulate our data entries. In the real world scenario its around 12MB pull on every GET operation.


jsoncodegolang.txt
test.json

Ugorji Nwoke

unread,
Nov 7, 2016, 7:19:00 PM11/7/16
to golang-nuts, ugo...@gmail.com, gmuth...@gmail.com
golang's encoding/json package likes to ensure that the 
1. you read 12MB into memory
2. encoding/json will first scan the full text to ensure it is a well-formed json (before populating the structure)
3. At a later point, GC will reclaim that memory
4. This means that if each GET request is doing this 12MB decode, you will start getting memory pressure.

I think this is a design problem more so. You can help with #1 and #3 and #4 above: do not read up the 12MB into memory, but instead use a json.NewDecoder(os.OpenFile(...)).Decode(...). However, to tackle #2, you may need to examine a different decoder. Try package: github.com/ugorji/gohttps://godoc.org/github.com/ugorji/go/codec ).
Reply all
Reply to author
Forward
0 new messages