Google Groups no longer supports new Usenet posts or subscriptions. Historical content remains viewable.
Dismiss

Decoding large, nested json and inserting into a database

109 views
Skip to first unread message

Cortez

unread,
May 17, 2013, 5:48:26 PM5/17/13
to
Hi,

I am trying to figure out a way to parse nested JSON and insert various extracted elements into an sqlite database. The JSON strings are typically quite large, and rather than parse them entirely into memory first as Lisp objects (alists or CLOS objects) I would like to be able to do this in streaming fashion, as the JSON is being read.

I've looked at YASON, but it doesn't seem to have a streaming decoder. CL-JSON looks like the best candidate, but there don't seem to be any examples relating to what I would like to do. The JSON I am dealing with has the following general schema:

{
"foo": [
{
"a": "lorem ipsum...",
"b": { "aa": 6, "bb": 4 },
"c": [1, 2, 3, 4]
},
// many more objects of this form...
],
"bar": 7,
"baz": 8
}

So I have objects which contain arrays of sub-objects, which may in turn contain further sub-objects or arrays. I looked at CL-JSON's json-bind, but it doesn't seem particularly suited to this task. Presumably I have to define my own custom decoder. It would be nice if anyone has any decent examples of how to implement such a decoder, or of any simpler way to achieve this task.

Many thanks,
Chris
Message has been deleted

Boris Smilga

unread,
May 18, 2013, 5:26:18 PM5/18/13
to
I'm not in a position to give you advice on the database pa
is very application-specific, but your JSON parsing can be
quite straightforwardly with custom decoders in CL-JSON:

(in-package json)

(defvar *default-decoder*
(current-decoder))

(defun dispose-of-big-data (obj)
(format *trace-output* "Disposing of ~S~%" obj))

(setf *big-data-decoder*
(custom-decoder
:beginning-of-array (constantly nil)
:array-member #'dispose-of-big-data
:end-of-array (constantly '#:big-data-array)
:internal-decoder *default-decoder*))

(defun decode-big-data-container (json)
(let ((kh *object-key-handler*))
(with-shadowed-custom-vars
(set-custom-vars
:object-key (lambda (key)
(setf *internal-decoder*
(if (equal key "foo")
*big-data-decoder*
*default-decoder*))
(funcall kh key)))
(decode-json-from-source json))))

(decode-big-data-container
"{
\"foo\": [
{
\"a\": \"lorem ipsum...\",
\"b\": { \"aa\": 6, \"bb\": 4 },
\"c\": [1, 2, 3, 4]
},
{
\"a\": \"the quick brown fox...\",
\"b\": { \"zz\": 66, \"yy\": 44 },
\"c\": [1, 1, 2, 3, 5]
}
],
\"bar\": 7,
\"baz\": 8
}")

This evaluates to ((:FOO . #:BIG-DATA-ARRAY) (:BAR . 7) (:BAZ . 8))
and prints out the trace for the big data elements:

Disposing of ((:A . "lorem ipsum...") (:B (:AA . 6) (:BB . 4))
(:C 1 2 3 4))
Disposing of ((:A . "the quick brown fox...") (:B (:ZZ . 66) (:YY . 44))
(:C 1 1 2 3 5))

You can see that *BIG-DATA-DECODER* installs its own array handlers.
They replace the standard array handlers which accumulate the array
elements; so, *BIG-DATA-DECODER* doesn't do any accumulation.

Hope this helps.

Yours,
– B. Sm.

Cortez

unread,
May 19, 2013, 1:19:10 AM5/19/13
to
That's a very helpful reply, thank you Boris! Indeed I came to the conclusion that a CL-JSON custom decoder would be the right solution. Your example is close to what I had envisaged - thank you again.

Best,
Chris
0 new messages