Decoding JSON object sequences (MIME type application/json-seq)?

378 views
Skip to first unread message

mathew murphy

unread,
May 11, 2016, 5:19:57 PM5/11/16
to golang-nuts
I was surprised to discover that Go's json.Decoder won't decode JSON sequences, as per RFC 7464. Has anyone worked out a good way to handle them?

Nate Brennand

unread,
May 11, 2016, 5:24:52 PM5/11/16
to mathew murphy, golang-nuts

Do the Token and More methods on json.Decoder not fit your use case?
There’s an example in the std lib docs: https://golang.org/pkg/encoding/json/#example_Decoder_Decode_stream.


--
You received this message because you are subscribed to the Google Groups "golang-nuts" group.
To unsubscribe from this group and stop receiving emails from it, send an email to golang-nuts...@googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

mathew

unread,
May 11, 2016, 6:12:12 PM5/11/16
to Nate Brennand, golang-nuts
No, the Token() method returns an error: "invalid character '\x1e' looking for beginning of value"


mathew

Matt Harden

unread,
May 12, 2016, 2:14:04 PM5/12/16
to mathew, Nate Brennand, golang-nuts
For this kind of use case, it would be very nice if the decoder made the following promise: If given a bufio.Reader in the constructor (with at least XX size buffer), when the decoder has just finished successfully decoding a top-level JSON value, it will leave the reader positioned just after the last character of that value. This should be doable using UnreadByte and/or UnreadRune and, if necessary, Peek. I don't think Peek will be necessary.

If the above promise were made, then the OP's request would be as simple as: create bufio.Reader and json.Decoder, loop until EOF (read RS character; unmarshal JSON value with decoder; read LF character).

If the Go standard library maintainers would find this acceptable, I can work on a pull request to do this.

mathew

unread,
May 12, 2016, 5:05:01 PM5/12/16
to Matt Harden, Nate Brennand, golang-nuts
That sounds good to me. Then any arbitrary encapsulation can be dealt with. 

(JSON embedded in XML, for example. You might think nobody would do something that awful, but then again, vCal...)


mathew

C Banning

unread,
May 13, 2016, 6:37:44 AM5/13/16
to golang-nuts
https://godoc.org/github.com/clbanning/mxj#NewMapJsonReader handles JSON decoding from io.Reader.  It's specifically for decoding to map[string]interface{}; so you'd have to hack it a bit if you're decoding to a struct.

Paul Hankin

unread,
May 14, 2016, 12:28:19 AM5/14/16
to golang-nuts, me...@pobox.com, natebr...@gmail.com
On Friday, 13 May 2016 03:14:04 UTC+9, Matt Harden wrote:
For this kind of use case, it would be very nice if the decoder made the following promise: If given a bufio.Reader in the constructor (with at least XX size buffer), when the decoder has just finished successfully decoding a top-level JSON value, it will leave the reader positioned just after the last character of that value. This should be doable using UnreadByte and/or UnreadRune and, if necessary, Peek. I don't think Peek will be necessary.

If the above promise were made, then the OP's request would be as simple as: create bufio.Reader and json.Decoder, loop until EOF (read RS character; unmarshal JSON value with decoder; read LF character).

For this specific use case (and assuming that a streaming decoder is needed -- otherwise you can just split the bytes slice on the RS tokens), one can wrap the underlying reader in one that stops reading when it encounters an RS. With a method to step over the RS so one can decode the next chunk in the text sequence.

The calling code would look like:

rsr := newRSReader(r)
for {
  err := rsr.ReadRS()
  if err == io.EOF {
     break
  }
  if err != nil {
    return err
  }
  parse all json from this chunk using rsr.Read()
}

The idea being that ReadRS() reads the RS from the underlying reader r -- returning an error if there's no RS or if there's any other error. Then rsr.Read() acts like an io.Reader that reads up to the next RS and then reports EOF.

This avoids adding subtle requirements to the stdlib implementation, although it does add a performance cost.

-- 
Paul

Matt Harden

unread,
May 17, 2016, 3:21:57 PM5/17/16
to Paul Hankin, golang-nuts, me...@pobox.com, natebr...@gmail.com
Paul, I started with that idea, but it's surprisingly difficult (at least for me) to write a Reader that stops at a certain delimiter. ReadSlice and friends on bufio.Reader don't seem to be designed for implementing Read methods. They all want to read until the delimiter is found, which means we might be reading farther into the stream than the JSON decoder really needs at that point. That means you have to hold on to those extra bytes in yet another buffer. It would be nice to have a DelimitedReader in io or ioutil, or maybe a ReadUpTo(delimiter byte, p []byte) (int, error) method on bufio.Reader.

I'm also not sure what json.Decoder will do once it encounters EOF, even if the EOF is beyond the current json value being decoded. If the decoder becomes unusable at that point, then one would have to create a new json.Decoder to process each record, which generates needless garbage.

Jakob Borg

unread,
May 18, 2016, 1:37:24 AM5/18/16
to Matt Harden, golang-nuts
It seems to me, from just quickly glancing at the json-seq spec, that
you could simply replace the record separator bytes with newlines and
have our normal json.Decoder be happy. That replacement is trivial in
a wrapper io.Reader.

//jb

Matt Harden

unread,
May 19, 2016, 12:28:46 PM5/19/16
to Jakob Borg, golang-nuts

The point of the separators is to enable resynchronization in cases where JSON records are truncated or corrupted. Just replacing the RS with whitespace would defeat that.

Reply all
Reply to author
Forward
0 new messages