Ignoring UTF-8 BOM when decoding JSON

108 views
Skip to first unread message

Mark Richman

unread,
Sep 23, 2016, 7:37:56 AM9/23/16
to golang-nuts
I have a JSON file which begins with the UTF-8 byte-order marker (BOM) 0xEF 0xBB 0xBF.

This causes Decode() to fail with SyntaxError:

SyntaxError invalid character 'ï' looking for beginning of value - Offset: 1


Is there any way to tell Decode() to ignore the BOM, or do I have to peek at the first 3 bytes and skip them somehow?


Thanks,

Mark



Jesse McNelis

unread,
Sep 23, 2016, 8:56:53 AM9/23/16
to Mark Richman, golang-nuts
On Fri, Sep 23, 2016 at 9:37 PM, Mark Richman <markar...@gmail.com> wrote:
>
> Is there any way to tell Decode() to ignore the BOM, or do I have to peek at
> the first 3 bytes and skip them somehow?
>

What you need is an io.Reader that skips the BOM.
Luckily someone wrote a package for that.
https://github.com/spkg/bom

gary.wi...@victoriaplumb.com

unread,
Sep 23, 2016, 10:13:40 AM9/23/16
to golang-nuts
This looks like a compounded error.

First, Json should never have a BOM encoded within it. Second, it seems like the Go Json decoder doesn't account for the BOM if it is mistakenly encoded. Both are mentioned in the Json RFC: https://tools.ietf.org/html/rfc7159#section-8.1

Mark Richman

unread,
Sep 23, 2016, 10:21:13 AM9/23/16
to golang-nuts, gary.wi...@victoriaplumb.com
This works great, thanks! https://github.com/spkg/bom 

Agreed, JSON should not have BOM, however there is still software out there, especially on Windows, which insist on writing out JSON with the BOM. So, I have to account for it, standard or not.

gary.wi...@victoriaplumb.com

unread,
Sep 23, 2016, 10:36:14 AM9/23/16
to golang-nuts, gary.wi...@victoriaplumb.com
Yeah, the Json decoder should handle it. Maybe post a bug report?  https://github.com/golang/go/issues

Ian Davis

unread,
Sep 23, 2016, 10:44:06 AM9/23/16
to golan...@googlegroups.com

On Fri, Sep 23, 2016, at 03:35 PM, gary.wi...@victoriaplumb.com wrote:
Yeah, the Json decoder should handle it. Maybe post a bug report?  https://github.com/golang/go/issues

This has been raised before: https://github.com/golang/go/issues/12254

The answer is to use a reader to strip the BOM

Ian
Reply all
Reply to author
Forward
0 new messages