Json.RawMessage and double encoding

6,480 views
Skip to first unread message

bsr

unread,
Oct 2, 2013, 2:25:06 PM10/2/13
to golan...@googlegroups.com
Hello,


I keep a struct field as Json.RawMessage, expecting it can be used  as HTTP json response. 

In my mind, Json.RawMessage, indicates to Marshal/unmarshal that it is a json string and no further encoding/decoding necessary. But, as in the example, if it is part of a struct, and encode the parent struct, it changes the representation.

I also tried making the field Json pointer to Json.RawMessage, but I got the same result locally.

Thanks for the help.
bsr

Output:
---
b %s {"Name":"World","Id":2}
b2 %s {"Name":"Hello","Id":1,"Json":"eyJOYW1lIjoiV29ybGQiLCJJZCI6Mn0="}
d.Json %s "eyJOYW1lIjoiV29ybGQiLCJJZCI6Mn0="

Code:
--

package main
import (
"encoding/json"
"fmt"
)
type Data struct {
Name string
Id   int
Json json.RawMessage
}
func main() {
tmp := struct {
Name string
Id   int
}{"World", 2}
b, err := json.Marshal(tmp)
if err != nil {
fmt.Println("Error %s", err.Error())
}
fmt.Println("b %s", string(b))
test := Data{"Hello", 1, b}
b2, err := json.Marshal(test)
if err != nil {
fmt.Println("Error %s", err.Error())
}
fmt.Println("b2 %s", string(b2))
var d Data
err = json.Unmarshal(b2, &d)
if err != nil {
fmt.Println("Error %s", err.Error())
}
fmt.Println("d.Json %s", string(d.Json))
}

bsr

unread,
Oct 2, 2013, 3:46:19 PM10/2/13
to golan...@googlegroups.com
It was indeed an issue with not using a pointer receiver.

Json *json.RawMessage

Also, I had to use pointer for test below.

b2, err := json.Marshal(test)

aro...@gmail.com

unread,
Oct 2, 2013, 4:41:14 PM10/2/13
to golan...@googlegroups.com
This has come up a few times.  Seems like a bug to me: https://code.google.com/p/go/issues/detail?id=6528

There's no reason that I can think of that marshalling a json.RawMessage requires a pointer receiver.

Kyle Lemons

unread,
Oct 2, 2013, 5:29:37 PM10/2/13
to Augusto Roman, golang-nuts
We can't change it until go 2.


--
You received this message because you are subscribed to the Google Groups "golang-nuts" group.
To unsubscribe from this group and stop receiving emails from it, send an email to golang-nuts...@googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.

Dan Kortschak

unread,
Oct 2, 2013, 6:43:10 PM10/2/13
to Kyle Lemons, Augusto Roman, golang-nuts
On Wed, 2013-10-02 at 14:29 -0700, Kyle Lemons wrote:
> We can't change it until go 2.

Unless it is considered as a bug as opposed to unfortunate or working as
intended.

Kyle Lemons

unread,
Oct 2, 2013, 6:50:44 PM10/2/13
to Dan Kortschak, Augusto Roman, golang-nuts
I don't really have any say in whether it's considered a bug, but I don't think it is.  I think it'd be even more surprising if it worked for encoding but not decoding, and decoding requires a pointer receiver.

Dan Kortschak

unread,
Oct 2, 2013, 7:00:16 PM10/2/13
to Kyle Lemons, Augusto Roman, golang-nuts
Yes, I think that's the reason it is like that, though I don't think
it's really surprising; decoding requires mutability of the receiver,
encoding doesn't.

Ugorji Nwoke

unread,
Oct 2, 2013, 8:19:22 PM10/2/13
to golan...@googlegroups.com, Kyle Lemons, Augusto Roman
I think it may have been an oversight. 

Some other types in the standard library that implement these use Value receiver for encode/marshal, and pointer receiver for decode/unmarshal e.g. 
http://tip.golang.org/pkg/time/ See Time. (M|Unm)arshal(JSON|Text|Binary) methods
http://tip.golang.org/pkg/net/ See IP. (M|Unm)arshal(Text) methods 

Also, since json.RawMessage is basically a []byte, it's idiomatic to pass it by value. 

However, it's now locked in till Go 2 at least. The best we may be able to do now is inform users in the API docs and examples.

Dan Kortschak

unread,
Oct 2, 2013, 8:31:19 PM10/2/13
to Ugorji Nwoke, golan...@googlegroups.com, Kyle Lemons, Augusto Roman
On Wed, 2013-10-02 at 17:19 -0700, Ugorji Nwoke wrote:
> I think it may have been an oversight.

Yes.

> Some other types in the standard library that implement these use
> Value receiver for encode/marshal, and pointer receiver for
> decode/unmarshal e.g.
> - http://tip.golang.org/pkg/time/ See Time. (M|Unm)arshal(JSON|Text|Binary) methods
> - http://tip.golang.org/pkg/net/ See IP. (M|Unm)arshal(Text) methods

Thanks, I was sure there were examples, but I couldn't think of them.


Matt Harden

unread,
Oct 4, 2013, 9:33:08 AM10/4/13
to golan...@googlegroups.com
To me this seems like a bug and therefore not subject to the Go 1 guarantee.

Ian Lance Taylor

unread,
Oct 4, 2013, 9:38:04 AM10/4/13
to Matt Harden, golang-nuts
On Fri, Oct 4, 2013 at 6:33 AM, Matt Harden <matt....@gmail.com> wrote:
> To me this seems like a bug and therefore not subject to the Go 1 guarantee.

In order to maintain the spirit of the Go 1 guarantee, we have to use
the "it's a bug" escape hatch very cautiously. We can only use it in
cases where no sane program would rely on the broken behaviour.
That's a subjective test, so we have to be really careful about using
our best judgment.

I haven't thought about the details on this particular issue. The
question to ask is: what sorts of working programs would start to
break if we made this change?

Ian

aro...@gmail.com

unread,
Oct 4, 2013, 11:50:13 PM10/4/13
to golan...@googlegroups.com, Matt Harden
I've been thinking about this issue on and off for the last several days.  I haven't come up with a single example where it would make sense to have RawMessage be purposefully byte-encoded by intentionally using a non-pointer receiver.

I'd be interested in a mega-compile of all known go libs to see if changing this breaks anything, or even detecting how many usages of the non-pointer receiver usage there are.  I would suppose that there are few, and they are all bugs.  Unfortunately, I won't be able to do such an analysis myself for a while.

- Augusto

Kevin Gillette

unread,
Oct 5, 2013, 9:57:41 PM10/5/13
to golan...@googlegroups.com, Matt Harden
On Friday, October 4, 2013 7:38:04 AM UTC-6, Ian Lance Taylor wrote:
We can only use it in cases where no sane program would rely on the broken behaviour.

Would any sane program rely on the marshaling of `json.RawMessage` to produce base64 encoded output stuffed in a JSON string, even when marshaling `*json.RawMessage` results in the documented behavior (passing through assumed-valid JSON syntax)?

I can't imagine any programmer found the documentation for RawMessage, saw that its straightforward (idiomatic) use unexpectedly produced the wrong output, yet decided that the proper solution was to change their expectation of the output as well as their program's behavior.

Kyle Lemons

unread,
Oct 7, 2013, 5:19:33 PM10/7/13
to Augusto Roman, golang-nuts, Matt Harden
Unfortunately, there are many, many other ways in which this could cause a program to break or change behavior.  You're effectively changing the method sets of any type that embeds a json.RawMessage, for instance.  It's concievable that someone would embed a RawMessage and use a pointer for unmarshaling, so that the raw JSON is stuck in the RawMessage field, and then they parse it into other fields from there while allowing the JSON encoder to not realize that it is a RawMessage when marshaling a value of it.  I think you're also making the code `x := (json.RawMessage).MarshalJSON` fail to compile, and making `y := foo.MarshalJSON` potentially cause foo to escape.

While I do agree that it's unfortunate and that we should change it eventually, I think doing so before go2 might be playing the "bug" card cavalierly.  It seems unlikely that making this change will fix enough existing, buggy programs to justify it.


--

Steven Blenkinsop

unread,
Oct 11, 2013, 1:28:02 PM10/11/13
to Kyle Lemons, Augusto Roman, golang-nuts, Matt Harden
On Monday, October 7, 2013, Kyle Lemons wrote:
Unfortunately, there are many, many other ways in which this could cause a program to break or change behavior.  You're effectively changing the method sets of any type that embeds a json.RawMessage, for instance.  It's concievable that someone would embed a RawMessage and use a pointer for unmarshaling, so that the raw JSON is stuck in the RawMessage field, and then they parse it into other fields from there while allowing the JSON encoder to not realize that it is a RawMessage when marshaling a value of it.

They'd have to be marshalling it into a base 64 representation of the json encoding of the data, which would be nonsense. There's no reason you'd base 64 encode something that was already text. Unless I'm not catching your meaning...
 
I think you're also making the code `x := (json.RawMessage).MarshalJSON` fail to compile, and making `y := foo.MarshalJSON` potentially cause foo to escape.

The opposite. The declaration of x wouldn't compile currently, but would with the change, and that of y could cause foo to escape now but wouldn't with the change.

Kyle Lemons

unread,
Oct 11, 2013, 1:42:32 PM10/11/13
to Steven Blenkinsop, Augusto Roman, golang-nuts, Matt Harden
On Fri, Oct 11, 2013 at 10:28 AM, Steven Blenkinsop <stev...@gmail.com> wrote:
On Monday, October 7, 2013, Kyle Lemons wrote:
Unfortunately, there are many, many other ways in which this could cause a program to break or change behavior.  You're effectively changing the method sets of any type that embeds a json.RawMessage, for instance.  It's concievable that someone would embed a RawMessage and use a pointer for unmarshaling, so that the raw JSON is stuck in the RawMessage field, and then they parse it into other fields from there while allowing the JSON encoder to not realize that it is a RawMessage when marshaling a value of it.

They'd have to be marshalling it into a base 64 representation of the json encoding of the data, which would be nonsense. There's no reason you'd base 64 encode something that was already text. Unless I'm not catching your meaning...

It is admittedly contrived, but it works and is a perfectly legal use of method sets:
 
 
I think you're also making the code `x := (json.RawMessage).MarshalJSON` fail to compile, and making `y := foo.MarshalJSON` potentially cause foo to escape.

The opposite. The declaration of x wouldn't compile currently, but would with the change, and that of y could cause foo to escape now but wouldn't with the change.

Right, I got myself switched around.  `x := (*json.RawMessage).MarshalJSON` will compile now, but fail in the future.

Steven Blenkinsop

unread,
Oct 11, 2013, 5:41:20 PM10/11/13
to Kyle Lemons, Augusto Roman, golang-nuts, Matt Harden


On Friday, October 11, 2013, Kyle Lemons wrote:
On Fri, Oct 11, 2013 at 10:28 AM, Steven Blenkinsop <stev...@gmail.com> wrote:

They'd have to be marshalling it into a base 64 representation of the json encoding of the data, which would be nonsense. There's no reason you'd base 64 encode something that was already text. Unless I'm not catching your meaning...

It is admittedly contrived, but it works and is a perfectly legal use of method sets:

Sure, it works. Seems like a really roundabout way to avoid implementing json.Unmarshaler. It seems that the person who would actually write this would be the same person who would hardcode a divide by zero in order to crash his program in certain cases. The circumstances in which this would be workable are fairly narrow anyways, namely where you are encoding and decoding close enough together that you even think to use the same structure, and where you aren't subsequently decoding the output from within Go. But I understand that this would be an iffy change considering the compatibility guarantee.

The opposite. The declaration of x wouldn't compile currently, but would with the change, and that of y could cause foo to escape now but wouldn't with the change.

Right, I got myself switched around.  `x := (*json.RawMessage).MarshalJSON` will compile now, but fail in the future.

From the spec:
"""
Consider a struct type T with two methods, Mv, whose receiver is of type T, and Mp, whose receiver is of type *T.

type T struct {
a int
}
func (tv  T) Mv(a int) int         { return 0 }  // value receiver
func (tp *T) Mp(f float32) float32 { return 1 }  // pointer receiver
<snip>
For a method with a value receiver, one can derive a function with an explicit pointer receiver, so `(*T).Mv` yields a function value representing Mv with signature `func(tv *T, a int) int`.
"""

Kevin Gillette

unread,
Oct 12, 2013, 12:15:29 AM10/12/13
to golan...@googlegroups.com, Steven Blenkinsop, Augusto Roman, Matt Harden
On Friday, October 11, 2013 11:42:32 AM UTC-6, Kyle Lemons wrote:
Right, I got myself switched around.  `x := (*json.RawMessage).MarshalJSON` will compile now, but fail in the future.

That should still work. Consider http://play.golang.org/p/9zZE_ziMnn

aro...@gmail.com

unread,
Mar 4, 2014, 4:51:48 AM3/4/14
to golan...@googlegroups.com, Augusto Roman, Matt Harden
Even well aware of this, I was reluctant to duplicate json.RawMessage locally and just got bit by it again in a more subtle manner:

This means that behavior changes surprisingly and undeniably incorrectly when encoding a value alone vs in a map, even when trying really hard to provide pointers.  For example, a struct that contains a json.RawMessage may suddenly fail to encode correctly.  Worse, using map[*]json.RawMessage guaranteed to encode incorrectly.

If we decide not to fix this bug (issue 6528), we should deprecate RawMessage and put a big warning message on it.

- Augusto
Reply all
Reply to author
Forward
0 new messages