Is overwriting contents of pointer field the expected behavior of json.Unmarshal?

571 views
Skip to first unread message

Mateusz Wójcik

unread,
Aug 22, 2022, 7:01:01 PM8/22/22
to golang-nuts
The test code below demonstrates a json.Unmarshal behavior that I find unexpected. The code unmarshals arrays of objects with pointers fields. All unmarshaled objects end up sharing the same pointer value, effectively overwriting the contents of the pointer.
package unmarshall

import (
    "encoding/json"
    "fmt"
    "testing"

    "github.com/stretchr/testify/assert"
)

func TestUnmarshal(t *testing.T) {
    type Elem struct {
        V int
        P *int
    }

    elems := []Elem(nil)
    newElems := []Elem(nil)

    json.Unmarshal([]byte(`[{"V":10, "P":1}]`), &newElems)
    elems = append(elems, newElems...)

    json.Unmarshal([]byte(`[{"V":20, "P":2}]`), &newElems)
    elems = append(elems, newElems...)

    assert.Equal(t, 10, elems[0].V)
    assert.Equal(t, 1, *elems[0].P)

    assert.Equal(t, 20, elems[1].V)
    assert.Equal(t, 2, *elems[1].P)

    assert.NotEqual(t, elems[0].P, elems[1].P)
}
I find the behavior unexpected because the documentation of json.Unmarshal states:
// To unmarshal a JSON array into a slice, Unmarshal resets the slice length
// to zero and then appends each element to the slice.
However, reseting the slice length to zero and appending elements to the slice results in different behavior:
func TestAppend(t *testing.T) {
   type Elem struct {
      V int
      P *int
   }

   elems := []Elem(nil)
   newElems := []Elem(nil)

   one, two := 1, 2

   newElems = append(newElems[:0], Elem{V: 10, P: &one})
   elems = append(elems, newElems...)

   newElems = append(newElems[:0], Elem{V: 20, P: &two})
   elems = append(elems, newElems...)

   assert.Equal(t, 10, elems[0].V)
   assert.Equal(t, 1, *elems[0].P)

   assert.Equal(t, 20, elems[1].V)
   assert.Equal(t, 2, *elems[1].P)

   assert.NotEqual(t, elems[0].P, elems[1].P)
}
Are my expectations of the above behavior incorrect?

Tamás Gulácsi

unread,
Aug 23, 2022, 1:52:29 AM8/23/22
to golang-nuts
"To unmarshal a JSON array into a slice, Unmarshal resets the slice length to zero and then appends each element to the slice."

And you gave "&newElems" to Unmarshal twice. So it zeroed the length, and appended the second time,
effectively overwriting the first (and only) element.
This is intended behaviour.

The problem is not this, but your use of newElems - although it's length is zeroed, the first element is still there, so Unmarshal uses it, and fills P,
instead of allocating a new *int.

If you zero out the element, it will allocate a new pointer, as you want:

Slawomir Pryczek

unread,
Aug 23, 2022, 3:30:01 AM8/23/22
to golang-nuts
From unmarshal doc:

1. To unmarshal JSON into a pointer, Unmarshal first handles the case of the JSON being the JSON literal null. In that case, Unmarshal sets the pointer to nil. Otherwise, Unmarshal unmarshals the JSON into the value pointed at by the pointerIf the pointer is nil, Unmarshal allocates a new value for it to point to.

2. To unmarshal a JSON array into a slice, Unmarshal resets the slice length to zero and then appends each element to the slice.

The problem with it is that it's really hard to describe what it really does without looking at code. When the slice size is reset to zero it means that only the length of this slice will be set to 0. The underlying data is still there in memory and the capacity is still X. Now during slice expansion, the data is not reinitialized to 0 if that slice is still able to fit another element (number of elements < current capacity) because the pointer was initialized, previously.

So during first unmarshal a new element is created, with V=0, P=nil. Unmarshal will set the pointer to some place in memory (1) so it can store int value in there. After first Unmarshal you have a slice newElems with length 1 and capacity 4 with V=10 and P=0xSOMETHING. The pointer is there so the pointed data will be overwritten for element [0], as described by documentation in section (1)

There's a problem with section (2) because that's not exactly what happens during append. There was a proposition to re-initialize the slice element before each element is "appended" by unmarshal (so it'd work like "real" append does and be in sync with the doc) but it was rejected (and that's very good as it broken some code). Real behaviour of unmarshal can maybe be described better by saying that it allocates slice elements with zeroinit and overwrites slice elements when allocation isn't needed.

 

Mateusz Wójcik

unread,
Aug 23, 2022, 3:51:42 AM8/23/22
to golang-nuts
I'm aware of why the JSON unmarshal overwrites contents of the pointer, as I followed the code. I'm also aware of how to pass the slice argument "correctly".

There's a problem with section (2) because that's not exactly what happens during append. There was a proposition to re-initialize the slice element before each element is "appended" by unmarshal (so it'd work like "real" append does and be in sync with the doc) but it was rejected (and that's very good as it broken some code). Real behaviour of unmarshal can maybe be described better by saying that it allocates slice elements with zeroinit and overwrites slice elements when allocation isn't needed.

Yes, that's exactly what I'm aiming at. If the current behavior is intended, the documentation doesn't convey it correctly.
Reply all
Reply to author
Forward
0 new messages