A problem of the design of gob

387 views
Skip to first unread message

David DENG

unread,
Feb 20, 2013, 5:17:00 AM2/20/13
to golan...@googlegroups.com
In gob, if a field contains zero value, it is considered as NO value. That causes some problems:

1) You have to create a new strcut or clear it with a zero value before you fetch a value. i.e. The following code doesn't got what you expect:


Instead you have to do it in this way:


or in this way


2) For pointers, it is even worse: zero value not working either.


David

Nate Finch

unread,
Feb 20, 2013, 8:30:47 AM2/20/13
to golan...@googlegroups.com
This looks like mostly a problem of lack of documentation.  If it were documented, the behavior would be clear.  

Zero values for types are not written on encode or decode, so any receiving type should be set to the zero value (or you should be aware that the contents of the type will serve as a default if the value encoded was the zero value).

For receiving into types that are pointers or contain pointers, the pointers must be non-nil. Pointers will not be created for you.

Jan Mercl

unread,
Feb 20, 2013, 9:13:29 AM2/20/13
to David DENG, golang-nuts
On Wed, Feb 20, 2013 at 11:17 AM, David DENG <david...@gmail.com> wrote:
> In gob, if a field contains zero value, it is considered as NO value. That
> causes some problems:

I don't think so (wrt "NO value"). The rationale is, I think: No need
to transport the zero values as they are the defaults. Then: I think
no one (of the developers) expected, nor meant to support, decoding
into non zero values. I cannot come with a sane use case for that
either. Modulo the pointers as Nate said already, but that again seems
to me like a rational design choice: Filling values of e.g. a struct
is always OK(safe). OTOH, growing carelessly a tree structure is an
easy attack vector.

-j

Kamil Kisiel

unread,
Feb 20, 2013, 1:04:05 PM2/20/13
to golan...@googlegroups.com
This is similar to to how encoding/json or encoding/xml work as well. The decoder doesn't touch fields that are not in the data stream.
The fact the field is not sent at all is intentional and documented in the encoding/gob documentation. In the paragraph about struct encodings:

"If a field has the zero value for its type, it is omitted from the transmission."

Dan Kortschak

unread,
Feb 20, 2013, 2:37:47 PM2/20/13
to Nate Finch, golan...@googlegroups.com
But in the last example the pointer is allocated when the value is non-zero. This is an case where gob should probably consider that the field is non-zero, since the pointer in not nil, event though the value it points to is zero.

Using a pointer value to a numerical value can be used as an indication of whether the value is absent, as opposed to being known to be zero - I use this idiom regularly when I'm not working with float values and so don't have NaN to make that statement.

gob could transmit zero values when they are pointed to be a pointer. This would allow the explicit transfer of zeros in this case.

Dan Kortschak

unread,
Feb 20, 2013, 3:00:30 PM2/20/13
to golan...@googlegroups.com
I should clarify this. The result of new(int) is not a zero value. So a struct value, struct{Foo *int}{new(int)} should send the value 0 for that field. This would indicate to the decoder that if the recieving struct has a pointer field the pointer should point to an allocated value.

I can't see how this would break existing code, though it would add slight overhead in very few cases where people probably wanted the data to be tranfered.

The alternative in these cases is to implement gob.Decoder and gob.Encoder, which seems a pain for this.

David DENG

unread,
Feb 20, 2013, 7:26:24 PM2/20/13
to golan...@googlegroups.com, David DENG
Using zero value to mean no-value or default-value is ok for pointers (itself)/slice/map. But for numbers it is quite often this is not true.

For example, any of the following fields consider zero as a special value other than non-zero value:

struct {
    Temperature  float32
    Height       int 
    NumOfFriends int
    HasError     bool
}

I think the design confuse between whether transmitting and whether set default values. If I have clear the struct before decoding, it is actually not the most effieincy.

David

David DENG

unread,
Feb 20, 2013, 7:57:11 PM2/20/13
to golan...@googlegroups.com
I can understand if it is due to reduce the number of encoded bytes, but for many numeric and boolean types, zero value is just as other non-zero value. Other than the examples in another post, for field as IndexInASlice, -1 is a more commonly-used special value meanning the index is invalid.

I think the decoder can detect the numeric fields from the type information from the encoded stream. If a field is number/boolean, and the receiver has that field, and no value in the encoded entry, the receiver's field should be set to zero.

For pointers, I give another example here:


X, Y are coordinates, why I have to create Y (because it points to zero) while not X (just because it points to a non-zero). But A coordiate being zero is very common and not special at all!

David

Dan Kortschak

unread,
Feb 20, 2013, 8:05:50 PM2/20/13
to David DENG, golan...@googlegroups.com
If this is seen as a problem worth fixing I think attacking it at the
receiving end is the wrong approach. For non-pointer values, people
should just start with a properly conditioned receiving variable. For
pointer values though, the receiver in some (many?) cases cannot know
what the properly conditioned state is, so zero values that are
explicitly pointed to in the encoded variable should be sent as the zero
value for the pointed-to type. This way the decoder would know to set
the field to zero, allocating if needed in the case of a pointer field.

This all can be worked around by defining a gob.Encoder/Decoder for the
field type when this is important, hence the caveat at the beginning.

David DENG

unread,
Feb 20, 2013, 9:08:37 PM2/20/13
to golan...@googlegroups.com, David DENG
I didn't check the source code but in the document says: "Thus when we send our first type T, the gob encoder sends a description of T". So, I suppose the receiver has enough information.

David

David DENG

unread,
Feb 20, 2013, 9:19:07 PM2/20/13
to golan...@googlegroups.com
package "encoding/json" is different:


Zero intergers and empty strings are correctly encoded and decoded.

This design is more natural for me.

David

Dan Kortschak

unread,
Feb 20, 2013, 9:27:30 PM2/20/13
to David DENG, golan...@googlegroups.com
Yes, it has enough information about the type, but not about values.

This should illustrate:

type A struct {
X *int
Y int
}

type B struct {
X int
Y int
}

Send A and the receiver knows that the type has an X and a Y both with
take values that are int, but it doesn't know (or care) that one is a
pointer and one in a concrete value.

Send (illegal syntax to simplify) A{&1, 2} and you can receive in B as
B{1, 2}. If you send A{&0, 2} or A{nil, 2} the value received in B will
be B{0, 2}. This is not interesting, but in the case of receiving into
an A, you get both as A{nil, 2}, because nil is the zero value and it
not sent, and &0 is the zero value for a dereferenced *int. So how can
we differentiate these?

A{nil, 2} // nil is the zero value, so send {Y:2}. Same as now.

A{&0, 2} // &0 is not the zero value for *int, so send {X:0, Y:2}

then when the decoder sees a gob struct of {X:0, Y:2} it will

{X:0, Y:2} -> A{&0, 2} which keeps the correct state for the *int field,

or

{X:0, Y:2} -> B{0, 2} which is no change from the current situation.

I can't see way you can distinguish an absent pointer field that is zero
from an absent pointer field that is nil unless you send some
information. This does that.

As far as the non-pointer fields go, they are purely a convenience
thing; just properly condition your receiving variable prior to decoding
into it.

On Wed, 2013-02-20 at 18:08 -0800, David DENG wrote:
> I didn't check the source code but in the document
> <http://golang.org/doc/articles/gobs_of_data.html>says: "*Thus when we
> send
> our first type T, the gob encoder sends a description of T*". So, I

David DENG

unread,
Feb 20, 2013, 9:35:04 PM2/20/13
to golan...@googlegroups.com, David DENG
This makes sense. Whatever I said, you just clarified what I expected.

David

Nigel Tao

unread,
Feb 20, 2013, 11:10:29 PM2/20/13
to David DENG, golang-nuts
On Thu, Feb 21, 2013 at 11:26 AM, David DENG <david...@gmail.com> wrote:
> If I have clear the struct before decoding, it is actually
> not the most effieincy.

Clearing the struct before decoding is both idiomatic and cheap.

Gob's design is gob's design. Its semantics are not something that is
easily changed, given both the Go 1 backwards compatability constraint
and the fact that there exists gob-encoded data on disk.

Dan Kortschak

unread,
Feb 20, 2013, 11:27:23 PM2/20/13
to Nigel Tao, golang-nuts
I agree with this. But I'm still wondering about the pointer issue. As
it stands it is not possible to signal that a pointer to a zero value is
different to a nil pointer without the field in question implementing
gob.Encoder.

Nigel Tao

unread,
Feb 21, 2013, 1:17:27 AM2/21/13
to Dan Kortschak, golang-nuts
On Thu, Feb 21, 2013 at 3:27 PM, Dan Kortschak
<dan.ko...@adelaide.edu.au> wrote:
> I agree with this. But I'm still wondering about the pointer issue. As
> it stands it is not possible to signal that a pointer to a zero value is
> different to a nil pointer without the field in question implementing
> gob.Encoder.

It is what it is, and the inability to distinguish nil from
pointer-to-zero is a natural consequence of two deliberate design
decisions. First, indirection in the Go types do not matter; a *int
field holds the same information as an int field. Second, zero-valued
fields are omitted from the wire format.

Dan Kortschak

unread,
Feb 21, 2013, 1:37:34 AM2/21/13
to Nigel Tao, golang-nuts
The point that I make though is that a *int is not the same as an int.
If they were, we would bother having *ints.

From the source it's clear that the indirection is done without saving
that pointer history. If the fact that indirection had been necessary to
reach a zero value, then a zero value could be sent, giving an
indication at the receiving end that the value should be allocated.

I think it's an unfortunate confluence of features. Though it's not
unresolvable at the client end though, so it can be lived with.
Reply all
Reply to author
Forward
0 new messages