Default Values vs Missing Values

14,774 views
Skip to first unread message

Yoav H

unread,
Mar 26, 2016, 2:43:00 PM3/26/16
to Protocol Buffers
Hi,

I wanted ask regarding the decision to populate fields with default values, even if they do not appear in the encoded message.
If I want to send a "patch" message, where I want to update just the provided fields, how can I do that with protobuf (without adding IsXXXSet for every field)?

Why not add another type, representing a default value? 
So the schematics would be, if the field is missing, it is null, and if the field exists, but with this "missing value" type, it will get the default value?

Thanks,
Yoav. 

Ilia Mirkin

unread,
Mar 26, 2016, 2:47:08 PM3/26/16
to Yoav H, Protocol Buffers
Use proto2, which has the has_* checks per field. (Using get_* you
still get the default value, of course.) It's extremely unfortunate
that this functionality was removed in proto3, I see that making
proto3 unattractive for all but the simplest uses of protos. I know in
almost every protobuf use-case I've had, the presence accessors were
imperative to proper operation.
> --
> You received this message because you are subscribed to the Google Groups
> "Protocol Buffers" group.
> To unsubscribe from this group and stop receiving emails from it, send an
> email to protobuf+u...@googlegroups.com.
> To post to this group, send email to prot...@googlegroups.com.
> Visit this group at https://groups.google.com/group/protobuf.
> For more options, visit https://groups.google.com/d/optout.

Tim Kientzle

unread,
Mar 26, 2016, 3:21:39 PM3/26/16
to Yoav H, Protocol Buffers
As Ilia pointed out, proto2 still exists, is still supported, and can be used for
cases where you require these particular semantics.

For proto3, you might look at google.protobuf.FieldMask, which is a new
standard message (one of the "well-known types") specifically designed
to store a set of field names. You might be able to achieve what you want by
providing a FieldMask with your data listing the specific fields to
be updated.

Tim


Yoav H

unread,
Mar 26, 2016, 4:00:04 PM3/26/16
to Protocol Buffers, joe.dai...@gmail.com
Thanks all,

Do you know where I can find the proto2 encoding guide?
The proto site has only the proto3 encoding described.

Ilia Mirkin

unread,
Mar 26, 2016, 4:08:23 PM3/26/16
to Yoav H, Protocol Buffers
Encoding is identical... just the API is different. In proto2, you
have (in C++) FooMessage->has_field() which will tell you whether a
field was present in the encoded version (or has been set prior if
you're building a new message). The Java API has something rather
similar... hasField() I think?

Yoav H

unread,
Mar 29, 2016, 2:02:11 AM3/29/16
to Protocol Buffers, joe.dai...@gmail.com, imi...@alum.mit.edu
How do they handle collections (repeated, non packed) in this case?
The absence of the tag is not conclusive.
Actually, even packed collection (and strings, and binary data) suffer from that, as you are "expected" to not include a packed collection with zero bytes.

Ilia Mirkin

unread,
Mar 29, 2016, 10:16:16 AM3/29/16
to Yoav H, Protocol Buffers
You can't distinguish an empty repeated from one that's not there at
all. If you need that, you'll need a manual presence field.

Zellyn

unread,
Mar 30, 2016, 10:10:54 AM3/30/16
to Protocol Buffers
protos are also gaining "Well-known Types", some of which are "boxed" (Message) versions of the primitive types: https://developers.google.com/protocol-buffers/docs/reference/google.protobuf

I believe the actual docs on well-known types are currently Google-internal :-(

Zellyn

Teddy Zhang

unread,
May 17, 2016, 10:53:02 PM5/17/16
to Protocol Buffers, joe.dai...@gmail.com, imi...@alum.mit.edu
I'm really not happy to see that proto3 removed the ability in generate code for check whether a field exits or not.

For a message like this:
message Test1 {
  required int32 a = 1;
}
If field a is present, the encoded message will have field with id 1 and its value. If the field is not set, the encoded message will not have field id 1.
In proto2 generated code, it provides a has method to check whether the field exists or not.
In proto3, this is no such thing. During deserialization, if the field is not exists, default value is set. So you can't tell whether the field does not exist or have a default value. That doesn't match the underline encoding anymore.

This is a breaking change and will portentially impact a lot of people. Basically we're losing nullable support.
For our project, we heavily depends on that. There are workarounds (add a Boolean field) but it is ugly. I think that will stop us from moving from proto2 to proto3 (may need find alternatives).

Can we add the functionality back?

Feng Xiao

unread,
May 18, 2016, 2:32:03 PM5/18/16
to Teddy Zhang, Protocol Buffers, Yoav H, Ilia Mirkin
On Tue, May 17, 2016 at 7:53 PM, Teddy Zhang <losti...@gmail.com> wrote:
I'm really not happy to see that proto3 removed the ability in generate code for check whether a field exits or not.

For a message like this:
message Test1 {
  required int32 a = 1;
}
If field a is present, the encoded message will have field with id 1 and its value. If the field is not set, the encoded message will not have field id 1.
In proto2 generated code, it provides a has method to check whether the field exists or not.
In proto3, this is no such thing. During deserialization, if the field is not exists, default value is set. So you can't tell whether the field does not exist or have a default value. That doesn't match the underline encoding anymore.

This is a breaking change and will portentially impact a lot of people. Basically we're losing nullable support.
For our project, we heavily depends on that. There are workarounds (add a Boolean field) but it is ugly. I think that will stop us from moving from proto2 to proto3 (may need find alternatives).
There are two workarounds to get back the field presence info in proto3.
1. Use a wrapper message, such as google.protobuf.Int32Value. In proto3, message fields still have has-bits.
2. Use an oneof. For example:
message Test1 {
  oneof a_oneof {
    int32 a = 1;
  }
}
then you can check test.getAOneofCase().



Can we add the functionality back?
It's very unlikely to happen as proto3 features are already finalized and implemented in many languages.
Message has been deleted

Teddy Zhang

unread,
May 18, 2016, 3:30:06 PM5/18/16
to Protocol Buffers, losti...@gmail.com, joe.dai...@gmail.com, imi...@alum.mit.edu


On Wednesday, May 18, 2016 at 11:32:03 AM UTC-7, Feng Xiao wrote:


On Tue, May 17, 2016 at 7:53 PM, Teddy Zhang <losti...@gmail.com> wrote:
I'm really not happy to see that proto3 removed the ability in generate code for check whether a field exits or not.

For a message like this:
message Test1 {
  required int32 a = 1;
}
If field a is present, the encoded message will have field with id 1 and its value. If the field is not set, the encoded message will not have field id 1.
In proto2 generated code, it provides a has method to check whether the field exists or not.
In proto3, this is no such thing. During deserialization, if the field is not exists, default value is set. So you can't tell whether the field does not exist or have a default value. That doesn't match the underline encoding anymore.

This is a breaking change and will portentially impact a lot of people. Basically we're losing nullable support.
For our project, we heavily depends on that. There are workarounds (add a Boolean field) but it is ugly. I think that will stop us from moving from proto2 to proto3 (may need find alternatives).
There are two workarounds to get back the field presence info in proto3.
1. Use a wrapper message, such as google.protobuf.Int32Value. In proto3, message fields still have has-bits.
Wrapper field consumes more space. Also, the wire format is not compatible when move from proto2 to proto3 (given the schema needs to change).
 
2. Use an oneof. For example:
message Test1 {
  oneof a_oneof {
    int32 a = 1;
  }
}
then you can check test.getAOneofCase().
Same issue as above.  



Can we add the functionality back?
It's very unlikely to happen as proto3 features are already finalized and implemented in many languages.
Is it possible to add a option on message to control this?
I know proto3 is probably in last beta and try to avoid big changes. However, remove support for this creates a lot of pain in a big system which already leverage this feature, and may move many people away. 

Teddy Zhang

unread,
May 26, 2016, 5:26:48 PM5/26/16
to Protocol Buffers, losti...@gmail.com, joe.dai...@gmail.com, imi...@alum.mit.edu
I've created an issue for this:
Reply all
Reply to author
Forward
0 new messages