protobuf3 deterministic binary serialization

774 views
Skip to first unread message

Brandon Philips

unread,
Oct 23, 2015, 4:53:32 PM10/23/15
to Protocol Buffers, Vincent Batts
Hello-

The Open Container Initiative is investigating use of protobuf for use for filesystem and container metadata. Part of the goal is to make this metadata signable. However, we learned that protobuf binary serialization are not deterministic between implementations: https://groups.google.com/a/opencontainers.org/d/msg/dev/xo4SQ92aWJ8/ad1-xm9qCAAJ

Is there any hope of that being tackled in protobuf3 or there being at least a mode to say "I really want the deterministic serialization"?

It isn't a huge blocker for us but it does mean the same spec serialized in a python tool and java tool MAY be different which is a bit annoying.

Thank You,

Brandon

Feng Xiao

unread,
Oct 23, 2015, 5:26:06 PM10/23/15
to Brandon Philips, Protocol Buffers, Vincent Batts
On Wed, Oct 21, 2015 at 10:31 AM, Brandon Philips <brandon...@coreos.com> wrote:
Hello-

The Open Container Initiative is investigating use of protobuf for use for filesystem and container metadata. Part of the goal is to make this metadata signable. However, we learned that protobuf binary serialization are not deterministic between implementations: https://groups.google.com/a/opencontainers.org/d/msg/dev/xo4SQ92aWJ8/ad1-xm9qCAAJ

Is there any hope of that being tackled in protobuf3
Unlikely.
 
or there being at least a mode to say "I really want the deterministic serialization"?
Maybe. There has been requests about adding such serialization support, though we haven't put much effort in it because it's a low priority comparing to many other stuff we are working on.
 

It isn't a huge blocker for us but it does mean the same spec serialized in a python tool and java tool MAY be different which is a bit annoying.
The undeterministic comes from unknown fields and a new feature protobuf maps. If you can guarantee there are no such fields in your proto, the protobuf library will always serialize other fields ordered by field number and thus should output the same bytes. This is what most people are relying on when they need to fingerprint or use protobuf serialized data as keys.
 

Thank You,

Brandon

--
You received this message because you are subscribed to the Google Groups "Protocol Buffers" group.
To unsubscribe from this group and stop receiving emails from it, send an email to protobuf+u...@googlegroups.com.
To post to this group, send email to prot...@googlegroups.com.
Visit this group at http://groups.google.com/group/protobuf.
For more options, visit https://groups.google.com/d/optout.

Walter Schulze

unread,
Oct 24, 2015, 8:48:38 AM10/24/15
to Protocol Buffers, brandon...@coreos.com, vba...@gmail.com
Go also sorts the map keys.
I heard it was, because that is what the C++ implementation is doing.

So do all implementations sort their fields by field number?
What languages do "all" include?

Feng Xiao

unread,
Oct 26, 2015, 1:20:04 PM10/26/15
to Walter Schulze, Protocol Buffers, brandon...@coreos.com, vba...@gmail.com
On Sat, Oct 24, 2015 at 5:48 AM, Walter Schulze <awalter...@gmail.com> wrote:
Go also sorts the map keys.
I heard it was, because that is what the C++ implementation is doing.
That's not what C++ does. Go is probably sorting keys for a different reason.
 

So do all implementations sort their fields by field number?
No. As far as I am aware of, only Go does sorting.

Paweł Szczur

unread,
Feb 4, 2018, 6:16:55 AM2/4/18
to Protocol Buffers
Hi,

What did you settle at?
I've googled and found: http://www.idpf.org/epub/30/spec/epub30-ocf.html#sec-container-metainf-signatures.xml
Am I correct you're using XML?

Cheers,

Paweł
Reply all
Reply to author
Forward
0 new messages