Repeated Fields Encoding

1,592 views
Skip to first unread message

Timothy Parez

unread,
Feb 1, 2011, 6:01:21 AM2/1/11
to Protocol Buffers
Hello,

Considering the following proto file:

message FileDescriptor
{
required string Filename = 1;
optional int64 Size = 2 [default = 0];
}

message FileList
{
repeated FileDescriptor Files = 1;
}

If you create something like this:
(and I'm duplicating the data because it made it easier to spot in a
hex editor)

files.Files.Add(new FileDescriptor() { Filename = "AAAAAAA", Size =
100 });
files.Files.Add(new FileDescriptor() { Filename = "AAAAAAA", Size =
100 });
files.Files.Add(new FileDescriptor() { Filename = "AAAAAAA", Size =
100 });
files.Files.Add(new FileDescriptor() { Filename = "AAAAAAA", Size =
100 });

and then serialize it using the Protobuf.Serializer I expected it to
generate something like

Tag for the FileList -> Id 1, WireType 2 => 0x0A
Length of the payload (all the bytes for all the files that follow)

But instead I found everything is simply repeated.

0A 0B 0A 07 41 41 41 41 41 41 41 10 64
0A 0B 0A 07 41 41 41 41 41 41 41 10 64
0A 0B 0A 07 41 41 41 41 41 41 41 10 64
0A 0B 0A 07 41 41 41 41 41 41 41 10 64

I'm wondering, is this an implementation detail (and allowed by the
protobol buffer specifications)
or a requirement of the google protocol buffer specifications ?

It does seem to add quite a bit of overhead, imagine the FileList has
other properties,
they would be repeated for every instance of FileDescriptor ?

Or am I missing something ?


Marc Gravell

unread,
Feb 1, 2011, 12:57:29 PM2/1/11
to Timothy Parez, Protocol Buffers
I think this also came to me directly and I answered earlier, but this is the expected layout of "repeated" data, where each item in a list is mapped separately in the data stream.

Marc

> --
> You received this message because you are subscribed to the Google Groups "Protocol Buffers" group.
> To post to this group, send email to prot...@googlegroups.com.
> To unsubscribe from this group, send email to protobuf+u...@googlegroups.com.
> For more options, visit this group at http://groups.google.com/group/protobuf?hl=en.
>

Kenton Varda

unread,
Feb 1, 2011, 1:58:22 PM2/1/11
to Timothy Parez, Protocol Buffers
The encoding is documented in detail here:


The short answer is, yes, repeated fields are literally encoded as repeated individual values, unless you use "packed" encoding.

Reply all
Reply to author
Forward
0 new messages