Encoding; sfixed32; sfixed64; groups

2,640 views
Skip to first unread message

Marc Gravell

unread,
Jul 20, 2008, 3:45:27 AM7/20/08
to Protocol Buffers
The encoding document doesn't mention he encoding rules for sfixed32/
sfixed64 - does this use the same logic as sint32/sint64? Are these
used in reality? Maybe I'm missing the point of them...

Groups: exactly /how/ deprecated are these? Are they "don't encode
with them, but you might want to know how to decode for compatibility
with old payloads", or are they "you'll never see them; don't support
them"...?

Marc

Torbjörn Gyllebring

unread,
Jul 20, 2008, 10:48:33 AM7/20/08
to Marc Gravell, Protocol Buffers

I've interpreted "fixed32/64" as "no encoding at all just x bits". Basicly the serializer sends 32bits the deserializer takes them and calls them for what ever the targettype happens to be, i.e. for 32bit int, uint, float.

Marc Gravell

unread,
Jul 20, 2008, 1:19:29 PM7/20/08
to Protocol Buffers
> I've interpreted "fixed32/64" as "no encoding at all just x bits".

Oh, absolutely. But my question was about sfixed32 / sfixed64 - the
"s" being important; i.e. does this do the same bitwise pre-processing
[shift/xor] as an sint32/sint64 would, but then encode in a fixed
space?

Marc

Torbjörn Gyllebring

unread,
Jul 20, 2008, 1:26:30 PM7/20/08
to Protocol Buffers
I would guess no based on the simple fact that if you look at the
native types in the Language Guide you'll find that

fixed32 == uint32
sfixed32 == int32

both for C++ but that gives that basicly it's just to indicate what
the *generator* should output and actually not a part of encoding.

Marc Gravell skrev:

Alek Storm

unread,
Jul 20, 2008, 1:48:42 PM7/20/08
to Protocol Buffers
On Jul 20, 2:45 am, Marc Gravell <marc.grav...@gmail.com> wrote:
> Groups: exactly /how/ deprecated are these? Are they "don't encode
> with them, but you might want to know how to decode for compatibility
> with old payloads", or are they "you'll never see them; don't support
> them"...?

I'm not sure if you're also asking *how* groups are encoded, but if
you are, it's just like nested messages.

Marc Gravell

unread,
Jul 20, 2008, 2:39:51 PM7/20/08
to Protocol Buffers
> fixed32 == uint32
> sfixed32 == int32

Ahh! Right. Sorry, my bad - I was thinking of fixed32 as the regular
32-bit int32 represenation, in the same way that int32 is the
variant... but mapping fixed32 to uint32 and sfixed32 to int32 makes
sense now. Kinda wish they'd called them ufixed32 etc, though!

Cheers,

Marc

Kenton Varda

unread,
Jul 20, 2008, 3:45:27 PM7/20/08
to Marc Gravell, Protocol Buffers
Yikes, there's a lot of misinformation in this thread.

fixed32 and fixed64 are unsigned integers encoded directly as 4 or 8 bytes on the wire, in little-endian order.  sfixed32 and sfixed64 are the same, but signed -- negative numbers just use two's complement.  ZigZag encoding is explicitly designed for use with varint-encoded values, so it doesn't make sense to use it here.  Sorry about the naming; it's the result of gradual evolution.  Proto1 only had fixed32 and fixed64, which were unsigned from the start.  The "sfixed" variants were added in proto2.

Groups are *not* encoded like nested messages.  Nested messages are encoded as a length followed by the message contents (the "length-delimited" wire type).  Groups are encoded as a start-group tag, followed by the message contents, followed by an end-group tag.

Groups are still used by many protocols inside Google, so all implementations of protobufs that we use internally had to support them.  It's possible that some of these protocols will be exposed publicly at some point (though I don't know of any specific examples), which would mean that external implementations need to support them too.  However, any *new* protocols should definitely not use them.  It's probably safe to not implement them for now, although you should at least recognize the wire type and be able to ignore groups seen on the wire, for forwards-compatibility.

Marc Gravell

unread,
Jul 21, 2008, 12:33:25 AM7/21/08
to Protocol Buffers
Thanks for the info; sorry about my confusion re sfixed32 etc; it was
my fault for seeing the "s" and thinking "ZigZag"... I am clear now.

The funny thing is that the "group" encoding might then be just the
thing needed to impement a firehose (forwards only) data stream from a
data-source like IQueryable<T> / IEnumerable<T> - since the problem
with length-prefix is that it gets very tricky with nested objects. I
doubt I'll get around to this at any point soon, but I'd love a little
more detail on what this would look like on the wire - if only so that
I can correctly jump the grouped data (at the moment I just throw a
NotSupportedException, breaking the deserialize completely). Maybe
I'll stick my nose into the java code to find out (I'm not feeling
brave enough to read the C++)...

Marc

Jon Skeet

unread,
Jul 21, 2008, 2:01:17 AM7/21/08
to Protocol Buffers
On Jul 21, 5:33 am, Marc Gravell <marc.grav...@gmail.com> wrote:
> Maybe
> I'll stick my nose into the java code to find out (I'm not feeling
> brave enough to read the C++)...

I suspect I've ported most of that functionality by now, so if you're
more comfortable reading C# you could look at the code on Git...
admittedly that's not the "tried and tested" code that the Java and C+
+ are :)

Jon
Reply all
Reply to author
Forward
0 new messages