int16/uint16 and int8/uint8

21,683 views
Skip to first unread message

mmas...@gmail.com

unread,
Jul 15, 2008, 10:07:10 PM7/15/08
to Protocol Buffers
Have these data types been intentionally left out because data
alignment issues if added?
I am wondering why?
Certain stats are kept as uint16 and uint8 in resource constrained
systems and would help
to have these instead of promoting them to uint32.
Just a thought.

Alek Storm

unread,
Jul 16, 2008, 1:01:27 AM7/16/08
to Protocol Buffers
All number types except fixed, double, and float are stored as
varints, meaning as long as your numbers stay in about the int8 or
int16 range, they are only 1 or 2 bytes, even though it says int32.

James Bruce

unread,
Jul 16, 2008, 3:10:41 AM7/16/08
to Alek Storm, mmas...@gmail.com, Protocol Buffers

Nevertheless, it would be nice to have them for the in-memory version.
My use cases have variables that would almost never exceed a single
digit, so it would be nice to keep those small (and sorted to minimize
waste from alignment). The nice thing about all those types being
varint on the wire is that you can always safely promote a type and
keep compatibility, which also means that you don't have to go
overboard with future-proofing (i.e. making everything 64 bit "just in
case").
- Jim

Marc Gravell

unread,
Jul 16, 2008, 3:29:11 AM7/16/08
to Protocol Buffers
> as long as your numbers stay in about the int8 or
> int16 range, they are only 1 or 2 bytes

Well, for data defined as [.proto] uint32 and sint32 then yes; but for
[.proto] int32, negative numbers will be larger, even in the [regular]
int8/int16 range; not a problem when talking about [regular] uint8 /
uint16, though.

> The nice thing about all those types being
> varint on the wire is that you can always safely promote a type and
> keep compatibility.

Again, watch for -ves if changing the .proto from int32 to int64, and
watch for changing between (for example) sint32 and int64. But sint32
=> sint64 and uint32 => uint64 seem wire-compatible.

Marc

Marc Gravell

unread,
Jul 16, 2008, 3:34:53 AM7/16/08
to Protocol Buffers
> they are only 1 or 2 bytes
at the extremes and int16 is surely 3 bytes? Anything that uses the 2
most significant bits is going to take us to 3 bytes under base-128?

Marc

Kenton Varda

unread,
Jul 16, 2008, 1:38:02 PM7/16/08
to mmas...@gmail.com, Protocol Buffers
As it stands, the protobuf C++ library is probably not well-suited to conditions where you would want to use 16-bit numbers to save memory.  Perhaps someone should write a version of protobufs designed for extremely-memory-constrained systems.

Alek Storm

unread,
Jul 16, 2008, 2:16:45 PM7/16/08
to Protocol Buffers
On Jul 16, 2:10 am, "James Bruce" <james.br...@gmail.com> wrote:
> Nevertheless, it would be nice to have them for the in-memory version.
> My use cases have variables that would almost never exceed a single
> digit, so it would be nice to keep those small (and sorted to minimize
> waste from alignment). The nice thing about all those types being
> varint on the wire is that you can always safely promote a type and
> keep compatibility, which also means that you don't have to go
> overboard with future-proofing (i.e. making everything 64 bit "just in
> case").

Hm. You could run an incoming message through the deserializer,
convert that data to int8/int16's, store it along with the others,
repeat. That way, only the incoming message has int32-wide fields.
But that only works if you're storing a lot of messages with a few
fields each. Here we have yet another use case for a streaming
deserializer - int32's could get squeezed into int8/int16's on the
fly, instead of waiting for the whole message to be parsed.
From my calculations, numbers <= 127 will take one byte, numbers <=
16,383 (127+127*128) will take two, numbers <= 2,097,151
(127+127*128+127*128^2) will take three. The normal range for int8's
and int16's are up to 255 and 65,536, respectively. So my mistake,
int16's that are 16,383-65,536 will take three bytes.

Gregory P. Smith

unread,
Jul 16, 2008, 11:34:36 PM7/16/08
to Kenton Varda, mmas...@gmail.com, Protocol Buffers
It could be more useful once support for packed repeated fields is added when i could imagine people actually wanting to use large repeateds of small numbers.

-gps
Reply all
Reply to author
Forward
0 new messages