Binary protocol Wire Type 7, proposal to use for padding

27 views
Skip to first unread message

James F. Bellinger

unread,
Oct 17, 2025, 11:32:33 AM (14 days ago) Oct 17
to Protocol Buffers
Hello,

We use Protocol Buffers for a portion of our settings on one of our products, in particular those that the device rewrites.

One issue we encountered was that, as we are in an embedded setting with limited RAM, writing to flash memory with 256 byte pages, we did not want structures to overlap page boundaries. Doing so would complicate writing and require more buffering.

So, we extended the binary protocol by making Wire Type 7 (Field Number 0) -- effectively just the byte 0x07 -- padding. All the parser needs to do is ignore it and skip to the next byte.

Having done this, once we recognize that our next part of the message will overlap a page boundary, we pad with 0x07 to that boundary, write the page to flash memory, reset our write pointer, and continue writing.

As you can see, with this small extension, it's now possible to trivially stream Protocol Buffers data *while maintaining alignment*.

I think in a PC context, this would also have some uses, because for a large binary field for a picture or audio, written into a field as binary, the writer could optionally pad to an 8 or 16 byte boundary, and then memcpy out of the stream would be drastically faster. Or similarly, because the data is already aligned, a game could use the contained art resources directly from the Protocol Buffer memory, without any copy at all.

(As an aside, one possibility would be to include a pad length in the Field Number, to tell the parser how many bytes to skip. We did not do this, because it invites a writer to not write those bytes at all and leak arbitrary data into the stream. Also, most padding is very short, so the extra complication when reading would not likely gain any speed.)

Please consider this. It's been very useful to us, and I think it would be useful to everyone else too.

Thank you!

God bless.

James Bellinger
Dimension Engineering

Samuel Benzaquen

unread,
Oct 17, 2025, 4:10:50 PM (14 days ago) Oct 17
to James F. Bellinger, Protocol Buffers
Thank you for the proposal.

In the past, we explored techniques to add padding for alignment (eg to allow aligned vectorized instructions on the payload) and as tombstones (eg removing a field from a buffer without having to rewrite the whole buffer).
However, changing the wire format is difficult because it is a backwards-incompatible breaking change. There are very old parsers out there that will never update to newer software and they won't know what to do with a new kind of tag.

On the other hand, you can artificially pad the payloads in a few ways:
 - Varint values do not have to be in their canonical encoding. For example, though normally the value 1 is encoded as `0x01`, you could add extra zero bits on top like `0x8100`. Note that tags are limited to 32-bits and value varints to 64-bits.
 - Take advantage of last-one-wins semantics and send duplicate values for non-repeated fields.
 - Designate one "dummy" field (e.g., `bool padding = 15;`) on messages that you can add to the payload. You can also combine this with the ideas above.

Hope these can be useful to you.

--
You received this message because you are subscribed to the Google Groups "Protocol Buffers" group.
To unsubscribe from this group and stop receiving emails from it, send an email to protobuf+u...@googlegroups.com.
To view this discussion visit https://groups.google.com/d/msgid/protobuf/6db5d4a0-bce8-42e9-8396-f354a5276d58n%40googlegroups.com.
Reply all
Reply to author
Forward
0 new messages