Hello,
I'd like to see if any prior work has been done in customizing protobuf compilation to support message encoding/decoding against a legacy wire format.
Put another way, I'm interested in:
1. specifying an existing protocol using protobuf's .proto file syntax, and
2. reusing protobuf's .proto file parsing and code generation infrastructure, while
3. replacing protobuf's default encoding algorithm and replacing it with one that conforms to an existing format.
This discussion from 2013 is the closest thing I've found to a similar question on this mailing list. Unfortunately it doesn't go into much detail:
Some context will probably be of use. The existing wire format in question is that of Bitcoin's peer-to-peer network protocol. These messages and their binary representations are defined in this document:
Note that protocol buffers were considered for use during Bitcoin's initial development, but rejected on concerns around complexity and security:
Whether or not those concerns were well-founded, Bitcoin's resulting wire format works well today, and for this reason, changing it is not considered to be an option.
The impetus for this question, then, is that there are an increasing number of implementations of the Bitcoin protocol under development today, and in order to participate in the peer-to-peer network, each must faithfully re-implement handling this custom wire format. Typically this work is done through a combination of studying the documentation above and carefully transcribing code from the Bitcoin Core reference implementation. This creates a significant barrier to entry as well as a potential source of bugs that can threaten network stability.
To avoid this tedious and error-prone work, there is a desire to codify the message formats in such a way that language-specific bindings may be generated rather than hand-coded.
The encoding algorithm and code generation for each specific language would of course have to be custom developed, but the idea is to do so within an otherwise widely-used framework such as protocol buffers, minimizing the need to re-invent as much as possible.
I have not yet looked deeply at the extension points within protocol buffers to assess the feasibility of this idea. I have seen that protoc supports plugins [1], but don't know whether anyone has gone so far with them as to replace fundamental assumptions about wire format. I have also noticed "Custom Options" [2], which may help in expressing particular quirks or nuances of the existing protocol within .proto files.
At this point, I'd simply like to see whether anyone has been down this road before, and whether there are reasons for dismissing the idea completely before digging in too much further.
- Chris
P.S: Please note that in posting this question I am in no way presuming to represent the Bitcoin Core development team.