New protobuf feature proposal: Generated classes for streaming / visitors

2,888 views
Skip to first unread message

Kenton Varda

unread,
Feb 1, 2011, 1:45:00 PM2/1/11
to Protocol Buffers, Pherl Liu, Jason Hsueh, Steven Knight
Hello open source protobuf users,

Background

Probably the biggest deficiency in the open source protocol buffers libraries today is a lack of built-in support for handling streams of messages.  True, it's not too hard for users to support it manually, by prefixing each message with its size as described here:


However, this is awkward, and typically requires users to reach into the low-level CodedInputStream/CodedOutputStream classes and do a lot of work manually.  Furthermore, many users want to handle streams of heterogeneous message types.  We tell them to wrap their messages in an outer type using the "union" pattern:


But this is kind of ugly and has unnecessary overhead.

These problems never really came up in our internal usage, because inside Google we have an RPC system and other utility code which builds on top of protocol buffers and provides appropriate abstraction. While we'd like to open source this code, a lot of it is large, somewhat messy, and highly interdependent with unrelated parts of our environment, and no one has had the time to rewrite it all cleanly (as we did with protocol buffers itself).

Proposed solution:  Generated Visitors

I've been wanting to fix this for some time now, but didn't really have a good idea how.  CodedInputStream is annoyingly low-level, but I couldn't think of much better an interface for reading a stream of messages off the wire.

A couple weeks ago, though, I realized that I had been failing to consider how new kinds of code generation could help this problem.  I was trying to think of solutions that would go into the protobuf base library, not solutions that were generated by the protocol compiler.

So then it became pretty clear:  A protobuf message definition can also be interpreted as a definition for a streaming protocol.  Each field in the message is a kind of item in the stream.

  // A stream of Foo and Bar messages, and also strings.
  message MyStream {
    option generate_visitors = true;  // enables generation of streaming classes
    repeated Foo foo = 1;
    repeated Bar bar = 2;
    repeated string baz = 3;
  }

All we need to do is generate code appropriate for treating MyStream as a stream, rather than one big message.

My approach is to generate two interfaces, each with two provided implementations.  The interfaces are "Visitor" and "Guide".  MyStream::Visitor looks like this:

  class MyStream::Visitor {
   public:
    virtual ~Visitor();

    virtual void VisitFoo(const Foo& foo);
    virtual void VisitBar(const Bar& bar);
    virtual void VisitBaz(const std::string& baz);
  };

The Visitor class has two standard implementations:  "Writer" and "Filler".  MyStream::Writer writes the visited fields to a CodedOutputStream, using the same wire format as would be used to encode MyStream as one big message.  MyStream::Filler fills in a MyStream message object with the visited values.

Meanwhile, Guides are objects that drive Visitors.

  class MyStream::Guide {
   public:
    virtual ~Guide();

    // Call the methods of the visitor on the Guide's data.
    virtual void Accept(MyStream::Visitor* visitor) = 0;

    // Just fill in a message object directly rather than use a visitor.
    virtual void Fill(MyStream* message) = 0;
  };

The two standard implementations of Guide are "Reader" and "Walker".  MyStream::Reader reads items from a CodedInputStream and passes them to the visitor.  MyStream::Walker walks over a MyStream message object and passes all the fields to the visitor.

To handle a stream of messages, simply attach a Reader to your own Visitor implementation.  Your visitor's methods will then be called as each item is parsed, kind of like "SAX" XML parsing, but type-safe.

Nonblocking I/O

The "Reader" type declared above is based on blocking I/O, but many users would prefer a non-blocking approach.  I'm less sure how to handle this, but my thought was that we could provide a utility class like:

  class NonblockingHelper {
   public:
    template <typename MessageType>
    NonblockingHelper(typename MessageType::Visitor* visitor);

    // Push data into the buffer.  If the data completes any fields,
    // they will be passed to the underlying visitor.  Any left-over data
    // is remembered for the next call.
    void PushData(void* data, int size);
  };

With this, you can use whatever non-blocking I/O mechanism you want, and just have to push the data into the NonblockingHelper, which will take care of calling the Visitor as necessary.

C++ implementation

I've written up a patch implementing this for C++ (not yet including the nonblocking part):


Feedback

What do you think?

I know I'm excited to use this in some of my own side projects (which is why I spent my weekend working on it), but before adding this to the official implementation we should make sure it is broadly useful.

Jason Hsueh

unread,
Feb 1, 2011, 6:17:40 PM2/1/11
to Kenton Varda, Protocol Buffers, Pherl Liu, Steven Knight
Conceptually this sounds great, the big question to me is whether this should be implemented as an option in the compiler or as a separate plugin. I haven't taken a thorough look at the patch, but I'd guess it adds a decent amount to the core code generator. I have a preference for the plugin approach, but of course I'm primarily an internal protobuf user, so I'm willing to be convinced otherwise :-) Would using a plugin, possibly even shipped with the standard implementation, make this feature too inconvenient to use? Or is there enough demand for this that it warrants implementing as an option?

Regarding the proposed interfaces: I can imagine some applications where the const refs passed to the visitor methods may be too restrictive - the user may instead want to take ownership of the object. e.g., suppose the stream is a series of requests, and each of the visitor handlers needs to start some asynchronous work. It would be good to hear if users have use cases that don't quite fit into this model (or at least if the existing use cases will work).

Kenton Varda

unread,
Feb 2, 2011, 12:30:15 AM2/2/11
to Jason Hsueh, Protocol Buffers, Pherl Liu, Steven Knight
On Tue, Feb 1, 2011 at 3:17 PM, Jason Hsueh <jas...@google.com> wrote:
Conceptually this sounds great, the big question to me is whether this should be implemented as an option in the compiler or as a separate plugin. I haven't taken a thorough look at the patch, but I'd guess it adds a decent amount to the core code generator. I have a preference for the plugin approach, but of course I'm primarily an internal protobuf user, so I'm willing to be convinced otherwise :-) Would using a plugin, possibly even shipped with the standard implementation, make this feature too inconvenient to use? Or is there enough demand for this that it warrants implementing as an option?

First of all, note that this feature is off by default.  You have to turn it on with the generate_visitors message-level option.  The only new code added to the base library is a couple templates in WireFormatLite, which are of course never instantiated if you don't generate visitor code.

There are a few reasons I prefer to make this part of the base code generator:

- If you look at the patch, you'll see that the code generation for the two Guide classes actually shares a lot with the code generation for MergeFromCodedStream and SerializeWithCachedSizes.  To make this a plugin, either we'd have to expose parts of the C++ code generator internals publicly (eww) or we'd have to reproduce a lot of code (also eww).

- The Reader and Writer classes directly use WireFormatLite, which is a private interface.

- It seems clear that this feature is widely desired by open source users.  We're not talking about a niche use case here.
 
Regarding the proposed interfaces: I can imagine some applications where the const refs passed to the visitor methods may be too restrictive - the user may instead want to take ownership of the object. e.g., suppose the stream is a series of requests, and each of the visitor handlers needs to start some asynchronous work. It would be good to hear if users have use cases that don't quite fit into this model (or at least if the existing use cases will work).

Interesting point.  In the Reader case, it's creating new objects, so in theory it ought to be able to hand off ownership to the Visitor it calls.  But, the Walker is walking an existing object and thus clearly cannot give up ownership.  It seems clear that some use cases need const references, which means that the only way we could support ownership passing is by adding another parallel set of methods.  I suppose they could have default implementations that delegate to the const reference versions, in which case only people who wanted to optimize for them would need to override them.  But I'd like to see that this is really desired first -- it's easy enough to add later.

Also note that my code currently doesn't reuse message objects, but improving it to do so would be straightforward.  A Reader could allocate one object of each sub-message type for reuse.  But, it seems like that wouldn't play well with ownership-passing.

fpmc

unread,
Feb 2, 2011, 2:02:04 AM2/2/11
to Protocol Buffers
Your proposal has one VisitXXX function for each repeated type. How
does it handle a message with two repeated fields of the same type?
> >> *Background*
>
> >> Probably the biggest deficiency in the open source protocol buffers
> >> libraries today is a lack of built-in support for handling streams of
> >> messages.  True, it's not too hard for users to support it manually, by
> >> prefixing each message with its size as described here:
>
> >>http://code.google.com/apis/protocolbuffers/docs/techniques.html#stre...
>
> >> However, this is awkward, and typically requires users to reach into the
> >> low-level CodedInputStream/CodedOutputStream classes and do a lot of work
> >> manually.  Furthermore, many users want to handle streams
> >> of heterogeneous message types.  We tell them to wrap their messages in an
> >> outer type using the "union" pattern:
>
> >>  http://code.google.com/apis/protocolbuffers/docs/techniques.html#union
>
> >> But this is kind of ugly and has unnecessary overhead.
>
> >> These problems never really came up in our internal usage, because inside
> >> Google we have an RPC system and other utility code which builds on top of
> >> protocol buffers and provides appropriate abstraction. While we'd like to
> >> open source this code, a lot of it is large, somewhat messy, and highly
> >> interdependent with unrelated parts of our environment, and no one has had
> >> the time to rewrite it all cleanly (as we did with protocol buffers itself).
>
> >> *Proposed solution:  Generated Visitors*
> >> *Nonblocking I/O*
>
> >> The "Reader" type declared above is based on blocking I/O, but many users
> >> would prefer a non-blocking approach.  I'm less sure how to handle this, but
> >> my thought was that we could provide a utility class like:
>
> >>   class NonblockingHelper {
> >>    public:
> >>     template <typename MessageType>
> >>     NonblockingHelper(typename MessageType::Visitor* visitor);
>
> >>     // Push data into the buffer.  If the data completes any fields,
> >>     // they will be passed to the underlying visitor.  Any left-over data
> >>     // is remembered for the next call.
> >>     void PushData(void* data, int size);
> >>   };
>
> >> With this, you can use whatever non-blocking I/O mechanism you want, and
> >> just have to push the data into the NonblockingHelper, which will take care
> >> of calling the Visitor as necessary.
>
> >> *C++ implementation*
>
> >> I've written up a patch implementing this for C++ (not yet including the
> >> nonblocking part):
>
> >>  http://codereview.appspot.com/4077052
>
> >> *Feedback*

Henner Zeller

unread,
Feb 2, 2011, 1:13:22 PM2/2/11
to fpmc, Protocol Buffers
On Tue, Feb 1, 2011 at 23:02, fpmc <fp...@google.com> wrote:
> Your proposal has one VisitXXX function for each repeated type.  How
> does it handle a message with two repeated fields of the same type?

I guess the naming is confusing in the example. The Visit is per
field-name; but since the typed is named the same as the field in this
example, it is confusing.

> --
> You received this message because you are subscribed to the Google Groups "Protocol Buffers" group.
> To post to this group, send email to prot...@googlegroups.com.
> To unsubscribe from this group, send email to protobuf+u...@googlegroups.com.
> For more options, visit this group at http://groups.google.com/group/protobuf?hl=en.
>
>

Thingfish

unread,
Feb 2, 2011, 3:55:41 PM2/2/11
to Protocol Buffers
Just want to add my vote for this feature to be added to the base
compiler. I've implemented similar multiplexing patterns over and over
again, and would love for the compiler to free me from writing and
maintaining that code.

On 2 Feb., 19:13, Henner Zeller <henner.zel...@googlemail.com> wrote:

Kenton Varda

unread,
Feb 4, 2011, 6:06:04 PM2/4/11
to Henner Zeller, fpmc, Protocol Buffers
On Wed, Feb 2, 2011 at 10:13 AM, Henner Zeller <henner...@googlemail.com> wrote:
I guess the naming is confusing in the example. The Visit is per
field-name; but since the typed is named the same as the field in this
example, it is confusing.

Yes, sorry.  Better example:

  message MyStream {
    option generate_visitors = true;
    repeated Foo bar = 1;
    repeated Foo baz = 2;
  }

creates:

  class MyStream::Visitor {
   public:
    virtual ~Visitor();

    virtual void VisitBar(const Foo& value);
    virtual void VisitBaz(const Foo& value);
  };

Frank Chu

unread,
Feb 5, 2011, 11:40:01 AM2/5/11
to Kenton Varda, Henner Zeller, Protocol Buffers
Can the naming be

visit_bar()
visit_baz()

then?  It's good to have some consistency.

Frank

Kenton Varda

unread,
Feb 5, 2011, 1:41:30 PM2/5/11
to Frank Chu, Henner Zeller, Protocol Buffers
Unfortunately, the Google C++ Style Guide prescribes inconsistency.  Only simple inline methods can use lowercase-with-underscores naming; everything else is supposed to use capitalized camelcase.

Jason Hsueh

unread,
Feb 7, 2011, 5:42:47 PM2/7/11
to Kenton Varda, Protocol Buffers, Pherl Liu, Steven Knight
I'm starting to look at the patch (meant to start end of last week but got caught up in other stuff)

On Tue, Feb 1, 2011 at 9:30 PM, Kenton Varda <ken...@google.com> wrote:
On Tue, Feb 1, 2011 at 3:17 PM, Jason Hsueh <jas...@google.com> wrote:
Conceptually this sounds great, the big question to me is whether this should be implemented as an option in the compiler or as a separate plugin. I haven't taken a thorough look at the patch, but I'd guess it adds a decent amount to the core code generator. I have a preference for the plugin approach, but of course I'm primarily an internal protobuf user, so I'm willing to be convinced otherwise :-) Would using a plugin, possibly even shipped with the standard implementation, make this feature too inconvenient to use? Or is there enough demand for this that it warrants implementing as an option?

First of all, note that this feature is off by default.  You have to turn it on with the generate_visitors message-level option.  The only new code added to the base library is a couple templates in WireFormatLite, which are of course never instantiated if you don't generate visitor code.

There are a few reasons I prefer to make this part of the base code generator:

- If you look at the patch, you'll see that the code generation for the two Guide classes actually shares a lot with the code generation for MergeFromCodedStream and SerializeWithCachedSizes.  To make this a plugin, either we'd have to expose parts of the C++ code generator internals publicly (eww) or we'd have to reproduce a lot of code (also eww).

- The Reader and Writer classes directly use WireFormatLite, which is a private interface.

- It seems clear that this feature is widely desired by open source users.  We're not talking about a niche use case here.
 
Regarding the proposed interfaces: I can imagine some applications where the const refs passed to the visitor methods may be too restrictive - the user may instead want to take ownership of the object. e.g., suppose the stream is a series of requests, and each of the visitor handlers needs to start some asynchronous work. It would be good to hear if users have use cases that don't quite fit into this model (or at least if the existing use cases will work).

Interesting point.  In the Reader case, it's creating new objects, so in theory it ought to be able to hand off ownership to the Visitor it calls.  But, the Walker is walking an existing object and thus clearly cannot give up ownership.  It seems clear that some use cases need const references, which means that the only way we could support ownership passing is by adding another parallel set of methods.  I suppose they could have default implementations that delegate to the const reference versions, in which case only people who wanted to optimize for them would need to override them.  But I'd like to see that this is really desired first -- it's easy enough to add later.

Yeah, there's definitely a need for the const ref versions. It sounds like nobody is clamoring for mutable access/ownership-passing so let's proceed as is.


Also note that my code currently doesn't reuse message objects, but improving it to do so would be straightforward.  A Reader could allocate one object of each sub-message type for reuse.  But, it seems like that wouldn't play well with ownership-passing.

 Perhaps instead of ownership-passing the methods could provide mutable access so people could Swap() etc. It would defeat the optimization, but at least be less messy. Anyway, all of this can be revisited later should the need arise.

Evan Jones

unread,
Feb 8, 2011, 8:47:50 AM2/8/11
to Kenton Varda, Protocol Buffers, Pherl Liu, Jason Hsueh, Steven Knight
I read this proposal somewhat carefully, and thought about it for a
couple days. I think something like this might solve the problem that
many people have with streams of messages. However, I was wondering a
couple things about the design:


* It seems to me that this will solve the problem for people who know
statically at compile time what types they need to handle from a
stream, so they can define the "stream type" appropriately. Will users
find themselves running into the case where they need to handle
"generic" messages, and end up needing to "roll their own" stream
support anyway?

I ask this question because I built my own RPC system on top of
protocol buffers, and in this domain it is useful to be able to pass
"unknown" messages around, typically as unparsed byte strings. Hence,
this streams proposal wouldn't be useful to me, so I'm just wondering:
am I an anomaly here, or could it be that many applications will find
themselves needing to handle "any" protocol buffer message in their
streams?


> The Visitor class has two standard implementations: "Writer" and
> "Filler". MyStream::Writer writes the visited fields to a
> CodedOutputStream, using the same wire format as would be used to
> encode MyStream as one big message.

Imagine I wanted a different protocol. Eg. I want something that
checksums each message, or maybe compresses them, etc. Will I need to
subclass MessageType::Visitor for each stream that I want to encode?
Or will I need to change the code generator? Maybe this is an unusual
enough need that the design doesn't need to be flexible enough to
handle this, but it is worth thinking about a little, since features
like being able to detect broken streams and "resume" in the middle
are useful.

Thanks!

Evan

--
http://evanjones.ca/

Kenton Varda

unread,
Feb 8, 2011, 1:34:57 PM2/8/11
to Evan Jones, Protocol Buffers, Pherl Liu, Jason Hsueh, Steven Knight
On Tue, Feb 8, 2011 at 5:47 AM, Evan Jones <ev...@mit.edu> wrote:
I read this proposal somewhat carefully, and thought about it for a couple days.

Thanks for the feedback!
 
* It seems to me that this will solve the problem for people who know statically at compile time what types they need to handle from a stream, so they can define the "stream type" appropriately. Will users find themselves running into the case where they need to handle "generic" messages, and end up needing to "roll their own" stream support anyway?

I ask this question because I built my own RPC system on top of protocol buffers, and in this domain it is useful to be able to pass "unknown" messages around, typically as unparsed byte strings. Hence, this streams proposal wouldn't be useful to me, so I'm just wondering: am I an anomaly here, or could it be that many applications will find themselves needing to handle "any" protocol buffer message in their streams?

In fact, a large part of my motivation for writing this was so that I can use it in my own RPC implementation, Captain Proto.  Here's the Captain Proto protocol, which already works in this "streaming" fashion:


I handle user messages by passing them as "bytes", embedded in my own outer message.  This is somewhat inefficient currently, as it will require an extra copy of all those bytes.  However, it seems likely that future improvements to protocol buffers will allow "bytes" fields to share memory with the original buffer, which will eliminate this concern.


The Visitor class has two standard implementations:  "Writer" and "Filler".  MyStream::Writer writes the visited fields to a CodedOutputStream, using the same wire format as would be used to encode MyStream as one big message.

Imagine I wanted a different protocol. Eg. I want something that checksums each message, or maybe compresses them, etc. Will I need to subclass MessageType::Visitor for each stream that I want to encode? Or will I need to change the code generator?

To do these things generically, we'd need to introduce some sort of equivalent of Reflection for streams.  This certainly seems like it could be a useful addition to the family, but I wanted to get the basic functionality out there first and then see if this is needed.

Note that I expect people will generally only "stream" their top-level message.  Although the proposal allows for streaming sub-messages as well, I expect that people will normally want to parse them into message objects which are handled whole.  So, you only have to manually implement the top-level stream, and then you can invoke some reflective algorithm from there.
 
features like being able to detect broken streams and "resume" in the middle are useful.

I'm not sure how this relates.  This seems like it should be handled at a lower layer, like in the InputStream -- if the connection is lost, it can re-establish and resume, without the parser ever knowing what happened.

Evan Jones

unread,
Feb 8, 2011, 2:23:56 PM2/8/11
to Kenton Varda, Protocol Buffers, Pherl Liu, Jason Hsueh, Steven Knight
On Feb 8, 2011, at 13:34 , Kenton Varda wrote:
> I handle user messages by passing them as "bytes", embedded in my
> own outer message.

This is what I do as well, as does protobuf-socket-rpc:

http://code.google.com/p/protobuf-socket-rpc/source/browse/trunk/proto/rpc.proto


I guess I was thinking that if you already have to do some sort of
"lookup" of the message type that is stored in that byte blob, then
maybe you don't need the streaming extension. For example, you could
just build a library that produces a sequence of byte strings, which
the "user" of the library can then parse appropriately.

I see how you are using it though: it is a friendly wrapper around
this simple "sequence of byte strings" model, that automatically
parses that byte string using the tag and "schema message." This might
be useful for some people.

> This is somewhat inefficient currently, as it will require an extra
> copy of all those bytes. However, it seems likely that future
> improvements to protocol buffers will allow "bytes" fields to share
> memory with the original buffer, which will eliminate this concern.

Ah cool. I was considering changing my protocol to be two messages:
the first one is the "descriptor" (eg. your CallRequest message), then
the second would be the "body" of the request, which I would then
parse based on the type passed in the CallRequest.


> Note that I expect people will generally only "stream" their top-
> level message. Although the proposal allows for streaming sub-
> messages as well, I expect that people will normally want to parse
> them into message objects which are handled whole. So, you only
> have to manually implement the top-level stream, and then you can
> invoke some reflective algorithm from there.

Right, but my concern is that I might want to use this streaming API
to write messages into files. In this case, I might have a file
containing the FooStream and another file containing the BarStream.
I'll have to implement both these ::Writer interfaces, or hack the
code generator to generate it for me. Although now that I think about
this, the implementation of these two APIs will be relatively trivial...


>> features like being able to detect broken streams and "resume" in
>> the middle are useful.
> I'm not sure how this relates. This seems like it should be handled
> at a lower layer, like in the InputStream -- if the connection is
> lost, it can re-establish and resume, without the parser ever
> knowing what happened.

Sorry, just an example of why you might want a different protocol. If
I've streamed 10e9 messages to disk, I don't want this stream to break
if there is some weird corruption in the middle, so I want some
protocol that can "resume" from corruption.

Evan

--
http://evanjones.ca/

Kenton Varda

unread,
Feb 8, 2011, 5:39:12 PM2/8/11
to Evan Jones, Protocol Buffers, Pherl Liu, Jason Hsueh, Steven Knight
On Tue, Feb 8, 2011 at 11:23 AM, Evan Jones <ev...@mit.edu> wrote:
Sorry, just an example of why you might want a different protocol. If I've streamed 10e9 messages to disk, I don't want this stream to break if there is some weird corruption in the middle, so I want some protocol that can "resume" from corruption.

Ah, yes.  This isn't an appropriate protocol for enormous files.  It's more targeted at network protocols.

Although, you might be able to build a decent seekable file protocol on top of it, by choosing a random string to use as a sync point, then writing that string every now and then...

  message FileStream {
    repeated string sync_point = 1;

    repeated Foo foo = 2;
    repeated Bar bar = 3;
    ...
  }

When writing, after every few messages, write a copy of sync_point.  Then, you can seek to an arbitrary position in the file by looking for a nearby copy of the sync point byte sequence, and starting to parse immediately after that.  The sync point just needs to be an 128-bit (or so) cryptographically random sequence, chosen differently for each file, so that there's no chance that the bytes will appear in the file by accident.

Jeffrey Damick

unread,
Apr 2, 2011, 6:53:11 PM4/2/11
to Protocol Buffers
This may be a naive question, but wouldn't the format in text_format
be a prime example another "protocol"? It seems that if you are able
to reuse the vistor generate the text format, then it would be easily
extendable by others for json or the latest encoding of the week.. I
look forward to seeing it pushed into the tree.

thanks
-jeff

Kenton Varda

unread,
Apr 3, 2011, 3:28:47 AM4/3/11
to Jeffrey Damick, Protocol Buffers
On Sat, Apr 2, 2011 at 3:53 PM, Jeffrey Damick <jeffre...@gmail.com> wrote:
This may be a naive question, but wouldn't the format in text_format
be a prime example another "protocol"? It seems that if you are able
to reuse the vistor generate the text format, then it would be easily
extendable by others for json or the latest encoding of the week..  I
look forward to seeing it pushed into the tree.

TextFormat is already implemented purely in terms of public interfaces -- namely, the reflection interface.  Thus it is already possible to write, say, a JSON encoder/decoder for protobufs, and indeed several people have done this.

The current visitor proposal (which I haven't had time to work on in awhile, but will get back to eventually...) does not provide any new way to implement TextFormat, because all visitor classes are type-specific.  In other words, to implement TextFormat via visitors you would need to write an implementation for every single type, rather than one implementation that covers all types.  This could perhaps be solved by inventing some sort of generic visitor adapter, but I haven't done any such thing in my patch, since reflection already solves most of the same problems.
 

thanks
-jeff

On Feb 8, 2:34 pm, Kenton Varda <ken...@google.com> wrote:
> On Tue, Feb 8, 2011 at 5:47 AM, Evan Jones <ev...@mit.edu> wrote:
>
> >  The Visitor class has two standard implementations:  "Writer" and
> >> "Filler".  MyStream::Writer writes the visited fields to a
> >> CodedOutputStream, using the same wire format as would be used to encode
> >> MyStream as one big message.
>
> > Imagine I wanted a different protocol. Eg. I want something that checksums
> > each message, or maybe compresses them, etc. Will I need to subclass
> > MessageType::Visitor for each stream that I want to encode? Or will I need
> > to change the code generator?
>
> To do these things generically, we'd need to introduce some sort of
> equivalent of Reflection for streams.  This certainly seems like it could be
> a useful addition to the family, but I wanted to get the basic functionality
> out there first and then see if this is needed.
>
> Note that I expect people will generally only "stream" their top-level
> message.  Although the proposal allows for streaming sub-messages as well, I
> expect that people will normally want to parse them into message objects
> which are handled whole.  So, you only have to manually implement the
> top-level stream, and then you can invoke some reflective algorithm from
> there.
>

Jeffrey Damick

unread,
Apr 3, 2011, 9:28:31 PM4/3/11
to Kenton Varda, Protocol Buffers
It just seems like a lot machinery has to be repeated across encoders/decoders to walk the messages & fields vs. a more event driven style like your vistor writer/filler which would abstract some of that, but it comes down to a matter of taste i suppose.  I'm definitely in favor the generic vistor adapter..

thanks,
-jeff
Reply all
Reply to author
Forward
0 new messages