How to read continuous stream of messages from TCP

3,451 views
Skip to first unread message

waynix

unread,
Feb 27, 2012, 5:27:17 PM2/27/12
to Protocol Buffers
Hello All;

New to protobuf and did some limited search for my question. So my
apology if this has already been talked about.

I naively thought that ParseFromFileDescriptor/ParseFromIstream would
block on an TCP socket and return when a valid message is received.
Read some old posts from 2010 and realized it's not that easy due to
mesages not being self-delimiting. And the suggestion from Jason
Hsueh was as follows:
"
One approach to writing multiple messages to the same stream is to use
a
length-delimited format: write the size of the message, then serialize
the
message itself. On the receiver side, you would set up a
FileInputStream,
and wrap a CodedInputStream around that. You can read the size of the
messages from the stream and then use PushLimit and PopLimit to
control how
much data is read.
"

My quesions are:

1. Is this still the way to do it? Seems quite cumbersome (to lazy
me ;-). Is there a wrapper built in to do this?
2. If I understand Jason's suggestion riht, the length is really not
part of the message, and the sender has to explcitly set it, instead
of having protobuf encode it in. Which means a generic third party
sender using my .proto file would not be sufficient. Plus how would
they know the length before encoding the message proper? Filling it in
after the fact would change the length again? or I am totally
missing it.

3. A related quesiton is in general do I have to manage reading of the
socket, or for that matter any istream, and spoon feed the protobuf
parser until it says OK, that's a whole message?

Thanks a lot.

Evan Jones

unread,
Mar 6, 2012, 6:08:26 PM3/6/12
to waynix, Protocol Buffers
On Feb 27, 2012, at 17:27 , waynix wrote:
> 1. Is this still the way to do it? Seems quite cumbersome (to lazy me ;-). Is there a wrapper built in to do this?

Yes. Sadly there is no wrapper included in the library.


> 2. If I understand Jason's suggestion riht, the length is really not
> part of the message, and the sender has to explcitly set it, instead
> of having protobuf encode it in. Which means a generic third party
> sender using my .proto file would not be sufficient. Plus how would
> they know the length before encoding the message proper? Filling it in
> after the fact would change the length again? or I am totally
> missing it.

As long as both sides encode the length in the same way , just having the right .proto will do the trick.


> 3. A related quesiton is in general do I have to manage reading of the
> socket, or for that matter any istream, and spoon feed the protobuf
> parser until it says OK, that's a whole message?

Basically yes. There is a sketch of some example code here:

https://groups.google.com/forum/?fromgroups#!searchin/protobuf/sequence/protobuf/pLwqN4jTVvY/60PBaEadW5IJ


Good luck,

Evan

--
http://evanjones.ca/

waynix

unread,
Mar 8, 2012, 2:30:47 AM3/8/12
to Protocol Buffers
Thank you so much Evan for your response.

While I look at your old posts and dig into coded stream (as I said,
new to protobuf), it seems to me this approach is "external" to
protobuf, in the sense that I have to tell the third party what my
delimiter is (it's a length as in you sample code) or I have to give
them a wrapper library on top of protobuf, to read that much data
before invoking protobuf proper. Granted it's no big deal coding wise
but it's an extra piece of deliverable. I can't simply hand my .proto
file to customer, refer them to google's protobuf site and wash my
hands off.

Since this is so common an issue and the suggested solution is almost
de facto standard, (saw this after my initial post:
http://code.google.com/apis/protocolbuffers/docs/techniques.html), it
begs the question of why not build it into protobuf proper. So that
ParseFromIstream would block until reading that much data as indicated
in the delimiter and decode and return.

I see StartGroup and StopProup being deprecated, (Poof! there goes 25%
of precious wire types ;-)). Makes you wonder if they were originally
intended as delimiters. Which would not be a bad idea. You could
have a StartGroup as an optional field in front of each of your top
level message, which would be defined as some meta data or some header
about the ensuing message, the simplest could be a length. The end
group could be some CRC for unreliable media such as serial
transmission.
I know CRC may sound overreaching, but a default startGroup type being
a length should be simple and generic enough and would solve a big
problem. For more elaborate and custom Start/End definition, you could
have some custom callback mechanism to interpret them.

Thanks again.
> https://groups.google.com/forum/?fromgroups#!searchin/protobuf/sequen...
>
> Good luck,
>
> Evan
>
> --http://evanjones.ca/

Evan Jones

unread,
Mar 8, 2012, 7:27:08 AM3/8/12
to waynix, Protocol Buffers
On Mar 8, 2012, at 2:30 , waynix wrote:
> Since this is so common an issue and the suggested solution is almost
> de facto standard, (saw this after my initial post:
> http://code.google.com/apis/protocolbuffers/docs/techniques.html), it
> begs the question of why not build it into protobuf proper.

Yeah, I would agree that something simple probably should have been included. The reasoning here is that this allows people to use protocol buffers with whatever other systems they might already be using (eg. HTTP, databases, files, RPC protocols, whatever), without being tied to a specific implementation. Compare the protocol buffer API to Thrift, for example, where the message serialization/deserialization is tied pretty tightly to the RPC system. There were proposals to possibly add a "protocol buffer utils" API, or a "streaming" API, but neither of those went anywhere. The closest thing is writeDelimitedTo / mergeDelimitedFrom in the Java API:

http://code.google.com/apis/protocolbuffers/docs/reference/java/com/google/protobuf/MessageLite.html#writeDelimitedTo(java.io.OutputStream)

Evan

--
http://evanjones.ca/

sa...@cumulusnetworks.com

unread,
May 6, 2019, 11:27:37 PM5/6/19
to Protocol Buffers
Hi,


this problem can be solved in below way also.

1. By defining new message type.

new_message {
    message a1:1;
    message a2:2;
    ...
    message a32:32
}

In proto3, all the fields are optional by default.

so no need to encode everything. But the problem is we can define more then 32 messages because of 5-bit field value.

But this problem can also be solved by using struct with in struct.

new_mesasge {
   message 1-32 : 1;
   message 33-64 :2;
  ...
}

This way we no need to worry about Length encoding type. By using protobuf only w can solve this problem

Thanks
Satheesh

Hao Nguyen

unread,
May 9, 2019, 11:30:31 AM5/9/19
to sa...@cumulusnetworks.com, Protocol Buffers
Have you tried using https://grpc.io/? Streaming is supported there: https://grpc.io/docs/guides/concepts/

--
You received this message because you are subscribed to the Google Groups "Protocol Buffers" group.
To unsubscribe from this group and stop receiving emails from it, send an email to protobuf+u...@googlegroups.com.
To post to this group, send email to prot...@googlegroups.com.
Visit this group at https://groups.google.com/group/protobuf.
For more options, visit https://groups.google.com/d/optout.
Reply all
Reply to author
Forward
0 new messages