Re: [protobuf] How to know what type of message it is?

3,847 views
Skip to first unread message

Feng Xiao

unread,
Mar 21, 2013, 6:56:10 PM3/21/13
to kramer65, Protocol Buffers
On Thu, Mar 21, 2013 at 3:48 PM, kramer65 <kra...@gmail.com> wrote:
Hello people,

I'm just starting out with protobuf but I hope someone can enlighten me a bit.

Lets say that I've got a couple message types defined. Now when a new message comes in (via zeromq) I don't know what type of message it is until I deserialize it, but to deserialize it I need to know what type of message it is (catch 22!). I'm totally lost here. Could anybody please explain this to me?
The common practice is that you always send the same message over the wire, so the server knows it.
 

All tips are welcome!

--
You received this message because you are subscribed to the Google Groups "Protocol Buffers" group.
To unsubscribe from this group and stop receiving emails from it, send an email to protobuf+u...@googlegroups.com.
To post to this group, send email to prot...@googlegroups.com.
Visit this group at http://groups.google.com/group/protobuf?hl=en.
For more options, visit https://groups.google.com/groups/opt_out.
 
 

kramer65

unread,
Mar 21, 2013, 7:14:23 PM3/21/13
to prot...@googlegroups.com, kramer65
Alright, in that case I totally need to change my setup.

I'm working on a project in which we are trying to fly a large kite using built in steering. For this we've set up a network connection with the kite to send and receive messages. For example: we measure altitude, direction, speed, etc. and we need to send that info back down to earth. All this happens in separate messages, because they are collected asynchronously.

Does this mean that we need to setup a separate socket and connection in zeromq for every type of message we want to send down to earth?

Also I wanted to make a logger process to which all messages should be send in order to write them to the central log file. This would make it impossible because messages that are collected can be any of the different (.proto) types.

Is there no option to send an identifier with the message so that the type can be recognized? Or would a better solution be to create one universal message with all types set to optional (which would create a lot more overhead I guess)?



Op donderdag 21 maart 2013 23:56:10 UTC+1 schreef Feng Xiao het volgende:
Op donderdag 21 maart 2013 23:56:10 UTC+1 schreef Feng Xiao het volgende:

Ilia Mirkin

unread,
Mar 21, 2013, 7:24:32 PM3/21/13
to kramer65, prot...@googlegroups.com
I don't know about zeromq, but rabbitmq allows you to have multiple
endpoints. If that is the case, you can have diff types for diff
endpoints. If not, you can create a container message like

message TheOneAndOnlyMessage {
required string type = 1;
optional bytes value = 2;
}

And then the bytes in that message could be decoded again. Of course
that means that you end up doing two copies of the data, which is
less-than-desirable. But you can avoid the performance penalty by
doing the encoding manually (i.e. working directly with a coded
input/output stream, getting the type name, and then giving the stream
to a message parse function).

I suppose another option is to make TheOneAndOnlyMessage contain each
of the possible messages directly, but that seems like a maintenance
pain.

On Thu, Mar 21, 2013 at 6:48 PM, kramer65 <kra...@gmail.com> wrote:
> Hello people,
>
> I'm just starting out with protobuf but I hope someone can enlighten me a
> bit.
>
> Lets say that I've got a couple message types defined. Now when a new
> message comes in (via zeromq) I don't know what type of message it is until
> I deserialize it, but to deserialize it I need to know what type of message
> it is (catch 22!). I'm totally lost here. Could anybody please explain this
> to me?
>

Feng Xiao

unread,
Mar 21, 2013, 7:37:30 PM3/21/13
to Ilia Mirkin, kramer65, Protocol Buffers
On Thu, Mar 21, 2013 at 4:24 PM, Ilia Mirkin <imi...@alum.mit.edu> wrote:
I don't know about zeromq, but rabbitmq allows you to have multiple
endpoints. If that is the case, you can have diff types for diff
endpoints. If not, you can create a container message like

message TheOneAndOnlyMessage {
  required string type = 1;
  optional bytes value = 2;
}

And then the bytes in that message could be decoded again. Of course
that means that you end up doing two copies of the data, which is
less-than-desirable. But you can avoid the performance penalty by
doing the encoding manually (i.e. working directly with a coded
input/output stream, getting the type name, and then giving the stream
to a message parse function).

I suppose another option is to make TheOneAndOnlyMessage contain each
of the possible messages directly, but that seems like a maintenance
pain.
It's the common practice and used widely in many many places.
You will have some messages which look like the ones in the bellow:
message MessageA {
}
message MessageB {
}
message TheOneAndOnlyMessage {
  enum MessageType {
    TYPE_UNKNOWN = 0;
    TYPE_A = 1;
    TYEP_B = 2;
  }
  optional MessageType type = 1;
  optional MessageA a = 2;
  optional MessageB b = 3;
  ...
}
I don't see the maintenance pain compared to other methods, but you can enlighten me.
It's used so often that we are actually planning to create a new syntax and implementation to make this pattern more efficiently.

moofish

unread,
Mar 23, 2013, 1:16:41 PM3/23/13
to prot...@googlegroups.com
Quick follow up question, we do a very similar pattern but to avoid the copy we just put the second message size in the "header" packet and the type of message.  This way we don't copy the bytes twice and avoid the coded input stream.  We also use this in a streaming situation so we can 'sync' on a unique key in the header message, plus we use fixed size fields in the header so you always know the size of the message. Yes this means we eat up 15 bytes of header for every message but that is very small percentage compared to our data.

While this has been working for us, are there bad practices and pitfalls we don't see yet?

Thanks,

-M

Feng Xiao

unread,
Mar 23, 2013, 2:23:46 PM3/23/13
to moofish, Protocol Buffers

Your approach is also a very common one. When sending Protobuf message s a fixed sized header with the length info is the easiest way to delimiter multiple messages in a stream. Whether to put the type info into the header is a design choice. I think both should work well.

--
Reply all
Reply to author
Forward
0 new messages