invalid tag (zero)

11,818 views
Skip to first unread message

Dominik Steenken

unread,
Oct 13, 2008, 1:09:13 PM10/13/08
to Protocol Buffers
Hi everyone,

we are currrently implementing a server/client system, the server
being implemented in c++, the client in java. During our last rounds
of tests, we encountered a problem that had to do with the sending of
(not so) long messages. on the (receiving) java side, we get the
following exception:
Exception in augnet.client.aim.connection.Receiver, Parse error:
com.google.protobuf.InvalidProtocolBufferException: Protocol message
contained an invalid tag (zero).
com.google.protobuf.InvalidProtocolBufferException: Protocol message
contained an invalid tag (zero).
at
com.google.protobuf.InvalidProtocolBufferException.invalidTag(InvalidProtocolBufferException.java:
52)
at com.google.protobuf.CodedInputStream.readTag(CodedInputStream.java:
67)
at com.google.protobuf.FieldSet.mergeFrom(FieldSet.java:397)
at com.google.protobuf.AbstractMessage
$Builder.mergeFrom(AbstractMessage.java:248)
at com.google.protobuf.GeneratedMessage
$Builder.mergeFrom(GeneratedMessage.java:1)
at
com.google.protobuf.CodedInputStream.readMessage(CodedInputStream.java:
227)
at com.google.protobuf.FieldSet.mergeFieldFrom(FieldSet.java:482)
at com.google.protobuf.FieldSet.mergeFrom(FieldSet.java:402)
at com.google.protobuf.AbstractMessage
$Builder.mergeFrom(AbstractMessage.java:248)
at com.google.protobuf.GeneratedMessage
$Builder.mergeFrom(GeneratedMessage.java:1)
at
com.google.protobuf.CodedInputStream.readMessage(CodedInputStream.java:
227)
at com.google.protobuf.FieldSet.mergeFieldFrom(FieldSet.java:482)
at com.google.protobuf.FieldSet.mergeFrom(FieldSet.java:402)
at com.google.protobuf.AbstractMessage
$Builder.mergeFrom(AbstractMessage.java:248)
at com.google.protobuf.AbstractMessage
$Builder.mergeFrom(AbstractMessage.java:240)
at com.google.protobuf.AbstractMessage
$Builder.mergeFrom(AbstractMessage.java:298)
at augnet.client.aim.messages.MessageProtos
$AugNetMessage.parseFrom(MessageProtos.java:6289)
at augnet.client.aim.connection.Receiver.run(Receiver.java:47)

while the (sending) c++ side encounters no errors. When we scale down
the message, no error occurs. Is this a bug in protobuf or are we
doing something wrong?

Best regards,
Dominik

Kenton Varda

unread,
Oct 13, 2008, 5:26:49 PM10/13/08
to Dominik Steenken, Protocol Buffers
Are you sure that the data you are sending to the parser is exactly the same data that was generated by the serializer?  Remember that protocol buffers are not self-delimiting, so you need to make sure that you limit the input to the exact number of bytes that were produced when serializing.

If the data is exactly the same, then this is a bug.  If you can create a small program or pair of programs that demonstrate the problem, I would be happy to debug it.

Dominik Steenken

unread,
Oct 14, 2008, 7:01:01 AM10/14/08
to ken...@google.com, prot...@googlegroups.com
Yes, we are pretty sure that we do not modify the data prior to putting
it on the wire. We have discovered a new fact, however that will
hopefully shed some light on this bizarre error. The error seems to
occur the instance the protobuf message is fragmented in the transport
layer, i.e. when it is larger than a single tcp frame. Any thoughts on
that? Has someone encountered this error before?

Dominik Steenken

unread,
Oct 14, 2008, 7:21:58 AM10/14/08
to Kenton Varda, Protocol Buffers
Nevermind, we found the error. Apparently, the Java read() function
reads only up to one TCP frame before returning. Replace with
readFully() and it works fine.

mcdowella

unread,
Oct 14, 2008, 7:23:36 AM10/14/08
to Protocol Buffers
This is begining to sound like a well-known TCP gotcha. Like most
stream protocols, there is nothing in the TCP protocol that marks the
boundaries between sender write() calls; TCP sees the connection as a
contiguous stream of bytes. If the protobuf message is fragmented into
multiple tcp frames by the sender, read() calls on the receiver will
typically be satisfied by the first tcp frame received. What is more,
because the TCP implementation of write() can also stuff more than one
write() call into an outgoing frame, the boundary between TCP frames
is not easy to predict - you might get a split inside a small write()
call because there was a little bit of space left over in a TCP frame
from a previous call, but not enough for all of even a very small
write(). It is therefore perfectly legal for you to call write() on
{1,2} and for the receiver to get {1} as the response of a first call
to read() and {2} as the response to the second call to read().

If this causes a problem (as it typically does) it is up to you to
e.g. prefix each chunk of data sent with a length count.

A google search on TCP Message Boundary finds the following from "C#
Network Programming"

Because TCP does not preserve data message boundaries, you must
compensate for that in your network programs. There are two ways to
handle this:

* Create a protocol that requires a one-for-one response to each data
message sent from the host
* Design a data message marker system to distinguish data message
boundaries within the data stream

(I'm not sure what their first option is, unless they mean that you
denote the end of a data message by closing the TCP connection used to
transfer it).

On Oct 14, 12:01 pm, Dominik Steenken <domi...@upb.de> wrote:
> Yes, we are pretty sure that we do not modify the data prior to putting
> it on the wire. We have discovered a new fact, however that will
> hopefully shed some light on this bizarre error. The error seems to
> occur the instance the protobuf message is fragmented in the transport
> layer, i.e. when it is larger than a single tcp frame. Any thoughts on
> that? Has someone encountered this error before?
>
> Kenton Varda wrote:
> > Are you sure that the data you are sending to the parser is exactly
> > the same data that was generated by the serializer? Remember that
> > protocol buffers are not self-delimiting, so you need to make sure
> > that you limit the input to the exact number of bytes that were
> > produced when serializing.
>
> > If the data is exactly the same, then this is a bug. If you can
> > create a small program or pair of programs that demonstrate the
> > problem, I would be happy to debug it.
>

Dominik Steenken

unread,
Oct 14, 2008, 7:47:02 AM10/14/08
to mcdowella, Protocol Buffers
Yep, as i wrote in my previous message, that was it.
Thanks for the feedback :)

Darren Davie

unread,
Oct 1, 2014, 2:38:23 AM10/1/14
to prot...@googlegroups.com, mcdo...@mcdowella.demon.co.uk
Thanks mcdowella that was a good detailed explanation of how to resolve the issue.

Regards
Darren
Reply all
Reply to author
Forward
0 new messages