[grpc-java] InvalidProtocolBufferException: Protocol message contained an invalid tag (zero) error

625 views
Skip to first unread message

Anthony Corbacho

unread,
Sep 16, 2018, 5:38:26 PM9/16/18
to grpc.io
Hello,
I am new to Grpc and so far like it very much.

I am using a bidirectional stream and from time to time I get an exception like this one:

io.grpc.StatusRuntimeException: CANCELLED: Failed to read message. at io.grpc.Status.asRuntimeException(Status.java:526) at io.grpc.stub.ClientCalls$StreamObserverToCallListenerAdapter.onClose(ClientCalls.java:418) at io.grpc.ForwardingClientCallListener.onClose(ForwardingClientCallListener.java:41) at io.grpc.internal.CensusStatsModule$StatsClientInterceptor$1$1.onClose(CensusStatsModule.java:663) at io.grpc.ForwardingClientCallListener.onClose(ForwardingClientCallListener.java:41) at io.grpc.internal.CensusTracingModule$TracingClientInterceptor$1$1.onClose(CensusTracingModule.java:392) at io.grpc.internal.ClientCallImpl.closeObserver(ClientCallImpl.java:443) at io.grpc.internal.ClientCallImpl.access$300(ClientCallImpl.java:63) at io.grpc.internal.ClientCallImpl$ClientStreamListenerImpl.close(ClientCallImpl.java:525) at io.grpc.internal.ClientCallImpl$ClientStreamListenerImpl.access$600(ClientCallImpl.java:446) at io.grpc.internal.ClientCallImpl$ClientStreamListenerImpl$1MessagesAvailable.runInContext(ClientCallImpl.java:510) at io.grpc.internal.ContextRunnable.run(ContextRunnable.java:37) at io.grpc.internal.SerializingExecutor.run(SerializingExecutor.java:123) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624) at java.lang.Thread.run(Thread.java:748) Caused by: io.grpc.StatusRuntimeException: INTERNAL: Invalid protobuf byte sequence at io.grpc.Status.asRuntimeException(Status.java:517) at io.grpc.protobuf.lite.ProtoLiteUtils$2.parse(ProtoLiteUtils.java:168) at io.grpc.protobuf.lite.ProtoLiteUtils$2.parse(ProtoLiteUtils.java:82) at io.grpc.MethodDescriptor.parseResponse(MethodDescriptor.java:265) at io.grpc.internal.ClientCallImpl$ClientStreamListenerImpl$1MessagesAvailable.runInContext(ClientCallImpl.java:498) ... 5 more Caused by: com.google.protobuf.InvalidProtocolBufferException: Protocol message contained an invalid tag (zero). at com.google.protobuf.InvalidProtocolBufferException.invalidTag(InvalidProtocolBufferException.java:105) at com.google.protobuf.CodedInputStream$ArrayDecoder.readTag(CodedInputStream.java:646) at com.zepl.notebook.service.grpc.NotebookResponse.<init>(NotebookResponse.java:46) at com.zepl.notebook.service.grpc.NotebookResponse.<init>(NotebookResponse.java:13) at com.zepl.notebook.service.grpc.NotebookResponse$1.parsePartialFrom(NotebookResponse.java:2851) at com.zepl.notebook.service.grpc.NotebookResponse$1.parsePartialFrom(NotebookResponse.java:2846) at com.google.protobuf.AbstractParser.parseFrom(AbstractParser.java:91) at com.google.protobuf.AbstractParser.parseFrom(AbstractParser.java:49) at io.grpc.protobuf.lite.ProtoLiteUtils$2.parseFrom(ProtoLiteUtils.java:173) at io.grpc.protobuf.lite.ProtoLiteUtils$2.parse(ProtoLiteUtils.java:165) ... 8 more

I enabled netty debug logs, and I have this line: [id: 0xfc5978c0, L:/100.119.42.167:39090 - R:--/--] OUTBOUND RST_STREAM: streamId=1539 errorCode=8.
I dont really get what is wrong, I get this error once in a while and I am stuck.
I am calling the observers from different threads in the server side, do I need to synchronize the method that calls the observers?

Thank you for your help and time.

Carl Mastrangelo

unread,
Sep 17, 2018, 1:38:20 PM9/17/18
to grpc.io
You should look for a netty debuglog frame that's for DATA, not RST_STREAM.   That should show you the corrupted message.  

There are also some hooks into the core gRPC library that (while more complicated) will let you examine the message bytes.   By using a custom Marshaller, you can peak at the bytes and then delegate the remaining message to the protobuf Marshaller.   You can see how to wire up a Marshaller by looking in the generated code for the MethodDescriptor. 

Anthony Corbacho

unread,
Sep 17, 2018, 2:39:47 PM9/17/18
to grpc.io
Hi Carl,

Thanks for the fast answer.
How can I enable `netty debug log frame that's for DATA`?

thanks~.

Carl Mastrangelo

unread,
Sep 17, 2018, 4:52:02 PM9/17/18
to grpc.io

Anthony Corbacho

unread,
Sep 17, 2018, 11:21:30 PM9/17/18
to grpc.io
Hi,

This is strange I have enabled the same log but I only see RST_STREAM.
Do I need to do something else?

# GRPC debugging
log4j.logger.io.grpc.netty.NettyServerHandler=ALL
log4j.logger.io.grpc.netty.NettyClientHandler =ALL

Carl Mastrangelo

unread,
Sep 18, 2018, 7:58:16 PM9/18/18
to grpc.io
Are both your client and server written using Java?   Also, are you using TLS or plaintext?   

Anthony Corbacho

unread,
Sep 18, 2018, 8:51:55 PM9/18/18
to grpc.io
Yes, both are written in Java and I am using plaintext.

Carl Mastrangelo

unread,
Sep 19, 2018, 4:09:59 PM9/19/18
to grpc.io
And you are sure there is no proxy in between the client and server?   One thing that can cause this is data integrity issues, caused by networking hardware.   To protect against this, we always suggest using TLS (SSL).


I know this is more effort, but it would help indicate if there is errant networking hardware.   Can you try using TLS client and server side (using the certs from our HelloWorldTls example), and see if the problem persists? 

Ilya Pavlenko

unread,
Feb 25, 2021, 2:53:25 AM2/25/21
to grpc.io
Hi guys. We have recently had the same experience. The problem occurred only under the heavy load and sometimes it took up to 2 hours to catch it. As you recommended, enabling TLS resolved the problem. 
Could you give any advice for the further investigation? And Y enabling ssl so coordinately resolves the issue? 

Thanks.
среда, 19 сентября 2018 г. в 23:09:59 UTC+3, not...@google.com:

zda...@google.com

unread,
Mar 3, 2021, 5:18:36 PM3/3/21
to grpc.io
I believe there's data integrity issue either from proxy or hardware that using TLS can avoid, as notcarl@ said. Unless it can be reliably reproduced, there is no way to investigate how this problem occurs for plaintext.
Reply all
Reply to author
Forward
0 new messages