How to encode nested Python Protobuf

3,190 views
Skip to first unread message

chri...@gmail.com

unread,
Aug 25, 2013, 12:47:39 AM8/25/13
to prot...@googlegroups.com
Resubmitted...haven't seen my post in a couple days so not sure
what happened.

This is also posted on stackoverlow as well
http://stackoverflow.com/questions/10957786/protocol-buffers-python-unicode-decode-error?rq=1
Will update whichever site doesn't get the answer posted.

******

Been stumped on this for a while and pulling what is left of my hair out.

Sending non-nested Protobufs from Python to Java and Java to Python without an issue with WebSockets. My problem is sending a nested version over a WebSocket. I believe my issue is on the Python encoding side.

Your guidance is appreciated.

.proto file

message Response {
  // Reflect back to caller
  required string service_name = 1;

  // Reflect back to caller
  required string method_name = 2;

  // Who is responding
  required string client_id = 3;

  // Status Code
  required StatusCd status_cd = 4;

  // RPC response proto
  optional bytes response_proto = 5;

  // Was callback invoked
  optional bool callback = 6 [default = false];

  // Error, if any
  optional string error = 7;
  //optional string response_desc = 6;
}

message HeartbeatResult {
    required string service = 1;
    required string timestamp = 2;
    required float status_cd = 3;
    required string status_summary = 4;
}

A Heartbeat result is supposed to get sent in the reponse_proto field of the Response Protobuf. I am able to do this in Java to Java but Python to Java is not working.

I've included two variations of the python code. Neither of which works.

   def GetHeartbeat(self):
    print "GetHeartbeat called"
    import time
    ts = time.time()
    import datetime
    st = datetime.datetime.fromtimestamp(ts).strftime('%Y-%m-%d %H:%M:%S')
    heartbeatResult = rpc_pb2.HeartbeatResult()
    heartbeatResult.service = "ALERT_SERVICE"
    heartbeatResult.timestamp = st
    heartbeatResult.status_cd = rpc_pb2.OK
    heartbeatResult.status_summary = "OK"

    response = rpc_pb2.Response()
    response.service_name = ""
    response.method_name = "SendHeartbeatResult"
    response.client_id = "ALERT_SERVICE"
    response.status_cd = rpc_pb2.OK
    response.response_proto = str(heartbeatResult).encode('utf-8')

    self.sendMessage(response.SerializeToString())
    print "GetHeartbeat finished"

   def GetHeartbeat2(self):
    print "GetHeartbeat called"
    import time
    ts = time.time()
    import datetime
    st = datetime.datetime.fromtimestamp(ts).strftime('%Y-%m-%d %H:%M:%S')
    heartbeatResult = rpc_pb2.HeartbeatResult()
    heartbeatResult.service = "ALERT_SERVICE"
    heartbeatResult.timestamp = st
    heartbeatResult.status_cd = rpc_pb2.OK
    heartbeatResult.status_summary = "OK"

    response = rpc_pb2.Response()
    response.service_name = ""
    response.method_name = "SendHeartbeatResult"
    response.client_id = "ALERT_SERVICE"
    response.status_cd = rpc_pb2.OK
    response.response_proto = heartbeatResult.SerializeToString()
    self.sendMessage(response.SerializeToString())
    print "GetHeartbeat finished"
    print "GetHeartbeat called"
    import time
    ts = time.time()
    import datetime
    st = datetime.datetime.fromtimestamp(ts).strftime('%Y-%m-%d %H:%M:%S')
    heartbeatResult = rpc_pb2.HeartbeatResult()
    heartbeatResult.service = "ALERT_SERVICE"
    heartbeatResult.timestamp = st
    heartbeatResult.status_cd = rpc_pb2.OK
    heartbeatResult.status_summary = "OK"

    response = rpc_pb2.Response()
    response.service_name = ""
    response.method_name = "SendHeartbeatResult"
    response.client_id = "ALERT_SERVICE"
    response.status_cd = rpc_pb2.OK
    response.response_proto = str(heartbeatResult).encode('utf-8')

    self.sendMessage(response.SerializeToString())
    print "GetHeartbeat finished"


Errors on the Java server side are:


(GetHeartbeat) Protocol message end-group tag did not match expected tag
and
(GetHeartbeat2)
Message: [org.java_websocket.exceptions.InvalidDataException: java.nio.charset.MalformedInputException: Input length = 1
    at org.java_websocket.util.Charsetfunctions.stringUtf8(Charsetfunctions.java:80)
    at org.java_websocket.WebSocketImpl.deliverMessage(WebSocketImpl.java:561)
    at org.java_websocket.WebSocketImpl.decodeFrames(WebSocketImpl.java:328)
    at org.java_websocket.WebSocketImpl.decode(WebSocketImpl.java:149)
    at org.java_websocket.server.WebSocketServer$WebSocketWorker.run(WebSocketServer.java:593)
Caused by: java.nio.charset.MalformedInputException: Input length = 1
    at java.nio.charset.CoderResult.throwException(CoderResult.java:277)
    at java.nio.charset.CharsetDecoder.decode(CharsetDecoder.java:798)
    at org.java_websocket.util.Charsetfunctions.stringUtf8(Charsetfunctions.java:77)

Ilia Mirkin

unread,
Aug 26, 2013, 4:13:39 PM8/26/13
to chri...@gmail.com, prot...@googlegroups.com
On Sun, Aug 25, 2013 at 12:47 AM, <chri...@gmail.com> wrote:
> heartbeatResult = rpc_pb2.HeartbeatResult()
> heartbeatResult.service = "ALERT_SERVICE"
> heartbeatResult.timestamp = st
> heartbeatResult.status_cd = rpc_pb2.OK
> heartbeatResult.status_summary = "OK"
>
> response = rpc_pb2.Response()
> response.service_name = ""
> response.method_name = "SendHeartbeatResult"
> response.client_id = "ALERT_SERVICE"
> response.status_cd = rpc_pb2.OK
> response.response_proto = str(heartbeatResult).encode('utf-8')

I'll admit to not being _entirely_ familiar with the python API, but
shouldn't this be

response.response_proto = heartbeatResult.SerializeToString()

I would semi-assume that str(heartbeatResult) produces a text-encoded
version, but perhaps not.

When in doubt, dump the raw protobuf data bytes received and see
what's going on.

-ilia

Christopher Head

unread,
Aug 26, 2013, 8:04:28 PM8/26/13
to chri...@gmail.com, prot...@googlegroups.com
On Sat, 24 Aug 2013 21:47:39 -0700 (PDT)
chri...@gmail.com wrote:

> response = rpc_pb2.Response()
> response.service_name = ""
> response.method_name = "SendHeartbeatResult"
> response.client_id = "ALERT_SERVICE"
> response.status_cd = rpc_pb2.OK
> response.response_proto = str(heartbeatResult).encode('utf-8')

I agree with Ilia here. Whatever is going on, UTF-8 is not something
that should ever be applied to an encoded protobuf message. An encoded
protobuf message is a sequence of bytes, not characters.

Chris
signature.asc

chri...@gmail.com

unread,
Aug 26, 2013, 8:32:23 PM8/26/13
to prot...@googlegroups.com, chri...@gmail.com
Thanks for taking the time to read/reply.

GetHeartbeat2 is my original version which I included as well and the
way I had thought it should probably work.
This code is serializing to string
on the inner buffer and then again on the outer. 


   response.response_proto = heartbeatResult.SerializeToString()
    self.sendMessage(response.SerializeToString())


The GetHeartbeat2 generates this error on the server ( I included more of the stack trace this time)

Message: [org.java_websocket.exceptions.InvalidDataException: java.nio.charset.MalformedInputException: Input length = 1
    at org.java_websocket.util.Charsetfunctions.stringUtf8(Charsetfunctions.java:80)
    at org.java_websocket.WebSocketImpl.deliverMessage(WebSocketImpl.java:561)
    at org.java_websocket.WebSocketImpl.decodeFrames(WebSocketImpl.java:328)
    at org.java_websocket.WebSocketImpl.decode(WebSocketImpl.java:149)
    at org.java_websocket.server.WebSocketServer$WebSocketWorker.run(WebSocketServer.java:593)
Caused by: java.nio.charset.MalformedInputException: Input length = 1
    at java.nio.charset.CoderResult.throwException(CoderResult.java:277)
    at java.nio.charset.CharsetDecoder.decode(CharsetDecoder.java:798)
    at org.java_websocket.util.Charsetfunctions.stringUtf8(Charsetfunctions.java:77)


Ilia Mirkin

unread,
Aug 27, 2013, 1:16:02 AM8/27/13
to chri...@gmail.com, prot...@googlegroups.com
I would assume that the error is in the sendMessage / receiving logic.
Try dumping out what you're sending and what you're receiving and make
sure they match up.

Christopher Head

unread,
Aug 27, 2013, 3:12:30 PM8/27/13
to chri...@gmail.com, prot...@googlegroups.com
This stack trace suggests you’re still involving UTF-8 somewhere, this
time on the Java side (java.nio.charset, stringUtf8 method name, etc.).
Again, an encoded Protobuf message is a BYTE STRING. You should not be
doing UTF-8 things, or any other text-related things, to it. I have no
idea how Websockets work, but if they are not capable of transporting
byte strings directly (if they only carry text, for example), then you
will have to do further encoding on your Protobuf message—Base64 might
be suitable here.

Chris
signature.asc

chri...@gmail.com

unread,
Sep 25, 2013, 1:24:22 PM9/25/13
to prot...@googlegroups.com, chri...@gmail.com
Chris,
Your suggestion of trying base64 worked on the nested
protobuf. Thanks to you and Llia again for taking the time to
share your thoughts.

Here is the solution that ended up working with the websocket
server that I'm using.



   def GetHeartbeat(self):
    print "GetHeartbeat called"
    heartbeatResult = rpc_pb2.HeartbeatResult()
    heartbeatResult.service = "ALERT_SERVICE"
    heartbeatResult.timestamp = self.getTimestamp()

    heartbeatResult.status_cd = rpc_pb2.OK
    heartbeatResult.status_summary = "OK"

    response = rpc_pb2.Response()
    response.service_name = ""
    response.method_name = "SendHeartbeatResult"
    response.client_id = "ALERT_SERVICE"
    response.status_cd = rpc_pb2.OK   
    response.response_proto = base64.b64encode(heartbeatResult.SerializeToString())
    self.sendMessage(response.SerializeToString())
    print "GetHeartbeat finished"

Ilia Mirkin

unread,
Sep 25, 2013, 1:27:52 PM9/25/13
to chri...@gmail.com, prot...@googlegroups.com
I believe Chris's suggestion was to base64-encode the full message,
not the subproto, or you'll run into other problems down the line,
since it sounds like your transport can't handle arbitrary byte
sequences.

IOW, something like

response.response_proto = heartbeatResult.SerializeToString()
self.sendMessage(base64.b64encode(response.SerializeToString()))

Or even better, stick the b64encode into sendMessage itself (and
matching b64decode in the receive logic). But best still would be to
figure out why your transport doesn't seem to be handling arbitrary
byte sequences.

-ilia
> --
> You received this message because you are subscribed to the Google Groups
> "Protocol Buffers" group.
> To unsubscribe from this group and stop receiving emails from it, send an
> email to protobuf+u...@googlegroups.com.
> To post to this group, send email to prot...@googlegroups.com.
> Visit this group at http://groups.google.com/group/protobuf.
> For more options, visit https://groups.google.com/groups/opt_out.

chri...@gmail.com

unread,
Sep 25, 2013, 3:51:14 PM9/25/13
to prot...@googlegroups.com, chri...@gmail.com, imi...@alum.mit.edu
Ilia (sorry for the misspell of your name previously),

I'll keep your suggestion in mind if I start to have issues when adding
other nested messages.

I did play around with wrapping the entire message as base64 but the
websocket server seems to complain upon receiving the message.
I receive an exception along the lines of "Protocol message tag had invalid wire type".
In any case,  wrapping only the nested object seems to be functioning
properly and is allowing me to move forward.  I'll update the thread if anything changes.


Cheers !

chri...@gmail.com

unread,
Sep 25, 2013, 6:28:01 PM9/25/13
to prot...@googlegroups.com, chri...@gmail.com, imi...@alum.mit.edu
Last update from me for a while hopefully :)

Ilia, I did look at your suggestion more closely with wrapping the entire message in base64.
It became apparent to me that I was only decoding base64 in my nested message handler which
should have been obvious.

I added some conditional decoding logic to check if every incoming message was base64.
I think this change leads to some cleaner code overall within the project.

Take care.
Reply all
Reply to author
Forward
0 new messages