[De]serialization of messages to java strings

1,202 views
Skip to first unread message

Will Morton

unread,
Nov 24, 2009, 6:16:35 PM11/24/09
to Protocol Buffers
Hello all;

I need to serialize a protobuf message to a string so that it can be
passed outside my program. The below fails, I'm guessing due to UTF8
encoding issues:

byte[] arr = msg.toByteArray();
String str = new String(arr);
// ... pass str around ...
MsgType msg2 = MsgType.parseFrom(str.getBytes()); // <-- throws
InvalidProtocolBufferException

So, reading the API, I thought I should use ByteStrings, with their
handy UTF8 encoding methods, but this doesn't work either:

ByteString bs = msg.toByteString();
String str = bs.toStringUtf8();
// ... pass str around ...
ByteString bs2 = ByteString.copyFromUtf8(str);
MsgType msg2 = MsgType.parseFrom(bs2); // <-- Still throws exception

What am I doing wrong? What's the best way to do java string
serialization of protobuf messages?

Thanks in advance,

Will

Kenton Varda

unread,
Nov 24, 2009, 7:01:29 PM11/24/09
to Will Morton, Protocol Buffers
Strings contain text, not arbitrary bytes.  Encoded protocol buffers are arbitrary bytes, not text.  So, they aren't compatible.  You would need to do something like base-64 encode the data in order to put it in a String.


--

You received this message because you are subscribed to the Google Groups "Protocol Buffers" group.
To post to this group, send email to prot...@googlegroups.com.
To unsubscribe from this group, send email to protobuf+u...@googlegroups.com.
For more options, visit this group at http://groups.google.com/group/protobuf?hl=en.



Will Morton

unread,
Nov 24, 2009, 7:14:12 PM11/24/09
to Adam Vartanian, Protocol Buffers
2009/11/25 Adam Vartanian <flo...@google.com>:
>> What am I doing wrong?  What's the best way to do java string
>> serialization of protobuf messages?
>
> If you absolutely have to pass things around as a String, you're going
> to need to do so in some kind of encoding that supports arbitrary
> data.  For example, you could encode it in Base64.
>

Great, thanks guys... I was wondering if protobuf had a more efficient
string-safe encoding, but I'll just base64 it.

Cheers!

Will

Kenton Varda

unread,
Nov 24, 2009, 7:35:05 PM11/24/09
to maca...@well.com, Adam Vartanian, Protocol Buffers
You can use TextFormat but it is probably *less* efficient than base64.


Will

Adam Vartanian

unread,
Nov 24, 2009, 7:02:21 PM11/24/09
to Will Morton, Protocol Buffers
> What am I doing wrong?  What's the best way to do java string
> serialization of protobuf messages?

The native wire format of protocol buffers is just a sequence of
bytes, so it can contain values that are invalid UTF-8 (or any
encoding that has invalid byte sequences). Trying to pack that into a
String, which holds Unicode character data, isn't going to work well;
Strings are welcome to mangle the bytes however they want as long as
the same characters are represented. If you want to pass a serialized
protocol buffer to something else, you should generally use a
ByteString, byte[], or ByteBuffer.

If you absolutely have to pass things around as a String, you're going
to need to do so in some kind of encoding that supports arbitrary
data. For example, you could encode it in Base64.

- Adam
Reply all
Reply to author
Forward
0 new messages