Trouble sending String representation of GPB from Java to C++

260 views
Skip to first unread message

Jerry

unread,
Oct 6, 2011, 12:49:04 PM10/6/11
to Protocol Buffers
Simple ( hopefully ) question. I am sending a protocol buffer message
from windows over EMS to a C++ process that is running on linux. The
text payload is generated as follows on my windows system:

person.toByteString().toStringUtf8().

When the c++ side attempts to reanimate my person with the following
code:

google:protobuf::TextFormat:parseFromString(dataString, &person);

The following error is produced:

"Invalid control characters encountered in text"

Any help would be greatly appreciated.

Also, for reasons that are beside the point I am unable to send a
bytes[] between processes at this time.

Thanks
-Jerry

Jason Hsueh

unread,
Oct 6, 2011, 1:03:24 PM10/6/11
to Jerry, Protocol Buffers
On Thu, Oct 6, 2011 at 9:49 AM, Jerry <gerald...@gmail.com> wrote:
Simple ( hopefully ) question. I am sending a protocol buffer message
from windows over EMS to a C++ process that is running on linux. The
text payload is generated  as follows on my windows system:

person.toByteString().toStringUtf8().

Serialized protos are not valid UTF8, and you should never operate on them as Java String objects. You'll get data corruption on the other side, making it unparsable.
 

When the c++ side attempts to reanimate my person with the following
code:

google:protobuf::TextFormat:parseFromString(dataString, &person);

You used the binary format above, but are using the text format here. You should be using Message::ParseFromString() or a similar variant; TextFormat should be paired with the Java TextFormat class.
 

The following error is produced:

"Invalid control characters encountered in text"

Any help would be greatly appreciated.

Also, for reasons that are beside the point I am unable to send a
bytes[] between processes at this time.

Thanks
-Jerry

--
You received this message because you are subscribed to the Google Groups "Protocol Buffers" group.
To post to this group, send email to prot...@googlegroups.com.
To unsubscribe from this group, send email to protobuf+u...@googlegroups.com.
For more options, visit this group at http://groups.google.com/group/protobuf?hl=en.


Jerry

unread,
Oct 6, 2011, 2:17:29 PM10/6/11
to Protocol Buffers
Thanks for the quick response Jason. Just to be clear, are you saying
that I will not be able to send a String representation from Java to C+
+ and expect it to work.

In term of the sending code:
-------
String text = person.toByteString().toStringUtf8();
//or toString(charsetName)
TextMessage tm = session.createTextMessage();
tm.setText (text);
//send
------

1) Once received the text payload will NOT be able to be used to
create a Person on the C++ side using Message::ParseFromString() ?
Or is it just that the above code is inncorrect? Could I use a
different charset?

2) However, the above code would work to send GPBs from Java to Java
as String representations over JMS for example, correct?


On Oct 6, 1:03 pm, Jason Hsueh <jas...@google.com> wrote:
> >http://groups.google.com/group/protobuf?hl=en.- Hide quoted text -
>
> - Show quoted text -

Jason Hsueh

unread,
Oct 6, 2011, 2:28:22 PM10/6/11
to Jerry, Protocol Buffers
On Thu, Oct 6, 2011 at 11:17 AM, Jerry <gerald...@gmail.com> wrote:
Thanks for the quick response Jason. Just to be clear, are you saying
that I will not be able to send a String representation from Java to C+
+ and expect it to work.

AFAIK, String coerces text into UTF (I am not a Java guy) so no, you cannot do this. You'd need to use the raw byte array from the ByteString. The String may be an invalid protobuf encoding, meaning it will fail for any protobuf decoder. Java to Java will not work either - the UTF conversion is lossy so the other side cannot recover the original protobuf encoding.

Jerry

unread,
Oct 6, 2011, 3:09:30 PM10/6/11
to Protocol Buffers
Just to follow on the off chance that someone as stupid as myself
might be facing a similar issue in the future. As you implied in in
your first response, using GPB Text Format on the producing side
( which amounts to calling toString() ) and a C++ TextFormat on the
consuming side will indeed work. This code writes, and reads
corresponding GPB text representations repectively.

1) To illustrate with a code snippet (Java producer)

TextMessage tm = session.createTextMessage(); //JMS
tm.setText(person.toString()); //person is a GPB

2) On the consuming side ( in C++ )
TextFormat:parseFromString(dataString, &person);

Where "dataString" is the TextMessage payload, successfully creates a c
++ person. Simplest thing should have heen tried first. Thanks

On Oct 6, 2:28 pm, Jason Hsueh <jas...@google.com> wrote:
> > > >http://groups.google.com/group/protobuf?hl=en.-Hide quoted text -
Reply all
Reply to author
Forward
0 new messages