Problem with accent

74 views
Skip to first unread message

Simon

unread,
Mar 23, 2012, 9:07:02 AM3/23/12
to Protocol Buffers
Hi guys,

I have an annoying problem with some accent.
I build my proto-object, no problem, and when i want to read it the
browser, using .toString function, i have \303\240 instead of "à",
\303\250 instead of "è", etc...

So i'm wondering where can be the problem ?
Eclipse encode the files in UTF-8, Maven the same.

I just don't know where to look for :/

Thanks !

Evan Jones

unread,
Mar 26, 2012, 8:33:18 PM3/26/12
to Simon, Protocol Buffers
On Mar 23, 2012, at 9:07 , Simon wrote:
> I have an annoying problem with some accent.
> I build my proto-object, no problem, and when i want to read it the
> browser, using .toString function, i have \303\240 instead of "à",
> \303\250 instead of "è", etc…

What do you mean "i want to read it the browser using .toString function"? Is this Java or C++ or something else? What does your message definition look like?

By default, protocol buffers encodes strings in UTF-8. These characters seem to be encoded correctly as UTF-8, so the "sending" side is doing the right thing, but the code that is reading them is not doing the correct decoding:

à = U+00E0

Escaped in hexadecimal this is: "\xc3\xa0"
Escaped in octal this is: "\303\240"


So you need to decode from UTF-8 to get the correct characters. Hope this helps,

Evan

--
http://evanjones.ca/

Reply all
Reply to author
Forward
0 new messages