Issue 541 in protobuf: Double decode in google.protobuf.text_format._CUnescape

11 views
Skip to first unread message

prot...@googlecode.com

unread,
Aug 6, 2013, 6:53:40 AM8/6/13
to prot...@googlegroups.com
Status: New
Owner: ----
Labels: Type-Defect Priority-Medium

New issue 541 by matt.k...@undue.org: Double decode in
google.protobuf.text_format._CUnescape
http://code.google.com/p/protobuf/issues/detail?id=541

What steps will reproduce the problem?

>>> print google.protobuf.text_format._CUnescape('\\x5c')
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
File "/usr/lib/python2.7/dist-packages/google/protobuf/text_format.py",
line 691, in _CUnescape
return result.decode('string_escape')
ValueError: Trailing \ in string

What is the expected output? What do you see instead?

The expected output is a single backslash. _CUnescape works if the input
is instead given in octal:

>>> print google.protobuf.text_format._CUnescape('\\134')
\

When the input is given in hex the escaped backslash is unescaped _twice_,
once in the re.sub() and once in the str.decode().

I'm not using the trunk HEAD but I can see that the issue is still present.

--
You received this message because this project is configured to send all
issue notifications to this address.
You may adjust your notification preferences at:
https://code.google.com/hosting/settings

prot...@googlecode.com

unread,
Aug 6, 2013, 2:27:18 PM8/6/13
to prot...@googlegroups.com
Updates:
Status: Invalid

Comment #1 on issue 541 by xiaof...@google.com: Double decode in
google.protobuf.text_format._CUnescape
http://code.google.com/p/protobuf/issues/detail?id=541

I think this is fixed in trunk. The current implementation of re.sub() will
only do the unescaping when there are an odd number of backslashes. In the
case you give, re.sub() will do nothing.

prot...@googlecode.com

unread,
Aug 7, 2013, 3:32:56 AM8/7/13
to prot...@googlegroups.com

Comment #2 on issue 541 by matt.k...@undue.org: Double decode in
google.protobuf.text_format._CUnescape
http://code.google.com/p/protobuf/issues/detail?id=541

Don't be confused by the input escaping, there is only one leading
backslash in '\\x5c'.

I've just checked the trunk and I can see that it is fixed but for a
different reason -- the regex only matches single hex sequences (e.g. \x5),
thus avoiding the double unescaping.

My problem now is that I don't have a fixed protobuf library available in
my distro.

prot...@googlecode.com

unread,
Aug 7, 2013, 1:54:07 PM8/7/13
to prot...@googlegroups.com

Comment #3 on issue 541 by xiaof...@google.com: Double decode in
google.protobuf.text_format._CUnescape
http://code.google.com/p/protobuf/issues/detail?id=541

You are right. I was misunderstanding '\\x5c' as having 2 backslashes.
The fixed code is already in 2.5.0 release. Maybe it won't be long before
it's available in your distro.
Reply all
Reply to author
Forward
0 new messages