Google Groups no longer supports new Usenet posts or subscriptions. Historical content remains viewable.
Dismiss

A bug for raw string literals in Py3k?

17 views
Skip to first unread message

Yingjie Lan

unread,
Oct 31, 2010, 6:54:10 AM10/31/10
to python list
Hi,

I tried this in the IDLE (version 3.1.2) shell:

>>> r'\'
SyntaxError: EOL while scanning string literal

But according to the py3k docs
(http://docs.python.org/release/3.0.1/whatsnew/3.0.html):

All backslashes in raw string literals are interpreted literally.

So I suppose this is a bug?

Yingjie



Martin v. Loewis

unread,
Oct 31, 2010, 7:41:47 AM10/31/10
to Yingjie Lan
> So I suppose this is a bug?

It's not, see

http://docs.python.org/py3k/reference/lexical_analysis.html#literals

# Specifically, a raw string cannot end in a single backslash

Regards,
Martin

Yingjie Lan

unread,
Oct 31, 2010, 8:23:36 AM10/31/10
to pytho...@python.org

Thanks! That looks weird to me ... doesn't this contradict with:

All backslashes in raw string literals are interpreted literally.

(see http://docs.python.org/release/3.0.1/whatsnew/3.0.html):

Best,

Yingjie


John Machin

unread,
Oct 31, 2010, 5:41:15 PM10/31/10
to

All backslashes in syntactically-correct raw string literals are
interpreted literally.

Ben Finney

unread,
Oct 31, 2010, 6:41:29 PM10/31/10
to
John Machin <sjma...@lexicon.net> writes:

That's a good way of putting it.

Yingjie, in case it's not clear: Python can only know what you've
written if it's syntactically correct. The backslash preceding the final
quote means that the code contains bad syntax, not a raw string literal.

Since there's no raw string literal, “backslashes in raw string
literals” doesn't apply.

--
\ “If sharing a thing in no way diminishes it, it is not rightly |
`\ owned if it is not shared.” —Augustine of Hippo (354–430 CE) |
_o__) |
Ben Finney

Yingjie Lan

unread,
Oct 31, 2010, 11:30:22 PM10/31/10
to pytho...@python.org
> > > All backslashes in raw string literals are
> interpreted literally.
> > > (seehttp://docs.python.org/release/3.0.1/whatsnew/3.0.html):
> >
> > All backslashes in syntactically-correct raw string
> literals are interpreted literally.
>
> That's a good way of putting it.
>

Syntactical correctness obviously depends on the syntax specification.
To cancle the special meaning of ALL backlashes in a raw string literal
makes a lot of sense to me. Currently, the behavior of backslashes
in a raw string literal is rather complicated I think.
In fact, the backlashes can still escape quotes in a raw string,
and one the other hand, it also remains in the string -- I'm
wondering what kind of use case is there to justify such a behavior?
Surely, my experience is way too limited to make a solid judgement,
I Hope others would shed light on this issue.

Yingjie



MRAB

unread,
Oct 31, 2010, 11:51:57 PM10/31/10
to pytho...@python.org
It has been discussed briefly here: http://bugs.python.org/issue1271

According to msg56377, the behaviour is "optimal" for regular
expressions. Well, I use regular expressions a lot, and I still think
it's a nuisance!

Yingjie Lan

unread,
Nov 1, 2010, 12:44:47 AM11/1/10
to pytho...@python.org
> According to msg56377, the behaviour is "optimal" for regular
> expressions. Well, I use regular expressions a lot, and I
> still think it's a nuisance!

Thanks for bringing that up.

Using an otherwise 'dead' backlash to escape quotes
in raw strings seems like the black magic of
necromancy to me. :)

To include quotes in a string, there are a couple of
known choices: If you need single quotes in the string,
start the literal by a double-quote, and vice versa.
In case you need both, you can use a long string:

>>> r''''ab\c"'''

Note that when the last character is also a quote, we can
use the other type of quote three times to delimit the
long string. Of course, there are still some corner cases:

1. when we need three consecutive single quotes AND three
consecutive double quotes in the string.

2. When the last is a single quote, and we also need
three consecutive double-quotes in the string,
or the other way around.

Then we can abandon the raw string literal, or use
concatenation of string literals wisely to get it done.

But in total, I still would vote against the nacromancy.

Yingjie



0 new messages