[Python-Dev] Raw strings ending with a backslash

16 views
Skip to first unread message

Steven D'Aprano

unread,
May 28, 2022, 5:32:21 AM5/28/22
to pytho...@python.org
Now that we have a new parser for CPython, can we fix the old gotcha
that raw strings cannot end in a backslash?

Its an FAQ and has come up again on the bug tracker.

https://docs.python.org/3/faq/design.html#id26

https://github.com/python/cpython/issues/93314



--
Steve
_______________________________________________
Python-Dev mailing list -- pytho...@python.org
To unsubscribe send an email to python-d...@python.org
https://mail.python.org/mailman3/lists/python-dev.python.org/
Message archived at https://mail.python.org/archives/list/pytho...@python.org/message/A437MSH3QO6CG2JIZHDTEDVUZZ2TCRYI/
Code of Conduct: http://python.org/psf/codeofconduct/

Serhiy Storchaka

unread,
May 28, 2022, 7:20:33 AM5/28/22
to pytho...@python.org
28.05.22 12:22, Steven D'Aprano пише:
> Now that we have a new parser for CPython, can we fix the old gotcha
> that raw strings cannot end in a backslash?
>
> Its an FAQ and has come up again on the bug tracker.
>
> https://docs.python.org/3/faq/design.html#id26
>
> https://github.com/python/cpython/issues/93314

I do not think that we can allow this, and it is not related to parser.

Few years ago I experimented with such change:
https://github.com/python/cpython/pull/15217

You can see that it breaks even some stdlib code, and it will definitely
break many third-party packages and examples. Technically we can do
this, but the benefit is small in comparison with the cost.

_______________________________________________
Python-Dev mailing list -- pytho...@python.org
To unsubscribe send an email to python-d...@python.org
https://mail.python.org/mailman3/lists/python-dev.python.org/
Message archived at https://mail.python.org/archives/list/pytho...@python.org/message/WWWRFQK4AG52GD3L6WT6QLRGTY2VQRQ2/

Damian Shaw

unread,
May 28, 2022, 7:58:53 AM5/28/22
to Python-Dev
That PR seems to make \' and \" not special in general right?

I think this is a more limited proposal, to only change the behavior when \ is at the end of a string, so the only behavior difference would never receiving the error "SyntaxError: EOL while scanning string literal"

In which case there should be no backwards compatibility issue.

Damian

Barney Gale

unread,
May 28, 2022, 8:10:51 AM5/28/22
to Damian Shaw, Python-Dev
Personally I'd expect these two lines to do the same thing, whatever that thing is:

path = 'C:\'
path = ('C:\')

Barney

Serhiy Storchaka

unread,
May 28, 2022, 8:25:04 AM5/28/22
to pytho...@python.org
28.05.22 14:57, Damian Shaw пише:
> That PR seems to make \' and \" not special in general right?
>
> I think this is a more limited proposal, to only change the behavior
> when \ is at the end of a string, so the only behavior difference would
> never receiving the error "SyntaxError: EOL while scanning string literal"
>
> In which case there should be no backwards compatibility issue.

How do you know that it is at the end of a string?

_______________________________________________
Python-Dev mailing list -- pytho...@python.org
To unsubscribe send an email to python-d...@python.org
https://mail.python.org/mailman3/lists/python-dev.python.org/
Message archived at https://mail.python.org/archives/list/pytho...@python.org/message/BFGT3H57CFTEOKWA3NXBSPUDE4JF4C2H/

Eric V. Smith

unread,
May 28, 2022, 10:23:04 AM5/28/22
to pytho...@python.org
On 5/28/2022 7:57 AM, Damian Shaw wrote:
That PR seems to make \' and \" not special in general right?

I think this is a more limited proposal, to only change the behavior when \ is at the end of a string, so the only behavior difference would never receiving the error "SyntaxError: EOL while scanning string literal"

How would you know where the end of a string is? I think this is one of those things that's easy to look at for a human and figure out the intent, but not so easy for the lexer, without some heuristics and backtracking. If the trailing single quote is removed below, it changes from "backslash in the middle of a string" to "backslash at the end of a string, followed by an arbitrary expression.

r'\' + "foo"'

Eric

Damian Shaw

unread,
May 28, 2022, 11:05:11 AM5/28/22
to Python-Dev
My understanding was that was part of the question being asked, is it possible to know what with the new PEG parser?

MRAB

unread,
May 28, 2022, 11:11:10 AM5/28/22
to pytho...@python.org
On 2022-05-28 13:17, Serhiy Storchaka wrote:
> 28.05.22 14:57, Damian Shaw пише:
>> That PR seems to make \' and \" not special in general right?
>>
>> I think this is a more limited proposal, to only change the behavior
>> when \ is at the end of a string, so the only behavior difference would
>> never receiving the error "SyntaxError: EOL while scanning string literal"
>>
>> In which case there should be no backwards compatibility issue.
>
> How do you know that it is at the end of a string?
>
It would also affect triple-quoted strings.

Here's an idea: prefix rr ("really raw") that would treat all
backslashes literally.
_______________________________________________
Python-Dev mailing list -- pytho...@python.org
To unsubscribe send an email to python-d...@python.org
https://mail.python.org/mailman3/lists/python-dev.python.org/
Message archived at https://mail.python.org/archives/list/pytho...@python.org/message/5U27KKDGHTGOWNXNCMNKZSSPZXD62RQ7/

Serhiy Storchaka

unread,
May 28, 2022, 2:23:49 PM5/28/22
to pytho...@python.org
28.05.22 18:03, Damian Shaw пише:
> My understanding was that was part of the question being asked, is it
> possible to know what with the new PEG parser?

You first need to define what is the end of a string. And I think it is
not relevant to the grammar parser.

_______________________________________________
Python-Dev mailing list -- pytho...@python.org
To unsubscribe send an email to python-d...@python.org
https://mail.python.org/mailman3/lists/python-dev.python.org/
Message archived at https://mail.python.org/archives/list/pytho...@python.org/message/I5FLGQ3KVDXVITFTW6LO2KADODKTGZ5O/

MRAB

unread,
May 28, 2022, 3:12:16 PM5/28/22
to pytho...@python.org
On 2022-05-28 16:03, MRAB wrote:
> On 2022-05-28 13:17, Serhiy Storchaka wrote:
>> 28.05.22 14:57, Damian Shaw пише:
>>> That PR seems to make \' and \" not special in general right?
>>>
>>> I think this is a more limited proposal, to only change the behavior
>>> when \ is at the end of a string, so the only behavior difference would
>>> never receiving the error "SyntaxError: EOL while scanning string literal"
>>>
>>> In which case there should be no backwards compatibility issue.
>>
>> How do you know that it is at the end of a string?
>>
> It would also affect triple-quoted strings.
>
> Here's an idea: prefix rr ("really raw") that would treat all
> backslashes literally.
> Here's something I've just realised.

Names in Python are case-sensitive, yet the string prefixes are
case-/insensitive/.

Why?
_______________________________________________
Python-Dev mailing list -- pytho...@python.org
To unsubscribe send an email to python-d...@python.org
https://mail.python.org/mailman3/lists/python-dev.python.org/
Message archived at https://mail.python.org/archives/list/pytho...@python.org/message/3FZJWRIZQQDNMJYJ2SOZFW3R7NVHKNC4/

Chris Angelico

unread,
May 28, 2022, 3:18:38 PM5/28/22
to pytho...@python.org
On Sun, 29 May 2022 at 05:05, MRAB <pyt...@mrabarnett.plus.com> wrote:
>
> On 2022-05-28 16:03, MRAB wrote:
> > On 2022-05-28 13:17, Serhiy Storchaka wrote:
> >> 28.05.22 14:57, Damian Shaw пише:
> >>> That PR seems to make \' and \" not special in general right?
> >>>
> >>> I think this is a more limited proposal, to only change the behavior
> >>> when \ is at the end of a string, so the only behavior difference would
> >>> never receiving the error "SyntaxError: EOL while scanning string literal"
> >>>
> >>> In which case there should be no backwards compatibility issue.
> >>
> >> How do you know that it is at the end of a string?
> >>
> > It would also affect triple-quoted strings.
> >
> > Here's an idea: prefix rr ("really raw") that would treat all
> > backslashes literally.
> > Here's something I've just realised.
>
> Names in Python are case-sensitive, yet the string prefixes are
> case-/insensitive/.
>
> Why?

Technically they're not, but there are aliases. Kinda like
threading.currentThread().

ChrisA
_______________________________________________
Python-Dev mailing list -- pytho...@python.org
To unsubscribe send an email to python-d...@python.org
https://mail.python.org/mailman3/lists/python-dev.python.org/
Message archived at https://mail.python.org/archives/list/pytho...@python.org/message/7PCCF24SSFJQKPA46PGF52XTROQM3QYT/

Guido van Rossum

unread,
May 28, 2022, 3:51:43 PM5/28/22
to MRAB, pytho...@python.org


On Sat, May 28, 2022 at 12:11 MRAB 

Names in Python are case-sensitive, yet the string prefixes are
case-/insensitive/.

Why?

IIRC we copied this from C for numeric suffixes  (0l and 0L are the same; also hex digits and presumably 0XA == 0xa) and then copied that for string prefixes without thinking about it much. I guess it’s too late to change.

—Guido
--
--Guido (mobile)

Gregory P. Smith

unread,
May 28, 2022, 4:21:34 PM5/28/22
to gu...@python.org, pytho...@python.org
Given that 99.99% of code uses lower case string prefixes we could change it, it'd just take a longer deprecation cycle - you'd probably want a few releases where the upper case prefixes become an error in files without a `from __future__ import case_sensitive_quote_prefixes` rather than jumping straight from parse time DeprecationWarning to repurposing the uppercase to have a new meaning.  The inertia behind doing that over the course of 5+ years is high.  Implying that we'd need a compelling reason to orchestrate it.  None has sprung up.

-gps


—Guido
--
--Guido (mobile)
_______________________________________________
Python-Dev mailing list -- pytho...@python.org
To unsubscribe send an email to python-d...@python.org
https://mail.python.org/mailman3/lists/python-dev.python.org/

Jonathan Goble

unread,
May 28, 2022, 7:41:00 PM5/28/22
to Python Dev
Trying again after I was mysteriously moderated. Thanks Ethan for fixing that.

On Sat, May 28, 2022 at 6:59 PM Jonathan Goble <jcgo...@gmail.com> wrote:
On Sat, May 28, 2022, 4:25 PM Gregory P. Smith <gr...@krypto.org> wrote:

On Sat, May 28, 2022 at 12:55 PM Guido van Rossum <gu...@python.org> wrote:

On Sat, May 28, 2022 at 12:11 MRAB 

Names in Python are case-sensitive, yet the string prefixes are
case-/insensitive/.

Why?

IIRC we copied this from C for numeric suffixes  (0l and 0L are the same; also hex digits and presumably 0XA == 0xa) and then copied that for string prefixes without thinking about it much. I guess it’s too late to change.

Given that 99.99% of code uses lower case string prefixes we could change it, it'd just take a longer deprecation cycle - you'd probably want a few releases where the upper case prefixes become an error in files without a `from __future__ import case_sensitive_quote_prefixes` rather than jumping straight from parse time DeprecationWarning to repurposing the uppercase to have a new meaning.  The inertia behind doing that over the course of 5+ years is high.  Implying that we'd need a compelling reason to orchestrate it.  None has sprung up.

There already is a semantic meaning in one case, though not in Python proper. Some syntax highlighters, including the one used in VSCode, treat r and R differently: the former is syntax highlighted as a regex and the latter is syntax highlighted as an generic string. I have seen project-specific style guides advising to use r/R accordingly. 

So there is meaningful use of capital R in real-world code, and any future change to require lowercase would need to at least consider the impact on that use case. 

Steven D'Aprano

unread,
May 30, 2022, 5:51:28 AM5/30/22
to pytho...@python.org
Thank you to everyone who responded, it is now clear to me that this
genuinely is a feature, not a bug or limitation of the parser or lexer.
And that there is code relying on that behaviour, including in the
stdlib, so we shouldn't change it even if we could.


--
Steve
_______________________________________________
Python-Dev mailing list -- pytho...@python.org
To unsubscribe send an email to python-d...@python.org
https://mail.python.org/mailman3/lists/python-dev.python.org/
Message archived at https://mail.python.org/archives/list/pytho...@python.org/message/RXA3TG5WN6YJ4HUVEUEOJWKYL27FPZVI/
Reply all
Reply to author
Forward
0 new messages