[Python-ideas] Verbatim names (allowing keywords as names)

99 views
Skip to first unread message

Steven D'Aprano

unread,
May 15, 2018, 8:43:01 PM5/15/18
to python...@python.org
Inspired by Alex Brault's post:

https://mail.python.org/pipermail/python-ideas/2018-May/050750.html

I'd like to suggest we copy C#'s idea of verbatim identifiers, but using
a backslash rather than @ sign:

\name

would allow "name" to be used as an identifier, even if it clashes with
a keyword.

It would *not* allow the use of characters that aren't valid in
identifiers, e.g. this is out: \na!me # still not legal

See usage #1 here:

https://docs.microsoft.com/en-us/dotnet/csharp/language-reference/tokens/verbatim


If "verbatim name" is too long, we could call them "raw names", by
analogy with raw strings.

I believe that \ is currently illegal in any Python expression, except
inside strings and at the very end of the line, so this ought to be
syntactically unambgiguous.

We should still include a (mild?) recommendation against using keywords
unless necessary, and a (strong?) preference for the trailing underscore
convention. But I think this doesn't look too bad:

of = 'output.txt'
\if = 'input.txt'
with open(\if, 'r'):
with open(of, 'w'):
of.write(\if.read())

maybe even nicer than if_.

Some examples:

result = \except + 1

result = something.\except

result = \except.\finally


--
Steve
_______________________________________________
Python-ideas mailing list
Python...@python.org
https://mail.python.org/mailman/listinfo/python-ideas
Code of Conduct: http://python.org/psf/codeofconduct/

Tim Delaney

unread,
May 15, 2018, 8:54:52 PM5/15/18
to Python-Ideas
On Wed, 16 May 2018 at 10:42, Steven D'Aprano <st...@pearwood.info> wrote:
Inspired by Alex Brault's  post:

https://mail.python.org/pipermail/python-ideas/2018-May/050750.html

I'd like to suggest we copy C#'s idea of verbatim identifiers, but using
a backslash rather than @ sign:

    \name

​Personally, I prefer $name as originally suggested in that thread. It has the advantage of precedence from other languages as a variable indicator (even as far back as BASIC for me) and is more visible (which IMO is a positive, but others may see as a negative).​

​In either case​, I'm +1 for some way of indicating a verbatiim identifier.

But I think this doesn't look too bad:

​​
    of = 'output.txt'
    \if = 'input.txt'
    with open(\if, 'r'):
        with open(of, 'w'):
            of.write(\if.read())

​​
    of = 'output.txt'
    $if = 'input.txt'
    with open($if, 'r'):
        with open(of, 'w'):
            of.write($if.read())
maybe even nicer than if_.

Some examples:

    result = \except + 1

​result = $except + 1​
 
    result = something.\except

​result = somthing.$except​

    result = \except.\finally

​result = $except.$finally

@\return
def func():
    pass

@$return
def func():
    pass

Tim Delaney​

Guido van Rossum

unread,
May 15, 2018, 9:10:29 PM5/15/18
to Steven D'Aprano, Python-Ideas
I like it. I much prefer \ to $ since in most languages that use $ that I know of (Perl, shell) there's a world of difference between $foo and foo whenever they occur (basically they never mean the same thing), whereas at least in shell, \foo means the same thing as foo *unless* foo would otherwise have a special meaning.

I also recall that in some Fortran dialect I once used, $ was treated as the 27th letter of the alphabet, but not in the language standard. See e.g. https://gcc.gnu.org/onlinedocs/gcc-3.4.6/g77/Dollar-Signs.html. Apparently it has a similar role in Java (https://stackoverflow.com/questions/7484210/what-is-the-meaning-of-in-a-variable-name).
--
--Guido van Rossum (python.org/~guido)

Franklin? Lee

unread,
May 15, 2018, 9:54:40 PM5/15/18
to Guido van Rossum, Python-Ideas
I assume there can't be space between the backslash and the name, to
prevent ambiguity like in the following:

# Is this `foo = not [1]` or `foo = \not[1]`?
foo = (\
not[1])


A sampling of \ in other languages, for consideration:
- Haskell: A lambda. E.g. `\x -> x+1`
- TeX: A command. E.g. `\bold{Some text.}`.
- Matlab: Matrix division operator.
- PHP: Namespace delimiter. Uh.

While LaTeX users have some intersection with Python users (Scipy!), I
think there are enough differences in the languages that this one more
won't hurt.

Carl Smith

unread,
May 15, 2018, 11:04:30 PM5/15/18
to Python-Ideas
 
On Tue, May 15, 2018 at 8:41 PM, Steven D'Aprano <st...@pearwood.info> wrote:
Inspired by Alex Brault's  post:

https://mail.python.org/pipermail/python-ideas/2018-May/050750.html

I'd like to suggest we copy C#'s idea of verbatim identifiers, but using
a backslash rather than @ sign:

    \name

would allow "name" to be used as an identifier, even if it clashes with
a keyword.

I strongly disagree, but can't seem to get anyone
​ to bite.

We want to be able to introduce a keyword that was formally a name, still
allow 
it to be used as a name, still allow code that uses it as a keyword to
interoperate
​ ​
with code that uses it as a name
, without changing the language
or implementation 
too much.

​Ideally, Python would still not allow the keyword to be used as a name and a
keyword in the same file??

The lexer could class the tokens as keynames, and the parser could use the
context of the first instance of each keyname to determine if it's a name or
keyword for the rest of that file. Projects that used the word as a name would
only be prevented from also using it as a keyword in the same file.

It's really then a question of whether users could elegantly and naturally
reference a name in another module without introducing the name to the
current module's namespace.

We only reference external names (as syntactic names) in import statements,
as properties after the dot operator, and as keyword arguments.

If code that used the word as a keyword was still allowed to use the word as
a name after the dot operator and as a keyword argument *in an invocation*,
it would only change the language in a subtle way.

If we could reference keynames in import statements, but not import the name,
so basically allow `from keyname import keyname as name`, but not allow
`import keyname`, we could still easily import things that used the keyname
as a name. This wouldn't change the language too dramatically either. 

Maybe I'm just being dumb, but it seems like three subtle changes to the
language would allow for everything we want to have, with only minor limitations
on the rare occasion that you want to use the new keyword with a library that is
also using the same keyword as a name.

I promise not to push this idea again, but would really appreciate someone taking
a couple of minutes to explain why it's not worth responding to. I'm not offended,
but would like to know what I'm doing wrong.

Thanks.

Terry Reedy

unread,
May 15, 2018, 11:23:43 PM5/15/18
to python...@python.org
On 5/15/2018 8:41 PM, Steven D'Aprano wrote:
> Inspired by Alex Brault's post:
>
> https://mail.python.org/pipermail/python-ideas/2018-May/050750.html
>
> I'd like to suggest we copy C#'s idea of verbatim identifiers, but using
> a backslash rather than @ sign:

Not quite as heavy.

> \name
>
> would allow "name" to be used as an identifier, even if it clashes with
> a keyword.
>
> It would *not* allow the use of characters that aren't valid in
> identifiers, e.g. this is out: \na!me # still not legal
>
> See usage #1 here:
>
> https://docs.microsoft.com/en-us/dotnet/csharp/language-reference/tokens/verbatim

> If "verbatim name" is too long, we could call them "raw names", by
> analogy with raw strings.


> I believe that \ is currently illegal in any Python expression, except
> inside strings and at the very end of the line, so this ought to be
> syntactically unambgiguous.
>
> We should still include a (mild?) recommendation against using keywords
> unless necessary, and a (strong?) preference for the trailing underscore
> convention. But I think this doesn't look too bad:

I think it is just ugly enough to discourage wild use.

> of = 'output.txt'
> \if = 'input.txt'
> with open(\if, 'r'):
> with open(of, 'w'):
> of.write(\if.read())
>
> maybe even nicer than if_.
>
> Some examples:
>
> result = \except + 1
>
> result = something.\except
>
> result = \except.\finally

I believe avoiding tagging raw names as keywords could be done by
adjusting the re for keywords and that addition of '\' could be done by
re.sub. (The details should be in the doc.)


--
Terry Jan Reedy

Tim Peters

unread,
May 15, 2018, 11:46:58 PM5/15/18
to Terry Reedy, Python-Ideas
[Terry Reedy]
> ...
> I believe avoiding tagging raw names as keywords could be done by adjusting
> the re for keywords

Yup - it should just require adding a negative lookbehind assertion; e.g.,

>>> import re
>>> keypat = r"(?<!\\)\b(if|while|for)\b"
>>> re.search(keypat, r"yup! while")
<_sre.SRE_Match object; span=(5, 10), match='while'>
>>> re.search(keypat, r"nope! \while") # None
>>>

The "(?<!\\)" part means "if what follows me matched, pretend it
didn't match if the character before it is a backslash - provided
there _is_ a character before it".

Ethan Furman

unread,
May 16, 2018, 12:01:20 AM5/16/18
to python...@python.org
On 05/15/2018 08:03 PM, Carl Smith wrote:

> On Tue, May 15, 2018 at 8:41 PM, Steven D'Aprano wrote:

>> I'd like to suggest we copy C#'s idea of verbatim identifiers, but using
>> a backslash rather than @ sign:
>>
>> \name
>>
>> would allow "name" to be used as an identifier, even if it clashes with
>> a keyword.
>
> I strongly disagree, but can't seem to get anyone to bite.

Sometimes it's like that, and yes it is frustrating.

> We want to be able to introduce a keyword that was formally a name, still
> allow
> it to be used as a name, still allow code that uses it as a keyword to
> interoperate
> ​ ​

> with code that uses it as a name, without changing the language


> or implementation too much.
>
> ​Ideally, Python would still not allow the keyword to be used as a name and a
> keyword in the same file??

For me at least, this is a deal breaker. My libraries tend to be single-file packages, so all my code is in one place
-- only being able to use the keyname as one or the other in my multi-thousand line file does me no good.

--
~Ethan~

Paul Moore

unread,
May 16, 2018, 4:14:55 AM5/16/18
to Steven D'Aprano, Python-Ideas
On 16 May 2018 at 01:41, Steven D'Aprano <st...@pearwood.info> wrote:
> Inspired by Alex Brault's post:
>
> https://mail.python.org/pipermail/python-ideas/2018-May/050750.html
>
> I'd like to suggest we copy C#'s idea of verbatim identifiers, but using
> a backslash rather than @ sign:
>
> \name
>
> would allow "name" to be used as an identifier, even if it clashes with
> a keyword.

I'm missing something. How is that different from using a trailing
underscore (like if_ or while_) at the moment? I understand that foo
and \foo are the same name, whereas foo and foo_ are different, but
how would that help? Can you give a worked example of how this would
help if we wanted to introduce a new keyword? For example, if we
intended to make "where" a keyword, what would numpy and its users
need to do to continue using `numpy.where`?

Paul

Eric V. Smith

unread,
May 16, 2018, 4:49:05 AM5/16/18
to Paul Moore, Steven D'Aprano, Python-Ideas
On 5/16/18 4:13 AM, Paul Moore wrote:
> On 16 May 2018 at 01:41, Steven D'Aprano <st...@pearwood.info> wrote:
>> Inspired by Alex Brault's post:
>>
>> https://mail.python.org/pipermail/python-ideas/2018-May/050750.html
>>
>> I'd like to suggest we copy C#'s idea of verbatim identifiers, but using
>> a backslash rather than @ sign:
>>
>> \name
>>
>> would allow "name" to be used as an identifier, even if it clashes with
>> a keyword.
>
> I'm missing something. How is that different from using a trailing
> underscore (like if_ or while_) at the moment? I understand that foo
> and \foo are the same name, whereas foo and foo_ are different, but
> how would that help?

Presumably things that know their name (like def'd functions and
classes, off the top of my head) would be able to figure out their
keyword-like name. Although exactly how that would work is debatable.

>>> def \if(): pass
...

Is \if.__name__ equal to r"\if", or "if"? Probably just "if".

Not that it really matters, but I can see code generators adding
backslashes everywhere. For example, in dataclasses I'd probably
generate __init__ for this:

\for=float
@dataclass
class C:
\if: int
only: \for

as:
def __init__(self, \if:\int, \only:\for):

That is, I'd add a backslash in front of every identifier, instead of
trying to figure out if I need to or not. I think a lot of code
generators (such as attrs) would need to be modified. Not a
show-stopper, but something to think about.

> Can you give a worked example of how this would
> help if we wanted to introduce a new keyword? For example, if we
> intended to make "where" a keyword, what would numpy and its users
> need to do to continue using `numpy.where`?

I think they'd have to change to `numpy.\where` when `where` became a
keyword.

Another thought: I'm sure f-strings would have fun with this. This code
is currently illegal:

>>> f'{\if}'
File "<stdin>", line 1
SyntaxError: f-string expression part cannot include a backslash

That would be a bear to fix, and would require all code that looks at
f-strings, even if only to ignore them, to change. Currently you can
just say "any string that has an f in front of it can be lexed (as a
unit) the same way 'r', 'u', and 'b' strings are". But this would break
that, and mean that instead of a simple tokenizer to find the end of an
f-string, you'd probably need a full expression parser. Again, just
something to think about.

Eric

Eric V. Smith

unread,
May 16, 2018, 4:57:35 AM5/16/18
to Paul Moore, Steven D'Aprano, Python-Ideas
On 5/16/18 4:47 AM, Eric V. Smith wrote:
> On 5/16/18 4:13 AM, Paul Moore wrote:

>> Can you give a worked example of how this would
>> help if we wanted to introduce a new keyword? For example, if we
>> intended to make "where" a keyword, what would numpy and its users
>> need to do to continue using `numpy.where`?
>
> I think they'd have to change to `numpy.\where` when `where` became a
> keyword.

To be clear: this would apply to any code that uses numpy.where, not
just the code that defines it.

The only way to bullet-proof your code so that it would never need any
modifications in the future would be to put a backslash in front of
every identifier. Or maybe just all-lowercase identifiers, since we're
unlikely to make a keyword with uppercase chars in it.

And since no one in their right mind would do that, there's still the
risk of your code breaking in the future. But at least there would be a
way of fixing it in a way that would work both with old versions of
python where the identifier isn't a keyword, and for versions where it
is. That is, once "old versions" include ones that support verbatim names.

Paul Moore

unread,
May 16, 2018, 5:04:44 AM5/16/18
to Eric V. Smith, Python-Ideas
On 16 May 2018 at 09:56, Eric V. Smith <er...@trueblade.com> wrote:
> On 5/16/18 4:47 AM, Eric V. Smith wrote:
>>
>> On 5/16/18 4:13 AM, Paul Moore wrote:
>
>
>>> Can you give a worked example of how this would
>>> help if we wanted to introduce a new keyword? For example, if we
>>> intended to make "where" a keyword, what would numpy and its users
>>> need to do to continue using `numpy.where`?
>>
>>
>> I think they'd have to change to `numpy.\where` when `where` became a
>> keyword.
>
>
> To be clear: this would apply to any code that uses numpy.where, not just
> the code that defines it.
>
> The only way to bullet-proof your code so that it would never need any
> modifications in the future would be to put a backslash in front of every
> identifier. Or maybe just all-lowercase identifiers, since we're unlikely to
> make a keyword with uppercase chars in it.
>
> And since no one in their right mind would do that, there's still the risk
> of your code breaking in the future. But at least there would be a way of
> fixing it in a way that would work both with old versions of python where
> the identifier isn't a keyword, and for versions where it is. That is, once
> "old versions" include ones that support verbatim names.

That's about what I thought - thanks.
Paul

Stephan Houben

unread,
May 16, 2018, 5:13:29 AM5/16/18
to Paul Moore, Eric V. Smith, Python-Ideas
Hi all,

One problem already alluded to with the \identifier syntax is that it only works
if the old Python version is sufficiently recent to understand \.

What about using parentheses to allow a keyword to be used as an identifier:
(where)(x, y)

This then in combination with allowing keywords in the following unambiguous locations:
1. After dot ("numpy.where")
2. After def and class ("def where")
3. After "as".


This should make it possible to write code which works in a hypothetical future Python
version where "where" is a keyword, and which also works with current Python versions.

Stephan

Antoine Pitrou

unread,
May 16, 2018, 6:00:38 AM5/16/18
to python...@python.org
On Wed, 16 May 2018 09:13:52 +0100
Paul Moore <p.f....@gmail.com> wrote:
> On 16 May 2018 at 01:41, Steven D'Aprano <st...@pearwood.info> wrote:
> > Inspired by Alex Brault's post:
> >
> > https://mail.python.org/pipermail/python-ideas/2018-May/050750.html
> >
> > I'd like to suggest we copy C#'s idea of verbatim identifiers, but using
> > a backslash rather than @ sign:
> >
> > \name
> >
> > would allow "name" to be used as an identifier, even if it clashes with
> > a keyword.
>
> I'm missing something. How is that different from using a trailing
> underscore (like if_ or while_) at the moment? I understand that foo
> and \foo are the same name, whereas foo and foo_ are different, but
> how would that help?

I think it could help in cases like namedtuple, where names can be
part of a data description (e.g. coming from a database) and then used
for attribute access. I do not find it extremely pretty, but I like it
much better still than the "allowing keywords as names" proposal.

It also has the nice side-effect that it doesn't make it easier to add
new keywords, since the common spelling (e.g. `np.where`) would still
become a syntax error and therefore break compatibility with existing
code.

Regards

Antoine.

Wolfgang Maier

unread,
May 16, 2018, 8:14:49 AM5/16/18
to python...@python.org
On 16.05.2018 02:41, Steven D'Aprano wrote:
>
> Some examples:
>
> result = \except + 1
>
> result = something.\except
>
> result = \except.\finally
>

Maybe that could get combined with Guido's original suggestion by making
the \ optional after a .?

Example:

class A ():
\global = 'Hello'
def __init__(self):
self.except = 0

def \finally(self):
return 'bye'

print(A.global)
a = A()
a.except += 1
print(a.finally())

or with a module, in my_module.py:

\except = 0

elsewhere:

import my_module
print(my_module.except)

or

from my_module import \except
print(\except)

Best,
Wolfgang

Andrés Delfino

unread,
May 16, 2018, 9:06:39 AM5/16/18
to python...@python.org
IMHO, it would be much easier to learn and understand if keywords can only be used by escaping them, instead of depending where they occur.

Todd

unread,
May 16, 2018, 9:48:00 AM5/16/18
to python-ideas
I think your idea would work okay if everyone followed good programming practices.  But when you have files that are tens of thousands of ugly code written by dozens of non-programmers over a dozen years it sounds like a recipe for a nightmare. 

For example someone you never met that left your group ten years ago could have made "True" be "np.bool_(1)" on a whim that makes your code break later in very hard-to-debug ways.

To put it simply, I think it encourages people to take convenient shortcuts with implications they don't understand.

Carl Smith

unread,
May 16, 2018, 10:27:43 AM5/16/18
to python-ideas
Thanks for the reply Todd.

If `True` was redefined somewhere else, it would still be `True` for you. You could do `from oldlib import True as true` and have `true` equal `np.bool_(1)`. You could reference `oldlib.True` or do `oldlib.function(True=x)` to interact with the name in the old library.

None of this would actually apply to `True`, as it's a reserved word in all versions. The proposal only applies to new keywords that are used as names in other libraries.

Again, thanks for taking the time.

-- Carl Smith

Carl Smith

unread,
May 16, 2018, 10:55:49 AM5/16/18
to python-ideas
One problem with my proposal is with assignments to properties (`name.keyword = something`) and regular assignments (including class and def statements) inside the body of a class that subclasses and externally defined class would all need to be allowed, so that inherited names can be reassigned to and inherited methods can be overridden.

As there is no way to know from static analysis whether the code is (legally) overriding something or (illegally) creating a name that is also a keyword in that file, doing so would need to be handled by a runtime exception, something like `NameError: cannot create names that are keywords in the same context`.

Runtime errors still seem preferable to making keywords legally names in the same file (especially if we have to escape the names).


-- Carl Smith

Todd

unread,
May 16, 2018, 11:23:52 AM5/16/18
to python-ideas
On Wed, May 16, 2018 at 10:26 AM, Carl Smith <carl....@gmail.com> wrote:
Thanks for the reply Todd.

If `True` was redefined somewhere else, it would still be `True` for you. You could do `from oldlib import True as true` and have `true` equal `np.bool_(1)`. You could reference `oldlib.True` or do `oldlib.function(True=x)` to interact with the name in the old library.


Not if you need to make changes in the same tens of thousands of lines file. 
 
None of this would actually apply to `True`, as it's a reserved word in all versions. The proposal only applies to new keywords that are used as names in other libraries.

No it isn't.  It was added in Python 2.3 and only became a keyword in Python 3.  Prior to that lots of other packages defined their own "True" (or "true" or "TRUE", etc), which is why it wasn't made a keyword for such a long time. 

But this is just an example of the sort of problem that can come up with your approach.  The overall issue is that python has no way of knowing if the keyword is being used for legitimate backwards-compatibility purposes or someone intentionally overrode after it was made a keyword because they somehow thought it was a good idea.  That is why being explicit about overriding the keyword is so important.

Niki Spahiev

unread,
May 16, 2018, 11:42:45 AM5/16/18
to python...@python.org
On 16.05.2018 16:05, Andrés Delfino wrote:
> IMHO, it would be much easier to learn and understand if keywords can only
> be used by escaping them, instead of depending where they occur.

There can be 2 escape characters '\' and '.'

Niki

Carl Smith

unread,
May 16, 2018, 2:18:35 PM5/16/18
to python-ideas
Not if you need to make changes in the same tens of thousands of lines file.

But what has that got to do with the the syntax of the new code? The old code is
what it is.

I did think after I replied that `True` wasn't actually reserved until more recently, but
the point still stands: You would be able to reference the name *as defined* in an
external library, and yeah, it could refer to anything, but that's kinda the point. We
have to assume the library does something sane with the name. We can't preempt
an employee sabotaging `True`.

As a more realistic example (if not for Python), say `until` became a keyword, then
you could end up with lines like this:

    from oldlib import until as upto

    dance(until="am")

    event.until = time(9, 30)

The overall issue is that python has no way of knowing if the keyword is being used for legitimate
> backwards-compatibility purposes or someone intentionally overrode after it was made a keyword
> because they somehow thought it was a good idea.

I only said that Python does not know *until runtime*, and I was wrong when I described that as a
problem. A runtime NameError actually makes perfect sense. Assigning to `self.until` or assigning
to `until` inside a subclass should not be a syntax error. A NameError would be correct.

It worth mentioning that the cost of checking only applies to cases where the name in question is also
keyword, so almost never.


-- Carl Smith

Carl Smith

unread,
May 16, 2018, 2:21:39 PM5/16/18
to python-ideas
​> There can be 2 escape characters '\' and '.'

That's clever, but then we have to put a slash in front of names in imports, assignments and keyword arguments, but not properties.

-- Carl Smith

Todd

unread,
May 16, 2018, 3:42:05 PM5/16/18
to python-ideas
On Wed, May 16, 2018 at 2:17 PM, Carl Smith <carl....@gmail.com> wrote:
Not if you need to make changes in the same tens of thousands of lines file.

But what has that got to do with the the syntax of the new code? The old code is
what it is.


Again, because you end up with hard-to-debug issues through no fault of your own.
 
I did think after I replied that `True` wasn't actually reserved until more recently, but
the point still stands: You would be able to reference the name *as defined* in an
external library, and yeah, it could refer to anything, but that's kinda the point. We
have to assume the library does something sane with the name. We can't preempt
an employee sabotaging `True`.


We can and do preempt someone sabotaging a keywords by not letting anyone override them.  That is the whole point of using reserved keywords.  Some languages allow you to change important words, some don't.  Guido made a conscious decision to make certain words keywords, and to not let anyone change them, I believe to avoid the sorts of issues I have brought up.  You are talking about removing one of the most important and long-standing protections the language has in place.  That is not a small change.

Carl Smith

unread,
May 16, 2018, 5:10:57 PM5/16/18
to python-ideas
We can and do preempt someone sabotaging keywords by not letting anyone override them. 
> That is the whole point of using reserved keywords. Some languages allow you to change
> important words, some don't.  Guido made a conscious decision to make certain words keywords,
> and to not let anyone change them, I believe to avoid the sorts of issues I have brought up. 
> You are talking about removing one of the most important and long-standing protections the
> language has in place.  That is not a small change.

Making names keywords requires that keywords also be names. If Guido is open to introducing
keywords that are currently names, it cannot be lost on him that some code will use names that
are now keywords.

If your position is that Guido shouldn't introduce keywords that are currently used as names at all,
fair enough; that'd be my first choice too. But, if we are going to do it, I have a strong preference for
a specific approach.


-- Carl Smith

Greg Ewing

unread,
May 16, 2018, 6:59:42 PM5/16/18
to python-ideas
Todd wrote:
> The overall issue is that python has no way of knowing
> if the keyword is being used for legitimate backwards-compatibility
> purposes or someone intentionally overrode after it was made a keyword
> because they somehow thought it was a good idea. That is why being
> explicit about overriding the keyword is so important.

The trouble with explicitly overriding keywords is that it
still requires old code to be changed whenever a new keyword
is added, which as far as I can see almost competely defeats
the purpose. If e.g. you need to change all uses of given
to \given in order for your code to keep working in
Python 3.x for some x, you might just as well change it
to given_ or some other already-legal name.

The only remotely legitimate use I can think of is for
calling APIs that come from a different language, but the
same thing applies there -- names in the Python binding can
always be modified somehow to make them legal.

As far as I can see, any mechanism allowing keywords to be
used as names has to be completely transparent to existing
code, otherwise there's no point in it.

--
Greg

Steven D'Aprano

unread,
May 16, 2018, 8:23:00 PM5/16/18
to python...@python.org
On Thu, May 17, 2018 at 10:58:34AM +1200, Greg Ewing wrote:

> The trouble with explicitly overriding keywords is that it
> still requires old code to be changed whenever a new keyword
> is added, which as far as I can see almost competely defeats
> the purpose. If e.g. you need to change all uses of given
> to \given in order for your code to keep working in
> Python 3.x for some x, you might just as well change it
> to given_ or some other already-legal name.

Well, maybe. Certainly using name_ is a possible solution, and it is one
which has worked for over a quarter century.

We can argue about whether \name or name_ looks nicer, but \name
has one advantage: the key used in the namespace is actually "name".
That's important: see below.


> The only remotely legitimate use I can think of is for
> calling APIs that come from a different language, but the
> same thing applies there -- names in the Python binding can
> always be modified somehow to make them legal.

Of course they can be modified. But having to do so is a pain.

With the status quo, when dealing with external data which may include
names which are keywords, we have to:

- add an underscore when we read keywords from external data
- add an underscore when used as obj.kw literals
- add an underscore when used as getattr("kw") literals
- conditionally remove trailing underscore when writing to external APIs

to:

- do nothing special when we read keywords from external data
- add a backslash when used as obj.kw literals
- do nothing special when used as getattr("kw") literals
- do nothing special when writing to external APIs


I think that overall this pushes it from a mere matter of visual
preference \kw versus kw_ to a significant win for verbatim names.

Let's say you're reading from a CSV file, creating an object from each
row, and processing it:

# untested
reader = csv.reader(infile)
header = next(reader)
header = [name + "_" if name in keywords.kwlist() else name for name in header]
for row in reader:
obj = SimpleNamespace(*zip(header, row))
process(obj)


The consumer of these objects, process(), has to reverse the
transformation:

def process(obj):
for name, value in vars(obj):
if name.endswith("_") and name[:-1] in keywords.kwlist():
name = name[:-1]
write_to_external_API(name, value)


Verbatim names lets us skip both of these boilerplate steps.

An interesting case is when you are using the keywords as hard-coded
names for attribute access. In the status quo, we write:

obj.name_
obj.getattr("name_")

In the first line, if you neglect the _ the compiler will complain and
you get a syntax error. In the second line, if you neglect the _ you'll
get no warning, only a runtime failure.

With verbatim names, we can write:

obj.\name
obj.getattr("name") # Don't escape it here!

In this case, the failure modes are similar:

- if you forget the backslash in the first line, you get a
SyntaxError at compile time, so there's no change here.

- if you wrongly include the backslash in the second line,
there are two cases:

* if the next character matches a string escape, say \n
or \t, you'll get no error but a runtime failure;

(but linters could warn about that)

* if it doesn't match, say \k, you'll now get a warning
and eventually a failure as we depreciate silently
ignoring backslashes.


--
Steve

Greg Ewing

unread,
May 17, 2018, 2:04:41 AM5/17/18
to python...@python.org
Steven D'Aprano wrote:
> Let's say you're reading from a CSV file, creating an object from each
> row, and processing it:

Okay, I can see it could be useful for situations like that.

But this is still a completely different use case from the
one that started this discussion, which was making it less
painful to add new keywords to the language. The backslash
idea doesn't help at all with that.

--
Greg

Steven D'Aprano

unread,
May 17, 2018, 8:52:55 AM5/17/18
to python...@python.org
On Thu, May 17, 2018 at 06:03:32PM +1200, Greg Ewing wrote:
> Steven D'Aprano wrote:
> >Let's say you're reading from a CSV file, creating an object from each
> >row, and processing it:
>
> Okay, I can see it could be useful for situations like that.
>
> But this is still a completely different use case from the
> one that started this discussion, which was making it less
> painful to add new keywords to the language. The backslash
> idea doesn't help at all with that.

Doesn't help *at all*?

Sure it does.

It's Python 3.8, and I learn that in 4.0 "spam" is going to become a
keyword. I simply take my code and change all the references spam to
\spam, and I've future-proofed the code for 4.0 while still keeping
compatibility with 3.8 and 3.9.

(But not 3.7 of course. But we can't have everything.)

If my code is a library which others will use, the benefit is even
bigger. (For a purely internal project, I could just replace spam with
spam_ and be done with it.) But for a library, I already have public
documentation saying that my API is the function spam(), and I don't
want to have to change the public API. As far as my library's users are
concerned, nothing has changed.



--
Steve

Alexandre Brault

unread,
May 17, 2018, 9:04:22 AM5/17/18
to python...@python.org
On 2018-05-17 2:03 AM, Greg Ewing wrote:
> Steven D'Aprano wrote:
>> Let's say you're reading from a CSV file, creating an object from
>> each row, and processing it:
>
> Okay, I can see it could be useful for situations like that.
>
> But this is still a completely different use case from the
> one that started this discussion, which was making it less
> painful to add new keywords to the language. The backslash
> idea doesn't help at all with that.
>
I don't think there will a solution that makes it less painful to add
keywords to the language, nor that finding one should be something we
aim for. What this proposal accomplishes is help interoperation with
other languages that have different keywords, simplify code generators
by letting them blindly escape names and avoid mangling/demangling
keywords, and as a distant third, an easy out if the language adds a
keyword you use as a name.

Alex

Carl Smith

unread,
May 17, 2018, 9:55:42 AM5/17/18
to python-ideas
The trouble with explicitly overriding keywords is that it still requires old code to
> be changed whenever a new keyword is added...

Nope. That's the exact thing my proposal avoids. I'm not sure it also does everything
everyone else wants it to do, so it may be a bad idea, but you would not have to change
old code, ever.

For the purpose of this discussion, let's say that if any code implicitly enables a new feature
(by simply using it), all the code in that file is 'new code' in the context of the specific feature
(a bit like how `yield` works). If `until` was a new keyword, any file that used it as a keyword
would be new code. Any other files are 'old code'.

New code could still import an object named `until` from old code. It just could not import it
as `until`. So `import until as upto` is fine, but `import until` is a NameError *in new code*.
In old code, `until` is still just a name.

We would also allow `name.until` and `dance(until="am")` in new code, so that we can still
reference names in old code (without them becoming names in new code).

If `self.until = something` appeared anywhere in new code, or any local assignment to `until`
(including class and def statements) appeared inside a subclass definition within new code,
that would need to be checked for a runtime NameError (not very often, but sometimes).

In practice, many libraries would alias names that became keywords, so new code could use
the new name without restrictions, but old code would carry on working with the old name.

TLDR: The syntax and semantics of old code would remain totally unchanged.


-- Carl Smith

Chris Barker via Python-ideas

unread,
May 17, 2018, 1:11:55 PM5/17/18
to Carl Smith, python-ideas
On Wed, May 16, 2018 at 2:09 PM, Carl Smith <carl....@gmail.com> wrote:
If your position is that Guido shouldn't introduce keywords that are currently used as names at all,

Exactly -- which is why I'm wondering my no one (that I've seen -- long thread)  is presenting the backwards option:

Any new keywords introduced will be non-legal as regular names.

\new_key_word

for instance.

Makes me think that it may have been good to have ALL keywords somehow non-legal as user-defined names -- maybe ugly syntax, but it would make a clear distinction.

how ugly would this be?

\for i in range(n):
    \while \True:
        ...

pretty ugly :-(

But maybe not so much if only a handful of new ones....

Or is there another currently illegal character that could be used that would be less ugly?

I'm actually confused as to what the point is to the \ prefix idea for names:

* It would still require people to change their code when a new keyword was introduced

* It would be no easier / harder than adding a conventional legal character -- trailing underscore, or ???

* but now the changed code would no longer run on older versions of python.

I guess it comes down to why you'd want to call out:

"this is a name that is almost like a keyword"

Seems like a meh, meh, lose proposal to me.

OK, I see one advantage -- one could have code that already has BOTH word and word_ names in it. So when word becomes a keyword, a tool that automatically added an underscore would break the code. whereas if it automatically added an currently illegal character, it wouldn't shadow anything.

But a sufficiently smart tool could get around that, too.

-CHB


--

Christopher Barker, Ph.D.
Oceanographer

Emergency Response Division
NOAA/NOS/OR&R            (206) 526-6959   voice
7600 Sand Point Way NE   (206) 526-6329   fax
Seattle, WA  98115       (206) 526-6317   main reception

Chris....@noaa.gov

Stephan Houben

unread,
May 17, 2018, 1:54:42 PM5/17/18
to Chris Barker, python-ideas
Fortunately we have Unicode bold characters nowadays

𝐢𝐟 if 𝐢𝐧 in:
    𝐫𝐞𝐭𝐮𝐫𝐧 return

Look ma! No syntactic ambiguity!

Stephan

Stephan Houben

unread,
May 17, 2018, 2:02:50 PM5/17/18
to Chris Barker, python-ideas
OK, that was a silly joke, but
did you realize that the following already WORKS TODAY:

>>> class Foo:
...  pass
>>> f = Foo()
>>> f.__dict__["if"] = 42
>>> f.𝐢𝐟
42

That's right, I can access a member whose name is a keyword
by simply using Unicode bold.

Because Python normalizes Unicode fonts but  this is
apparently AFTER keywords have been recognized.

Stephan

Neil Girdhar

unread,
May 17, 2018, 2:41:33 PM5/17/18
to python-ideas
My preference is to do nothing.  If you end up making "where" a keyword in Python 3.8, numpy will probably:
* rename their where function to "where_" in 3.8
* add a where_ alias in Python < 3.8.

And then people will have to fix their code in 3.8 anyway.  Only instead of learning a new verbatim syntax, they will just add the familiar underscore.

One thing that should ideally be done is to improve the SyntaxError processing to special case use of keywords in places that identifiers are used.  Instead of:

In [1]: for if in range(10):
  File "<ipython-input-1-a51677fa6668>", line 1
    for if in range(10):
         ^
SyntaxError: invalid syntax

How about

In [1]: for if in range(10):
  File "<ipython-input-1-a51677fa6668>", line 1
    for if in range(10):
         ^
SyntaxError: "if" was used where a variable belongs, but "if" is a keyword.  Consider using "if_" instead.

Similarly,

In [2]: int.if
  File "<ipython-input-2-72291900e846>", line 1
    int.if
         ^
SyntaxError: "if" was used where an attribute name belongs.  Did you mean "if_"?

SyntaxError doesn't need to quickly generate its error strings.  So there is only an upside to having clear error messages.

On Thursday, May 17, 2018 at 1:54:42 PM UTC-4, Stephan Houben wrote:
Fortunately we have Unicode bold characters nowadays

𝐢𝐟 if 𝐢𝐧 in:
    𝐫𝐞𝐭𝐮𝐫𝐧 return

Look ma! No syntactic ambiguity!

That's hilarious :) 

Eric V. Smith

unread,
May 17, 2018, 5:39:40 PM5/17/18
to Python-Ideas
[Resending due to Google Groups getting involved and giving me an error]

On 5/17/2018 2:41 PM, Neil Girdhar wrote:
> My preference is to do nothing.  If you end up making "where" a keyword
> in Python 3.8, numpy will probably:
> * rename their where function to "where_" in 3.8
> * add a where_ alias in Python < 3.8.
>
> And then people will have to fix their code in 3.8 anyway.  Only instead
> of learning a new verbatim syntax, they will just add the familiar
> underscore.

I'm not saying this applies to numpy, but one bonus of using \where
would be that existing 3.7 pickles would work in 3.8 (or I think so,
it's all obviously underspecified at this point). With renaming, pickles
would break.

Eric

Carl Smith

unread,
May 17, 2018, 5:40:05 PM5/17/18
to python-ideas
My preference is to do nothing.  If you end up making "where" a keyword in Python 3.8, numpy will probably:
> * rename their where function to "where_" in 3.8
> * add a where_ alias in Python < 3.8.

This assumes that every library that defines something named `where` (and every library that references it)
are maintained​. If the parser just starts treating `where` as a keyword, working code will not parse. That
may be acceptable (I don't know what the policy is), but it makes introducing keywords more costly than
an approach that allows old code to continue working.

-- Carl Smith

Rob Cliffe via Python-ideas

unread,
May 17, 2018, 5:41:06 PM5/17/18
to python...@python.org

On 16/05/2018 10:12, Stephan Houben wrote:
> Hi all,
>
> One problem already alluded to with the \identifier syntax is that it
> only works
> if the old Python version is sufficiently recent to understand \.
>
> What about using parentheses to allow a keyword to be used as an
> identifier:
> (where)(x, y)
>
>

I believe this is the first proposal that allows future-proofing of new
code while preserving complete backward compatibility.  As far as I
know,    ( keyword )    is never legal syntax.
Of course, putting brackets round every occurrence of every identifier
that you think might become an identifier in the next century is a bit
of a chore.  There is no perfect solution.
Best wishes
Rob Cliffe

Carl Smith

unread,
May 17, 2018, 5:55:30 PM5/17/18
to python-ideas
> I believe this is the first proposal that allows future-proofing of new code while preserving
> complete backward compatibility.

My proposal removes the need to future proof anything, and only requires
subtle changes to the syntax (nothing visually different). It also preserves
perfect backwards compatibility. Just saying :)



-- Carl Smith

Neil Girdhar

unread,
May 17, 2018, 6:07:28 PM5/17/18
to python-ideas


On Thursday, May 17, 2018 at 5:40:05 PM UTC-4, Carl Smith wrote:
My preference is to do nothing.  If you end up making "where" a keyword in Python 3.8, numpy will probably:
> * rename their where function to "where_" in 3.8
> * add a where_ alias in Python < 3.8.

This assumes that every library that defines something named `where` (and every library that references it)
are maintained​. If the parser just starts treating `where` as a keyword, working code will not parse. That
may be acceptable (I don't know what the policy is), but it makes introducing keywords more costly than
an approach that allows old code to continue working.

Fair enough.

If that solution is taken, I want the linters to complain about verbatim uses as a stopgap measure that ultimately should be removed.  I don't think interfaces should be naming functions and arguments: "if", "while", etc.  I think PEP 08 should discourage its use as well except when it's necessary to continue to use an unmaintained interface.

Neil Girdhar

unread,
May 17, 2018, 6:11:14 PM5/17/18
to python-ideas


On Thursday, May 17, 2018 at 5:55:30 PM UTC-4, Carl Smith wrote:
> I believe this is the first proposal that allows future-proofing of new code while preserving
> complete backward compatibility.

My proposal removes the need to future proof anything, and only requires
subtle changes to the syntax (nothing visually different). It also preserves
perfect backwards compatibility. Just saying :)

Maybe I misunderstood, but it seems like your solution places a small burden on new code that uses "given" or "where" or whatever in the form of a special import or statement enabling it.  I love that we're instead making it easy to keep old code working while protecting Python's beautiful future with no special imports or statements to use the core language.   

MRAB

unread,
May 17, 2018, 7:19:30 PM5/17/18
to python...@python.org
On 2018-05-17 22:38, Rob Cliffe via Python-ideas wrote:
>
>
> On 16/05/2018 10:12, Stephan Houben wrote:
>> Hi all,
>>
>> One problem already alluded to with the \identifier syntax is that it
>> only works
>> if the old Python version is sufficiently recent to understand \.
>>
>> What about using parentheses to allow a keyword to be used as an
>> identifier:
>> (where)(x, y)
>>
>>
> I believe this is the first proposal that allows future-proofing of new
> code while preserving complete backward compatibility.  As far as I
> know,    ( keyword )    is never legal syntax.

Apart from (False), (True) and (None), there's also (yield).

> Of course, putting brackets round every occurrence of every identifier
> that you think might become an identifier in the next century is a bit
> of a chore.  There is no perfect solution.
>

Steven D'Aprano

unread,
May 17, 2018, 9:15:24 PM5/17/18
to python...@python.org
On Thu, May 17, 2018 at 10:54:25PM +0100, Carl Smith wrote:

> My proposal removes the need to future proof anything, and only requires
> subtle changes to the syntax (nothing visually different). It also preserves
> perfect backwards compatibility. Just saying :)

I must admit, I don't understand your proposal. Can you summarise it
again?


--
Steve

Steven D'Aprano

unread,
May 17, 2018, 9:35:33 PM5/17/18
to python...@python.org
On Thu, May 17, 2018 at 11:41:33AM -0700, Neil Girdhar wrote:
> My preference is to do nothing. If you end up making "where" a keyword in
> Python 3.8, numpy will probably:

This is only incidently about "where". I'm hoping that the "where" (or
"given") proposal is rejected. It is so verbose and redundantly
repetitious in the common case that rather than being an improvement
over the status quo, it is worse. But that's by-the-by.


> * rename their where function to "where_" in 3.8
> * add a where_ alias in Python < 3.8.
>
> And then people will have to fix their code in 3.8 anyway. Only instead of
> learning a new verbatim syntax, they will just add the familiar underscore.

You should be thinking forward two or three versions from now, when
\name is the familiar syntax and name_ looks like you started to write
an identifier using the underscore_words_convention but got distracted
halfway through.

Remember that (if approved) verbatim names will not be "that new syntax"
for long. We don't still talk about "that new fangled list comprehension
syntax" or "that new yield keyword". That was the problem with the "old
versus new style classes" terminology: at the point that "new-style
classes" had been around for six releases, approaching a decade, they
weren't new any more.


> One thing that should ideally be done is to improve the SyntaxError
> processing to special case use of keywords in places that identifiers are
> used.

This is worth doing regardless of whether or not we get verbatim strings
or some other alternative. You ought to raise it on the bug tracker.



--
Steve

Neil Girdhar

unread,
May 17, 2018, 11:02:37 PM5/17/18
to python...@googlegroups.com, python...@python.org
On Thu, May 17, 2018 at 9:35 PM Steven D'Aprano <st...@pearwood.info> wrote:
On Thu, May 17, 2018 at 11:41:33AM -0700, Neil Girdhar wrote:
> My preference is to do nothing.  If you end up making "where" a keyword in
> Python 3.8, numpy will probably:

This is only incidently about "where". I'm hoping that the "where" (or
"given") proposal is rejected. It is so verbose and redundantly
repetitious in the common case that rather than being an improvement
over the status quo, it is worse. But that's by-the-by.


> * rename their where function to "where_" in 3.8
> * add a where_ alias in Python < 3.8.
>
> And then people will have to fix their code in 3.8 anyway.  Only instead of
> learning a new verbatim syntax, they will just add the familiar underscore.

You should be thinking forward two or three versions from now, when
\name is the familiar syntax and name_ looks like you started to write
an identifier using the underscore_words_convention but got distracted
halfway through.

Remember that (if approved) verbatim names will not be "that new syntax"
for long. We don't still talk about "that new fangled list comprehension
syntax" or "that new yield keyword". That was the problem with the "old
versus new style classes" terminology: at the point that "new-style
classes" had been around for six releases, approaching a decade, they
weren't new any more.

I can get behind the benefit of main benefit of backslash, which is keeping code working with old libraries.  

However, the difference between the backslash syntax and comprehensions and generator functions is that comprehensions and generator functions make the language more expressive.  The backslash is no more expressive than trailing underscore.  It's no more succinct, and no more clear.  Adding it to the language constitutes more that the user needs to learn, which makes Python slightly less accessible.

I don't like multiple ways of doing the same thing.  There is already probably billions of lines of code that use trailing underscore to avoid collisions.  If the backslash syntax is added, then there will be a natural push towards the "right way" to avoid collisions.   If that's backslashes, then projects are typically going to have code written with backslashes and code written with underscore.  When you go access a variable named "where", you'll wonder: was it called "\where" or "where_"?

Maybe the "\where" is pretty enough that it's worth it like you say.  Maybe a function like:

def f(\in=0, out=1):

is prettier than

def f(in_=0, out=1):

but I'm already so used the current way of doing things, my aesthetic is that it's not worth the variability.

For that reason, I'd like to make a more modest proposal to *only* add a verbatim versions of keywords as necessary, e.g., "\where" or "\given".  That way, there will be no temptation to use that syntax in any other place.  If a new version of Python comes out with a new keyword, say "abc", then all of the old Python versions can get a minor revision that knows about "\abc".  This would ensure that the backslash syntax is only used to avoid collisions with new keywords.

When 3.7 hits end-of-life, the "\given" (or whatever) can be deprecated.


> One thing that should ideally be done is to improve the SyntaxError
> processing to special case use of keywords in places that identifiers are
> used.

This is worth doing regardless of whether or not we get verbatim strings
or some other alternative. You ought to raise it on the bug tracker.



--
Steve
_______________________________________________
Python-ideas mailing list
Python...@python.org
https://mail.python.org/mailman/listinfo/python-ideas
Code of Conduct: http://python.org/psf/codeofconduct/

--

---
You received this message because you are subscribed to a topic in the Google Groups "python-ideas" group.
To unsubscribe from this topic, visit https://groups.google.com/d/topic/python-ideas/r1kFC8mYEKk/unsubscribe.
To unsubscribe from this group and all its topics, send an email to python-ideas...@googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

Alexandre Brault

unread,
May 18, 2018, 2:42:39 AM5/18/18
to python...@python.org
On 2018-05-17 11:02 PM, Neil Girdhar wrote:

For that reason, I'd like to make a more modest proposal to *only* add a verbatim versions of keywords as necessary, e.g., "\where" or "\given".  That way, there will be no temptation to use that syntax in any other place.  If a new version of Python comes out with a new keyword, say "abc", then all of the old Python versions can get a minor revision that knows about "\abc".  This would ensure that the backslash syntax is only used to avoid collisions with new keywords.

When 3.7 hits end-of-life, the "\given" (or whatever) can be deprecated.

-1. This would add an extra maintenance and mental ("which keywords are allowed as verbatim and which not") cost to the feature while limiting its utility to the one use case it's only incidentally addressing. PEP8 can warn people not to use verbatim names frivolously in handwritten code.

Greg Ewing

unread,
May 18, 2018, 2:47:05 AM5/18/18
to python-ideas
Carl Smith wrote:
> I wrote:
>> The trouble with explicitly overriding keywords is that it still requires old code to
>> be changed whenever a new keyword is added...
>
> For the purpose of this discussion, let's say that if any code implicitly
> enables a new feature (by simply using it), all the code in that file is 'new
> code' in the context of the specific feature (a bit like how `yield` works).
> If `until` was a new keyword, any file that used it as a keyword would be new
> code. Any other files are 'old code'.

Okay, that works because it *doesn't* require old code to explicitly
say "I'm using this word the old way". My comment was about the idea
of having to use a backslash to escape keywords used as names, or
similar schemes.

> We would also allow `name.until` and `dance(until="am")` in new code, so that
> we can still reference names in old code (without them becoming names in new
> code).

Actually it would be fine if new code had to say "name.\until" etc.

The only problem I can see is that it would probably be near-impossible
to implement using the current parser generator. It might be doable
by keeping multiple versions of the grammar -- try to parse using the
most recent grammar, if that doesn't work, try the next most recent,
etc. But that would be pretty horrible, and it would require keeping
old cruft in the implementation forever, which we don't like doing.

--
Greg

Steven D'Aprano

unread,
May 18, 2018, 2:48:34 AM5/18/18
to python...@python.org
On Thu, May 17, 2018 at 11:02:23PM -0400, Neil Girdhar wrote:

> However, the difference between the backslash syntax and comprehensions and
> generator functions is that comprehensions and generator functions make the
> language more expressive. The backslash is no more expressive than
> trailing underscore. It's no more succinct, and no more clear. Adding it
> to the language constitutes more that the user needs to learn, which makes
> Python slightly less accessible.

On the contrary: it removes a pain point when dealing with external
libraries. No longer will we have to *transform* the name on both input
and output. Instead, we only need to *escape* the name when written as a
literal.


> I don't like multiple ways of doing the same thing.

Ah, like when people want to use "class" as an identifier, and since
they can't, they write:

klass cls Class

and maybe even occasionally class_ :-)

Or they use a synonym:

kind, clade, type (with or without trailing underscore).

I've seen *every one of those choices* in real code. Except "clade", I
just added that one now.

Remind me again what the "one (obvious) way to do it" is?


> There is already
> probably billions of lines of code that use trailing underscore to avoid
> collisions.

Indeed, and if this proposal is accepted, that will remain legal, and if
people want to write class_ instead of \class or klass, or if_ instead
of \in or infile, they are permitted to do so.

You can even have your own in-house style rules mandating whatever style
you prefer.


> If the backslash syntax is added, then there will be a natural
> push towards the "right way" to avoid collisions. If that's backslashes,
> then projects are typically going to have code written with backslashes and
> code written with underscore. When you go access a variable named "where",
> you'll wonder: was it called "\where" or "where_"?

Yes? Why is this more of a problem than what we have now? Is it called
(in context of PEP 572) "where" or "given"? In general, is it called:

where, place, location, loc, locus, position, pos, x, xy,
locality, locale, coordinates, coord

or some other synonym?

In order to successfully use a library's API, one needs to actually
know what that API *is*. That means you need to know the name of things.
Adding verbatim names won't change that.


> Maybe the "\where" is pretty enough that it's worth it like you say. Maybe
> a function like:
>
> def f(\in=0, out=1):
>
> is prettier than
>
> def f(in_=0, out=1):
>
> but I'm already so used the current way of doing things, my aesthetic is
> that it's not worth the variability.

Being able to use "in" as an identifier as in that example is not the
driving motivation for adding this feature. The driving motivation is to
remove a pain point when dealing with external APIs that use keywords as
regular identifiers, and to make it simpler to future-proof code when a
new keyword is due to be introduced.

Nobody is going to recommend that folks rush to deprecate their name_
APIs and replace them with \name. I'm sure most library maintainers
will have better things to do. in_ will stay in_ for most existing code.
It is only new code that doesn't have to care about 3.7 or older than
can even consider this.


> For that reason, I'd like to make a more modest proposal to *only* add a
> verbatim versions of keywords as necessary,

Because "special cases are special enough to break the rules, complicate
the documentation and the implementation, and give rise to a thousand
Stackoverflow posts asking why we can escape some keywords but not
others".



> e.g., "\where" or "\given".
> That way, there will be no temptation to use that syntax in any other
> place.

Just because you have no use-case for using "except", say, as an
identifier doesn't mean nobody has. You are not arbiter of which
keywords are acceptable to use verbatim and which ones are forbidden.


> When 3.7 hits end-of-life, the "\given" (or whatever) can be deprecated.

Having a white list of "Permitted keywords you may escape" is horrible
enough without baking in a policy of continued code churn by removing
them from the whitelist every few releases.

Greg Ewing

unread,
May 18, 2018, 4:42:47 AM5/18/18
to python...@python.org
Steven D'Aprano wrote:
> It's Python 3.8, and I learn that in 4.0 "spam" is going to become a
> keyword. I simply take my code and change all the references spam to
> \spam, and I've future-proofed the code for 4.0 while still keeping
> compatibility with 3.8 and 3.9.

Okay, maybe it helps a little bit, but not very much. There
will still be a lot of reluctance to add new keywords, because
of the disruption it will cause to existing code.

If we've learned nothing else from the Python 3 changeover,
it's that many people work in an environment where it's
extremely difficult to update working code.

--
Greg

Stephan Houben

unread,
May 18, 2018, 5:18:10 AM5/18/18
to Greg Ewing, Python-Ideas
2018-05-18 8:05 GMT+02:00 Greg Ewing <greg....@canterbury.ac.nz>:
Steven D'Aprano wrote:
It's Python 3.8, and I learn that in 4.0 "spam" is going to become a keyword. I simply take my code and change all the references spam to \spam, and I've future-proofed the code for 4.0 while still keeping compatibility with 3.8 and 3.9.

Okay, maybe it helps a little bit, but not very much. There
will still be a lot of reluctance to add new keywords, because
of the disruption it will cause to existing code

And the alternative is to replace all occurrences of
spam with 𝐬𝐩𝐚𝐦 , which has the same effect and also is backward-compatible with
3.x for x < 8.

So there is already a kind of solution available, albeit an ugly one.
 
Stephan

Steven D'Aprano

unread,
May 18, 2018, 7:36:46 AM5/18/18
to python...@python.org
On Fri, May 18, 2018 at 06:05:05PM +1200, Greg Ewing wrote:
> Steven D'Aprano wrote:
> >It's Python 3.8, and I learn that in 4.0 "spam" is going to become a
> >keyword. I simply take my code and change all the references spam to
> >\spam, and I've future-proofed the code for 4.0 while still keeping
> >compatibility with 3.8 and 3.9.
>
> Okay, maybe it helps a little bit, but not very much. There
> will still be a lot of reluctance to add new keywords, because
> of the disruption it will cause to existing code.

That's okay, in fact there *ought* to be reluctance to add new keywords.
The aim of the exercise is not to add dozens of new keywords to the
language, just to make it easier to deal with the situation when we do.


--
Steve

Steven D'Aprano

unread,
May 18, 2018, 7:39:00 AM5/18/18
to python...@python.org
On Fri, May 18, 2018 at 11:17:13AM +0200, Stephan Houben wrote:

> And the alternative is to replace all occurrences of
> spam with 𝐬𝐩𝐚𝐦 , which has the same effect and also is
> backward-compatible with 3.x for x < 8.
>
> So there is already a kind of solution available, albeit an ugly one.

You are kidding, I hope.

If that works at all, I don't think its something we want to guarantee
will work. And for what it's worth, what I see is eight empty boxes
(missing glyph symbols).


--
Steve

Neil Girdhar

unread,
May 18, 2018, 8:31:50 AM5/18/18
to python...@googlegroups.com, python...@python.org
On Fri, May 18, 2018 at 2:48 AM Steven D'Aprano <st...@pearwood.info> wrote:
On Thu, May 17, 2018 at 11:02:23PM -0400, Neil Girdhar wrote:

> However, the difference between the backslash syntax and comprehensions and
> generator functions is that comprehensions and generator functions make the
> language more expressive.  The backslash is no more expressive than
> trailing underscore.  It's no more succinct, and no more clear.  Adding it
> to the language constitutes more that the user needs to learn, which makes
> Python slightly less accessible.

On the contrary: it removes a pain point when dealing with external
libraries. No longer will we have to *transform* the name on both input
and output. Instead, we only need to *escape* the name when written as a
literal.


> I don't like multiple ways of doing the same thing.

Ah, like when people want to use "class" as an identifier, and since
they can't, they write:

    klass cls Class

and maybe even occasionally class_ :-)

Or they use a synonym:

    kind, clade, type (with or without trailing underscore).

I've seen *every one of those choices* in real code. Except "clade", I
just added that one now.

Remind me again what the "one (obvious) way to do it" is?

All of your arguments would have applied to a keyword escaping proposal had it been proposed before "given" was even considered.  The only reason we're even considered considering escaping is to keep code that uses "given" as an identifier working.  That's why I prefer the most modest solution of only being able to escape given.  After all, there wasn't any big need to escape other keywords last year.

> When 3.7 hits end-of-life, the "\given" (or whatever) can be deprecated.

Having a white list of "Permitted keywords you may escape" is horrible
enough without baking in a policy of continued code churn by removing
them from the whitelist every few releases.



--
Steve
_______________________________________________
Python-ideas mailing list
Python...@python.org
https://mail.python.org/mailman/listinfo/python-ideas
Code of Conduct: http://python.org/psf/codeofconduct/

Clint Hepner

unread,
May 18, 2018, 8:51:48 AM5/18/18
to Steven D'Aprano, python...@python.org

> On 2018 May 18 , at 7:37 a, Steven D'Aprano <st...@pearwood.info> wrote:
>
> On Fri, May 18, 2018 at 11:17:13AM +0200, Stephan Houben wrote:
>
>> And the alternative is to replace all occurrences of
>> spam with 𝐬𝐩𝐚𝐦 , which has the same effect and also is
>> backward-compatible with 3.x for x < 8.
>>
>> So there is already a kind of solution available, albeit an ugly one.
>
> You are kidding, I hope.
>
> If that works at all, I don't think its something we want to guarantee
> will work. And for what it's worth, what I see is eight empty boxes
> (missing glyph symbols).

It could be worse. At least 𝐬𝐩𝐚𝐦 ("\U0001d42c\U0001d429\U0001d41a\U0001d426")
is visually distinct with the right font support. You could (ab?)use the Unicode
Tags block (https://en.wikipedia.org/wiki/Tags_(Unicode_block))
and use something like 'spam\U000e0069\U000e0064' (spam<i><d>). (Luckily, that's not
even valid Python, as the tag characters aren't valid for identifiers.)

--
Clint

Stephan Houben

unread,
May 18, 2018, 9:11:01 AM5/18/18
to Steven D'Aprano, Python-Ideas
2018-05-18 13:37 GMT+02:00 Steven D'Aprano <st...@pearwood.info>:
On Fri, May 18, 2018 at 11:17:13AM +0200, Stephan Houben wrote:

> And the alternative is to replace all occurrences of
> spam with 𝐬𝐩𝐚𝐦 , which has the same effect and also is
> backward-compatible with 3.x for x < 8.
>
> So there is already a kind of solution available, albeit an ugly one.

You are kidding, I hope.


I am not kidding; I am merely defending the status quo.
I demonstrate how the intended behavior can be achieved using features
available in current Python versions.

The approach has at least the following two technical advantages.
1. It requires no change to Python
2. It provides backwards compatibility all the way back to 3.0.

The spelling is arguably ugly, but this should be weighted against
the, IMHO, extremely rare use of this feature.
 

If that works at all, I don't think its something we want to guarantee
will work.

It is guaranteed to work by PEP-3131:
https://www.python.org/dev/peps/pep-3131

"All identifiers are converted into the normal form NFKC while parsing; comparison of identifiers is based on NFKC."

NFKC normalization means spam must be considered the same identifier as 𝐬𝐩𝐚𝐦 .

Note that the choice for NFKC normalization was apparently explicitly discussed and decided upon at the time.
Since the difference between NFC and NFKC is exactly that identifiers like spam and  𝐬𝐩𝐚𝐦 are different
under the former and identical under the latter, I take it this is all quite intentional.

 
And for what it's worth, what I see is eight empty boxes
(missing glyph symbols).


I am afraid that mostly shows that your mailer has a bug in handling non-BMP unicode
characters; you should be seeing FOUR missing glyph symbols.

Stephan

Steven D'Aprano

unread,
May 18, 2018, 9:39:00 AM5/18/18
to python...@python.org
On Fri, May 18, 2018 at 03:09:57PM +0200, Stephan Houben wrote:
> 2018-05-18 13:37 GMT+02:00 Steven D'Aprano <st...@pearwood.info>:
>
> > On Fri, May 18, 2018 at 11:17:13AM +0200, Stephan Houben wrote:
> >
> > > And the alternative is to replace all occurrences of
> > > spam with 𝐬𝐩𝐚𝐦 , which has the same effect and also is
> > > backward-compatible with 3.x for x < 8.
> > >
> > > So there is already a kind of solution available, albeit an ugly one.
> >
> > You are kidding, I hope.
> >
>
>
> I am not kidding;

Earlier you described this suggestion as "a silly joke".

https://mail.python.org/pipermail/python-ideas/2018-May/050861.html

I think you were right then.


> I am merely defending the status quo.
> I demonstrate how the intended behavior can be achieved using features
> available in current Python versions.

Aside from the fact that font, editor and keyboard support for such
non-BMP Unicode characters is very spotty, it isn't the intended
behaviour.

As you point out, the intended behaviour is that obj.𝐢𝐟 and
obj.if ought to be identical. Since the later is a syntax error, so
should be the former.


> It is guaranteed to work by PEP-3131:
> https://www.python.org/dev/peps/pep-3131
>
> "All identifiers are converted into the normal form NFKC while parsing;
> comparison of identifiers is based on NFKC."
>
> NFKC normalization means spam must be considered the same identifier as
> 𝐬𝐩𝐚𝐦 .


It's not the NFKC normalization that I'm questioning. Its the fact that
it is done too late to catch the use of a keyword.

Stephan Houben

unread,
May 18, 2018, 10:35:00 AM5/18/18
to Steven D'Aprano, Python-Ideas
2018-05-18 15:37 GMT+02:00 Steven D'Aprano <st...@pearwood.info>:

Earlier you described this suggestion as "a silly joke".

https://mail.python.org/pipermail/python-ideas/2018-May/050861.html


The joke proposal was to write all keywords in Python using bold font variation,
as a reaction to a similar joke proposal to precede all keywords in Python with  \.

In contrast this isn't even a proposal, it is merely a description of
an existing feature.

Practically speaking, suppose "spam" becomes a keyword in 3.8, and I
have a module which I want to make compatible with 3.8 AND I want
to preserve the API for pre-3.8 versions, then I will first update my module
to use some alternative spelling spam_ throughout, and then, in a single place,
write:

𝐬𝐩𝐚𝐦 = spam_  # exploit NFKC normalization to set identifier "spam" for backward compatibility

Even if this single line shows up as mojibake in somebody's editor, it shouldn't inconvenience them too much.

 
I think you were right then.


> I am merely defending the status quo.
> I demonstrate how the intended behavior can be achieved using features
> available in current Python versions.

Aside from the fact that font, editor and keyboard support for such
non-BMP Unicode characters is very spotty, it isn't the intended
behaviour.

I am not sure from what you conclude that.

There seem to be three design possibilities here:
1.  𝐢𝐟 is an alternative spelling for the keyword if
2.  𝐢𝐟 is an identifier
3.  𝐢𝐟 is an error

I am pretty sure option 1 (non-ASCII spelling of keywords) was not intended
(doc says about keywords: "They must be spelled exactly as written here:")

So it is either 2 or 3.  Option 3 would only make sense if we conclude that it is
a bad idea to have an identifier with the same name as a keyword.
Whereas this whole thread so far has been about introducing such a feature.

So that leaves 2, which happens to be the implemented behavior.

As an aside:
A general observation of PEP-3131 and Unicode identifiers in Python:
from the PEP it becomes clear that there have been several proposals
of making it more restricted (e.g. requiring source code to be already in
NFKC normal form, which would make 𝐢𝐟 illegal, disallowing confusables,
etc.)

Ultimately this has been rejected and the result is that we have a rather liberal
definition of Unicode identifiers in Python. I feel that 𝐢𝐟  being a valid
identifier fits into that pattern, just as various confusable spellings of if
would be legal identifiers. In theory this could lead to all kinds of
sneaky attacks where code appears to do one thing but does another,
but it just doesn't seem so big an issue in practice.


As you point out, the intended behaviour is that obj.𝐢𝐟 and
obj.if ought to be identical. Since the later is a syntax error, so
should be the former.

NFKC normalization is restricted to identifiers.
Keywords "must be spelled exactly as written here."
 


> It is guaranteed to work by PEP-3131:
> https://www.python.org/dev/peps/pep-3131
>
> "All identifiers are converted into the normal form NFKC while parsing;
> comparison of identifiers is based on NFKC."
>
> NFKC normalization means spam must be considered the same identifier as
> 𝐬𝐩𝐚𝐦 .


It's not the NFKC normalization that I'm questioning. Its the fact that
it is done too late to catch the use of a keyword.


See above.

Stephan

Steven D'Aprano

unread,
May 18, 2018, 10:49:30 AM5/18/18
to python...@python.org
On Fri, May 18, 2018 at 08:31:36AM -0400, Neil Girdhar wrote:
[...]
> > > I don't like multiple ways of doing the same thing.
> >
> > Ah, like when people want to use "class" as an identifier, and since
> > they can't, they write:
> >
> > klass cls Class
[...]
> > Remind me again what the "one (obvious) way to do it" is?
> >
>
> In most cases: cls
>
> https://www.python.org/dev/peps/pep-0008/#function-and-method-arguments

The immediate next sentence goes on to say:

If a function argument's name clashes with a reserved keyword,
it is generally better to append a single trailing underscore
rather than use an abbreviation or spelling corruption. Thus
class_ is better than clss. (Perhaps better is to avoid such
clashes by using a synonym.)

So PEP 8 already torpedoes your preference for a single way to spell
words that clash with keywords:

- use a synonym;

- or a trailing underscore

- except for the first argument of class methods, where we
use the misspelling (or abbreviation) "cls".

If you object to this proposed verbatim names on the basis of disliking
multiple ways to spell identifies that clash with keywords, that ship
sailed long ago. There has always been multiple ways.

But I like to think that verbatim names could become the one OBVIOUS way
to do it in the future.


[...]
> All of your arguments would have applied to a keyword escaping proposal had
> it been proposed before "given" was even considered.

Every new idea has to be thought of for the first time. Otherwise it
would have been in the language from day zero and it wouldn't be a new
idea. If it wasn't "given", it could have been for "async" and "await",
if not for them it could have been for "with", if not for "with" it
might have been "yield", and so on.

There had to be a first time for any idea. I would have
suggested this twenty years ago if I thought of it twenty years ago, but
I didn't, so I didn't. Dismissing the idea because I didn't suggest it
earlier when other keywords were suggested is illogical.


> The only reason we're
> even considered considering escaping is to keep code that uses "given" as
> an identifier working.

That might be the only reason YOU are considing it, but it definitely
isn't the only reason for me. And probably not for others.

In fact, since I strongly dislike the "given" syntax, and really want
that idea to be rejected, anything that increases its chances are a
negative to me.

Nevertheless, identifiers clashing with keywords isn't something brand
new that only occurs thanks to "given". It has been a pain point
forever. A small one, true, but still a nuisance.

Verbatim names has the possibility to cut that nuisance value even
further. But not if we short-sightedly limit it to a single case.


> That's why I prefer the most modest solution of
> only being able to escape given.

Limiting a general method of mitigating the keyword clashing problem to
a single keyword is as sensible as limiting the pathlib library to only
work with a single hard-coded pathname.

Richard Damon

unread,
May 18, 2018, 11:21:00 AM5/18/18
to python...@python.org
On 5/18/18 10:31 AM, Stephan Houben wrote:
>
> NFKC normalization is restricted to identifiers.
> Keywords "must be spelled exactly as written here."
>  
>
>
>
> > It is guaranteed to work by PEP-3131:
> > https://www.python.org/dev/peps/pep-3131
> <https://www.python.org/dev/peps/pep-3131>
> >
> > "All identifiers are converted into the normal form NFKC while
> parsing;
> > comparison of identifiers is based on NFKC."
> >
> > NFKC normalization means spam must be considered the same
> identifier as
> > 𝐬𝐩𝐚𝐦 .
>
>
> It's not the NFKC normalization that I'm questioning. Its the fact
> that
> it is done too late to catch the use of a keyword.
>
>
> See above.

I would think that the rule that normalization is restricted to
identifiers says that it needs to happen AFTER keyword identification
(otherwise it would have applied to the keyword).  To follow the rules
and flag identifiers that normalize to keywords, either you need to
normalize early and tag text that had been changed by normalization so
keywords could flag errors (but late enough that you don't normalize
inside strings and such), or you need to normalize late (as is done) but
then add a check to see if the text became the same as a keyword.

Seems a shame to go to extra work to flag as an error something that
really could only have been done intentionally, removing a 'feature' to
help with backwards compatibility.

--
Richard Damon

Neil Girdhar

unread,
May 18, 2018, 2:54:28 PM5/18/18
to python...@googlegroups.com, python...@python.org
Right.  I think it's pretty clear that there is one way to avoid naming conflict and keep the same name: use a trailing underscore except when it's the first argument to a class method.

But I like to think that verbatim names could become the one OBVIOUS way
to do it in the future.

That's what I don't want. 

Carl Smith

unread,
May 18, 2018, 10:24:34 PM5/18/18
to python-ideas
I was asked earlier to summarise the the proposal I've been advocating for, but
have already gone over the central points a few times. I'll try and find time to
write a clear explanation soon.

-- Carl Smith

To unsubscribe from this group and all its topics, send an email to python-ideas+unsubscribe@googlegroups.com.

For more options, visit https://groups.google.com/d/optout.
Reply all
Reply to author
Forward
0 new messages