Account Options

  1. Sign in
The old Google Groups will be going away soon, but your browser is incompatible with the new version.
Google Groups Home
« Groups Home
Message from discussion Invalid identifier claimed to be valid by docs (methinks)

Received: by 10.68.212.168 with SMTP id nl8mr2371439pbc.5.1348454601520;
        Sun, 23 Sep 2012 19:43:21 -0700 (PDT)
Path: t10ni14059611pbh.0!nntp.google.com!npeer03.iad.highwinds-media.com!news.highwinds-media.com!feed-me.highwinds-media.com!nx01.iad01.newshosting.com!newshosting.com!news2.euro.net!newsfeed.xs4all.nl!newsfeed6.news.xs4all.nl!xs4all!post.news.xs4all.nl!not-for-mail
Return-Path: <python-python-l...@m.gmane.org>
X-Original-To: python-l...@python.org
Delivered-To: python-l...@mail.python.org
X-Spam-Status: OK 0.000
X-Spam-Evidence: '*H*': 1.00; '*S*': 0.00; 'compiler': 0.05; '*not*':
	0.07; 'correct.': 0.07; 'identifier': 0.09; 'received:80.91':
	0.09; 'received:80.91.229': 0.09; 'received:gmane.org': 0.09;
	'received:list': 0.09; 'sep': 0.09; 'subset': 0.09; 'terry': 0.09;
	'bug': 0.10; 'assume': 0.11; '::=': 0.16; 'char)': 0.16;
	'digits.': 0.16; 'identifiers': 0.16; 'identifiers.': 0.16;
	'none"': 0.16; 'received:80.91.229.3': 0.16;
	'received:plane.gmane.org': 0.16; 'reedy': 0.16; 'syntaxerror:':
	0.16; 'valid.': 0.16; 'wrote:': 0.17; 'jan': 0.18; '>>>': 0.18;
	'seems': 0.23; 'header:In-Reply-To:1': 0.25; 'header:User-
	Agent:1': 0.26; 'looks': 0.26; 'header:X-Complaints-To:1': 0.28;
	'finds': 0.29; 'character': 0.29; 'definition': 0.29; 'docs':
	0.33; 'to:addr:python-list': 0.33; 'problem,': 0.35; 'pm,': 0.35;
	'there': 0.35; 'received:org': 0.36; 'skip:u 20': 0.36; 'but':
	0.36; 'characters': 0.36; 'subject: (': 0.36; 'one,': 0.37;
	'does': 0.37; 'subject:: ': 0.38; 'to:addr:python.org': 0.39;
	'skip:" 10': 0.40; 'header:Received:5': 0.40; 'skip:u 10': 0.60;
	'skip:n 10': 0.63; 'more': 0.63; 'therefore': 0.65; 'carefully':
	0.71; 'received:fios.verizon.net': 0.84; 'hand,': 0.97
X-Injected-Via-Gmane: http://gmane.org/
To: python-l...@python.org
From: Terry Reedy <tjre...@udel.edu>
Subject: Re: Invalid identifier claimed to be valid by docs (methinks)
Date: Sun, 23 Sep 2012 22:42:35 -0400
References: <CAN1F8qX54CQQ2roDNtofJCx_=QWY--1sspFGCmLZ7DEtH-a1Ug@mail.gmail.com>
	<CALwzidmPHn479FcjsfM6xGby1wE-GOFChS_q+yh3CU2t8Ua...@mail.gmail.com>
Mime-Version: 1.0
X-Gmane-NNTP-Posting-Host: pool-173-75-251-66.phlapa.fios.verizon.net
User-Agent: Mozilla/5.0 (Windows NT 6.1; WOW64;
	rv:15.0) Gecko/20120824 Thunderbird/15.0
In-Reply-To: <CALwzidmPHn479FcjsfM6xGby1wE-GOFChS_q+yh3CU2t8Ua...@mail.gmail.com>
X-BeenThere: python-l...@python.org
X-Mailman-Version: 2.1.15
Precedence: list
List-Id: General discussion list for the Python programming language
	<python-list.python.org>
List-Unsubscribe: <http://mail.python.org/mailman/options/python-list>,
	<mailto:python-list-requ...@python.org?subject=unsubscribe>
List-Archive: <http://mail.python.org/pipermail/python-list/>
List-Post: <mailto:python-l...@python.org>
List-Help: <mailto:python-list-requ...@python.org?subject=help>
List-Subscribe: <http://mail.python.org/mailman/listinfo/python-list>,
	<mailto:python-list-requ...@python.org?subject=subscribe>
Newsgroups: comp.lang.python
Message-ID: <mailman.1177.1348454579.27098.python-l...@python.org>
Lines: 68
NNTP-Posting-Host: 2001:888:2000:d::a6
X-Trace: 1348454579 news.xs4all.nl 6847 [2001:888:2000:d::a6]:49152
X-Complaints-To: ab...@xs4all.nl
X-Received-Bytes: 5001
Content-Type: text/plain; charset=UTF-8; format=flowed
Content-Transfer-Encoding: quoted-printable

On 9/23/2012 6:57 PM, Ian Kelly wrote:
> On Sun, Sep 23, 2012 at 4:24 PM, Joshua Landau
> <joshua.landau...@gmail.com> wrote:
>> The docs describe identifiers to have this grammar:
>>
>> identifier   ::=3D  xid_start xid_continue*
>> id_start     ::=3D  <all characters in general categories Lu, Ll, Lt, =
Lm, Lo,
>> Nl, the underscore, and characters with the Other_ID_Start property>
>> id_continue  ::=3D  <all characters in id_start, plus characters in th=
e
>> categories Mn, Mc, Nd, Pc and others with the Other_ID_Continue proper=
ty>
>> xid_start    ::=3D  <all characters in id_start whose NFKC normalizati=
on is in
>> "id_start xid_continue*">

xid_start is a subset of id_start

>> xid_continue ::=3D  <all characters in id_continue whose NFKC normaliz=
ation is
>> in "id_continue*">

xid_continue is a subset of id_continue.

>> So I would assume that
>>      exec("a{} =3D None".format(char))
>> would be valid if
>>     unicodedata.normalize("NFKC", char)  =3D=3D "1"

Read more carefully the definition of xid_continue. The un-normalized=20
character must also be in id_continue.

>> as
>>     exec("a1 =3D None")
>> is valid.
>>
>> BUT "a=C2=B9 =3D None" is not valid*.

 >>> ud.category("\u00b9")
'No'

Category No is *not* in id_continue, and therefore not in xid_continue.

> exec("x\u00b9 =3D None")  # U+00B9 is superscript 1
>
> On the other hand, this does work:
>
> exec("x\u2071 =3D None")  # U+2071 is superscript i
>
> So it seems to be only an issue with superscript and subscript digits.
>   Looks like a compiler bug to me.

The problem, if there were one, would be in the tokenizer that finds=20
identifiers. However,

 >>> exec("x\u00b9 =3D None")
=2E..
     x=C2=B9 =3D None
       ^
SyntaxError: invalid character in identifier

this is correct.

--=20
Terry Jan Reedy