I am trying to understand something about how the 'in' operator (as in
the following expression)
if 'aa' in x:
do_something()
When trying to implement in support on a class it appears that if
__contains__ doesn't exist
in falls back to calling __getitem__
However strange things happen to the name passed to __getitem__ in the
following example (and in fact in all varients I have triend the name/
key passed to __getitem__ is always the integer 0
For instance
Python 2.6.2 (release26-maint, Apr 19 2009, 01:56:41)
[GCC 4.3.3] on linux2
Type "help", "copyright", "credits" or "license" for more information.
>>> class xx(object):
... def __getitem__(self,name):
... raise KeyError(name)
...
>>> aa = xx()
>>> aa['kk']
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
File "<stdin>", line 3, in __getitem__
KeyError: 'kk'
>>> 'kk' in aa
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
File "<stdin>", line 3, in __getitem__
KeyError: 0
I am running on ubuntu, and this happens to 2.5.4 as well. I must say
I am surprised and
am at a loss as to what is actually going on.
Can anyone enlighten me (or should I go and read some 'c' code ;-)
Rgds
Tim
> However strange things happen to the name passed to __getitem__ in the
> following example (and in fact in all varients I have triend the name/
> key passed to __getitem__ is always the integer 0
I think it's scanning the container as a sequence and not as a mapping,
hence the access by index.
Thats definately what I think is happening.
I tried the following
>>> class yy(object):
... def __getitem__(self,name):
... raise KeyError(name)
... def __contains__(self,name):
... raise KeyError(name)
...
>>> aa = yy()
>>> 'll' in aa
Traceback (most recent call last):
File "<stdin>", line 1, in ?
File "<stdin>", line 5, in __contains__
KeyError: 'll'
>>> [i in aa]
Traceback (most recent call last):
File "<stdin>", line 1, in ?
NameError: name 'i' is not defined
>>> [i for i in aa]
Traceback (most recent call last):
File "<stdin>", line 1, in ?
File "<stdin>", line 3, in __getitem__
KeyError: 0
>>>
Which suggests to me there must be some sort of order of precedence
between __contains__ and __getitem__
and 'for' statement must change the order in some manner.
Thanks for the reply
T
Whoops, I just realized I posted the wrong link...
http://docs.python.org/reference/expressions.html#in
Jeff
>>>> [i for i in aa]
> Traceback (most recent call last):
> File "<stdin>", line 1, in ?
> File "<stdin>", line 3, in __getitem__
> KeyError: 0
>
>
> Which suggests to me there must be some sort of order of precedence
> between __contains__ and __getitem__
> and 'for' statement must change the order in some manner.
The 'in' part of for statements has nothing to do with the 'in'
comparison operator. For loops first look for .__iter__ and .__next__
and fall back to .__getitem__.
"in" is both a keyword and an operator, with entirely different semantics.
As a keyword, it's used inside a for statement, the expression (after
the in) must return an iterable object.
As an operator, it's used inside an arbitrary expression. It is this
case which is described in the docs:
>>>For user-defined classes which define the __contains__()
<datamodel.html#object.__contains__> method, x in y is true if and only
if y.__contains__(x) is true.
>>>For user-defined classes which do not define __contains__()
<datamodel.html#object.__contains__> and do define __getitem__(),
<datamodel.html#object.__getitem__>
>>> x in y is true if and only if there is a non-negative integer index
/i/ such that x == y[i],
>>> and all lower integer indices do not raise IndexError
<../library/exceptions.html#exceptions.IndexError> exception. (If any
other exception is
>>>raised, it is as if in <#in> raised that exception).