"/a" is not "/a" ?

Emanuele D'Arrigo

unread,

Mar 6, 2009, 2:17:07 PM3/6/09

to

Hi everybody,

while testing a module today I stumbled on something that I can work
around but I don't quite understand.

>>> a = "a"
>>> b = "a"
>>> a == b
True
>>> a is b
True

>>> c = "/a"
>>> d = "/a"
>>> c == d
True # all good so far
>>> c is d
False # eeeeek!

Why c and d point to two different objects with an identical string
content rather than the same object?

Manu

Joshua Kugler

unread,

Mar 6, 2009, 2:44:45 PM3/6/09

to pytho...@python.org

Emanuele D'Arrigo wrote:

>>>> c = "/a"
>>>> d = "/a"
>>>> c == d
> True # all good so far
>>>> c is d
> False # eeeeek!
>
> Why c and d point to two different objects with an identical string
> content rather than the same object?

Because you instantiated two difference objects.
http://docs.python.org/reference/datamodel.html#objects-values-and-types
should get you started on Python and objects.

j

Gary Herron

unread,

Mar 6, 2009, 2:46:14 PM3/6/09

to pytho...@python.org

Emanuele D'Arrigo wrote:
> Hi everybody,
>
> while testing a module today I stumbled on something that I can work
> around but I don't quite understand.
>

*Do NOT use "is" to compare immutable types.* **Ever! **

It is an implementation choice (usually driven by efficiency considerations) to choose when two strings with the same value are stored in memory once or twice. In order for Python to recognize when a newly created string has the same value as an already existing string, and so use the already existing value, it would need to search *every* existing string whenever a new string is created. Clearly that's not going to be efficient. However, the C implementation of Python does a limited version of such a thing -- at least with strings of length 1.

Gary Herron

>
>>>> a = "a"
>>>> b = "a"
>>>> a == b
>>>>
> True
>
>>>> a is b
>>>>
> True
>
>
>>>> c = "/a"
>>>> d = "/a"
>>>> c == d
>>>>
> True # all good so far
>
>>>> c is d
>>>>
> False # eeeeek!
>
> Why c and d point to two different objects with an identical string
> content rather than the same object?
>
> Manu

> --
> http://mail.python.org/mailman/listinfo/python-list
>

Robert Kern

unread,

Mar 6, 2009, 2:58:36 PM3/6/09

to pytho...@python.org

On 2009-03-06 13:46, Gary Herron wrote:

> Emanuele D'Arrigo wrote:
>> Hi everybody,
>>
>> while testing a module today I stumbled on something that I can work
>> around but I don't quite understand.
>

> *Do NOT use "is" to compare immutable types.* **Ever! **

Well, "foo is None" is actually recommended practice....

--
Robert Kern

"I have come to believe that the whole world is an enigma, a harmless enigma
that is made terrible by our own mad attempt to interpret it as though it had
an underlying truth."
-- Umberto Eco

John Nagle

unread,

Mar 6, 2009, 3:37:48 PM3/6/09

to

Gary Herron wrote:
> Emanuele D'Arrigo wrote:
>> Hi everybody,
>>
>> while testing a module today I stumbled on something that I can work
>> around but I don't quite understand.
>
> *Do NOT use "is" to compare immutable types.* **Ever! **

Then it should be a detected error to do so.

John Nagle

Gary Herron

unread,

Mar 6, 2009, 3:23:18 PM3/6/09

to pytho...@python.org

Robert Kern wrote:

> On 2009-03-06 13:46, Gary Herron wrote:
>> Emanuele D'Arrigo wrote:

>>> Hi everybody,
>>>
>>> while testing a module today I stumbled on something that I can work
>>> around but I don't quite understand.
>>

>> *Do NOT use "is" to compare immutable types.* **Ever! **
>

> Well, "foo is None" is actually recommended practice....
>

But since newbies are always falling into this trap, it is still a good
rule to say:

Newbies: Never use "is" to compare immutable types.

and then later point out, for those who have absorbed the first rule:

Experts: Singleton immutable types *may* be compared with "is",
although normal equality with == works just as well.

Gary Herron

Robert Kern

unread,

Mar 6, 2009, 3:33:28 PM3/6/09

to pytho...@python.org

On 2009-03-06 14:23, Gary Herron wrote:
> Robert Kern wrote:
>> On 2009-03-06 13:46, Gary Herron wrote:
>>> Emanuele D'Arrigo wrote:

>>>> Hi everybody,
>>>>
>>>> while testing a module today I stumbled on something that I can work
>>>> around but I don't quite understand.
>>>

>>> *Do NOT use "is" to compare immutable types.* **Ever! **
>>
>> Well, "foo is None" is actually recommended practice....
>>
>
> But since newbies are always falling into this trap, it is still a good
> rule to say:
>
> Newbies: Never use "is" to compare immutable types.
>
> and then later point out, for those who have absorbed the first rule:
>
> Experts: Singleton immutable types *may* be compared with "is",
> although normal equality with == works just as well.

That's not really true. If my object overrides __eq__ in a funny way, "is None"
is much safer.

Use "is" when you really need to compare by object identity and not value.

Steven D'Aprano

unread,

Mar 6, 2009, 3:37:51 PM3/6/09

to

Gary Herron wrote:

> Emanuele D'Arrigo wrote:
>> Hi everybody,
>>
>> while testing a module today I stumbled on something that I can work
>> around but I don't quite understand.
>>
>
> *Do NOT use "is" to compare immutable types.* **Ever! **

Huh? How am I supposed to compare immutable types for identity then? Your
bizarre instruction would prohibit:

if something is None

which is the recommended way to compare to None, which is immutable. The
standard library has *many* identity tests to None.

I would say, *always* use "is" to compare any type whenever you intend to
compare by *identity* instead of equality. That's what it's for. If you use
it to test for equality, you're doing it wrong. But in the very rare cases
where you care about identity (and you almost never do), "is" is the
correct tool to use.

> It is an implementation choice (usually driven by efficiency
> considerations) to choose when two strings with the same value are stored
> in memory once or twice. In order for Python to recognize when a newly
> created string has the same value as an already existing string, and so
> use the already existing value, it would need to search *every* existing
> string whenever a new string is created.

Not at all. It's quite easy, and efficient. Here's a pure Python string
constructor that caches strings.

class CachedString(str):
_cache = {}
def __new__(cls, value):
s = cls._cache.setdefault(value, value)
return s

Python even includes a built-in function to do this: intern(), although I
believe it has been removed from Python 3.0.

> Clearly that's not going to be efficient.

Only if you do it the inefficient way.

> However, the C implementation of Python does a limited version
> of such a thing -- at least with strings of length 1.

No, that's not right. The identity test fails for some strings of length
one.

>>> a = '\n'
>>> b = '\n'
>>> len(a) == len(b) == 1
True
>>> a is b
False

Clearly, Python doesn't intern all strings of length one. What Python
actually interns are strings that look like, or could be, identifiers:

>>> a = 'heresareallylongstringthatisjustmade' \
... 'upofalphanumericcharacterssuitableforidentifiers123_'
>>>
>>> b = 'heresareallylongstringthatisjustmade' \
... 'upofalphanumericcharacterssuitableforidentifiers123_'
>>> a is b
True

It also does a similar thing for small integers, currently something
like -10 through to 256 I believe, although this is an implementation
detail subject to change.

--
Steven

Steven D'Aprano

unread,

Mar 6, 2009, 3:48:29 PM3/6/09

to

Emanuele D'Arrigo wrote:

> Hi everybody,
>
> while testing a module today I stumbled on something that I can work
> around but I don't quite understand.

Why do you have to work around it?

What are you trying to do that requires that two strings should occupy the
same memory location rather than merely being equal?

> Why c and d point to two different objects with an identical string
> content rather than the same object?

Why shouldn't they?

--
Steven

Gary Herron

unread,

Mar 6, 2009, 3:54:40 PM3/6/09

to pytho...@python.org

Robert Kern wrote:
> On 2009-03-06 14:23, Gary Herron wrote:
>> Robert Kern wrote:

>>> On 2009-03-06 13:46, Gary Herron wrote:
>>>> Emanuele D'Arrigo wrote:

>>>>> Hi everybody,
>>>>>
>>>>> while testing a module today I stumbled on something that I can work
>>>>> around but I don't quite understand.
>>>>

>>>> *Do NOT use "is" to compare immutable types.* **Ever! **
>>>

>>> Well, "foo is None" is actually recommended practice....
>>>
>>
>> But since newbies are always falling into this trap, it is still a good
>> rule to say:
>>
>> Newbies: Never use "is" to compare immutable types.
>>
>> and then later point out, for those who have absorbed the first rule:
>>
>> Experts: Singleton immutable types *may* be compared with "is",
>> although normal equality with == works just as well.
>
> That's not really true. If my object overrides __eq__ in a funny way,
> "is None" is much safer.
>
> Use "is" when you really need to compare by object identity and not
> value.

But that definition is the *source* of the trouble. It is *completely*
meaningless to newbies. Until one has experience in programming in
general and experience in Python in particular, the difference between
"object identity" and "value" is a mystery.

So in order to lead newbies away from this *very* common trap they often
fall into, it is still a valid rule to say

Newbies: Never use "is" to compare immutable types.

of even better

Newbies: Never use "is" to compare anything.

This will help them avoid traps, and won't hurt their use of the
language. If they get to a point that they need to contemplate using
"is", then almost be definition, they are not a newbie anymore, and the
rule is still valid.

Gary Herron

unread,

Mar 6, 2009, 4:08:41 PM3/6/09

to pytho...@python.org

Steven D'Aprano wrote:
> Gary Herron wrote:
>
>
>> Emanuele D'Arrigo wrote:
>>
>>> Hi everybody,
>>>
>>> while testing a module today I stumbled on something that I can work
>>> around but I don't quite understand.
>>>
>>>
>> *Do NOT use "is" to compare immutable types.* **Ever! **
>>
>
> Huh? How am I supposed to compare immutable types for identity then? Your
> bizarre instruction would prohibit:
>
> if something is None
>

Just use:

if something == None

It does *exactly* the same thing.

But... I'm not (repeat NOT) saying *you* should do it this way.

I am saying that since newbies continually trip over incorrect uses of
"is", they should be warned against using "is" in any situation until
they understand the subtle nature or "is".

If they use a couple "something==None" instead of "something is None"
in their code while learning Python, it won't hurt, and they can change
their style when they understand the difference. And meanwhile they
will skip traps newbies fall into when they don't understand these
things yet.

Gary Herron

Steven D'Aprano

unread,

Mar 6, 2009, 4:12:39 PM3/6/09

to

Gary Herron wrote:

> Robert Kern wrote:
...

>> Use "is" when you really need to compare by object identity and not
>> value.
>
> But that definition is the *source* of the trouble. It is *completely*
> meaningless to newbies. Until one has experience in programming in
> general and experience in Python in particular, the difference between
> "object identity" and "value" is a mystery.

Then teach them the difference, rather than give them bogus advice.

> So in order to lead newbies away from this *very* common trap they often
> fall into, it is still a valid rule to say
>
> Newbies: Never use "is" to compare immutable types.

Look in the standard library, and you will see dozens of cases of
first-quality code breaking your "valid" rule.

Your rule is not valid. A better rule might be:

Never use "is" to compare equality.

Or even:

Never use "is" unless you know the difference between identity and equality.

Or even:

Only use "is" on Tuesdays.

At least that last rule is occasionally right (in the same way a stopped
clock is right twice a day), while your rule is *always* wrong. It is never
correct to avoid using "is" when you need to compare for identity.

> of even better
>
> Newbies: Never use "is" to compare anything.

Worse and worse! Now you're actively teaching newbies to write buggy code!

--
Steven

Steven D'Aprano

unread,

Mar 6, 2009, 4:22:23 PM3/6/09

to

Gary Herron wrote:

>> Huh? How am I supposed to compare immutable types for identity then? Your
>> bizarre instruction would prohibit:
>>
>> if something is None
>>
>
> Just use:
>
> if something == None
>
> It does *exactly* the same thing.

Wrong.

"something is None" is a pointer comparison. It's blindingly fast, and it
will only return True if something is the same object as None. Any other
object *must* return False.

"something == None" calls something.__eq__(None), which is a method of
arbitrary complexity, which may cause arbitrary side-effects. It can have
false positives, where objects with unexpected __eq__ methods may return
True, which is almost certainly not the intention of the function author
and therefore a bug.

[...]

> If they use a couple "something==None" instead of "something is None"
> in their code while learning Python, it won't hurt,

Apart from the subtle bugs they introduce into their code.

> and they can change
> their style when they understand the difference. And meanwhile they
> will skip traps newbies fall into when they don't understand these
> things yet.

How about teaching them the right reasons for using "is" instead of giving
them false information by telling them they should never use it?

--
Steven

Emanuele D'Arrigo

unread,

Mar 6, 2009, 4:31:02 PM3/6/09

to

Thank you everybody for the contributions and sorry if I reawoke the
recurring "is vs ==" issue. I -think- I understand how Python's
object model works, but clearly I'm still missing something. Let me
reiterate my original example without the distracting aspect of the
"==" comparisons and the four variables:

>>> a = "a"
>>> b = "a"

>>> a is b
True

>>> a = "/a" <- same as above, except the forward slashes!
>>> b = "/a" <- same as above, except the forward slashes!
>>> a is b
False

So, it appears that in the first case a and b are names to the same
string object, while in the second case they are to two separate
objects. Why? What's so special about the forward slash that cause the
two "/a" strings to create two separate objects? Is this an
implementation-specific issue?

Manu

Emanuele D'Arrigo

unread,

Mar 6, 2009, 4:42:57 PM3/6/09

to

On 6 Mar, 19:46, Gary Herron <gher...@islandtraining.com> wrote:
> It is an implementation choice (usually driven by efficiency considerations) to choose when two strings with the same value are stored in memory once or twice. In order for Python to recognize when a newly created string has the same value as an already existing string, and so use the already existing value, it would need to search *every* existing string whenever a new string is created. Clearly that's not going to be efficient. However, the C implementation of Python does a limited version of such a thing -- at least with strings of length 1.

Gary, thanks for your reply: your explanation does pretty much answer
my question. One thing I can add however is that it really seems that
non-alphanumeric characters such as the forward slash make the
difference, not just the number of characters. I.e.

>>> a = "aaaaaaaaaaaaaaaaaaaaaaaaaaa"
>>> b = "aaaaaaaaaaaaaaaaaaaaaaaaaaa"
>>> a is b
True
>>> a = "/aaaaaaaaaaaaaaaaaaaaaaaaaaa"
>>> b = "/aaaaaaaaaaaaaaaaaaaaaaaaaaa"
>>> a is b
False

I just find it peculiar more than a nuisance, but I'll go to the
blackboard and write 100 times "never compare the identities of two
immutables". Thank you all!

Manu

Christian Heimes

unread,

Mar 6, 2009, 4:51:36 PM3/6/09

to pytho...@python.org

Emanuele D'Arrigo wrote:
> So, it appears that in the first case a and b are names to the same
> string object, while in the second case they are to two separate
> objects. Why? What's so special about the forward slash that cause the
> two "/a" strings to create two separate objects? Is this an
> implementation-specific issue?

Python special cases certain objects like str with one element or small
ints from -10 to +256 for performance reasons. It's version and
implementation specific and may change in the future. Do NOT rely on it!
Christian

Gary Herron

unread,

Mar 6, 2009, 5:38:34 PM3/6/09

to pytho...@python.org

Nonsense. Show me "newbie" level code that's buggy with "==" but
correct with "is".

However, I do like your restatement of the rule this way:

Never use "is" unless you know the difference between identity and
equality.

That warns newbies away from the usual pitfall, and (perhaps) won't
offend those
who seem to forget what "newbie" means.

Gary Herron

>
>

"Martin v. Löwis"

unread,

Mar 6, 2009, 5:46:14 PM3/6/09

to Emanuele D'Arrigo

> So, it appears that in the first case a and b are names to the same
> string object, while in the second case they are to two separate
> objects. Why?

This question is ambiguous:
a) Why does the Python interpreter behave this way?
(i.e. what specific algorithm produces this result?)
or
b) Why was the interpreter written to behave this way?
(i.e. what is the rationale for that algorithm?)

For a), the answer is in Object/codeobject.c:

/* Intern selected string constants */
for (i = PyTuple_Size(consts); --i >= 0; ) {
PyObject *v = PyTuple_GetItem(consts, i);
if (!PyString_Check(v))
continue;
if (!all_name_chars((unsigned char *)PyString_AS_STRING(v)))
continue;
PyString_InternInPlace(&PyTuple_GET_ITEM(consts, i));
}

So it interns all strings which only consist of name
characters.

For b), the rationale is that such string literals
in source code are often used to denote names, e.g.
for getattr() calls and the like. As all names are interned,
name-like strings get interned also.

> What's so special about the forward slash that cause the
> two "/a" strings to create two separate objects?

See above.

> Is this an implementation-specific issue?

Yes, see above.

Martin

Gary Herron

unread,

Mar 6, 2009, 6:09:23 PM3/6/09

to pytho...@python.org

Unless you are *trying* to discern something about the implementation
and its attempt at efficiencies. Here's several more interesting example:

>>> 101 is 100+1
True
>>> 1001 is 1000+1
False

>>> 10*'a' is 5*'aa'
True
>>> 100*'a' is 50*'aa'
False

Gary Herron

> Manu
> --
> http://mail.python.org/mailman/listinfo/python-list
>

sk...@pobox.com

unread,

Mar 6, 2009, 5:08:48 PM3/6/09

to Robert Kern, pytho...@python.org

Gary> *Do NOT use "is" to compare immutable types.* **Ever! **

The obvious followup question is then, "when is it ok to use 'is'?"

Robert> Well, "foo is None" is actually recommended practice....

Indeed. It does have some (generally small) performance ramifications as
well. Two trivial one-line examples:

% python -m timeit -s 'x = None' 'x is None'
10000000 loops, best of 3: 0.065 usec per loop
% python -m timeit -s 'x = None' 'x == None'
10000000 loops, best of 3: 0.121 usec per loop
% python -m timeit -s 'x = object(); y = object()' 'x == y'
10000000 loops, best of 3: 0.154 usec per loop
% python -m timeit -s 'x = object(); y = object()' 'x is y'
10000000 loops, best of 3: 0.0646 usec per loop

I imagine the distinction grows if you implement a class with __eq__ or
__cmp__ methods, but that would make the examples greater than one line
long. Of course, the more complex the objects you are comparing the
stronger the recommendation agaist using 'is' to compare two objects.

Skip

Gabriel Genellina

unread,

Mar 6, 2009, 6:49:50 PM3/6/09

to pytho...@python.org

En Fri, 06 Mar 2009 19:31:02 -0200, Emanuele D'Arrigo <man...@gmail.com>
escribió:

>>>> a = "a"
>>>> b = "a"
>>>> a is b
> True
>
>>>> a = "/a" <- same as above, except the forward slashes!
>>>> b = "/a" <- same as above, except the forward slashes!
>>>> a is b
> False
>
> So, it appears that in the first case a and b are names to the same
> string object, while in the second case they are to two separate
> objects. Why? What's so special about the forward slash that cause the
> two "/a" strings to create two separate objects? Is this an
> implementation-specific issue?

With all the answers you got, I hope you now understand that you put the
question backwards: it's not "why aren't a and b the very same object in
the second case?" but "why are they the same object in the first case?".

Two separate expressions, involving two separate literals, don't *have* to
evaluate as the same object. Only because strings are immutable the
interpreter *may* choose to re-use the same string. But Python would still
be Python even if all those strings were separate objects (although it
would perform a lot slower!)

--
Gabriel Genellina

Carl Banks

unread,

Mar 6, 2009, 7:05:53 PM3/6/09

to

On Mar 6, 12:23 pm, Gary Herron <gher...@islandtraining.com> wrote:
> Robert Kern wrote:
> > On 2009-03-06 13:46, Gary Herron wrote:
> >> Emanuele D'Arrigo wrote:
> >>> Hi everybody,
>
> >>> while testing a module today I stumbled on something that I can work
> >>> around but I don't quite understand.
>
> >> *Do NOT use "is" to compare immutable types.* **Ever! **
>
> > Well, "foo is None" is actually recommended practice....
>
> But since newbies are always falling into this trap, it is still a good
> rule to say:
>
> Newbies: Never use "is" to compare immutable types.

No it isn't, it's asinine advice that's not even a simpllified truth,
it's just a lie.

Newbies who don't understand the difference between "==" and "is"
should not be using "is", for any object, immutable or mutable, aside
from None (which, whether you like it or not, is idomatic Python).

Everyone who's learned the difference between equality and same
identity, including experts, should be using "is" only to test if some
object is the same object they created themselves, or is an object
guaranteed by a library or the langauge to never change, irrespective
of whether the object is mutable or not.

At no point on the learning curve is the distinction of when to use
"is" or not ever mutability.

Carl Banks

Ben Finney

unread,

Mar 6, 2009, 7:07:12 PM3/6/09

to

"Emanuele D'Arrigo" <man...@gmail.com> writes:

> I just find it peculiar more than a nuisance, but I'll go to the
> blackboard and write 100 times "never compare the identities of two
> immutables". Thank you all!

That's the wrong lesson to learn from this.

The right lesson to learn is, “Equality comparison is not the same
operation as identity comparison. Use the right tool for the situation
at hand.”

--
\ “Facts are meaningless. You could use facts to prove anything |
`\ that's even remotely true!” —Homer, _The Simpsons_ |
_o__) |
Ben Finney

Paul Rubin

unread,

Mar 6, 2009, 7:26:46 PM3/6/09

to

Gary Herron <ghe...@islandtraining.com> writes:
> Experts: Singleton immutable types *may* be compared with "is",

That is absolutely wrong:

>>> a = 2^100
>>> b = 2^100

>>> a == b
True
>>> a is b

False

Paul Rubin

unread,

Mar 6, 2009, 7:29:46 PM3/6/09

to

Steven D'Aprano <st...@pearwood.info> writes:
> It is never
> correct to avoid using "is" when you need to compare for identity.

When is it ever necessary to compare for identity?

Steven D'Aprano

unread,

Mar 6, 2009, 7:45:46 PM3/6/09

to

sk...@pobox.com wrote:

> Of course, the more complex the objects you are comparing the
> stronger the recommendation agaist using 'is' to compare two objects.

Why is there so much voodoo advice about "is"? Is object identity really
such a scary concept that people are frightened of it?

Mutable versus immutable is irrelevant. The complexity of the object is
irrelevant. The phase of the moon is irrelevant. The *only* relevant factor
is the programmer's intention:

If you want to test whether two objects have the same value (equality), the
correct way to do it is with "==".

If you want to test whether two objects are actually one and the same
object, that is, they exist at the same memory location (identity), the
correct way to do it is with "is".

If you find it difficult to think of a reason for testing for identity,
you're right, there aren't many. Since it's rare to care about identity, it
should be rare to use "is". But in the few times you do care about
identity, the correct solution is to use "is" no matter what sort of object
it happens to be. It really is that simple.

--
Steven

Robert Kern

unread,

Mar 6, 2009, 7:53:09 PM3/6/09

to pytho...@python.org

Caches of arbitrary objects.

When checking if an object (which may be have an arbitrarily perverse __eq__) is
None.

Or a specifically constructed sentinel value.

Checking for cycles in a data structure that defines __eq__.

Steven D'Aprano

unread,

Mar 6, 2009, 7:57:35 PM3/6/09

to

Gary Herron wrote:

>>> Newbies: Never use "is" to compare anything.
>>
>> Worse and worse! Now you're actively teaching newbies to write buggy
>> code!
>
> Nonsense. Show me "newbie" level code that's buggy with "==" but
> correct with "is".

What's "newbie" level code? What does that even *mean*? There's no sandbox
for newbies to play in -- their code runs in the same environment as code
written by experts. Newbies can import modules written by experts and use
their code: any object, no matter how complex, might find itself imported
and used by a newbie. Sometimes code written by newbies might even find its
way into code used by experts.

But regardless of that, the point is, what's your motivation in giving
advice to newbies? Do you want them to learn correct coding techniques, or
to learn voodoo programming and superstition? If you want them to learn
correct coding, then teach them the difference between identity and
equality. If you want them to believe superstitions, then continue telling
them never to use "is" without qualifications.

--
Steven

sk...@pobox.com

unread,

Mar 6, 2009, 7:57:43 PM3/6/09

to Steven D'Aprano, pytho...@python.org

Steven> Mutable versus immutable is irrelevant. The complexity of the
Steven> object is irrelevant. The phase of the moon is irrelevant. The
Steven> *only* relevant factor is the programmer's intention:

Which for a new user not familiar with the differing concepts of "is" and
"==" can lead to mistakes.

Steven> If you find it difficult to think of a reason for testing for
Steven> identity, you're right, there aren't many. Since it's rare to
Steven> care about identity, it should be rare to use "is". But in the
Steven> few times you do care about identity, the correct solution is to
Steven> use "is" no matter what sort of object it happens to be. It
Steven> really is that simple.

Right. Again though, when newcomers conflate the concepts they can deceive
themselves into thinking "is" is just a faster "==".

Skip

Steven D'Aprano

unread,

Mar 7, 2009, 2:34:17 AM3/7/09

to

sk...@pobox.com wrote:

> Steven> Mutable versus immutable is irrelevant. The complexity of the
> Steven> object is irrelevant. The phase of the moon is irrelevant. The
> Steven> *only* relevant factor is the programmer's intention:
>
> Which for a new user not familiar with the differing concepts of "is" and
> "==" can lead to mistakes.

Right. And for newbies unfamiliar with Python, they might mistakenly think
that ^ is the exponentiation operator rather than **.

So what do we do? Do we teach them what ^ actually is, or give them bad
advice "Never call ^" and then watch them needlessly write their own
exponentiation function?

Do we then defend this terrible advice by claiming that nobody needs
exponentiation? Or that only experts need it? Or that it's necessary to
tell newbies not to use ^ because they're only newbies and can't deal with
the truth?

No, of course not. But that's what some of us are doing with regard to "is".
Because *some* newbies are ignorant and use "is" for equality testing, we
patronisingly decide that *all* newbies can't cope with learning what "is"
really is for, give them bad advice, and thus ensure that they stay
ignorant longer.

> Steven> If you find it difficult to think of a reason for testing for
> Steven> identity, you're right, there aren't many. Since it's rare to
> Steven> care about identity, it should be rare to use "is". But in the
> Steven> few times you do care about identity, the correct solution is
> to Steven> use "is" no matter what sort of object it happens to be. It
> Steven> really is that simple.
>
> Right. Again though, when newcomers conflate the concepts they can
> deceive themselves into thinking "is" is just a faster "==".

Then teach them the difference, don't teach them superstition.

--
Steven

alex23

unread,

Mar 7, 2009, 2:38:50 AM3/7/09

to

On Mar 7, 10:57 am, s...@pobox.com wrote:
> Right. Again though, when newcomers conflate the concepts they can deceive
> themselves into thinking "is" is just a faster "==".

But _you_ only _just_ stated "It does have some (generally small)
performance ramifications as
well" and provided timing examples to show it. Without qualification.

And you're wondering why these mythical newcomers might be confused...

Steven D'Aprano

unread,

Mar 7, 2009, 2:53:49 AM3/7/09

to

Paul Rubin wrote:

Is that a trick question? The obvious answer is, any time you need to.

The standard library has dozens of tests like:

something is None
something is not None

Various standard modules include comparisons like:

if frame is self.stopframe:
if value is not __UNDEF__:
if b is self.exit:
if domain is Absent:
if base is object:
if other is NotImplemented:
if type(a) is type(b):

although that last one is probably better written using isinstance() or
issubclass(). I have no doubt that there are many more examples.

Comparing by identity are useful for interning objects, for testing that
singletons actually are singletons, for comparing functions with the same
name, for avoiding infinite loops while traversing circular data
structures, and for non-destructively testing whether two mutable objects
are the same object or different objects.

--
Steven

Paul Rubin

unread,

Mar 7, 2009, 2:57:10 AM3/7/09

to

alex23 <wuw...@gmail.com> writes:
> But _you_ only _just_ stated "It does have some (generally small)
> performance ramifications as
> well" and provided timing examples to show it. Without qualification.

The performance difference can be large if the objects are (for
example) long lists.

Albert Hopkins

unread,

Mar 7, 2009, 3:07:39 AM3/7/09

to pytho...@python.org

I would think (not having looked) that the implementation of == would
first check for identity (for performance reasons)... but then that lead
me to ask: can an object be identical but not equal to itself?

Albert Hopkins

unread,

Mar 7, 2009, 3:11:02 AM3/7/09

to pytho...@python.org

On Sat, 2009-03-07 at 03:07 -0500, Albert Hopkins wrote:
> On Fri, 2009-03-06 at 23:57 -0800, Paul Rubin wrote:

> I would think (not having looked) that the implementation of == would
> first check for identity (for performance reasons)... but then that lead
> me to ask: can an object be identical but not equal to itself?

... answered my own question

class Foo:
def __eq__(self, b):
return False

>>> x == Foo()
>>> x is x
--> True

>>> x == x
--> False

Steven D'Aprano

unread,

Mar 7, 2009, 3:19:38 AM3/7/09

to

Albert Hopkins wrote:

> I would think (not having looked) that the implementation of == would
> first check for identity (for performance reasons)...

For some types, it may. I believe that string equality testing first tests
whether the two strings are the same string, then tests if they have the
same hash, and only then do a character-by-character comparison. Or so I've
been told.

> can an object be identical but not equal to itself?

Yes. Floating point NANs are required to compare unequal to all floats,
including themselves. It's part of the IEEE standard.

Python doesn't assume that == must mean equality. If, for some bizarre
reason you want to define == to mean something completely different, then
you can define x == x to return anything you like for your class.

--
Steven

Marc 'BlackJack' Rintsch

unread,

Mar 7, 2009, 3:50:33 AM3/7/09

to

What should this example show? And where's the singleton here? BTW:

In [367]: a = 2 ^ 100

In [368]: b = 2 ^ 100

In [369]: a == b
Out[369]: True

In [370]: a is b
Out[370]: True

Ciao,
Marc 'BlackJack' Rintsch

Paul Rubin

unread,

Mar 7, 2009, 4:01:43 AM3/7/09

to

Marc 'BlackJack' Rintsch <bj_...@gmx.net> writes:
> What should this example show? And where's the singleton here? BTW:

I misunderstood at first what you meant by "singleton". Sorry.

Lie Ryan

unread,

Mar 7, 2009, 5:18:52 AM3/7/09

to

Steven D'Aprano wrote:
> Albert Hopkins wrote:
>
>> I would think (not having looked) that the implementation of == would
>> first check for identity (for performance reasons)...
>
> For some types, it may. I believe that string equality testing first tests
> whether the two strings are the same string, then tests if they have the
> same hash, and only then do a character-by-character comparison. Or so I've
> been told.
>
>> can an object be identical but not equal to itself?
>
> Yes. Floating point NANs are required to compare unequal to all floats,
> including themselves. It's part of the IEEE standard.

btw, have anybody noticed that the subject line "/a" is not "/a" is
actually False.

>>> "/a" is not "/a"
False
>>> a = "/a"
>>> b = "/a"
>>> a is not b
True

Mel

unread,

Mar 7, 2009, 8:36:38 AM3/7/09

to

Emanuele D'Arrigo wrote:
> Gary, thanks for your reply: your explanation does pretty much answer
> my question. One thing I can add however is that it really seems that
> non-alphanumeric characters such as the forward slash make the
> difference, not just the number of characters. I.e.

(Actually, we had this thread last week.) It's a question of strings that
might be Python names. Every line of code that looks up a name in a
namespace (e.g. global symbols, instance attributes, class attributes, etc.)
needs a string containing the name. This optimization keeps Python data
space from filling up with names. The same thing happens with small,
common, integers.
[ ... ]

> I just find it peculiar more than a nuisance, but I'll go to the
> blackboard and write 100 times "never compare the identities of two
> immutables". Thank you all!

The rule is to know the operations, and use the ones that do what you want
to do. `is` and `==` don't do the same thing. Never did, never will.

<sarcasm>
Python 2.5.2 (r252:60911, Oct 5 2008, 19:24:49)
[GCC 4.3.2] on linux2
Type "help", "copyright", "credits" or "license" for more information.
>>> a=3
>>> b=0
>>> a+b == a-b
True
>>> b=1
>>> a+b == a-b
False

At this point, someone says 'Eek'. But why? `+` and `-` never were the
same operation, regardless of a coincidence or two.
</sarcasm>

Mel.

Mel

unread,

Mar 7, 2009, 8:43:54 AM3/7/09

to

wrote:

Ho-hum. MUDD game.

def broadcast (sender, message):
for p in all_players:
if p is not sender:
p.tell (message) # don't send a message to oneself

Mel.

Christian Heimes

unread,

Mar 7, 2009, 9:14:16 AM3/7/09

to pytho...@python.org

Steven D'Aprano wrote:
> Yes. Floating point NANs are required to compare unequal to all floats,
> including themselves. It's part of the IEEE standard.

As far as I remember that's not correct. It's just the way C has
interpreted the standard and Python inherited the behavior. But you may
proof me wrong on that.

Mark, you are the expert on IEEE 754.

Christian

Emanuele D'Arrigo

unread,

Mar 7, 2009, 11:32:21 AM3/7/09

to

On Mar 6, 10:46 pm, "Martin v. Löwis" <mar...@v.loewis.de> wrote:
> For b), the rationale is that such string literals
> in source code are often used to denote names, e.g.
> for getattr() calls and the like. As all names are interned,
> name-like strings get interned also.

Thank you Martin, and all others who have responded, I have a much
better picture of the whole issue now. Much appreciated.

Manu

Robert Kern

unread,

Mar 7, 2009, 6:41:15 PM3/7/09

to pytho...@python.org

On 2009-03-07 02:11, Albert Hopkins wrote:
> On Sat, 2009-03-07 at 03:07 -0500, Albert Hopkins wrote:
>> On Fri, 2009-03-06 at 23:57 -0800, Paul Rubin wrote:

>> I would think (not having looked) that the implementation of == would

>> first check for identity (for performance reasons)... but then that lead
>> me to ask: can an object be identical but not equal to itself?
>
> ... answered my own question
>
>
> class Foo:
> def __eq__(self, b):
> return False
>
>>>> x == Foo()
>>>> x is x
> --> True
>
>>>> x == x
> --> False

And for a practical, real world example:

In [1]: inf = 1e200 * 1e200

In [2]: inf
Out[2]: inf

In [3]: nan = inf / inf

In [4]: nan
Out[4]: nan

In [5]: nan is nan
Out[5]: True

In [6]: nan == nan
Out[6]: False

Robert Kern

unread,

Mar 7, 2009, 6:44:45 PM3/7/09

to pytho...@python.org

On 2009-03-07 08:14, Christian Heimes wrote:

> Steven D'Aprano wrote:
>> Yes. Floating point NANs are required to compare unequal to all floats,
>> including themselves. It's part of the IEEE standard.
>

> As far as I remember that's not correct. It's just the way C has
> interpreted the standard and Python inherited the behavior. But you may
> proof me wrong on that.
>
> Mark, you are the expert on IEEE 754.

Steven is correct. The standard defines how boolean comparisons like ==, !=, <,
etc. should behave in the presence of NaNs. Table 4 on page 9, to be precise.

Lie Ryan

unread,

Mar 8, 2009, 8:32:43 AM3/8/09

to

Since in a MUD game, a player would always have a unique username, I'd
rather compare with that. It doesn't rely on some internals. There is
very, very rare case where 'is' is really, really needed.

Lie Ryan

unread,

Mar 8, 2009, 8:35:17 AM3/8/09

to

Robert Kern wrote:
> On 2009-03-07 08:14, Christian Heimes wrote:
>> Steven D'Aprano wrote:
>>> Yes. Floating point NANs are required to compare unequal to all floats,
>>> including themselves. It's part of the IEEE standard.
>>
>> As far as I remember that's not correct. It's just the way C has
>> interpreted the standard and Python inherited the behavior. But you may
>> proof me wrong on that.
>>
>> Mark, you are the expert on IEEE 754.
>
> Steven is correct. The standard defines how boolean comparisons like ==,
> !=, <, etc. should behave in the presence of NaNs. Table 4 on page 9, to
> be precise.
>

The rationale behind the standard was because NaN can be returned by
many distinct operations, thus one NaN may not be equal to other NaN.

Mark Dickinson

unread,

Mar 8, 2009, 1:39:56 PM3/8/09

to

On Mar 7, 2:14 pm, Christian Heimes <li...@cheimes.de> wrote:
> Steven D'Aprano wrote:
> > Yes. Floating point NANs are required to compare unequal to all floats,
> > including themselves. It's part of the IEEE standard.
>
> As far as I remember that's not correct. It's just the way C has
> interpreted the standard and Python inherited the behavior. But you may
> proof me wrong on that.

Steven's statement sounds about right to me: IEEE 754
(the current 2008 version of the standard, which supersedes
the original 1985 version that I think Robert Kern is
referring to) says that every NaN compares *unordered*
to anything else (including itself). A compliant language is
required to supply twenty-two(!) comparison operations, including
a 'compareQuietEqual' operation with compareQuietEqual(NaN, x)
being False, and also a 'compareSignalingEqual' operation, such
that compareSignalingEqual(NaN, x) causes an 'invalid operation
exception'. See sections 5.6.1 and 5.11 of the standard for
details.

Throughout the above, 'NaN' means quiet NaN. A comparison
involving a signaling NaN should always cause an invalid operation
exception. I don't think Python really supports signaling NaNs
in any meaningful way.

I wonder what happens if you create an sNaN using
struct.unpack(suitable_byte_string) and then try
to do arithmetic on it in Python...

Mark

Carl Banks

unread,

Mar 9, 2009, 11:47:56 AM3/9/09

to

Well, by that criterion you can dismiss almost anything.

Of course you can assign unique ids to most objects and perform your
identity tests that way. The point is that sometimes you do need to
test for the identity of the object, not merely the equivalent
semantic value.

If, faced with this problem (and I'm guessing you haven't faced it
much) your approach is always to define a unique id, so that you can
avoid ever having to use the "is" operator, be my guest. As for me, I
do program in the sort of areas where identity testing is common, and
I don't care to define ids just to test for identity alone, so for me
"is" is useful.

Carl Banks

Robert Kern

unread,

Mar 9, 2009, 1:02:52 PM3/9/09

to pytho...@python.org

On 2009-03-08 12:39, Mark Dickinson wrote:
> On Mar 7, 2:14 pm, Christian Heimes<li...@cheimes.de> wrote:
>> Steven D'Aprano wrote:
>>> Yes. Floating point NANs are required to compare unequal to all floats,
>>> including themselves. It's part of the IEEE standard.
>> As far as I remember that's not correct. It's just the way C has
>> interpreted the standard and Python inherited the behavior. But you may
>> proof me wrong on that.
>
> Steven's statement sounds about right to me: IEEE 754
> (the current 2008 version of the standard, which supersedes
> the original 1985 version that I think Robert Kern is
> referring to)

Yes.

Steve Holden

unread,

Mar 14, 2009, 10:10:32 AM3/14/09

to pytho...@python.org

Gary Herron wrote:
> Robert Kern wrote:
>> On 2009-03-06 14:23, Gary Herron wrote:
>>> Robert Kern wrote:
>>>> On 2009-03-06 13:46, Gary Herron wrote:
>>>>> Emanuele D'Arrigo wrote:
>>>>>> Hi everybody,
>>>>>>
>>>>>> while testing a module today I stumbled on something that I can work
>>>>>> around but I don't quite understand.
>>>>>
>>>>> *Do NOT use "is" to compare immutable types.* **Ever! **
>>>>
>>>> Well, "foo is None" is actually recommended practice....
>>>>
>>>
>>> But since newbies are always falling into this trap, it is still a good
>>> rule to say:
>>>
>>> Newbies: Never use "is" to compare immutable types.
>>>
>>> and then later point out, for those who have absorbed the first rule:

>>>
>>> Experts: Singleton immutable types *may* be compared with "is",

>>> although normal equality with == works just as well.
>>
>> That's not really true. If my object overrides __eq__ in a funny way,
>> "is None" is much safer.
>>
>> Use "is" when you really need to compare by object identity and not
>> value.
>
> But that definition is the *source* of the trouble. It is *completely*
> meaningless to newbies. Until one has experience in programming in
> general and experience in Python in particular, the difference between
> "object identity" and "value" is a mystery.
> So in order to lead newbies away from this *very* common trap they often
> fall into, it is still a valid rule to say
>
> Newbies: Never use "is" to compare immutable types.
>
I think this is addressing the wrong problem. I;d prefer to say

Newbies: never assume that the interpreter keeps just one copy of any
value. Just because a == b that doesn't mean that a is b. *Sometimes* it
will be, but it isn't usually guaranteed.

> of even better

>
> Newbies: Never use "is" to compare anything.
>

> This will help them avoid traps, and won't hurt their use of the
> language. If they get to a point that they need to contemplate using
> "is", then almost be definition, they are not a newbie anymore, and the
> rule is still valid.

personally I believe newbies should be allowed the freedom to shoot
themselves in the foot occasionally, and will happily explain the issues
that arise when they do so. It's all good learning.

I think using "is" to compare mutable objects is a difficult topic to
explain, and I think your division of objects into mutable and immutable
types is unhelpful and not to-the-point.

regards
Steve
--
Steve Holden +1 571 484 6266 +1 800 494 3119
Holden Web LLC http://www.holdenweb.com/
Want to know? Come to PyCon - soon! http://us.pycon.org/

Steve Holden

unread,

Mar 14, 2009, 10:12:09 AM3/14/09

to pytho...@python.org

Paul Rubin wrote:
> Steven D'Aprano <st...@pearwood.info> writes:
>> It is never
>> correct to avoid using "is" when you need to compare for identity.
>
> When is it ever necessary to compare for identity?

For example when providing a unique "sentinel" value as a function
argument. The parameter must be tested for identity with the sentinel.

Steve Holden

unread,

Mar 14, 2009, 10:20:32 AM3/14/09

to pytho...@python.org

Well, the obvious "identity" is id(p), but then

a is b

is entirely equivalent to

id(a) == id(b)