Google Groups no longer supports new Usenet posts or subscriptions. Historical content remains viewable.
Dismiss

Too many 'self' in python.That's a big flaw in this language.

65 views
Skip to first unread message

hide...@gmail.com

unread,
Jun 27, 2007, 7:02:34 AM6/27/07
to
HI
I'm currently using Python. I find that a instance variable must
confined with self,
for example:
class a:
def __init__(self):
self.aa=10
def bb(self):
print self.aa # See .if in c++,I could use aa to change that
variable

That's a big inconvenience in coding ,especially when you have lot of
variable
If you method need 10 variables ,you have to type "self" for 10 times
and that also makes your variable longer.

>From My point,I think this only help python interpreter to deside
where to look for.
Is there anyone know's how to make the interpreter find instance name
space first?
Or any way to make programmer's life easier?

Marc 'BlackJack' Rintsch

unread,
Jun 27, 2007, 7:41:02 AM6/27/07
to
In <1182942154....@e9g2000prf.googlegroups.com>,
hide...@gmail.com wrote:

Use a shorter name than `self` or an editor with auto completion. If a
name in a function or method is local or global is decided at compile
time, not at run time. So at least every assignment to an instance
attribute must have the ``self.`` in front or the compiler sees the name
as local to the method. Changing this would slow down the interpreter
because every name has to be looked up in the instance dict every time to
decide if it's an attribute or a local name.

Another drawback of your proposed "magic" is that attributes can be
assigned, deleted or delegated dynamically at run time. So your bare `aa`
name can change meaning from instance attribute to local name or vice
versa over the time.

You must have very compelling reasons to ask for changes that spare you
some keystrokes by the way. Pythonistas usually don't like sacrificing
readability for fewer characters. Most source code will be written once
but must be read and understood a couple of times, so it's more important
to have clear than short code. With `self` in place you always know which
names are local and which are attributes.

Ciao,
Marc 'BlackJack' Rintsch

Neil Cerutti

unread,
Jun 27, 2007, 8:01:34 AM6/27/07
to
On 2007-06-27, hide...@gmail.com <hide...@gmail.com> wrote:
> HI
> I'm currently using Python. I find that a instance variable must
> confined with self,
> for example:
> class a:
> def __init__(self):
> self.aa=10
> def bb(self):
> print self.aa # See .if in c++,I could use aa to change that
> variable
>
> That's a big inconvenience in coding ,especially when you have
> lot of variable If you method need 10 variables ,you have to
> type "self" for 10 times and that also makes your variable
> longer.

I recommend the discussion of this issue in the Python FAQ.

http://www.python.org/doc/faq/general/#why-must-self-be-used-explicitly-in-method-definitions-and-calls

> From My point,I think this only help python interpreter to
> deside where to look for. Is there anyone know's how to make
> the interpreter find instance name space first? Or any way to
> make programmer's life easier?

Try thinking of "self." as a notation that provides vital
information to you, the programmer.

--
Neil Cerutti

faulkner

unread,
Jun 27, 2007, 8:11:34 AM6/27/07
to

Roy Smith

unread,
Jun 27, 2007, 8:58:59 AM6/27/07
to
faulkner <faulk...@gmail.com> wrote:
> http://www.voidspace.org.uk/python/weblog/arch_d7_2006_12_16.shtml#e584

I looked the "Selfless Python" idea described there, and I think it's a
REALLY bad idea. It's a clever hack, but not something I would ever want
to see used in production code. Sure, it saves a little typing, but it
invokes a lot of magic and the result is essentially a new language and
everybody who uses this code in the future will have to scratch their heads
and figure out what you did.

Programs get written once. They get read and maintained forever, by
generations of programmers who haven't even been hired by your company yet.
Doing some clever magic to save a few keystrokes for the original
programmer at the cost of sowing confusion for everybody else in the future
is a bad tradeoff.

Roy Smith

unread,
Jun 27, 2007, 9:03:33 AM6/27/07
to
Marc 'BlackJack' Rintsch <bj_...@gmx.net> wrote:
> Use a shorter name than `self` or an editor with auto completion.

Of the two, I'd strongly vote for the auto completion (assuming you feel
the need to "solve" this problem at all). The name "self" is so ingrained
in most Python programmers minds, that it's almost a keyword. Changing it
to "this" or "s" or "me" will just make your program a little harder for
other people to understand.

Changing it to "this" would be particularly perverse since it's not even
any less typing. In fact, on a standard keyboard, it's harder to type
since it involves moving off the home row more :-)

Sion Arrowsmith

unread,
Jun 27, 2007, 9:50:03 AM6/27/07
to

And it provides even more vital information to *other* programmers
dealing with your code ("other" including "you in six months time").
I've just confused the socks off a cow-orker by writing in a C++
method kill(SIGTERM); -- confusion which would have been avoided if
I'd used an explicit this->kill(SIGTERM); . But amongst C++'s many
flaws, such disambiguation is frowned on as non-idiomatic. Explicit
self *is a good thing*.

--
\S -- si...@chiark.greenend.org.uk -- http://www.chaos.org.uk/~sion/
"Frankly I have no feelings towards penguins one way or the other"
-- Arthur C. Clarke
her nu becomeþ se bera eadward ofdun hlæddre heafdes bæce bump bump bump

Jorgen Bodde

unread,
Jun 27, 2007, 10:01:18 AM6/27/07
to pytho...@python.org
I had the same feeling when I started, coming from a C++ background, I
forgot about self a lot, creating local copies of what should be an
assign to a class instance, or methods that could not be found because
I forgot 'self' .

Now I am 'kinda' used to it, as every language has some draw backs
(you can't please all). But, what about something in between like only
using the dot (.) for a shorter notation?

self.some_var = True

Could become:

.some_var = True

Which basically shows about the same thing, but you leave 'self' out
of the syntax. Ofcourse it should not be allowed to break a line
between the dot and the keywords, else Python would never know what to
do;

my_class()
.my_var = True

Should not be parsed the same as;

my_class().my_var = True

Just a suggestion. I am pretty happy with self, but I could settle for
a shorter version if possible.

- Jorgen

> --
> http://mail.python.org/mailman/listinfo/python-list
>

A.T.Hofkamp

unread,
Jun 27, 2007, 10:37:50 AM6/27/07
to
On 2007-06-27, hide...@gmail.com <hide...@gmail.com> wrote:
> HI
> I'm currently using Python. I find that a instance variable must
> confined with self,
> for example:
> class a:
> def __init__(self):
> self.aa=10
> def bb(self):
> print self.aa # See .if in c++,I could use aa to change that
> variable

c++ is a much more static language (for example, you cannot create new fields
in your class at run time), so it can decide in advance what you mean.

In other words, it is a cost you pay for the increased flexibility. You may not
be using that flexibility, but it is there, and people use it.

> That's a big inconvenience in coding ,especially when you have lot of
> variable

I have switched from c++ to Python several years ago, and was surprised about
having to explicitly write 'self' each time. However, I never considered it "a
big inconvenience".
As years went by I have come to like the explicit notation in Python.


> Or any way to make programmer's life easier?

Others have already pointed out that leaving out 'self' is more bad than good.
I think they are right. In the past I also thought that Python was badly
designed, and until now, in the end it appeared that I was always in error.
[off-topic:
I think that again now with the default implementation of the object.__eq__ and
object.__hash__ methods. I believe these methods should not exist until the
programmer explicitly defines them with a suitable notion of equivalence.

Anybody have a good argument against that? :-)
]


Another suggestion may be to look at your code again, and check whether all
self's are really needed. In other words, can you improve your code by reducing
use of instance variables?
In Python, The "a=b" statement is extremely cheap, because you don't copy data.
Exploit that feature.

An alternative may be to copy a self variable into a local variable one and use
the local variable in the method. Another option may be to collect results in a
local variable first and then assign it to a self.X variable.

If you really have a lot of variables, are you sure that they should all be
seperate (flat) variables in one class, ie would it be possible to merge some
of them together in another object and have more structure in the variables?
(classes are much cheaper in Python than in c++ w.r.t. amount of code)


Sincerely,
Albert

Bjoern Schliessmann

unread,
Jun 27, 2007, 11:29:40 AM6/27/07
to
hide...@gmail.com wrote:

> I'm currently using Python.

How long have you been using Python?

> I find that a instance variable
> must confined with self, for example:
> class a:
> def __init__(self):
> self.aa=10
> def bb(self):
> print self.aa #
> See .if in c++,I could use aa to change that variable

Mh, strange, I personally like to use "this.a" in C++, to make clear
I use an instance variable.

> That's a big inconvenience in coding ,especially when you have lot
> of variable

NACK, see above.

> If you method need 10 variables ,you have to type "self" for 10
> times and that also makes your variable longer.

Explicit is better than implicit.



> From My point,I think this only help python interpreter to deside
> where to look for.

IMHO, it's also a great hint for the programmer. With others' C++
code, I'm often confused what kinds of variables (global, instance,
static, ...) they access, it's also badly commented. If C++ forced
the programmer to write "this.var", the code would be
understandable with much less comments.

Regards,


Björn

--
BOFH excuse #13:

we're waiting for [the phone company] to fix that line

Alex Martelli

unread,
Jun 27, 2007, 11:39:57 AM6/27/07
to
A.T.Hofkamp <h...@se-162.se.wtb.tue.nl> wrote:

> I think that again now with the default implementation of the
> object.__eq__ and object.__hash__ methods. I believe these methods should
> not exist until the programmer explicitly defines them with a suitable
> notion of equivalence.
>
> Anybody have a good argument against that? :-)

It's very common and practical (though not ideologically pure!) to want
each instance of a class to "stand for itself", be equal only to itself:
this lets me place instances in a set, etc, without fuss.

I don't want, in order to get that often-useful behavior, to have to
code a lot of boilerplate such as
def __hash__(self): return hash(id(self))
and the like -- so, I like the fact that object does it for me. I'd
have no objection if there were two "variants" of object (object itself
and politically_correct_object), inheriting from each other either way
'round, one of which kept the current practical approach while the other
made __hash__ and comparisons abstract.

In Python 3000, ordering comparisons will not exist by default (sigh, a
modest loss of practicality on the altar of purity -- ah well, saw it
coming, ever since complex numbers lost ordering comparisons), but
equality and hashing should remain just like now (yay!).


Alex

John Roth

unread,
Jun 27, 2007, 4:16:05 PM6/27/07
to


Guido has already said that this will not change in Python 3.0 See PEP
3099.

John Roth

Bruno Desthuilliers

unread,
Jun 28, 2007, 1:00:23 AM6/28/07
to
Jorgen Bodde a écrit :

> I had the same feeling when I started, coming from a C++ background, I
> forgot about self a lot, creating local copies of what should be an
> assign to a class instance, or methods that could not be found because
> I forgot 'self' .
>
> Now I am 'kinda' used to it, as every language has some draw backs
> (you can't please all). But, what about something in between like only
> using the dot (.) for a shorter notation?
>
> self.some_var = True
>
> Could become:
>
> .some_var = True
>
> Which basically shows about the same thing, but you leave 'self' out
> of the syntax. Ofcourse it should not be allowed to break a line
> between the dot and the keywords, else Python would never know what to
> do;
>
> my_class()
> .my_var = True
>
> Should not be parsed the same as;
>
> my_class().my_var = True
>
> Just a suggestion. I am pretty happy with self, but I could settle for
> a shorter version if possible.
>
What is nice with the required, explicit reference to the instance -
which BTW and so far is not required to be *named* 'self' - is that it
avoids the need for distinct rules (and different implementations) for
functions and methods. The different 'method' types are just very thin
wrappers around function objects. Which in turn allow to use 'ordinary'
functions (defined outside a class) as methods - IOW, to dynamically
extend classes (and instances) with plain functions. Uniformity can also
have very practical virtues...

Aahz

unread,
Jun 27, 2007, 5:13:50 PM6/27/07
to
In article <1i0cy8z.z94q5d1dxgexxN%al...@mac.com>,

Alex Martelli <al...@mac.com> wrote:
>
>In Python 3000, ordering comparisons will not exist by default (sigh, a
>modest loss of practicality on the altar of purity -- ah well, saw it
>coming, ever since complex numbers lost ordering comparisons), but
>equality and hashing should remain just like now (yay!).

While emotionally I agree with you, in practice I have come to agree
with the POV that allowing default ordering comparisons between disjoint
types causes subtle bugs that are more difficult to fix than the small
amount of boilerplate needed to force comparisons when desired.
--
Aahz (aa...@pythoncraft.com) <*> http://www.pythoncraft.com/

"as long as we like the same operating system, things are cool." --piranha

John Nagle

unread,
Jun 27, 2007, 5:54:36 PM6/27/07
to
Bruno Desthuilliers wrote:
> Jorgen Bodde a écrit :

>
>> But, what about something in between like only
>> using the dot (.) for a shorter notation?

How about "Mavis Beacon Teaches Typing"?

John Nagle

Andy Freeman

unread,
Jun 27, 2007, 6:10:35 PM6/27/07
to
On Jun 27, 2:54 pm, John Nagle <n...@animats.com> wrote:
> >> But, what about something in between like only
> >> using the dot (.) for a shorter notation?
>
> How about "Mavis Beacon Teaches Typing"?

How about no "wouldn't it be better" suggestions until at least three
months after the suggester has written at least 1000 lines of working
code.?

Erik Max Francis

unread,
Jun 27, 2007, 9:51:19 PM6/27/07
to
Aahz wrote:

> In article <1i0cy8z.z94q5d1dxgexxN%al...@mac.com>,
> Alex Martelli <al...@mac.com> wrote:
>> In Python 3000, ordering comparisons will not exist by default (sigh, a
>> modest loss of practicality on the altar of purity -- ah well, saw it
>> coming, ever since complex numbers lost ordering comparisons), but
>> equality and hashing should remain just like now (yay!).
>
> While emotionally I agree with you, in practice I have come to agree
> with the POV that allowing default ordering comparisons between disjoint
> types causes subtle bugs that are more difficult to fix than the small
> amount of boilerplate needed to force comparisons when desired.

I agree. It makes more sense to have to specify an ordering rather than
assume an arbitrary one that may or may not have any relation to what
you're interested in.

I always did think that the inability to compare complex numbers was a
bit of a wart -- not because it's not mathematically correct, since it
is, but rather since everything else was comparable, even between
distinct types -- but I think the more sensible approach is to allow
equality by default (which defaults to identity), and only support
comparisons when the user defines what it is he wants them to mean.

--
Erik Max Francis && m...@alcyone.com && http://www.alcyone.com/max/
San Jose, CA, USA && 37 20 N 121 53 W && AIM, Y!M erikmaxfrancis
Water which is too pure has no fish.
-- Ts'ai Ken T'an

Alex Martelli

unread,
Jun 28, 2007, 2:01:59 AM6/28/07
to
Bjoern Schliessmann <usenet-mail-03...@spamgourmet.com>
wrote:
...

> Mh, strange, I personally like to use "this.a" in C++, to make clear
> I use an instance variable.

That would be nice, unfortunately your C++ compiler will refuse that,
and force you to use this->a instead;-).

Many programming shops use naming conventions instead, such as my_a or
a_ (trailing underscore for member-variables) -- I've even seen the
convention this_a which IMHO is silly (at that point you might as well
use this->a and avoid the 'convention'!-).

Anyway, I essentially agree with you (except for the C++ bit: since this
is a pointer, it needs ->). However, full disclosure, Smalltalk/XP
superstar Kent Beck disagrees -- in his good book "Test Driven Design by
Example", in the chapter where he gives the Python example, he DOES
whine against the need to explicitly say self (the one bad bit in the
book:-).

For the curious: the explicit-self idea is essentially taken from
Modula-3, a sadly now forgotten language which still had an impact on
the history of programming.


Alex

Bjoern Schliessmann

unread,
Jun 28, 2007, 8:38:23 AM6/28/07
to
Alex Martelli wrote:
> Bjoern Schliessmann <usenet-mail-03...@spamgourmet.com>
> wrote:

>> Mh, strange, I personally like to use "this.a" in C++, to make
>> clear I use an instance variable.

> That would be nice, unfortunately your C++ compiler will refuse
> that, and force you to use this->a instead;-).

Sure, thanks. Before I last used C++ I was forced to use Java --
where I would write "this.<member>". ;)



> Many programming shops use naming conventions instead, such as
> my_a or a_ (trailing underscore for member-variables) -- I've even
> seen the convention this_a which IMHO is silly (at that point you
> might as well use this->a and avoid the 'convention'!-).

ACK.

> For the curious: the explicit-self idea is essentially taken from
> Modula-3, a sadly now forgotten language which still had an impact
> on the history of programming.

Mh, I'm going to read some about this one.

Regards,


Björn

--
BOFH excuse #4:

static from nylon underwear

Lou Pecora

unread,
Jun 28, 2007, 8:48:10 AM6/28/07
to
In article <mailman.118.11829528...@python.org>,
"Jorgen Bodde" <jorgen....@gmail.com> wrote:

> I had the same feeling when I started, coming from a C++ background, I
> forgot about self a lot, creating local copies of what should be an
> assign to a class instance, or methods that could not be found because
> I forgot 'self' .
>
> Now I am 'kinda' used to it, as every language has some draw backs
> (you can't please all). But, what about something in between like only
> using the dot (.) for a shorter notation?
>
> self.some_var = True
>
> Could become:
>
> .some_var = True
>
> Which basically shows about the same thing, but you leave 'self' out
> of the syntax. Ofcourse it should not be allowed to break a line
> between the dot and the keywords, else Python would never know what to
> do;
>
> my_class()
> .my_var = True
>
> Should not be parsed the same as;
>
> my_class().my_var = True
>
> Just a suggestion. I am pretty happy with self, but I could settle for
> a shorter version if possible.
>
> - Jorgen

Hmmm... I like this idea. Would you put a dot in the argument of a
class method?

def afcn(.,x,y):
# stuff here

??

I still like it. self remains a wart on python for me after 5 years of
use despite a deep love of the language and developers' community.

--
-- Lou Pecora

When I was a kid my parents moved a lot, but I always found them.
(R.Dangerfield)

A.T.Hofkamp

unread,
Jun 28, 2007, 9:19:53 AM6/28/07
to
On 2007-06-27, Alex Martelli <al...@mac.com> wrote:
> A.T.Hofkamp <h...@se-162.se.wtb.tue.nl> wrote:
>
>> I think that again now with the default implementation of the
>> object.__eq__ and object.__hash__ methods. I believe these methods should
>> not exist until the programmer explicitly defines them with a suitable
>> notion of equivalence.
>>
>> Anybody have a good argument against that? :-)
>
> It's very common and practical (though not ideologically pure!) to want
> each instance of a class to "stand for itself", be equal only to itself:
> this lets me place instances in a set, etc, without fuss.

Convenience is the big counter argument, and I have thought about that.
I concluded that the convenience advantage is not big enough, and the problem
seems to be what "itself" exactly means.

In object oriented programming, objects are representations of values, and the
system shouldn't care about how many instances there are of some value, just
like numbers in math. Every instance with a certain value is the same as every
other instance with the same value.

You can also see this in the singleton concept. The fact that it is a pattern
implies that it is special, something not delivered by default in object
oriented programming.

This object-oriented notion of "itself" is not what Python delivers.

Python 2.4.4 (#1, Dec 15 2006, 13:51:44)
[GCC 3.4.4 20050721 (Red Hat 3.4.4-2)] on linux2
Type "help", "copyright", "credits" or "license" for more information.
>>> class Car(object):
... def __init__(self, number):
... self.number = number
... def __repr__(self):
... return "Car(%r)" % self.number
...

>>> 12345 == 12345
True
>>> Car(123) == Car(123)
False

So in Python, the default equivalence notion for numbers is based on values,
and the default equivalence notion for objects assumes singleton objects which
is weird from an object oriented point of view.

Therefore, I concluded that we are better off without a default __eq__ .

The default existence of __hash__ gives other nasty surprises:

>>> class Car2(object):
... def __init__(self, number):
... self.number = number
... def __repr__(self):
... return "Car2(%r)" % self.number
... def __eq__(self, other):
... return self.number == other.number
...

Above I have fixed Car to use value equivalence (albeit not very robust).
Now if I throw these objects naively in a set:

>>> a = Car2(123)
>>> b = Car2(123)
>>> a == b
True
>>> set([a,b])
set([Car2(123), Car2(123)])

I get a set with two equal cars, something that never happens with a set
my math teacher once told me.

Of course, I should have defined an appropiate __hash__ method together with
the __eq__ method. Unfortunately, not every Python programmer has always had
enough coffee to think about that when he is programming a class. Even worse, I
may get a class such as the above from somebody else and decide that I need a
set of such objects, something the original designer never intended.
The problem is then that something like "set([Car2(123), Car2(124)])" does the
right thing for the wrong reason without telling me.

Without a default __hash__ I'd get at least an error that I cannot put Car2
objects in a set. In that setup, I can still construct a broken set, but I'd
have to write a broken __hash__ function explicitly rather than implicitly
inheriting it from object.


> I don't want, in order to get that often-useful behavior, to have to
> code a lot of boilerplate such as
> def __hash__(self): return hash(id(self))
> and the like -- so, I like the fact that object does it for me. I'd

I understand that you'd like to have less typing to do. I'd like that too if
only it would work without major accidents by simple omission such as
demonstrated in the set example.


Another question can be whether your coding style would be correct here.

Since you apparently want to have singleton objects (since that is what you get
and you are happy with them), shouldn't you be using "is" rather than "=="?
Then you get the equivalence notion you want, you don't need __eq__, and you
write explicitly that you have singleton objects.

In the same way, sets have very little value for singleton objects, you may as
well use lists instead of sets since duplicate **values** are not filtered.
For lists, you don't need __hash__ either.

The only exception would be to filter multiple inclusions of the same object
(that is what sets are doing by default). I don't know whether that would be
really important for singleton objects **in general**.
(ie wouldn't it be better to explicitly write a __hash__ based on identity for
those cases?)

> have no objection if there were two "variants" of object (object itself
> and politically_correct_object), inheriting from each other either way
> 'round, one of which kept the current practical approach while the other
> made __hash__ and comparisons abstract.

Or you define your own base object class "class Myobject(object)" and add a
default __eq__ and __hash__ method. This at least gives an explicit definition
of the equivalence notion for your application.


> In Python 3000, ordering comparisons will not exist by default (sigh, a
> modest loss of practicality on the altar of purity -- ah well, saw it
> coming, ever since complex numbers lost ordering comparisons), but
> equality and hashing should remain just like now (yay!).

I didn't try that, but it seems like a good decision. Ordering based on
identity may change with each invocation of the program!


Albert

Alan Isaac

unread,
Jun 28, 2007, 9:52:13 AM6/28/07
to
A.T.Hofkamp wrote:

>>>>a = Car2(123)
>>>>b = Car2(123)
>>>>a == b
>
> True
>
>>>>set([a,b])
>
> set([Car2(123), Car2(123)])
>
> I get a set with two equal cars, something that never happens with a set
> my math teacher once told me.


Then your math teacher misspoke.
You have two different cars in the set,
just as expected. Use `is`.
http://docs.python.org/ref/comparisons.html

This is good behavior.

Cheers,
Alan Isaac

A.T.Hofkamp

unread,
Jun 28, 2007, 10:38:56 AM6/28/07
to

Hmm, maybe numbers in sets are broken then?

>>> a = 12345
>>> b = 12345
>>> a == b
True
>>> a is b
False
>>> set([a,b])
set([12345])


Numbers and my Car2 objects behave the same w.r.t. '==' and 'is', yet I get a
set with 1 number, and a set with 2 cars.
Something is wrong here imho.

The point I intended to make was that having a default __hash__ method on
objects give weird results that not everybody may be aware of.
In addition, to get useful behavior of objects in sets one should override
__hash__ anyway, so what is the point of having a default object.__hash__ ?

The "one should override __hash__ anyway" argument is being discussed in my
previous post.


Albert

Roy Smith

unread,
Jun 28, 2007, 10:56:59 AM6/28/07
to
In article <slrnf87db...@se-162.se.wtb.tue.nl>,
"A.T.Hofkamp" <h...@se-162.se.wtb.tue.nl> wrote:

> In object oriented programming, objects are representations of values, and the
> system shouldn't care about how many instances there are of some value, just
> like numbers in math. Every instance with a certain value is the same as every
> other instance with the same value.

Whether two things are equal depends on the context. Is one $10 note equal
to another? It depends.

If the context is a bank teller making change, then yes, they are equal.
What's more, there are lots of sets of smaller notes which would be equally
fungible.

If the context is a district attorney showing a specific $10 note to a jury
as evidence in a drug buy-and-bust case, they're not. It's got to be
exactly that note, as proven by a recorded serial number.

In object oriented programming, objects are representations of the real
world. In one case, the $10 note represents some monetary value. In
another, it represents a piece of physical evidence in a criminal trial.
Without knowing the context of how the objects are going to be used, it's
really not possible to know how __eq__() should be defined.

Let me give you a more realistic example. I've been doing a lot of network
programming lately. We've got a class to represent an IP address, and a
class to represent an address-port pair (a "sockaddr"). Should you be able
to compare an address to a sockaddr? Does 192.168.10.1 == 192.168.10.1:0?
You tell me. This is really just the "does 1 == (1 + 0j)" question in
disguise. There's reasonable arguments to be made on both sides, but there
is no one true answer. It depends on what you're doing.

John Nagle

unread,
Jun 28, 2007, 4:12:03 PM6/28/07
to
Alex Martelli wrote:
> Bjoern Schliessmann <usenet-mail-03...@spamgourmet.com>
> wrote:
> ...
>
>>Mh, strange, I personally like to use "this.a" in C++, to make clear
>>I use an instance variable.
>
>
> That would be nice, unfortunately your C++ compiler will refuse that,
> and force you to use this->a instead;-).

Yes, as Strostrup admits, "this" should have been a reference.
Early versions of C++ didn't have references.

One side effect of that mistake was the "delete(this)" idiom,
which does not play well with inheritance. But that's a digression here.

John Nagle

Steve Holden

unread,
Jun 28, 2007, 9:10:03 PM6/28/07
to pytho...@python.org

Hmm, I suspect you'll like this even less:

>>> set((1.0, 1, 1+0j))
set([1.0])

Just the same there are sound reasons for it, so I'd prefer to see you
using "counterintuitive" or "difficult to fathom" rather than "broken"
and "wrong".

Such language implies you have thought about this more deeply than the
developers (which I frankly doubt) and that they made an inappropriate
decision (which is less unlikely, but which in the case you mention I
also rather doubt).

regards
Steve
--
Steve Holden +1 571 484 6266 +1 800 494 3119
Holden Web LLC/Ltd http://www.holdenweb.com
Skype: holdenweb http://del.icio.us/steve.holden
--------------- Asciimercial ------------------
Get on the web: Blog, lens and tag the Internet
Many services currently offer free registration
----------- Thank You for Reading -------------

Gabriel Genellina

unread,
Jun 28, 2007, 11:47:16 PM6/28/07
to pytho...@python.org
En Thu, 28 Jun 2007 11:38:56 -0300, A.T.Hofkamp <h...@se-162.se.wtb.tue.nl>
escribió:

> The point I intended to make was that having a default __hash__ method on
> objects give weird results that not everybody may be aware of.
> In addition, to get useful behavior of objects in sets one should
> override
> __hash__ anyway, so what is the point of having a default
> object.__hash__ ?

__hash__ and equality tests are used by the dictionary implementation, and
the default implementation is OK for immutable objects. I like the fact
that I can use almost anything as dictionary keys without much coding.
This must always be true: (a==b) => (hash(a)==hash(b)), and the
documentation for __hash__ and __cmp__ warns about the requisites (but
__eq__ and the other rich-comparison methods are lacking the warning).

--
Gabriel Genellina

mma...@gmx.net

unread,
Jun 29, 2007, 12:21:11 AM6/29/07
to
On Fri, 29 Jun 2007 00:47:16 -0300
"Gabriel Genellina" <gags...@yahoo.com.ar> wrote:
> __hash__ and equality tests are used by the dictionary
> implementation, and the default implementation is OK for immutable
> objects.

That is probably why inf == inf yields True.
In this unique case, I do not like the default implementation.

Martin

A.T.Hofkamp

unread,
Jun 29, 2007, 7:17:07 AM6/29/07
to
On 2007-06-29, Steve Holden <st...@holdenweb.com> wrote:
> Just the same there are sound reasons for it, so I'd prefer to see you
> using "counterintuitive" or "difficult to fathom" rather than "broken"
> and "wrong".

You are quite correct, in the heat of typing an answer, my wording was too
strong, I am sorry.


Albert

A.T.Hofkamp

unread,
Jun 29, 2007, 8:04:16 AM6/29/07
to
On 2007-06-28, Roy Smith <r...@panix.com> wrote:
> In article <slrnf87db...@se-162.se.wtb.tue.nl>,
> "A.T.Hofkamp" <h...@se-162.se.wtb.tue.nl> wrote:
>
>> In object oriented programming, objects are representations of values, and the
>> system shouldn't care about how many instances there are of some value, just
>> like numbers in math. Every instance with a certain value is the same as every
>> other instance with the same value.
>
> Whether two things are equal depends on the context. Is one $10 note equal
> to another? It depends.
>
> If the context is a bank teller making change, then yes, they are equal.
> What's more, there are lots of sets of smaller notes which would be equally
> fungible.
>
> If the context is a district attorney showing a specific $10 note to a jury
> as evidence in a drug buy-and-bust case, they're not. It's got to be
> exactly that note, as proven by a recorded serial number.
>
> In object oriented programming, objects are representations of the real
> world. In one case, the $10 note represents some monetary value. In
> another, it represents a piece of physical evidence in a criminal trial.
> Without knowing the context of how the objects are going to be used, it's
> really not possible to know how __eq__() should be defined.

I can see your point, but am not sure I agree. The problem is that OO uses
models tailored to an application, ie the model changes with each application.

In a bank teller application, one would probably not model the serial number,
just the notion of $10 notes would be enough, as in "Note(value)". The contents
of a cash register would then for example be a dictionary of Note() objects to
a count. You can merge two of such dictionaries, where the 'value' data of the
Note objects would be the equivalence notion.

In an evidence application one **would** record the serial number, since it is
a relevant distinguishing feature between notes, ie one would model Note(value,
serialnumber).
In this application the combination of value and serial number together defines
equivalence.

However, also in this situation we use values of the model for equivalence. If
we have a data base that relates evidence to storage location, and we would
like to know where a particular note was stored, we would compare Note objects
with each other based in the combination of value and serial number, not on
their id()'s.

> You tell me. This is really just the "does 1 == (1 + 0j)" question in
> disguise. There's reasonable arguments to be made on both sides, but there
> is no one true answer. It depends on what you're doing.

While we don't agree on how OO programming handles equality (and it may well be
that there are multiple interpretations possible), wouldn't your argument also
not lead to the conclusion that it is better not to have a pre-defined __eq__
method?


Albert

Steve Holden

unread,
Jun 29, 2007, 8:30:46 AM6/29/07
to pytho...@python.org
No problem, I do the same thing myself ...

A.T.Hofkamp

unread,
Jun 29, 2007, 9:15:24 AM6/29/07
to
On 2007-06-29, Gabriel Genellina <gags...@yahoo.com.ar> wrote:
> En Thu, 28 Jun 2007 11:38:56 -0300, A.T.Hofkamp <h...@se-162.se.wtb.tue.nl>
> escribió:
>
>> The point I intended to make was that having a default __hash__ method on
>> objects give weird results that not everybody may be aware of.
>> In addition, to get useful behavior of objects in sets one should
>> override
>> __hash__ anyway, so what is the point of having a default
>> object.__hash__ ?
>
> __hash__ and equality tests are used by the dictionary implementation, and
> the default implementation is OK for immutable objects. I like the fact

I don't understand exactly how mutability relates to this.

The default __eq___ and __hash__ implementation for classes is ok if you never
have equivalent objects. In that case, == and 'is' are exactly the same
function in the sense that for each pair of arguments, they deliver the same
value.

This remains the case even if I mutate existing objects without creating
equivalent objects.

As soon as I create two equivalent instances (either by creating a duplicate at
a new address, or by mutating an existing one) the default __eq__ should be
redefined if you want these equivalent objects to announce themselves as
equivalent with the == operator.

> that I can use almost anything as dictionary keys without much coding.

Most data-types of Python have their own implementation of __eq__ and __hash__
to make this work. This is good, it makes the language easy to use. However for
home-brewn objects (derived from object) the default implementation of these
functions may easily cause unexpected behavior and we may be better off without
a default implementation for these functions. That would prevent use of such
objects in combination with == or in sets/dictionaries without an explicit
definition of the __eq__ and __hash__ functions, but that is not very bad,
since in many cases one would have to define the proper equivalence notion
anyway.

> This must always be true: (a==b) => (hash(a)==hash(b)), and the
> documentation for __hash__ and __cmp__ warns about the requisites (but
> __eq__ and the other rich-comparison methods are lacking the warning).

I don't know exactly what the current documentation says. One of the problems
is that not everybody is reading those docs. Instead they run a simple test
like "print set([Car(1),Car(2)])". That gives the correct result even if the
"(a==b) => (hash(a)==hash(b))" relation doesn't hold due to re-definition of
__eq__ but not __hash__ (the original designer never expected to use the class
in a set/dictionary for example) , and the conclusion is "it works". Then they
use the incorrect implementation for months until they discover that it doesn't
quite work as expected, followed by a long debugging session to find and
correct the problem.

Without default __eq__ and __hash__ implementations for objects, the program
would drop dead on the first experiment. While it may be inconvenient at that
moment (to get the first experiment working, one needs to do more effort), I
think it would be preferable to having an incorrect implementation for months
without knowing it. In addition, a developer has to think explicitly about his
notion of equivalence.

Last but not least, in the current implementation, you cannot see whether there
is a __eq__ and/or __hash__ equivalence notion. Lack of an explicit definition
does not necessarily imply there is no such notion. Without default object
implementation this would also be uniqly defined.


Albert

Alan Isaac

unread,
Jul 1, 2007, 10:23:10 PM7/1/07
to
A.T.Hofkamp wrote:
> Hmm, maybe numbers in sets are broken then?
>>>>a = 12345
>>>>b = 12345
>>>>a == b
>
> True
>
>>>>a is b
>
> False
>
>>>>set([a,b])
>
> set([12345])
>
> Numbers and my Car2 objects behave the same w.r.t. '==' and 'is', yet I get a
> set with 1 number, and a set with 2 cars.
> Something is wrong here imho.
>
> The point I intended to make was that having a default __hash__ method on
> objects give weird results that not everybody may be aware of.
> In addition, to get useful behavior of objects in sets one should override
> __hash__ anyway, so what is the point of having a default object.__hash__ ?


The point is: let us have good default behavior.
Generally, two equal numbers are two conceptual
references to the same "thing". (Say, the Platonic
form of the number.) So it is good that the hash value
is determined by the number. Similarly for strings.
Two equal numbers or strings are **also** identical,
in the sense of having the same conceptual reference.
In contrast, two "equal" cars are generally not identical
in this sense. Of course you can make them so if you wish,
but it is odd. So *nothing* is wrong here, imo.

Btw:
>>> a = 12
>>> b = 12


>>> a == b
True
>>> a is b

True

Cheers,
Alan Isaac

0 new messages