hash(el) does not work though el.__hash__ exists...

16 views
Skip to first unread message

Florent Hivert

unread,
Mar 26, 2010, 4:29:58 AM3/26/10
to cython...@googlegroups.com
Hi there,

I'm stuck with a problem of calling hash(el) on some object. Unfortunately,
right now I haven't been able to reduce the problem to a small code. In every
reduction I've tried the problem doesn't appear. Here is the description:

I've some object "el" which is an instance of an extension class. A method
__class__ is implemented in a parent class. el.__class__ correctly works but
hash(el) answers
TypeError: unhashable type: 'sage.structure.list_clone.IncreasingList'
Any idea what could be the cause ?

Cheers,

Florent


sage: from sage.structure.list_clone import IncreasingLists
sage: el = IncreasingLists()([1,2,3])
sage: el.__class__
<type 'sage.structure.list_clone.IncreasingList'>
sage: for cl in el.__class__.mro(): print cl
....:
<type 'sage.structure.list_clone.IncreasingList'>
<type 'sage.structure.list_clone.ListClone'>
<type 'sage.structure.list_clone.ElementClone'>
<type 'sage.structure.element.Element'>
<type 'sage.structure.sage_object.SageObject'>
<type 'object'>
sage: el.__class__.__hash__
<slot wrapper '__hash__' of 'sage.structure.list_clone.ElementClone' objects>
sage: el.__hash__()
-310718273
sage: hash(el)
---------------------------------------------------------------------------
TypeError Traceback (most recent call last)
[...]
TypeError: unhashable type: 'sage.structure.list_clone.IncreasingList'

Florent Hivert

unread,
Mar 26, 2010, 1:12:22 PM3/26/10
to cython...@googlegroups.com
Hi there,

> I'm stuck with a problem of calling hash(el) on some object. Unfortunately,
> right now I haven't been able to reduce the problem to a small code. In every
> reduction I've tried the problem doesn't appear. Here is the description:

I manage to reduce the problem ! Consider the following Cython code:


cdef class HashDebug(object):
def __hash__(self):
return 42

cdef class HashNo(HashDebug):
def __richcmp__(left, right, int op):
return 1

cdef class HashYes(HashDebug):
def __hash__(self):
return HashDebug.__hash__(self)
def __richcmp__(left, right, int op):
return 1

Then

sage: from sage.structure.hash_debug import *
sage: elYes = HashYes()
sage: elYes.__hash__()
42
sage: hash(elYes)
42

sage: elNo = HashNo()
sage: elNo.__hash__()
42

But
sage: hash(elNo)

TypeError: unhashable type: 'sage.structure.hash_debug.HashNo'

Is there any reason for that ? Am I doing something wrong ? Thanks for the
help.

Cheers,

Florent

Lisandro Dalcin

unread,
Mar 29, 2010, 11:22:02 AM3/29/10
to cython...@googlegroups.com

The answer for your question is in CPython sources, at
Objects/typeobject.c, in function inherit_slots()

if (type->tp_flags & base->tp_flags & Py_TPFLAGS_HAVE_RICHCOMPARE) {
if (type->tp_compare == NULL &&
type->tp_richcompare == NULL &&
type->tp_hash == NULL)
{
type->tp_compare = base->tp_compare;
type->tp_richcompare = base->tp_richcompare;
type->tp_hash = base->tp_hash;

As you can see, the handling of __hash__ and __richcmp__ inheritance
is a bit special. IIUC, you override __richcmp__ and then __hash__ is
not inherited.

--
Lisandro Dalcin
---------------
Centro Internacional de Métodos Computacionales en Ingeniería (CIMEC)
Instituto de Desarrollo Tecnológico para la Industria Química (INTEC)
Consejo Nacional de Investigaciones Científicas y Técnicas (CONICET)
PTLC - Güemes 3450, (3000) Santa Fe, Argentina
Tel/Fax: +54-(0)342-451.1594

Craig Citro

unread,
Mar 29, 2010, 2:37:11 PM3/29/10
to cython...@googlegroups.com
> As you can see, the handling of __hash__ and __richcmp__ inheritance
> is a bit special. IIUC, you override __richcmp__ and then __hash__ is
> not inherited.
>

Lisandro's right on the money -- inheritance for __cmp__, __richcmp__,
and __hash__ is an all-or-nothing deal. This is mentioned in two
places in the Python docs:

http://docs.python.org/c-api/typeobj.html#tp_compare

http://docs.python.org/reference/datamodel.html#object.__hash__

I know there's some esoteric note about it in the Sage source code
somewhere, but I couldn't find it quickly.

-cc

Dag Sverre Seljebotn

unread,
Mar 29, 2010, 3:41:32 PM3/29/10
to cython...@googlegroups.com
However, I think Cython could work around this if we wanted? I want to
(but not so much I'm volunteering).

Dag Sverre

Lisandro Dalcin

unread,
Mar 29, 2010, 3:54:55 PM3/29/10
to cython...@googlegroups.com
On 29 March 2010 16:41, Dag Sverre Seljebotn

Why? What's wrong with Python's semantics?

Stefan Behnel

unread,
Mar 29, 2010, 4:05:26 PM3/29/10
to cython...@googlegroups.com
Dag Sverre Seljebotn, 29.03.2010 21:41:

I'm not completely sure which behaviour is better. I see an argument for
automatically inheriting all base type methods in Cython's type hierarchy,
so if one type implements __richcmp__ and another type in the hierarchy
implements __hash__, I think the resulting type should have both methods
and be hashable.

The only case I see where this might get in the way is when you want to
explicitly make a type non-hashable, maybe only comparable. The current
behaviour allows this, but I wouldn't know how to do the same in Cython if
the methods are automatically inherited. I consider this a rare use case.

So I see arguments for both. ISTM that the context of the Cython language
speaks for automatic inheritance, but the first semantics is certainly much
easier to work around manually than the second.

Stefan

Lisandro Dalcin

unread,
Mar 29, 2010, 6:12:28 PM3/29/10
to cython...@googlegroups.com
On 29 March 2010 17:05, Stefan Behnel <stef...@behnel.de> wrote:
>
> I'm not completely sure which behaviour is better. I see an argument for
> automatically inheriting all base type methods in Cython's type hierarchy,
> so if one type implements __richcmp__ and another type in the hierarchy
> implements __hash__, I think the resulting type should have both methods and
> be hashable.
>

And you are just opening the door for broken type objects... I know,
we are all adults... But I still do not see the point. If you are able
to compute a meaningful hash, you should be able to compute a
meaningful comparison (at least __eq__ or __neq__). Computing hash()
with some bytes of your object, and == with others is a nonsense,
right? You could end-up having objects that compare equal, but have
different hash values... That's the reason hash(7.0) is 7, right?

Florent Hivert

unread,
Mar 29, 2010, 8:24:14 PM3/29/10
to cython...@googlegroups.com
Hi,

> The answer for your question is in CPython sources, at
> Objects/typeobject.c, in function inherit_slots()
>
> if (type->tp_flags & base->tp_flags & Py_TPFLAGS_HAVE_RICHCOMPARE) {
> if (type->tp_compare == NULL &&
> type->tp_richcompare == NULL &&
> type->tp_hash == NULL)
> {
> type->tp_compare = base->tp_compare;
> type->tp_richcompare = base->tp_richcompare;
> type->tp_hash = base->tp_hash;
>
> As you can see, the handling of __hash__ and __richcmp__ inheritance
> is a bit special. IIUC, you override __richcmp__ and then __hash__ is
> not inherited.

Thanks a lot for this pointer ! It's look to me very strange ! Is there a
rationale for this, is it documented somewhere ?

Cheers,

Florent

Lisandro Dalcin

unread,
Mar 29, 2010, 9:05:05 PM3/29/10
to cython-users

Craig already pointed out the relevant links... Quoting
http://docs.python.org/reference/datamodel.html#object.__hash__

"""
The only required property is that objects which compare equal have
the same hash value.
"""

So you should see this behavior as a safety measure; inheritance could
easily break that property (which is required for hash tables, i.o.w
dict and set and any other hash-table-based types).

Dag Sverre Seljebotn

unread,
Mar 30, 2010, 2:25:17 AM3/30/10
to cython...@googlegroups.com
Lisandro wrote:
> On 29 March 2010 16:41, Dag Sverre Seljebotn
> <da...@student.matnat.uio.no> wrote:
>> Craig Citro wrote:
>>>>
>>>> As you can see, the handling of __hash__ and __richcmp__ inheritance
>>>> is a bit special. IIUC, you override __richcmp__ and then __hash__ is
>>>> not inherited.
>>>>
>>>>
>>>
>>> Lisandro's right on the money -- inheritance for __cmp__, __richcmp__,
>>> and __hash__ is an all-or-nothing deal. This is mentioned in two
>>> places in the Python docs:
>>>
>>> http://docs.python.org/c-api/typeobj.html#tp_compare
>>>
>>> http://docs.python.org/reference/datamodel.html#object.__hash__
>>>
>>> I know there's some esoteric note about it in the Sage source code
>>> somewhere, but I couldn't find it quickly.
>>>
>>
>> However, I think Cython could work around this if we wanted?
>
> Why? What's wrong with Python's semantics?

My vote is almost always for emulating whatever pure Python does as best
as we can. cdef classes are different from Python classes, but the former
should emulate the latter as far as it is reasonable.

I thought that the Python language was different from Cython here -- that
the second link above is only guidelines and the __hash__ attribute is not
really a special case in the Python language? (except for the = None
business).

The problem I have with it is that it is a surprising special case -- one
more thing one "has to learn". How much time do you think Florent has
wasted on this issue by now

Dag Sverre

Craig Citro

unread,
Mar 30, 2010, 11:57:03 AM3/30/10
to cython...@googlegroups.com
> The problem I have with it is that it is a surprising special case -- one
> more thing one "has to learn". How much time do you think Florent has
> wasted on this issue by now
>

I strongly agree with this last part -- I only know about it because
I've run into it before, and I know Robert B has hit it before -- and
the comments in the Sage source are from a *third* person who ran into
it and got confused. I'm happy to match the Python semantics, both for
compatibility reasons, and because I think it would cause a different
kind of confusing bug. Namely, if you "accidentally" inherited a
__hash__ that didn't match the __cmp__ you wrote, I think you'd end up
with some pretty confusing and hard-to-track-down behavior. (Again,
it's the user shooting themselves in the foot -- it's one thing to
give them a gun, but another to blindfold them, too.)

So +1 on keeping Python semantics, and +10 on having better
documentation for this ... where should it go? FAQ? (I'm happy to add
it ...)

-cc

William Stein

unread,
Mar 30, 2010, 1:11:34 PM3/30/10
to cython...@googlegroups.com

Is there any possibility to having Cython actually emit a big honking
warning, or even an error (?), in this situation?
Since I'm going to forget this issue too at some point, and I'm never
going to read the docs or FAQ :-).

-- William

Robert Bradshaw

unread,
Mar 30, 2010, 1:23:25 PM3/30/10
to cython...@googlegroups.com
On Mar 29, 2010, at 11:25 PM, Dag Sverre Seljebotn wrote:

> Lisandro wrote:
>> On 29 March 2010 16:41, Dag Sverre Seljebotn
>> <da...@student.matnat.uio.no> wrote:
>>> Craig Citro wrote:
>>>>>
>>>>> As you can see, the handling of __hash__ and __richcmp__
>>>>> inheritance
>>>>> is a bit special. IIUC, you override __richcmp__ and then
>>>>> __hash__ is
>>>>> not inherited.
>>>>>
>>>>>
>>>>
>>>> Lisandro's right on the money -- inheritance for __cmp__,
>>>> __richcmp__,
>>>> and __hash__ is an all-or-nothing deal. This is mentioned in two
>>>> places in the Python docs:
>>>>
>>>> http://docs.python.org/c-api/typeobj.html#tp_compare
>>>>
>>>> http://docs.python.org/reference/datamodel.html#object.__hash__
>>>>
>>>> I know there's some esoteric note about it in the Sage source code
>>>> somewhere, but I couldn't find it quickly.
>>>>
>>>
>>> However, I think Cython could work around this if we wanted?
>>
>> Why? What's wrong with Python's semantics?

What do you mean by Python's semantics?

In [1]: class A(object):
...: def __hash__(self):
...: return id(self)
...:
...:

In [2]: class B(A):
...: def __cmp__(self, other):
...: return id(self) == id(other)
...:
...:

In [3]: hash(A())
Out[3]: 7694640

In [4]: hash(B())
Out[4]: 8017840

This is a question of whether to stick with the C API extension class
semantics, or Python instance class semantics.

> My vote is almost always for emulating whatever pure Python does as
> best
> as we can. cdef classes are different from Python classes, but the
> former
> should emulate the latter as far as it is reasonable.

I agree here. In general, or philosophy is that the user should not
have to know about the Python/C API level at all.

> I thought that the Python language was different from Cython here --
> that
> the second link above is only guidelines and the __hash__ attribute
> is not
> really a special case in the Python language? (except for the = None
> business).
>
> The problem I have with it is that it is a surprising special case
> -- one
> more thing one "has to learn". How much time do you think Florent has
> wasted on this issue by now

I have never seen anyone run into this issue and not be baffled or at
least very surprised (including myself the first time I saw it). We
already insulate the user from not being able to return -1 for cdef
classes like one can for ordinary classes.

Of course there are some special method behavior that would be more
difficult (or expensive) to emulate.

- Robert

Lisandro Dalcin

unread,
Mar 30, 2010, 1:25:33 PM3/30/10
to cython...@googlegroups.com
On 30 March 2010 03:25, Dag Sverre Seljebotn

<da...@student.matnat.uio.no> wrote:
> Lisandro wrote:
>> On 29 March 2010 16:41, Dag Sverre Seljebotn
>> <da...@student.matnat.uio.no> wrote:
>>> Craig Citro wrote:
>>>>>
>>>>> As you can see, the handling of __hash__ and __richcmp__ inheritance
>>>>> is a bit special. IIUC, you override __richcmp__ and then __hash__ is
>>>>> not inherited.
>>>>>
>>>>>
>>>>
>>>> Lisandro's right on the money -- inheritance for __cmp__, __richcmp__,
>>>> and __hash__ is an all-or-nothing deal. This is mentioned in two
>>>> places in the Python docs:
>>>>
>>>> http://docs.python.org/c-api/typeobj.html#tp_compare
>>>>
>>>> http://docs.python.org/reference/datamodel.html#object.__hash__
>>>>
>>>> I know there's some esoteric note about it in the Sage source code
>>>> somewhere, but I couldn't find it quickly.
>>>>
>>>
>>> However, I think Cython could work around this if we wanted?
>>
>> Why? What's wrong with Python's semantics?
>
> My vote is almost always for emulating whatever pure Python does as best
> as we can.

Python 2 or Python 3?

> cdef classes are different from Python classes, but the former
> should emulate the latter as far as it is reasonable.
>

They do, at least in Py 3, regarding this hash/richcmp business. Run
the code below in Py2 and Py3

class A(object):
def __hash__(self):
return 2
def __eq__(self, other):
return True

class B(A):
def __eq__(self, other):
return True

a = A()
b = B()
assert a == b
print( hash(a) )
print( hash(b) )

Florent Hivert

unread,
Mar 30, 2010, 6:35:25 AM3/30/10
to cython...@googlegroups.com
Hi,

> My vote is almost always for emulating whatever pure Python does as best
> as we can. cdef classes are different from Python classes, but the former
> should emulate the latter as far as it is reasonable.
>
> I thought that the Python language was different from Cython here -- that
> the second link above is only guidelines and the __hash__ attribute is not
> really a special case in the Python language? (except for the = None
> business).
>
> The problem I have with it is that it is a surprising special case -- one
> more thing one "has to learn". How much time do you think Florent has
> wasted on this issue by now

Something like two hours bissecting a bunch of python/cython files and class
hierachy... But I must confess I'm not very good at it...

I completely agree that this is very surprising, but I now understand the
rationale for this though I personally think it's a bad design. Moreover, It's
clear that Cython should be at most as possible compatible with Python. So if
you don't want to change this behavior, shouldn't be possible to have Cython
raise a warning ?

Cheers,

Florent


Lisandro Dalcin

unread,
Mar 30, 2010, 1:36:43 PM3/30/10
to cython...@googlegroups.com
On 30 March 2010 14:23, Robert Bradshaw <robe...@math.washington.edu> wrote:
> On Mar 29, 2010, at 11:25 PM, Dag Sverre Seljebotn wrote:
>
>> Lisandro wrote:
>>>
>>>>
>>>> However, I think Cython could work around this if we wanted?
>>>
>>> Why? What's wrong with Python's semantics?
>
> What do you mean by Python's semantics?
>
> In [1]: class A(object):
>   ...:     def __hash__(self):
>   ...:         return id(self)
>   ...:
>   ...:
>
> In [2]: class B(A):
>   ...:     def __cmp__(self, other):
>   ...:         return id(self) == id(other)
>   ...:
>   ...:
>
> In [3]: hash(A())
> Out[3]: 7694640
>
> In [4]: hash(B())
> Out[4]: 8017840
>

That's an IPython session, so I guess you have no chance to try Python
3, right? ;-)

> This is a question of whether to stick with the C API extension class
> semantics, or Python instance class semantics.
>

"Python instance class semantics" (in Python 2) are essentially
broken, and this was fixed in Python 3.

>> My vote is almost always for emulating whatever pure Python does as best
>> as we can. cdef classes are different from Python classes, but the former
>> should emulate the latter as far as it is reasonable.
>
> I agree here. In general, or philosophy is that the user should not have to
> know about the Python/C API level at all.
>

IMHO, this is not a C API issue, but instead an
oversight/bug/whatever-you-want to call in in CPython 2.x for
Python-implemented classes. If we ever do something, it should be to
warn/error the user about this when running Cython in their source
code.

>
> I have never seen anyone run into this issue and not be baffled or at least
> very surprised (including myself the first time I saw it).
>

Yes, we learn new things every day, at any age, no matter how much
experience you have, etc. etc...

>
> We already
> insulate the user from not being able to return -1 for cdef classes like one
> can for ordinary classes.
>

Well, that's related to a different issue.

Lisandro Dalcin

unread,
Mar 30, 2010, 1:42:44 PM3/30/10
to cython...@googlegroups.com
On 30 March 2010 07:35, Florent Hivert <florent...@univ-rouen.fr> wrote:
>
> I completely agree that this is very surprising,
>

Me too, though more or less the same level of surprise you experience
when you compare x==x for a floating point value, and you get false.

>
> but I now understand the rationale for this though I personally think it's a bad design.
>

Why do you think this is a bad design? I personally agree with the
behavior of Python 3, though I would very much prefer Python to
generate a warning...

> Moreover, It's
> clear that Cython should be at most as possible compatible with Python. So if
> you don't want to change this behavior,

Still, I think it is not only a matter of compatibility, but of the right thing.

> shouldn't be possible to have Cython
> raise a warning ?
>

Definitely +1 on that.

Robert Bradshaw

unread,
Mar 30, 2010, 1:54:48 PM3/30/10
to cython...@googlegroups.com
On Mar 30, 2010, at 10:36 AM, Lisandro Dalcin wrote:

> On 30 March 2010 14:23, Robert Bradshaw
> <robe...@math.washington.edu> wrote:
>> On Mar 29, 2010, at 11:25 PM, Dag Sverre Seljebotn wrote:
>>
>>> Lisandro wrote:
>>>>
>>>>>
>>>>> However, I think Cython could work around this if we wanted?
>>>>
>>>> Why? What's wrong with Python's semantics?
>>
>> What do you mean by Python's semantics?
>>
>> In [1]: class A(object):
>> ...: def __hash__(self):
>> ...: return id(self)
>> ...:
>> ...:
>>
>> In [2]: class B(A):
>> ...: def __cmp__(self, other):
>> ...: return id(self) == id(other)
>> ...:
>> ...:
>>
>> In [3]: hash(A())
>> Out[3]: 7694640
>>
>> In [4]: hash(B())
>> Out[4]: 8017840
>>
>
> That's an IPython session, so I guess you have no chance to try Python
> 3, right? ;-)

Well, when I talk about "Python" I'm always referring to the one that
ships with Sage ;)

>> This is a question of whether to stick with the C API extension class
>> semantics, or Python instance class semantics.
>
> "Python instance class semantics" (in Python 2) are essentially
> broken, and this was fixed in Python 3.

Would you have said that Python instance class semantics were
essentially broken, and this was fixed in the C API?

>>> My vote is almost always for emulating whatever pure Python does
>>> as best
>>> as we can. cdef classes are different from Python classes, but the
>>> former
>>> should emulate the latter as far as it is reasonable.
>>
>> I agree here. In general, or philosophy is that the user should not
>> have to
>> know about the Python/C API level at all.
>
> IMHO, this is not a C API issue, but instead an
> oversight/bug/whatever-you-want to call in in CPython 2.x for
> Python-implemented classes. If we ever do something, it should be to
> warn/error the user about this when running Cython in their source
> code.
>
>> I have never seen anyone run into this issue and not be baffled or
>> at least
>> very surprised (including myself the first time I saw it).
>>
>
> Yes, we learn new things every day, at any age, no matter how much
> experience you have, etc. etc...

It's not a matter of learning, it's a question of violating the
principle of least surprise. I had not realized that this was changed
in Py3--that is a good point and should give this behavior some
"precedent."

No matter what we do, +1 for a big warning.

- Robert

Stefan Behnel

unread,
Mar 31, 2010, 3:27:30 AM3/31/10
to cython...@googlegroups.com
Robert Bradshaw, 30.03.2010 19:54:

> I had not realized that this was changed in
> Py3--that is a good point and should give this behavior some "precedent."

Yes, in this case, following Python semantics means that we must follow Py3.


> No matter what we do, +1 for a big warning.

Same from here.

The exact same problem came up on the lxml mailing list a couple of days
ago. One of the classes in lxml.objectify had implemented its own
__richcmp__ and thus suddenly failed to be hashable. It's really surprising
that changing __richcmp__ can break a standard feature like hashing, which
works out-of-the-box for basically everything in Python.

Note, however, that overriding __richcmp__ and not __hash__ is really a bug
in this case, as it renders hashing meaningless and fragile. A warning
would have prevented this.

But then, how will code be able to deliberately break hashability without
getting such a warning from Cython? Could we allow something like this:

cdef class MyType(SuperType, unhashable=True):
...

? And, would that also generate different code or just silence the warning?

Stefan

Lisandro Dalcin

unread,
Mar 31, 2010, 9:20:20 AM3/31/10
to cython...@googlegroups.com
On 31 March 2010 04:27, Stefan Behnel <stef...@behnel.de> wrote:
>
> But then, how will code be able to deliberately break hashability without
> getting such a warning from Cython? Could we allow something like this:
>
>    cdef class MyType(SuperType, unhashable=True):
>        ...

cdef class MyType(SuperType, unhashable=True):
def __richcmp__(self, other): ...
__hash__ = None

cdef class MyOtherType(SuperType):
def __richcmp__(self, other): ...
__hash__ = SuperType.__hash__

>
> ? And, would that also generate different code or just silence the warning?
>

The fist one should just silent the warning (that's would be enough,
right? CPython already blocks slot inheritance...).

The second one should generate the obvious C code.

BTW, this would give you maximum compatibility with Python (3) code.

Dag Sverre Seljebotn

unread,
Mar 31, 2010, 12:13:45 PM3/31/10
to cython...@googlegroups.com
Lisandro Dalcin wrote:
> On 31 March 2010 04:27, Stefan Behnel <stef...@behnel.de> wrote:
>
>> But then, how will code be able to deliberately break hashability without
>> getting such a warning from Cython? Could we allow something like this:
>>
>> cdef class MyType(SuperType, unhashable=True):
>> ...
>>
Isn't this the same as the "__hash__ = None" introduced in Python (2.6 I
think?)

Dag Sverre

Stefan Behnel

unread,
Mar 31, 2010, 12:27:58 PM3/31/10
to cython...@googlegroups.com, Dag Sverre Seljebotn
Dag Sverre Seljebotn, 31.03.2010 18:13:

Ah, right, I had forgotten that that existed. That could then be used to
silence the warning, it's certainly simple enough.

Stefan

Reply all
Reply to author
Forward
0 new messages