doctests for hashes

14 views
Skip to first unread message

Dan Drake

unread,
Jul 25, 2010, 9:33:20 PM7/25/10
to sage-...@googlegroups.com
Hello,

Ticket #9590 fixes a problem with hashing; it looks like the hash value
in the doctest is 32- or 64-bit specific, and of course it fails on
systems that don't match. The solution there is to change a doctest like

sage: hash(foo)
574575757575

into

sage: hash(foo) == hash(foo)
True

That avoids using the particular value, which we don't really care about
anyway, but I'm not certain this is a good solution, since the new
doctest really only tests that the hash function actually returns a
value.

I don't know much about the hashing functions, so maybe the new doctest
is fine. There are some other fixes proposed on that ticket; should I
merge the current fix, or should we use 32-bit and 64-bit tags to use
different values?

Dan

--
--- Dan Drake
----- http://mathsci.kaist.ac.kr/~drake
-------

signature.asc

Carl Witty

unread,
Jul 25, 2010, 10:11:22 PM7/25/10
to sage-...@googlegroups.com

Hmm... looks like the current state of affairs is a mess. Looking
through the 'def __hash__' grep hits in sage/rings, there are quite a
few of each of the following:

1) no doctest at all
2) provide both 32-bit and 64-bit doctests
3) define your hash function to produce a 32-bit output that's the
same on 32-bit and 64-bit systems; doctest an instance of that output
4) doctest hash value equality without ever showing a doctest output

plus one instance where the hash output is marked "# random".

So whatever you do with this particular patch, it won't make things
much worse :)

As for the desired state of affairs: I have a slight preference for
providing both 32-bit and 64-bit doctest outputs, because it increases
our chance of noticing if something changes unexpectedly. But I could
also make a good case for only testing hash equality, because it
slightly reduces the effort involved in changing hash functions,
internal representations, etc. :)

Carl

Dan Drake

unread,
Jul 25, 2010, 11:11:15 PM7/25/10
to sage-...@googlegroups.com
On Sun, 25 Jul 2010 at 07:11PM -0700, Carl Witty wrote:
> Hmm... looks like the current state of affairs is a mess. Looking
> through the 'def __hash__' grep hits in sage/rings, there are quite a
> few of each of the following:
>
> 1) no doctest at all
> 2) provide both 32-bit and 64-bit doctests
> 3) define your hash function to produce a 32-bit output that's the
> same on 32-bit and 64-bit systems; doctest an instance of that output
> 4) doctest hash value equality without ever showing a doctest output
>
> plus one instance where the hash output is marked "# random".
>
> So whatever you do with this particular patch, it won't make things
> much worse :)

Yeah! That's what I like to hear. :)

> As for the desired state of affairs: I have a slight preference for
> providing both 32-bit and 64-bit doctest outputs, because it increases
> our chance of noticing if something changes unexpectedly. But I could
> also make a good case for only testing hash equality, because it
> slightly reduces the effort involved in changing hash functions,
> internal representations, etc. :)

If the hash values are supposed to be 32- or 64-bit integers, perhaps
testing that would be useful; something like

sage: hash(foo) > 0 and is_integer(hash(foo))
True
sage: hash(foo) < 2^sys_bits()
True

where sys_bits() is a function that we could add that returns "32" or
"64", depending on your system. (Maybe such a function is already in
Sage.) Or we could just do two tests:

sage: hash(foo) < 2^32 # 32-bit
True

and so on. The above setup ignores the particular value and instead
insures that it has the necessary properties, which I think is what we
really want. Thoughts?

signature.asc

Carl Witty

unread,
Jul 25, 2010, 11:24:55 PM7/25/10
to sage-...@googlegroups.com

1) We want more properties than that; a constant hash (that returned
17 for all elements of a given ring) would not be so useful. (Of
course, very few (possibly none?) of the existing doctests test for
non-constant hashes.)

2) The Python code in hash() that calls your __hash__ method already
enforces your properties (hashes must be integers, and if they're not
in the correct range, Python will re-hash them into a small enough
integer). So just having hash(foo) not return an error is enough to
check those properties.

Carl

Reply all
Reply to author
Forward
0 new messages