Hashtables and composite types

195 views
Skip to first unread message

Avik Sengupta

unread,
Apr 9, 2012, 6:07:38 PM4/9/12
to juli...@googlegroups.com
I am slightly confused by the functionality of the hash table, and would appreciate some pointers. 

I have a composite type, for which I have written a custom hash function. I would expect that storing objects of that type in a Hashtable would store and return values based on the hash values of the key. (Note that my hash function below is purely for illustrative purposes) . 

julia> type A                        
         x::Integer
         y::Integer
       end

julia> hash(a::A) = (a.x+1)*(a.y+1)

julia> a1 = A(1,2)
A(1,2)

julia> a2 = A(2,1)
A(2,1)

julia> hash(a1) == hash(a2)
true

julia> h = HashTable{A,Int}()
HashTable()

julia> assign(h, 1, a1)
{A(1,2)=>1,}

julia> get(h, a1, nothing)         
1

julia> get(h, a2, nothing)  #expect to return 1!!

julia> 


PS. What are IdTables and WeakKeyHashTable? 

Stefan Karpinski

unread,
Apr 9, 2012, 6:30:08 PM4/9/12
to juli...@googlegroups.com
The problem here is that you need to override both the hash and isequal functions: the hash table works by picking a bucket using hash(x), but then checks for equality with whatever's in the bucket using isequal(x,y). In this case a1 and a2 have the same hash bucket, but they aren't isequal, so they are considered to simply be a hash collision.

All of this is poorly documented at this point. I think that we probably want to remove IdTable from public sight, rename HashTable to Hash (using the name HashFunction for the far less commonly mentioned function that can be used to implement a hash table), and rename WeakKeyHashTable to WeakHash or maybe the slightly more verbose WeakKeyHash. To answer your questions though, IdTable is a hash implementation that's just for Int keys and WeakKeyHashTable is a hash implementation where the keys are weak references to objects, meaning that a value won't be prevented from garbage collection just because it has been used as a key in the hash table.

Avik Sengupta

unread,
Apr 9, 2012, 6:34:33 PM4/9/12
to juli...@googlegroups.com
Ah, 'isequal', thats the bit I've missed. Thanks a lot Stefan, this all makes sense now. 

Jeff Bezanson

unread,
Apr 9, 2012, 6:44:20 PM4/9/12
to juli...@googlegroups.com
Nope, IdTable is a special hash table where the keys are always object
identities --- it is for associating things with specific objects. For
example you could use it to detect cycles in data structures.

To me, the noun "hash" refers to a computed hash value, as in "compute
a hash of this block of data". The "table" part is what makes it a
data structure. I'd rather call it Table, Map or Dict than Hash.

Stefan Karpinski

unread,
Apr 9, 2012, 7:05:10 PM4/9/12
to juli...@googlegroups.com
On Mon, Apr 9, 2012 at 6:44 PM, Jeff Bezanson <jeff.b...@gmail.com> wrote:
Nope, IdTable is a special hash table where the keys are always object
identities --- it is for associating things with specific objects. For
example you could use it to detect cycles in data structures.

I did not know that. Should probably be renamed to ObjectIdDict.
 
To me, the noun "hash" refers to a computed hash value, as in "compute
a hash of this block of data". The "table" part is what makes it a
data structure. I'd rather call it Table, Map or Dict than Hash.

Let's go with Dict then. Map will likely the name of a type that works like the map function but is lazy. Table, as we've discussed before, is too overloaded.
Reply all
Reply to author
Forward
0 new messages