Google Groups no longer supports new Usenet posts or subscriptions. Historical content remains viewable.
Dismiss

hashCode for 4 ints?

10 views
Skip to first unread message

Jeff

unread,
Jun 20, 2005, 7:39:39 PM6/20/05
to
Am I misusing HashMap or just need to fix the .hashCode() override for the
key object?

The class I use as a key for a HashMap is unique based on a combination of 4
ints. Normally, about 5000 instances are created and persist over time.
The maximum possible created is probably max around 40,000.

Since .hashCode() returns an int, how do I create a unique identifier for
the 128 bits contained in the 4 ints? Or should I swap to another data
structure?

Thanks!

// Here's a simplification of my class that illustrates my problem.

public class IntHashExample {
int A, B, C, D;

IntHashExample( int A , int B , int C , int D ) {
this.A = A;
this.B = B;
this.C = C;
this.D = D;
}

public boolean equals( Object arg0 ) {

IntHashExample other = ( IntHashExample ) arg0;

if ( ( A == other.A ) && ( B == other.B ) && ( C == other.C ) && ( D
== other.D ) )
return true;
else
return false;
}

// how do I create a hashcode to lookup the unique combination of the 4
ints? The following obviously wouldn't work,
// but hopefully indicates my need
public int hashCode() {

long highest = ( long ) A << 96;

long high = ( long ) B << 64;

long low = ( long ) C << 32;

long lowest = ( long ) D;

int code = ( int ) ( highest | high | low | lowest );

return code;
}
}

--
Jeff


YoungHwan

unread,
Jun 20, 2005, 8:08:26 PM6/20/05
to
what about this??

public int hashCode() {
int result = 17;
result = 37 * result + highest;
result = 37 * result + high;
result = 37 * result + low;
result = 37 * result + lowest;
return result;
}

>From ' Effective Java: Programming Language Guide


Jeff 작성:

Lasse Reichstein Nielsen

unread,
Jun 20, 2005, 8:12:16 PM6/20/05
to
"Jeff" <jeff...@bellatlantic.net> writes:

> The class I use as a key for a HashMap is unique based on a combination of 4
> ints.

Ok, that gives you (2^32)^4 different possible values (or 2^96 times
more than the possible values of hashCode()).

> Normally, about 5000 instances are created and persist over time.
> The maximum possible created is probably max around 40,000.

Is there any system to these, or are they distributed randomly in
int^4-space?

> Since .hashCode() returns an int, how do I create a unique identifier for
> the 128 bits contained in the 4 ints?

Obviously, you can't map 2^128 bits of information uniquely into 2^32
bits, unless you know something about the four integers that you
haven't told us yet.

> Or should I swap to another data structure?

No need. Hash values doesn't need to be unique. For performance
reasons, it's best that they don't collide too often, but for 40000
instances, that's unlikely to happen.

> public boolean equals( Object arg0 ) {
>
> IntHashExample other = ( IntHashExample ) arg0;
>
> if ( ( A == other.A ) && ( B == other.B ) && ( C == other.C ) && ( D
> == other.D ) )
> return true;
> else
> return false;

Just do:
return (A==other.A) && ... && (D == other.D);

>
> // how do I create a hashcode to lookup the unique combination of the 4
> ints? The following obviously wouldn't work,

No simple method will guarantee uniqueness.

Maybe it is possible to make a simple method that works for a typical
choice of the 40000 instances, but as long as it's fairly good, a few
collisions won't matter much.

> // but hopefully indicates my need
> public int hashCode() {

How about just:
return ((((((A * 31) + B) * 31) + C) * 31) + D);

/L
--
Lasse Reichstein Nielsen - l...@hotpop.com
DHTML Death Colors: <URL:http://www.infimum.dk/HTML/rasterTriangleDOM.html>
'Faith without judgement merely degrades the spirit divine.'

Jeff

unread,
Jun 21, 2005, 6:26:00 AM6/21/05
to
Am I misusing HashMap or just need to fix the .hashCode() override for the
key object?

The class I use as a key for a HashMap is unique based on a combination of 4
ints. Normally, about 5000 instances are created and persist over time.


The maximum possible created is probably max around 40,000.

Since .hashCode() returns an int, how do I create a unique identifier for
the 128 bits contained in the 4 ints? Or should I swap to another data
structure?

Thanks!

// Here's a simplification of my class that illustrates my problem.

public class IntHashExample {
int A, B, C, D;

IntHashExample( int A , int B , int C , int D ) {
this.A = A;
this.B = B;
this.C = C;
this.D = D;
}

public boolean equals( Object arg0 ) {

IntHashExample other = ( IntHashExample ) arg0;

if ( ( A == other.A ) && ( B == other.B ) && ( C == other.C ) && ( D
== other.D ) )
return true;
else
return false;
}

// how do I create a hashcode to lookup the unique combination of the 4


ints? The following obviously wouldn't work,

// but hopefully indicates my need
public int hashCode() {

long highest = ( long ) A << 96;

Patricia Shanahan

unread,
Jun 21, 2005, 8:18:25 AM6/21/05
to
Jeff wrote:
...

> Since .hashCode() returns an int, how do I create a unique identifier for
> the 128 bits contained in the 4 ints? Or should I swap to another data
> structure?
...

Why do you need the hash code to be a unique identifier?

Patricia

Thomas Weidenfeller

unread,
Jun 21, 2005, 8:22:37 AM6/21/05
to
Jeff wrote:
> The class I use as a key for a HashMap is unique based on a combination of 4
> ints. Normally, about 5000 instances are created and persist over time.
> The maximum possible created is probably max around 40,000.
>
> Since .hashCode() returns an int, how do I create a unique identifier for
> the 128 bits contained in the 4 ints? Or should I swap to another data
> structure?

You don't. Hashes don't have to be unique. It is much better if they
are, but they can't under all conditions. That's why you also need to
have equals(). The rule is that hashCode() must return the same hash for
objects which are equal (and of course of the object is the same), but
is allowed to return the same hash also for unequal objects.


An extremely inefficient hash, but still an allowed implementation of
hashCode() would be:

public int hashCode() { return 0; }

NO, DON'T DO IT - this is just an example.

So what is a good hash? Entire computer science books can be filled with
research about this. I would suggest you start with a simple one, and
measure if it is ok for your task. Or, you could try stuff like the
following:

Warning:

- This was a quick hack, and I didn't verify if I got the algorithm even
remotely right.

- I have also no clue if the algorithm is suitable for your numbers

- It burns a lot more CPU cycles than a simper hash (like just xoring
all integers), and might not be worth the trouble at all


static final int FNV1_INIT = 0x811c9dc5;
static final int FNV1_PRIME = 0x01000193;

protected int hashFnvInt(int v, int hash) {
for(int i = 0; i < 4; i++) {

// equivalent to hash *= FNV1_PRIME
// might generate a lot of VM instructions
// and not worth the saving of the
// multiplication
hash += (hash << 1)
+ (hash << 4)
+ (hash << 7)
+ (hash << 8)
+ (hash << 24);
hash ^= v & 0xFF;
v >>= 8;
}
return hash;
}

public int hashCode() {
return hashFnvInt(D,
hashFnvInt(C,
hashFnvInt(B,
hashFnvInt(A, FNV1_INIT))));
}


Googel for FNV (Fowler, Noll, Vo) for details.

/Thomas


--
The comp.lang.java.gui FAQ:
ftp://ftp.cs.uu.nl/pub/NEWS.ANSWERS/computer-lang/java/gui/faq

huddler

unread,
Jun 21, 2005, 10:06:14 AM6/21/05
to

<Warning: Java Newbie>

If the hash is used as the key in a HashMap, then non-unique hashes
would create a side effect to put() where an existing Object in the
HashMap is replaced with the new Object. He might not want that to
happen.

Eric Sosman

unread,
Jun 21, 2005, 10:30:08 AM6/21/05
to

Jeff wrote:
> Am I misusing HashMap or just need to fix the .hashCode() override for the
> key object?

Others have given advice about how to hash, but I haven't
seen anyone point out a different problem in your code:

> public boolean equals( Object arg0 ) {
>
> IntHashExample other = ( IntHashExample ) arg0;

This violates the equals() contract. If `arg0' is
a String or a JPanel or anything except an IntHashExample,
this method will throw a ClassCastException -- but equals()
is not supposed to do that; it should just return false.
Also, if `arg0' is null `other' will also be null, and
you'll throw a NullPointerException as soon as you try to
access the non-existent object it doesn't reference.

Suggested fix:

public boolean equals(Object arg0) {
if (! (arg0 instanceof IntHashExample) )
return false;
// the remainder as before ...

--
Eric....@sun.com

Patricia Shanahan

unread,
Jun 21, 2005, 11:36:15 AM6/21/05
to

The "key" is the object itself, not its hash. Replacement happens if,
and only if, the two objects are equal.

The hash is used for performance purposes, to reduce the number of
objects that HashMap has to look at when testing for equality.

Here's an example program, using the unrecommented, very poor
performance, but functionally correct trivial hashCode method:

import java.util.HashMap;
import java.util.Map;

public class TrivialHash {
String name;

public TrivialHash(String name) {
this.name = name;
}

public String toString() {
return name;
}

public boolean equals(Object o) {
return (o instanceof TrivialHash && name.equals(((TrivialHash)
o).name));
}

public int hashCode() {
return 0;
}

public static void main(String[] args) {
Map map = new HashMap();
map.put(new TrivialHash("A"), "X");
map.put(new TrivialHash("B"), "Y");
System.out.println(new TrivialHash("A").hashCode());
System.out.println(new TrivialHash("B").hashCode());
System.out.println(map.get(new TrivialHash("A")));
System.out.println(map.get(new TrivialHash("B")));
}
}

This program prints:

0
0
X
Y

demonstrating that equal hash codes do not prevent two key values from
being in the HashMap at the same time.

Patricia

huddler

unread,
Jun 21, 2005, 1:07:46 PM6/21/05
to
Well, as a Java newbie, I'm in a constant state of confusion. In the
1.4.2 doc for the HashMap put() method, it says:

<quote>
public Object put(Object key,
Object value)

Associates the specified value with the specified key in this map. If
the map previously contained a mapping for this key, the old value is
replaced.
</quote>

So could I trouble you to explain (or point to an explanation) of why
your example and the doc SEEM to contradict each other?

http://java.sun.com/j2se/1.4.2/docs/api/java/util/HashMap.html#put(java.lang.Object,%20java.lang.Object)

I await humiliation.

Patricia Shanahan

unread,
Jun 21, 2005, 4:25:27 PM6/21/05
to

I aim for enlightenment, rather than humiliation. Of course, I may miss
sometimes.

I think you may be misinterpreting the word "key" in the HashMap
documentation. It is also possible that the OP shares your confusion.

Look at the argument list. The key is the object, not its hash code.

In my example program, a TrivialHash object is equal to other
TrivialHash objects with the same name, and nothing else. The first call
uses as key a TrivialHash object with name "A". The second call uses a
TrivialHash object with name "B", which is unequal. As far as HashMap is
concerned, when I do the second map.put call, the map does not contain a
mapping for the key, so it does not need to replace an old value.

HashMap, like most hash tables, assumes that equal keys have equal hash
codes, but that some unequal keys may share the same hash code. That is
all HashMap is allowed to assume, because of the hashCode definition in
the Object javadocs: "If two objects are equal according to the
equals(Object) method, then calling the hashCode method on each of the
two objects must produce the same integer result."

Essentially, a hash table splits the keys into subsets, called buckets,
using the hash code. If two keys are equal, they go on the same bucket,
so a lookup only has to examine the keys in one bucket. However, two
keys in the same bucket, even if they have exactly the same hash, may
not be equal. The only way HashMap can be sure two Java objects are
equal is to get a true result from o1.equals(o2).

The behavior you seem to be expecting, using the hash code as key and
ignoring the object's equals method, does match a real but specialized
technique. A "perfect hash" ensures unequal keys have unequal hash
codes. It shifts some complexity from the lookup process to the
construction of the hash. It may be used, for example, for short lists
of keywords that do not change. Do a search for "perfect hash" if you
want to learn more.

The Java hashCode function is NOT required to be a perfect hash, and
HashMap does not assume it is one.

Patricia

Wibble

unread,
Jun 21, 2005, 9:28:50 PM6/21/05
to
Nopers, instanceof isnt strong enough since arg0 may subclass
IntHashExample and override hashCode or equals. use

if (arg0==null || arg0.getClass()==this.getClass()) return(false);

huddler

unread,
Jun 21, 2005, 10:54:09 PM6/21/05
to
Cripes, I hate when I'm this stupid in public.

Thanks for the discussion, Patricia.

John C. Bollinger

unread,
Jun 21, 2005, 11:00:48 PM6/21/05
to
Wibble wrote:
> Eric Sosman wrote:
>> Jeff wrote:

[...]

>>> public boolean equals( Object arg0 ) {
>>>
>>> IntHashExample other = ( IntHashExample ) arg0;

>>
>> This violates the equals() contract.

[...]

>> Suggested fix:
>>
>> public boolean equals(Object arg0) {
>> if (! (arg0 instanceof IntHashExample) )
>> return false;
>> // the remainder as before ...
>>
> Nopers, instanceof isnt strong enough since arg0 may subclass
> IntHashExample and override hashCode or equals. use
>
> if (arg0==null || arg0.getClass()==this.getClass()) return(false);

It depends on your point of view, I suppose. instanceof works fine for
the equals() method described, but it does provide an extra avenue by
which a subclass can be broken (overriding equals() or hashCode()). The
class presented is only broken if such a subclass exists. Such breakage
could also be prevented by making the class final (which renders your
solution equivalent to Eric's) or by making just equals() and hashCode()
final.

--
John Bollinger
jobo...@indiana.edu

George Cherry

unread,
Jun 22, 2005, 5:18:38 PM6/22/05
to

"Patricia Shanahan" <pa...@acm.org> wrote in message
news:PtWte.6979$hK3....@newsread3.news.pas.earthlink.net...

Excellent example, Patricia. In the spirit of your
example, here is another one:

/*
* This program illustrates that if equals() is appropriately
* defined, then even a mediocre hashCode() will not prevent a
* hash collection from working correctly (although a mediocre
* hashCode() will defeat the performance purpose of a hash
* collection.
*/

import java.util.HashSet;
import java.util.Set;

public class SillyHashCodeTester {
private String name;
private int id;

public SillyHashCodeTester(String name, int id) {
this.name = name;
this.id = id;
}

public String toString() {
return name + ", " + id;
}

public boolean equals(Object o) {
return (

o instanceof SillyHashCodeTester
&& this.name.equals( ((SillyHashCodeTester)o).name )
&& this.id == ((SillyHashCodeTester)o).id
);
}

//A: Define a silly hashCode():
public int hashCode() {
return 42;
}

public static void main(String[] args) {

Set<SillyHashCodeTester> hashSet;
hashSet = new HashSet<SillyHashCodeTester>();

System.out.println( "Set size: " + hashSet.size() );
System.out.println("Add an object to the set.");
hashSet.add( new SillyHashCodeTester("abc", 100) );
System.out.println( "Set size: " + hashSet.size() + "\n" );

System.out.println("Add a non-duplicate object to the set.");
hashSet.add( new SillyHashCodeTester("abc", 200) );
System.out.println( "Set size: " + hashSet.size() + "\n" );

System.out.println("Try to add a duplicate object to the set.");
hashSet.add( new SillyHashCodeTester("abc", 100) );
System.out.println( "Set size: " + hashSet.size() + "\n" );

System.out.println("Try to add a duplicate object to the set.");
hashSet.add( new SillyHashCodeTester("abc", 200) );
System.out.println( "Set size: " + hashSet.size() + "\n" );

System.out.println("Add a non-duplicate object to the set.");
hashSet.add( new SillyHashCodeTester("xyz", 100) );
System.out.println( "Set size: " + hashSet.size() + "\n" );
}
}

/*
This program should print:

Set size: 0
Add an object to the set.
Set size: 1

Add a non-duplicate object to the set.
Set size: 2

Try to add a duplicate object to the set.
Set size: 2

Try to add a duplicate object to the set.
Set size: 2

Add a non-duplicate object to the set.
Set size: 3
*/

George


George Cherry

unread,
Jun 22, 2005, 5:32:07 PM6/22/05
to

"Patricia Shanahan" <pa...@acm.org> wrote in message
news:XI_te.7035$hK3....@newsread3.news.pas.earthlink.net...

Yes, part of the contract of a hashCode() is

1. Equal objects have equal hash codes.
2. Unequal objects MAY have the same hash code.
(Of course, IDEALLY they don't have the same hash code.)

> That is
> all HashMap is allowed to assume, because of the hashCode definition in
> the Object javadocs: "If two objects are equal according to the
> equals(Object) method, then calling the hashCode method on each of the
> two objects must produce the same integer result."
>
> Essentially, a hash table splits the keys into subsets, called buckets,
> using the hash code. If two keys are equal, they go on the same bucket,
> so a lookup only has to examine the keys in one bucket.

The bucket is often the head of a linked list, I think.

> However, two
> keys in the same bucket, even if they have exactly the same hash, may
> not be equal. The only way HashMap can be sure two Java objects are
> equal is to get a true result from o1.equals(o2).

Yes, o1.equals(o2) implies o1.hashCode() == o2.hashCode()
BUT o1.hashCode() == o2.hashCode() does not imply o1.equals(o2)

> The behavior you seem to be expecting, using the hash code as key and
> ignoring the object's equals method, does match a real but specialized
> technique. A "perfect hash"

In this world there is no perfection, no permanence, and no self. -- Buddha

> ensures unequal keys have unequal hash
> codes. It shifts some complexity from the lookup process to the
> construction of the hash. It may be used, for example, for short lists
> of keywords that do not change. Do a search for "perfect hash" if you
> want to learn more.
>
> The Java hashCode function is NOT required to be a perfect hash, and
> HashMap does not assume it is one.
>
> Patricia

Great follow-up to your excellent example.

George


George Cherry

unread,
Jun 22, 2005, 5:33:22 PM6/22/05
to

"huddler" <hud...@earthlink.net> wrote in message
news:1119408849.6...@o13g2000cwo.googlegroups.com...

> Cripes, I hate when I'm this stupid in public.
>
> Thanks for the discussion, Patricia.

I beg to differ with you, huddler.
You are NOT stupid--the stupid
persist in their ignorance.

George


Roedy Green

unread,
Jun 28, 2005, 7:44:24 AM6/28/05
to
On Tue, 21 Jun 2005 12:18:25 GMT, Patricia Shanahan <pa...@acm.org>
wrote or quoted :

>> Since .hashCode() returns an int, how do I create a unique identifier for
>> the 128 bits contained in the 4 ints? Or should I swap to another data
>> structure?

see http://mindprod.com/jgloss/hashcode.html. Just xor them together.

--
Bush crime family lost/embezzled $3 trillion from Pentagon.
Complicit Bush-friendly media keeps mum. Rumsfeld confesses on video.
http://www.infowars.com/articles/us/mckinney_grills_rumsfeld.htm

Canadian Mind Products, Roedy Green.
See http://mindprod.com/iraq.html photos of Bush's war crimes

Thomas Weidenfeller

unread,
Jun 28, 2005, 9:34:12 AM6/28/05
to
Roedy Green wrote:
> see http://mindprod.com/jgloss/hashcode.html. Just xor them together.

Just XORing the integers can give very bad hash-codes. In the not to
unlikely case where all the involved integers are rather small
positive(!) numbers the hash is not well spread at all over the entire
possible range.

Small positive integers have many of their higher bits set to zero.
XORing zeros gives zeros again. A hash calculated from small positive
integers vi XORing can never exceed

2^(floor(log2(max(i1, i2, ..., in))) + 1) - 1

But even for positive integers which are not so small any more you end
up with bad hashes. E.g. if your largest integer is 10000, you would
still never get a hash code which is larger than 2^14 -1 = 16283.

/Thomas

http://www.uni-giessen.de/faq/archiv/computer-lang.java.gui.faq/

0 new messages