Google Groups no longer supports new Usenet posts or subscriptions. Historical content remains viewable.
Dismiss

What is the hashcode used for?

112 views
Skip to first unread message

tojo2000

unread,
Aug 6, 2008, 7:16:33 PM8/6/08
to
So in a previous thread we discovered that we could check the identity
of an object by doing a quick $variable.GetHashCode() to see if two
array variables were really pointing at the same underlying object.

My question is this: what is the hashcode used for really? Is it
similar to the hashcode in Java?

____________________________________
tojo2000 at the dot com of the same name
http://tojo2000.com/tasteofpowershell

Richard Mueller [MVP]

unread,
Aug 6, 2008, 9:07:21 PM8/6/08
to

"tojo2000" <tojo...@gmail.com> wrote in message
news:811ea5d0-e938-44a8...@a8g2000prf.googlegroups.com...

A hash is an efficient function that takes an input of any length and
calculates a psuedo random fixed length output. The purpose is to serve as a
compact representation of the input. The idea is that any change to the
input, no matter how small, should result in a completely different hash
value. If the two objects are identical, the hashes must match. If the
objects are not identical, the odds of the hashes matching should be so
small that we can ignore the possibility. If different inputs have the same
hash, it is called a collision. It is impossible to prevent collisions,
since the universe of possible inputs is much larger than the universe of
possible hashes. However, given one input and its hash, it should be
infeasible to find another input that results in the same hash.

I cannot find documentation on the algorithm used by GetHashCode, but it
appears to be a 4 byte (32-bit) value. That makes the odds of a collision
awfully high compared to standard algorithms. 40-bit hash algorithms are
generally considered out of date for security purposes (at least 128-bit is
recommended and 256-bit is becoming common). Although you could argue that
the purpose here is not security, I'm still a bit surprised.

--
Richard Mueller
MVP Directory Services
Hilltop Lab - http://www.rlmueller.net
--


tojo2000

unread,
Aug 6, 2008, 9:53:49 PM8/6/08
to
On Aug 6, 6:07 pm, "Richard Mueller [MVP]" <rlmueller-
nos...@ameritech.nospam.net> wrote:
> "tojo2000" <tojo2...@gmail.com> wrote in message

Is there any definitive way to really find the identity of the object
being pointed to by a variable. The issue that someone brought up was
originally that he wanted to copy a two-dimensional array, but he
discovered that copying the array resulted in changes to one variable
being reflected in the copy. We discovered that changing the length
of the array would change the underlying object to a unique object,
but it seems a bit kludgy to have to run a recursive function on every
copied array, even though it's not an issue that has come up a lot for
me so far.

/\/\o\/\/ [MVP]

unread,
Aug 7, 2008, 4:26:12 AM8/7/08
to

The issue that someone brought up was
> originally that he wanted to copy a two-dimensional array, but he
> discovered that copying the array resulted in changes to one variable
> being reflected in the copy. We discovered that changing the length
> of the array would change the underlying object to a unique object

.NET objects can be of value or reference type (insert good ref link here)

what you might also have seen here is another issue on top of that :

an arry does not have an Add (e.g. is not resizable) method, if you do a +=
(add an element) PowerShell will make a copy of the original array under the
hood to mimic a on-the-fly resize of the array.

note that some objects have a copy() or clone() method you might be able to
use in case you.

for the other way arround [ref]

Greetings /\/\o\/\/

PS I'm sure Bruce has a better explaination in his book, PowerShell in action

tojo2000

unread,
Aug 7, 2008, 5:31:36 AM8/7/08
to
On Aug 7, 1:26 am, /\/\o\/\/ [MVP] <o...@discussions.microsoft.com>
wrote:

Here is the original code that Bruce had that started the last thread:

PS C:\Documents and Settings\Bruce> $c=1,2,3
PS C:\Documents and Settings\Bruce> $c
1
2
3
PS C:\Documents and Settings\Bruce> $d = $c
PS C:\Documents and Settings\Bruce> $d
1
2
3
PS C:\Documents and Settings\Bruce> $c[0]=99
PS C:\Documents and Settings\Bruce> $d
99 #<<< Hang on! I updated $c, not $d. is $d a reference to the
same
object as $c?
2
3
PS C:\Documents and Settings\Bruce> $c+=100
PS C:\Documents and Settings\Bruce> $d #if $d is a reference then
100
should appear on the end of this too... nup.
99
2
3
PS C:\Documents and Settings\Bruce> $c
99
2
3
100

<snip>

#not like this...
$x=(1,2),(3,4)
$y = $x.Clone()
$x.GetHashCode(),$y.GetHashCode() # <-- different
$x[0].gethashcode(),$y[0].gethashcode() # <-- the same...

So it seems that when you copy one array to another, both variables
point to the same object. Eventually I came up with this function to
disentangle the two arrays:

function New-Hashcode($var) {
if ($var.GetType().Name -eq 'Object[]) {
$var += 0
$var = $var[0..($var.Length - 2)]
foreach($index in (0..($var.Length -1))) {
$var[$index] = New-Hashcode $var[$index]
}
}
return $var
}

But this seems more like a kludgy workaround for a weird bug tan
anything else. I'm hoping that by understanding the way the
underlying code is assigning and manipulating the objects it will all
fall into place.

Alex K. Angelopoulos at

unread,
Aug 7, 2008, 8:42:35 AM8/7/08
to
FYI, GetHashCode is documented under the class which it is inherited from,
System.Object:

http://msdn.microsoft.com/en-us/library/system.object.gethashcode.aspx

I was punting on the earlier answer I made in that thread - a hash
comparison really IS a lame way to check identity. I just couldn't remember
where this can be done in .NET. Anyway, I believe the appropriate way is to
use ReferenceEquals, a static System.Object method:

http://msdn.microsoft.com/en-us/library/system.object.referenceequals.aspx

something like this:

[object]::ReferenceEquals($c,$d)

That should give a definitive identity test.

I'm not sure how Java does its hashing, but if you read the Remarks on the
first page I referenced, you'll notice that the hashcode is based on
instance properties, which suggests that distinct objects with the same
properties will share a hash. This is possible to demonstrate with the
following code:

PS> $fu = "abc";$bar = "abc"
PS> $fu,$bar | %{$_.GetHashCode()}
1099313834
1099313834
PS> [object]::ReferenceEquals($fu,$bar)
False


I also found one other oddity in PowerShell, which makes perfect sense after
a bit of thought. If you make several items equal to the same value in one
statement, the variables point to the same object. Thus the following:


PS> $fu = "abc"; $bar = "abc"
PS> $barney = $fred = "abc"
PS> [object]::ReferenceEquals($fu,$bar)
False
PS> [object]::ReferenceEquals($barney,$fred)
True

That final "True" is very cool. : )

"tojo2000" <tojo...@gmail.com> wrote in message

news:c2906d88-4bbc-4e25...@b2g2000prf.googlegroups.com...

0 new messages