On 4/4/2012 4:37 PM, Mark wrote:
> Hello
>
> I'm trying to find a way create a hash value over multiple IP addresses (I
> keep a list of IP addresses and its number is variable. The simplest method
> I've found so far is (obviously not the most effective and collision free):
>
> unsigned long hash_ipaddr(struct in_addr *addr)
> {
> unsigned long res;
>
> if (addr == NULL)
> return 0UL;
>
> res = (((addr->s_addr>> 24)& 0xff) * 256) + \
> (((addr->s_addr>> 16)& 0xff) * 256) + \
> (((addr->s_addr>> 8)& 0xff)* 256) + \
> (addr->s_addr& 0xff);
(Idle question: Why the backslashes?)
> return res;
Briefly, this calculation takes four chunks of the address
as numbers between 0..255 and adds them to produce a sum 0..1020.
So: You start with 32 bits of information and squash them down to
a smidgen less than 10 bits -- with a moderate central tendencey,
too. Ponder, young one: Wise this is?
> }
>
> And then sum up hashes for every IP and store it.
Not sure what you mean by this. A given IP produces one and
only one hash value with this scheme (or any other sane scheme),
so what do you mean by "sum up the hashes?"
> Can somebody suggest some better approach for my task?
You haven't explained what your task is, so I'll have to guess.
If the objective is to use a hash table keyed on IP addresses, there
are two likely possibilities:
- The number of buckets in the table is a prime number (or at
the very least an odd number): Just take `*addr % N', and the
outcome is very likely to be better, especially if N > 1020.
The remainder after division by a prime is likely to break up
simple patterns.
- The number of buckets is a power of two. Now `*addr % N' would
just throw away the high-order bits -- which might not be Bad,
but could easily be Less Than Good if there are patterns in the
input keys. One possibility is to use `f(*addr) % N', where f()
is a suitable "scrambling" function: For example, implement a
two-line linear congruential pseudo-random number generator, seed
it with `*addr', and take its output as f().
For further ideas, see TAOCP volume III.
--
Eric Sosman
eso...@ieee-dot-org.invalid