# sparsehash/vector of integer numbers

292 views

### daniel

Nov 5, 2011, 9:08:06 AM11/5/11
Dear All,

I would like to use sparsehash to map arrays of integer numbers into
unique identifiers. All examples I've seen about the usage of
sparsehash involve strings rather than integer numbers, though.
Please, is there any sample code involving arrays of integers?

In particular, I know beforehand 1) the size of the arrays, as well as
2) the maximum value that each element of the array will reach.

All numbers involved are unsigned.

Are there standard hash functions for this use case?

Would it be better to use arrays rather than vectors to index the hash
map? In other words, if vec2index is a hashmap, is it better to do

vector<uint>v(num_elem);
v.assign(vec, vec + num_elem);
*num=vec2index[ v ];

or something inspired by

uint v[num_elem];
v = ...;
*num=vec2index[ v ];

?

Thanks!

Daniel

### Donovan Hide

Nov 5, 2011, 9:16:28 AM11/5/11
Hi,

I'm doing something similar with sparsetable:

typedef sparsetable<string,48> index_t;

Works well when you know the range of unsigned integer keys. The
integer is effectively the hash.

Cheers,
Donovan.

> --
>
>

### daniel

Nov 5, 2011, 11:54:48 AM11/5/11
Hi,

I don't understand your idea. Are you suggesting to use a sparsetable
as an index to a sparsehash?

What do you mean by

typedef sparsetable<string,48> index_t;

?

> Works well when you know the range of unsigned integer keys. The integer is effectively the hash.

My keys are tuples of unsigned integers, not unsigned integers. Which
integer should I use as hash?

Thanks,

Daniel

On Nov 5, 11:16 am, Donovan Hide <donovanh...@gmail.com> wrote:
> Hi,
>
> I'm doing something similar with sparsetable:
>
>
> typedef sparsetable<string,48> index_t;
>
> Works well when you know the range of unsigned integer keys. The
> integer is effectively the hash.
>
> Cheers,
> Donovan.
>

### Donovan Hide

Nov 5, 2011, 3:59:20 PM11/5/11
Hi,
sorry was in a bit of a rush when I sent that reply. Sparsetable is
the data structure that sits behind the dense and sparse hash table
implementations. It's useful as a memory saver when you only have
unsigned integer keys. The best choice depends on what your
constraints are, such as fast lookup by unsigned integer key, memory
efficiency or value uniqueness. One possible solution is to disregard
the value uniqueness and duplicate values for individual keys, ie:

sparsetable<unique_id_class,48> table(1024);
unique_id_class unique1("foo");
unique_id_class unique2("bar");
table.set(12,unique1);
table.set(1000,unique1);
table.set(99,unique2);

If you wanted to maintain uniqueness you could keep pointers in the
sparsetable instead and have an unordered_set to store the actual
values, or you could build a custom hash function for your
tuples/arrays:

http://stackoverflow.com/questions/1250599/how-to-unordered-settupleint-int

Depends a lot on what your use case is, as usual!

Cheers,
Donovan.

### Craig Silverstein

Nov 5, 2011, 7:21:14 PM11/5/11
} Please, is there any sample code involving arrays of integers?

You can do something like
sparse_hash_map<int, whatever> foo;
foo[5] = whatever;
foo[15] = whatever;

But I agree a sparsetable is more natural for this app.

craig

### daniel

Nov 6, 2011, 11:06:31 AM11/6/11
Dear Donovan,

Thanks for the reply. I still have some questions, though.

I want to map vectors of unsigned integers into unsigned integers.

In other words, I want to index a hash map (or a similar structure) by
a vector of unsigned integers (here, I'm referring to the mathematical
notion of vector, not C++ notion).

The vector of unsigned integers might be represented by a vector of
ints, array of ints, tuple of ints. I don't know the best choice. I
just know two things, in running time,

a) the length of the vector

b) the maximum value that each element of the vector might have

My constrains are mostly in terms of time, not space!

I do need to maintain uniqueness of the vectors that index the map...

Where would

> unique_id_class unique1("foo");

fit into my solution?

Thanks again!

Best regards,

Daniel

On Nov 5, 5:59 pm, Donovan Hide <donovanh...@gmail.com> wrote:
> Hi,
> sorry was in a bit of a rush when I sent that reply. Sparsetable is
> the data structure that sits behind the dense and sparse hash table
> implementations. It's useful as a memory saver when you only have
> unsigned integer keys. The best choice depends on what your
> constraints are, such as fast lookup by unsigned integer key, memory
> efficiency or value uniqueness. One possible solution is to disregard
> the value uniqueness and duplicate values for individual keys, ie:
>
> sparsetable<unique_id_class,48> table(1024);
> unique_id_class unique1("foo");
> unique_id_class unique2("bar");
> table.set(12,unique1);
> table.set(1000,unique1);
> table.set(99,unique2);
>
> If you wanted to maintain uniqueness you could keep pointers in the
> sparsetable instead and have an unordered_set to store the actual
> values, or you could build a custom hash function for your
> tuples/arrays:
>
> http://stackoverflow.com/questions/1250599/how-to-unordered-settuplei...
>
> Depends a lot on what your use case is, as usual!
>
> Cheers,
> Donovan.
>

### Donovan Hide

Nov 6, 2011, 12:16:44 PM11/6/11
Hi Daniel,

always enjoy a challenge! The ambiguous nature of the term vector is a
wonderful source of misunderstanding :)

To put a custom data structure in an unordered type map requires the
definition of both an equality and a hash function for that structure.
Equality functions are easy but hash functions are an endless source
of benchmarking fun. I've put a simple example here:

https://gist.github.com/1343184

which uses the suggested hashing from here:

While it always fun to write your own code, there are some libraries
which cover this arena:

Hope that helps!

Donovan.

### daniel

Nov 6, 2011, 8:06:06 PM11/6/11
Dear Donovan,

Best regards,

Daniel

On Nov 6, 3:16 pm, Donovan Hide <donovanh...@gmail.com> wrote:
> Hi Daniel,
>
> always enjoy a challenge! The ambiguous nature of the term vector is a
> wonderful source of misunderstanding :)
>
> To put a custom data structure in an unordered type map requires the
> definition of both an equality and a hash function for that structure.
> Equality functions are easy but hash functions are an endless source
> of benchmarking fun. I've put a simple example here:
>
> https://gist.github.com/1343184
>
> which uses the suggested hashing from here:
>
>
> While it always fun to write your own code, there are some libraries
> which cover this arena:
>
> http://www.boost.org/doc/libs/1_47_0/libs/numeric/ublas/doc/index.htmhttp://eigen.tuxfamily.org/
>
> Hope that helps!
>
> Donovan.

### daniel

Nov 6, 2011, 8:08:15 PM11/6/11
Just one more question, maybe a very basic one,

* if, while using hashmap, I try to insert two elements into the
hashmap whose keys map into the same hash, the class will
automatically take care of the collision, right?

Thanks!

Best regards,

Daniel

### Donovan Hide

Nov 6, 2011, 8:11:50 PM11/6/11
Hi Daniel,

yes, the equality function will compare the two values with the same
hash and take the appropriate action. This action varies between
hashmap implementations and is another source of endless fun :)