Google Groups no longer supports new Usenet posts or subscriptions. Historical content remains viewable.
Dismiss

Hashing Vertices

341 views
Skip to first unread message

Mathias Heyer

unread,
Sep 10, 2000, 9:07:27 AM9/10/00
to
Hello!

I need to turn "vertexed" Primitives into indexed ones. So I wrote a
function "IndexForVertex" that searches all vertices already in the
vertexbuffer.If it finds an already known vertex, which fits both, vertex
coordinates and texture coordinates it gives back its index, otherwise the
function will add the new vertex to the end of the buffer and gives back the
index it. As you can see, this is very brute-force and becomes ever slower
the more vertices are in the vertexbuffer. So I thought about using some
sort of hashing to find an already known vertex fast.
But how do you hash vertices whose keys consist of 5 floating point values?
Is there an already known good hashing function ? I thought about dividing
the coordinates by a certain value. So many vertices would "snap" to one
position, sharing the same hash-adress. Then you could convert the
coordinates to integer and using them in a 3-dimensional array...

Any ideas?

Mathias


yooyo

unread,
Sep 10, 2000, 8:12:49 PM9/10/00
to

It doesn't matter... it is just a precalc! Only one thing is important...
after precalc, mesh data are ready for compiled vertex arrays.
But if driver doesn't support compiled vertex array, you shoud sort
faces in strips and fans, and draw mesh by calling several times
glDrawElements to draw strips, fans and other unoptimised triangles!

yooyo/eastgate

gordon

unread,
Sep 10, 2000, 10:11:41 PM9/10/00
to

damn shame that your texture coordinates are floats and not ints as hashing
the texture coordinates to narrow the field and then applying your current
approach would give you a serious speed kick, assuming 256x256 textures that
instantly in O(1) reduces your possible matches to 1/65536 of what it was,
so unless you got a HELL of a lot of verticies this should sort you out.


s...@cs.stanford.edu

unread,
Sep 10, 2000, 10:17:54 PM9/10/00
to

Yes, hashing is a fairly good approach. Basically, you first want to
quantize the vertex coordinates into buckets, making the buckets large
enough to account for any floating point error or uncertainty in the
points:

int xbucket = (int) floor(vert[i].x / bucketsize);
int ybucket = (int) floor(vert[i].y / bucketsize);
int zbucket = (int) floor(vert[i].z / bucketsize);

Next, you compute a hash value from these buckets. One function that
seems to work reasonably well is the following:

int hashvalue = MAGIC1*xbucket + MAGIC2*ybucket + MAGIC3*zbucket;
hashvalue %= hashtable_size;
if (hashvalue < 0)
hashvalue += hashtable_size;

There are two options for choosing MAGIC1, MAGIC2, MAGIC3, and
hashtable_size. Option one is to make MAGIC1, MAGIC2, and MAGIC3 large
primes (greater than the largest expected values of {x,y,z}bucket), and
hashtable_size can be anything less than the minimum of these three.
The other option is to make hashtable_size prime, in which case the
multipliers should be less than hashtable_size and relatively prime to
each other.

Finally, you just insert the vertices in a (regular, 1-dimensional) hash
table, using the above function for the keys. All the regular design
decisions for hash tables (choice of table size, chaining vs. open
hashing, etc.) apply.

- Szymon Rusinkiewicz
s...@cs.stanford.edu

Pierre Terdiman

unread,
Sep 10, 2000, 10:33:34 PM9/10/00
to
Errr........... and why do you use hashing in the first place? If you don't
want to play with magic values all over the place, you can just sort the
primitives and read them back in order to find redundant vertices and create
the indices.

Pierre

Mathias Heyer <Mathia...@in.stud.tu-ilmenau.de> wrote in message
news:8pg1bs$n99$1...@piggy.rz.tu-ilmenau.de...

s...@cs.stanford.edu

unread,
Sep 11, 2000, 2:14:32 AM9/11/00
to
In comp.graphics.algorithms Pierre Terdiman <p.ter...@wanadoo.fr> wrote:
> Errr........... and why do you use hashing in the first place? If you don't
> want to play with magic values all over the place, you can just sort the
> primitives and read them back in order to find redundant vertices and create
> the indices.

The typical reason one might use hashing is if the vertices that are
supposed to be "the same" really are at slightly different, though
nearby, positions. This might be because of various errors (e.g.
floating-point roundoff) creeping in earlier in the pipeline. Another
benefit of the hash-table scheme relative to sorting is that it has a
complexity of O(n), rather than O(n log n). For very large amounts of
data, this might make a difference...

- Szymon Rusinkiewicz
s...@cs.stanford.edu

Robert Schneiders

unread,
Sep 11, 2000, 3:37:26 AM9/11/00
to

s...@cs.stanford.edu wrote:

If your code is C++, you can use the std::map standard template library hash
table.
All you need to do is to define a "<" relation, if necessary, one that neglects
small perturbations.

Best regards,

Robert


Robert Schneiders
MAGMA GmbH
Aachen, Germany


Pierre Terdiman

unread,
Sep 11, 2000, 11:04:48 AM9/11/00
to
Uh-oh....

1) Then we're speaking about welding nearby vertices, not really about
transforming a list of vertices into an indexed one. This is not the same
problem, and indeed in the first case a hash is good. Nonetheless I think
the original poster only wanted to deal with data organisation, and how to
please, say a DramIndexedPrimitive call.

2) ...sure, but it's been ages since I last used a O(n log n) sort. Linear
sorts are common those days.

Anyway.

<s...@cs.stanford.edu> wrote in message
news:8pht88$b2d$1...@nntp.Stanford.EDU...

Johan Hanson

unread,
Sep 11, 2000, 1:31:45 PM9/11/00
to
Mathias Heyer wrote:
>
> I need to turn "vertexed" Primitives into indexed ones.

[...]


> So I thought about using some sort of hashing to find an already known vertex fast.
> But how do you hash vertices whose keys consist of 5 floating point values?

First, I would suggest that you find out the range that the values will ever
have
and compute constants that will be used for multiplying the values to fit the
ranges within 32 bit integers before converting to integer.
Then XOR them together (the '^' operator in C). There you have an integer hash
sum.
You might get better results if you try to fit the values within smaller
amounts of bits and concatenate them.

Then take the modulus (remainder of division, the '%' operator) with the size
of your hash table to get the index.

Greetings
--
/ Johan
-- jo...@tiq.com -- http://www.obsession.se/johan/ --

Gino van den Bergen

unread,
Sep 12, 2000, 3:21:40 AM9/12/00
to

Robert Schneiders wrote:

>
>
> If your code is C++, you can use the std::map standard template library hash
> table.
> All you need to do is to define a "<" relation, if necessary, one that neglects
> small perturbations.
>

std::map is not a hash map but a balanced binary seach tree (red-black tree). There
is a hash map in the STL (std::hash_map), which requires the definition of a hash
function for non-primitive types. The std::hash_map should be more light-weight,
and, if a proper hash function is used, faster than the std::map, although I haven't
found significant proof for this in my applications. That is to say, the std::map
performs quite well.


--
Gino van den Bergen
Not a Number
www.blender.nl


Konstantin Baumann

unread,
Sep 12, 2000, 3:30:33 AM9/12/00
to

std::hash_* is not part of the STL. It is an extension introduced by SGI. Some compilers use this implementation, especially GNU's g++. But std::hash_* is not available for VC++ by default for example.

But there exists a STL port (http://www.stlport.org/) which is based on SGI's STL implementation that can be used with VC++ (and many other compilers, too).

Kosta

Erik Max Francis

unread,
Sep 12, 2000, 3:48:11 AM9/12/00
to
Gino van den Bergen wrote:

> There is a hash map in the STL (std::hash_map), ...

Those concerned about portability should note that hash_map is a part of
the "STL" (the name of the conglomeration of template utilities that
have been floating around for some time), but didn't make it into the
C++ Standard along with some other "STL" template classes; usually when
people say STL they mean not the pre-Standard, amorphously defined "STL"
but rather the template support classes associated with Standard C++.

That being said, though, you can get very portable implementations of
hash_map from a variety of places.

--
Erik Max Francis / m...@alcyone.com / http://www.alcyone.com/max/
__ San Jose, CA, US / 37 20 N 121 53 W / ICQ16063900 / &tSftDotIotE
/ \ One sword keeps the other in the sheath.
\__/ George Herbert
Kepler's laws / http://www.alcyone.com/max/physics/kepler/
A proof of Kepler's laws.

Hans-Bernhard Broeker

unread,
Sep 12, 2000, 8:33:43 AM9/12/00
to
In comp.graphics.algorithms s...@cs.stanford.edu wrote:
[...]

> The typical reason one might use hashing is if the vertices that are
> supposed to be "the same" really are at slightly different, though
> nearby, positions.

But if I'm not terribly mistaken, this will never work reliably.
Hashing, by its very nature, will sometimes fail to yield the same
hash value for two very nearby points. That's because the hash
function implicitly has to divide the 5 -parameter input space into
static 'cells', whose walls will be always at the same position. Now,
if one of the vertices falls very close to one such wall, another copy
of it may easily fall onto the other side of it, ending up in the
neighboring cell. The more 'hashy' your hash function, the more walls
you get (because the set of points yielding the same hash value will
fall apart into many small, unconnected areas), i.e. the more probable
the problematic case becomes.

The hash function will also make it quite hard to check out all the
neighboring cells, which you would have to do to avoid that problem.

At the base of this is that you simply cannot 'sort' points form a
more than 1-dimensional space without causing distorted sorting (some
originally very close pairs of points ending up much farther away from
each other than some very far-spaced pairs).

I think it's really better to use a dynamically built hierarchical
datastructure over the full 5-dimensional input parameter space, here.
Ideally (and if you can spare the time), it'd be an incrementally
built 5-D Voronoi 'diagram', described by a BSP tree in 5 dimensions.

--
Hans-Bernhard Broeker (bro...@physik.rwth-aachen.de)
Even if all the snow were burnt, ashes would remain.

Christer Ericson

unread,
Sep 12, 2000, 1:36:03 PM9/12/00
to
On 12 Sep 2000 12:33:43 GMT, Hans-Bernhard Broeker
<bro...@physik.rwth-aachen.de> wrote:

>In comp.graphics.algorithms s...@cs.stanford.edu wrote:
>[...]
>
>> The typical reason one might use hashing is if the vertices that are
>> supposed to be "the same" really are at slightly different, though
>> nearby, positions.
>
>But if I'm not terribly mistaken, this will never work reliably.
>Hashing, by its very nature, will sometimes fail to yield the same
>hash value for two very nearby points.

>[...]


>The hash function will also make it quite hard to check out all the
>neighboring cells, which you would have to do to avoid that problem.

Your first point is valid, but all you have to do to fix this is
to grid your data points into cells of some size larger than the
epsilon value for which you consider them to be equal. Then, you
hash on the *cell* coordinates, not the actual data value.

Checking for unique data points (within the given epsilon) is
now quite easy. You just have to test against both the directly
mapped cell and its neighbours, the neighbours being easily
located as you have the cell coordinates for them.

E.g. for a 2D search, assuming your point (p,q) maps to cell
(x,y) under the mapping f(p,q)->(x,y) then you only need to
examine the unique cells f(p+-epsilon,q+-epsilon). At most
this will be 4 cells, typically though, just one cell.


>At the base of this is that you simply cannot 'sort' points form a
>more than 1-dimensional space without causing distorted sorting (some
>originally very close pairs of points ending up much farther away from
>each other than some very far-spaced pairs).

This is irrelevant, as the hash approach I outlined above
perfectly solves the problem (modulo the quality of your
cell-to-hashbin hash function).


>I think it's really better to use a dynamically built hierarchical
>datastructure over the full 5-dimensional input parameter space, here.
>Ideally (and if you can spare the time), it'd be an incrementally
>built 5-D Voronoi 'diagram', described by a BSP tree in 5 dimensions.

I disagree. This will be much slower and much more cumbersome to
both implement and debug, compared to the hash approach.

You simply cannot beat the hash table approach for this problem.


Christer Ericson
SCEA, Santa Monica

rich...@omnieng.co.uk

unread,
Sep 19, 2000, 3:00:00 AM9/19/00
to
I used the hashing method when processing 3D models. The verticies had
(x,y,z) position, vertex normal, and texture (u,v) coords. Now you have
8 floats per vertex you would have a problem sorting these into any
sensible order.

Hashing was very useful when processing 3D models with smoothing groups
as it returns new vertices when the vertex normal has changed, even if
the position is the same.

It was also useful when subdiving a large 3D world model into blocks
for faster culling. For each block you can find the minimum list of
verticies required for drawing those triangles and store the block as
an indexed face list.

So I would definitely recommend the hashing method.


Sent via Deja.com http://www.deja.com/
Before you buy.

0 new messages