Inverse Perfect Hashing

3 views
Skip to first unread message

Han de Bruijn

unread,
Mar 15, 2005, 5:13:20 AM3/15/05
to
This was meant to be a question. But meanwhile, it has turned into an
answer. Anyway, I've decided to share it with this newsgroup.

Imagine a rectangular region which is placed quite arbitrarily inside
an image and has to be filled with pixels. A picture says more than a
thousand words:

Like this:

X
X X X
X X X
X X X
X X X
X

Or this:

X
X X X
X X X X X
X X X X X X X
X X X X X X X X X
X X X X X X X X X
X X X X X X X
X X X X X
X X X
X

Or like this:
X X
X X X X X
X X X X X
X X X X X
X X X X X
X X

Or this:
X X X X X X X X X
X X X X X X X X X X X X X X X X X X X
X X X X X X X X X X X X X X X X X X X
X X X X X X X X X X X X X X X X X X X
X X X X X X X X X

Description of the problem
--------------------------
How to obtain a list of all pixels which are inside the rectangle,
where each pixel occurs only _once_ in the list. And yes, I want my
algorithm to be as fast as possible.

Solution of the problem
-----------------------
The rectangle is transformed into a rectangle which coincides with
a cartesian coordinate system at the left and bottom edges. This is
a rotation. Each pixel is transformed into the rotated coordinates,
giving (x,y) floating point values. When rounding these f.p. values
into integer values, memory locations may be obtained in principle.

Such a method is commonly called _hashing_.

It can be demonstrated (make a drawing) that at most two collisions
can happen when rounding these (x,y) coordinates. The big trick is
now that all lengths may be multiplied with a factor sqrt(2) or so.
In this case, it can be shown that all collisions have disappeared.
With other words: we have a _perfect_ hash then.

Therefore what we need is a table which is sqrt(2) times sqrt(2) =
twice as big as the minimal size needed for storing all pixel values
"without gaps". The advantage of the overhead by a factor 2 is that
no searches are needed whatsoever.

A piece of (Delphi/Pascal) code to illustrate the idea:

const
safety : double = 1.414215;
var
p,q : vector { end-points of line }
d : double { half linewidth }
begin
thick := Round(d*safety);
L := sqrt(sqr(q.x-p.x)+sqr(q.y-p.y));
long := Round(L*safety);
c := (q.x-p.x)/L; s := (q.y-p.y)/L;
{ c(osine) and s(ine) of rotation }

{ Now for each pixel: }
r.x := c*(z.x-p.x) + s*(z.y-p.y);
r.y := -s*(z.x-p.x) + c*(z.y-p.y);

inside := (0 < r.x) and (r.x < L)
and (-d < r.y) and (r.y < +d);
if not inside then Exit;

g.x := Round(safety*r.x); g.y := Round(safety*r.y);
place := (g.y + thick) * long + g.x; { Perfect hash }

Such a "Perfect Hash" can always be inverted. This is evident, because
if there is a one-to-one mapping of a pixel onto an adress then such a
mapping can always be inverted to a mapping of an adress onto a pixel.

The procedure is as follows. Traverse all adresses in our twice-too-big
table; and apply the inverse of all transformations in reverse order.
Inverse rounding in the first place, corresponding to plus or minus 1/2
with both coordinates, resulting in a little square with four vertices.
Then dividing all coordinates by sqrt(2) instead of a multiply. Rotate
over minus the angle, perform a translation backwards. The end-result
of our floating point procedure is the little square with yes or no a
pixel in it. If the pixel is within that little square, then it can be
found by rounding the coordinates. However, the reverse is not true !
And the latter fact enables us to find all pixels which correspond to
the filled adresses in the inverse problem.

But look what we have obtained now ! This actually is a recursion-free
Flood-Fill procedure of arbitrary rectangles in an image. The overhead
measured in computation time is at most twice the theoretical minimum,
that is: iff each pixel could be found immediately, which, in general,
is not the case.

Han de Bruijn

Reply all
Reply to author
Forward
0 new messages