Fortran to find nearest point from set in 3-D

David.P...@csiro.au

unread,

Apr 15, 2006, 7:16:11 PM4/15/06

to

I'm looking for Fortran source code if possible.

Given set A with roughly 10,000 (x,y,z) triples and set B with roughly
1,000,000 (x,y,z) triples.

For each point in set A find the nearest point from set B.

What's the quickest algorithm?

-------------------------------

A very slow algorithm would be to check all 1,000,000 triples in B for
every point in A.

A slightly better algorithm would be to sort B by (x) in advance. If I
was programming it from scratch that's what I'd do.

A slightly better algorithm than that would be to allocate every point
in B to a 3-D cubic grid. Find the grid node closest to each point from
A and search that and the surrounding 26 grid cubes for the closest.

I seem to remember that there's an even better algorithm than that,
using a heirarchy of 3-D grids.

Rob McDonald

unread,

Apr 15, 2006, 10:09:50 PM4/15/06

to

To search a 1-D list, the best you can do is use a sorted list and a
binary search. This is efficiently implemented (especially when you
need to add/remove points) using a binary tree.

Similarly, you need an extension of a binary tree to 3-D. So, you may
want to use an octree 2^3=8, or another generalization of a tree. Such
as a k-d tree, an R tree, R+ tree, k-b-d tree, etc.

I don't have any Fortran source laying around, but that should give you
some keywords to google from (n-dimensional binary tree or
multidimensional binary tree, as well as the algorithms above). At a
minimum, you'll find a pile of technical papers describing various
algorithms, this is a classic problem of computer science.

Rob

Ronald Bruck

unread,

Apr 15, 2006, 10:51:13 PM4/15/06

to

In article <1145142971.0...@t31g2000cwb.googlegroups.com>,
<David.P...@csiro.au> wrote:

I've not thought about the problem before (which is strange--it's an
obvious proximal-point problem), but it does occur to me that once
you've done one point in A, and are about to do another, MOST of B can
be eliminated at a glance. Roughly, you're looking near the surface of
the sphere with radius R + eps, where R was the last proximal distance
found and eps is the distance between two points of A. "At a glance"
requires more computational effort than glancing, of course, but
presumably you're already computed the distances from the grid cubes to
the old point, and most of the cubes which were eliminated at the last
point are already eliminated for the current point.

And if it's good enough to do this in the sup (max) norm instead of the
L^2 norm... :o)

It would help tremendously if you could be sure that B or A was a
discrete approximation to a convex set... Do you know anything more
about the sets?

--
Ron Bruck

David.P...@csiro.au

unread,

Apr 16, 2006, 10:18:29 PM4/16/06

to

octree I know about, but k-d tree etc. are new to me. By the way, I
will have to do this 'proximal-point' problem hundreds of times with
different sets of points B.

> MOST of B can be eliminated at a glance. ... "At a glance" requires more computational effort than glancing

I figure that, too.

> presumably you're already computed the distances from the grid cubes to

the old point ... discrete approximation to a convex set

Huh?

> Do you know anything more about the sets?

Yes. Set A (the small one) comes from the triangular discretisation on
a set of surfaces and Set B (the big one) comes from the the corners of
a tetrahedral discretisation of 3-D space. Unfortunately all the
connectivity has been lost in the form that I have access to.

This means that the points in B are roughly equidistance apart, I may
even be able to give an 'a priori' maximum distance apart. If it helps,
call it 'D' for distance. I can also put hard upper and lower bounds on
each of the x, y and z of the triples (x,y,z) in B.

> And if it's good enough to do this in the sup (max) norm instead of the
L^2 norm

L^2 won't take that much more time provided I remember not to take
square roots :-)

Rob McDonald

unread,

Apr 16, 2006, 11:30:38 PM4/16/06

to

David.P...@csiro.au wrote:
> octree I know about, but k-d tree etc. are new to me. By the way, I
> will have to do this 'proximal-point' problem hundreds of times with
> different sets of points B.
>

> Yes. Set A (the small one) comes from the triangular discretisation on
> a set of surfaces and Set B (the big one) comes from the the corners of
> a tetrahedral discretisation of 3-D space. Unfortunately all the
> connectivity has been lost in the form that I have access to.
>
> This means that the points in B are roughly equidistance apart, I may
> even be able to give an 'a priori' maximum distance apart. If it helps,
> call it 'D' for distance. I can also put hard upper and lower bounds on
> each of the x, y and z of the triples (x,y,z) in B.
>

k-d tree is basically a binary tree where you rotate through each of
the dimensions on each level down the tree. So, the top level splits x
in half, the next level y, then z, then x again. A quick search should
lead to a lot of papers that discuss exactly the algorithms you need.

If you had the connectivity information, then you could be clever about
subsequent points. For each connected point in the small set, you
would have a good starting point. Without that information, you aren't
going to do any better than an octree or k-d tree binary search.

Even for a large problem, an octree approach should be really fast.
For a binary tree, each doubling of the number of points only requires
one additional bisection. An octree will scale similarly.

The bounds on B aren't much help beyond giving you a start on the bins
for your tree structure. If the points are approximately uniformly
distributed, then you could make a good guess at the required sizes of
the bins and do it all with some hackish statically allocated arrays.

Do you expect the closest points in A and B to be nearly coincident?
Might they be exactly coincident? If you could count on the pairs of
closest points being very nearly identical, there might be some clever
things you could do such as calculate a checksum, then you're searching
for an exact match in a single 'dimension'

Also, if you expect the closest match to be coincident points, you can
break the problem down using bounding boxes. For example, find the
bounding box of set A. Expand the bounds by a factor of the expected
point spacing. Then, reduce set B to the points falling withing the
box.

This can be carried further. Split the A box in half, grab the
corresponding A and B sets. Parallelize the problem to two processors.

This essentially mimics the octree approach, but on a somewhat ad-hoc
basis, made easier by expecting to find nearly coincident points....

Good luck,

Rob

Martin Blume

unread,

Apr 17, 2006, 11:09:48 AM4/17/06

to

<David.P...@csiro.au> schrieb

> I'm looking for Fortran source code if possible.
>
> Given set A with roughly 10,000 (x,y,z) triples
> and set B with roughly 1,000,000 (x,y,z) triples.
>
> For each point in set A find the nearest point
> from set B.
>
> What's the quickest algorithm?
>

Chapter 26 "Range Searching" and
chapter 28 "Closest-Point Problems" of the book
"Algorithms in C++" by Robert Sedgewick might help
you with the theory of kD trees and Voronoi diagrams.
Although the "source code" (rather code snippets)
are in C++, this should be portable to FORTRAN.

HTH
Martin

beli...@aol.com

unread,

Apr 17, 2006, 4:40:32 PM4/17/06

to

David.P...@csiro.au wrote:
> I'm looking for Fortran source code if possible.
>
> Given set A with roughly 10,000 (x,y,z) triples and set B with roughly
> 1,000,000 (x,y,z) triples.
>
> For each point in set A find the nearest point from set B.
>
> What's the quickest algorithm?
>
> -------------------------------
>
> A very slow algorithm would be to check all 1,000,000 triples in B for
> every point in A.

Suppose v1 and v2 are both 3-d vectors of REALs. I wonder if Fortran
compilers, given the expression

if ((v1-v2)**2 > c)

are "smart" enough to recognize that if (v1(1)-v2(1))**2 > c,
computations involving v1(2:3) and v2(2:3) can be skipped.

I also wonder if coding this logic explicitly would result in a slower
program than just doing the comparison as above. I may check later when
I have time.

Btw a similar question was asked recently in comp.lang.fortran -- one
can Google "kd tree" there.

(comp.lang.fortran added to follow-ups)

Gib Bogle

unread,

Apr 17, 2006, 6:16:34 PM4/17/06

to

beli...@aol.com wrote:

I've attacked this problem in the past in the following way (2D problem
in my case).
First divide the region into cubes, given, say, by C(i,j,k), i=1,..,Nx,
j=1,..,Ny, k=1,..,Nz, where C(i,j,k) spans the region
Xmin + (i-1)*delta < x <= Xmin + i*delta
Ymin + (j-1)*delta < y <= Ymin + j*delta
Zmin + (k-1)*delta < z <= Zmin + k*delta
and delta, Nx, Ny, Nz are suitably chosen.

Then for each point in B determine the cube it falls in, i.e. the values
of (i,j,k). This is very fast. Create a list of B points for each
cube, i.e. for each triple (i,j,k).

Then when looking for the nearest point in set B to a given point in set
A that is in the cube (ia,ja,ka), you just need to examine the B points
in the cube (ia,ja,ka) and the adjacent cubes (27 altogether, in
general). If delta is badly chosen, or if your points are very
irregularly distributed, you might not find any B points in these cubes,
and then you'll need to look at the next layer of surrounding cubes,
etc. - the method starts to lose its shine a bit if this happens. I
guess delta should be chosen so that every cube contains at least one B
point.

David.P...@csiro.au

unread,

Apr 19, 2006, 2:19:14 AM4/19/06

to

> Btw a similar question was asked recently in comp.lang.fortran -- one
can Google "kd tree" there.

Thanks from David Paterson. I've come to the conclusion that a "kd
tree" is excellent if I want to find the approximate nearest neighbour
(which is good enough for me) but not if I want the exact nerarest
neighbour.

Have you seen a Fortran implementation of a "kd tree"?

David.P...@csiro.au

unread,

Apr 19, 2006, 2:27:43 AM4/19/06

to

> Such as a k-d tree, an R tree, R+ tree, k-b-d tree, etc.

I've read references on all of those now. If I can keep everything in
memory (as seems very likely), then a k-d tree looks good if I only
need approximate nearest neighbour but looks very messy if I was to
want the exact nearest neighbour.

> The bounds on B aren't much help beyond giving you a start on the bins
for your tree structure. If the points are approximately uniformly
distributed, then you could make a good guess at the required sizes of
the bins and do it all with some hackish statically allocated arrays.

Yes.

> Do you expect the closest points in A and B to be nearly coincident?

No closer than random.

> "Algorithms in C++" by Robert Sedgewick

Ta, I'll follow that up.

Rob McDonald

unread,

Apr 19, 2006, 8:03:39 AM4/19/06

to

David.P...@csiro.au wrote:
> > Such as a k-d tree, an R tree, R+ tree, k-b-d tree, etc.
>
> I've read references on all of those now. If I can keep everything in
> memory (as seems very likely), then a k-d tree looks good if I only
> need approximate nearest neighbour but looks very messy if I was to
> want the exact nearest neighbour.
>

Quick question, does this tool absolutely have to be done in Fortran?

http://gts.sourceforge.net

Is a library for working with triangulated surfaces. It is written in
pure C, but is object oriented. Lots of crazy tricks with function
pointers etc.

The author of GTS is a stickler about using the right data structure
and the right algorithm for everything. He has a nearest neighbor
routine, and it is based on an octree.

His routines are set up to assume some connectivity between the points,
but I don't really see why you couldn't modify them to work with just
raw points. If it doesn't absolutely _have_ to be in Fortran, this
would be a big head start. Possibly you could even whip up something
in C based on GTS and write out the neighbor data to be read into your
Fortran program.

Good luck,

Rob

Gordon Sande

unread,

Apr 19, 2006, 8:25:58 AM4/19/06

to

On 2006-04-19 03:19:14 -0300, David.P...@csiro.au said:

>> Btw a similar question was asked recently in comp.lang.fortran -- one
> can Google "kd tree" there.
>
> Thanks from David Paterson. I've come to the conclusion that a "kd
> tree" is excellent if I want to find the approximate nearest neighbour
> (which is good enough for me) but not if I want the exact nerarest
> neighbour.

Having used kd-trees for a couple distinct applications I would said
that they do (exact) nearest neigbour nicely and if you just want
approximate nearest neighbour then there are there are some other
variants for that. Ask google and it will tell you about some
folks with a package called ANN.

> Have you seen a Fortran implementation of a "kd tree"?

By the time you wade throught the overhead of someone else's code
you will find that rolling your own is about as simple. Reading the
other stuff helps for the first time.

One of the folks (Watson) who does this lives just down the street from
you in Perth. (That's a joke as Perth is one the other end of Oz and
about a 6 hour flight away. Like asking someone from New York if they
know your friend in Los Angeles because both cities are in the USA.)
His variant goes by the name natural neighbours.

b52b

unread,

Apr 19, 2006, 12:59:52 PM4/19/06

to

Hello,

Take a look at:
ftp://lyapunov.ucsd.edu/pub/nonlinear/
There is Matt's Kennel implementation of kd-tree in Fortran 95
Unfortunately you can't open this address in the web browser. Use
dedicated ftp and anonymous login,

Regards,
Jamie

b52b

unread,

Apr 19, 2006, 1:04:47 PM4/19/06

to

Actually my firefox is broken. Seems it works with other browsers.

Jamie

>
> Regards,
> Jamie

Brooks Moses

unread,

Apr 19, 2006, 3:15:41 PM4/19/06

to

David.P...@csiro.au wrote:
>>Btw a similar question was asked recently in comp.lang.fortran -- one
>
> can Google "kd tree" there.
>
> Thanks from David Paterson. I've come to the conclusion that a "kd
> tree" is excellent if I want to find the approximate nearest neighbour
> (which is good enough for me) but not if I want the exact nerarest
> neighbour.

Huh. My impression was that a kd-tree was used to locate a (small) set
of candidate nearest-neighbor points, which one would then test to find
the exact nearest neighbor.

- Brooks

--
The "bmoses-nospam" address is valid; no unmunging needed.

Gordon Sande

unread,

Apr 19, 2006, 3:35:20 PM4/19/06

to

On 2006-04-19 16:15:41 -0300, Brooks Moses
<bmoses...@cits1.stanford.edu> said:

> David.P...@csiro.au wrote:
>>> Btw a similar question was asked recently in comp.lang.fortran -- one
>>
>> can Google "kd tree" there.
>>
>> Thanks from David Paterson. I've come to the conclusion that a "kd
>> tree" is excellent if I want to find the approximate nearest neighbour
>> (which is good enough for me) but not if I want the exact nerarest
>> neighbour.
>
> Huh. My impression was that a kd-tree was used to locate a (small) set
> of candidate nearest-neighbor points, which one would then test to find
> the exact nearest neighbor.
>
> - Brooks

I would have said that the kd-tree offers up an exapanding collection of
highly plausible neigbourhoods, with contents, that have to be inspected
more closely because it does not know exactly where the contents are inside
those neighbourhoods. The kd neigbourhoods may not match the precise form
of the metric you are using. Its neighbourhoods are rectalinear while metrics
like L_2 have spherical neighbourhoods.

There are enough variations on kd-trees that it depends on which version
one is looking at. Are the queries for the best or a range? Are the kd-trees
buckets taken all the way to single points or do they stop early? It shows
that the basic scheme is both powerful and versatile.

Christer Ericson

unread,

Apr 20, 2006, 1:25:33 AM4/20/06

to

In article <1145427554.7...@g10g2000cwb.googlegroups.com>,
David.P...@csiro.au says...

> Thanks from David Paterson. I've come to the conclusion that a "kd
> tree" is excellent if I want to find the approximate nearest neighbour
> (which is good enough for me) but not if I want the exact nerarest
> neighbour.

Not so. The k-d tree is the canonical data structure for exact
nearest neighbor queries (in low dimensions I should probably
add). The original reference is:

Friedman, Jerome. Jon Bentley. Raphael Finkel. =3FAn Algorithm for
Finding Best Matches in Logarithmic Expected Time.=3F ACM Transactions
on Mathematical Software, vol. 3, no. 3, pp. 209-226, September 1977.

You can find the technical report version of this paper here:

ftp://reports.stanford.edu/pub/cstr/reports/cs/tr/75/482/CS-TR-75-
482.pdf

Their description leaves a bit to be desired, but the bottom line
is that, given a k-d tree implementation, the nearest neighbor
query can be implemented in about 10-15 lines of C/C++ code
(and probably not much more in Fortran).

I give an almost complete implementation in Section 7.3.7 of my
book; to be turned into an exact nearest neighbor query the code
just needs to be completed by shrinking the query sphere as
points are encountered (i.e. 1-2 lines of code).

--
Christer Ericson
http://realtimecollisiondetection.net/