Best way to find a row based on my location

56 views
Skip to first unread message

Jim

unread,
Feb 11, 2025, 11:50:45 AMFeb 11
to SpatiaLite Users
I have a 7 million row table that contains a geometry of a POINT.  The table also has a spatial index build on the geometry column.

I want to find the row nearest to my location and have come up with this select form:

SELECT *
FROM my_table
WHERE ST_Distance(geometry, MakePoint(mylon, mylat) <= 0.013
ORDER BY ST_Distance(geometry, MakePoint(-122.000, 47.000)) ASC
LIMIT 1;

Is there a more robust and efficient version of this SELECT taking into account the spatial_index on geometry?

Jim

unread,
Feb 12, 2025, 1:13:18 PMFeb 12
to SpatiaLite Users
I got it:

 SELECT * FROM streets, (SELECT MakePoint(-122.3655289, 47.7635415, 4326) AS mypoint)
  WHERE ST_Distance(geometry, mypoint) <= 0.013
  AND streets.ROWID IN ( SELECT ROWID FROM SpatialIndex WHERE f_table_name = 'streets' AND      search_frame = mypoint)

The seletion time went from 35 seconds for the original search to 0.300 seconds for this one.

a.fu...@lqt.it

unread,
Feb 12, 2025, 2:17:52 PMFeb 12
to spatiali...@googlegroups.com
On Tue, 11 Feb 2025 08:50:45 -0800 (PST), Jim wrote:
> I have a 7 million row table that contains a geometry of a POINT. 
> The table also has a spatial index build on the geometry column.
> I want to find the row nearest to my location
>

Hi Jim,

this is one of the most classic problems and has been covered
in depth in the academic literature.

It is generally known as the nearest neighbors problem; here
you can find a valid summary to begin to delve deeper into
the topic:

https://en.wikipedia.org/wiki/K-nearest_neighbors_algorithm
https://en.wikipedia.org/wiki/Nearest_neighbor_search

In a nutshell:
- it's a decidedly complex problem
- many different algorithms have been proposed
- unfortunately none of them are simple
- and above all they all adapt very poorly to
a simple SQL implementation

The biggest obstacle for any SQL approach is that in
principle it's never possible to set a reasonable search
radius to narrow the search.

Just a few easy examples to understand better:
- if you are looking for the nearest pharmacy and you are
in some city, a search radius of a few hundred meters
should be reasonable.
- but if you are in a rural area the search radius will
have to be expanded to several km
- if, however, you find yourself in the middle of the
forests in a mountain range you'll have to widen the
search radius to many tens of km
- finally, if you are on a boat in the middle of the
ocean the search radius should be extended to several
thousand km
- there are no general rules; it all depends on what
you're looking for, where you are, and the statistical
dispersion of the sample.

Note: apparently you use geographic coordinates of longitude
and latitude. This introduces many additional complications
that we'll intentionally ignore to avoid drowning in
excessive complexity.
The problem is already complex enough in itself.


> ... and have come up with this select form:
>
> SELECT *
> FROM my_table
> WHERE ST_Distance(geometry, MakePoint(mylon, mylat) <= 0.013
> ORDER BY ST_Distance(geometry, MakePoint(-122.000, 47.000)) ASC
> LIMIT 1;
>

Your query suffers from a fairly obvious flaw.
In the WHERE clause you are imposing a completely arbitrary
distance radius, and as we just saw it could be a very severe
limit.

SELECT p.id, p.name, Min(ST_Distance(p.geom, x.mypoint, 1)) AS dist
FROM points AS p,
(SELECT MakePoint(-122.3655289, 47.7635415, 4326) AS mypoint) AS x;

A modified Query like the previous one, however, is really
robust; it will always find the closest point in any situation
because it does not assume any arbitrary distance radius.

Note: ST_Distance(g1, g2, 1) will return a distance expressed
in meters precisely calculated on the Earth's ellipsoid, so you
can ignore all the problems related to converting angular
distances to metric distances.
It works perfectly both neat the equator and near the poles.

However, it's clear that this is a very slow query when many
millions of points have to be examined.
It cannot be a practical solution.

... the rest of my analysis continues below your next post...

see you later,
Sandro



a.fu...@lqt.it

unread,
Feb 12, 2025, 3:15:32 PMFeb 12
to spatiali...@googlegroups.com
On Wed, 12 Feb 2025 10:13:18 -0800 (PST), Jim wrote:
> I got it:
>
>  SELECT * FROM streets, (SELECT
> MakePoint(-122.3655289, 47.7635415, 4326) AS mypoint)
> WHERE ST_Distance(geometry, mypoint) <= 0.013
> AND streets.ROWID IN ( SELECT ROWID FROM SpatialIndex
> WHERE f_table_name = 'streets' AND search_frame = mypoint)
>
> The seletion time went from 35 seconds for the original search
> to 0.300 seconds for this one.
>

Hi Jim,

I'm sorry to disappoint you, but this Query is also wrong.
You are on the right track because now you are using the
Spatial Index, but you are not doing it right.

Recall: the Spatial Index is simply a mechanism that very
quickly finds all the points that fall within a given rectangle.
The smaller the rectangle, the quicker the search.
You can set the size and position of the search rectangle by
assigning the value of "search_frame", .which can be any
Geometry because only its MBR (aka BBLOX) will be taken
into consideration.

In your Query you are setting "search_frame = mypoint"; but
mypoint is a POINT, that is, it has an infinitesimal dimension.
You were very lucky that the search in the Spatial Index found
a point exactly coinciding with mypoint, but obviously it's a
pure coincidence.

To actually use the Spatial Index you have to expand the
Geometry you pass to "search_frame", so that it becomes
a real rectangle with a certain extension.
The simplest way to do this is to use the BuildCircleMBR()
function, as e.g.:

SELECT p.id, p.name, Min(ST_Distance(p.geom, x.mypoint, 1)) AS dist
FROM points AS p,
(SELECT MakePoint(-122.3655289, 47.7635415, 4326) AS mypoint) AS x
WHERE p.ROWID IN (
SELECT ROWID FROM SpatialIndex
WHERE f_table_name = 'points'
AND search_frame =
BuildCircleMbr(ST_X(mypoint), ST_Y(mypoint), 0.013));

Note that this paves the way for a heuristic solution that
might be perfectly adequate, an iterative type solution.

step #1: start by setting a reasonably small search radius
on BuildCircleMBR().

step #2: if you are lucky and find yourself in an area where
there are numerous highly dense points you'll
immediately have found the solution you are
looking for.
STOP

step #3: however, it could also be that you are in a less
dense area (open countryside, forest, mountain
and alike), and the Query will not be able to
identify any point within the "search_frame"

step #4: at this point it's quite obvious that we need
to further expand the search radius; for example
by doubling it.
loop again and jump to step#1

after a few iterations you'll surely end up finding at
least one point that falls within "search_frame", even
if you are in the middle of the ocean.

the powers of 2 grow very quickly, so at most yo'll have
to complete no more than a dozen loops even in the most
desperate cases.
and note well: searches that fail will take a practically
infinitesimal amount of time, so the impact on overall
performance should be negligible.

bye Sandro

Jim

unread,
Feb 13, 2025, 10:45:50 AMFeb 13
to SpatiaLite Users
Sando,
Thank you for the detailed explanation. At 84, the  details of what you as describing have long since  left my brain and I tend to copy old code that seemed to work.

My actual problem is to find the address of the house that I am adjacent to on the curent road. It is for display in my navigation system.  Of course I could just ask Google or such, but I have a database of street address build from data from OpenStreets and wish to do my own lookup.  Budget consta9nts and all that and since I rarely leave the local area I can limit the data to Washington state.

Having said that, I am now off to try your select statement..

Jim

Jim

unread,
Feb 14, 2025, 11:35:40 AMFeb 14
to SpatiaLite Users
To close out the item.  Your solution, while a bit slower (0.1 sec),  was far more precise and gives me very accurate results.  Thanks again Sandro.
Reply all
Reply to author
Forward
0 new messages