[postgis-users] PostGIS KNN best practices

52 views
Skip to first unread message

Stephen V. Mather

unread,
May 15, 2012, 12:37:18 PM5/15/12
to postgi...@postgis.refractions.net

Hi All,

                Pretty excited by the new operators <-> and <#>, but a bit confused as to how to use them in a query.  The two examples from P. Ramsey back in late 2011 ( http://blog.opengeo.org/2011/09/28/indexed-nearest-neighbour-search-in-postgis/ ) included doing a KNN on a single point to a cloud of points, i.e.

 

SELECT name, gid

FROM geonames

ORDER BY geom <-> st_setsrid(st_makepoint(-90,40),4326)

LIMIT 10;

 

or doing KNN on non-point different geometries, where the first neighbor by <-> or <#> might not be truly the first i.e.

 

with index_query as (
  select
    st_distance(geom, 'SRID=3005;POINT(1011102 450541)') as distance,
    parcel_id, address
  from parcels
  order by geom <#> 'SRID=3005;POINT(1011102 450541)' limit 100
)
select * from index_query order by distance limit 10;

 

So, how would one grab the first nearest neighbor for all points in a dataset?  This is how I used to do it:

 

CREATE TABLE n2180_560_height AS

SELECT x, y, height FROM

(SELECT DISTINCT ON(veg.gid) veg.gid as gid, ground.gid as gid_ground, veg.x as x, veg.y as y, ground.z as z, veg.z - ground.z as height, veg.the_geom as geometry, veg.class as class

FROM (SELECT * FROM n2180_560 WHERE class = 5) As veg, (SELECT * FROM n2180_560 WHERE class = 2) As ground

WHERE veg.class = 5 AND veg.gid <> ground.gid AND ST_DWithin(veg.the_geom, ground.the_geom, 10)

ORDER BY veg.gid, ST_Distance(veg.the_geom,ground.the_geom)) AS vegpoints;

 

ST_DWithin prevents a full cross join, but is a sloppy way to do this, as it requires a priori knowledge of the end cases.  I’m hoping there’s a subquery or some such magic that would allow me to use the distance operator to a similar end… .

 

Thanks,

Best,

Steve

 

http://www.clemetparks.com/images/esig/cmp-ms-90x122.pngStephen Mather
Geographic Information Systems (GIS) Manager
(216) 635-3243

s...@clevelandmetroparks.com
clevelandmetroparks.com

 

 

 

 

image001.png

Alexandre Neto

unread,
May 16, 2012, 7:35:06 AM5/16/12
to PostGIS Users Discussion
I have been around that question to.


You have to do it in two steps, like is explained in the operator page. One faster step to reduce the candidates (by using <-> or <#>) and second one to get the real distances with ST_Distance.

The problem in finding the KNN for each row in a table is the fact that the gist index <-> operator only works if one of the geometries is constant. The workaround would be to create a SQL function to apply to each of the rows using table.the_geom as a parameter.

Something like this:

----
CREATE OR REPLACE FUNCTION _enn2 (geometry) RETURNS double precision AS $$

WITH index_query as
(SELECT ST_Distance($1,f.the_geom) as dist
FROM "grelha5m" As f
ORDER BY $1 <#> g1.the_geom limit 1000)
SELECT dist
FROM index_query
ORDER BY dist;

$$ LANGUAGE SQL;
---

and I call it like this: 

---
Select c.gid as gid, _enn2(c.the_geom) as enn
From cosn1 as c
Order by c.gid
---

In this case the function returned the smallest distance, but you can choose any other column.

Hope it helps

Alexandre Neto

_______________________________________________
postgis-users mailing list
postgi...@postgis.refractions.net
http://postgis.refractions.net/mailman/listinfo/postgis-users


image001.png

Stephen V. Mather

unread,
May 16, 2012, 8:56:40 AM5/16/12
to PostGIS Users Discussion

Ah, I had hopes pinned on the idea that I just wasn’t smart enough to figure it out, but it’s an inherent limitation.  I will be using your function though—that is a clean way to encapsulate the functionality.

image001.png

Stephen V. Mather

unread,
May 17, 2012, 10:42:45 AM5/17/12
to PostGIS Users Discussion, mather....@gmail.com

Cool,

                That approach is about twice as fast as my previous approach for comparing points to lines, e.g.:

 

 

CREATE OR REPLACE FUNCTION angle_to_street (geometry) RETURNS double precision AS $$

 

WITH index_query as

                (SELECT ST_Distance($1,road.geom) as dist,

                                degrees(ST_Azimuth($1, ST_ClosestPoint($1, road.geom))) as azimuth

                FROM street_centerlines As road

                ORDER BY $1 <#> road.geom limit 5)

SELECT azimuth

                FROM index_query

                ORDER BY dist

LIMIT 1;

 

$$ LANGUAGE SQL;

 

DROP TABLE IF EXISTS address_points_rot;

CREATE TABLE address_points_rot AS

                SELECT addr.*, angle_to_street(addr.geom)

                FROM

                                address_points addr;

 

 

 

http://www.clemetparks.com/images/esig/cmp-ms-90x122.pngStephen Mather
Geographic Information Systems (GIS) Manager
(216) 635-3243

s...@clevelandmetroparks.com
clevelandmetroparks.com

 

 

 

 


Sent: Wednesday, May 16, 2012 7:35 AM
To: PostGIS Users Discussion
Subject: Re: [postgis-users] PostGIS KNN best practices

 

I have been around that question to.

image001.png

Stephen V. Mather

unread,
May 17, 2012, 11:21:23 AM5/17/12
to PostGIS Users Discussion

And the case with points, e.g. the distance between the nearest 3D points is of course much simpler:

 

CREATE OR REPLACE FUNCTION _lidar_distance (geometry, geometry) RETURNS double precision AS $$

 

SELECT ST_3DDistance($1, $2) as height

                ORDER BY $1 <-> $2

limit 1

;

 

$$ LANGUAGE SQL;

 

DROP TABLE IF EXISTS lidar_distance;

CREATE TABLE lidar_distance AS

                SELECT veg.*, _lidar_distance(veg.geom, ground.geom)

                FROM

                                (SELECT * FROM n2180_560 WHERE class = 5) As veg,

                                (SELECT * FROM n2180_560 WHERE class = 2) As ground;

image002.png
Reply all
Reply to author
Forward
0 new messages