VirtualKNN and max_distance - feature request

56 views
Skip to first unread message

Totò Fiandaca

unread,
May 30, 2021, 2:06:02 PM5/30/21
to SpatiaLite Users
I studied and used the fabulous VirtualKNN present in spatialite [0] for
calculations of minimum distances between points and lines on large datasets (millions of
lines) and I noticed that the classic approach of using spatial indexes is - for my goal -
enormously more performing than the use of VirtualKNN (2 sec against
100 seconds); this is also due to the fact that the max_distance option does not exist in the
virtualKNN,  and it calculates the distances on all features.

these are the two queries compared, same input data and same output:
- geo-table `route` linestring 32632 with 116674 features
- geo-table `schools` point 32632 with 5529 features

VirtualKNN:

CREATE TABLE t_schuole_100mKNN AS
SELECT d.chiave_uni, d.fid as id_route, d.distance as distance,
     ST_shortestline (d.geometry, s.geometry) as geom
FROM
(SELECT a.fid as fid, a.distance as distance, zz.chiave_uni as
key_uni, zz.geometry
    FROM knn as a
    JOIN
    schools as zz
    WHERE f_table_name = 'route'
    AND f_geometry_column = 'geometry'
    AND ref_geometry = zz.geometry
    AND max_items = 1) d,
    route s
WHERE ST_Distance (s.geometry, d.geometry) <= 100 and id_route = s.ogc_fid
ORDER BY d.chiave_uni;
SELECT
RecoverGeometryColumn ('t_schuole_100mKNN', 'geom', 32632, 'LINESTRING', 'XY');

output 1737 lines
takes about 100 seconds

classic spatialIndex use

CREATE TABLE t_schuole_100mMBR AS
SELECT a.ogc_fid as id_route, Min (ST_Distance (a.geometry, zz.geometry)) AS
distance,
    zz.uni_key as uni_key, st_shortestline (a.geometry, zz.geometry)
as geom
FROM route as a, schools as zz
WHERE ST_Distance (a.geometry, zz.geometry) <= 100 AND a.ogc_fid IN (
        SELECT rowid
        FROM SpatialIndex
        WHERE f_table_name = 'route'
           AND search_frame = BuildCircleMbr (ST_X (zz.geometry),
ST_Y (zz.geometry), 100))
GROUP by zz.chiave_uni;
SELECT
RecoverGeometryColumn ('t_schuole_100mMBR', 'geom', 32632, 'LINESTRING', 'XY');

output 1737 lines
takes about 2 seconds

Perhaps the queries are not written optimally, but in my opinion a parameter
max_distance in virtualKNN would help a lot

thank you

a.fu...@lqt.it

unread,
Jun 2, 2021, 4:34:46 AM6/2/21
to spatiali...@googlegroups.com
the discussion about this topic will continue on this post:

https://groups.google.com/g/spatialite-users/c/m0SF7IzVc_Y

bye Sandro
Reply all
Reply to author
Forward
0 new messages