Speed of knn2

25 views
Skip to first unread message

ckgoo...@gmail.com

unread,
Nov 2, 2024, 6:30:13 AMNov 2
to spatiali...@googlegroups.com

Hi,

I am using the knn2 algorithm as below and I am finding it is taking around 0.5 seconds to return.  I think this is due to the nature of my data and I’d like to reduce the time.

My data is 100 polygons of the world seas so many are very large.  And their MBR’s will overlap quite a bit.

SELECT pk_uid, name

FROM  knn2 AS k

  JOIN spt_IHOSeas AS s  ON k.FID  = s.pk_uid

WHERE f_table_name = 'spt_IHOSeas' AND ref_geometry = MakePoint ( -1,50 ) AND radius = 0.01 AND max_items = 1;

 

I need to return the name of the nearest polygon to my point.  Unfortunately sometimes my user is in the sea according to OSM data but my IHO sourced polygons can be around 100m out.  So I cannot do a intersection or contains check.

Changing the radius doesn’t make a difference it seems.

Can anyone give advice on how I could improve speed, either of this query, or by using a different strategy to get the nearest polygon to a given position?

Best, Chris

 

a.fu...@lqt.it

unread,
Nov 4, 2024, 4:04:44 AMNov 4
to spatiali...@googlegroups.com
On Sat, 2 Nov 2024 10:07:05 -0000, ckgoo...@gmail.com wrote:
> Hi,
>
> I am using the knn2 algorithm as below and I am finding it is taking
> around 0.5 seconds to return. I think this is due to the nature of my
> data and I'd like to reduce the time.
>

Hi Chris,

the KNN2 algorithm is notoriously very slow for long/lat geographic
coordinates.
furthermore, KNN2 works reasonably well only for POINT-type geometries,
while it will inevitably be slow in the case of POLYGONs, especially
if these are particularly complex.


> My data is 100 polygons of the world seas so many are very large. And
> their MBR's will overlap quite a bit.
>

obviously this prevents you from using the SpatialIndex effectively,
and this generates further slowness.
furthermore, it's likely that your polygons representing the seas
are particularly complex with many thousands of vertices; that is,
they require many CPU cycles to calculate any geometric operation.

the reason is simple: the SpatialIndex works in degrees, but the
minimum distance must be calculated in meters, and this forces
to carry out a lot of complex calculations.
Simply put: you're trying to solve a particularly thorny and
downright heavy problem that requires a lot of calculations.


> I need to return the name of the nearest polygon to my point.
> Unfortunately sometimes my user is in the sea according to OSM data
> but my IHO sourced polygons can be around 100m out. So I cannot do a
> intersection or contains check.
>
> Changing the radius doesn't make a difference it seems.
>
> Can anyone give advice on how I could improve speed, either of this
> query, or by using a different strategy to get the nearest polygon to
> a given position?
>

personally I would be tempted to adopt a decidedly imaginative and
creative solution.

after all, you don't need to find out which is the closest sea
by analyzing the entire polygon; you simply need to find out
which is the closest shore.
this being the case we could proceed as follows.

step A) we'll use ST_LinesFromRings() so to get a collection
of LINESTRINGs representing the outer and inner borders
of each polygon

step B) then we'll use ElementaryGeometries so to dissolve
any MULTILINESTRING into simple LINESTRINGs

step C) and finally we'll call ST_Subdivide() so to cut
each LINESTRING in small parts with very few vertices
(ideally just few tenths)

Note: we'll obviously take any possible precaution so
to fully preserve the PK value identifying the initial
"sea" from which we obtained all the small pieces of
the edge.

Concluding: in this way instead of a few very complex
polygons we'll have a lot of small, very simple
linestrings.
and so the SpatialIndex will finally be a truly
effective selective filter.

bye Sandro

ckgoo...@gmail.com

unread,
Nov 4, 2024, 4:16:02 PMNov 4
to spatiali...@googlegroups.com
Hi Sandro,
Thanks for your reply. You are right. What I wanted to do was simply too computationally expensive.

So I took a different approach which seems to have worked okay. I inflated all my sea polygons by a small amount so they are now slightly larger than the bounds of my OSM sea positions which means I am now able to utilise the usual spatial index and do a ST_Contains test after. The speed of this is fine.

It does mean when I move from one sea to another, like from the South Pacific to the North Pacific, there is a thin strip where the polygons overlap and I would get two polygons returned. But I limit the result to one and hey, when you are in the middle of the ocean like that, a kilometre out isn't going to matter.

Best,
Chris
--
You received this message because you are subscribed to the Google Groups "SpatiaLite Users" group.
To unsubscribe from this group and stop receiving emails from it, send an email to spatialite-use...@googlegroups.com.
To view this discussion visit https://groups.google.com/d/msgid/spatialite-users/9e01c94cb51d35e0cb7e6b9a996ce46c%40lqt.it.

Reply all
Reply to author
Forward
0 new messages