Hi list,
few considerations about the current state of the art of the KNN module.
A) several users during last months/weeks reported the practical impossibility
to successfully use KNN on Windows when using environments like Java or
Python or C# and alike.
typical sympthoms: anything run smoothly in spatialite_gui or spatialite CLI
but there is no way to get the same SQL queries working on language
connectors. any attempt fails reporting mysterious errors or directly
causes an application crash.
the technical causes explaining for such a behavior are absolutely clear:
the current KNN implementation critically depends on a very special callback
API of libsqlite3 (sqlite3_rtree_query_info), that is intended to explore
all the branches of the tree supporting any SpatialIndex.
unhappily this very special API is not included within the "reflected"
APIs directly available for dynamically loaded modules (as mod_spatialite)
and requires to be directly linked to libsqlite3.dll
the net effect of all this is that a robust, stable and affordable
configuration is ensured only if and when both the main application
and the SpatiaLite extension use the same identical libsqlite.dll
(as it happens in C/C++ apps such as SpatiaLite GUI and CLI).
but when using complex language frameworks such as Java, Python, C# etc
it's very probable to end up with having _TWO_ different conflicting
copies of libsqlite.dll (one referenced by the language and the other
by KNN), possibly of different versions or built using different
compilers and/or different settings, and this will surely be a fatal
toxic combination.
the ideal solution will be building the SpatiaLite extension
directly on the top of the same libraries used at run time
by the language framework, but this seems a too much complex
and demanding approach for average users and developers.
a completely unrealistic expectation.
B) there is a pending proposal from Toto' Fiandaca suggesting to
introduce a MaxDistance radius parameter to KNN.
short rationale: the current implementation is rather slow,
and introducing an alternative approach based on MaxDistance
will certainly ensure noticeably faster performances in the
vast majority of cases.
evaluating altogether A) and B) leads to a rather obvious conclusion;
a complete refactoring of KNN is required for the next version,
a refactoring based on the following design assumptions:
1) getting definitely rid of the problematic sqlite3_rtree_query_info API
2) adopting a radically different approach internally based on
conventional SpatialIndex queries.
PROS:
===================================
- faster execution (at least assuming a reasonably small MaxDistace
radius)
- robust stability ensured on all configurations, including
Java, Python, C# etc language bindings
CONS:
===================================
just one: the current approach based on full exploration
of all branches of the Rtree completely avoids any preliminary
assumption about MaxDistance. it's an approach that will
surely identify nearest features even when there is a
very scattered and irregular spatial distribution.
this will be no longer possible with the proposed refactored
implementation because MaxDistance will then become a mandatory
argument.
your considerations about all this ?
bye Sandro