On Tue, 19 Aug 2025 21:20:21 +0100,
ckgoo...@gmail.com wrote:
> Hi all, the query below takes around 15 seconds to run. When I
> comment
> out the ST_Distance function call it completes in 1.5seconds.
>
Hi Chris,
Let's start with the first detail (which is perhaps not the most
relevant).
I notice that nested inside ST_Distance() there are 2 calls to
ST_Transform();
I imagine you need them because your geometries are Lat/Long (4326) but
you
need a distance expressed in meters and not in degrees.
I would like to point out that ST_Transform() is a rather expensive
function,
not only because it involves complex geodetic calculations, but above
all
because behind the scenes it triggers several accesses to the DB to
retrieve
the geodetic parameters for the required transformation.
Placing one or more calls to ST_Transform() inside other Spatial
functions
is not the best way to achieve maximum speed.
In this case you had an alternative solution available, probably less
expensive.
ST_Distance() in its 3-argument form (g1, g2, use_ellipsoid) is meant
to directly get a length in meters when using Lat/Long Geometries, and
should be faster because it doesn't require any additional DB access.
please see:
https://www.gaia-gis.it/gaia-sins/spatialite-sql-5.1.0.html#p13
It would at least be worth doing some tests to objectively measure the
difference between the two approaches.
> So I would think the ST_Distance would be calculated only on the 77
> results.
>
Yours is a very dangerous assumption, which could be false.
There's no guarantee that SQLite will execute the query in the order
you
expect; at runtime, it may decide to go in a completely different
direction.
One of SQLite's strengths, which often makes it so surprisingly fast,
is its
Query Planner module. please see:
https://www.sqlite.org/queryplanner.html
Simply put, SQLite takes the liberty of rewriting your query before
executing
it, possibly modifying it heavily when it deems appropriate.
Very often, the Query Planner is able to find the most efficient and
fastest
data access strategy. But there are also cases where it ends up making
the
worst choice, leading to a very slow query.
Generally, the Query Planner attempts to select the indexes that appear
most promising.
This, however, doesn't apply to Spatial Indexes, because for SQLite an
R*Tree
isn't an index at all but a table unrelated to any other table, and
therefore tends to receive the lowest priority, especially when dealing
with a complex query, or worse, a query that contains subqueries.
Which is exactly your case.
There's a way to check in advance which access strategy the Query
Planner
will effectively use at runtime:
EXPLAIN QUERY PLAN <your sql query>;
Will return a resultset listing all operations sorted by their
priority.
It's a SQL tool that many overlook or ignore, but it's absolutely
essential
for getting high-performance queries and to avoid reaching completely
unfounded conclusions.
In this specific case, I'll bet you two cents that you'll eventually
discover
that that query is executed in a very different way than you imagined,
and
that ST_Distance() is not the main cause of slowness.
bye Sandro