SpatialIndex documentation misleading

119 views
Skip to first unread message

mj10777

unread,
Dec 10, 2013, 1:48:47 AM12/10/13
to spatiali...@googlegroups.com
Sandro,

I ran into a problem that has been puzzling me for a while.
I had one Database where the following query returned correct results:

SELECT CastToXY(ST_Transform(coord_geometry,4326)) FROM geodb_geometry WHERE ROWID IN (SELECT ROWID FROM Spatialindex WHERE f_table_name ='geodb_geometry' AND search_frame = ST_Transform(BuildMBR(13.287419,52.632703,13.562078,52.380477,4326),3068))

on another Database this command brought no results:

SELECT CastToXY(ST_Transform(soldner_geometry,4326)) FROM berlin_plz_geometry WHERE ROWID IN (SELECT ROWID FROM Spatialindex WHERE f_table_name ='berlin_plz_geometry' AND search_frame = ST_Transform(BuildMBR(13.287419,52.632703,13.562078,52.380477,4326),3068))

This query brought the desired results:

SELECT CastToXY(ST_Transform(soldner_geometry,4326)) FROM berlin_plz_geometry WHERE ST_Intersects(soldner_geometry,ST_Transform(BuildMBR(13.287419,52.632703,13.562078,52.380477,4326),3068)) = 1

At this point I began to wonder whether I made some mistake during the creation of the Database, but since this table has a primary key defined I believed that a 'VACUUM' problem could be ruled out.

I then look for and found a more descriptive documentation about 'SpatialIndex' and found:
https://www.gaia-gis.it/fossil/libspatialite/wiki?name=SpatialIndex

Here I found the line:

you could eventually explicitly reference some f_geometry_column too; but this is strictly required only when the same table defines more Geometries, and is redundant in most ordinary cases.

While staring at this statement for a while, an old light bulb began to glow in the back of my head, 'yes, the difference between the 2 tables is that the first has 1 geometry and the second has 4.'

So I tried:

SELECT CastToXY(ST_Transform(soldner_geometry,4326)) FROM berlin_plz_geometry WHERE ROWID IN (SELECT ROWID FROM Spatialindex WHERE f_table_name ='berlin_plz_geometry' AND f_geometry_column ='soldner_geometry'  AND search_frame = ST_Transform(BuildMBR(13.287419,52.632703,13.562078,52.380477,4326),3068))

which brought the desired results.

For this reason I would suggest changing the above text to something like:


Should you have more than 1 Geometry in a table, you must explicitly reference that Geometry column, otherwise no results will be shown.

---

The original query used looked like this:

SELECT CastToXY(ST_Transform(soldner_geometry,4326)) FROM berlin_plz_geometry WHERE ST_Intersects(soldner_geometry,ST_Transform(BuildMBR(13.287419,52.632703,13.562078,52.380477,4326),3068)) = 1 AND ROWID IN (SELECT ROWID FROM Spatialindex WHERE f_table_name ='berlin_plz_geometry' AND search_frame = ST_Transform(BuildMBR(13.287419,52.632703,13.562078,52.380477,4326),3068))

it seems to me that the ST_Intersects portion is not needed, since without it the same results are returned and is faster.
So I am assuming that when the support for Spatialindex was built in, the removal of this portion was simply forgotten.
I would to remove that portion if my conclusion is correct.

Mark Johnson, Berlin _Germany

a.fu...@lqt.it

unread,
Dec 10, 2013, 4:42:09 AM12/10/13
to spatiali...@googlegroups.com
Hi Mark,

just few general order clarifications:

> At this point I began to wonder whether I made some mistake during
> the
> creation of the Database, but since this table has a primary key
> defined I believed that a 'VACUUM' problem could be ruled out.
>

a) explicitly declaring a generic Primary Key by itself should never
be assumed to be an absolute protection against the "Vacuum
problem".
Only Primary Keys based on a single INTEGER column are absolutely
immune from changing ROWID values during a VACUUM (because in this
case the ROWID simply is an alias name for the Primary Key column).
For any other kind of Primary Key (i.e. based on multiple columns
or based on a single TEXT column) the ROWID may eventually change
(or not) during a VACUUM operation depending on the libsqlite3
version in use.

b) there is a specific SQL function specifically intended to directly
check if a SpatialIndex become corrupted: CheckSpatialIndex()
and there is a second SQL function supporting immediate repair:
RecoverSpatialIndex()


>> you could eventually explicitly reference some f_geometry_column
>> too; but this is strictly required only when the same table defines
>> more Geometries, and is redundant in most ordinary cases.
>
> For this reason I would suggest changing the above text to something
> like:
>
>> Should you have more than 1 Geometry in a table, you must explicitly
>> reference that Geometry column, otherwise no results will be shown.
>

accepted: seems to be a very reasonable integration.


> it seems to me that the ST_INTERSECTS portion is not needed, since
> without it the same results are returned and is faster.
> So I am assuming that when the support for Spatialindex was built in,
> the removal of this portion was simply forgotten.
> I would to remove that portion if my conclusion is correct.
>

sorry, but your conclusion is not correct; it could eventually be
acceptable only on very special or particular cases.
but it's not valid in a most general way.

there is a very strong reason suggesting to still continue including
the ST_Intersects() member even when a Spatial Index is effectively
supporting the SQL query:
- the Spatial Index (R*Tree) simply applies a fast but very coarse
comparison uniquely based on MBRs (aka BBOXes)
- rather obviously there are many possible cases in which the MBRs
of two geometries do intersect, but the geometries themselves
doesn't intersect at all.

please see the attached figures:

case "A":
- the "search_frame" (red) intersects the green polygon MBR, so
the Spatial Index will return this polygon
- the "search_frame" do actually intersect the green polygon
itself: so our query should definitely include this polygon
in the final resultset

case "B":
- in this second case too the "search_frame" (red) intersects
the green polygon MBR and the Spatial Index will return this
polygon
- but now the "search frame" doesn't intersect at all the
green polygon: so our query should definitely discard this
polygon from the final resultset.
and this exactly is the reason explaining why we still need
to include a ST_Intersects() member in order to get a very
precise selection.

bye Sandro
intersects-A.png
intersects-B.png

mj10777

unread,
Dec 10, 2013, 5:43:36 AM12/10/13
to spatiali...@googlegroups.com
Thank you for the clarification, I understand the reason now.

Mark Johnson

bye Sandro
Reply all
Reply to author
Forward
0 new messages