Trick to quickly retrieve extent on a spatial table with a RTree

105 views
Skip to first unread message

Even Rouault

unread,
Jun 4, 2018, 3:32:45 PM6/4/18
to SpatiaLite Users
Hi,

I've today investigated a performance issue when opening a huge GeoPackage database (80 GB !)
and getting the extent when it is not stocked in the gpkg_contents table [1]

I tried at first to use a SELECT MIN(minx), MIN(miny), MAX(maxx), MAX(maxy) FROM {rtreeTableName}
but this apparently involves a full table scan, which defeats the purpose. This is even true on recent versions of SQLite

But if you do SELECT 1 FROM {rTreeTableName} WHERE minx < {value} LIMIT 1, this is super fast.
So by implementing a dichotomic search, you can easily determine the minimum/maximum of minx, minx, maxx, maxy

That's what I implemented in GDAL in
https://github.com/OSGeo/gdal/commit/f3276e81274936027803295b32187a59db234eca#diff-a674cfc4ad8bc2d939e8044e5258c577R2394

Could be applied to a Spatialite table of course.

Even


[1] https://issues.qgis.org/issues/18402

--
Spatialys - Geospatial professional services
http://www.spatialys.com

Even Rouault

unread,
Jun 5, 2018, 1:33:29 PM6/5/18
to spatiali...@googlegroups.com
On lundi 4 juin 2018 21:32:41 CEST Even Rouault wrote:

Pepijn Van Eeckhoudt mentionned me an alternative:
"""if your comfortable reading the shadow tables directly, is to read the root
node of the rtree and extract the bounding box from the blob. The root node
always has node id 1, but to be 100% sure you could query the parent shadow
table for the node id that has not parent id. Then grab the corresponding node
blob and get min/max x/y from the entries. See the comments at the top of
rtree.c https://www.sqlite.org/cgi/src/artifact/cb6d4bd43c118354 for
details."""

a.fu...@lqt.it

unread,
Jun 6, 2018, 6:18:40 AM6/6/18
to spatiali...@googlegroups.com
Hi Even,

a smarter (and surely safer) method allowing for fast retrieval
of the Full Extent BBOX of any SQLite's R*Tree is by implementing
a custom query based on the top of sqlite3_rtree_query_callback()
(available since version 3.8.5 released on 2014-06-04).

I'll attach a sample C source file containing a skeletal
implementation easily adaptable to many common situations.

Note: the BBOX retrieved from an R*Tree is always intrinsically
affected by some rounding approximation, because SQLite internally
stores all R*Tree's coordinates as SINGLE PRECISION floating points,
after applying an appropriate "stretching" of all BBOXes so to avoid
any possible truncation.
So attempting to guess the BBOX of some Spatial Table by querying
its companion R*Tree is surely a fast operation, but it's not really
accurate.
(not to mention the case when the companion R*Tree doesn't exist).

bye Sandro
rtree_bbox.zip

Even Rouault

unread,
Jun 6, 2018, 6:42:23 AM6/6/18
to spatiali...@googlegroups.com, a.fu...@lqt.it
Hi Sandro,

Great brainstorming session :-) Your solution wins the elegance contest.

Regarding rounding, I see they mention "bounding boxes might be slightly
larger than specified, but will never be any smaller", which is what we are
generally interested in for a layer extent ("Zoom to layer" type of
functionality)

Thanks,

Even
Reply all
Reply to author
Forward
0 new messages