Hi, I have a query which takes a very long time to run so would like to know if I'm approaching my problem correctly.
The general problem is that I have polygons in my Spatialite database representing countries which is derived from OSM data.
These polygons include the territorial waters of the country and I want to prune these back to the actual coastline data.
So I plan to do an intersection with coastline data derived from OSM data.
Number of country polygons is 981.
Number of coastline data polygons is 783,507
This coastline data I presume will include a huge polygon covering the continents of Africa, Europe and Asia plus the other continents and then hundreds of thousands of islands.
As an example of what I'm looking to achieve, I have a single country polygon for Spain which includes its territorial waters. When I intersect this with the coastline data I get 138 polygons because of all the islands within the territorial waters of Spain but which I now know belong to Spain which I didn't know before.
My query looks like this:
INSERT INTO spt_PureLandPolygons (elementType, elementID, polygon, point)
SELECT
spt.elementType, --elementType = 3 indicates a OSM relation
spt.elementID, --The OSM relation ID
ST_Intersection( spt.polygon, c.lpolygon ),
spt.point --A label position which probably needs moving elsewhere
FROM
spt_Polygons AS spt
JOIN
cdb.spt_LandPolygons AS c
ON ST_Intersects( spt.polygon, c.lpolygon) --Ensures polygons overlap before finding intersection
JOIN
tbl_Tags AS t
ON spt.elementID = t.elementID
WHERE
spt.elementType = 3 --Relations only
AND ( t.key='admin_level' AND t.value='2' ) --Admin level 2 relations only
AND spt.elementID=62149;
The last AND clause limits the query to the United Kingdom and it takes several minutes to create 18,000 or so polygons. If I removed this to cover the whole world then the query is still running after 12 hours. No surprise.
I believe the ST_INTERSECTS function will inherently use any spatial indexes available. Is that correct?
I did write a version of this query which added a specific spatial index check but there was no change in performance.
So I'm after advice on whether this is a sensible approach and what I can do to reduce the runtime?
Is it simply that INTERSECTS and INTERSECTION are costly functions.
Or what I have done is doomed to failure and I should do it all a different way?
Best, Chris