What is the efficient way to dissolve polygons using Spatialite 3.0

1,921 views
Skip to first unread message

Alexandre Neto

unread,
Oct 9, 2012, 7:06:07 AM10/9/12
to spatiali...@googlegroups.com

I have polygon table called "tableA" with 72000 records. It has a geometry column called "geometry", and a text column called "field1". I want to aggregate\dissolve all polygons with the same "field1" value.

I have a Spatial Index in Geometry and an index in field1.

I tried this:

Select f.field1 as field1, st_union(f.geometry) as geometry
From tableA as f
Group by field1;

and its taking too long, I had to cancel it after being processing for 1 hour. Using Arcgis it toke me 5 minutes, so I must be doing something wrong.

So, is there a better way to preform this operation using spatialite? Is the Spatial Index being used this way?


The question is also in gis.stackexchange if anyone care to answer there.


Thank You very much.


Alexandre Neto

Noli Sicad

unread,
Oct 9, 2012, 7:18:47 AM10/9/12
to spatiali...@googlegroups.com
> gis.stackexchange<http://gis.stackexchange.com/questions/35213/what-is-the-efficient-way-to-dissolve-polygons-using-spatialite-3-0>if
> anyone care to answer there.

Read this recipe (i.e. steps) how to do union in spatialite

http://www.gaia-gis.it/gaia-sins/spatialite-cookbook/html/union.html

Noli

Alexandre Neto

unread,
Oct 9, 2012, 7:49:30 AM10/9/12
to spatiali...@googlegroups.com
Thank for the recipe.

Comparing with the steps presented in the Recip, it seems that my query is ok. Now, Is it expected that a 72000 polygon table take more than 1h to aggregate?

It toke 1m32s in arcgis, and even less in Postgis.

Alexandre Neto

Scro

unread,
Oct 10, 2012, 5:26:21 PM10/10/12
to spatiali...@googlegroups.com
I can confirm this issue, and it relates to how many geometries are given to ST_Union in aggregate form.  I did a test like this:

SELECT ST_Union(t1.geom) agg_geom
FROM (
  SELECT geom
  FROM quadrangles
  LIMIT 10) t1;

...and increased the LIMIT each time.

10 polygons: 0.01 seconds
100 polygons: 0.07
200 polygons: 0.17
300 polygons: 0.35
400 polygons: 0.57
500 polygons: Crashed spatiaLite GUI

Hopefully someone has some insight on aggregating large(ish) quantities of polygons.

Thanks
-tom

sandro furieri

unread,
Oct 12, 2012, 7:02:24 AM10/12/12
to spatiali...@googlegroups.com
Some interesting details:
I've used a medium-sized set of about 1 thousand polygons
(Italy Local Council boundaries from ISTAT).

SELECT ST_Union(Geometry)
FROM com2011
WHERE cod_reg = 1;
--------------
completed in about 1 min and 20 secs

SELECT ST_UnaryUnion(ST_Collect(Geometry))
FROM com2011
WHERE cod_reg = 1;
--------------
completed in just 20 secs

so it seems definitively confirmed that using the indirect approach
ST_UnaryUnion(ST_Collect(...)) is noticeably faster than directly
using ST_Union() as an aggregate function.
I'm still investigating in further depth

bye Sandro


a.fu...@lqt.it

unread,
Oct 12, 2012, 3:48:23 PM10/12/12
to spatiali...@googlegroups.com
All right, solved. BTW an interesting story ;-)

very quick explanation: ST_Union is based on GEOS, and GEOS in turn
supports several related methods.

a) GEOSGeometry *GEOSUnion(const GEOSGeometry* g1,
const GEOSGeometry* g2);
this is the "basic" method, accepting two input geometries
and then returning the dissolved geometry.
corresponds to the SQL non-aggregate function: ST_Union(a, b)

b) GEOSGeometry *GEOSUnaryUnion(const GEOSGeometry* g1);
this one is the "advanced" method, very recently introduced and
absolutely requiring a very recent GEOS version (>= 3.3.0)
a single input geometry is accepted, this corresponding to some
(may be huge) collection of many individual geometries.
it's much more efficient than the previous one, when really
complex geometries have to be processed.

c) GEOSGeometry *GEOSUnionCascaded(const GEOSGeometry* g1);
more or less like UnaryUnion, but accepting *only* collections
of Polygons/MultiPolygons.
supported by earlier GEOS versions, and nowadays officially
*deprecated* (UnaryUnion being a more sophisticated replacement).

Do you think all this is rather confusing ? I agree with you :-D

the SpatiaLite's own implementation for ST_Union (aggregate function)
was stupidly based on GEOSUnion(a, b); it's now absolutely clear that
this simplistic approach cannot absolutely offer a decent performance
level while processing many and many polygons at once.

final solution (to be adopted on 4.0.0):
----------------------------------------
case a) if some recent GEOS version (>= 3.3.0) is available, then
ST_Union() [aggregate] will be based on GEOSUnaryUnion()

case b) when no recent GEOS version (<= 3.2.0) is available then
we'll have two further sub-cases:
case b.1) we simply have to merge many Polygons/MultiPolygons:
then GEOSUnionCascaded will be used.
case b.2) in any other case (Points, Linestrings, mixed-type)
then the "old&slow" approach based on GEOSUnion will be
still adopted, lacking any better alternative.

Please note: this one exactly is the same approach adopted by PostGIS.


interim solution (for 3.0.0 users)
----------------------------------
never use ST_Union() [aggregate function]; it's *really* very slow,
and its performances tend to degradate very quickly when an huge
number of Polygons is involved in the process.

use instead ST_UnaryUnion(ST_Collect(...))

bye Sandro


--
Il messaggio e' stato analizzato alla ricerca di virus o
contenuti pericolosi da MailScanner, ed e'
risultato non infetto.

Alexandre Neto

unread,
Oct 14, 2012, 5:22:25 PM10/14/12
to spatiali...@googlegroups.com
Thank you Sandro!

If you don't mind, I will post your explanation in GIS.Stackexchange. Or, if you prefer, you can post it yourself.


Best Regards,

Alexandre Neto

a.fu...@lqt.it

unread,
Oct 15, 2012, 4:03:17 AM10/15/12
to spatiali...@googlegroups.com
> If you don't mind, I will post your explanation in GIS.Stackexchange.
>

feel absolutely free ;-)

bye sandro

Damien

unread,
Sep 2, 2014, 4:23:10 PM9/2/14
to spatiali...@googlegroups.com
Hello,

I take the occasion of this thread because I am facing a weird problem directly related to what was said above.

I am using GAUL data and there are some polygons related to the same administrative unit that I would like to dissolve.
I used this query:
CREATE TABLE GAUL_1 AS
SELECT
STATUS
,
DISP_AREA
,
ADM2_CODE
,
ADM2_NAME
,
STR2_YEAR
,
EXP2_YEAR
,
ADM1_CODE
,
ADM1_NAME
,
ADM0_CODE
,
ADM0_NAME
,
ST_unaryunion
(ST_collect(Geometry)) AS Geometry
FROM GAUL1990_2
GROUP BY ADM2_CODE
;


But when I want to recover the geometry column (XY, Multipolygon) I get an error: "Geometry column doesn't satisfies required constraints". However, the geometry information should be compliant with the multipolygons:
count    GeometryType           SRID       Dimension
411      MULTIPOLYGON      4326   XY
576       POLYGON                4326         XY


Can anyone give me a clue to be able to recover this geometry? For the record, I also tried with inserting the data into an empty table with a clean geometry, the effect was the same.

Thank you for your help!

Damien


Jukka Rahkonen

unread,
Sep 2, 2014, 4:43:52 PM9/2/14
to spatiali...@googlegroups.com, Damien
Hi,

Polygons can be presented as multipolygons as you said but Spatialite
does not do it automatically. Push the geometries from ST_unaryunion
through CastToMultiPolygon
http://www.gaia-gis.it/gaia-sins/spatialite-sql-latest.html#cast and
they will all be MultiPolygons and you can recover the geometry column.

-Jukka Rahkonen-

a.fu...@lqt.it

unread,
Sep 2, 2014, 4:45:24 PM9/2/14
to spatiali...@googlegroups.com
On Tue, 2 Sep 2014 13:23:10 -0700 (PDT), Damien wrote:
> But when I want to recover the geometry column (XY, Multipolygon) I
> get an error: "Geometry column doesn't satisfies required
> constraints". However, the geometry information should be compliant
> with the multipolygons:
>
> count GeometryType SRID Dimension
> 411 MULTIPOLYGON 4326 XY
> 576 POLYGON 4326 XY
>

Hi Damien,

please note: Polygon and MultiPolygon are completely different
Geometry classes in the standard OGC model.
and consequently any attempt to insert a Polygon into a Table
declared of the MultiPolygon type will always be considered
a blatant violation of Geometry constraints.

Anyway nothing in the standard OGC model forbids that a
MultiSomething type will actually contain just a single
elementary item.
So you are absolutely free to promote all your Polygons
to be formally represented as MultiPolygons, thus
re-establishing the expected full consistency.

example:
--------
UPDATE my_table SET geom = CastToMultiPolygon(geom)
WHERE ST_GeometryType(geom) = 'POLYGON';

after executing this query your table will contain
all MultiPolygons, and RecoverGeometryColumn() will
finally work as you are expecting.

bye Sandro

Damien

unread,
Sep 3, 2014, 3:16:52 AM9/3/14
to spatiali...@googlegroups.com
Thank You Jukka and Sandro!

I naively thought that all Multi-something geometries also allowed the something. That's good to know.
Reply all
Reply to author
Forward
0 new messages