On Wed, 19 Jul 2017 02:43:05 -0700 (PDT), Alessandro Donati wrote:
> Hi Sandro,
>
> thanks for your reply
>
> what you say makes perfectly sense to me. An obstacle is the huge
> size
> of the original shapefile (several GB) which makes importing the
> whole
> dataset into the topology a really slow process. That's why I had to
> clip it.
>
> I've noticed that TopoGeo_FromGeoTable draws 25% of my CPU but not so
> much RAM (~300MB).
>
Alessandro,
I see that you are using an Intel i5 CPU, having 4 logical cores.
both SQLite and SpatiaLite are single-threaded, so a 25% workload
simply means that you are squeezing out any possible bit of
computational power from a single core.
> The process of importing a single (clipped) dataset is very slow.
>
Topology == boring slowness (but ultra-high quality)
this is always true in a general way, and the current implementation
of librttopo (the topology engine) could be hardly judged as a
"fast performer".
> Some test data:
>
> Clipped shapefile dataset
> Size: ~7MB
> Polygons: 207
> Nodes (total number of polygon vertexes): 441586
> CPU Time: user 1233.016304 sys 11.887276 on Intel Core i5-3320M 2.6
> GHz, 8GB RAM (yet 32 bit process), SSD hard disk, Windows 7
>
> based on your experience, are these performances reasonable given the
> dataset complexity or should I investigate more about this slowness?
>
your figures aren't at all exceptional.
during the preliminary Topology testing we frequently encountered
really huge datasets (in the "many GB/million features" range)
requiring a full week to be loaded into a Topology :-D
> BTW, all components (sqlite, spatialite, librttopo, etc) are built in
> release mode using original
makefile.vc settings
>
> Is there an alternative workflow to what I have described (in terms
> of
> SELECT TopoGeo_xxx) to speed up the process, at expense of more RAM
> perhaps?
>
yes, you can usefully try using the more recent import interfaces
based on librttopo 1.1:
TopoGeo_FromGeoTableNoFaceExt()
and
TopoGeo_Polygonize()
they implement a two-stages import process:
1. first all Nodes and Edges are imported completely ignoring Faces.
2. the Faces are build on a separate final step.
caveat: these new functions are noticeably faster but they usually
require an impressive amount of RAM.
I suppose that using 64 bit software should be a practical requirement,
because 32 bit sw is physically limited to a 2GB address space, that
on Windows platforms usually downgrades to 1.5GB or (may easily be
if your RAM is highly fragmented) to a mere 0.8GB
bye Sandro