Not being an expert on indexes, I would say that building the indexes later is not a problem
- this could be a task that can be done together with 'RecoverGeometryColumn'
-- after the basic import has been completed
What should not be done, is to deactivate constraints
- I often use the following horror combinations of:
PRIMARY KEY (id_admin,admin_level,name,valid_since,valid_until,changed_type)
-- here the constraints must be enforced while importing to insure that the data in sane
---
So it would seem to me you are doing this correctly
- so the question remains about the '.journal' file
In the past I have received an 'database disk image is malformed' when
- on a long task, become curious and impatient, and copy the file elsewhere to have sneak 'preview'
-- without coping the '.journal' file, this will, in most cases, fail
--> sometimes it will work, but often not
The '.journal' is a ROLLBACK file
- so if something goes wrong while INSERTing or UPDATEing
-- sqlite3 will use this file to remove any data inserted into the database before a COMMIT has been completed
--> the crash may have occurred during the writing into the database (i.e. before the COMMIT was completed
---> thus causing the 'malformed' error
So if a sqlite3 connection is started
- this being done with any 'spatialite' connection
-- sqlite3 will look for any '.journal' and try to repair the database
--> based on the information contained in the '.journal'
So in such a scenario
- where a '.journal' file exists
-- you may notice that after opening the database with spatialite_gui
--> the '.journal' file will disappear, after the offending data has been removed from the database (ROLLBACK)
So executing the following sql-script
- which lists a cleaned up version of the table/views (without most of the internal administration tables/views)
-- sqlite3 20160117.berlin_street_geometries.db < list_tables.sql
SELECT
name
FROM sqlite_master
WHERE
(
(name NOT LIKE '%_f_%') AND
(name NOT LIKE '%ref_sys%') AND
(name NOT LIKE '%_history') AND
(name NOT LIKE '%_log') AND
(name NOT LIKE '%_insert') AND
(name NOT LIKE '%_update') AND
(name NOT LIKE '%_columns%') AND
(name NOT LIKE '%_view_%') AND
(name NOT LIKE '%_layers%') AND
(name NOT LIKE '%_temp%') AND
(name NOT LIKE '%_ins_%') AND
(name NOT LIKE '%_upd_%') AND
(name NOT LIKE 'sqlite_%') AND
(name NOT LIKE 'vw_%') AND
(name NOT LIKE 'gg%') AND
(name NOT LIKE 'gi%') AND
(name NOT LIKE 'tm%') AND
(name NOT LIKE 'SpatialIndex') AND
(name NOT LIKE 'ElementaryGeometries') AND
(name NOT LIKE 'KNN') AND
(name NOT LIKE 'idx_%')
);
This may be enough to force any needed ROLLBACK
- this is the last possibility I can think of to resolve such a situation
-- and, of course, will only work within a TRANSACTION (i.e. BEGIN / COMMIT)
--> which you are doing
--
The only other way to (possibly) speed up the bulk INSERTing of data in batch mode
- is to group your INSERT commands
-------
-- 1811042191 - Pariser Platz [1811042191,Pariser Platz]
-------
INSERT INTO berlin_admin
(id_admin, name, notes, admin_level, valid_since, valid_until, changed_type, id_belongs_to, belongs_to)
VALUES
(1811042191,'Viereck','',11,'1732-01-01','1733-12-31',1,1811042191,'Pariser Platz'),
(1811042191,'Quarree','',11,'1734-01-01','1814-09-14',1,1811042191,'Pariser Platz'),
(1811042191,'Pariser Platz','Bekanntmachung vom 1814-09-16',11,'1814-09-15','3000-01-01',1,1811042191,'Pariser Platz'),
(1811042191,'Pariser Platz','',11,'1732-01-01','3000-01-01',2,1811042191,'Pariser Platz'),
(1811042191,'Pariser Platz','',11,'1732-01-01','3000-01-01',3,1811042191,'Pariser Platz'),
(1811042191,'Pariser Platz','1-6',11,'1814-09-15','3000-01-01',4,1811042191,'Pariser Platz');
This was created with to following sql
- which is used in a php-script to dump out the database
-- each admin_level (11=street) in it own file
--> change-type=1 means the name of the (here street) has changed; 4=the houses have changed (street has been extend or shorten)
---> so a new geometry must be created for each changed situation
So each record must be unique
- thus the 'PRIMARY KEY (id_admin,admin_level,valid_since,valid_until,changed_type,id_belongs_to)'
The record is dumped out with:
SELECT
'-------'||char(10)||'-- '||id_admin||' - '||name||' ['||id_belongs_to||','||belongs_to||']'||char(10)||'-------'||char(10)||
'INSERT INTO berlin_admin'||char(10)||
' (id_admin, name, notes, admin_level, valid_since, valid_until, changed_type, id_belongs_to, belongs_to)
VALUES'||
group_concat((
SELECT
char(10)||' ('||id_admin||','''||name||''','''||notes||''','||admin_level||','''||valid_since||''','''||valid_until||''','||changed_type||','||id_belongs_to||','''||belongs_to||''')'
))||';' AS sql_insert
FROM
(
SELECT * FROM berlin_admin
WHERE ((id_admin IN (1811042191)) AND (admin_level IN (11)))
ORDER BY changed_type,valid_since
);
In this way, when manually editing the data
- it is easier to insure the uniqueness
-- but also speeds up the creation of the database with a bulk INSERTing of the data
Hope this helps
Mark