I've once written a parser for ibm's "portable heapdump" (phd) format, and
if I feed it a really large .phd file (>1GB) and let it write its output
into CSV-files, then it finishes in less than an hour on my machine.
Then I thought, it might be faster (or at least more convenient for next
steps of analysis) to write the data into a database instead. I read a
tutorial on the web <
www.magicsplat.com/articles/tdbc.html> that showed
me how to open database, create tables and how to use prepared statements
for repeating tasks (such as inserting rows into tables).
I picked sqlite3 because I don't need concurrent access or user-management.
But when I ran my parser to write into that DB, it was so very slow, that
by the time of an hour it had only produced a few kilobytes of database-
and journal files, and the script hadn't reached the first (of 206 for that
input file) progress-marks. After I aborted it, the db actually did
contain some consistent data, so it did successfully write rows to the
tables, just ways too slow.
I suppose I've done something very foolish, or maybe "insert" is just sqlite3's
weak spot (regarding performance), or maybe there are settings that I'd have
to set in the newly created db ("pragma ...") before inserts would be fast, or
maybe sqlite3 is just that one "proof-of-concept" pure tcl tdbc-driver ...
Any hints to an sqlite-newbie?
PS: I've been dealing with oracle and informix databases with C++ and Java in
the past, so I'm not a general newie with db-stuff, just with tdbc and sqlite.