Is it possible to pre allocate space to a DBM-Deep database file?

1 view
Skip to first unread message

SonOfTheLama

unread,
Jun 9, 2009, 6:12:24 PM6/9/09
to DBM-Deep
I am using DBM-Deep 1.0013 in a Win32 ActiveState PERL environment. I
wrote a program to process ~800MB text files with 10->100 million ~60
byte lines. I am storing portions of the text in a multilevel array
which I use to compare data from 1 line of text to data from all other
lines of text in a given file. Processing this much data, in this way,
is going to take a while but I’d like to optimize the process as best
I can.

Initially the program reads in the source file & DBM-Deep stores the
relevant data in the database file. I can see the file growing
incrementally as it receives the data and this phase is taking hours
to complete. I know from past experience that growing a file
incrementally can be a costly process from a IO/file system/OS
standpoint & I’m wondering if it would be possible to pre allocate a
large amount of space to the database file prior to populating it with
data so it does not have to grow incrementally?

Any advice would be most appreciated.

Thank you,

Rob Kinyon

unread,
Jun 9, 2009, 7:26:39 PM6/9/09
to DBM-...@googlegroups.com
You most certainly can pad the file with \0. That could possibly help
and, if you want, I'll gladly accept a patch for doing that.

As for what you're doing ... Have you thought about hashing the
various components you're comparing and creating a hash of arrays vs.
a multi-level array? DBM::Deep's manipulation of arrays is extremely
poor. It treats arrays as hashes with numerical keys.

Rob
--
Thanks,
Rob Kinyon
Reply all
Reply to author
Forward
0 new messages