Lots of data. Seems to call rather for a database than for a Tcl script.
But maybe you can ease the load for Tcl: if the Big one is a text file,
and those lines that interest you have a recognizable pattern, pipe it
through a sed, grep, or awk (or Perl, as you suggested). Here's some
shots into the darkness:
When you process lines in Tcl, it may help to replace several regsubs by
one regexp.
When calculating with expr, curlybrace the arguments:
expr {$foo + $bar * $baz} ;#instead of
expr $foo + $bar * $baz
Incrementing counters is best done with
incr me ;# instead of
set me [expr $me+1]
If you can extract a representative test sample from the Big file, do
that and time various program alternatives to find out what's fastest.
Also, by turning on or off parts of the test processing, identify the
parts that eat most of the time. They'll profit most from optimization.
Post non-confidential code examples here on c.l.t for more detailed help
(never guaranteed, but often coming).
--
Schoene Gruesse/best regards, Richard Suchenwirth -- tel. +49-7531-86
2703
> RC DT2, Siemens Electrocom GmbH, Buecklestr. 1-5, D-78467 Konstanz, Germany
> My opinions were not necessarily, or will not necessarily be, mine.
Gregory,
Look at a discussion of Tcl Performance on
http://purl.org/thecliff/tcl/wiki/TclPerformance
It is hard to tell why your code seems to be slow without seeing it. My
guess would be that you are using the default buffer size when reading
the 90Mb file. See the section on "slurping up data files" towards the
bottom of the Tcl Performance page. If this is the primary issue, it
could make a 50x speed difference in your code.
Good luck,
Bob
--
Bob Techentin techenti...@mayo.edu
Mayo Foundation (507) 284-2702
Rochester MN, 55905 USA http://www.mayo.edu/sppdg/sppdg_home_page.html
My gut reaction is this program is taking waaaayyyyy too long to
execute. I would think an HP9K could process a 90mb file in a matter of
a few tens of minutes, max. For example, I can set 800,000 array
elements to a static value in under a minute on a Pentium II class
machine. I can read a 35mb file in less than two minutes. I find it hard
to believe that actually processing that data could take almost seven
hours. Well, ok, I can believe it, but I also believe that can be cut
down by an order of magnitude.
My guess is, there is a bottleneck in how you read the data in and/or
write it out Look for a long thread on I/O performance in the past
couple of weeks. The nutshell summary is, it's significantly faster to
use read with the actual size of the data to read, than to default to a
relatively small buffer size (eg: [read $fileid [file size $file]])
There may also be some subtle optimizations that could make a huge
difference. For example, putting curly braces around expressions,
limited use of eval, making sure you aren't causing a lot of
list->string conversions (or visa versa), that sort of thing. But
without seeing the code, it's hard to say.
--
Bryan Oakley mailto:oak...@channelpoint.com
ChannelPoint, Inc. http://purl.oclc.org/net/oakley
Education is full of oversimplified lies which can be
refined into the truth later.
Greetings!
Volker