"R.Wieser" <add...@not.available> wrote
|
| I'm using InStr() and mid() in a loop on large data strings. What I've
| noticed is that the larger the datastring, the slower the process seems to
| become. The time difference between results being returned for a few
KByte
| and even 5 MByte of data is rather noticable (and I've got a 25 MByte
| file...).
|
For concatenation or adding to a string:
Memory allocation. When you add a character to a 25 MB
string you have to allocate new memory of 25 MB + 1 byte.
According to Matthew Curland, VB6 has a string cache to
work from, up to a point. After that point it gets slow.
Presumably WScript and other programs also set aside a minor
cache, but I don't know about that.
People end up doing things like writing string builder classes
to get around the problem. The fastest solution I know of in
VB is to allocate a string bigger than might be needed, then
point an array at it so that it can be treated as an array for
reading. Using the Mid *statement* then allows writing to
the string pretty much as fast as CopyMemory.
VBS doesn't have the Mid statement, only the function.
But it is extremely fast to do Join. So whenever I need to
build a string I'll do it that way:
For i = 1 to 10000
s = "something" & x & "something else"
A(i) = s
Next
s = Join(A, "")
For InStr:
If you see a lag just reading
a string or doing InStr then I don't know the problem,
assuming that you're not counting the time to allocate the
string. VBS is just generally slow. But I do find that searching
case-insensitive is much slower. So much so that if you have
a lot to do it's worth allocating a second string UCase(s)
and then search that: InStr(pt1, s, "SOMETHING", 0)
I just tried loading a 100 MB text file and looking for a
string near the end. It took .14 seconds. I'd expect a few
ms in VB, but VBS does a lot of clunky wrapping. Everything
has to be converted to/from variants. Direct memory access
is wrapped in layers. I just
assume anything big will be slow. On the other hand, if
you need to deal with 25 MB files enough to matter then
maybe something else is wrong. No *typical* text file should
need to be that big.
I have found a big difference with my server logs. I parse
server logs to extract all entries with a particular
date, so that I can create one file for each day. That's thousands
of InStr calls for each day. With a 5 MB file I can extract
and write files for 5 days in about 1-2 seconds. With a 300
MB server log I have to leave it to work. Probably more than
a minute for each day. When the logs get big I use InStrRev,
but it's still very slow. It's possible WScript is not keeping
that string in RAM. I don't know.