rm tmp.tmp
foreach ($file in dir *exitlc.cfg)
{
echo $file.name
mv $file tmp.tmp
foreach ($line in cat tmp.tmp)
{
if ($line -match "^#.+version")
{
$line = $line -replace "version[ \t]+[^ \t]+",
"version 1.03"
}
if ($line -match "(.+DS60_[^ \t]+[ \t]+[^ \t]+[ \t]+)")
{
$line = $matches[1] + "25 25"
}
$line | add-content -encoding ascii $file
}
rm tmp.tmp
}
The script works just fine, but it takes between 1.5 and 2.0 seconds to
execute for each file. Each of these files are simple ASCII text, of
about 300 lines. At first blush this seems like a very long time to
process such small files. I'd appreciate comments as to whether I could
do better, and as to why this script takes so long.
This is using PS 2.0. My OS is Windows XP SP3. CPU is 2.6 ghz, and RAM
is 1 gigabyte. Disk is a Promise Raid 1 array of two 120 gb drives.
Thanks,
Al
1)
if ($line -match "^#.+version")
{
$line = $line -replace "version[ \t]+[^ \t]
+",
"version 1.03"
}
convert only to command with -replace only.
Something like $line = $line -replace '(?<pre>^#.+)(?<v>version[ \t]+
[^ \t]+)','${pre}version 1.03'
2) maybe .+? instead of .+ would be quicker.
3) I wouldn't add each line one by one, I would save it all at once.
You can publish the files somewhere, I can check how quick it is at my
machine.
stej
. {
foreach ($line in cat tmp.tmp)
{
...
$line
}
} | Set-Content $file
==
Thanks,
Roman Kuzmin
http://code.google.com/p/farnet/
PowerShell and .NET support in Far Manager
Martin
"Al Fansome" <al_fa...@hotmail.com> wrote in message
news:INKdnSG6dLcWA57W...@supernews.com...
stej
On Nov 18, 4:00 pm, "Martin Zugec" <martin.zu...@gmail.com> wrote:
> Having a quick peek at your code, I would write it little bit different and
> use RegEx replace instead:http://msdn.microsoft.com/en-us/library/system.text.regularexpression...
>
> Martin
>
> "Al Fansome" <al_fans...@hotmail.com> wrote in message
$buf = $buf + $line + "`n"
After the file has been processed, I dump the entire buffer into the file:
set-content -encoding ascii -value $buf $file
I also added some instrumentation to measure per-file times. The code
now processes one file per 70 ms, as opposed to one file per 1500 ms, an
improvement by more than a factor of 20. Very impressive.
I imagine the problem with add-content is that it does a complete open,
append, and close cycle on the file for each line, and all of that extra
directory overhead is what's using up all the time.
Thanks for the advice.
You might try to use StringBuilder class as well - every time you join
strings, new string is created -> it's cpu and memory expensive.
$buff = new-object text.stringbuilder 200 # 2000 is total expected
length (default is 16)
[void]$buff.AppendLine('first line')
[void]$buff.AppendLine('second line')
$buff.ToString() #returns "first line<new line>second line"
In your case it would be [void]$buff.AppendLine($line)
stej
On Nov 18, 11:18 pm, Al Fansome <al_fans...@hotmail.com> wrote:
> Instead of using add-content on each line, I accumulate each modified
> line for the current file in a buffer variable:
>
> $buf = $buf + $line + "`n"
>
> After the file has been processed, I dump the entire buffer into the file:
>
> set-content -encoding ascii -value $buf $file
>
> I also added some instrumentation to measure per-file times. The code
> now processes one file per 70 ms, as opposed to one file per 1500 ms, an
> improvement by more than a factor of 20. Very impressive.
>
> I imagine the problem with add-content is that it does a complete open,
> append, and close cycle on the file for each line, and all of that extra
> directory overhead is what's using up all the time.
>
> Thanks for the advice.
>
> Martin Zugec wrote:
> > Having a quick peek at your code, I would write it little bit different
> > and use RegEx replace instead:
> >http://msdn.microsoft.com/en-us/library/system.text.regularexpression...
>
> > Martin
>
> > "Al Fansome" <al_fans...@hotmail.com> wrote in message
$buf = $buf + ($line + "`n")
will execute much faster if a large number of iterations are involved.
- Larry
# append long string
measure-command { 1..5000 | % { $buff = new-object text.stringbuilder
10000; 1..100 |% { $buff.AppendLine(('a'*100)) } } }
40,8sec
measure-command { 1..5000 | % { $buff = ''; 1..100 |% { $buff +=
('a'*100) +"`n" } } }
43,1sec
# append very short string
measure-command { 1..50 | % { $buff = new-object text.stringbuilder
10000; 1..10000 |% { $buff.AppendLine('a') } } }
36,1sec
measure-command { 1..50 | % { $buff = ''; 1..10000 |% { $buff += 'a'
+"`n" } } }
43,5sec
It depends very much on the length of appended string.
On 19 lis, 02:11, Larry__Weiss <l...@airmail.net> wrote:
> My measurements show that
>
> $buf = $buf + ($line + "`n")
>
> will execute much faster if a large number of iterations are involved.
>
> - Larry
>
> Al Fansome wrote:
> > Instead of using add-content on each line, I accumulate each modified
> > line for the current file in a buffer variable:
>
> > $buf = $buf + $line + "`n"
>
> > After the file has been processed, I dump the entire buffer into the file:
>
> > set-content -encoding ascii -value $buf $file
>
> > I also added some instrumentation to measure per-file times. The code
> > now processes one file per 70 ms, as opposed to one file per 1500 ms, an
> > improvement by more than a factor of 20. Very impressive.
>
> > I imagine the problem with add-content is that it does a complete open,
> > append, and close cycle on the file for each line, and all of that extra
> > directory overhead is what's using up all the time.
>
> > Thanks for the advice.
>
> > Martin Zugec wrote:
> >> Having a quick peek at your code, I would write it little bit
> >> different and use RegEx replace instead:
> >>http://msdn.microsoft.com/en-us/library/system.text.regularexpression...
>
> >> Martin
>
> >> "Al Fansome" <al_fans...@hotmail.com> wrote in message
$s = $s + "xxx" + "yyy"
just makes it slower still by an unexpectedly large factor.
Consider:
$s = ""; Measure-Command {1..10 | % {$s = $s + "xyzzy" + "`n"} } | fl Ticks
Ticks : 68271
6,827 ticks per append
$s = ""; Measure-Command {1..100 | % {$s = $s + "xyzzy" + "`n"} } | fl Ticks
Ticks : 547485
5,475 ticks per append
$s = ""; Measure-Command {1..1000 | % {$s = $s + "xyzzy" + "`n"} } | fl Ticks
Ticks : 4095285
4,095 ticks per append
$s = ""; Measure-Command {1..10000 | % {$s = $s + "xyzzy" + "`n"} } | fl Ticks
Ticks : 181310792
18,131 ticks per append
$s = ""; Measure-Command {1..20000 | % {$s = $s + "xyzzy" + "`n"} } | fl Ticks
Ticks : 1032937725
51,647 ticks per append
So we see it not scaling well at all for the largest iterations.
Interestingly, look what this code variation measures as:
$s = ""; Measure-Command {1..10 | % {$s = $s + "xyzzy`n"} } | fl Ticks
Ticks : 62555
6,256 ticks per append
$s = ""; Measure-Command {1..100 | % {$s = $s + "xyzzy`n"} } | fl Ticks
Ticks : 588960
5,890 ticks per append
$s = ""; Measure-Command {1..1000 | % {$s = $s + "xyzzy`n"} } | fl Ticks
Ticks : 3829056
3,829 ticks per append
$s = ""; Measure-Command {1..10000 | % {$s = $s + "xyzzy`n"} } | fl Ticks
Ticks : 109793102
10,979 ticks per append (compare to 18,131 ticks per append)
$s = ""; Measure-Command {1..20000 | % {$s = $s + "xyzzy`n"} } | fl Ticks
Ticks : 571517650
28,576 ticks per append (compare to 51,647 ticks per append)
So, not much optimization going on there!
- Larry
> $s = ""; Measure-Command {1..20000 | % {$s = $s + "xyzzy" + "`n"} } | select -exp ticks; $s.length
216039306
120000
> $s = ""; Measure-Command {1..(20000/4*2) | % {$s = $s + "xyzzy" + "`n" + "xyzzy" + "`n"} } | select -exp Ticks; $s.length
193721120
120000
> $s = ""; Measure-Command {1..(20000/6*2) | % {$s = $s + "xyzzy" + "`n" + "xyzzy" + "`n" + "xyzzy" + "`n" } } | select -exp Ticks; $s.length
205190128
120006
> $s = ""; Measure-Command {1..(20000/8*2) | % {$s = $s + "xyzzy" + "`n" + "xyzzy" + "`n" + "xyzzy" + "`n" + "xyzzy" + "`n" } } | select -exp Ticks; $s.length
181509387
120000
imho it doesn't matter how many "+" there is. I think that all the
magic could be caused by garbage collector. With every + there is a
new string allocated and the old one is forgotten. That's quite memory
expensive and when too many objects are allocated, gc is run.
I mentioned StringBuilder because of this:
(it's equivalent to the last command)
> Measure-Command { 1..(20000/8*2) | % {
$buff.append("xyzzy"); $buff.append("`n");
$buff.append("xyzzy"); $buff.append("`n");
$buff.append("xyzzy"); $buff.append("`n");
$buff.append("xyzzy"); $buff.append("`n") } } | select -exp Ticks;
$buff.length
9064070
120000
A degradation at an almost exponential rate is not something you want to allow
Murphy to have at his disposal! (or maybe better to say it will prove Murphy
correct at the worst of times)
- Larry
$s = ""; Measure-Command {1..20000 | % {$s = $s + "xyzzy" + "`n"} } | select
-exp ticks; $s.length
1349420647
120000
$s = ""; Measure-Command {1..20000 | % {$s = $s + ("xyzzy" + "`n")} } | select
-exp ticks; $s.length
727340851
120000
$s = ""; Measure-Command {1..20000 | % {$s = $s + "xyzzy`n"} } | select -exp
ticks; $s.length
732108064
120000
Do you get this large difference in runtime?
- Larry
> $s = ""; Measure-Command {1..20000 | % {$s = $s + "xyzzy" + "`n"} } | select -exp ticks; $s.length
258436122
120000
> $s = ""; Measure-Command {1..20000 | % {$s = $s + ("xyzzy" + "`n")} } | select -exp ticks; $s.length
107454640
120000
> $s = ""; Measure-Command {1..20000 | % {$s = $s + "xyzzy`n"} } | select -exp ticks; $s.length
135194417
120000
I have no explanation for why there is so huge difference.
stej
https://connect.microsoft.com/powershell
- Larry
https://connect.microsoft.com/PowerShell/feedback/ViewFeedback.aspx?FeedbackID=513075
- Larry
Larry__Weiss wrote:
> This one is interesting enough that I may open an issue about it at
> https://connect.microsoft.com/powershell
>