I try to use sed command with special arguments and it fails.
1- In deed I want to add 'TITI' before a '{' and add 'TOTO' after the
'{' so I write
exec sed -e '/{/aTITI' -e '/{/iTOTO' inFile > outFile
TCL error is: missing close-brace: possible unbalanced brace in
comment.
To solve it I use \ to 'protect' the curly brace
exec sed -e '\/{/aTITI' -e '/\{/iTOTO' inFile > outFile
and TCL error is sed: -e expression #1, char 1: Unknown command: `''
2- I also want to replace
//BLABLA by //BLOBLO
exec sed -e "s//BLABLA///BLOBLO" inFile > outFile
TCL error: sed: -e expression #1, char 0: No previous regular
expression
To solve it I use \ to 'protect' /
I still got the same error.
Can U help me
thanks
fab
first I will mention that this can all be done in pure tcl without needing
to call out to sed, but assuming you have a need/desire to do so the issue
is you are quoting incorrectly. the single quotes ' are a SHELL
quoting mechanism, not a Tcl one. You are writing Tcl so you
need to use tcl quoting mechanisms and there are only 2 of them
double quotes "" you can use to group into words and substitution
does occur and braces {} in which substitution does not occur.
The first error is that Tcl does not see the ' as a quoting mechanism
so the { is assumed to start a groupiong, siwtching to double quotes
will fix this without the need to escape the brace.
The error is the the ' is being passed to sed as part of the
argument itself and sed does not like it. so switching to double
quotes will fiox this as well.
exec sed -e "/{/aTITI" -e "/{/iTOTO" inFile > outFile
should work better.
Beware single quote is not a metacharacter in Tcl. You want:
exec sed -e "/\{/aTITI" -e "/\{/iTOTO"
> To solve it I use \ to 'protect' the curly brace
> exec sed -e '\/{/aTITI' -e '/\{/iTOTO' inFile > outFile
> and TCL error is sed: -e expression #1, char 1: Unknown command: `''
First I think the first expression started with '/\{' and not '\/
{' (otherwise you'd have received the same punition as before).
Second, this shows you that the single quotes are passed to the
underlying program (sed) as normal characters, as explained above.
Of course usually you type all this in one of the sh or csh families,
where single quotes play a different role ;-)
> 2- I also want to replace
> //BLABLA by //BLOBLO
> exec sed -e "s//BLABLA///BLOBLO" inFile > outFile
> TCL error: sed: -e expression #1, char 0: No previous regular
> expression
After visiting sh's and (a bit of) Tcl's Quoting Hells, welcome to
sed's :-)
Though you can escape slashes this way:
exec sed -e "s/\\/\\/BLABLA/\\/\\/BLOBLO/"
or (only Tcl-quoting differs):
exec sed -e {s/\/\/BLABLA/\/\/BLOBLO/}
sed has a much nicer tool: you can use any character instead of "/" as
the s-command arg separator. So choose one that doesn't collide with
your own regexp and you're saved:
exec sed -e s@//BLABLA@//BLOBLO@
-Alex
Alex and Bruce have already said all there is to say. For completeness,
I'll mention that <URL:
http://phaseit.net/claird/comp.lang.tcl/fmm.html#sed > covers the same
material.
Thanks for all your answer.
Someone suggests that sed command could be replaced by a tcl code, I
think that exec sed is more efficient that develop a tcl routinme for
this, what is your opinion about it in term of run time.
Fab
It depends on the size of the data.
For a small file the fork/exec overhead of sed loses the battle.
For a big one the unbeatable speed of that old predator will dwarf
Tcl's best efforts.
A secundary concern may be (take your pick):
- portability (though you can have sed on Windows, but you may need
to bundle it)
- proper handling of i18n (Tcl wins)
- expressivity of the regexp engine (sed only has simple ones)
- geekness (a big decision tree in sed is a piece of art).
-Alex
It depends on how many substitutions are to be made, and how large the
file is.
Executing a no-op sed:
% time {exec sed -e {s/foo/bar/g} < /dev/null}
1823 microseconds per iteration
This will be the overhead of starting the external program.
Processing time will be added.
Executing a no-op regsub:
% time {regsub -all foo {} bar -}
11 microseconds per iteration
This suggests that the TCL solution will be faster for some amount of
data and number of substitutions. You will have to profile a typical
real-world example to check what is faster on your machine.
R'
I think what it takes time is not the substitution but the parsing of
all the file, for this operation sed seems faster, but it strongly
depend of the size of the file.
Fab
Just for completeness, I'll show how I'd do this in Tcl
set in [open infile r]
set out [open outfile w]
while {[gets $in line] != -1} {
regsub {//BLABLA} $line {//BLOBLO} newline
set hasBrace [expr {[string first \{ $line] != -1}]
if {$hasBrace} {puts $out TITI}
puts $out $newline
if {$hasBrace} {puts $out TOTO}
}
close $in
close $out
--
Glenn Jackman
Write a wise saying and your name will live forever. -- Anonymous
Don't think, measure :-)
proc tclsubst {file} {
set fd [open $file r]
set txt [read $fd]
close $fd
# substitute \{ by TITI\{TOTO and //BLABLA by //BLOBLO
return [string map [list \{ TITI\{TOTO //BLABLA //BLOBLO] $txt]
}
proc sedsubst {file} {
return [exec -keepnewline sed -e "s,\{,TITI\{TOTO,g" -e "s,//BLABLA,//BLOBLO,g" $file]
}
On my machine, with a text file of alternating
{ foobar baz }
//BLABLA by //BLOBLO
(i.e. each line triggers one conversion) lines the TCL version is
faster up to 100kB file size. With larger files, sed is faster.
100kB
% time {sedsubst x2} 3
16216 microseconds per iteration
% time {tclsubst x2} 3
14223 microseconds per iteration
2.5MB
% time {sedsubst xxx} 3
270010 microseconds per iteration
% time {tclsubst xxx} 3
346909 microseconds per iteration
10MB
% time {sedsubst x4} 3
1056490 microseconds per iteration
% time {tclsubst x4} 3
1374480 microseconds per iteration
Interestingly enough, with a file where no line matches,
sed is way faster than TCL:
10MB
% time {sedsubst x5} 3
655128 microseconds per iteration
% time {tclsubst x5} 3
1603964 microseconds per iteration
YMMV.
R'
Thanks Ralf your demonstration leave no doubt.
Fab
Which way in no doubt, TCL or SED
For me less than a second difference for a 10MB file points directly
to the TCL solution.
Immediatly portable to any TCL platform and no extra dependencies
(What happens when PATH gets changed.
Probably easier to maintain as well only one language to learn.
Just my 2c.
Martyn
Martyn,
finally you got the best argument, have the same language in the code
improve the robustness and the portability, the runTime is secondary
Fab