On 9/22/2012 11:07 AM, er1ch wrote:
> Am Fri, 21 Sep 2012 16:34:35 +0000 schrieb Ed Morton:
>
>> Ed Morton <
morto...@gmail.com> wrote:
>>
> [...]
>>>> I know that the tolower("FoO") function converts each character in
>>>> "FoO" to its lowercase equivalent "foo".
>>>>
>>>> What I'm trying to do is to apply this function to regexes, i.e. I
>>>> want to replace every string in <Html Brackets> into <lowercase html
>>>> characters>. This can be done with a simple sed command. Thus, I
<snip>
>> It feels to me like this is a fairly common question (in various
>> flavours) so here's a more general function showing how to update
>> segments of a string as identified by an RE and an operation:
>
> I really am grateful for your replies, they are more a tutorial than a
> "simple help".
Darn, now I'm embarrassed into writing it better :-). Below is how, given a bit
of thought, I'd really implement a function to modify strings matching an RE
based on a specified operation:
$ cat file
here <abc> are 0003 <DeFgHi> string <KLMNO> and 002 numeric examples
$
$ cat modre.awk
function modre(string,regexp,op,delta, head,tail,old,new)
{
tail = string
while ( match(tail,regexp) ) {
old = substr(tail,RSTART,RLENGTH)
if (op == "tolower") { new = tolower(old) }
else if (op == "toupper") { new = toupper(old) }
else if (op == "length") { new = length(old) }
else if (op == "int") { new = int(old) }
else if (op == "exp") { new = exp(old) }
else if (op == "+") { new = old + delta }
else if (op == "-") { new = old - delta }
else if (op == "*") { new = old * delta }
else if (op == "/") { new = old / delta }
else {
printf "ERROR: modre() invalid op \"%s\".\n", op | "cat>&2"
exit 1
}
head = head substr(tail,1,RSTART-1) new
tail = substr(tail,RSTART+RLENGTH)
}
return head tail
}
{
# string change examples:
print 1, modre( $0, "<[^>]+>", "tolower" )
print 2, modre( $0, "<[^>]+>", "toupper" )
print 3, modre( $0, "<[^>]+>", "length" )
# numeric change examples:
print 4, modre( $0, "[[:digit:]]+", "int" )
print 5, modre( $0, "[[:digit:]]+", "+", 159 )
print 6, modre( $0, "[[:digit:]]+", "*", 5 )
}
$ awk -f modre.awk file
1 here <abc> are 0003 <defghi> string <klmno> and 002 numeric examples
2 here <ABC> are 0003 <DEFGHI> string <KLMNO> and 002 numeric examples
3 here 5 are 0003 8 string 7 and 002 numeric examples
4 here <abc> are 3 <DeFgHi> string <KLMNO> and 2 numeric examples
5 here <abc> are 162 <DeFgHi> string <KLMNO> and 161 numeric examples
6 here <abc> are 15 <DeFgHi> string <KLMNO> and 10 numeric examples
It no longer supports assignment as you'd simply use gsub() for that, it only
supports cases where you need to perform some operation on the result of the match.
Regards,
Ed.