On 11/9/2011 3:46 PM, Peng Yu wrote:
<snip>
> Suppose I have string of the format [0-9]+[a-zA-Z]. For example,
> 100M20K28U10K is one such string.
>
> I want to extract the number before say a English letter, for example,
> 'K' or 'U' in this example. What is the best way to do in awk?
I don't know if it's the BEST way as there are alternatives, but one way is:
$ cat file
100M20K28U10K
$ awk -v let="K" 'match($0,"[[:digit:]]+" let) {
print substr($0,RSTART,RLENGTH-1)}' file
20
$ awk -v let="U" 'match($0,"[[:digit:]]+" let) {
print substr($0,RSTART,RLENGTH-1)}' file
28
There's also:
$ awk -v let="K" '{ sub(let ".*",""); sub(/.*[[:alpha:]]/,"") }1' file
20
$ awk -v let="U" '{ sub(let ".*",""); sub(/.*[[:alpha:]]/,"") }1' file
28
Of course those just get the number before the first occurrence of a letter.
Maybe you want all of them? Then one way would be:
$ awk -v let="K" '{ n=split($0,a,let); for (i=1; i<n; i++) {
sub(/.*[[:alpha:]]/,"",a[i]); print a[i] } }' file
20
10
It all depends what you want...
Regards,
Ed.