Google Groups no longer supports new Usenet posts or subscriptions. Historical content remains viewable.
Dismiss

A strange issue when using awk to substitute a field in specific line.

29 views
Skip to first unread message

Hongyi Zhao

unread,
Oct 10, 2012, 9:35:12 AM10/10/12
to
Hi all,

I've a file named for_test which has the following three lines in it just
as the output given by the cat command:

$ cat for_test
set base_path /home/werner/Desktop/test_awk_sub/apt-mirrorfdsfdsfds
clean http://mirror.bjtu.edu.cn/debian
clean http://mirror.bjtu.edu.cn/debian-multimedia

Now, I want to substitute the first line in this file with the following
content:

set base_path /path/to/for_test's/current_dir/apt-mirror

For detail, I want to the third field of the first line of this file be
the current directory of this file followed by the characters "apt-
mirror".

For the above purpose, I use the following awk script located in the same
directory as the for_test file:

$ cat sub_test.awk
baseDirForScriptSelf="$(cd "$(dirname "$0")"; pwd)"
awk -v a=$baseDirForScriptSelf '{if($1=="set" && $2=="base_path") sub(/
^.*$/,a"/apt-mirror",$3);print}' ./for_test 1<>./for_test

But, after I run the sub_test.awk, I found that the for_test file become
the following one:

$ cat for_test
set base_path /home/werner/Desktop/test_awk_sub/apt-mirror
clean http://mirror.bjtu.edu.cn/debian
clean http://mirror.bjtu.edu.cn/debian-multimedia
ltimedia

As you can see, the final for_test becomes four lines in it. The last
line "ltimedia" is appeared when I issue the command:

$ ./sub_test.awk

I cann't figure out why this should happen. Could you please give me
some hints on this strange thing?

Regards
--
.: Hongyi Zhao [ hongyi.zhao AT gmail.com ] Free as in Freedom :.

Ben Bacarisse

unread,
Oct 10, 2012, 11:35:12 AM10/10/12
to
Hongyi Zhao <hongy...@gmail.com> writes:

> Hi all,
>
> I've a file named for_test which has the following three lines in it just
> as the output given by the cat command:
>
> $ cat for_test
> set base_path /home/werner/Desktop/test_awk_sub/apt-mirrorfdsfdsfds
> clean http://mirror.bjtu.edu.cn/debian
> clean http://mirror.bjtu.edu.cn/debian-multimedia
>
> Now, I want to substitute the first line in this file with the following
> content:
>
> set base_path /path/to/for_test's/current_dir/apt-mirror
>
> For detail, I want to the third field of the first line of this file be
> the current directory of this file followed by the characters "apt-
> mirror".
>
> For the above purpose, I use the following awk script located in the same
> directory as the for_test file:
>
> $ cat sub_test.awk
> baseDirForScriptSelf="$(cd "$(dirname "$0")"; pwd)"
> awk -v a=$baseDirForScriptSelf '{if($1=="set" && $2=="base_path") sub(/
> ^.*$/,a"/apt-mirror",$3);print}' ./for_test 1<>./for_test

The 1<>./for_test syntax does not remove the problem of writing to the
file you are reading. You might be able to use this method if you
could process the file fully before generating any output and then
rewind the IO file, but I don't think awk can do that.

The simplest solution is to use a temporary file for the output.

Your awk looks a little odd to me. Substituting the whole line just to
print it can be done by printing the desired string:

{ if ($1=="set" && $2=="base_path") print a, "/apt-mirror"; else print }

but you could also remove the if altogether:

$1=="set" && $2=="base_path" {print a, "/apt-mirror"; next} {print}

<snip>
--
Ben.

Ed Morton

unread,
Oct 10, 2012, 12:27:26 PM10/10/12
to
Hongyi Zhao <hongy...@gmail.com> wrote:

> Hi all,
>
> I've a file named for_test which has the following three lines in it just
> as the output given by the cat command:
>
> $ cat for_test
> set base_path /home/werner/Desktop/test_awk_sub/apt-mirrorfdsfdsfds
> clean http://mirror.bjtu.edu.cn/debian
> clean http://mirror.bjtu.edu.cn/debian-multimedia
>
> Now, I want to substitute the first line in this file with the following
> content:
>
> set base_path /path/to/for_test's/current_dir/apt-mirror
>
> For detail, I want to the third field of the first line of this file be
> the current directory of this file followed by the characters "apt-
> mirror".

Using awk for that probably isn't really necessary when you could do this:

printf "set base_path %s/%s" "$PWD" "apt-mirror"
tail +2 for_test

If you still really feel a burning desire to use awk for it, though, then
that'd be:

awk -v pwd="$PWD" 'NR==1{$0=$1" "$2" "pwd"/apt-mirror"}1' for_test

Regards,

Ed.

Posted using www.webuse.net

Ed Morton

unread,
Oct 10, 2012, 12:37:09 PM10/10/12
to
At a glance:

1) You aren't quoting your variables.
2) You're using sub() in awk to assign a whole record instead of simply "=".
3) You're directing your output to the same file you're reading from.
4) You're using an "if(condition)" within the action part of the awk body
instead of simply using the condition part.

FWIW I suspect "3" is the cause of the specific problem you mention but
you've been posting here FAR too long to still be making any of the above
rookie mistakes.

Ed.

Posted using www.webuse.net

Geoff Clare

unread,
Oct 11, 2012, 8:33:17 AM10/11/12
to
Hongyi Zhao wrote:

> $ cat sub_test.awk
> baseDirForScriptSelf="$(cd "$(dirname "$0")"; pwd)"
> awk -v a=$baseDirForScriptSelf '{if($1=="set" && $2=="base_path") sub(/
> ^.*$/,a"/apt-mirror",$3);print}' ./for_test 1<>./for_test
>
> But, after I run the sub_test.awk, I found that the for_test file become
> the following one:
>
> $ cat for_test
> set base_path /home/werner/Desktop/test_awk_sub/apt-mirror
> clean http://mirror.bjtu.edu.cn/debian
> clean http://mirror.bjtu.edu.cn/debian-multimedia
> ltimedia
>
> As you can see, the final for_test becomes four lines in it. The last
> line "ltimedia" is appeared when I issue the command:
>
> $ ./sub_test.awk
>
> I cann't figure out why this should happen. Could you please give me
> some hints on this strange thing?

Because you are using 1<>file to redirect standard output, the shell
does not truncate the file before executing awk. When awk writes to
the file it is overwriting existing data. Your transformation
shortens the data, and therefore there is some of the old data
left at the end when awk has finished writing the new data.

The 1<>file trick is neat, but it should only really be used when
you are doing a one-to-one transformation. It's also only safe if
the transformation is idempotent (i.e. repeating the change does
not alter the result). This is so that if awk is killed part way
through, you can just restart it from the beginning.

An example of a safe usage is:

$ cat for_test
abcabc
$ awk '{ gsub(/b/, "B"); print }' for_test 1<>for_test
$ cat for_test
aBcaBc

--
Geoff Clare <net...@gclare.org.uk>

0 new messages