Google Groups no longer supports new Usenet posts or subscriptions. Historical content remains viewable.
Dismiss

why does referencing NF or different fields change $0 after recompilation?

35 views
Skip to first unread message

Ed Morton

unread,
Sep 16, 2015, 11:15:09 PM9/16/15
to
This GNU awk script is intended to replace commas with semi-colons in a CSV file
that could contain commas in the quoted fields and could have blank fields:

$ awk -v FPAT='([^,]*)|("[^"]+")' -v OFS=';' '{$1=$1}1' file

Can anyone explain why the first call to awk below replaces the first comma with
two semi-colons while the second and third (which are only different from the
first in that they mentions NF somewhere in the action block) replace it with
one, which is the desired result?

$ cat file
"A","B","C"

$ awk -v FPAT='([^,]*)|("[^"]+")' -v OFS=';' '{$1=$1}1' file
"A";;"B";"C"

$ awk -v FPAT='([^,]*)|("[^"]+")' -v OFS=';' '{NF;$1=$1}1' file
"A";"B";"C"

$ awk -v FPAT='([^,]*)|("[^"]+")' -v OFS=';' '{$1=$1;NF}1' file
"A";"B";"C"

Mentioning some other variables has no effect:

$ awk -v FPAT='([^,]*)|("[^"]+")' -v OFS=';' '{OFS;$1=$1}1' file
"A";;"B";"C"

$ awk -v FPAT='([^,]*)|("[^"]+")' -v OFS=';' '{$0;$1=$1}1' file
"A";;"B";"C"

while just mentioning $1/2/3 can change where the double-semi-colon appears:

$ awk -v FPAT='([^,]*)|("[^"]+")' -v OFS=';' '{$1;$1=$1}1' file
"A";;"B";"C"
$ awk -v FPAT='([^,]*)|("[^"]+")' -v OFS=';' '{$2;$1=$1}1' file
"A";"B";;"C"
$ awk -v FPAT='([^,]*)|("[^"]+")' -v OFS=';' '{$3;$1=$1}1' file
"A";"B";"C"

and (presumably related) we get a different $0 depending on which field we
assign to itself:

$ awk -v FPAT='([^,]*)|("[^"]+")' -v OFS=';' '{$1=$1}1' file
"A";;"B";"C"
$ awk -v FPAT='([^,]*)|("[^"]+")' -v OFS=';' '{$2=$2}1' file
"A";"B";;"C"
$ awk -v FPAT='([^,]*)|("[^"]+")' -v OFS=';' '{$3=$3}1' file
"A";"B";"C"

I'm using

$ awk --version
GNU Awk 4.1.3, API: 1.1 (GNU MPFR 3.1.3, GNU MP 6.0.0)

in bash on cygwin.

Regards,

Ed.

Hermann Peifer

unread,
Sep 17, 2015, 2:18:18 AM9/17/15
to
On 2015-09-17 5:15, Ed Morton wrote:
> This GNU awk script is intended to replace commas with semi-colons in a
> CSV file that could contain commas in the quoted fields and could have
> blank fields:
>
> $ awk -v FPAT='([^,]*)|("[^"]+")' -v OFS=';' '{$1=$1}1' file
>
> Can anyone explain why the first call to awk below replaces the first
> comma with two semi-colons while the second and third (which are only
> different from the first in that they mentions NF somewhere in the
> action block) replace it with one, which is the desired result?
>
> $ cat file
> "A","B","C"
>
> $ awk -v FPAT='([^,]*)|("[^"]+")' -v OFS=';' '{$1=$1}1' file
> "A";;"B";"C"

# Same here, using gawk/master from git on Mac OS X
$ awk -v FPAT='([^,]*)|("[^"]+")' -v OFS=';' '{$1=$1}1' file
"A";;"B";"C"

> $ awk -v FPAT='([^,]*)|("[^"]+")' -v OFS=';' '{NF;$1=$1}1' file
> "A";"B";"C"
>

# Using $0=$0 instead of $1=$1 for getting the desired result
$ awk -v FPAT='([^,]*)|("[^"]+")' -v OFS=';' '{$0=$0}1' file
"A","B","C"

The manual [1] makes two statements about the issue, where I never
understood the difference in meaning, in all these years:

> $1 = $1 # force record to be reconstituted

and

> Any assignment to $0 causes the record to be reparsed
> into fields using the current value of FS.

It looks to me that your usage of NF is some "assignment to $0", so that
the code behaves as if you used: $0=$0.

Hermann


[1] https://www.gnu.org/software/gawk/manual/html_node/Changing-Fields.html


Josef Frank

unread,
Sep 17, 2015, 6:55:53 AM9/17/15
to
On 17.09.2015 12:49:58 Ed Morton wrote:
>
> $ cat file
> "A","B","C"
>
> $ awk -v FPAT='([^,]*)|("[^"]+")' -v OFS=';' '{$1=$1}1' file
> "A";;"B";"C"
>

Reminds me of a bug (in gawk 4.0.0) mentioned at the top of:



http://git.savannah.gnu.org/cgit/gawk.git/tree/test/pty1.awk




> $ awk -v FPAT='([^,]*)|("[^"]+")' -v OFS=';' '{NF;$1=$1}1' file
> "A";"B";"C"

the "{NF;...}" was exactly the workaround suggested there in connection
with use of FPAT.



jf1

Ed Morton

unread,
Sep 17, 2015, 9:06:14 AM9/17/15
to
Thanks Herman and Josef - since I didn't get a response yet telling me I'm doing
something wrong, it is sounding likely to be a bug so I've emailed
bug-...@gnu.org about it.

Ed.
0 new messages