On 2015-09-17 5:15, Ed Morton wrote:
> This GNU awk script is intended to replace commas with semi-colons in a
> CSV file that could contain commas in the quoted fields and could have
> blank fields:
>
> $ awk -v FPAT='([^,]*)|("[^"]+")' -v OFS=';' '{$1=$1}1' file
>
> Can anyone explain why the first call to awk below replaces the first
> comma with two semi-colons while the second and third (which are only
> different from the first in that they mentions NF somewhere in the
> action block) replace it with one, which is the desired result?
>
> $ cat file
> "A","B","C"
>
> $ awk -v FPAT='([^,]*)|("[^"]+")' -v OFS=';' '{$1=$1}1' file
> "A";;"B";"C"
# Same here, using gawk/master from git on Mac OS X
$ awk -v FPAT='([^,]*)|("[^"]+")' -v OFS=';' '{$1=$1}1' file
"A";;"B";"C"
> $ awk -v FPAT='([^,]*)|("[^"]+")' -v OFS=';' '{NF;$1=$1}1' file
> "A";"B";"C"
>
# Using $0=$0 instead of $1=$1 for getting the desired result
$ awk -v FPAT='([^,]*)|("[^"]+")' -v OFS=';' '{$0=$0}1' file
"A","B","C"
The manual [1] makes two statements about the issue, where I never
understood the difference in meaning, in all these years:
> $1 = $1 # force record to be reconstituted
and
> Any assignment to $0 causes the record to be reparsed
> into fields using the current value of FS.
It looks to me that your usage of NF is some "assignment to $0", so that
the code behaves as if you used: $0=$0.
Hermann
[1]
https://www.gnu.org/software/gawk/manual/html_node/Changing-Fields.html