Thank you, Ed. I'll also want to contribute something I think is useful.
It's my final version which however does not take care of newlines,
but dequotes and enquotes double quotes properly. Even poorly formatted
csv gets "healed" with correct output. I've also added an "alwaysquote"
option to my enquote function which can be set to "1", e.g.
csvenquote($i, OFS, 1)
because some people like everything be quoted in csv files, see
https://en.wikipedia.org/wiki/Comma-separated_values#Basic_rules
"Any field may be quoted (that is, enclosed within double-quote characters)."
"1997","Ford","E350"
The debug result files csvdequote.txt and csvenquote.csv show correct values.
Content of csv test file "csvin3comma.csv":
Field1,Field2,Field3,Field4
One,,Two,Three
What,,"Is,",It
What,,Is,It
What,,"""Yes,No""",It
What,,Is,It
"""What""",,Is,It
"""What""","""yes, no",Is,It
20,,aaa20,No
20,"""",aaa20,Yes
21,"""Hello""",abc21,"""Sir"""
23,,xyz00,","
24,"""Yes!""",pdq24,9
28,,89,3
25,"""Yes,No""","26,pd",q24
25,,"26,pd","q""Sir""24"
Now my awk file awkCSV_comma2.awk for use with
gawk -f awkcsv_comma2.awk csvin3comma.csv
# awkCSV_comma2.awk
function csvdequote( str ) {
if (str ~ /^\".*\"$/) {
sub(/^\"/, "", str)
sub(/\"$/, "", str)
gsub(/\"\"/, "\"", str)
}
return ( str )
}
function csvenquote( str, separator, alwaysquote ) {
if ((str ~ separator) || (str ~ /\"/)) {
gsub(/\"/, "\"\"", str)
str = "\"" str "\""
}
else if (alwaysquote) {
str = "\"" str "\""
}
return ( str )
}
# ARGV[1]: csv file
BEGIN {
FS = 0 # Disable FS, needed for Thompson AWK
OFS = ","
FPAT = "([^" OFS "]*)|(\"([^\"]|\"\")*\")"
}
{
for (i=1; i<=NF; i++) {
$i = csvdequote($i)
# Debug:
print "dequoted: Rec. " NR " Field i: " $i
printf("%s", $i) >"csvdequote.txt"
if (i < NF)
printf("%s", "\t") >"csvdequote.txt"
else
printf("%s", "\n") >"csvdequote.txt"
}
print ""
# Do something with data (change contents as usual for $1, $0 etc.)
for (i=1; i<=NF; i++) {
$i = csvenquote($i, OFS, 0)
# Debug:
print "enquoted: Rec. " NR " Field i: " $i
printf("%s", $i) >"csvenquote.csv"
if (i < NF)
printf("%s", OFS) >"csvenquote.csv"
else
printf("%s", "\n") >"csvenquote.csv"
}
print ""
}