Sure, but it depends on the - yet unmentioned - further requirements.
Is there always an at most a 7-day period of data?
How to determine the start (or resp. end) of the period in cases where
the data samples are lacking the first (first few) or resp. the last
(last few) data entries in a row. (Is the date range known in advance?
Is it always starting on a certain week-day?)
Could all the data in some cases have just 2 or 3 columns?
With the mentioned new requirements you probably may indeed need arrays
(though not necessarily gawk's non-standard multi-dimensional arrays).
The problem is that you need the boundary dates, which (depending on
your data) might not be available until the file is completely read.
Here's a sample program that assumes a header line in the data file,
and the necessity for summation across each line, and for each column.
For other requirements that script may be simplified. The script uses
only standard awk features.
awk '
NR==1 { next }
!min || $2<min { min=$2 }
!max || $2>max { max=$2 }
!($1 in keyset) { keyset[$1] ; key[++nk]=$1 }
{ val[$1,$2]=$3 }
END {
for (k=1; k<=nk; k++) {
printf "%s", key[k]
s=0
for (d=min; d<=max; d++) {
v=val[key[k],d]
s+=v
sum[d]+=v
printf "\t" v
}
printf "\tsum=%s\n", s
}
printf "Sum"
for (d=min; d<=max; d++)
printf "\t%s", sum[d]
printf "\n"
}
'
Output:
101 63910 26722 10320 49265 61710 65428 64092 sum=341447
111 1 sum=1
114 6 2 1 sum=9
Sum 63910 26722 10320 49272 61712 65428 64093
Janis
>
> TIA
>