group id price1 price2
-------------------------------------------------------------
a 1 20 10
b 1 21 15
c 1 30 26
d 1 19 12
a 2 0.5 0.6
b 2 0.1 0.9
c 2 0.3 1.1
d 2 0.4 0.8
a 3 : :
b 3 : :
c 3 : :
d 3 : :
I would like to have output as:
id a b c d
1 20 10 21 15 30 26 19 12
2 0.5 0.6 0.1 0.9 0.3 1.1 0.4 0.8
3 : : : : : : : :
How could I use awk to do so?
Thanks!!!!
That has been asked and answered before; google this group and you'll find
a couple proposals. I recall that I've provided even a brief solution for
handling sparse matrices.
Janis
Sorry, I was wrong assuming and implying it was here; actually it was posted
in comp.unix.shell...
awk ' BEGIN { FS = "\t" }
{ for (col=1; col<=NF; col++) if ($col) matrix[NR,col] = $col }
END { for (col=1; col<=NF; col++)
for (row=1; row<=NR; row++)
printf("%s%s",
((row,col) in matrix) ? matrix[row,col] : 0,
(row<NR) ? FS : RS)
}
'
For your demand you'd have to restrict the collection of data to NR>2 as in
(untested)...
NR>2 { for (col=1; col<=NF; col++) if ($col) matrix[NR,col] = $col }
to skip the headers.
>
> Janis
Forget what I wrote. I answered your Subject: without looking at your data.
I see that you don't want a transposed matrix, as the subject suggests.
Sorry for the noise.
as you didn't really specify what to do with namespace or
ordering and you didn't post a sample of what you already tried
maybe this would be either a convenient solution or a starter
or at least an idea about what to do not ?-)
--------
awk '
BEGIN{print "id","a","b","c","d"}
NR<3{next}
{
n=$2
$1=$2=""
w[n]=w[n] OFS $0
}
END{ for(n in w) print n,w[n] }
' yerfile
--------
Something like this, untested:
awk '
NR<3 { next }
{ id=$2; grp=$1
ids[id]; grps[grp]
for (p=3; p<=NF; p++) {
prices[p]
price[id,grp,p] = $p
}
}
END {
printf "id"
for (grp in grps) {
printf "\t%s",grp
}
print ""
for (id in ids) {
printf "%s",id
for (grp in grps) {
for (p in prices) {
printf "\t%s",price[id,grp,p]
}
}
print ""
}
}' file
Figuring out the output formatting left as an exercise....
Ed.
This one produces the following output:
group a b c d a b c d a b c d
id 1 1 1 1 2 2 2 2 3 3 3 3
price1 20 21 30 19 0.5 0.1 0.3 0.4 : : : :
price2 10 15 26 12 0.6 0.9 1.1 0.8 : : : :
Here it is:
NR == 1 {
rows = NF
}
NF == rows {
width = 0
for (i = 1; i <= rows; i++)
width = ((width >= length($i)) * width + (width < length($i)) * length($i));
for (i = 1; i <= rows; i++)
row[i] = sprintf("%s%" ((NR > 1) * 2) "s%" width "s", row[i], "", $i);
}
END {
for (i = 1; i <= rows; i++)
print(row[i]);
}
All lines not having the same format as the first line are ignored.
Have fun deciphering it.
Thank you all for the information.
I have try the other file as:
AB01 1 1 2
AB05 1 3 4
AB10 1 5 6
AB15 1 7 8
AB01 2 9 10
AB05 2 11 12
AB10 2 13 14
AB15 2 15 16
AB01 3 17 18
AB05 3 19 20
AB10 3 21 22
AB15 3 23 24
I try to use the following:
NR<1 { next }
{
id=$2; m=$1
$1=$2=""
ms[m]=m
w[id]=w[id] OFS $0
}
END{
printf "id "
for (m in ms) printf "%s ",m
print ""
for(id in w) print id,w[id]
}
The output is like:
id AB10 AB01 AB05 AB15
1 1 2 3 4 5 6 7 8
2 9 10 11 12 13 14 15 16
3 17 18 19 20 21 22 23 24
Somehow the order of m changed. How could I have the order like "AB01
AB05 AB10 AB15"?
Thanks!!!
You could make use of gawk's WHINY_USERS feature:
$ WHINY_USERS= gawk -f yourscript.awk yourdata
id AB01 AB05 AB10 AB15
1 1 2 3 4 5 6 7 8
2 9 10 11 12 13 14 15 16
3 17 18 19 20 21 22 23 24
$
Hermann
PS
I am not quite sure when you expect this rule to be executed:
NR<1 { next }
Thanks!!!
I remove the "NR<1 { next }" from the script.
One more question. Could I have the output for id and w[id] with space
delimiter?
How could I modify this part "for(id in w) print id,w[id]"?
Thanks!!
I'm not sure if I understand your question. Can you post expected output?
BTW, this is actually not needed: m=$1; ms[m]=m
You could simplify it to: ms[$1]
Hermann
Thanks!!
I would expect the output as:
id AB01 AB05 AB10 AB15
1 1 2 3 4 5 6 7 8
2 9 10 11 12 13 14 15 16
3 17 18 19 20 21 22 23 24
It is the same as before but with space delimiter.
$ cat yourscript.awk
{
ms[$1]
id=$2
w[id]=w[id] sep[id] $3 OFS $4
sep[id]=OFS
}
END{
printf "id"
for (m in ms) printf " %s",m
print ""
for(id in w) print id,w[id]
}
$ WHINY_USERS= awk -f yourscript.awk yourdata
id AB01 AB05 AB10 AB15
1 1 2 3 4 5 6 7 8
2 9 10 11 12 13 14 15 16
3 17 18 19 20 21 22 23 24
Hermann