transpose row and column

T. C.

unread,

Nov 4, 2010, 4:22:41 PM11/4/10

to

Hi,
I have a data file as:

group id price1 price2
-------------------------------------------------------------
a 1 20 10
b 1 21 15
c 1 30 26
d 1 19 12
a 2 0.5 0.6
b 2 0.1 0.9
c 2 0.3 1.1
d 2 0.4 0.8
a 3 : :
b 3 : :
c 3 : :
d 3 : :

I would like to have output as:

id a b c d
1 20 10 21 15 30 26 19 12
2 0.5 0.6 0.1 0.9 0.3 1.1 0.4 0.8
3 : : : : : : : :

How could I use awk to do so?

Thanks!!!!

Janis Papanagnou

unread,

Nov 4, 2010, 4:38:17 PM11/4/10

to

That has been asked and answered before; google this group and you'll find
a couple proposals. I recall that I've provided even a brief solution for
handling sparse matrices.

Janis

Janis Papanagnou

unread,

Nov 4, 2010, 4:47:07 PM11/4/10

to

Sorry, I was wrong assuming and implying it was here; actually it was posted
in comp.unix.shell...

awk ' BEGIN { FS = "\t" }

{ for (col=1; col<=NF; col++) if ($col) matrix[NR,col] = $col }

END { for (col=1; col<=NF; col++)
for (row=1; row<=NR; row++)
printf("%s%s",
((row,col) in matrix) ? matrix[row,col] : 0,
(row<NR) ? FS : RS)
}
'

For your demand you'd have to restrict the collection of data to NR>2 as in
(untested)...

NR>2 { for (col=1; col<=NF; col++) if ($col) matrix[NR,col] = $col }

to skip the headers.

>
> Janis

Janis Papanagnou

unread,

Nov 4, 2010, 4:50:06 PM11/4/10

to

On 04/11/10 21:47, Janis Papanagnou wrote:
> On 04/11/10 21:38, Janis Papanagnou wrote:
>> On 04/11/10 21:22, T. C. wrote:
>>> Hi,
>>> I have a data file as:
>>>

>>> [snip]

Forget what I wrote. I answered your Subject: without looking at your data.
I see that you don't want a transposed matrix, as the subject suggests.

Sorry for the noise.

Loki Harfagr

unread,

Nov 5, 2010, 7:38:25 AM11/5/10

to

Thu, 04 Nov 2010 13:22:41 -0700, T. C. did cat :

as you didn't really specify what to do with namespace or
ordering and you didn't post a sample of what you already tried
maybe this would be either a convenient solution or a starter
or at least an idea about what to do not ?-)
--------
awk '
BEGIN{print "id","a","b","c","d"}
NR<3{next}
{
n=$2
$1=$2=""
w[n]=w[n] OFS $0
}
END{ for(n in w) print n,w[n] }
' yerfile
--------

Ed Morton

unread,

Nov 5, 2010, 8:41:54 AM11/5/10

to

Something like this, untested:

awk '
NR<3 { next }
{ id=$2; grp=$1
ids[id]; grps[grp]
for (p=3; p<=NF; p++) {
prices[p]
price[id,grp,p] = $p
}
}
END {
printf "id"
for (grp in grps) {
printf "\t%s",grp
}
print ""
for (id in ids) {
printf "%s",id
for (grp in grps) {
for (p in prices) {
printf "\t%s",price[id,grp,p]
}
}
print ""
}
}' file

Figuring out the output formatting left as an exercise....

Ed.

Dominic Fandrey

unread,

Nov 9, 2010, 10:03:10 AM11/9/10

to

On 04/11/2010 21:22, T. C. wrote:
> Hi,
> I have a data file as:
>
> group id price1 price2
> -------------------------------------------------------------
> a 1 20 10
> b 1 21 15
> c 1 30 26
> d 1 19 12

> ...

>
> I would like to have output as:
>
> id a b c d
> 1 20 10 21 15 30 26 19 12
> 2 0.5 0.6 0.1 0.9 0.3 1.1 0.4 0.8
> 3 : : : : : : : :

This one produces the following output:
group a b c d a b c d a b c d
id 1 1 1 1 2 2 2 2 3 3 3 3
price1 20 21 30 19 0.5 0.1 0.3 0.4 : : : :
price2 10 15 26 12 0.6 0.9 1.1 0.8 : : : :

Here it is:
NR == 1 {
rows = NF
}
NF == rows {
width = 0
for (i = 1; i <= rows; i++)
width = ((width >= length($i)) * width + (width < length($i)) * length($i));
for (i = 1; i <= rows; i++)
row[i] = sprintf("%s%" ((NR > 1) * 2) "s%" width "s", row[i], "", $i);
}
END {
for (i = 1; i <= rows; i++)
print(row[i]);
}

All lines not having the same format as the first line are ignored.
Have fun deciphering it.

T. C.

unread,

Nov 9, 2010, 9:09:41 PM11/9/10

to

Hi,

Thank you all for the information.
I have try the other file as:

AB01 1 1 2
AB05 1 3 4
AB10 1 5 6
AB15 1 7 8
AB01 2 9 10
AB05 2 11 12
AB10 2 13 14
AB15 2 15 16
AB01 3 17 18
AB05 3 19 20
AB10 3 21 22
AB15 3 23 24

I try to use the following:
NR<1 { next }
{
id=$2; m=$1
$1=$2=""
ms[m]=m
w[id]=w[id] OFS $0
}

END{
printf "id "
for (m in ms) printf "%s ",m
print ""
for(id in w) print id,w[id]
}

The output is like:
id AB10 AB01 AB05 AB15
1 1 2 3 4 5 6 7 8
2 9 10 11 12 13 14 15 16
3 17 18 19 20 21 22 23 24

Somehow the order of m changed. How could I have the order like "AB01
AB05 AB10 AB15"?
Thanks!!!

Hermann Peifer

unread,

Nov 10, 2010, 3:26:08 AM11/10/10

to

You could make use of gawk's WHINY_USERS feature:

$ WHINY_USERS= gawk -f yourscript.awk yourdata
id AB01 AB05 AB10 AB15

1 1 2 3 4 5 6 7 8
2 9 10 11 12 13 14 15 16
3 17 18 19 20 21 22 23 24

$

Hermann

PS
I am not quite sure when you expect this rule to be executed:
NR<1 { next }

T. C.

unread,

Nov 10, 2010, 9:58:01 AM11/10/10

to

Thanks!!!
I remove the "NR<1 { next }" from the script.

One more question. Could I have the output for id and w[id] with space
delimiter?
How could I modify this part "for(id in w) print id,w[id]"?

Thanks!!

Hermann Peifer

unread,

Nov 11, 2010, 2:35:26 AM11/11/10

to

I'm not sure if I understand your question. Can you post expected output?

BTW, this is actually not needed: m=$1; ms[m]=m
You could simplify it to: ms[$1]

Hermann

T. C.

unread,

Nov 11, 2010, 11:50:16 AM11/11/10

to

Thanks!!
I would expect the output as:

id AB01 AB05 AB10 AB15
1 1 2 3 4 5 6 7 8
2 9 10 11 12 13 14 15 16
3 17 18 19 20 21 22 23 24

It is the same as before but with space delimiter.

Hermann Peifer

unread,

Nov 12, 2010, 2:42:58 AM11/12/10

to

On 11/11/2010 17:50, T. C. wrote:
> I would expect the output as:
> id AB01 AB05 AB10 AB15
> 1 1 2 3 4 5 6 7 8
> 2 9 10 11 12 13 14 15 16
> 3 17 18 19 20 21 22 23 24
>

$ cat yourscript.awk
{
ms[$1]
id=$2
w[id]=w[id] sep[id] $3 OFS $4
sep[id]=OFS
}

END{
printf "id"
for (m in ms) printf " %s",m
print ""
for(id in w) print id,w[id]
}

$ WHINY_USERS= awk -f yourscript.awk yourdata

id AB01 AB05 AB10 AB15
1 1 2 3 4 5 6 7 8
2 9 10 11 12 13 14 15 16
3 17 18 19 20 21 22 23 24

Hermann