[9fans] rows to cols?

Peter A. Cejchan

unread,

Nov 13, 2009, 2:42:54 AM11/13/09

to

Hi, folks!

Is there an easy way to transpose the text so that rows become
columns, and vice versa? Delimiter is space. Perhaps in AWK?

Thanks,

=============================
Petr A. Cejchan
<c...@gli.cas.cz, tya...@gmail.com>
http://home.gli.cas.cz/cej/www/
http://www.facebook.com/cejchan
work: +420-233 087 237
home/SMS: +420-720 121 721
ICQ: 583000501
=============================

Richard Miller

unread,

Nov 13, 2009, 3:46:51 AM11/13/09

to

> Is there an easy way to transpose the text so that rows become
> columns, and vice versa? Delimiter is space.

If you know in advance the number of rows & colums, it's easy:

term% cat t
one two three four
five six seven eight
nine ten eleven twelve
term% tr -s ' ' '\xA' <t | pr -t -3 -l4 | tr -s ' ' ' '
one five nine
two six ten
three seven eleven
four eight twelve

dav...@mac.com

unread,

Nov 13, 2009, 6:57:11 PM11/13/09

to

Wow.
Excellent us of tools.

The smallest arbitrary-columns answer I could come up with was:
awk '{if(m < NF)m=NF;for(i=1;i<=NF;i++)r[NR, i]=$i}END {for(i=1;i<=m;i+
+){for(j=1;j<=NR;j++)printf "%s ", r[j,i];print ""}}' t

I'm sure there's an insane sed solution out there somewhere for very
small numbers of rows and columns.

D

Lyndon Nerenberg VE6BBM/VE7TFX

unread,

Nov 13, 2009, 11:46:07 PM11/13/09

to

> Is there an easy way to transpose the text so that rows become
> columns, and vice versa? Delimiter is space. Perhaps in AWK?

If Richard's trick won't work, grab contrib/lyndon/transpose.c.

It's dog slow (actually, avl(2) is), but its effectively
unbounded for the input dataset size.

--lyndon

P.S. Never underestimate the power of C.

Richard Miller

unread,

Nov 14, 2009, 4:47:09 AM11/14/09

to

> Wow.
> Excellent us of tools.

It's the sort of thing I used to give as an exercise to students.

> The smallest arbitrary-columns answer I could come up with was:
> awk '{if(m < NF)m=NF;for(i=1;i<=NF;i++)r[NR, i]=$i}END {for(i=1;i<=m;i+
> +){for(j=1;j<=NR;j++)printf "%s ", r[j,i];print ""}}' t

Explicit looping looks so strenuous.

To make the tr|pr method more general, you can count columns first with
sed 1q | wc -b
or if you like
awk '{print NF;exit}'

Counting rows is left as an exercise to the reader ...

dav...@mac.com

unread,

Nov 14, 2009, 6:30:50 AM11/14/09

to

> It's the sort of thing I used to give as an exercise to students.

Wish I'd been in your class.

> Explicit looping looks so strenuous.

I know: I kept thinking "map ... join": too much perl.

> To make the tr|pr method more general, you can count columns first
> with

But that's multi-pass:-).

You could of course, use one pass of wc to count the words and lines,
then divide words by lines to get cols:-).

There are too many mad genius coders on this list.
Next: count the number of angels dancing on the head of a pin using an
acid function.

Can we talk about plan9 now, please?

D

erik quanstrom

unread,

Nov 14, 2009, 10:14:32 AM11/14/09

to

> It's dog slow (actually, avl(2) is), but its effectively
> unbounded for the input dataset size.

i haven't found avl to be slow, so i was interested in
this. after stripping out the tmp file and the
unnecessary runes, prof tells me this for a
2000x10000 array. (normal runtime ~20s)

minooka; prof /mnt/term/usr/quanstro/8.out prof.116938
% Time Calls Name
50.0 115.833 486724015 _insertavl
12.5 28.961 466724015 cmp
11.9 27.586 1 main
11.4 26.465 466724015 balance
3.9 9.080 168888891 Bgetc
3.9 9.069 168888890 Bputc
[...]
1.5 3.376 20000000 findsuccessor

okay, you're measuring that building an avl tree
takes n log(n)/log(2). if it were not, i'd be worried! note
also that the ratio of time(_insertavl)/time(findsuccessor)
= log(n)/log(2). findsuccessor is the meat of walkavl.

in http://iwp9.org/papers/upasexp.pdf i talked about
a similar issue with hash tables. if all you do is build
a fast-access structure, then they can be really slow.

- erik

Lyndon Nerenberg VE6BBM/VE7TFX

unread,

Nov 14, 2009, 6:27:16 PM11/14/09

to

> i haven't found avl to be slow, so i was interested in
> this.

It was slow in relation to other methods available. That code wasn't
written to be fast. It came out of a long ago Sunday afternoon
discussion I had with someone about data structures, from which we
ended up cobbling together a few different versions of transpose to
get some timings. That was the only version that seems to have
survived, so that's the one you got ;-)