[tip] Never write for(i=1;i<=n;i++)... again?

Jim Hart

unread,

Jan 25, 2009, 11:37:55 AM1/25/09

to

I've written this kind of thing

n = split(something,arr,/re/)
for(i=1;i<=n;i++) {
print arr[i]
}

so often, it's tedious. I like this better:

n = split(something,arr,/re/)
while(n--) {
print arr[i++]
}

Easier to type. And, in cases where front-to-back or back-to-front
doesn't matter, it's even simpler:

# copy a number indexed array, assuming n contains the number of
# elements

while(n--) arr2[n] = arr1[n]

And, yes,

for(i in arr1) arr2[i] = arr1[i]

works, too. But, some loops don't involve arrays. :-)

Not earth-shattering, to be sure, but when one writes a lot of AWK...

(Braces included in the first example, for completeness, in case there's
more than one line of code in the block.)

Ed Morton

unread,

Jan 25, 2009, 8:20:21 PM1/25/09

to

On Jan 25, 10:37 am, Jim Hart <nore...@fairpoint.net> wrote:
> I've written this kind of thing
>
> n = split(something,arr,/re/)
> for(i=1;i<=n;i++) {
> print arr[i]
> }
>
> so often, it's tedious. I like this better:
>
> n = split(something,arr,/re/)
> while(n--) {
> print arr[i++]
> }

ITYM
print arr[++i]

Kinda dangerous though. If someone introduces a

for (i=1;i<=whatever;i++)

loop above the split() you'll end up trying to print the array
starting at index "whatever+1". Also, what if you need to use n again
later? It just doesn't seem worthwhile to me to sacrifice the clarity
and relative security of a "normal" loop for the brevity.

Ed.

Grant

unread,

Jan 25, 2009, 9:43:23 PM1/25/09

to

On Sun, 25 Jan 2009 11:37:55 -0500, Jim Hart <nor...@fairpoint.net> wrote:

>
>
>I've written this kind of thing
>
> n = split(something,arr,/re/)
> for(i=1;i<=n;i++) {
> print arr[i]
> }
>
>so often, it's tedious. I like this better:
>
> n = split(something,arr,/re/)
> while(n--) {
> print arr[i++]
> }

Assumes i not been used before -- use with care ;)

Grant.
--
http://bugsplatter.id.au

r.p....@gmail.com

unread,

Jan 26, 2009, 5:42:36 PM1/26/09

to

My response might not be on point, since you seem to have general
looping in mind, not just array index looping. But here is another
possibility with arrays:

split($0,arr,myregexp)
for (i=1; i in arr; i++) print arr[i]

The reason this is easier is that you don't have to introduce n (which
might require some thought about whether it's already in use, though
there is nothing wrong with naming discipline, e.g., narr, or n_arr),
and you don't have to shift for the <=. It's also easier to read, and
you get to print your values in order.

Janis Papanagnou

unread,

Jan 27, 2009, 4:26:05 AM1/27/09

to

Keep in mind that it's much less performant...

$ time awk 'BEGIN{for(i=1;i<=1000000;i++)arr[i]=i}'
real 0m2,79s
user 0m2,42s
sys 0m0,14s

$ time awk 'BEGIN{for(i=1;i<=1000000;i++)arr[i]=i
for (i=1; i<=1000000; i++) x=i}'
real 0m3,24s
user 0m2,91s
sys 0m0,14s

$ time awk 'BEGIN{for(i=1;i<=1000000;i++)arr[i]=i
for (i=1; i in arr; i++) x=i}'
real 0m4,42s
user 0m4,02s
sys 0m0,22s

That's a net ~1.63s for the 'in' operator and ~0.45s for the '<=' operator.

Janis

r.p....@gmail.com

unread,

Jan 27, 2009, 12:24:37 PM1/27/09

to

On Jan 27, 4:26 am, Janis Papanagnou <janis_papanag...@hotmail.com>
wrote:

> Janis- Hide quoted text -
>
> - Show quoted text -

That is a very good point. Interesting that I never found that to be
a bottleneck, though.

Dave Gibson

unread,

Jan 27, 2009, 1:52:44 PM1/27/09

to

Jim Hart <nor...@fairpoint.net> wrote:
>
>
> I've written this kind of thing
>
> n = split(something,arr,/re/)

[snip: accessing array elements 1..n]

> Easier to type. And, in cases where front-to-back or back-to-front
> doesn't matter, it's even simpler:
>
> # copy a number indexed array, assuming n contains the number of
> # elements
>
> while(n--) arr2[n] = arr1[n]

Due to n being decremented before the while loop is executed, that
would address the elements from n-1 to 0 leading to the highest-
indexed element not being copied and spurious 0-indexed elements
being created in both arrays.

This works:

n = split(something,arr,/re/) + 1

while(--n) arr2[n] = arr1[n]

Janis Papanagnou

unread,

Jan 27, 2009, 9:08:04 PM1/27/09

to

r.p....@gmail.com wrote:
>
> That is a very good point. Interesting that I never found that to be
> a bottleneck, though.

Maybe because you usually don't operate on really large datasets?
Or maybe other constructs you choose to use require yet more time?
You may have never nested a less performant contruct inside a loop?
...many possibilities.

Janis

Hermann Peifer

unread,

Feb 2, 2009, 1:18:14 PM2/2/09

to

...and this works also:

while(n--) arr2[n] = arr1[n]

Hermann

Dave Gibson

unread,

Feb 2, 2009, 4:40:04 PM2/2/09

to

That would still make an assigment when n == 0, creating the 0-indexed
elements (in both arrays) which the pre-increment form avoids.

Run the following with -v pre=1 and -v pre=0 to see the effect:

awk -v pre=1 '
BEGIN {
n = split("a b c d", arr1) + 1

printf "All elements:\n"
for (i in arr1)
printf "arr1[%d] = \"%s\"\n", i, arr1[i]

if (pre)
while (--n) arr2[n] = arr1[n]
else
while (n--) arr2[n] = arr1[n]

printf "\nAll elements of both arrays after copy:\n"
for (i in arr1)
printf "arr1[%d] = \"%s\"\tarr2[%d] = \"%s\"\n", i, arr1[i], i, arr2[i]

exit
}'