I've written this kind of thing
n = split(something,arr,/re/)
for(i=1;i<=n;i++) {
print arr[i]
}
so often, it's tedious. I like this better:
n = split(something,arr,/re/)
while(n--) {
print arr[i++]
}
Easier to type. And, in cases where front-to-back or back-to-front
doesn't matter, it's even simpler:
# copy a number indexed array, assuming n contains the number of
# elements
while(n--) arr2[n] = arr1[n]
And, yes,
for(i in arr1) arr2[i] = arr1[i]
works, too. But, some loops don't involve arrays. :-)
Not earth-shattering, to be sure, but when one writes a lot of AWK...
(Braces included in the first example, for completeness, in case there's
more than one line of code in the block.)
ITYM
print arr[++i]
Kinda dangerous though. If someone introduces a
for (i=1;i<=whatever;i++)
loop above the split() you'll end up trying to print the array
starting at index "whatever+1". Also, what if you need to use n again
later? It just doesn't seem worthwhile to me to sacrifice the clarity
and relative security of a "normal" loop for the brevity.
Ed.
>
>
>I've written this kind of thing
>
> n = split(something,arr,/re/)
> for(i=1;i<=n;i++) {
> print arr[i]
> }
>
>so often, it's tedious. I like this better:
>
> n = split(something,arr,/re/)
> while(n--) {
> print arr[i++]
> }
Assumes i not been used before -- use with care ;)
Grant.
--
http://bugsplatter.id.au
My response might not be on point, since you seem to have general
looping in mind, not just array index looping. But here is another
possibility with arrays:
split($0,arr,myregexp)
for (i=1; i in arr; i++) print arr[i]
The reason this is easier is that you don't have to introduce n (which
might require some thought about whether it's already in use, though
there is nothing wrong with naming discipline, e.g., narr, or n_arr),
and you don't have to shift for the <=. It's also easier to read, and
you get to print your values in order.
Keep in mind that it's much less performant...
$ time awk 'BEGIN{for(i=1;i<=1000000;i++)arr[i]=i}'
real 0m2,79s
user 0m2,42s
sys 0m0,14s
$ time awk 'BEGIN{for(i=1;i<=1000000;i++)arr[i]=i
for (i=1; i<=1000000; i++) x=i}'
real 0m3,24s
user 0m2,91s
sys 0m0,14s
$ time awk 'BEGIN{for(i=1;i<=1000000;i++)arr[i]=i
for (i=1; i in arr; i++) x=i}'
real 0m4,42s
user 0m4,02s
sys 0m0,22s
That's a net ~1.63s for the 'in' operator and ~0.45s for the '<=' operator.
Janis
That is a very good point. Interesting that I never found that to be
a bottleneck, though.
[snip: accessing array elements 1..n]
> Easier to type. And, in cases where front-to-back or back-to-front
> doesn't matter, it's even simpler:
>
> # copy a number indexed array, assuming n contains the number of
> # elements
>
> while(n--) arr2[n] = arr1[n]
Due to n being decremented before the while loop is executed, that
would address the elements from n-1 to 0 leading to the highest-
indexed element not being copied and spurious 0-indexed elements
being created in both arrays.
This works:
n = split(something,arr,/re/) + 1
while(--n) arr2[n] = arr1[n]
Maybe because you usually don't operate on really large datasets?
Or maybe other constructs you choose to use require yet more time?
You may have never nested a less performant contruct inside a loop?
...many possibilities.
Janis
...and this works also:
while(n--) arr2[n] = arr1[n]
Hermann
That would still make an assigment when n == 0, creating the 0-indexed
elements (in both arrays) which the pre-increment form avoids.
Run the following with -v pre=1 and -v pre=0 to see the effect:
awk -v pre=1 '
BEGIN {
n = split("a b c d", arr1) + 1
printf "All elements:\n"
for (i in arr1)
printf "arr1[%d] = \"%s\"\n", i, arr1[i]
if (pre)
while (--n) arr2[n] = arr1[n]
else
while (n--) arr2[n] = arr1[n]
printf "\nAll elements of both arrays after copy:\n"
for (i in arr1)
printf "arr1[%d] = \"%s\"\tarr2[%d] = \"%s\"\n", i, arr1[i], i, arr2[i]
exit
}'