safely iterating through potentially empty arrays in bash

James Leifer

unread,

Feb 27, 2004, 8:44:43 AM2/27/04

to

Hello,

I always thought that the correct way to iterate through the elements
of arrays in bash is via

for f in "${foo[@]}" ; ...

but this doesn't seem to work correctly when set -u is on.

My version of bash is the standard debian woody one:

The following script isolates the problem I'm having:

#!/bin/bash

# don't treat undefined variables as empty
set -u

# initialise foo to the empty array
declare -a foo=( )

# should say 0
echo "The number of elements in foo is ${#foo[@]}."

# should iterate 0 times but it causes an error instead
for f in "${foo[@]}"; do
echo "$f"
done

echo "done"

I expected the following output (indenting for clarity)

The number of elements in foo is 0.
done

but instead I got

The number of elements in foo is 0.
/tmp/artest: foo[@]: unbound variable

What would be the correct contruct? I know I can always separately
test the size of the array and only enter the loop if non 0, but that
is rather ugly and shouldn't be necessary. I must be missing
something obvious.

Regards,
--
James

Chris F.A. Johnson

unread,

Feb 27, 2004, 2:06:05 PM2/27/04

to

On Fri, 27 Feb 2004 at 13:44 GMT, James Leifer wrote:
> Hello,
>
> I always thought that the correct way to iterate through the elements
> of arrays in bash is via
>
> for f in "${foo[@]}" ; ...
>
> but this doesn't seem to work correctly when set -u is on.
>
> My version of bash is the standard debian woody one:
>
> GNU bash, version 2.05a.0(1)-release (i386-pc-linux-gnu)
> Copyright 2001 Free Software Foundation, Inc.
>
> The following script isolates the problem I'm having:
>
> #!/bin/bash
>
> # don't treat undefined variables as empty
> set -u

man bash:

set [--abefhkmnptuvxBCHP] [-o option] [arg ...]
........
-u Treat unset variables as an error when performing param-
eter expansion. If expansion is attempted on an unset
variable, the shell prints an error message, and, if not
interactive, exits with a non-zero status.

>
> # initialise foo to the empty array
> declare -a foo=( )
>
> # should say 0
> echo "The number of elements in foo is ${#foo[@]}."
>
> # should iterate 0 times but it causes an error instead
> for f in "${foo[@]}"; do
> echo "$f"
> done
>
> echo "done"
>
> I expected the following output (indenting for clarity)
>
> The number of elements in foo is 0.
> done
>
> but instead I got
>
> The number of elements in foo is 0.
> /tmp/artest: foo[@]: unbound variable
>
> What would be the correct contruct? I know I can always separately
> test the size of the array and only enter the loop if non 0, but that
> is rather ugly and shouldn't be necessary. I must be missing
> something obvious.

Just remove "set -u".

--
Chris F.A. Johnson http://cfaj.freeshell.org/shell
===================================================================
My code (if any) in this post is copyright 2004, Chris F.A. Johnson
and may be copied under the terms of the GNU General Public License

Stephane CHAZELAS

unread,

Feb 27, 2004, 2:59:08 PM2/27/04

to

2004-02-27, 19:06(+00), Chris F.A. Johnson:
[...]

> man bash:
>
> set [--abefhkmnptuvxBCHP] [-o option] [arg ...]
> ........
> -u Treat unset variables as an error when performing param-
> eter expansion. If expansion is attempted on an unset
> variable, the shell prints an error message, and, if not
> interactive, exits with a non-zero status.

[...]

But note that with ksh or bash, an array with index 0 not set is
an unset variable.

bash is not consistent:

$ bash -uc 'a[1]=FOO; echo $a'

~$ bash -uc 'a[1]=FOO; echo ${a-a is not set}'
a is not set

$ ksh -uc 'a[1]=FOO; echo $a'
ksh: a: parameter not set

~$ bash -uc 'a[1]=FOO; echo "${a[0]}"'
bash: line 1: a[0]: unbound variable

($a is supposed to be a shortcut for ${a[0]}!)

zsh doesn't have problems because in zsh, array and scalar
are two distincts types of variable (and anyway, zsh arrays are
not hashes as in bash or ksh).

In any case, to avoid the error with "set -u", with ksh or bash
you need:

for f in ${a[@]+"${a[@]}"}; do ...

because in those shells empty or unset arrays are the same
thing.

It's OK with zsh:

~$ zsh -cu 'a=(); echo "${a[@]}"'

~$ zsh -cu 'echo "${a[@]}"'
zsh: a[@]: parameter not set

--
Stéphane ["Stephane.Chazelas" at "free.fr"]

William Park

unread,

Feb 27, 2004, 3:19:42 PM2/27/04

to

Stephane CHAZELAS <this.a...@is.invalid> wrote:
> 2004-02-27, 19:06(+00), Chris F.A. Johnson:
> [...]
> > man bash:
> >
> > set [--abefhkmnptuvxBCHP] [-o option] [arg ...]
> > ........
> > -u Treat unset variables as an error when performing param-
> > eter expansion. If expansion is attempted on an unset
> > variable, the shell prints an error message, and, if not
> > interactive, exits with a non-zero status.
> [...]
>
> But note that with ksh or bash, an array with index 0 not set is
> an unset variable.
>
> bash is not consistent:
>
> $ bash -uc 'a[1]=FOO; echo $a'
>
> ~$ bash -uc 'a[1]=FOO; echo ${a-a is not set}'
> a is not set
>
> $ ksh -uc 'a[1]=FOO; echo $a'
> ksh: a: parameter not set
>
> ~$ bash -uc 'a[1]=FOO; echo "${a[0]}"'
> bash: line 1: a[0]: unbound variable
>
> ($a is supposed to be a shortcut for ${a[0]}!)

You are missing nuance. $a is asking is 'a' defined; answer is yes,
because at least one item (a[1]) is defined. ${a[0]} is asking is
'a[0]' defined; answer is no.

>
> zsh doesn't have problems because in zsh, array and scalar
> are two distincts types of variable (and anyway, zsh arrays are
> not hashes as in bash or ksh).

In Bash, array variable names are hashed, like any other shell
variables. But, array itself (ie. data in array) is circular linked
list.

--
William Park, Open Geometry Consulting, <openge...@yahoo.ca>
Linux solution for data management and processing.

Stephane CHAZELAS

unread,

Feb 27, 2004, 4:17:39 PM2/27/04

to

2004-02-27, 20:19(+00), William Park:
[...]

> You are missing nuance. $a is asking is 'a' defined; answer is yes,
> because at least one item (a[1]) is defined. ${a[0]} is asking is
> 'a[0]' defined; answer is no.

Nuances are not consistent.

With "set -u" is meant to exit if accessing an unset variable
${a-replace} is supposed to expand to "replace" if $a in unset.

Nuances don't match in the case of a[1]=2

>> zsh doesn't have problems because in zsh, array and scalar
>> are two distincts types of variable (and anyway, zsh arrays are
>> not hashes as in bash or ksh).
>
> In Bash, array variable names are hashed, like any other shell
> variables. But, array itself (ie. data in array) is circular linked
> list

I meant that bash (and ksh) arrays are not arrays as in every
other language (C, Java, perl, zsh...) (and I'm not speaking of
the interpreter internal implementaion), but hashes with keys
restricted to the interger type.

The positional parameters are an array in bash, not bash arrays.
That may be why they are not mapped to a bash array, while they
are in zsh (argv).

in zsh:

a[100]=1

creates an array of size 100 (indices start at 1 in zsh like the
positional parameters).

In bash or ksh:

a[100]=1

creates a hash (or associative array if you prefer) of size "1",
with the value 1 associated to the key 100.

Note that zsh also has associative arrays, but keys are not
restricted to the integer type and the array type is different from
the hash type:

$ typeset -A h
$ typeset -a a
$ h[foo]=bar a[12]=baz
$ echo ${#h}
1
$ echo ${#a}
12

Note that it is very tricky in bash to know the keys of its
(associative) arrays, you need something like:

get_keys() {
local IFS=" "
REPLY=
local fopt=
[[ "$-" = *f* ]] || fopt="set +f"
set -f
eval "set -- $(declare -p -- "$1" 2> /dev/null)"
[[ "$2" = -*a* ]] || eval "$fopt; return 1"
eval 'REPLY=${'"$#"'#*"=("}'
eval "set -- ${REPLY%')'}"
set -- "${@%%']'*}"
REPLY=${*#'['}
eval "$fopt"
}

$ a[4]=4
$ a[12]=12
$ get_keys a
$ echo "$REPLY"
4 12

In ksh93:

$ a[4]=4
$ a[12]=12
$ echo "${!a[@]}"
4 12

In zsh:

$ typeset -A h
$ h[foo]=12
$ h[bar]=4
$ echo ${(k)h}
foo bar

William Park

unread,

Feb 27, 2004, 6:48:09 PM2/27/04

to

Stephane CHAZELAS <this.a...@is.invalid> wrote:
> > In Bash, array variable names are hashed, like any other shell
> > variables. But, array itself (ie. data in array) is circular linked
> > list
>
> I meant that bash (and ksh) arrays are not arrays as in every
> other language (C, Java, perl, zsh...) (and I'm not speaking of
> the interpreter internal implementaion), but hashes with keys
> restricted to the interger type.
>
> The positional parameters are an array in bash, not bash arrays.

Yes. But, only for $0-9. For $11 and up, it's linear linked list (ie.
one-way).

Hmm, thanks for the pointer. This operator is, indeed, missing in Bash.
I'll add it to my patch,
http://home.eol.ca/~parkw/index.html#bash
Meanwhile, just do
array var
:-)

Stephane CHAZELAS

unread,

Feb 28, 2004, 4:53:17 AM2/28/04

to

2004-02-27, 23:48(+00), William Park:

> Stephane CHAZELAS <this.a...@is.invalid> wrote:
>> > In Bash, array variable names are hashed, like any other shell
>> > variables. But, array itself (ie. data in array) is circular linked
>> > list
>>
>> I meant that bash (and ksh) arrays are not arrays as in every
>> other language (C, Java, perl, zsh...) (and I'm not speaking of
>> the interpreter internal implementaion), but hashes with keys
>> restricted to the interger type.
>>
>> The positional parameters are an array in bash, not bash arrays.
>
> Yes. But, only for $0-9. For $11 and up, it's linear linked list (ie.
> one-way).

I was not speaking of bash implementation, but of bash language.
Positional parameters are an array because you have contiguous
indices, an array length ($#)... And it's normal they are since
they are built from the argv[] array. ksh and bash "arrays" are of
different shape, they can't be mapped to a C array, contrary to
zsh ones. I was just pointing out that oddity and I'm still
wondering why it (the language!) was designed that way.

William Park

unread,

Feb 28, 2004, 1:10:45 PM2/28/04

to

Stephane CHAZELAS <this.a...@is.invalid> wrote:
> 2004-02-27, 23:48(+00), William Park:
> > Stephane CHAZELAS <this.a...@is.invalid> wrote:
> >> > In Bash, array variable names are hashed, like any other shell
> >> > variables. But, array itself (ie. data in array) is circular linked
> >> > list
> >>
> >> I meant that bash (and ksh) arrays are not arrays as in every
> >> other language (C, Java, perl, zsh...) (and I'm not speaking of
> >> the interpreter internal implementaion), but hashes with keys
> >> restricted to the interger type.
> >>
> >> The positional parameters are an array in bash, not bash arrays.
> >
> > Yes. But, only for $0-9. For $11 and up, it's linear linked list (ie.
> > one-way).
>
> I was not speaking of bash implementation, but of bash language.
> Positional parameters are an array because you have contiguous
> indices, an array length ($#)... And it's normal they are since
> they are built from the argv[] array.

Perhaps, the implementors left the possibility of inserting (ie.
increasing) positional parameter. Like, opposite of 'shift'.

> ksh and bash "arrays" are of
> different shape, they can't be mapped to a C array, contrary to
> zsh ones. I was just pointing out that oddity and I'm still
> wondering why it (the language!) was designed that way.

Interesting... You mean
var[1000]='abc'
in Zsh will create 1000 elements, last one hold 'abc'?

Stephane CHAZELAS

unread,

Feb 28, 2004, 7:15:08 PM2/28/04

to

2004-02-28, 18:10(+00), William Park:
[...]

> Interesting... You mean
> var[1000]='abc'
> in Zsh will create 1000 elements, last one hold 'abc'?

Yes, unless you declare "var" as an associative array (typeset
-A).

zsh arrays and hashes are similar to perl ones.

James Leifer

unread,

Mar 1, 2004, 3:20:00 AM3/1/04

to

"Chris F.A. Johnson" <c.fa.j...@rogers.com> writes:

> set [--abefhkmnptuvxBCHP] [-o option] [arg ...]
> ........
> -u Treat unset variables as an error when performing param-
> eter expansion. If expansion is attempted on an unset
> variable, the shell prints an error message, and, if not
> interactive, exits with a non-zero status.

snip

> Just remove "set -u".

Hi Chris,

Thanks for your reply. I must say that I come to exactly the opposite
conclusion. I want bash to detect all attempts to read from undefined
variables, which is why I use set -u. The case in question consists
of an *empty* array, not an undefined array. Clearly bash agrees
because in my example given in the original post, the line

echo "The number of elements in foo is ${#foo[@]}."

works without error. If the foo array weren't defined, this line
would generate an error:

set -u
unset foo
echo ${#foo[@]}

====>

bash: foo: unbound variable

So it seems that there's a bug in that bash incorrectly emits an error
when doing "${foo[@]}" rather than generating an empty list of
strings.

Comments?

-James