Google Groups no longer supports new Usenet posts or subscriptions. Historical content remains viewable.
Dismiss

Why does this print $0 instead of raising an error

46 views
Skip to first unread message

Marc de Bourget

unread,
Oct 3, 2016, 4:49:33 PM10/3/16
to
BEGIN {
#e.g. OPTION_VALUES["--key1"] = 1
#Test: deliberately uninitialized:
while ((getline < ARGV[1]) > 0) {
print $OPTION_VALUES["--key1"]
}
close(ARGV[1])
}

"print $OPTION_VALUES["--key1"]"
is with the code above the same as "print $0".
I would have expected at least a warning message.
Is this a bug or a feature?
It seems to be a feature because gawk, mawk, tawk
behave the same although OPTION_VALUES["--key1"] is not 0.

Kaz Kylheku

unread,
Oct 3, 2016, 5:12:54 PM10/3/16
to
On 2016-10-03, Marc de Bourget <marcde...@gmail.com> wrote:
> BEGIN {
> #e.g. OPTION_VALUES["--key1"] = 1
> #Test: deliberately uninitialized:
> while ((getline < ARGV[1]) > 0) {
> print $OPTION_VALUES["--key1"]
> }
> close(ARGV[1])
> }
>
> "print $OPTION_VALUES["--key1"]"
> is with the code above the same as "print $0".
> I would have expected at least a warning message.
> Is this a bug or a feature?

$ is an operator in Awk, not a sigil.

Does $(NF - 1) ring a bell?

In the POSIX standard, this grammar production appears:

lvalue : NAME
| NAME '[' expr_list ']'
| '$' expr ;

You can see that $ plays the role of a unary operator, whose
operand is an expr.

The POSIX description of Awk also contains a sentence (just one)
which actually calls it an "operator":

"In the context of the '$' operator, '|' shall behave as if it had a
lower precedence than '$'."

> It seems to be a feature because gawk, mawk, tawk
> behave the same although OPTION_VALUES["--key1"] is not 0.

That's how Awk works: if a number is required somewhere and an
expression is given which refers to something which doesn't exist, or
which refers to an empty string, or a string that contains non-numeric
junk, then, silently, the value zero emerges. (Moreover, there is
usually the side effect that the nonexistent thing pops into
existence.)

If you reject this design as bad, of course you want all such situations
to be diagnosed. Many "well-golfed" Awk one-liners will then trigger
diagnostics.

If you accept this language design, how would you articulate a
rationale for generating a warning for $whatever[index], yet
not generate a warning for uses like counter++ (where counter is never
previously mentioned at all, let alone initialized to zero)?

There are legitimate programs which evaluate $EXPR where EXPR
may refer to a blank, junk or nonexistent array element or variable.

Awk (or an implementation thereof) could benefit from a mode analogous
to the shell's "set -u" which traps *all* accesses to anything
undefined, plus another option which catches all non-numeric junk being
used in an arithmetic context.

pk

unread,
Oct 3, 2016, 5:21:10 PM10/3/16
to
1) If a is 5, then doing $a is like doing $5. But if a is 0 or empty, doing
$a is like doing $0.

2) if OPTION_VALUES is an array, OPTION_VALUES["--key1"] is the element of
OPTION_VALUES whose key is "--key1", no matter how many times you
reference it. "--key1" (with quotes) is a string and never changes.

Put the two points above together and draw the conclusions.

Marc de Bourget

unread,
Oct 3, 2016, 5:30:12 PM10/3/16
to
Thank you very much Kaz for the explanations.
l still find this $-behaviour somehow strange:
This behaviour is the cause of many errors which are difficult to detect.
Everything seems to work properly at first glance but actually it does not.
BTW, I never use "counter++" without initializing "counter = 0" before.

Ed Morton

unread,
Oct 3, 2016, 7:00:50 PM10/3/16
to
On 10/3/2016 3:49 PM, Marc de Bourget wrote:
> BEGIN {
> #e.g. OPTION_VALUES["--key1"] = 1
> #Test: deliberately uninitialized:
> while ((getline < ARGV[1]) > 0) {
> print $OPTION_VALUES["--key1"]
> }
> close(ARGV[1])
> }
>
> "print $OPTION_VALUES["--key1"]"
> is with the code above the same as "print $0".
> I would have expected at least a warning message.
> Is this a bug or a feature?

It's a feature.

> It seems to be a feature because gawk, mawk, tawk
> behave the same although OPTION_VALUES["--key1"] is not 0.

Remember in awk you don't have to declare or initialize variables and an
uninitialized variable (or array element) is of type numeric-string and has the
value zero-or-null. So in your code you have an array OPTION_VALUES with an
element indexed by "--key1" that has never been populated and so is a numeric
string of type zero-or-null.

You're then applying the `$` operator to the result and `$` treats any variable
it's applied to as numeric so you're applying `$` to the number `0`, i.e. `$0`.

This ability to use uninitialized variables is a great strength and weakness of
awk - it's incredibly useful most of the time but then that one time you
mis-spell a variable or something:

foobar = 7
print fobar

it's a PITA to debug (just spent an hour doing that this morning :-( ). Overall
it's "a good thing".

Ed.


Hermann Peifer

unread,
Oct 4, 2016, 1:53:28 AM10/4/16
to
On 2016-10-04 1:00, Ed Morton wrote:
>
> This ability to use uninitialized variables is a great strength and
> weakness of awk - it's incredibly useful most of the time but then that
> one time you mis-spell a variable or something:
>
> foobar = 7
> print fobar
>
> it's a PITA to debug (just spent an hour doing that this morning :-( ).
> Overall it's "a good thing".
>

What I usually do for debugging is:
$ gawk --lint 'BEGIN{foobar = 7; print fobar}'
awk: cmd. line:1: warning: reference to uninitialized variable `fobar'

There is also the gawk debugger, which is quite convenient, in more
complex cases.

Hermann

Marc de Bourget

unread,
Oct 4, 2016, 2:10:44 AM10/4/16
to
Thank you Hermann, good hint. With my test file on top this creates:
>gawk --lint -f test.awk file.txt
gawk: test.awk:5: warning: reference to uninitialized element `OPTION_VALUES["--key1"]'

Ed Morton

unread,
Oct 4, 2016, 10:45:01 AM10/4/16
to
On 10/4/2016 12:53 AM, Hermann Peifer wrote:
> On 2016-10-04 1:00, Ed Morton wrote:
>>
>> This ability to use uninitialized variables is a great strength and
>> weakness of awk - it's incredibly useful most of the time but then that
>> one time you mis-spell a variable or something:
>>
>> foobar = 7
>> print fobar
>>
>> it's a PITA to debug (just spent an hour doing that this morning :-( ).
>> Overall it's "a good thing".
>>
>
> What I usually do for debugging is:
> $ gawk --lint 'BEGIN{foobar = 7; print fobar}'
> awk: cmd. line:1: warning: reference to uninitialized variable `fobar'

Thanks but the code I was debugging wasn't that simple.

Ed.
0 new messages