(GAWK) fatal: attempt to use scalar 'x' as an array

89 views
Skip to first unread message

Kenny McCormack

unread,
Aug 12, 2021, 6:54:14 AM8/12/21
to
Observe:

$ gawk4 '{ x[$1] = $0 } END { if (!length(x)) print "Nothing in the array";for (i in x) print i,x[i] }'
Nothing in the array
gawk4: cmd. line:1: fatal: attempt to use scalar `x' as an array
$

This happens because I hit ^D (eof) as the first input to this program.

I can "fix" this issue by either:
1) Entering at least one valid line of input before hitting EOF.
or
2) Reversing the order of the two clauses in the END section.

Now, of course, it is clear what is going on here - and why the "fixes"
work. But what surprises me is that it (the error) happens at all. My
understanding had been that the issue (i.e., dark corner in the GAWK
language) of whether something is an array or a scalar is resolved entirely
at compile time. That basically the context of a variable's first
occurrence determined its type (i.e., array or not). So:
1) Note that the first occurrence of x is in the assignment, so you'd
think it would get array'ized there.
2) I find it strange that the runtime execution path (whether or not
the assignment part ever executes) ends up determining the type of 'x'.

I find this interesting. It would be nice if this dark corner could be
removed from (fixed in) the language.

Note, incidentally, that TAWK is better here (as it [almost] always is).
This dark corner does not exist in TAWK. A variable can have multiple
types in the course of a program.

--
To most Christians, the Bible is like a software license. Nobody
actually reads it. They just scroll to the bottom and click "I agree."

- author unknown -

Janis Papanagnou

unread,
Aug 12, 2021, 8:17:14 AM8/12/21
to
On 12.08.2021 12:54, Kenny McCormack wrote:
> Observe:
>
> $ gawk4 '{ x[$1] = $0 } END { if (!length(x)) print "Nothing in the array";for (i in x) print i,x[i] }'
> Nothing in the array
> gawk4: cmd. line:1: fatal: attempt to use scalar `x' as an array
> $
>
> This happens because I hit ^D (eof) as the first input to this program.
>
> I can "fix" this issue by either:
> 1) Entering at least one valid line of input before hitting EOF.
> or
> 2) Reversing the order of the two clauses in the END section.

or
3) Force x to become an array.


I recall a similar question from mine a couple months ago (you were
amongst the first responders, BTW), and my try on a solution was code
like the one in the BEGIN clause added to your test case:

gawk '
BEGIN { x["dummy"] ; delete x["dummy"] }
{ x[$1] = $0 }
END { if (!length(x)) print "Nothing in the array"
for (i in x) print i,x[i]
}
'

>
> Now, of course, it is clear what is going on here - and why the "fixes"
> work. But what surprises me is that it (the error) happens at all. My
> understanding had been that the issue (i.e., dark corner in the GAWK
> language) of whether something is an array or a scalar is resolved entirely
> at compile time.

I don't think so. I think it's a pure runtime issue.

Janis

> [...]


Janis Papanagnou

unread,
Aug 12, 2021, 9:52:24 AM8/12/21
to
On 12.08.2021 14:17, Janis Papanagnou wrote:
>
> or
> 3) Force x to become an array.

I just recall that, I think it was Ed who suggested a less bulky form.

BEGIN { delete x[""] }

works, and it seems

BEGIN { delete x }

works as well (at least in my GNU Awk context).

>
> gawk '
> BEGIN { x["dummy"] ; delete x["dummy"] }
> { x[$1] = $0 }
> END { if (!length(x)) print "Nothing in the array"
> for (i in x) print i,x[i]
> }
> '

Janis

Kenny McCormack

unread,
Aug 12, 2021, 1:27:01 PM8/12/21
to
In article <sf392n$gnd$1...@news-1.m-online.net>,
Janis Papanagnou <janis_pa...@hotmail.com> wrote:
>On 12.08.2021 14:17, Janis Papanagnou wrote:
>>
>> or
>> 3) Force x to become an array.
>
>I just recall that, I think it was Ed who suggested a less bulky form.
>
> BEGIN { delete x[""] }
>
>works, and it seems
>
> BEGIN { delete x }
>
>works as well (at least in my GNU Awk context).

So, the point is, it really does just boil down to: You have to ensure
that, whatever execution path your program takes, the first runtime
reference to the variable is an unequivocally array context.

It strikes me that it might be a good thing for GAWK to have a "declare"
statement - that would allow you to state up front that something is an
array. Bash has this now, and it is actually quite useful.

--
A racist, a Nazi, and a Klansman walk into a bar...

Bartender says, "What will it be, Mr. Trump?"

J Naman

unread,
Aug 12, 2021, 2:17:06 PM8/12/21
to
Kenny McCormack: Would you PLEASE stop inserting political content into our conversations about the awk language! I don't care what ideology people espouse, I participate in this group to NOT have politics and ideology intrude on my conversations. Take your flames to Twitter ...

Janis Papanagnou

unread,
Aug 12, 2021, 8:48:17 PM8/12/21
to
On 12.08.2021 20:17, J Naman wrote:
> On Thursday, 12 August 2021 at 13:27:01 UTC-4, Kenny McCormack wrote:
>> [...]
> Kenny McCormack: Would you PLEASE stop inserting political content into our conversations about the awk language! I don't care what ideology people espouse, I participate in this group to NOT have politics and ideology intrude on my conversations. Take your flames to Twitter ...

J Naman, please comply to the Usenet posting standards yourself - here:
line length - before you even try to ask others to change their posting
habits.

This is Usenet, not a web forum or a Google forum. If you'd been using
a Real Newsreader you'd certainly be better aware of that fact. Then
you'd also see that the text you were addressing was part of a randomly
generated signature, not part of the topical post/conversation. (A real
newsreader would make that quite obvious, BTW; signatures are displayed
differently and are not inserted in quoted replies, for example.)

Since we're at it, Laurent, you too, get informed about Usenet and try
complying to Netiquette; post context. Get informed!

Other Google users flooding Usenet newsgroups should check the Usenet
Netiquette as well before continuing to post. It's sad that even long
time regulars here that switched to the Google interface forgot about
where they are and about the Netiquette.

I'm too old to really care, but if all those Googlies (Google Usenet-
newbies) start to try enforcing their own rules while not complying to
long time existing Usenet Netiquette it's time to speak up.

(Google users will know how to use the [Google] search engine to find
the Netiquette and information about Usenet newsgroups, so I abstain
from doing their homework and providing the link.)

Janis

Janis Papanagnou

unread,
Aug 12, 2021, 9:03:12 PM8/12/21
to
On 12.08.2021 19:27, Kenny McCormack wrote:
> In article <sf392n$gnd$1...@news-1.m-online.net>,
> Janis Papanagnou <janis_pa...@hotmail.com> wrote:
>> On 12.08.2021 14:17, Janis Papanagnou wrote:
>>>
>>> or
>>> 3) Force x to become an array.
>>
>> I just recall that, I think it was Ed who suggested a less bulky form.
>>
>> BEGIN { delete x[""] }
>>
>> works, and it seems
>>
>> BEGIN { delete x }
>>
>> works as well (at least in my GNU Awk context).
>
> So, the point is, it really does just boil down to: You have to ensure
> that, whatever execution path your program takes, the first runtime
> reference to the variable is an unequivocally array context.
>
> It strikes me that it might be a good thing for GAWK to have a "declare"
> statement - that would allow you to state up front that something is an
> array. Bash has this now, and it is actually quite useful.

Well, part of the beauty of Awk is it's terseness, here the fact that
you don't need declarations. Of course the feature could be optional,
but then you'd have to introduce another keyword, something language
designers usually want to avoid. Declarations are especially useful
where a lot of data structuring features are present. GNU Awk started
to enter that path already by the support of multi-dimensional arrays,
so maybe, depending on any further plans to introduce yet more data
structuring features, a 'declare' might eventually be the consequence.
For the current primitive vs. compound data type dichotomy it's likely
just overkill, especially since there's a code pattern to address that
issue.

(It's a bit different in shells, with Bash's declare or Ksh's typeset;
typeset, for example, is a much more powerful concept than a simple
array/string/numeric declaration.)

Janis

Andrew Schorr

unread,
Aug 12, 2021, 9:34:22 PM8/12/21
to
On Thursday, August 12, 2021 at 6:54:14 AM UTC-4, Kenny McCormack wrote:
> Observe:
>
> $ gawk4 '{ x[$1] = $0 } END { if (!length(x)) print "Nothing in the array";for (i in x) print i,x[i] }'
> Nothing in the array
> gawk4: cmd. line:1: fatal: attempt to use scalar `x' as an array
> $

As others have pointed out, adding 'delete x' essentially declares it as an array.
That being said, there seems to be a patch in the development tree that fixes this issue.
In gawk 5.1.0:

bash-4.2$ ./gawk '{ x[$1] = $0 } END { if (!length(x)) print "Nothing in the array";for (i in x) print i,x[i]; print typeof(x) }' < /dev/null
Nothing in the array
gawk: cmd. line:1: fatal: attempt to use scalar `x' as an array

In the master branch:

bash-4.2$ ./gawk '{ x[$1] = $0 } END { if (!length(x)) print "Nothing in the array";for (i in x) print i,x[i]; print typeof(x) }' < /dev/null
Nothing in the array
array

So this problem may eventually go away. But it is safest to say 'delete x' to avoid ambiguity.

Regards,
Andy



Kenny McCormack

unread,
Aug 12, 2021, 10:47:41 PM8/12/21
to
In article <d3603a3b-96fb-441c...@googlegroups.com>,
Andrew Schorr <asc...@telemetry-investments.com> wrote:
...
>In the master branch:
>
>bash-4.2$ ./gawk '{ x[$1] = $0 } END { if (!length(x)) print "Nothing in the
>array";for (i in x) print i,x[i]; print typeof(x) }' < /dev/null
>Nothing in the array
>array

This is good. I am glad to see that it is being worked on.

I think we can all agree that while it is not a big deal in the grand
scheme of things, and we all know by now how to workaround it, it is in the
category of "surprising" and it would be better if it didn't happen.

>So this problem may eventually go away. But it is safest to say 'delete x' to
>avoid ambiguity.

For me, it was easiest to just reverse the order of the two clauses in the
END section (*). The idea of putting a BEGIN clause in (which my program
does not currently have) and to put the obscure incantation of "delete x"
there seems odd. Not necessarily bad, but odd.

(*) It was actually pretty much just by accident that I coded it in that
order originally. It could just as easily have been coded the other way
from the start.

--
The randomly chosen signature file that would have appeared here is more than 4
lines long. As such, it violates one or more Usenet RFCs. In order to remain
in compliance with said RFCs, the actual sig can be found at the following URL:
http://user.xmission.com/~gazelle/Sigs/FreeCollege

Aharon Robbins

unread,
Aug 13, 2021, 3:51:43 PM8/13/21
to
In article <sf33g8$f3s$1...@news-1.m-online.net>,
Janis Papanagnou <janis_pa...@hotmail.com> wrote:
>> Now, of course, it is clear what is going on here - and why the "fixes"
>> work. But what surprises me is that it (the error) happens at all. My
>> understanding had been that the issue (i.e., dark corner in the GAWK
>> language) of whether something is an array or a scalar is resolved entirely
>> at compile time.
>
>I don't think so. I think it's a pure runtime issue.

This is correct, it is a pure runtime issue.
--
Aharon (Arnold) Robbins arnold AT skeeve DOT com

Aharon Robbins

unread,
Aug 13, 2021, 3:56:08 PM8/13/21
to
>In the master branch:
>
>bash-4.2$ ./gawk '{ x[$1] = $0 } END { if (!length(x)) print "Nothing in
>the array";for (i in x) print i,x[i]; print typeof(x) }' < /dev/null
>Nothing in the array
>array
>
>So this problem may eventually go away. But it is safest to say 'delete
>x' to avoid ambiguity.

Using 'delete x', or some other way to force x to be an array is the
most portable thing to do.

The fix in the development branch is that `length(x)' on a never
assigned value no longer forces that value to be a scalar, but leaves
it as undefined, and returns 0, which is correct both for undefined
scalars and undefined arrays.

This will be included in the next release.
Reply all
Reply to author
Forward
0 new messages