Google Groups no longer supports new Usenet posts or subscriptions. Historical content remains viewable.
Dismiss

GAWK: When can we set ARGC=1 ?

33 views
Skip to first unread message

Kenny McCormack

unread,
Feb 4, 2021, 8:57:34 PM2/4/21
to
I sometimes use the technique of passing in data via the command line in
the place normally reserved for filename args. Yes, I realize this is
non-standard, and that there are other ways to do it. I'm not interested
in any arguments or suggestions about those alternatives. The technique is
something like:

$ someCommand | gawk 'BEGIN { ARGC = 1 }
/something/ { for (i in ARGV) print i,ARGV[i] }' 'string 1' 'string 2' ...

The trick here is that you explicitly set ARGC to 1, so that your strings
don't get interpreted as filenames. Written as above, it all works fine.
As long as you "kill" ARGV via setting ARGC in the BEGIN clause, it works
as expected.

Now, just for fun, I was playing around with some alternatives, and found
that neither of the following variations work (and by "not work", I mean
that it tries to interpret "string 1" as a filename, which of course fails
and causes a fatal error abort from the program.

1) $ someCommand | gawk -v ARGC=1 '
/something/ { for (i in ARGV) print i,ARGV[i] }' 'string 1' 'string 2' ...

2) $ someCommand | gawk '
/something/ { for (i in ARGV) print i,ARGV[i] }' ARGC=1 'string 1' 'string 2' ...

I'm curious as to why neither of these work. To my mind, it seems they should.

(Particularly, the first one; I can sort of get why the second one might
not work)

--
The randomly chosen signature file that would have appeared here is more than 4
lines long. As such, it violates one or more Usenet RFCs. In order to remain
in compliance with said RFCs, the actual sig can be found at the following URL:
http://user.xmission.com/~gazelle/Sigs/Pedantic

Janis Papanagnou

unread,
Feb 5, 2021, 7:16:56 AM2/5/21
to
On 05.02.2021 02:57, Kenny McCormack wrote:
> I sometimes use the technique of passing in data via the command line in
> the place normally reserved for filename args. Yes, I realize this is
> non-standard, and that there are other ways to do it. I'm not interested
> in any arguments or suggestions about those alternatives. The technique is
> something like:
>
> $ someCommand | gawk 'BEGIN { ARGC = 1 }
> /something/ { for (i in ARGV) print i,ARGV[i] }' 'string 1' 'string 2' ...
>
> The trick here is that you explicitly set ARGC to 1, so that your strings
> don't get interpreted as filenames. Written as above, it all works fine.
> As long as you "kill" ARGV via setting ARGC in the BEGIN clause, it works
> as expected.
>
> Now, just for fun, I was playing around with some alternatives, and found
> that neither of the following variations work (and by "not work", I mean
> that it tries to interpret "string 1" as a filename, which of course fails
> and causes a fatal error abort from the program.
>
> 1) $ someCommand | gawk -v ARGC=1 '
> /something/ { for (i in ARGV) print i,ARGV[i] }' 'string 1' 'string 2' ...
>
> 2) $ someCommand | gawk '
> /something/ { for (i in ARGV) print i,ARGV[i] }' ARGC=1 'string 1' 'string 2' ...
>
> I'm curious as to why neither of these work. To my mind, it seems they should.
>
> (Particularly, the first one; I can sort of get why the second one might
> not work)

I wouldn't expect anything here. While the three variants seem to do
the same they are obviously and effectively triggered at different
"instances of time". Because of that I get different error messages
for the two error cases. So it depends on when the file-open command
is issued and when it is determined whether files are present or not.

Has the GNU Awk manual nothing to say about the processing order?
Does POSIX specify anything about it?

Janis

Geoff Clare

unread,
Feb 5, 2021, 8:41:05 AM2/5/21
to
There are some clarifications about ARGC and ARGV planned for the
next revision of POSIX. See:

https://austingroupbugs.net/view.php?id=974#c3231

One of the things the new description says is "It is unspecified
whether alterations to ARGC can be made using the -v option."

However, for the second command, unless I missed something I think it
is (will be) required to work.

--
Geoff Clare <net...@gclare.org.uk>

Aharon Robbins

unread,
Feb 8, 2021, 12:50:41 AM2/8/21
to
In article <26uveh-...@ID-313840.user.individual.net>,
Geoff Clare <net...@gclare.org.uk> wrote:
>> Has the GNU Awk manual nothing to say about the processing order?
>> Does POSIX specify anything about it?
>
>There are some clarifications about ARGC and ARGV planned for the
>next revision of POSIX. See:
>
>https://austingroupbugs.net/view.php?id=974#c3231
>
>One of the things the new description says is "It is unspecified
>whether alterations to ARGC can be made using the -v option."
>
>However, for the second command, unless I missed something I think it
>is (will be) required to work.

Thanks for this link and info. I will be reviewing what it says
and if necessary I will fix gawk to take this into account.

Brian Kernighan's awk "correctly" handles the case where ARGC=1 appears
in place of a filename. Mawk and gawk don't. I haven't yet tried
any other awks.

Arnold
--
Aharon (Arnold) Robbins arnold AT skeeve DOT com

Aharon Robbins

unread,
Feb 8, 2021, 5:27:58 AM2/8/21
to
In article <rvqjff$1vpp$1...@gioia.aioe.org>,
Interestingly enough, this used to work. It broke at gawk 4.2.0 with
the addition of MPFR. Below is a patch that will eventually make
its way into the Git repo.

Arnold
-----------------------------
diff --git a/io.c b/io.c
index c1007423..08ea3c16 100644
--- a/io.c
+++ b/io.c
@@ -520,6 +520,9 @@ nextfile(IOBUF **curfile, bool skipping)

return ++i; /* run beginfile block */
}
+
+ // could have had ARGC=xx on command line. sigh.
+ argc = get_number_si(ARGC_node->var_value);
}

if (files == false) {
0 new messages