Google Groups no longer supports new Usenet posts or subscriptions. Historical content remains viewable.
Dismiss

Possible to prompt for input in GAWK?

74 views
Skip to first unread message

Kenny McCormack

unread,
Feb 22, 2015, 9:11:26 AM2/22/15
to
I sometimes write GAWK programs that take input from the terminal - that
is, use non-redirected stdin, and use the normal AWK automatic input loop.

So, it might look like:

% gawk '{print "The sum is:",$1+$2}'
5 12
The sum is: 19
5 7
The sum is: 12
^C
%

The problem is that there is no prompt for the user input and it is thus
hard to tell what I typed and what the computer reponded with.

Now, my understanding is that in the latest versions of GAWK, there is some
kind of mechanism for addressing this - some sort of method for "re-wiring"
GAWK's basic I/O mechanisms. I think it is the same sort of underlying
method that is used to implement the "in-place" editing feature.

Am I right about this? Can someone knowledgeable comment? Thanks.

BTW, TAWK has had this since forever; when it is reading from the terminal,
it prompts, like this:

C:>awkw '{print $1+$2}'
tawk input? 5 56
61
tawk input?
C:>

Note: Not interested in workarounds that involve things like "getline".
Need it to work "natively".

--
For instance, Standard C says that nearly all extensions to C are prohibited. How
silly! GCC implements many extensions, some of which were later adopted as part of
the standard. If you want these constructs to give an error message as
“required” by the standard, you must specify ‘--pedantic’, which was
implemented only so that we can say “GCC is a 100% implementation of the
standard”, not because there is any reason to actually use it.

Ed Morton

unread,
Feb 22, 2015, 9:57:37 AM2/22/15
to
That seems completely useless when you can trivially write:

$ awk 'function prompt(){printf "awk input? "} BEGIN{prompt()} {print $1+$2;
prompt()}'
awk input? 5 56
61
awk input?

Why would you want that as part of the tool/language?

Ed.

Kenny McCormack

unread,
Feb 22, 2015, 10:03:03 AM2/22/15
to
In article <mccqnn$c8u$1...@dont-email.me>,
...
>Why would you want that as part of the tool/language?

Heh heh. For the same reason you wanted "strptime" added to the core
language. Because to you, it was an essential feature.

Also, to be really useful, it needs to sense that stdin is a terminal (the
equivalent of using "isatty()" in C) and only prompt then.

Like TAWK does...

Besides, your "solution" is ugly...

P.S. I'm not asking for anythine to be "added". I just want to know if the
existing technology allows it (I think it does) and if so, how to use it.

--
Just for a change of pace, this sig is *not* an obscure reference to
comp.lang.c...

Ed Morton

unread,
Feb 22, 2015, 10:21:15 AM2/22/15
to
I wanted strptime() added because it's gritty code to write with multiple
function calls and tmp variables and it's very common for people to NEED that
functionality, not just want it, and it's a standard function in other languages
and it exists in libraries and it would provide symmetry with the other gawk
time function strftime(). IMHO that's a long way from wanting an "awk input? "
prompt when reading from stdin.

> Also, to be really useful, it needs to sense that stdin is a terminal (the
> equivalent of using "isatty()" in C) and only prompt then.
>
> Like TAWK does...
>
> Besides, your "solution" is ugly...

Agreed.
>
> P.S. I'm not asking for anythine to be "added". I just want to know if the
> existing technology allows it (I think it does) and if so, how to use it.
>

OK, good luck.

Ed.

Mike Sanders

unread,
Feb 27, 2015, 4:33:46 PM2/27/15
to
Here's native Kenny:

:: save as whatever.cmd or whatever.bat
:: [.cmd gives more options/faster exection]

@echo off

:: syntax: set /p variable=[promptstring]
:: http://ss64.com/nt/set.html

set /p X=input value for X:
set /p Y=input value Y:
echo %X% %Y% | gawk "{print 'sum:', $1+$2}"

:: best unset'em...

set X=
set Y=

:: eof

--
Mike Sanders
www: http://freebsd.hypermart.net
gpg: 0xD94D4C13

Mike Sanders

unread,
Feb 27, 2015, 4:40:07 PM2/27/15
to
Mike Sanders <mi...@taco-shack.cow> wrote:

> :: save as whatever.cmd or whatever.bat
> :: [.cmd gives more options/faster exection]
>
> @echo off
>
> :: syntax: set /p variable=[promptstring]
> :: http://ss64.com/nt/set.html
>
> set /p X=input value for X:
> set /p Y=input value Y:
> echo %X% %Y% | gawk "{print 'sum:', $1+$2}"
>
> :: best unset'em...
>
> set X=
> set Y=
>
> :: eof

Please excuse typo's dyslexic fingers today,
example still works fine however =)

Corrected version follows...

:: save as whatever.cmd or whatever.bat
:: [.cmd gives more options/faster exection]

@echo off

:: syntax: set /p variable=[promptstring]
:: http://ss64.com/nt/set.html

set /p X=input value X:

Andrew Schorr

unread,
Feb 28, 2015, 10:17:58 AM2/28/15
to
On Sunday, February 22, 2015 at 9:11:26 AM UTC-5, Kenny McCormack wrote:
> Now, my understanding is that in the latest versions of GAWK, there is some
> kind of mechanism for addressing this - some sort of method for "re-wiring"
> GAWK's basic I/O mechanisms.

Yes, such a mechanism exists. You can register an input parser that will replace the normal function used to read a record of input. But I think it might be rather painful to do this, since I don't see a simple way to stack this functionality on top of the existing code to read a record. In other words, for the minor gain of printing a prompt, I think you will need to implement all the hairy logic of reading and parsing input.

The "readdir" and "readfile" extensions in the gawk distribution both register input parsers, so those can serve as examples. The gawk-xml extension provides a more complicated example.

This feature is documented in the manual under "Input Parsers". The basic challenge is to implement a get_record function. This is tricky for the general case where RS could be a regular expression, etc.

> I think it is the same sort of underlying
> method that is used to implement the "in-place" editing feature.

The in-place editing feature does not register an input parser or output wrapper. It just uses some API function calls to redirect stdout to a temporary file which it renames to the source file at the end.

Regards,
Andy

Kenny McCormack

unread,
Feb 28, 2015, 11:31:05 AM2/28/15
to
In article <de6125db-f1e3-4ebf...@googlegroups.com>,
Andrew Schorr <asc...@telemetry-investments.com> wrote:
>On Sunday, February 22, 2015 at 9:11:26 AM UTC-5, Kenny McCormack wrote:
>> Now, my understanding is that in the latest versions of GAWK, there is some
>> kind of mechanism for addressing this - some sort of method for "re-wiring"
>> GAWK's basic I/O mechanisms.
>
>Yes, such a mechanism exists. You can register an input parser that will replace
>the normal function used to read a record of input.

Curiously enough, I had just now gotten around to doing some research on
this problem, so I was probably reaching these same conclusions at the
exact same time as you were composing and posting your response.

>But I think it might be
>rather painful to do this, since I don't see a simple way to stack this
>functionality on top of the existing code to read a record. In other words, for
>the minor gain of printing a prompt, I think you will need to implement all the
>hairy logic of reading and parsing input.

Interesting. However, in looking through this, I found this line in io.c:

iop->public.read_func = ( ssize_t(*)() ) read;

And I suspect (although I have not gotten around to testing it yet) that if
I just point that at a wrapper function (e.g., 'my_read' instead of 'read'),
I will get what I want.

Note that since I have already made a lot of source code level tweaks to
the various renditions of GAWK on my systems, I am not averse to making
source code level changes and recompiling. To me, (i.e., for my own use)
there is no particular difference between tweaking the source code vs.
doing stuff in extension libs.

>The "readdir" and "readfile" extensions in the gawk distribution both register
>input parsers, so those can serve as examples.

Yes, I've looked at those. It's pretty complicated stuff; I haven't fully
digested them yet.

>This feature is documented in the manual under "Input Parsers". The basic
>challenge is to implement a get_record function. This is tricky for the general
>case where RS could be a regular expression, etc.

As noted above, I am hopeful that I will not have to go down this road.

>> I think it is the same sort of underlying
>> method that is used to implement the "in-place" editing feature.
>
>The in-place editing feature does not register an input parser or output wrapper.
>It just uses some API function calls to redirect stdout to a temporary file which
>it renames to the source file at the end.

Correct. I did finally get around to digesting how inplace works, and can
see now that it does not involve the Input Parser mechanism.

Thanks for your reply!

--
"We should always be disposed to believe that which appears to us to be
white is really black, if the hierarchy of the church so decides."

- Saint Ignatius Loyola (1491-1556) Founder of the Jesuit Order -

ptren...@gmail.com

unread,
Feb 28, 2015, 1:37:48 PM2/28/15
to
Here's a function I sometimes use. Note that the "Testing" stuff at the end
is not part of the function. I just added it as a usage example. Also, the
hash-bang is for igawk because the function is intended to be included in a program
that need to use it.

$ cat Scripts/gawk/getanswer.gawk
#!/usr/bin/igawk -f
##############################################################################
#
# Interactive function to solicit user input
#
function getanswer(\
question, # Question to be asked\
Valid_Answers, # Valid answer values:\
#
# Note: If "Valid_Answers" is NOT an array, it may be either:
#
# 1) Null (or omitted) if any response (including null) is acceptable.
#
# 2) A string containing single character answers ([:alpha:] only). If the
# the default answer value is null (or not passed), it will be ASSUMED
# to be the first (left most) upper case letter in the Valid_Answers string.
#
# Note: All answers will be returned in lower case. If you need case-sensitive
# answers, pass Valid_Answers as null, or, if your answer set is known,
# pass the acceptable answers as elements of a Valid_Answers array.
#
default_answer, # Default answer (Null or omitted if there is no default,\
# # or if it is implied as specified in (2), above.)
Parsed_question, # Array in which the parsed question information should be
# stored. If omitted (or null), the parsed data will not be
# saved. (If a question may be asked more than once, saving the
# parsing results trades storage for execution time. This
# array is indexed by 'question'.) If the 'default_answer'
# value is not null, then that value will be used in place of
# the default_answer value in the Parsed_question array.
#
# Local variables:
#
na, # Number of valid answers\
i, j, k, l, # Temporary value holders\
v, a, b, step, # Interval value, lower_bound, upper_bound, and step value holders\
part, # Components of n1:n2:step, if Val Valid=Parsed_question[question]["Valid"]
Valid, # Valid answers array\
answer) # Temporary answer holder
{
# Do we have saved parsing results for this question?
i=0
if (Parsed_question) {
if (isarray(Parsed_question)) {
if (question in Parsed_question) {
Valid=Parsed_question[question]["Valid"]
if (!default_answer) default_answer=Parsed_question[question]["default_answer"]
i=1
}
}
else {
# If we're here, the fourth argument is neither null nor an array.
# In this case, the following statement should abort execution.
Parsed_question[question]["default_answer"] = default_answer
}
}
if (i=0) { # Nope. Parse this question now.
# Make sure that Valid is an empty array
Valid[""]=""
delete Valid
# If we have an array of valid answers, populate "Valid" from it.
if (isarray(Valid_Answers))
{
# Note that the INDICES of "Valid" are the VALUES of "Valid_Answers"
# This is done so the the "in" operator may be used to determine
# the validity of any constrained input.
for (i in Valid_Answers) {
# As an unused side effect, the number of duplicated Valid_Answers is found.
++Valid[Valid_Answers[i]]
}
na=length(Valid)
}
else
{
# Assume Valid_Answers is a string as described above.
na=split(Valid_Answers, part, "")
for (i=1;i<=na;++i) {
if (part[i]==toupper(part[i]))
{
if (!default_answer) {
default_answer=tolower(part[i])
}
}
++Valid[tolower(part[i])]
}
}
# If Parsed_question was not passed, it will be a local variable
# and these values discarded on exit.
Parsed_question[question]["Valid"]=Valid
Parsed_question[question]["default_answer"]=default_answer
}
while (!answer) {
printf(question " ")
if (default_answer) {
printf("(Default: %s) ", default_answer)
}
if (getline answer >0) {
# Is any response valid? Use this one
if (length(Valid)==0) {
break
}
# Is this a null response? Use the default, if any.
if (!answer && default_answer) {
answer=default_answer
break
}
if (answer in Valid) {
break
}
printf("\"%s\" is not a valid answer.\n", answer)
if (Valid) {
printf("Valid responses are:")
for (i in Valid) {
printf("\"%s\"", i)
}
printf("\n")
}
answer=""
}
}
return answer
}
BEGIN {
# Testing
# debug="/dev/stderr"
while (answer!="y") {
answer=getanswer("Should the case of the input string be ignored?","Yn","",Parsed)
if (answer!="y") print "Answer=" answer ": Looping"
}
answer=""
while (answer!="y") {
answer=getanswer("Answer \"y\" to exit this loop:","","Maybe", Parsed)
print "answer = \""answer"\""
}
}




Andrew Schorr

unread,
Feb 28, 2015, 2:56:29 PM2/28/15
to
On Saturday, February 28, 2015 at 11:31:05 AM UTC-5, Kenny McCormack wrote:
> Interesting. However, in looking through this, I found this line in io.c:
>
> iop->public.read_func = ( ssize_t(*)() ) read;
>
> And I suspect (although I have not gotten around to testing it yet) that if
> I just point that at a wrapper function (e.g., 'my_read' instead of 'read'),
> I will get what I want.

You are correct. I forgot that a terminal in cooked mode should return a single line of input for each read call. So yes, I think this should probably do what you want.

You can accomplish this using a shared library that registers an input parser and sets read_func instead of get_record.

If you plan to hack on the source, then there are many ways of accomplishing your goal. But the officially supported way is to use a shared library extension that registers an input parser.

Regards,
Andy

Kenny McCormack

unread,
Feb 28, 2015, 7:04:05 PM2/28/15
to
In article <b29c22e8-c2fe-4ff9...@googlegroups.com>,
Agreed that it is cleaner (although much more verbose, as we see below) to
do it via an extension lib. Below find my effort. It works, but note that
there are no examples in the distro of hooking "read_func" (they all are
examples of hooking "get_record") - so I had to make it up as I went along...

--- Cut Here ---
/*
* promptme.c - Provide an input parser for GAWK
* Compile command:
gcc -shared -I.. -W -Wall -Werror -fPIC -o promptme.so promptme.c
*/

#include <stdio.h>
#include <stddef.h>
#include <string.h>
#include <assert.h>
#include <errno.h>
#include <stdlib.h>
#include <alloca.h>
#include <unistd.h>
#include <sys/types.h>
#include <sys/stat.h>
#include <sys/wait.h>
#include <signal.h>
#include <stdarg.h>

#include "gawkapi.h"

static const gawk_api_t *api; /* for convenience macros to work */
static awk_ext_id_t *ext_id;
static const char *ext_version = "promptme extension: version 1.0";
static awk_bool_t init_promptme(void);
static awk_bool_t (*init_func)(void) = init_promptme;

int plugin_is_GPL_compatible;

static ssize_t promptme_read(int fd, void *buf, size_t nbytes)
{
if (isatty(fd)) { printf("GAWK input: "); fflush(stdout); }
return read(fd,buf,nbytes);
}

/* promptme_can_take_file --- return true if we want the file */
static awk_bool_t
promptme_can_take_file(const awk_input_buf_t *iobuf)
{
(void) iobuf;
return awk_true;
}

/* promptme_take_control_of --- set up input parser. */
static awk_bool_t
promptme_take_control_of(awk_input_buf_t *iobuf)
{
iobuf->read_func = promptme_read;
return awk_true;
}

static awk_input_parser_t promptme_parser = {
"promptme",
promptme_can_take_file,
promptme_take_control_of,
NULL
};

/* init_promptme --- set things up */

static awk_bool_t
init_promptme()
{
register_input_parser(&promptme_parser);
return awk_true;
}

static awk_ext_func_t func_table[] = {
{ NULL, NULL, 0 }
};

/* define the dl_load function using the boilerplate macro */

dl_load_func(func_table, promptme, "")

--- Cut Here ---

--
Religion is regarded by the common people as true,
by the wise as foolish,
and by the rulers as useful.

(Seneca the Younger, 65 AD)

Kenny McCormack

unread,
Feb 28, 2015, 7:07:47 PM2/28/15
to
In article <mctl1j$jsv$1...@news.xmission.com>,
Kenny McCormack <gaz...@shell.xmission.com> wrote:
...
>--- Cut Here ---

I forgot to include the sample run:

$ gawk4 -l ./promptme '{print $1+$2}'
GAWK input: 3 19
22
GAWK input: 12 12
24
GAWK input: ^C (or ^D)
$ cat /tmp/gliz
3 15
$ gawk4 -l ./promptme '{print $1+$2}' /tmp/gliz
18
$

--
"I have a simple philosophy. Fill what's empty. Empty what's full. And
scratch where it itches."

Alice Roosevelt Longworth

Andrew Schorr

unread,
Mar 1, 2015, 10:01:54 AM3/1/15
to
On Saturday, February 28, 2015 at 7:04:05 PM UTC-5, Kenny McCormack wrote:
> static ssize_t promptme_read(int fd, void *buf, size_t nbytes)
> {
> if (isatty(fd)) { printf("GAWK input: "); fflush(stdout); }
> return read(fd,buf,nbytes);
> }
>
> /* promptme_can_take_file --- return true if we want the file */
> static awk_bool_t
> promptme_can_take_file(const awk_input_buf_t *iobuf)
> {
> (void) iobuf;
> return awk_true;
> }

FYI, it will be much more efficient to do the "isatty" test once in promptme_can_take_file rather than calling it each time in promptme_read. If it's not a tty, just return false and don't take control.

Regards,
Andy

Kenny McCormack

unread,
Mar 1, 2015, 11:07:43 AM3/1/15
to
In article <67b58a20-db6b-416b...@googlegroups.com>,
Aha! Yes, thanks. I assumed there was something missing in my
understanding of the can_take and take_control functions.

So, I changed it to:

promptme_can_take_file(const awk_input_buf_t *iobuf)
{
return iobuf->fd != INVALID_HANDLE && isatty(iobuf->fd);
}

(and removed the isatty() check from promptme_read())
I've tested this and it does indeed do the right thing in this case:

% cat /tmp/testfile
12 19
23 198
2 12
% gawk4 -l ./promptme '{print $1+$2}' - /tmp/testfile
GAWK input: 2 12
14
GAWK input: 5 19
24
GAWK input:
31
221
14
%

Also, because there aren't any examples in the distro of doing this sort of
thing (hooking read_func instead of get_record), I'd like to suggest that
this (prompting the user, a la TAWK, when reading from the terminal) be
included therein as such an example.

Feel free to use any and all code I've posted - or just re-implement it
from scratch - either way, it'd be good if there was such an example in the
distro.

--
Faced with the choice between changing one's mind and proving that there is
no need to do so, almost everyone gets busy on the proof.

- John Kenneth Galbraith -

0 new messages