Google Groups no longer supports new Usenet posts or subscriptions. Historical content remains viewable.
Dismiss

proposed new gawk time function strptime()

299 views
Skip to first unread message

Ed Morton

unread,
Aug 28, 2014, 11:55:39 AM8/28/14
to
I was just writing a script in a forum to convert a timestamp to a different form:

$ cat file
27/Aug/2014:23:58

$ awk -f tst.awk file
Wed, Aug 27, 2014 11:58:00 PM

$ cat tst.awk
{
split($0,t,/[\/:]/)
mthNr = (match("JanFebMarAprMayJunJulAugSepOctNovDec",t[2])+2)/3
secs = mktime(t[3]" "mthNr" "t[1]" "t[4]" "t[5]" 0")
print strftime("%c", secs)
}

when someone else posted a perl script that uses a built in function named strptime() that takes a string as the first argument and a description of the contents of that string in terms of time specifiers and returns the number of seconds. If that function existed in awk the whole of the above code would be written as:

{ print strftime("%c", strptime($0,"%d/%b/%Y:%H:%M")) }

I know we want to avoid cluttering up the awk language but time conversions are a VERY common problem and having to write the split()+match() with arithmetic or populate an array that maps month names to numbers etc. is pretty painful (I always have to look it up) and that strptime() seems much more like it'd be filling a glaring hole in the gawk time functions rather than adding on to them.

Thoughts?

Ed.


Ed Morton

unread,
Aug 28, 2014, 12:00:10 PM8/28/14
to
I was pointed to this as a reference:

http://man7.org/linux/man-pages/man3/strptime.3.html

Kenny McCormack

unread,
Aug 28, 2014, 12:45:56 PM8/28/14
to
In article <13387e6a-c503-4faa...@googlegroups.com>,
Ed Morton <morto...@gmail.com> wrote:
...
>I know we want to avoid cluttering up the awk language but time
>conversions are a VERY common problem and having to write the
>split()+match() with arithmetic or populate an array that maps month names
>to numbers etc. is pretty painful (I always have to look it up) and that
>strptime() seems much more like it'd be filling a glaring hole in the gawk
>time functions rather than adding on to them.

Arnold has stated many times that if you can do it in AWK code or in an
extension library, it won't be put into the core. Both you and I may
disagree with this policy from time to time, but we have to live with it.

Here ya go!

$ gawk4 -l ./strptime.so '{print strftime("%c", strptime($0,"%d/%b/%Y:%H:%M"))}'
27/Aug/2014:23:58
Wed Aug 27 23:58:00 2014
$

And here's the code (note that it is sort of a pain having go through this
every time you need to add a new function which is basically a pass-thru
into the C library - but so it goes. I was able to knock this together in
about 20 minutes, so it's not so bad...)

--- Cut Here ---
/*
* strptime.c - GAWK interface to strptime (like in Perl)
* Compile command:
gcc -shared -I.. -W -Wall -Werror -fPIC -o strptime.so strptime.c
*/

#define _XOPEN_SOURCE
#include <stdio.h>
#include <stddef.h>
#include <string.h>
#include <assert.h>
#include <errno.h>
#include <stdlib.h>
#include <sys/types.h>
#include <sys/stat.h>
#include <dlfcn.h>
#include <time.h>

#include "gawkapi.h"
#define STR str_value.str

static const gawk_api_t *api; /* for convenience macros to work */
static awk_ext_id_t *ext_id;
static const char *ext_version = "strptime extension: version 1.0";
static awk_bool_t (*init_func)(void) = NULL;

int plugin_is_GPL_compatible;

/* do_strptime */

static awk_value_t *
do_strptime(int nargs, awk_value_t *result)
{
awk_value_t arg0,arg1;
struct tm tm;

if (nargs != 2) {
lintwarn(ext_id,"strptime: called with wrong # arguments (%d): must be 2!",nargs);
goto the_end;
}
if (!get_argument(0, AWK_STRING, &arg0)) {
lintwarn(ext_id,"strptime: Fatal error retrieving first arg!");
goto the_end;
}
if (!get_argument(1, AWK_STRING, &arg1)) {
lintwarn(ext_id,"strptime: Fatal error retrieving second arg!");
goto the_end;
}
strptime(arg0.STR,arg1.STR,&tm);
return make_number(mktime(&tm), result);

the_end:
return make_const_string("<ERROR>",7,result);
}

static awk_ext_func_t func_table[] = {
{ "strptime", do_strptime, 2 },
};

/* define the dl_load function using the boilerplate macro */

dl_load_func(func_table, strptime, "")

--- Cut Here ---

--
"There's no chance that the iPhone is going to get any significant market share. No chance." - Steve Ballmer

Ed Morton

unread,
Aug 28, 2014, 1:08:06 PM8/28/14
to
On Thursday, August 28, 2014 11:45:56 AM UTC-5, Kenny McCormack wrote:
> In article <13387e6a-c503-4faa...@googlegroups.com>,
>
> Ed Morton <morto...@gmail.com> wrote:
>
> ...
>
> >I know we want to avoid cluttering up the awk language but time
>
> >conversions are a VERY common problem and having to write the
>
> >split()+match() with arithmetic or populate an array that maps month names
>
> >to numbers etc. is pretty painful (I always have to look it up) and that
>
> >strptime() seems much more like it'd be filling a glaring hole in the gawk
>
> >time functions rather than adding on to them.
>
>
>
> Arnold has stated many times that if you can do it in AWK code or in an
>
> extension library, it won't be put into the core. Both you and I may
>
> disagree with this policy from time to time, but we have to live with it.
>
>
>
> Here ya go!

Thanks Kenny. I would never actually do that, though, as it's more trouble (since I never actually compile gawk myself, our tech guys provide it) than writing the 4 lines of awk I'm hoping to not have to write in future and similarly I wouldn't really expect anyone posting a question here to adopt that as the answer.

I'm hoping Arnold et al will see this for what it is - a missing piece of functionality rather than some added optional feature.

As the man page says, strptime() is the opposite of strftime() and we already have strftime() and symmetry is a core ingredient of "good" software.

Ed.

Kenny McCormack

unread,
Aug 28, 2014, 2:10:02 PM8/28/14
to
In article <f283017a-42ba-4393...@googlegroups.com>,
Ed Morton <morto...@gmail.com> wrote:
...
>Thanks Kenny. I would never actually do that, though, as it's more trouble (since
>I never actually compile gawk myself, our tech guys provide it) than writing the

Actually, you don't need to compile GAWK and, in fact, this has nothing to
do with compiling GAWK. All you need is the "gawkapi.h" file, and an
enterprising young fellow such as yourself should have no trouble finding
that.

Note that, AIUI, the earlier (original) extension interface was more
closely bound up in GAWK compilation, and you more or less did have to do a
self-compile of GAWK in order to have the necessary pieces to do an
extension library. But my understanding is that getting away from that -
and making it almost completely standalone (gawkapi.h being the only link)
was the (or one of the) primary goal(s) of the new API interface.

Anyway, it seems pretty clean to me.

>4 lines of awk I'm hoping to not have to write in future and similarly I wouldn't
>really expect anyone posting a question here to adopt that as the answer.

The fact of the matter is that that *is* the answer. As I stated in my
previous post, Arnold has made it clear that that is the direction in which
GAWK is moving and will continue to be moving.

>I'm hoping Arnold et al will see this for what it is - a missing piece of
>functionality rather than some added optional feature.

Heh heh. Everybody pet product is "obviously" a missing functionality, not
some "optional" added feature. Welcome to the club!

--
Pensacola - the thinking man's drink.

Janis Papanagnou

unread,
Aug 28, 2014, 2:18:46 PM8/28/14
to
On 28.08.2014 17:55, Ed Morton wrote:
> I was just writing a script in a forum to convert a timestamp to a
> different form: [...]
>
> when someone else posted a perl script that uses a built in function named
> strptime() that takes a string as the first argument and a description of
> the contents of that string in terms of time specifiers and returns the
> number of seconds. If that function existed in awk the whole of the above
> code would be written as:
>
> { print strftime("%c", strptime($0,"%d/%b/%Y:%H:%M")) }
>
> I know we want to avoid cluttering up the awk language but time conversions
> are a VERY common problem and having to write the split()+match() with
> arithmetic or populate an array that maps month names to numbers etc. is
> pretty painful (I always have to look it up) and that strptime() seems much
> more like it'd be filling a glaring hole in the gawk time functions rather
> than adding on to them.
>
> Thoughts?

The awk language is not "cluttered up" (WRT new supported syntax) by your
proposal; it's just a function. Mind that all the time related functions
are anyway gawk specific (or at least non-standard), so adding another
function in this subdomain should not be to severe a crucial problem. The
only "problem" I see is that someone has to contribute it. And of course
that Arnold might have some reservations about it. :-)

Janis

>
> Ed.
>
>

Joe User

unread,
Aug 28, 2014, 2:32:44 PM8/28/14
to
On Thu, 28 Aug 2014 10:08:06 -0700, Ed Morton wrote:

> As the man page says, strptime() is the opposite of strftime() and we
> already have strftime() and symmetry is a core ingredient of "good"
> software.

Use perl (from a google search result):




use DateTime::Format::Strptime qw( );

my $format = DateTime::Format::Strptime->new(
pattern => '%d/%b/%Y:%H:%M',
time_zone => 'local',
on_error => 'croak',
);

my $dt = $fields->[1] ;
print "Date:[$dt]\n";
my $dateopen = $format->parse_datetime($dt);




Nine lines instead of four in awk.

BTW, if strptime() were to be added to gawk, scanf() should be added, too.

A general interface to any of thousands of glibc functions is just a lot
of programming away, with the gawk extension facility. Thanks to Kenny
McCormack for posting a complete example.

--
If the Catholic church couldn't stop Galileo, then
governments won't be able to stop things now.

-- Carlo de Benedetti of Olivetti on the
folly of trying to regulate information
technology.

Kenny McCormack

unread,
Aug 28, 2014, 2:36:14 PM8/28/14
to
In article <c794db-...@c100.static-216-228-92-121.apid.com>,
Joe User <ax...@yahoo.com> wrote:
>On Thu, 28 Aug 2014 10:08:06 -0700, Ed Morton wrote:
>
>> As the man page says, strptime() is the opposite of strftime() and we
>> already have strftime() and symmetry is a core ingredient of "good"
>> software.
>
>Use perl (from a google search result):

Off topic

--
The motto of the GOP "base": You can't *be* a billionaire, but at least you
can vote like one.

Joe User

unread,
Aug 28, 2014, 2:37:51 PM8/28/14
to
On Thu, 28 Aug 2014 20:18:46 +0200, Janis Papanagnou wrote:

> The awk language is not "cluttered up" (WRT new supported syntax) by
> your proposal; it's just a function. Mind that all the time related
> functions are anyway gawk specific (or at least non-standard), so adding
> another function in this subdomain should not be to severe a crucial
> problem. The only "problem" I see is that someone has to contribute it.
> And of course that Arnold might have some reservations about it.

Really, there are two problems. The specification and the implementation.

It is fairly easy to change the code to add functions and language
features to an awk implementation.

The hard part is to get someone to think the changes through, to make the
language consistent and understandable, without a lot of exceptions. To
make it fast and flexible. To update documentation. To make sure the
changes are upward compatible. To get acceptance by the user community.
That is the hard part.

--
I cannot undertake to lay my finger on
that article of the Constitution which
grant[s] a right to Congress of expending,
on objects of benevolence, the money of
their constituents.

-- James Madison, 1794

Aharon Robbins

unread,
Aug 28, 2014, 3:27:19 PM8/28/14
to
In article <ltnmc4$j3k$1...@news.xmission.com>,
Kenny McCormack <gaz...@shell.xmission.com> wrote:
>In article <13387e6a-c503-4faa...@googlegroups.com>,
>Arnold has stated many times that if you can do it in AWK code or in an
>extension library, it won't be put into the core.

Right.

>Both you and I may
>disagree with this policy from time to time, but we have to live with it.
>
>Here ya go!
>
>$ gawk4 -l ./strptime.so '{print strftime("%c", strptime($0,"%d/%b/%Y:%H:%M"))}'
>27/Aug/2014:23:58
>Wed Aug 27 23:58:00 2014
>$

Nice work Kenny. You might want to contribute it to the gawkextlib project
so that other people can benefit from it. How to do so is documented in
the manual.
--
Aharon (Arnold) Robbins arnold AT skeeve DOT com
P.O. Box 354 Home Phone: +972 8 979-0381
Nof Ayalon
D.N. Shimshon 9978500 ISRAEL

Aharon Robbins

unread,
Aug 28, 2014, 3:29:37 PM8/28/14
to
In article <ltnr9q$ld9$1...@news.xmission.com>,
Kenny McCormack <gaz...@shell.xmission.com> wrote:
>Actually, you don't need to compile GAWK and, in fact, this has nothing to
>do with compiling GAWK. All you need is the "gawkapi.h" file, and an
>enterprising young fellow such as yourself should have no trouble finding
>that.

Right.

>Note that, AIUI, the earlier (original) extension interface was more
>closely bound up in GAWK compilation, and you more or less did have to do a
>self-compile of GAWK in order to have the necessary pieces to do an
>extension library.

No. But if gawk's internal NODE structure (or other stuff changed), you
would have to recompile your extension.

>But my understanding is that getting away from that -
>and making it almost completely standalone (gawkapi.h being the only link)
>was the (or one of the) primary goal(s) of the new API interface.

Right.

>Anyway, it seems pretty clean to me.

Thanks. We worked hard to get it to be so.

>>4 lines of awk I'm hoping to not have to write in future and similarly I wouldn't
>>really expect anyone posting a question here to adopt that as the answer.
>
>The fact of the matter is that that *is* the answer. As I stated in my
>previous post, Arnold has made it clear that that is the direction in which
>GAWK is moving and will continue to be moving.
>
>>I'm hoping Arnold et al will see this for what it is - a missing piece of
>>functionality rather than some added optional feature.
>
>Heh heh. Everybody pet product is "obviously" a missing functionality, not
>some "optional" added feature. Welcome to the club!

Kenny's got this one right.

Arnold

Aharon Robbins

unread,
Aug 28, 2014, 3:31:46 PM8/28/14
to
In article <vg94db-...@c100.static-216-228-92-121.apid.com>,
Joe User <ax...@yahoo.com> wrote:
>The hard part is to get someone to think the changes through, to make the
>language consistent and understandable, without a lot of exceptions. To
>make it fast and flexible. To update documentation. To make sure the
>changes are upward compatible. To get acceptance by the user community.
>That is the hard part.

At last, someone who seems to understand something of what being a
maintainer is all about. :-)

Janis Papanagnou

unread,
Aug 28, 2014, 5:14:32 PM8/28/14
to
On 28.08.2014 20:37, Joe User wrote:
> On Thu, 28 Aug 2014 20:18:46 +0200, Janis Papanagnou wrote:
>
>> The awk language is not "cluttered up" (WRT new supported syntax) by
>> your proposal; it's just a function. Mind that all the time related
>> functions are anyway gawk specific (or at least non-standard), so adding
>> another function in this subdomain should not be to severe a crucial
>> problem. The only "problem" I see is that someone has to contribute it.
>> And of course that Arnold might have some reservations about it.
>
> Really, there are two problems. The specification and the implementation.

Specification and implementation are two aspekts of a change, not two
problems.

>
> It is fairly easy to change the code to add functions and language
> features to an awk implementation.

Added functions and changed language, OTOH, are two issues of different
severity and impact. (It's arguable of whether those issues merge in awk.)

>
> The hard part is to get someone to think the changes through, to make the
> language consistent and understandable, without a lot of exceptions.

I agree, for a language change that conflicts with all sorts of other
constructs and whatnot. But we were not talking about [such] a language
change[*], we were talking about a self-contained function without any
side-effects[**]. So the issue's really not that muddy as you somewhat
vaguely suggest. Ed's proposal has a narrow well defined interface that
is based on existing format specifiers.

But what part is it that you find non-consistent or non-understandable
in the proposal?

> To
> make it fast and flexible. To update documentation. To make sure the
> changes are upward compatible. To get acceptance by the user community.
> That is the hard part.

Not very concrete and many gereralities. Let me pick that apart...

"fast and flexible" - yes, but in case the first (or any) version is
not perfect you only "pay" for it if you use it. What flexibility
for the specific function are you missing from the proposal?

"update documentation" - quite trivial with the proposed function,[***]
which even relies on already existing definitions where applicable.
BTW, a function needs to be documented whether implemented natively
or through the extension mechanism.

"upward compatible" - a name clash seems to be the only potentially
incompatibility issue. Not worse than the existing strftime(), isn't
it?

"acceptance by the user community" - again, people who don't need that
function won't notice (modulo name clash if you incidentally decided
to name one of your own functions strptime(). And people who'd like
a function like that won't complain either.


That all said, and acknowledging that the set of native functions shall
be kept at a minimum, a question; shouldn't all gawk specific functions
(or even all existing function sets[****]) be supported by an extension
interface then?

The reason I am asking is; typically you have modules for extensions
that cover a specific topic, and many functions collected in a module.
If I want time stamps and time/date manipulation I could bind them,
the existing ones and the proposed addition, with a single "include".

And since you have mentioned the difficulty-of-good-design questions;
rethink about the gawk non-standard but native extensions. Have strftime
available but include strptime? Designwise, if consequent and if at all,
both might be better placed with the other time functions in a loadable
time/date module.

(But I noticed, sadly too late, that I have written more than worth the
issue.)

Janis

[*] As previously said, predefined names for function will affect the
programs insofar that you cannot use that name for other purposes, but
that's it.

[**] Some native awk functions (unfortunately) have side-effects, some
other former proposed extensions are also often proposed with similar
side-effects than the builtins (e.g. match()).

[***] WRT complexity of documentation a quick look into the gawk manual
will be enlightening; strftime() for example requires only a few lines
of documentation, and I'd expect strptime() not to require substantially
more.

[****] E.g., why bother with standard numerical functions in the core
awk code if all I do is non-numeric data processing.

Anton Treuenfels

unread,
Aug 28, 2014, 6:12:37 PM8/28/14
to

"Kenny McCormack" <gaz...@shell.xmission.com> wrote in message
news:ltnmc4$j3k$1...@news.xmission.com...

> /* do_strptime */
>
> static awk_value_t *
> do_strptime(int nargs, awk_value_t *result)
> {
> awk_value_t arg0,arg1;
> struct tm tm;
>
if (nargs != 2)
lintwarn(ext_id,"strptime: called with wrong # arguments (%d): must be
2!",nargs);
else if (!get_argument(0, AWK_STRING, &arg0))
lintwarn(ext_id,"strptime: Fatal error retrieving first arg!");
else if (!get_argument(1, AWK_STRING, &arg1))
lintwarn(ext_id,"strptime: Fatal error retrieving second arg!");
else {
strptime(arg0.STR,arg1.STR,&tm);
return make_number(mktime(&tm), result);
}

> return make_const_string("<ERROR>",7,result);
> }

Yes, the GOTOs are still lurking in the object code, but this way the
compiler hides them away from Impressionable Young Minds before they can Do
Harm (As Is Their Wont).

- Anton Treuenfels

Kenny McCormack

unread,
Aug 28, 2014, 6:19:00 PM8/28/14
to
In article <ltnvqn$ele$1...@dont-email.me>,
Aharon Robbins <arn...@skeeve.com> wrote:
...
>>$ gawk4 -l ./strptime.so '{print strftime("%c", strptime($0,"%d/%b/%Y:%H:%M"))}'
>>27/Aug/2014:23:58
>>Wed Aug 27 23:58:00 2014
>>$
>
>Nice work Kenny.

Thanks.

>You might want to contribute it to the gawkextlib project
>so that other people can benefit from it. How to do so is documented in
>the manual.

Some comments in that vein:
1) I read the section (on gawkextlib), but it seemed a little abstract
to me. FWIW, I don't think I have "Autotools" installed on the
systems I'd be doing the work on - that represents another hurdle
to be hurdled. I'll probably figure it out eventually, but not at
the present time.

2) My survey of the landscape makes it seem to me that my strptime()
library would more likely fit into the "in the gawk distribution"
basket than in the "gawkextlib" basket. It seems to me that the
ones that are in the distribution are of the "short, sweet,
proof-of-concept" variety, while the ones in gawkextlib are complex
and involved and do non-trivial things in the lib itself. Given
that strptime() has just 2 lines of operative code (the call to
strptime(3) and the call to mktime(3)), it is pretty clear that it
is in the "proof-of-concept" and/or "Just a pass-through to the C
library" category.

3) FWIW, I have another lib, called call_any, that purports to be a
general-purpose, call-anything-in-the-C-library-from-GAWK tool.
That might be more of a candidate for gawkextlib.

Note, incidentally, that it was almost, but not quite, possible to
do strptime() in call_any. If it had been, it would not have been
necessary to write a new library.

--

First of all, I do not appreciate your playing stupid here at all.

- Thomas 'PointedEars' Lahn -

Kenny McCormack

unread,
Aug 28, 2014, 6:20:12 PM8/28/14
to
In article <OOCdnfPqN5TONGLO...@earthlink.com>,
Anton Treuenfels <teamt...@yahoo.com> wrote:
...
>Yes, the GOTOs are still lurking in the object code, but this way the
>compiler hides them away from Impressionable Young Minds before they can Do
>Harm (As Is Their Wont).

I hate, with the passion of 1,000 suns, chained ifs and elses.

So, no thank you.

--
b w w g y g r b y w

Ed Morton

unread,
Aug 28, 2014, 9:22:38 PM8/28/14
to
On 8/28/2014 2:31 PM, Aharon Robbins wrote:
> In article <vg94db-...@c100.static-216-228-92-121.apid.com>,
> Joe User <ax...@yahoo.com> wrote:
>> The hard part is to get someone to think the changes through, to make the
>> language consistent and understandable, without a lot of exceptions. To
>> make it fast and flexible. To update documentation. To make sure the
>> changes are upward compatible. To get acceptance by the user community.
>> That is the hard part.
>
> At last, someone who seems to understand something of what being a
> maintainer is all about. :-)
>

I think most of us understand the general issues. The general issues though
shouldn't stop us from having a reasonable discussion of a specific feature
request so you can then decide if the benefits of this specific feature outweigh
the effort of providing it.

In this case we have a suggestion to provide a function that would avoid a few
lines of fairly gritty hand written code for the very common problem of working
with time stamps in various formats.

The proposed function is well defined, exists in other languages and so is well
known, provides symmetry with an existing gawk function (strftime()), does not
impact any other part of the language, I would imagine is about as easy as it
can get to implement since it already exists in libraries, fits in like a
missing puzzle piece to the existing set of gawk time functions, and the only
impact to existing scripts would be if someone had their own function named
strptime() defined.

I'm not suggesting anything that would change the awk language, or take the code
in a different direction, or even require any effort to define. In short it
seems like it's a lot of pros and close to zero cons for very little effort.

So wrt this specific request - what do you think?

Ed.

Ed Morton

unread,
Aug 28, 2014, 9:22:58 PM8/28/14
to
On 8/28/2014 2:31 PM, Aharon Robbins wrote:
> In article <vg94db-...@c100.static-216-228-92-121.apid.com>,
> Joe User <ax...@yahoo.com> wrote:
>> The hard part is to get someone to think the changes through, to make the
>> language consistent and understandable, without a lot of exceptions. To
>> make it fast and flexible. To update documentation. To make sure the
>> changes are upward compatible. To get acceptance by the user community.
>> That is the hard part.
>
> At last, someone who seems to understand something of what being a
> maintainer is all about. :-)
>

Joe User

unread,
Aug 28, 2014, 9:25:59 PM8/28/14
to
On Thu, 28 Aug 2014 19:31:46 +0000, Aharon Robbins wrote:

> In article <vg94db-...@c100.static-216-228-92-121.apid.com>, Joe
> User <ax...@yahoo.com> wrote:
>>The hard part is to get someone to think the changes through, to make
>>the language consistent and understandable, without a lot of exceptions.
>> To make it fast and flexible. To update documentation. To make sure
>>the changes are upward compatible. To get acceptance by the user
>>community. That is the hard part.
>
> At last, someone who seems to understand something of what being a
> maintainer is all about. :-)

I did it for years. It's always more fun to hack at the code than to go
to meetings about user requests and enhancements.

--
Trees are wonderful. They're naturally beautiful;
require little, if any attention. And they enable
life for all that is beautiful on Earth. In other
words, the antithesis of the Kardashians.

-- Craig Ferguson

Ed Morton

unread,
Aug 28, 2014, 9:34:16 PM8/28/14
to
On 8/28/2014 8:25 PM, Joe User wrote:
> On Thu, 28 Aug 2014 19:31:46 +0000, Aharon Robbins wrote:
>
>> In article <vg94db-...@c100.static-216-228-92-121.apid.com>, Joe
>> User <ax...@yahoo.com> wrote:
>>> The hard part is to get someone to think the changes through, to make
>>> the language consistent and understandable, without a lot of exceptions.
>>> To make it fast and flexible. To update documentation. To make sure
>>> the changes are upward compatible. To get acceptance by the user
>>> community. That is the hard part.
>>
>> At last, someone who seems to understand something of what being a
>> maintainer is all about. :-)
>
> I did it for years. It's always more fun to hack at the code than to go
> to meetings about user requests and enhancements.
>

Ditto. I've provided and supported the Virtual Finite State Machines (VFSM)
behavioral modeling toolset for about the past 20 years. That's how I got into
awk - writing parsers for some parts of the model.

Ed.

Kenny McCormack

unread,
Aug 28, 2014, 10:23:58 PM8/28/14
to
In article <ltnvv1$ele$2...@dont-email.me>,
Aharon Robbins <arn...@skeeve.com> wrote:
...
>>Note that, AIUI, the earlier (original) extension interface was more
>>closely bound up in GAWK compilation, and you more or less did have to do a
>>self-compile of GAWK in order to have the necessary pieces to do an
>>extension library.
>
>No. But if gawk's internal NODE structure (or other stuff changed), you
>would have to recompile your extension.

FWIW, and I could certainly have been wrong about this (*), but the recipe
that I always used for compiling "old style" extension libs included the
incantation -DHAVE_CONFIG_H, which causes "awk.h" to #include "config.h",
so you need that file, and that file comes from running ./configure, so if
you haven't done a self-compile, you won't have that file.

Furthermore, "awk.h" tries to #include a whole bunch of other files from
the GAWK distribution, so that would at least have required unpacking the
source if not actually having compiled it (but see previous paragraph for
why you pretty much would have had to at least run ./configure, if not
actually built GAWK).

In any case, I remember trying to do it once (build a lib on a system w/o
having built GAWK first) and ran into a bunch of problems (missing files,
etc), so I just gave up (and made a mental note that "You can't do that!")

(*) I.e., it is possible that other recipes are possible, but this one came
from one of the help files, so it wasn't made up out of whole cloth, or
anything like that.

--
A liberal, a moderate, and a conservative walk into a bar...

Bartender says, "Hi, Mitt!"

Aharon Robbins

unread,
Aug 29, 2014, 4:25:31 AM8/29/14
to
As always, trn asks me

Are you absolutely sure that you want to do this? [ny]

I keep answering "y", I guess I haven't learned.

In article <lto63o$io5$1...@news.m-online.net>,
Janis Papanagnou <janis_pa...@hotmail.com> wrote:
>"update documentation" - quite trivial with the proposed function,[***]
> which even relies on already existing definitions where applicable.
> BTW, a function needs to be documented whether implemented natively
> or through the extension mechanism.

>[***] WRT complexity of documentation a quick look into the gawk manual
>will be enlightening; strftime() for example requires only a few lines
>of documentation, and I'd expect strptime() not to require substantially
>more.

Gee - I see over 3 pages describing all the format options for strftime.
I certainly don't feel like writing another 3 or 4 pages for strptime.

>That all said, and acknowledging that the set of native functions shall
>be kept at a minimum, a question; shouldn't all gawk specific functions
>be supported by an extension
>interface then?

The answer here is yes they should, but it's too late. If I'd had the
extension mechanism 20 years ago, I would have done the time functions,
the bit manipulation functions and the i18n functions as loadable
modules. Now it's too late to extract them into separate modules.
And even if I were to do so and just always "preload" those extensions,
what would be the net gain?

I'll also point out that perl's strptime comes from a loadable module
and is not built-in to the core interpreter. It's just that perl has
a lot more standard loadable modules.

Kenny has already written and posted a loadable module. Anyone who wants
that can use it. Thanks Kenny.

Aharon Robbins

unread,
Aug 29, 2014, 4:32:52 AM8/29/14
to
In article <lto9sk$tfk$1...@news.xmission.com>,
Kenny McCormack <gaz...@shell.xmission.com> wrote:
>>You might want to contribute it to the gawkextlib project
>>so that other people can benefit from it. How to do so is documented in
>>the manual.
>
>Some comments in that vein:
> 1) I read the section (on gawkextlib), but it seemed a little abstract
> to me. FWIW, I don't think I have "Autotools" installed on the
> systems I'd be doing the work on - that represents another hurdle
> to be hurdled. I'll probably figure it out eventually, but not at
> the present time.

It shouldn't be necessary - you can work with the gawkextlib guys and
ask them to handle the autotools stuff upon integrating your code into
their library.

You should probably add a copyright statement indicating that it's GPL 3
code, also. You likely should use your real name there.

> 2) My survey of the landscape makes it seem to me that my strptime()
> library would more likely fit into the "in the gawk distribution"
> basket than in the "gawkextlib" basket. It seems to me that the
> ones that are in the distribution are of the "short, sweet,
> proof-of-concept" variety, while the ones in gawkextlib are complex
> and involved and do non-trivial things in the lib itself. Given
> that strptime() has just 2 lines of operative code (the call to
> strptime(3) and the call to mktime(3)), it is pretty clear that it
> is in the "proof-of-concept" and/or "Just a pass-through to the C
> library" category.

Your observation as to what kind of extension code is in the gawk
distribution is mostly valid. However, the number of possible such
functions that fit this criteria is vast, and I very explicitly do not
want to keep adding large numbers of small extension functions to the
dist that I then have to maintain indefinitely into the future.

The main point of the sample extensions in the dist is to serve as
examples of how to write extensions (and test cases), and if possible
to also be useful. I think that what's there fulfills that purpose
adequately, and there's little to be gained *towards the end of showing
how to write extensions* by adding strptime to the dist.

The gawkextlib project currently serves as the collection and distribution
point for extensions - please open a discussion with them if you'd like
to see your code published further.

> 3) FWIW, I have another lib, called call_any, that purports to be a
> general-purpose, call-anything-in-the-C-library-from-GAWK tool.
> That might be more of a candidate for gawkextlib.

Ditto - please talk to them about it, this sounds quite useful.

Thanks,

Arnold

Anton Treuenfels

unread,
Aug 29, 2014, 7:13:58 PM8/29/14
to

"Kenny McCormack" <gaz...@shell.xmission.com> wrote in message
news:lto9us$tfk$2...@news.xmission.com...
> In article <OOCdnfPqN5TONGLO...@earthlink.com>,
> Anton Treuenfels <teamt...@yahoo.com> wrote:
> ...
>>Yes, the GOTOs are still lurking in the object code, but this way the
>>compiler hides them away from Impressionable Young Minds before they can
>>Do
>>Harm (As Is Their Wont).
>
> I hate, with the passion of 1,000 suns, chained ifs and elses.

It is hard for me to imagine how even one enormous sphere of fusing hydrogen
could be said to possess a quality as anthropomorphic as "passion", let
alone a multitude of them. Intensity, perhaps, but not passion.

What could inspire such hatred of a common and useful idiom?

- Anton Treuenfels

Kenny McCormack

unread,
Aug 31, 2014, 1:52:59 PM8/31/14
to
In article <ltpddr$ogh$1...@dont-email.me>,
Aharon Robbins <arn...@skeeve.com> wrote:
...
>>[***] WRT complexity of documentation a quick look into the gawk manual
>>will be enlightening; strftime() for example requires only a few lines
>>of documentation, and I'd expect strptime() not to require substantially
>>more.
>
>Gee - I see over 3 pages describing all the format options for strftime.
>I certainly don't feel like writing another 3 or 4 pages for strptime.

I'm guessing that Janis was looking at the "man page" rather than the
"Web site"/book. In "man gawk", it just refers you to the "ANSI C"
documentation of "strftime" for the gory details. On the web site, it
lists them all out for you.

Reverting to an earlier subject (That which we can refer to as "Mr. Ed's
dilemma"), I do see his point - that, as things stand now, it is always
going to be easier (for him, and for his "customers") to just "grit it out"
in AWK than to compile an extension and use that. We all know that this
all "just works" under Linux, but it gets dicey under any of the
proprietary OSes (e.g., Windows, Mac, and even some "custom" builds of
Linux). FWIW, I think that this sort of thing can only work in Windows
(where I think Mr. Ed does a lot of his work) if you have some
(quasi-) commercial entity behind it; I'm thinking here of ActiveState,
which manages both Perl and Tcl (and Expect in the bargain) for the Windows
crowd.

My point in all of this is that we have a ways to go in advancing GAWK to
the point of being an "ecosystem" like Perl & Tcl are. It is a good goal,
and I'm not saying anything against it, but we do have a ways to go - until
we will be able to make people like Mr. Ed comfortable using it.

--

There are many self-professed Christians who seem to think that because
they believe in Jesus' sacrifice they can reject Jesus' teachings about
how we should treat others. In this country, they show that they reject
Jesus' teachings by voting for Republicans.

Kees Nuyt

unread,
Sep 1, 2014, 8:09:43 AM9/1/14
to
On Sun, 31 Aug 2014 17:52:59 +0000 (UTC),
gaz...@shell.xmission.com (Kenny McCormack) wrote:

>My point in all of this is that we have a ways to go in advancing GAWK to
>the point of being an "ecosystem" like Perl & Tcl are. It is a good goal,
>and I'm not saying anything against it, but we do have a ways to go - until
>we will be able to make people like Mr. Ed comfortable using it.

My 2 ct:

I like [g]awk because it doesn't have such a rich ecosystem.
It prevents me wasting time by trying to find my way in a myriad
of, partly overlapping, modules.

With [g]awk I know in advance I have to write the utility myself,
usually it's just 5-10 lines of extra code.

--
Kees Nuyt

Kenny McCormack

unread,
Sep 1, 2014, 9:03:39 AM9/1/14
to
In article <c8o80ap37frdiceic...@dim53.demon.nl>,
Agreed as well.

Found on the net:

--- Cut Here ---
Ken Olsen once said:

"It is our belief, however, that serious professional users will run out of
things they can do with UNIX. They'll want a real system and will end up
doing VMS when they get to be serious about programming. With UNIX, if
you're looking for something, you can easily and quickly check that small
manual and find out that it's not there. With VMS, no matter what you look
for - it's literally a five-foot shelf of documentation - if you look long
enough it's there. That's the difference - the beauty of UNIX is it's
simple; and the beauty of VMS is that it's all there."
- DECWORLD Vol. 8 No. 5, 1984 -

(followed by the commentator's opinion)
What Ken failed to realize is that once the programmers hit the limitations
of what they could do with Unix; once they needed things like virtual
memory, shared libraries, ACLs, memory mapped I/O, etc, they didn't switch
to VMS, they hacked them into Unix and Unix-like operating systems.
--- Cut Here ---

Since this thread began, I've been thinking of this quote, substituting
"GAWK" for "Unix" and "Perl" for "VMS".

--
"We should always be disposed to believe that which appears to us to be
white is really black, if the hierarchy of the church so decides."

- Saint Ignatius Loyola (1491-1556) Founder of the Jesuit Order -

Aharon Robbins

unread,
Sep 1, 2014, 12:39:04 PM9/1/14
to
In article <ltvndr$os4$1...@news.xmission.com>,
Kenny McCormack <gaz...@shell.xmission.com> wrote:
>I'm guessing that Janis was looking at the "man page" rather than the
>"Web site"/book. In "man gawk", it just refers you to the "ANSI C"
>documentation of "strftime" for the gory details. On the web site, it
>lists them all out for you.

"The documentation" for gawk includes both. The man page is not the primary
source of information, the manual is.

> We all know that this all "just works" under Linux, but it gets dicey
> under any of the proprietary OSes (e.g., Windows, Mac, and even some
> "custom" builds of Linux).

You're speaking out of ignorance. The extension mechanism works out
of the box on Mac OS X, Cygwin and with MinGW.

>FWIW, I think that this sort of thing can only work in Windows
>(where I think Mr. Ed does a lot of his work) if you have some
>(quasi-) commercial entity behind it; I'm thinking here of ActiveState,
>which manages both Perl and Tcl (and Expect in the bargain) for the Windows
>crowd.

Ed (and anyone else!) is welcome to throw money at me if he really can't
handle the basic steps to build and install gawk on Windows.

Ed Morton

unread,
Sep 1, 2014, 12:47:10 PM9/1/14
to
On 9/1/2014 11:39 AM, Aharon Robbins wrote:
> In article <ltvndr$os4$1...@news.xmission.com>,
> Kenny McCormack <gaz...@shell.xmission.com> wrote:
<snip>
>> FWIW, I think that this sort of thing can only work in Windows
>> (where I think Mr. Ed does a lot of his work) if you have some
>> (quasi-) commercial entity behind it; I'm thinking here of ActiveState,
>> which manages both Perl and Tcl (and Expect in the bargain) for the Windows
>> crowd.
>
> Ed (and anyone else!) is welcome to throw money at me if he really can't
> handle the basic steps to build and install gawk on Windows.

Don't assume Kenny has the faintest idea what he's talking about. The only thing
I've ever done in Windows is write a small batch script to call cygwin bash to
run a UNIX tool and I'm sure if I wanted to build and install gawk on Windows
I'd have no trouble doing it.

You nailed Kenny's MO perfectly when you said upthread:

> You're speaking out of ignorance.

He just keeps posting and posting hoping to one day get "Savant" added to his title.

Ed.

Kenny McCormack

unread,
Sep 1, 2014, 2:16:10 PM9/1/14
to
In article <lu27f8$rqo$3...@dont-email.me>,
Aharon Robbins <arn...@skeeve.com> wrote:
>In article <ltvndr$os4$1...@news.xmission.com>,
>Kenny McCormack <gaz...@shell.xmission.com> wrote:
>>I'm guessing that Janis was looking at the "man page" rather than the
>>"Web site"/book. In "man gawk", it just refers you to the "ANSI C"
>>documentation of "strftime" for the gory details. On the web site, it
>>lists them all out for you.
>
>"The documentation" for gawk includes both. The man page is not the primary
>source of information, the manual is.

Agreed.

I'm just assuming that Janis reached his conclusion based (only) on the man
page.

>> We all know that this all "just works" under Linux, but it gets dicey
>> under any of the proprietary OSes (e.g., Windows, Mac, and even some
>> "custom" builds of Linux).
>
>You're speaking out of ignorance.

A typical Usenet nonsensical response, for which you are already forgiven.
I thought you & I were past the point of throwing rocks at each other.

>The extension mechanism works out
>of the box on Mac OS X, Cygwin and with MinGW.

A couple of points re: that:
1) It doesn't "work out of the box" on Windows or Mac because those
OSes don't ship with a C compiler! That is my point. And those
OSes are aimed at users who, for the most part, wouldn't know a C
compiler from a tennis shoe.

Believe me, I know what I am talking about, in terms of what is
technically possible - given that I do most of my development on a
Mac. Once you've got the C compiler installed (on a Mac), it
works pretty much the same as on Linux. But that (for most users)
is a pretty big "Once you've got...". And even more so under Windows.
But I am more interested in discussing the "ecosystem" side
of it - what it will take to get wide acceptance from the
aforementioned "tennis shoe" type of user.

Under Windows, the situations is considerably more murky because
there's no single thing that can be called "The C compiler".
Luckily, on the Mac, there pretty much is just one single defined
entity called "The C compiler". I've no doubt that it works under
Cygwin and MinGW, but most people are going to be using something
(Visual something or other) from MS or (gasp!) C#. Just out of
curiosity, does it/will it work with C#?

2) I am glad to hear that it works with MinGW. I'm pretty sure it
always worked under Cygwin - since Cygwin is essentially Linux.
I remember when the networking stuff first came out - how happy I
was that it worked out-of-the-box under Cygwin.

But I don't think you're going to achieve "market share"
unless/until it either works with MS or you get some entity like
ActiveState to manage it for the users.

>>FWIW, I think that this sort of thing can only work in Windows
>>(where I think Mr. Ed does a lot of his work) if you have some
>>(quasi-) commercial entity behind it; I'm thinking here of ActiveState,
>>which manages both Perl and Tcl (and Expect in the bargain) for the Windows
>>crowd.
>
>Ed (and anyone else!) is welcome to throw money at me if he really can't
>handle the basic steps to build and install gawk on Windows.

His concern is more for his "customers" than for himself, apparently.

--
They say compassion is a virtue, but I don't have the time!

- David Byrne -

Andrew Schorr

unread,
Sep 1, 2014, 11:03:01 PM9/1/14
to
Hi,

I don't want to get into the middle of a flame war. But I do want to point out that we were careful to make sure that the extension mechanism works across platforms. If you don't have a C compiler installed on the system, then it is true that you can't compile extensions. But you can't compile gawk either. If somebody is kind enough to provide a gawk binary, then they can also provide binaries of the extensions. So is the problem the lack of gawkextlib binaries for various platforms?

On a side note, I don't agree that gawk's historical lack of extensions was a virtue. The gawkextlib project was started in order to enable gawk to parse XML documents. This required a way to link in expat. You are of course free not to use any of the extensions, but I cannot see how it is not of benefit to some users to be able to handle XML documents in gawk. Similarly, various other extensions can add a great deal of power to the gawk language for those who desire these features.

Personally, I think CPAN is a good thing. I just happen to prefer the awk language to Perl, so I'd like to have some of these libraries available inside the gawk ecosystem. Nobody is required to use the extensions.

Regards,
Andy

Janis Papanagnou

unread,
Sep 4, 2014, 12:37:53 AM9/4/14
to
On 31.08.2014 19:52, Kenny McCormack wrote:
> In article <ltpddr$ogh$1...@dont-email.me>,
> Aharon Robbins <arn...@skeeve.com> wrote:
> ...
>>> [***] WRT complexity of documentation a quick look into the gawk manual
>>> will be enlightening; strftime() for example requires only a few lines
>>> of documentation, and I'd expect strptime() not to require substantially
>>> more.
>>
>> Gee - I see over 3 pages describing all the format options for strftime.
>> I certainly don't feel like writing another 3 or 4 pages for strptime.

No, you would not duplicate the format specifications, but refer to them.
The point of a symmetric function set (strftime/strptime) is to reuse the
same format specifiers and not invent a different set. (The documentation
structure is already there, BTW.)

>
> I'm guessing that Janis was looking at the "man page" rather than the
> "Web site"/book. In "man gawk", it just refers you to the "ANSI C"
> documentation of "strftime" for the gory details. On the web site, it
> lists them all out for you.

No, I've looked at the Web-Version of the manual. For the function itself
there's only a few lines of description. As said; I don't think the format
specifiers should be duplicated.

We would then have...

mktime(datespec)
...

strftime([format [, timestamp [, utc-flag]]]) ### 1 ###
...

strptime(...) ### 2 ###
...

systime()
...

time format specifiers ### 3 ###
...


The description for #2# would be new and of similar size as [existing] #1#
(i.e. just a few lines), and #3# (and all the other function sections) are
existing anyway. No need to duplcate #3#.

Janis

Aharon Robbins

unread,
Sep 4, 2014, 2:51:27 AM9/4/14
to
In article <lu8qb0$2ej$1...@news.m-online.net>,
Janis Papanagnou <janis_pa...@hotmail.com> wrote:
>>> Gee - I see over 3 pages describing all the format options for strftime.
>>> I certainly don't feel like writing another 3 or 4 pages for strptime.
>
>No, you would not duplicate the format specifications, but refer to them.
>The point of a symmetric function set (strftime/strptime) is to reuse the
>same format specifiers and not invent a different set.

Are you sure that they all mean the same thing in both functions?
I'm not.

It's a moot point. I'm not going to add strptime to the dist.

Janis Papanagnou

unread,
Sep 4, 2014, 4:20:39 AM9/4/14
to
On 04.09.2014 08:51, Aharon Robbins wrote:
> In article <lu8qb0$2ej$1...@news.m-online.net>,
> Janis Papanagnou <janis_pa...@hotmail.com> wrote:
>>>> Gee - I see over 3 pages describing all the format options for strftime.
>>>> I certainly don't feel like writing another 3 or 4 pages for strptime.
>>
>> No, you would not duplicate the format specifications, but refer to them.
>> The point of a symmetric function set (strftime/strptime) is to reuse the
>> same format specifiers and not invent a different set.
>
> Are you sure that they all mean the same thing in both functions?

Short answer: yes.

Basically that seems to be the case; the respective same specifiers do
the equivalent things.

> I'm not.

Neither was I sure. So, despite not using those C-functions, before my
postings I inspected and compared the specifiers as described in their
respective manpage of my system. (Standards anyone? - It turned out
that the man page descriptions seems to have been to great extent just
copied from the respective POSIX standard document. The inconsistencies
subsequently described are thus [partly] inherited.)

The problem is that those two functions in the C library made the same
mistake that you seem to have in mind, to document all the specifiers
twice. The sad consequence is that you indeed would have to carefully
not only compare the two C function documents, but - since gawk seems
to just use those C functions as they are, whatever they actually do -
also to test the [C-]functions before relying on them in gawk. Not sure
that had been done. Anyway. The double documentation reveals a lot of
unnecessary problems; e.g., listing, say, %A and %a in one sentence vs.
in two separate lines, Listing the %E and %O extensions explicitly vs.
describing them as a class of specifiers, mentioning alternative names
for specifies separately or all together, and that specific specifiers
%n/%t define in one case the specific specific separators in the other
case seem to accept each other, and use different wording for the same
behaviour. All this does not provide great confidence in what those C
functions do; in critical cases or if I'd implement them in my software
I would actually better test them.

Would it be better to just copy the contents of the POSIX documents?
A valid approach given that only the underlying C functions are used.
But then you'd get that bulky duplication. I'm not sure what you were
specifically reluctant to; the writing - unnecessary due to the plain
copy -, or the bulkyness of text that document duplication comes with.

Anyway. (As said.)

Janis

> [...]

Ed Morton

unread,
Sep 4, 2014, 9:09:16 AM9/4/14
to
On 9/4/2014 1:51 AM, Aharon Robbins wrote:
> In article <lu8qb0$2ej$1...@news.m-online.net>,
> Janis Papanagnou <janis_pa...@hotmail.com> wrote:
>>>> Gee - I see over 3 pages describing all the format options for strftime.
>>>> I certainly don't feel like writing another 3 or 4 pages for strptime.
>>
>> No, you would not duplicate the format specifications, but refer to them.
>> The point of a symmetric function set (strftime/strptime) is to reuse the
>> same format specifiers and not invent a different set.
>
> Are you sure that they all mean the same thing in both functions?
> I'm not.

Yes, that's the point of strptime() - to reverse-map the same format specifiers
as strftime().

>
> It's a moot point. I'm not going to add strptime to the dist.
>

I don't really understand the difference but would it be more palatable to have
it added with one of those "-i" arguments, e.g. "-i time", like you did for "-i
inplace"?

Ed.

Aharon Robbins

unread,
Sep 5, 2014, 1:38:24 AM9/5/14
to
In article <lu97cm$ac$1...@news.m-online.net>,
I was under the impression that not all formats were the same, semantically,
so I didn't want to have to document the function fully.

But even given that they're the same, for other reasons that I've already
gone over, I don't see a reason to add this to the dist.

Thanks,

Arnold

Aharon Robbins

unread,
Sep 5, 2014, 1:42:02 AM9/5/14
to
In article <lu9o9q$ss$1...@dont-email.me>,
Ed Morton <morto...@gmail.com> wrote:
>I don't really understand the difference but would it be more palatable to have
>it added with one of those "-i" arguments, e.g. "-i time", like you did for "-i
>inplace"?

No. `-i inplace' hardly brought about the huge increase in gawk use that people
here claimed it would.

Kenny posted his extension. Ed, you and anyone else who needs it can compile it
and use it with -l strptime (or @load "strptime") and you're set.

The gawk dist is NOT going to become the location for every small extension
that people might come up with. The gawkextlib project serves that purpose,
and maybe it will one day evolve into a CTAN or CPAN equivalent.

Manuel Collado

unread,
Sep 5, 2014, 3:32:15 AM9/5/14
to
El 04/09/2014 15:09, Ed Morton escribió:
> On 9/4/2014 1:51 AM, Aharon Robbins wrote:
>> In article <lu8qb0$2ej$1...@news.m-online.net>, Janis Papanagnou
>> <janis_pa...@hotmail.com> wrote:
>>>>> Gee - I see over 3 pages describing all the format options
>>>>> for strftime. I certainly don't feel like writing another 3
>>>>> or 4 pages for strptime.
>>>
>>> No, you would not duplicate the format specifications, but refer
>>> to them. The point of a symmetric function set
>>> (strftime/strptime) is to reuse the same format specifiers and
>>> not invent a different set.
>>
>> Are you sure that they all mean the same thing in both functions?
>> I'm not.
>
> Yes, that's the point of strptime() - to reverse-map the same format
> specifiers as strftime().

Well, the manual must describe not just the intuitive meaning of each
format specifier, but the precise behavior of the function that process
it. And this behavior is sensibly different in both functions.

Excerpt from the strftime/strptime docs:

---- strftime
%a
Replaced by the locale's abbreviated weekday name. [ /tm_wday/]
%A
Replaced by the locale's full weekday name. [ /tm_wday/]
%b
Replaced by the locale's abbreviated month name. [ /tm_mon/]
%B
Replaced by the locale's full month name. [ /tm_mon/]

---- strptime
%a
The day of the week, using the locale's weekday names; either
the abbreviated or full name may be specified.
%A
Equivalent to %a.
%b
The month, using the locale's month names; either the
abbreviated or full name may be specified.
%B
Equivalent to %b.

Do you see the difference?

Regards.
--
Manuel Collado - http://lml.ls.fi.upm.es/~mcollado

Janis Papanagnou

unread,
Sep 5, 2014, 7:20:19 AM9/5/14
to
Of course. The parse function is in more than one place "generous". It's
even "worse"; some specifiers (e.g. %g and %G) *exist* only in one version.
But such differences are natural. The point - as I see it (and so at least I
understood Arnold's question about "meaning") - is that (for example) %a/%A
are about the week name, %d/%D are about day of month, etc. The specifiers
all match. Processing is naturally different in respects of the function; a
print/format function will have to print the defined format, while a parse
function can be "tolerant", not only as in your above shown examples but
also whether for input, say, besides 2-digit values also 1-digit or N-digit
values are accepted. All this does not make it impossible to have those
specifiers documented consistently and in one place so that you can easily
see the commonalities (and differences). A table would probably be the
clearest form for that purpose.

Janis

>
> Regards.

Janis Papanagnou

unread,
Sep 5, 2014, 7:25:05 AM9/5/14
to
On 05.09.2014 13:20, Janis Papanagnou wrote:
[...]
Darn! I typed without thinking.

> But such differences are natural. The point - as I see it (and so at least I
> understood Arnold's question about "meaning") - is that (for example) %a/%A
> are about the week name, %d/%D are about day of month, etc. [...]

%d is about day of month (in both functions)
%D is equivalent to %m/%d/%y (in both functions)

The point, just to emphasize, was that specifiers using the same character(s)
define the same semantical entity ("meaning") in both of those time functions.

Janis

Kenny McCormack

unread,
Sep 5, 2014, 7:27:50 AM9/5/14
to
In article <luc69h$pt6$1...@news.m-online.net>,
Janis Papanagnou <janis_pa...@hotmail.com> wrote:
...
>A table would probably be the
>clearest form for that purpose.

So, you're volunteering - to write the documentation for the new strptime
extension? Great! The code is already written; all I need is for someone
smart and literate like yourself to write up the docu. Thanks!

--
Rich people pay Fox people to convince middle class people to blame poor people.

(John Fugelsang)

Janis Papanagnou

unread,
Sep 5, 2014, 8:12:31 AM9/5/14
to
On 05.09.2014 13:27, Kenny McCormack wrote:
> In article <luc69h$pt6$1...@news.m-online.net>,
> Janis Papanagnou <janis_pa...@hotmail.com> wrote:
> ...
>> A table would probably be the
>> clearest form for that purpose.
>
> So, you're volunteering - to write the documentation for the new strptime
> extension? Great! The code is already written; all I need is for someone
> smart and literate like yourself to write up the docu. Thanks!

And indeed I've already started yesterday! LOL. - But I wasn't volunteering,
as you are flippant presuming; I just did it for my own interest to see how
that would look like, and whether it's feasible, and to backup my statements
before posting.

But what are you looking for? Docu for your strptime function? It's already
there[*] if you are concerned about the format specifiers. Or looking for
some strftime/strptime-unified version? Where should that docu go if Arnold
wants to keep it outside of the core awk?

Janis

[*] http://pubs.opengroup.org/onlinepubs/009695399/functions/strptime.html

Ed Morton

unread,
Sep 5, 2014, 8:12:44 AM9/5/14
to
On 9/5/2014 12:42 AM, Aharon Robbins wrote:
> In article <lu9o9q$ss$1...@dont-email.me>,
> Ed Morton <morto...@gmail.com> wrote:
>> I don't really understand the difference but would it be more palatable to have
>> it added with one of those "-i" arguments, e.g. "-i time", like you did for "-i
>> inplace"?
>
> No. `-i inplace' hardly brought about the huge increase in gawk use that people
> here claimed it would.

The point of "-i inplace" was mainly to stop people using complicated sed
solutions instead of simple awk solutions just to get "inplace" editing. Word of
"-i inplace" is only now spreading and we're just now starting to see it show up
in forums as the suggested solution instead of "sed -i" so I think it's early to
consider whether or not gawk usage will increase because of it.

> Kenny posted his extension. Ed, you and anyone else who needs it can compile it
> and use it with -l strptime (or @load "strptime") and you're set.

No-one's going to do that as an alternative to writing the awk code as there's
no overall saving in effort.

> The gawk dist is NOT going to become the location for every small extension
> that people might come up with.

Understood and I already explained why strptime() isn't just some other small
extension and it's certainly not something I came up with.

Just think about how many times people are parsing log files with timestamps in
all sorts of formats and the hoops they have to jump through to map their
timestamp-du-jour to secs since the epoch, especially when they include a month
name.

strptime() is missing functionality from the time functions that impacts many
users by its absence. It is well-defined, well-documented, doesn't interfere
with any existing language constructs or functions and is already implemented
and best I can tell available to just be called. I don't understand your
reluctance to just make it available. I'll write the documentation and/or
implement it if either of those is the issue.

Ed.

Kenny McCormack

unread,
Sep 5, 2014, 9:56:07 AM9/5/14
to
In article <luc9be$42c$1...@news.m-online.net>,
Janis Papanagnou <janis_pa...@hotmail.com> wrote:
>On 05.09.2014 13:27, Kenny McCormack wrote:
>> In article <luc69h$pt6$1...@news.m-online.net>,
>> Janis Papanagnou <janis_pa...@hotmail.com> wrote:
>> ...
>>> A table would probably be the
>>> clearest form for that purpose.
>>
>> So, you're volunteering - to write the documentation for the new strptime
>> extension? Great! The code is already written; all I need is for someone
>> smart and literate like yourself to write up the docu. Thanks!
>
>And indeed I've already started yesterday!
>I just did it for my own interest to see how
>that would look like, and whether it's feasible, and to backup my statements
>before posting.

Great! Let's say by Wednesday (9/10), COB.

Have your secretary call my secretary and make the arragements.

Thanks again!

--
(This discussion group is about C, ...)

Wrong. It is only OCCASIONALLY a discussion group
about C; mostly, like most "discussion" groups, it is
off-topic Rorsharch [sic] revelations of the childhood
traumas of the participants...

Janis Papanagnou

unread,
Sep 5, 2014, 4:29:36 PM9/5/14
to
On 05.09.2014 15:55, Kenny McCormack wrote:
> In article <luc9be$42c$1...@news.m-online.net>,
> Janis Papanagnou <janis_pa...@hotmail.com> wrote:
>> On 05.09.2014 13:27, Kenny McCormack wrote:
>>> In article <luc69h$pt6$1...@news.m-online.net>,
>>> Janis Papanagnou <janis_pa...@hotmail.com> wrote:
>>> ...
>>>> A table would probably be the
>>>> clearest form for that purpose.
>>>
>>> So, you're volunteering - to write the documentation for the new strptime
>>> extension? Great! The code is already written; all I need is for someone
>>> smart and literate like yourself to write up the docu. Thanks!
>>
>> And indeed I've already started yesterday!
>> I just did it for my own interest to see how
>> that would look like, and whether it's feasible, and to backup my statements
>> before posting.
>
> Great! Let's say by Wednesday (9/10), COB.
>
> Have your secretary call my secretary and make the arragements.
>
> Thanks again!

You're welcome. Just note that such close delivery time requests will cost
an extra fee. Also note that since you arbitrarily removed some text of my
posting that changes the meaning of what was announced will make it a bit
more expensive als well. But as you suggest, your secretary may contact my
legal representative to discuss the details of the contract.

You funny guy!

Janis

Jonathan Hankins

unread,
Nov 5, 2015, 4:28:53 PM11/5/15
to
On Thursday, August 28, 2014 at 11:45:56 AM UTC-5, Kenny McCormack wrote:
> In article <13387e6a-c503-4faa...@googlegroups.com>,
> Ed Morton <morto...@gmail.com> wrote:
> ...
> >I know we want to avoid cluttering up the awk language but time
> >conversions are a VERY common problem and having to write the
> >split()+match() with arithmetic or populate an array that maps month names
> >to numbers etc. is pretty painful (I always have to look it up) and that
> >strptime() seems much more like it'd be filling a glaring hole in the gawk
> >time functions rather than adding on to them.
>
> Arnold has stated many times that if you can do it in AWK code or in an
> extension library, it won't be put into the core. Both you and I may
> disagree with this policy from time to time, but we have to live with it.
>
> Here ya go!
>
> $ gawk4 -l ./strptime.so '{print strftime("%c", strptime($0,"%d/%b/%Y:%H:%M"))}'
> 27/Aug/2014:23:58
> Wed Aug 27 23:58:00 2014
> $
>
> And here's the code (note that it is sort of a pain having go through this
> every time you need to add a new function which is basically a pass-thru
> into the C library - but so it goes. I was able to knock this together in
> about 20 minutes, so it's not so bad...)
>
> --- Cut Here ---
> /*
> * strptime.c - GAWK interface to strptime (like in Perl)
> * Compile command:
> gcc -shared -I.. -W -Wall -Werror -fPIC -o strptime.so strptime.c
> */
>
> #define _XOPEN_SOURCE
> #include <stdio.h>
> #include <stddef.h>
> #include <string.h>
> #include <assert.h>
> #include <errno.h>
> #include <stdlib.h>
> #include <sys/types.h>
> #include <sys/stat.h>
> #include <dlfcn.h>
> #include <time.h>
>
> #include "gawkapi.h"
> #define STR str_value.str
>
> static const gawk_api_t *api; /* for convenience macros to work */
> static awk_ext_id_t *ext_id;
> static const char *ext_version = "strptime extension: version 1.0";
> static awk_bool_t (*init_func)(void) = NULL;
>
> int plugin_is_GPL_compatible;
>
> /* do_strptime */
>
> static awk_value_t *
> do_strptime(int nargs, awk_value_t *result)
> {
> awk_value_t arg0,arg1;
> struct tm tm;
>
> if (nargs != 2) {
> lintwarn(ext_id,"strptime: called with wrong # arguments (%d): must be 2!",nargs);
> goto the_end;
> }
> if (!get_argument(0, AWK_STRING, &arg0)) {
> lintwarn(ext_id,"strptime: Fatal error retrieving first arg!");
> goto the_end;
> }
> if (!get_argument(1, AWK_STRING, &arg1)) {
> lintwarn(ext_id,"strptime: Fatal error retrieving second arg!");
> goto the_end;
> }
> strptime(arg0.STR,arg1.STR,&tm);
> return make_number(mktime(&tm), result);
>
> the_end:
> return make_const_string("<ERROR>",7,result);
> }
>
> static awk_ext_func_t func_table[] = {
> { "strptime", do_strptime, 2 },
> };
>
> /* define the dl_load function using the boilerplate macro */
>
> dl_load_func(func_table, strptime, "")
>
> --- Cut Here ---
>
> --
> "There's no chance that the iPhone is going to get any significant market share. No chance." - Steve Ballmer

Kenny,

On my system (Linux Mint 17.2) with GNU libc 2.19, I needed to initialize the struct *tm, or I got semi-random results calling strptime. The man page strptime(3) mentions this in relation to glibc, at least on my system.

memset(&tm, 0, sizeof(struct tm));

Thanks for coding the extension -- I had started to myself, but never got far :-)

-Jonathan Hankins

Kenny McCormack

unread,
Nov 5, 2015, 4:46:09 PM11/5/15
to
In article <b0f6b7bc-a8f6-48c4...@googlegroups.com>,
Jonathan Hankins <jonathan...@gmail.com> wrote:
...
>Kenny,
>
>On my system (Linux Mint 17.2) with GNU libc 2.19, I needed to initialize the
>struct *tm, or I got semi-random results calling strptime. The man page
>strptime(3) mentions this in relation to glibc, at least on my system.
>
> memset(&tm, 0, sizeof(struct tm));
>
>Thanks for coding the extension -- I had started to myself, but never got far :-)
>
>-Jonathan Hankins

Yes. I had figured that out, and also how to do the DST stuff right, a
while back (September 1st, in fact).

You can pick up an updated version from:

http://shell.xmission.com:PORT/strptime.zip

where PORT is 65401.

--
The last time a Republican cared about you, you were a fetus.
0 new messages