Google Groups no longer supports new Usenet posts or subscriptions. Historical content remains viewable.
Dismiss

understanding time format conversions

215 views
Skip to first unread message

Bryan

unread,
May 31, 2012, 10:30:59 PM5/31/12
to
[ Gnu awk 4.0.0 ]
[ Ubuntu 12.04 ]

I have a serious misunderstanding of strftime. The example below attempts some conversions which fail (giving 31Dec69). Some related time examples work for me from comp.lang.awk (not shown). I would like to have "01Jan10" be converted into anything else. I have seen some elaborate scripts to do so, suggesting this is the wrong way to do it.

-Bryan

#/bin/awk
BEGIN{
format_A = "%a %b %e %H:%M:%S %Z %Y";
format_B = "%e%b%y";
date_1="01Jan10";
date_2="2010 01 01 10 23 59";
print "try to reformat "date_2" : " strftime(format_B,date_2);
}

Ed Morton

unread,
May 31, 2012, 11:09:51 PM5/31/12
to
strftime takes a number of secs since the epoch as its second arg, not a date
spec. Try this:

... strftime(format_B,mktime(date_2))

and be aware that these time functions are gawk-specific so invoking /bin/awk
may be misleading.

Ed.

Kenny McCormack

unread,
Jun 1, 2012, 3:24:26 AM6/1/12
to
In article <jq9bpu$5n1$1...@dont-email.me>,
Ed Morton <morto...@gmail.com> wrote:
...
>> #/bin/awk
>> BEGIN{
>> format_A = "%a %b %e %H:%M:%S %Z %Y";
>> format_B = "%e%b%y";
>> date_1="01Jan10";
>> date_2="2010 01 01 10 23 59";
>> print "try to reformat "date_2" : " strftime(format_B,date_2);
>> }
>>
>
>strftime takes a number of secs since the epoch as its second arg, not a date
>spec. Try this:
>
> ... strftime(format_B,mktime(date_2))
>
>and be aware that these time functions are gawk-specific so invoking /bin/awk
>may be misleading.

I don't see any invocation of /bin/awk. Do you?

--
The motto of the GOP "base": You can't be a billionaire, but at least you
can vote like one.

Aharon Robbins

unread,
Jun 1, 2012, 6:04:22 AM6/1/12
to
Hi Kenny. Trimming things down:

>>> #/bin/awk
>
>I don't see any invocation of /bin/awk. Do you?

So, yes...

What was probably intended is

#! /bin/awk -f
--
Aharon (Arnold) Robbins arnold AT skeeve DOT com
P.O. Box 354 Home Phone: +972 8 979-0381
Nof Ayalon Cell Phone: +972 50 729-7545
D.N. Shimshon 99785 ISRAEL

Bryan

unread,
Jun 1, 2012, 12:42:00 PM6/1/12
to
On Thursday, May 31, 2012 11:09:51 PM UTC-4, Ed Morton wrote:
> Try this:
>
> ... strftime(format_B,mktime(date_2))

this is a great idea, thanks. I can see how that works,..

... soooo... sorry, this isn't sinking in yet - is there a time function that will take an argument in format %d%m%y ? i.e. mktime takes YYYY MM DD HH MM SS [DST], strftime takes a number of secs since the epoch, so - without changing dates by hand into YYYY MM DD HH MM SS [DST] I do not see how to go from %d%m%y to anything else. (I have a script with dates already hand-written in this format, e.g. some_event="01Jan10";

> and be aware that these time functions are gawk-specific so invoking /bin/awk
> may be misleading.
> (others commented on the #! thing)

yes, I appreciate pointing this out. In truth I put it there as an afterthought when posting, i.e. not copy/paste from the actual script - mea culpa. and actually, my installation actually has a symlink from awk to gawk.

-Bryan

Ed Morton

unread,
Jun 1, 2012, 1:11:03 PM6/1/12
to
No. Best you can do is automate the re-formatting of that date spec. Something
like this (untested):

{
# assume $0 = 01Jan10
dayNr = substr($0,1,2) # 01
mthName = substr($0,3,3) # Jan
yearNr = "20" substr($0,6,2) # 2010

match("JanFebMarAprMayJunJulAugSepOctNovDec",mthName)
mthNr = sprintf("%02d",(RSTART+2)/3)

secs = mktime( yearNr " " mthNr " " dayNr " 00 00 00")

print strftime("%e%b%y",secs)
}

Regards,

Ed.



Luuk

unread,
Jun 1, 2012, 1:20:06 PM6/1/12
to Aharon Robbins
On 01-06-2012 12:04, Aharon Robbins wrote:
> What was probably intended is
>
> #! /bin/awk -f

What was probably intended is

#! /bin/gawk -f


;)

Kenny McCormack

unread,
Jun 1, 2012, 1:44:28 PM6/1/12
to
In article <95a2b2e6-8554-4e81...@googlegroups.com>,
Bryan <bryan...@gmail.com> wrote:
>On Thursday, May 31, 2012 11:09:51 PM UTC-4, Ed Morton wrote:
>> Try this:
>>
>> ... strftime(format_B,mktime(date_2))
>
>this is a great idea, thanks. I can see how that works,..
>
>... soooo... sorry, this isn't sinking in yet - is there a time function
>that will take an argument in format %d%m%y ? i.e. mktime takes YYYY MM
>DD HH MM SS [DST], strftime takes a number of secs since the epoch, so -
>without changing dates by hand into YYYY MM DD HH MM SS [DST] I do not
>see how to go from %d%m%y to anything else. (I have a script with dates
>already hand-written in this format, e.g. some_event="01Jan10";

The general idea is that GAWK only has the one function for making "epoch
time numbers" - and that is mktime(). So, you usually have to write the
code yourself to convert whatever time format you have into the format that
mktime() wants. In particular, you will almost certainly have to write code
to lookup the month names. And, yes, this has been done a thousand times,
by a thousand different programmers - seems a shame that it has to be
re-rolled each time.

On a more general note, I think what you're looking for is strptime(), which
is generally described as being the "inverse of strftime()". Unfortunately,
there doesn't exist any (direct, user-accessible) interface to it in GAWK.

I've often thought about cobbling something together to give access to
strptime(). I'd like to do it with my "call_any()" interface - not sure if
this will work or not, though. It could certainly be done, of course, using
an extension library.

--
One of the best lines I've heard lately:

Obama could cure cancer tomorrow, and the Republicans would be
complaining that he had ruined the pharmaceutical business.

(Heard on Stephanie Miller = but the sad thing is that there is an awful lot
of direct truth in it. We've constructed an economy in which eliminating
cancer would be a horrible disaster. There are many other such examples.)

Ed Morton

unread,
Jun 1, 2012, 4:17:55 PM6/1/12
to
On 6/1/2012 12:44 PM, Kenny McCormack wrote:
> In article<95a2b2e6-8554-4e81...@googlegroups.com>,
> Bryan<bryan...@gmail.com> wrote:
>> On Thursday, May 31, 2012 11:09:51 PM UTC-4, Ed Morton wrote:
>>> Try this:
>>>
>>> ... strftime(format_B,mktime(date_2))
>>
>> this is a great idea, thanks. I can see how that works,..
>>
>> ... soooo... sorry, this isn't sinking in yet - is there a time function
>> that will take an argument in format %d%m%y ? i.e. mktime takes YYYY MM
>> DD HH MM SS [DST], strftime takes a number of secs since the epoch, so -
>> without changing dates by hand into YYYY MM DD HH MM SS [DST] I do not
>> see how to go from %d%m%y to anything else. (I have a script with dates
>> already hand-written in this format, e.g. some_event="01Jan10";
>
> The general idea is that GAWK only has the one function for making "epoch
> time numbers" - and that is mktime(). So, you usually have to write the
> code yourself to convert whatever time format you have into the format that
> mktime() wants. In particular, you will almost certainly have to write code
> to lookup the month names. And, yes, this has been done a thousand times,
> by a thousand different programmers - seems a shame that it has to be
> re-rolled each time.
>
> On a more general note, I think what you're looking for is strptime(), which
> is generally described as being the "inverse of strftime()". Unfortunately,
> there doesn't exist any (direct, user-accessible) interface to it in GAWK.

Even strptime() couldn't handle the OPs time format though as it requires
non-alphanumeric characters between time specifiers.

> I've often thought about cobbling something together to give access to
> strptime(). I'd like to do it with my "call_any()" interface - not sure if
> this will work or not, though. It could certainly be done, of course, using
> an extension library.

That got me wondering how hard it'd be to write a simple, stripped down
strptime() that'd work for most applications. Here's what I came up with FWIW:

$ cat strptime.awk
function strptime(val, fmt, fmtA,valA,i,n,specA,spec)
{
n = split(val,valA,/[^[:alnum:]]+/)
split(fmt,fmtA,/[^[:alpha:]]+/)

for (i=1; i<=n; i++) {
specA[fmtA[i+1]] = valA[i]
}

if (!specA["m"]) {
match("janfebmaraprmayjunjulaugsepoctnovdec",tolower(specA["b"]))
specA["m"] = (RSTART+2)/3
}

spec = sprintf("%4d %2d %2d %2d %2d %2d",\
specA["Y"], specA["m"], specA["d"], specA["H"], specA["M"], specA["S"])

return mktime(spec)
}

BEGIN{ FS="\t" }

{ print $1 " -> " strftime( "%D (%T)", strptime($1,$2) ) }


$ cat file
6 Dec 2001 12:33:45 %d %b %Y %H:%M:%S
6 dec 2001 %d %b %Y
06 12 2001 %d %m %Y


$ awk -f strptime.awk file
6 Dec 2001 12:33:45 -> 12/06/01 (12:33:45)
6 dec 2001 -> 12/06/01 (00:00:00)
06 12 2001 -> 12/06/01 (00:00:00)

All it supports is %d, %b, %m, %Y, %H, %M, and %S specifiers. I do agree a
builtin strptime() would be a great addition to gawks time functions.

Regards,

Ed.

Bryan

unread,
Jun 3, 2012, 11:28:28 AM6/3/12
to
Just want to say thanks for another illuminating thread.

FWIW a copy/paste of a testing script I used to understand things based on Ed's script is below BUT WARNING : it is not pretty to look at, but seems to work ok.

-Bryan

#!/bin/gawk
BEGIN{
{
# ORIGINAL : assume $0 = 01Jan10
first_date_point="03Jun12"
second_date_point="03Jun12"
}
#--------------------------------------------------
# FIRST DATE
#--------------------------------------------------
{
# substr(s, a, b) returns b number of chars from string s,
# starting at position a. The parameter b is optional.
dayNr = substr(first_date_point,1,2) # 01
mthName = substr(first_date_point,3,3) # Jan
yearNr = "20" substr(first_date_point,6,2) # 2010

match("JanFebMarAprMayJunJulAugSepOctNovDec",mthName)
# The match function sets the built-in variable RSTART to the index.
# It also sets the built-in variable RLENGTH to the length in characters
# of the matched substring. If no match is found, RSTART is set to 0, and LENGTH to -1.
mthNr = sprintf("%02d",(RSTART+2)/3)
foo = sprintf("%2d",(RSTART+2)/3)
# YYYY MM DD HH MM SS
first_date_in_secs = mktime( yearNr " " mthNr " " dayNr " 00 00 00")
}
#--------------------------------------------------
# REPORT STUFF
#--------------------------------------------------
{
print "date of some format : "strftime("%e%b%y", first_date_in_secs)
print dayNr" "mthName" "mthNr" "yearNr" " first_date_in_secs" "RSTART" "(RSTART+2)/3" "foo
print "seconds of " first_date_point " since epoch: " strftime( first_date_in_secs)
print "today in seconds since epoch :"systime()
print "today in human readable form :"strftime("%d%b%y")
print "today for calculating diffs :"strftime("%Y %m %d %H %M %S")
print ( strftime("%Y %m %d %H %M %S")-strftime("%Y %m %d %H %M %S",first_date_in_secs) ) / (60 * 60 * 24 )
print "try some cals:"
print "today ...... test date"
print mktime(strftime("%Y %m %d %H %M %S"))" "mktime(strftime("%Y %m %d %H %M %S",first_date_in_secs))
print " days between " strftime("%d%b%y %H:%M:%S")" and "strftime("%d%b%y %H:%M:%S",first_date_in_secs)": "(mktime(strftime("%Y %m %d %H %M %S")) - mktime(strftime("%Y %m %d %H %M %S",first_date_in_secs)))/(60 * 60 * 24)
print " hours between " strftime("%d%b%y %H:%M:%S")" and "strftime("%d%b%y %H:%M:%S",first_date_in_secs)": "(mktime(strftime("%Y %m %d %H %M %S")) - mktime(strftime("%Y %m %d %H %M %S",first_date_in_secs)))/(60 * 60 )
print "seconds between " strftime("%d%b%y %H:%M:%S")" and "strftime("%d%b%y %H:%M:%S",first_date_in_secs)": "(mktime(strftime("%Y %m %d %H %M %S")) - mktime(strftime("%Y %m %d %H %M %S",first_date_in_secs)))/(60 )
}
}

Geoff Clare

unread,
Jun 5, 2012, 9:18:21 AM6/5/12
to
Ed Morton wrote:

> On 6/1/2012 12:44 PM, Kenny McCormack wrote:
>>
>> On a more general note, I think what you're looking for is strptime(), which
>> is generally described as being the "inverse of strftime()". Unfortunately,
>> there doesn't exist any (direct, user-accessible) interface to it in GAWK.
>
> Even strptime() couldn't handle the OPs time format though as it requires
> non-alphanumeric characters between time specifiers.

The latest (2008) POSIX revision updated strptime() to take field
widths in the conversion specifiers, so non-alphanumeric field
separators are no longer required (or won't be once implementations
of the new spec strptime() become common).

--
Geoff Clare <net...@gclare.org.uk>

Bryan

unread,
Aug 5, 2016, 3:24:27 PM8/5/16
to
On Friday, June 1, 2012 at 4:17:55 PM UTC-4, Ed Morton wrote:
[trimmed messages....]
sorry if digging up old threads is frowned upon, but I've been working on this and I'm getting output:

awk -f strptime_01.awk strptime_01.dat
6 Dec 2001 12:33:45 %d %b %Y %H:%M:%S -> 12/31/69 (18:59:59)
6 dec 2001 %d %b %Y -> 12/31/69 (18:59:59)
06 12 2001 %d %m %Y -> 12/31/69 (18:59:59)

... using gawk, nawk, awk 4.1.3. I wouldn't otherwise bring this up but some other things I'm working on also give 12/31/69 (or 31 Dec 1969, etc.) output - so I think a helpful pattern might open up for me...

Kees Nuyt

unread,
Aug 6, 2016, 10:16:35 AM8/6/16
to
Is your input file TAB delimited ?

--
Kees Nuyt

Kenny McCormack

unread,
Aug 6, 2016, 11:14:42 AM8/6/16
to
In article <d45aae42-e039-4157...@googlegroups.com>,
Bryan <bryan...@gmail.com> wrote:
>On Friday, June 1, 2012 at 4:17:55 PM UTC-4, Ed Morton wrote:
>[trimmed messages....]
>>
>> That got me wondering how hard it'd be to write a simple, stripped down
>> strptime() that'd work for most applications. Here's what I came up with
>> FWIW:
...
>sorry if digging up old threads is frowned upon, but I've been working on
>this and I'm getting output:
>
>awk -f strptime_01.awk strptime_01.dat
>6 Dec 2001 12:33:45 %d %b %Y %H:%M:%S -> 12/31/69 (18:59:59)
>6 dec 2001 %d %b %Y -> 12/31/69 (18:59:59)
>06 12 2001 %d %m %Y -> 12/31/69 (18:59:59)

First, the time that you are seeing (12/31/69 18:59:59) is what you get if
you pass -1 to strftime(). That's a commonly observed indicator that
something is wrong with whatever you are using to generate a time value.

Second, I have no idea what Ed's code is doing, nor do I care. It seems
unlikely for arbitrary user-written code to get something as tricky as time
conversion correct. There are a lot of funny corner cases...

Third, here's what I get when I run your file (after making sure that there
is a tab character between the two fields in the file):

$ gawk4 -l strptime '-F\t' '{ print $1 " -> " strftime( "%D (%T)",strptime($1,$2)) }' testfile
6 Dec 2001 12:33:45 -> 12/06/01 (12:33:45)
6 dec 2001 -> 12/06/01 (00:00:00)
06 12 2001 -> 12/06/01 (00:00:00)
$

You can download the strptime extension from:

http://shell.xmission.com:PORT/strptime.zip

(where "PORT" is 65401)

--
"There's no chance that the iPhone is going to get any significant market share. No chance." - Steve Ballmer

Ed Morton

unread,
Aug 6, 2016, 12:17:50 PM8/6/16
to
I guarantee it's not since `$1` contains the whole line ;-).

Ed.

Kees Nuyt

unread,
Aug 6, 2016, 6:16:51 PM8/6/16
to
So do I. I just tried to make him think in the right direction
instead of spoon-feeding the whole issue.

--
Kees Nuyt

Bryan

unread,
Aug 6, 2016, 9:38:08 PM8/6/16
to
thank you for the suggestions.

looks like I have to figure out how to compile awk with those strptime files ...

Kenny McCormack

unread,
Aug 6, 2016, 9:54:19 PM8/6/16
to
In article <60e2d58b-4416-44cb...@googlegroups.com>,
Bryan <bryan...@gmail.com> wrote:
>thank you for the suggestions.
>
>looks like I have to figure out how to compile awk with those strptime files ...

In theory, you don't have to compile GAWK itself to make use of an
extension library. I.e., you just have to compile the lib and you can use
it with your existing GAWK executable.

However, you do need (at least) gawkapi.h, which comes as part of the main
GAWK tarball. Since I've always compiled GAWK from source for my use, it
has always "just worked" for me. I have often wondered whether it actually
would be possible to compile one of my libs on a system on which I hadn't
already built GAWK from source.

I know that in the first version of the extension library functionality, it
wasn't possible to do this, but I think one of the goals of the new
extension library system was to make it possible. I expect one of the more
knowledgeable GAWK developers to chime in at some point and educate me on
these matters.

--
The randomly chosen signature file that would have appeared here is more than 4
lines long. As such, it violates one or more Usenet RFCs. In order to remain in
compliance with said RFCs, the actual sig can be found at the following web address:
http://www.xmission.com/~gazelle/Sigs/RoyDeLoon
0 new messages