Google Groups no longer supports new Usenet posts or subscriptions. Historical content remains viewable.
Dismiss

Get min and max dates

68 views
Skip to first unread message

DFS

unread,
Dec 7, 2016, 11:15:48 AM12/7/16
to
dts= ['10-Mar-1998',
'20-Aug-1997',
'06-Sep-2009',
'23-Jan-2010',
'12-Feb-2010',
'05-Nov-2010',
'03-Sep-2009',
'07-Nov-2014',
'08-Mar-2013']

Of course, the naive:
min(dates) = '03-Sep-2009'
max(dates) = '23-Jan-2010'
is wrong.

Not wanting to use any date parsing libraries, I came up with:
========================================================================
m=[('Dec','12'),('Nov','11'),('Oct','10'),('Sep','09'),
('Aug','08'),('Jul','07'),('Jun','06'),('May','05'),
('Apr','04'),('Mar','03'),('Feb','02'),('Jan','01')]

#create new list with properly sortable date (YYYYMMDD)
dts2 = []
for d in dts:
dts2.append((d[-4:]+dict(m)[d[3:6]]+d[:2],d))

print 'min: ' + min(dts2)[1]
print 'max: ' + max(dts2)[1]
========================================================================
$python getminmax.py
min: 20-Aug-1997
max: 07-Nov-2014

which is correct, but I sense a more pythonic way, or a one-liner list
comprehension, is in there somewhere.

Any ideas? Thanks

Steven D'Aprano

unread,
Dec 8, 2016, 12:17:04 AM12/8/16
to
On Thursday 08 December 2016 03:15, DFS wrote:

> dts= ['10-Mar-1998',
> '20-Aug-1997',
> '06-Sep-2009',
> '23-Jan-2010',
> '12-Feb-2010',
> '05-Nov-2010',
> '03-Sep-2009',
> '07-Nov-2014',
> '08-Mar-2013']
>
> Of course, the naive:
> min(dates) = '03-Sep-2009'
> max(dates) = '23-Jan-2010'
> is wrong.
>
> Not wanting to use any date parsing libraries, I came up with:
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^

That's where you have gone wrong. By not using a date parsing library which
works and has been tested, you have to write your own dodgy and possibly buggy
parsing routine yourself.

Why reinvent the wheel?


> ========================================================================
> m=[('Dec','12'),('Nov','11'),('Oct','10'),('Sep','09'),
> ('Aug','08'),('Jul','07'),('Jun','06'),('May','05'),
> ('Apr','04'),('Mar','03'),('Feb','02'),('Jan','01')]
>
> #create new list with properly sortable date (YYYYMMDD)
> dts2 = []
> for d in dts:
> dts2.append((d[-4:]+dict(m)[d[3:6]]+d[:2],d))


Why do you convert m into a dict each and every time through the loop? If you
have a million items, you convert m to a dict a million times, and then throw
away the dict a million times. And then you'll complain that Python is slow...

m = {'Jan': 1, 'Feb': 2, 'Mar': 3, 'Apr': 4, 'May': 5, 'Jun': 6,
'Jul': 7, 'Aug': 8, 'Sep': 9, 'Oct': 10, 'Nov': 11, 'Dec': 12}

There you go, fixed that.

As for the rest, I haven't bothered to check your code. But here's a much
simpler, more Pythonic, way:

def date_to_seconds(string):
return time.mktime(time.strptime(string, '%d-%b-%Y'))


min(dts, key=date_to_seconds) # returns '20-Aug-1997'
max(dts, key=date_to_seconds) # returns '07-Nov-2014'


Works for at least Python 3.3 or better.

The best part of this is that you can now support any time format you like,
just by changing the format string '%d-%b-%Y'.



--
Steven
"Ever since I learned about confirmation bias, I've been seeing
it everywhere." - Jon Ronson

Peter Heitzer

unread,
Dec 8, 2016, 4:23:58 AM12/8/16
to
I'd use strptime from the time module.

Then you could write
dts2.append(strptime(d,'%d-%b-%Y)
min and max return a struct_time type that can easily converted to
the original date format

--
Dipl.-Inform(FH) Peter Heitzer, peter....@rz.uni-regensburg.de

DFS

unread,
Dec 8, 2016, 8:47:31 AM12/8/16
to
On 12/8/2016 12:16 AM, Steven D'Aprano wrote:
> On Thursday 08 December 2016 03:15, DFS wrote:
>
>> dts= ['10-Mar-1998',
>> '20-Aug-1997',
>> '06-Sep-2009',
>> '23-Jan-2010',
>> '12-Feb-2010',
>> '05-Nov-2010',
>> '03-Sep-2009',
>> '07-Nov-2014',
>> '08-Mar-2013']
>>
>> Of course, the naive:
>> min(dates) = '03-Sep-2009'
>> max(dates) = '23-Jan-2010'
>> is wrong.
>>
>> Not wanting to use any date parsing libraries, I came up with:
> ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
>
> That's where you have gone wrong. By not using a date parsing library which
> works and has been tested, you have to write your own dodgy and possibly buggy
> parsing routine yourself.
>
> Why reinvent the wheel?


Because the "wheel" is a pain in the ass.

--------------------------------------------------------------
import time
dts=['10-Mar-1908','20-Aug-1937','06-Sep-1969','23-Jan-1952']
def date_to_seconds(string):
return time.mktime(time.strptime(string, '%d-%b-%Y'))
print min(dts, key=date_to_seconds)
--------------------------------------------------------------

OverflowError: mktime argument out of range



>> ========================================================================
>> m=[('Dec','12'),('Nov','11'),('Oct','10'),('Sep','09'),
>> ('Aug','08'),('Jul','07'),('Jun','06'),('May','05'),
>> ('Apr','04'),('Mar','03'),('Feb','02'),('Jan','01')]
>>
>> #create new list with properly sortable date (YYYYMMDD)
>> dts2 = []
>> for d in dts:
>> dts2.append((d[-4:]+dict(m)[d[3:6]]+d[:2],d))
>
>
> Why do you convert m into a dict each and every time through the loop? If you
> have a million items, you convert m to a dict a million times, and then throw
> away the dict a million times. And then you'll complain that Python is slow...


Python IS slow (for some things anyway).

But in this case, the list->dict conversion isn't; 1 million of them
takes just under 2 seconds.

---------------------------------------------------------------------------------
import sys,time,random

#lists
d= ['01','02','03','04','05','06','07','08','09','10']
d+=['11','12','13','14','15','16','17','18','19','20']
d+=['21','22','23','24','25','26','27','28']
m=['Jan','Feb','Mar','Apr','May','Jun','Jul','Aug','Sep','Oct','Nov','Dec']
y,dt=[],[]
for yr in range(1900,2001):y.append(yr)


loops=int(sys.argv[1])

#populate list of random dates in format 'dd-MMM-yyyy'
start=time.clock()
for i in range(loops):
rd=random.choice(d)
rm=random.choice(m)
ry=random.choice(y)
dt.append(rd+'-'+rm+'-'+str(ry))
print "Build list of %s random dates: %.2g sec"
%(len(dt),time.clock()-start)
print


#create sortable 2nd list: convert list to dict
ml=
[('Dec','12'),('Nov','11'),('Oct','10'),('Sep','09'),('Aug','08'),('Jul','07')]
ml+=[('Jun','06'),('May','05'),('Apr','04'),('Mar','03'),('Feb','02'),('Jan','01')]
start=time.clock()
dts=[]
for d in dt:
dts.append((d[-4:]+dict(ml)[d[3:6]]+d[:2],d))
print "With list to dict conversion: %.2g sec" %(time.clock()-start)
print min(dts)[1], max(dts)[1]
print


#create sortable 2nd list: use dict
md={'Jan':'01','Feb':'02','Mar':'03','Apr':'04','May':'05','Jun':'06',
'Jul':'07','Aug':'08','Sep':'09','Oct':'10','Nov':'11','Dec':'12'}
start=time.clock()
dts=[]
for d in dt:
dts.append((d[-4:]+md[d[3:6]]+d[:2],d))
print "No list to dict conversion: %.2g sec" %(time.clock()-start)
print min(dts)[1], max(dts)[1]
---------------------------------------------------------------------------------

$ python temp.py 1000000
Build list of 1000000 random dates: 3.9 sec

With list to dict conversion: 2.7 sec
01-Jan-1900 28-Dec-2000

No list to dict conversion: 0.93 sec
01-Jan-1900 28-Dec-2000




> m = {'Jan': 1, 'Feb': 2, 'Mar': 3, 'Apr': 4, 'May': 5, 'Jun': 6,
> 'Jul': 7, 'Aug': 8, 'Sep': 9, 'Oct': 10, 'Nov': 11, 'Dec': 12}
>
> There you go, fixed that.

Broke it: "TypeError: cannot concatenate 'str' and 'int' objects"

This is what I need:
m = {'Jan':'01','Feb':'02','Mar':'03','Apr':'04','May':'05','Jun':'06',
'Jul':'07','Aug':'08','Sep':'09','Oct':'10','Nov':'11','Dec':'12'}



> As for the rest, I haven't bothered to check your code. But here's a much
> simpler, more Pythonic, way:
>
> def date_to_seconds(string):
> return time.mktime(time.strptime(string, '%d-%b-%Y'))
>
>
> min(dts, key=date_to_seconds) # returns '20-Aug-1997'
> max(dts, key=date_to_seconds) # returns '07-Nov-2014'
>
>
> Works for at least Python 3.3 or better.
>
> The best part of this is that you can now support any time format you like,
> just by changing the format string '%d-%b-%Y'.


I like that flexibility, but mktime is apparently useless for dates
prior to 'the epoch'.

can...@gmail.com

unread,
Dec 8, 2016, 11:30:26 AM12/8/16
to
Am Donnerstag, 8. Dezember 2016 14:47:31 UTC+1 schrieb DFS:
> On 12/8/2016 12:16 AM, Steven D'Aprano wrote:
> > On Thursday 08 December 2016 03:15, DFS wrote:
> >
> >> dts= ['10-Mar-1998',
> >> '20-Aug-1997',
> >> '06-Sep-2009',
> >> '23-Jan-2010',
> >> '12-Feb-2010',
> >> '05-Nov-2010',
> >> '03-Sep-2009',
> >> '07-Nov-2014',
> >> '08-Mar-2013']
> >>
> >> Of course, the naive:
> >> min(dates) = '03-Sep-2009'
> >> max(dates) = '23-Jan-2010'
> >> is wrong.
> >>
> >> Not wanting to use any date parsing libraries, I came up with:
> > ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
> >
> > That's where you have gone wrong. By not using a date parsing library which
> > works and has been tested, you have to write your own dodgy and possibly buggy
> > parsing routine yourself.
> >
> > Why reinvent the wheel?
>
>
> Because the "wheel" is a pain in the ass.
>

Why? And why do you use this wording?

> --------------------------------------------------------------
> import time
> dts=['10-Mar-1908','20-Aug-1937','06-Sep-1969','23-Jan-1952']
> def date_to_seconds(string):
> return time.mktime(time.strptime(string, '%d-%b-%Y'))
> print min(dts, key=date_to_seconds)
> --------------------------------------------------------------
>
> OverflowError: mktime argument out of range
>
>

(snip)

>
> I like that flexibility, but mktime is apparently useless for dates
> prior to 'the epoch'.

With a little more experience in Python programming you should have discovered that time.mktime is not even required to do your calculations.

Please remove time.mktime from the function date_to_seconds and rename the function to date_to_timestruct.

-- Paolo

Skip Montanaro

unread,
Dec 8, 2016, 1:01:38 PM12/8/16
to
Datetime has greater range I believe. I can't paste from my work computer,
but try this:

min(datetime.datetime.strptime(s, "%d-%b-%Y") for s in dts)

You should get the 1908 date instead of the 1969 date.

In general, you should always use datetime instead of time for doing date
manipulation and date arithmetic. It's a *whole lot better*.

Skip

Cousin Stanley

unread,
Dec 8, 2016, 2:49:17 PM12/8/16
to
DFS wrote:

> ....
> Not wanting to use any date parsing libraries,
> ....

If you happen reconsider date parsing libraries
the strptime function from the datetime module
might be useful ....

#!/usr/bin/env python3

from datetime import datetime

dates = [ '10-Mar-1998' ,
'20-Aug-1997' ,
'06-Sep-2009' ,
'23-Jan-2010' ,
'12-Feb-2010' ,
'05-Nov-2010' ,
'03-Sep-2009' ,
'07-Nov-2014' ,
'08-Mar-2013' ]

dict_dates = { }

print( )

for this_date in dates :

dt = datetime.strptime( this_date , '%d-%b-%Y' )

sd = dt.strftime( '%Y-%m-%d' )

print( ' {0} .... {1}'.format( this_date , sd ) )

dict_dates[ sd ] = this_date


min_date = min( dict_dates.keys() )

max_date = max( dict_dates.keys() )

print( '\n {0} .... {1} min'.format( dict_dates[ min_date ] , min_date ) )

print( '\n {0} .... {1} max'.format( dict_dates[ max_date ] , max_date ) )








































--
Stanley C. Kitching
Human Being
Phoenix, Arizona

0 new messages