Google Groups no longer supports new Usenet posts or subscriptions. Historical content remains viewable.
Dismiss

strptime performance

1 view
Skip to first unread message

George Trojan

unread,
Nov 3, 2003, 6:24:58 PM11/3/03
to
Is time.strptime() intrinsically slow and should be avoided whenever
possible? I have the following code in my application:

def string2time(s):
# Converts string %y%m%d%H%M to Unix time
y = int(s[:2])+2000
m = int(s[2:4])
d = int(s[4:6])
H = int(s[6:8])
M = int(s[8:10])
return time.mktime((y, m, d, H, M, 0, 0, 0, 0))
# return time.mktime(time.strptime(s[:10], '%y%m%d%H%M'))

Output from pstats is:

ncalls tottime percall cumtime percall filename:lineno(function)
2215 0.030 0.000 0.030 0.000
/awips/adapt/avnfps/CE/py/Avn.py:41(string2time)

When I use the commented out line instead, the code is about 60 times
slower:

1456 0.080 0.000 1.270 0.001
/awips/adapt/avnfps/CE/py/Avn.py:41(string2time)
...
1456 0.020 0.000 0.770 0.001
/usr/local/python2.3/lib/python2.3/_strptime.py:396(compile)
1456 0.360 0.000 0.720 0.000
/usr/local/python2.3/lib/python2.3/_strptime.py:374(pattern)

George

John Roth

unread,
Nov 3, 2003, 7:10:17 PM11/3/03
to

"George Trojan" <george...@noaa.gov> wrote in message
news:bo6oaf$vfq$1...@news.nems.noaa.gov...

> Is time.strptime() intrinsically slow and should be avoided whenever
> possible? I have the following code in my application:

According to the "what's new in Python" for 2.3, the strptime
implementation was switched from a lightweight wrapper
around the frequently buggy and incompatible C library
to a portable pure Python implementation.

Yes, it's going to be a lot slower.

John Roth


Delaney, Timothy C (Timothy)

unread,
Nov 3, 2003, 9:03:32 PM11/3/03
to pytho...@python.org
> From: John Roth [mailto:newsg...@jhrothjr.com]

Note also in 2.3.1 (and later) ...

"Caching in _strptime.py has been re-introduced. This leads to a large performance boost at the cost of not being thread-safe from locale changes while executing time.strptime()".

> Yes, it's going to be a lot slower.

Now that's an assumption you shouldn't be making until you've timed it.

Tim Delaney

John Roth

unread,
Nov 4, 2003, 6:28:57 AM11/4/03
to

"Delaney, Timothy C (Timothy)" <tdel...@avaya.com> wrote in message
news:mailman.407.1067911...@python.org...

[John Roth]
Let me put it this way. If a pure Python implementation is even
close to a C language implementation, then the C language
implementation has to be truely awful (as in awe-inspiringly bad).
As you note above, caching was (re) introduced to get a
performance boost. Sometimes you can make a performance
statement without measurement with a high expectation of being
right.

John Roth

Tim Delaney


P...@draigbrady.com

unread,
Nov 4, 2003, 6:32:04 AM11/4/03
to

This duplication of logic looks like a trend?

For e.g. locale.format is very simplistic (in 2.2.2 at least)
It groups %s items as numbers:
locale.format("%s",1234,1) -> '1,234'
and treats non numbers as numbers:
locale.format("%s\n",1234,1) -> '12,34\n'
This should be fixed, or changed to just taking an
int and returning a string or better using the
glibc facility of the ' modifier. e.g "%'d"
This applies to any decimal conversion (i,d,u,f,F,g,G)
You wouldn't have to use it directly, just in locale.format()
you could add in a ' in the appropriate places.
Note this is SUSV2 not just glibc

Another e.g. where glibc functionality should be used rather than
reimplementing in python is the parsing of mo files.
Note Solaris mo files are not supported currently anyway.

I know python has to be cross platform, but automatically setting
things up to use the system libraries where appropriate would
be a benefit to performance and functionality IMHO.

Pádraig.

John Roth

unread,
Nov 4, 2003, 7:44:36 AM11/4/03
to

<P...@draigBrady.com> wrote in message news:3FA78E34...@draigBrady.com...

> John Roth wrote:
> > "George Trojan" <george...@noaa.gov> wrote in message
> > news:bo6oaf$vfq$1...@news.nems.noaa.gov...
> >
> >>Is time.strptime() intrinsically slow and should be avoided whenever
> >>possible? I have the following code in my application:
> >
> >
> > According to the "what's new in Python" for 2.3, the strptime
> > implementation was switched from a lightweight wrapper
> > around the frequently buggy and incompatible C library
> > to a portable pure Python implementation.
> >
> > Yes, it's going to be a lot slower.
>
> This duplication of logic looks like a trend?

It's only being done where the various implementations are,
as I mentioned in the original post, incomplete, inconsistent
and buggy. Python is, for better or worse, cross-platform,
and attempts to give a reasonable attempt to running the
same everywhere that there is not a clear and obvious
reason to do something different on the various platforms.

Another point to ponder here is that Python implementations
are favored, at least initially, because they are better defined
and more accessible to the entire user base than C implementations.
Also, they're one less thing for the PyPy project to reengineer
into Python. However, once it's demonstrated that it works
properly, I suspect the developers might be open to accepting
a C version, at least if it was supported by sufficient unit tests
to demonstrate it was the same as the pure Python version, and
it ran significantly faster.

> I know python has to be cross platform, but automatically setting
> things up to use the system libraries where appropriate would
> be a benefit to performance and functionality IMHO.

A serious question: How do you know where it's appropriate?
That's as much a political question as a technical one, and neither
the political nor the technical issue is easy to solve.

John Roth
>
> Pádraig.
>


0 new messages