Google Groups no longer supports new Usenet posts or subscriptions. Historical content remains viewable.
Dismiss

Parsing ISO date/time strings - where did the parser go?

6,532 views
Skip to first unread message

John Nagle

unread,
Sep 6, 2012, 3:27:19 PM9/6/12
to
In Python 2.7:

I want to parse standard ISO date/time strings such as

2012-09-09T18:00:00-07:00

into Python "datetime" objects. The "datetime" object offers
an output method , datetimeobj.isoformat(), but not an input
parser. There ought to be

classmethod datetime.fromisoformat(s)

but there isn't. I'd like to avoid adding a dependency on
a third party module like "dateutil".

The "Working with time" section of the Python wiki is so
ancient it predates "datetime", and says so.

There's an iso8601 module on PyPi, but it's abandoned; it hasn't been
updated since 2007 and has many outstanding issues.

There are mentions of "xml.utils.iso8601.parse" in
various places, but the "xml" module that comes
with Python 2.7 doesn't have xml.utils.

http://www.seehuhn.de/pages/pdate
says:

"Unfortunately there is no easy way to parse full ISO 8601 dates using
the Python standard library."

It looks like this was taken out of "xml" at some point,
but not moved into "datetime".

John Nagle

Paul Rubin

unread,
Sep 6, 2012, 3:51:37 PM9/6/12
to
John Nagle <na...@animats.com> writes:
> There's an iso8601 module on PyPi, but it's abandoned; it hasn't been
> updated since 2007 and has many outstanding issues.

Hmm, I have some code that uses ISO date/time strings and just checked
to see how I did it, and it looks like it uses iso8601-0.1.4-py2.6.egg .
I don't remember downloading that module (I must have done it and
forgotten). I'm not sure what its outstanding issues are, as it works
ok in the limited way I use it.

I agree that this functionality ought to be in the stdlib.

Thomas Jollans

unread,
Sep 6, 2012, 3:44:13 PM9/6/12
to pytho...@python.org
On 09/06/2012 09:27 PM, John Nagle wrote:
> In Python 2.7:
>
> I want to parse standard ISO date/time strings such as
>
> 2012-09-09T18:00:00-07:00
>
> into Python "datetime" objects. The "datetime" object offers
> an output method , datetimeobj.isoformat(), but not an input
> parser. There ought to be
>
> classmethod datetime.fromisoformat(s)

http://docs.python.org/library/datetime.html#datetime.datetime.strptime

The ISO date/time format is dead simple and well-defined. strptime is
quite suitable.

Dave Angel

unread,
Sep 6, 2012, 3:54:15 PM9/6/12
to John Nagle, pytho...@python.org
For working with datetime, see
http://docs.python.org/library/datetime.html#datetime.datetime

and look up datetime.strptime()

Likewise for generalized output, check out datetime.strftime().





--

DaveA

John Nagle

unread,
Sep 6, 2012, 4:34:20 PM9/6/12
to
Yes, it should. There's no shortage of implementations.
PyPi has four. Each has some defect.

PyPi offers:

iso8601 0.1.4 Simple module to parse ISO 8601 dates
iso8601.py 0.1dev Parse utilities for iso8601 encoding.
iso8601plus 0.1.6 Simple module to parse ISO 8601 dates
zc.iso8601 0.2.0 ISO 8601 utility functions

Unlike CPAN, PyPi has no quality control.

Looking at the first one, it's in Google Code.

http://code.google.com/p/pyiso8601/source/browse/trunk/iso8601/iso8601.py

The first bug is at line 67. For a timestamp with a "Z"
at the end, the offset should always be zero, regardless of the default
timezone. See "http://en.wikipedia.org/wiki/ISO_8601".
The code uses the default time zone in that case, which is wrong.
So don't call that code with your local time zone as the default;
it will return bad times.

Looking at the second one, it's on github:

https://github.com/accellion/iso8601.py/blob/master/iso8601.py

Giant regular expressions! The code to handle the offset
is present, but it doesn't make the datetime object a
timezone-aware object. It returns a naive object in UTC.

The third one is at

https://github.com/jimklo/pyiso8601plus

This is a fork of the first one, because the first one is abandonware.
The bug in the first one, mentioned above, isn't fixed. However, if
a time zone is present, it does return an "aware" datetime object.

The fourth one is the Zope version. This brings in the pytz
module, which brings in the Olsen database of named time zones and
their historical conversion data. None of that information is
used, or necessary, to parse ISO dates and times. Somebody
just wanted the pytz.fixedOffset() function, which does something
datetime already does.

(For all the people who keep saying "use strptime", that doesn't
handle time zone offsets at all.)

John Nagle


Miki Tebeka

unread,
Sep 6, 2012, 7:27:49 PM9/6/12
to
I'd look also into dateutil.parser.parse and feedparser._parse_date

Roy Smith

unread,
Sep 6, 2012, 7:34:22 PM9/6/12
to
In article <k2atej$4rq$1...@dont-email.me>, John Nagle <na...@animats.com>
wrote:

> In Python 2.7:
>
> I want to parse standard ISO date/time strings such as
>
> 2012-09-09T18:00:00-07:00
>
> into Python "datetime" objects. The "datetime" object offers
> an output method , datetimeobj.isoformat(), but not an input
> parser. There ought to be
>
> classmethod datetime.fromisoformat(s)
>
> but there isn't. I'd like to avoid adding a dependency on
> a third party module like "dateutil".

I'm curious why? I really think dateutil is the way to go.

It's really amazing (and unfortunate) that datetime has isoformat(), but
no way to go in the other direction.

Roy Smith

unread,
Sep 6, 2012, 7:44:38 PM9/6/12
to
In article <mailman.325.13469612...@python.org>,
Dave Angel <d...@davea.name> wrote:

> For working with datetime, see
> http://docs.python.org/library/datetime.html#datetime.datetime
>
> and look up datetime.strptime()

strptime has two problems.

One is that it's a pain to use (you have to look up all those
inscrutable %-thingies every time).

The second is that it doesn't always work. To correctly parse an
ISO-8601 string, you need '%z', which isn't supported on all platforms.

The third is that I never use methods I can't figure out how to
pronounce.

Terry Reedy

unread,
Sep 6, 2012, 10:12:35 PM9/6/12
to pytho...@python.org
On 9/6/2012 3:44 PM, Thomas Jollans wrote:
> On 09/06/2012 09:27 PM, John Nagle wrote:
>> In Python 2.7:
>>
>> I want to parse standard ISO date/time strings such as
>>
>> 2012-09-09T18:00:00-07:00
>>
>> into Python "datetime" objects. The "datetime" object offers
>> an output method , datetimeobj.isoformat(), but not an input
>> parser. There ought to be
>>
>> classmethod datetime.fromisoformat(s)
>
> http://docs.python.org/library/datetime.html#datetime.datetime.strptime
>
> The ISO date/time format is dead simple and well-defined. strptime is
> quite suitable.

I do not see any example formats. An example for ISO might be a good one.

--
Terry Jan Reedy

André Malo

unread,
Sep 8, 2012, 2:12:46 PM9/8/12
to
* Roy Smith wrote:

> The third is that I never use methods I can't figure out how to
> pronounce.

here: strip'time

nd
--
Flhacs wird im Usenet grundsätzlich alsfhc geschrieben. Schreibt man
lafhsc nicht slfach, so ist das schlichtweg hclafs. Hingegen darf man
rihctig ruhig rhitcgi schreiben, weil eine shcalfe Schreibweise bei
irhictg nicht als shflac angesehen wird. -- Hajo Pflüger in dnq

John Gleeson

unread,
Sep 8, 2012, 8:20:25 PM9/8/12
to John Nagle, pytho...@python.org

On 2012-09-06, at 2:34 PM, John Nagle wrote:
> Yes, it should. There's no shortage of implementations.
> PyPi has four. Each has some defect.
>
> PyPi offers:
>
> iso8601 0.1.4 Simple module to parse ISO 8601 dates
> iso8601.py 0.1dev Parse utilities for iso8601 encoding.
> iso8601plus 0.1.6 Simple module to parse ISO 8601 dates
> zc.iso8601 0.2.0 ISO 8601 utility functions


Here are three more on PyPI you can try:

iso-8601 0.2.3 Flexible ISO 8601 parser...
PySO8601 0.1.7 PySO8601 aims to parse any ISO 8601 date...
isodate 0.4.8 An ISO 8601 date/time/duration parser and
formater

All three have been updated this year.

John Nagle

unread,
Sep 8, 2012, 11:51:03 PM9/8/12
to
There's another one inside feedparser, and there used to be
one in the xml module.

Filed issue 15873: "datetime" cannot parse ISO 8601 dates and times
http://bugs.python.org/issue15873

This really should be handled in the standard library, instead of
everybody rolling their own, badly. Especially since in Python 3.x,
there's finally a useful "tzinfo" subclass for fixed time zone
offsets. That provides a way to directly represent ISO 8601 date/time
strings with offsets as "time zone aware" date time objects.

John Nagle

Roy Smith

unread,
Sep 9, 2012, 6:15:38 AM9/9/12
to
In article <k2h3n2$213$1...@dont-email.me>, John Nagle <na...@animats.com>
wrote:

> This really should be handled in the standard library, instead of
> everybody rolling their own, badly.

+1

Mark Lawrence

unread,
Sep 9, 2012, 7:20:44 AM9/9/12
to pytho...@python.org
I'll second that given "There should be one-- and preferably only one
--obvious way to do it".

--
Cheers.

Mark Lawrence.

Roy Smith

unread,
Sep 9, 2012, 8:14:30 AM9/9/12
to
In article <mailman.323.13469611...@python.org>,
Thomas Jollans <t...@jollybox.de> wrote:

> The ISO date/time format is dead simple and well-defined.

Well defined, perhaps. But nobody who has read the standard could call
it "dead simple". ISO-8601-2004(E) is 40 pages long.

Of course, that fact that it's complicated enough to generate 40 pages
worth of standards document just argues that much more strongly for it
being in the standard lib (so there can be one canonical, well-tested,
way to do it).

Rhodri James

unread,
Sep 10, 2012, 5:46:19 PM9/10/12
to
On Sun, 09 Sep 2012 13:14:30 +0100, Roy Smith <r...@panix.com> wrote:

> In article <mailman.323.13469611...@python.org>,
> Thomas Jollans <t...@jollybox.de> wrote:
>
>> The ISO date/time format is dead simple and well-defined.

> Well defined, perhaps. But nobody who has read the standard could call
> it "dead simple". ISO-8601-2004(E) is 40 pages long.

A short standard, then :-)

--
Rhodri James *-* Wildebeest Herder to the Masses

Chris Angelico

unread,
Sep 10, 2012, 6:51:02 PM9/10/12
to pytho...@python.org
On Tue, Sep 11, 2012 at 7:46 AM, Rhodri James
<rho...@wildebst.demon.co.uk> wrote:
> On Sun, 09 Sep 2012 13:14:30 +0100, Roy Smith <r...@panix.com> wrote:
>
>> In article <mailman.323.13469611...@python.org>,
>> Thomas Jollans <t...@jollybox.de> wrote:
>>
>>> The ISO date/time format is dead simple and well-defined.
>
>
>> Well defined, perhaps. But nobody who has read the standard could call
>> it "dead simple". ISO-8601-2004(E) is 40 pages long.
>
>
> A short standard, then :-)

What is it that takes up forty pages? RFC 2822 describes a date/time
stamp in about two pages. In fact, the whole RFC describes the
Internet Message Format in not much more than 40 pages. Is
ISO-language just bloated?

*boggle*

ChrisA

Roy Smith

unread,
Sep 10, 2012, 9:12:09 PM9/10/12
to

Ben Finney

unread,
Sep 11, 2012, 12:00:06 PM9/11/12
to
Roy Smith <r...@panix.com> writes:

> In article <mailman.473.13473178...@python.org>,
> Chris Angelico <ros...@gmail.com> wrote:
> > What is it that takes up forty pages [for the ISO 8601
> > specification]? RFC 2822 describes a date/time stamp in about two
> > pages. In fact, the whole RFC describes the Internet Message Format
> > in not much more than 40 pages. Is ISO-language just bloated?
> >
> > *boggle*
>
> You can find a copy at http://dotat.at/tmp/ISO_8601-2004_E.pdf

In brief: ISO 8601 doesn't have the luxury of a single timestamp format.
It also must define its terms from a rather more fundamental starting
point than RFC 5822 can assume.

There's some bloat (5 of the 40 pages don't even show up in the table of
contents), but much of the content of the ISO 8601 standard is required,
to cover the ground intended in the level of detail intended.

Scope

This International Standard is applicable whenever representation of
dates in the Gregorian calendar, times in the 24-hour timekeeping
system, time intervals and recurring time intervals or of the
formats of these representations are included in information
interchange. It includes

* calendar dates expressed in terms of calendar year, calendar month
and calendar day of the month;

* ordinal dates expressed in terms of calendar year and calendar day
of the year;

* week dates expressed in terms of calendar year, calendar week number
and calendar day of the week;

* local time based upon the 24-hour timekeeping system;

* Coordinated Universal Time of day;

* local time and the difference from Coordinated Universal Time;

* combination of date and time of day;

* time intervals;

* recurring time intervals.

--
\ “First things first, but not necessarily in that order.” —The |
`\ Doctor, _Doctor Who_ |
_o__) |
Ben Finney

Pete Forman

unread,
Sep 12, 2012, 8:31:14 AM9/12/12
to
John Nagle <na...@animats.com> writes:

> I want to parse standard ISO date/time strings such as
>
> 2012-09-09T18:00:00-07:00
>
> into Python "datetime" objects.

Consider whether RFC 3339 might be a more suitable format.

It is a subset of ISO 8601 extended format. Some of the restrictions are

Year must be 4 digits
Fraction separator is period, not comma
All components including time-offset are mandatory, except for time-secfrac
time-minute in time-offset is not optional, must use �hh:mm or Z

Some latitude is allowed

T may be replaced by e.g. space

Extra feature

time-offset of -00:00 means UTC but local time is unknown

--
Pete Forman
0 new messages