Parsing ISO date/time strings - where did the parser go?

Showing 1-20 of 20 messages
Parsing ISO date/time strings - where did the parser go? John Nagle 9/6/12 12:27 PM
In Python 2.7:

   I want to parse standard ISO date/time strings such as

        2012-09-09T18:00:00-07:00

into Python "datetime" objects.  The "datetime" object offers
an output method , datetimeobj.isoformat(), but not an input
parser.  There ought to be

        classmethod datetime.fromisoformat(s)

but there isn't.  I'd like to avoid adding a dependency on
a third party module like "dateutil".

The "Working with time" section of the Python wiki is so
ancient it predates "datetime", and says so.

There's an iso8601 module on PyPi, but it's abandoned; it hasn't been
updated since 2007 and has many outstanding issues.

There are mentions of "xml.utils.iso8601.parse" in
various places, but the "xml" module that comes
with Python 2.7 doesn't have xml.utils.

http://www.seehuhn.de/pages/pdate
says:

"Unfortunately there is no easy way to parse full ISO 8601 dates using
the Python standard library."

It looks like this was taken out of "xml" at some point,
but not moved into "datetime".

                                John Nagle
Re: Parsing ISO date/time strings - where did the parser go? Paul Rubin 9/6/12 12:51 PM
John Nagle <na...@animats.com> writes:
> There's an iso8601 module on PyPi, but it's abandoned; it hasn't been
> updated since 2007 and has many outstanding issues.

Hmm, I have some code that uses ISO date/time strings and just checked
to see how I did it, and it looks like it uses iso8601-0.1.4-py2.6.egg .
I don't remember downloading that module (I must have done it and
forgotten).  I'm not sure what its outstanding issues are, as it works
ok in the limited way I use it.

I agree that this functionality ought to be in the stdlib.
Re: Parsing ISO date/time strings - where did the parser go? Thomas Jollans 9/6/12 12:51 PM
On 09/06/2012 09:27 PM, John Nagle wrote:
> In Python 2.7:
>
>    I want to parse standard ISO date/time strings such as
>
>         2012-09-09T18:00:00-07:00
>
> into Python "datetime" objects.  The "datetime" object offers
> an output method , datetimeobj.isoformat(), but not an input
> parser.  There ought to be
>
>         classmethod datetime.fromisoformat(s)

http://docs.python.org/library/datetime.html#datetime.datetime.strptime

The ISO date/time format is dead simple and well-defined. strptime is
quite suitable.
Re: Parsing ISO date/time strings - where did the parser go? Dave Angel 9/6/12 12:55 PM
For working with datetime, see
    http://docs.python.org/library/datetime.html#datetime.datetime

and look up  datetime.strptime()

Likewise for generalized output, check out  datetime.strftime().





--

DaveA

Re: Parsing ISO date/time strings - where did the parser go? John Nagle 9/6/12 1:34 PM
   Yes, it should.  There's no shortage of implementations.
PyPi has four.  Each has some defect.

   PyPi offers:

        iso8601 0.1.4                 Simple module to parse ISO 8601 dates
        iso8601.py 0.1dev         Parse utilities for iso8601 encoding.
        iso8601plus 0.1.6         Simple module to parse ISO 8601 dates
        zc.iso8601 0.2.0         ISO 8601 utility functions

Unlike CPAN, PyPi has no quality control.

Looking at the first one, it's in Google Code.

http://code.google.com/p/pyiso8601/source/browse/trunk/iso8601/iso8601.py

The first bug is at line 67.  For a timestamp with a "Z"
at the end, the offset should always be zero, regardless of the default
timezone.  See "http://en.wikipedia.org/wiki/ISO_8601".
The code uses the default time zone in that case, which is wrong.
So don't call that code with your local time zone as the default;
it will return bad times.

Looking at the second one, it's on github:

https://github.com/accellion/iso8601.py/blob/master/iso8601.py

Giant regular expressions!  The code to handle the offset
is present, but it doesn't make the datetime object a
timezone-aware object.  It returns a naive object in UTC.

The third one is at

https://github.com/jimklo/pyiso8601plus

This is a fork of the first one, because the first one is abandonware.
The bug in the first one, mentioned above, isn't fixed.  However, if
a time zone is present, it does return an "aware" datetime object.

The fourth one is the Zope version.  This brings in the pytz
module, which brings in the Olsen database of named time zones and
their historical conversion data. None of that information is
used, or necessary, to parse ISO dates and times.  Somebody
just wanted the pytz.fixedOffset() function, which does something
datetime already does.

(For all the people who keep saying "use strptime", that doesn't
handle time zone offsets at all.)

                                        John Nagle


Re: Parsing ISO date/time strings - where did the parser go? Miki Tebeka 9/6/12 4:27 PM
I'd look also into dateutil.parser.parse and feedparser._parse_date
Re: Parsing ISO date/time strings - where did the parser go? Roy Smith 9/6/12 4:34 PM
In article <k2atej$4rq$1...@dont-email.me>, John Nagle <na...@animats.com>
wrote:

> In Python 2.7:
>
>    I want to parse standard ISO date/time strings such as
>
>         2012-09-09T18:00:00-07:00
>
> into Python "datetime" objects.  The "datetime" object offers
> an output method , datetimeobj.isoformat(), but not an input
> parser.  There ought to be
>
>         classmethod datetime.fromisoformat(s)
>
> but there isn't.  I'd like to avoid adding a dependency on
> a third party module like "dateutil".

I'm curious why?  I really think dateutil is the way to go.

It's really amazing (and unfortunate) that datetime has isoformat(), but
no way to go in the other direction.
Re: Parsing ISO date/time strings - where did the parser go? Roy Smith 9/6/12 4:44 PM
In article <mailman.325.1346961282.27098.python-list@python.org>,
 Dave Angel <d...@davea.name> wrote:

> For working with datetime, see
>     http://docs.python.org/library/datetime.html#datetime.datetime
>
> and look up  datetime.strptime()

strptime has two problems.

One is that it's a pain to use (you have to look up all those
inscrutable %-thingies every time).

The second is that it doesn't always work.  To correctly parse an
ISO-8601 string, you need '%z', which isn't supported on all platforms.

The third is that I never use methods I can't figure out how to
pronounce.
Re: Parsing ISO date/time strings - where did the parser go? Terry Reedy 9/6/12 7:13 PM
On 9/6/2012 3:44 PM, Thomas Jollans wrote:
> On 09/06/2012 09:27 PM, John Nagle wrote:
>> In Python 2.7:
>>
>>     I want to parse standard ISO date/time strings such as
>>
>>         2012-09-09T18:00:00-07:00
>>
>> into Python "datetime" objects.  The "datetime" object offers
>> an output method , datetimeobj.isoformat(), but not an input
>> parser.  There ought to be
>>
>>         classmethod datetime.fromisoformat(s)
>
> http://docs.python.org/library/datetime.html#datetime.datetime.strptime
>
> The ISO date/time format is dead simple and well-defined. strptime is
> quite suitable.

I do not see any example formats. An example for ISO might be a good one.

--
Terry Jan Reedy

Re: Parsing ISO date/time strings - where did the parser go? André Malo 9/8/12 11:12 AM
* Roy Smith wrote:

> The third is that I never use methods I can't figure out how to
> pronounce.

here: strip'time

nd
--
Flhacs wird im Usenet grundsätzlich alsfhc geschrieben. Schreibt man
lafhsc nicht slfach, so ist das schlichtweg hclafs. Hingegen darf man
rihctig ruhig rhitcgi schreiben, weil eine shcalfe Schreibweise bei
irhictg nicht als shflac angesehen wird.       -- Hajo Pflüger in dnq
Re: Parsing ISO date/time strings - where did the parser go? John Gleeson 9/8/12 5:20 PM

On 2012-09-06, at 2:34 PM, John Nagle wrote:
>  Yes, it should.  There's no shortage of implementations.
> PyPi has four.  Each has some defect.
>
>   PyPi offers:
>
>         iso8601 0.1.4                 Simple module to parse ISO 8601 dates
>         iso8601.py 0.1dev         Parse utilities for iso8601 encoding.
>         iso8601plus 0.1.6         Simple module to parse ISO 8601 dates
>         zc.iso8601 0.2.0         ISO 8601 utility functions


Here are three more on PyPI you can try:

iso-8601 0.2.3           Flexible ISO 8601 parser...
PySO8601 0.1.7       PySO8601 aims to parse any ISO 8601 date...
isodate 0.4.8            An ISO 8601 date/time/duration parser and  
formater

All three have been updated this year.
Re: Parsing ISO date/time strings - where did the parser go? John Nagle 9/8/12 8:50 PM
   There's another one inside feedparser, and there used to be
one in the xml module.

   Filed issue 15873: "datetime" cannot parse ISO 8601 dates and times
        http://bugs.python.org/issue15873

   This really should be handled in the standard library, instead of
everybody rolling their own, badly.  Especially since in Python 3.x,
there's finally a useful "tzinfo" subclass for fixed time zone
offsets.  That provides a way to directly represent ISO 8601 date/time
strings with offsets as "time zone aware" date time objects.

                                John Nagle
Re: Parsing ISO date/time strings - where did the parser go? Roy Smith 9/9/12 3:15 AM
In article <k2h3n2$213$1...@dont-email.me>, John Nagle <na...@animats.com>
wrote:

> This really should be handled in the standard library, instead of
> everybody rolling their own, badly.

+1
Re: Parsing ISO date/time strings - where did the parser go? Mark Lawrence 9/9/12 4:19 AM
I'll second that given "There should be one-- and preferably only one
--obvious way to do it".

--
Cheers.

Mark Lawrence.

Re: Parsing ISO date/time strings - where did the parser go? Roy Smith 9/9/12 5:14 AM
In article <mailman.323.1346961101.27098.python-list@python.org>,
 Thomas Jollans <t...@jollybox.de> wrote:

> The ISO date/time format is dead simple and well-defined.

Well defined, perhaps.  But nobody who has read the standard could call
it "dead simple".  ISO-8601-2004(E) is 40 pages long.

Of course, that fact that it's complicated enough to generate 40 pages
worth of standards document just argues that much more strongly for it
being in the standard lib (so there can be one canonical, well-tested,
way to do it).
Re: Parsing ISO date/time strings - where did the parser go? Rhodri James 9/10/12 2:46 PM
On Sun, 09 Sep 2012 13:14:30 +0100, Roy Smith <r...@panix.com> wrote:

> In article <mailman.323.1346961101.27098.python-list@python.org>,
>  Thomas Jollans <t...@jollybox.de> wrote:
>
>> The ISO date/time format is dead simple and well-defined.

> Well defined, perhaps.  But nobody who has read the standard could call
> it "dead simple".  ISO-8601-2004(E) is 40 pages long.

A short standard, then :-)

--
Rhodri James *-* Wildebeest Herder to the Masses
Re: Parsing ISO date/time strings - where did the parser go? Chris Angelico 9/10/12 3:57 PM
On Tue, Sep 11, 2012 at 7:46 AM, Rhodri James
<rho...@wildebst.demon.co.uk> wrote:
> On Sun, 09 Sep 2012 13:14:30 +0100, Roy Smith <r...@panix.com> wrote:
>
>> In article <mailman.323.1346961101.27098.python-list@python.org>,
>>  Thomas Jollans <t...@jollybox.de> wrote:
>>
>>> The ISO date/time format is dead simple and well-defined.
>
>
>> Well defined, perhaps.  But nobody who has read the standard could call
>> it "dead simple".  ISO-8601-2004(E) is 40 pages long.
>
>
> A short standard, then :-)

What is it that takes up forty pages? RFC 2822 describes a date/time
stamp in about two pages. In fact, the whole RFC describes the
Internet Message Format in not much more than 40 pages. Is
ISO-language just bloated?

*boggle*

ChrisA
Re: Parsing ISO date/time strings - where did the parser go? Roy Smith 9/10/12 6:12 PM
In article <mailman.473.1347317852.27098.python-list@python.org>,
You can find a copy at http://dotat.at/tmp/ISO_8601-2004_E.pdf
Re: Parsing ISO date/time strings - where did the parser go? Ben Finney 9/11/12 9:00 AM
Roy Smith <r...@panix.com> writes:

> In article <mailman.473.1347317852.27098.python-list@python.org>,
>  Chris Angelico <ros...@gmail.com> wrote:
> > What is it that takes up forty pages [for the ISO 8601
> > specification]? RFC 2822 describes a date/time stamp in about two
> > pages. In fact, the whole RFC describes the Internet Message Format
> > in not much more than 40 pages. Is ISO-language just bloated?
> >
> > *boggle*
>
> You can find a copy at http://dotat.at/tmp/ISO_8601-2004_E.pdf

In brief: ISO 8601 doesn't have the luxury of a single timestamp format.
It also must define its terms from a rather more fundamental starting
point than RFC 5822 can assume.

There's some bloat (5 of the 40 pages don't even show up in the table of
contents), but much of the content of the ISO 8601 standard is required,
to cover the ground intended in the level of detail intended.

    Scope

    This International Standard is applicable whenever representation of
    dates in the Gregorian calendar, times in the 24-hour timekeeping
    system, time intervals and recurring time intervals or of the
    formats of these representations are included in information
    interchange. It includes

    * calendar dates expressed in terms of calendar year, calendar month
      and calendar day of the month;

    * ordinal dates expressed in terms of calendar year and calendar day
      of the year;

    * week dates expressed in terms of calendar year, calendar week number
      and calendar day of the week;

    * local time based upon the 24-hour timekeeping system;

    * Coordinated Universal Time of day;

    * local time and the difference from Coordinated Universal Time;

    * combination of date and time of day;

    * time intervals;

    * recurring time intervals.

--
 \       “First things first, but not necessarily in that order.” —The |
  `\                                              Doctor, _Doctor Who_ |
_o__)                                                                  |
Ben Finney
Re: Parsing ISO date/time strings - where did the parser go? Pete Forman 9/12/12 5:31 AM
John Nagle <na...@animats.com> writes:

>    I want to parse standard ISO date/time strings such as
>
>         2012-09-09T18:00:00-07:00
>
> into Python "datetime" objects.

Consider whether RFC 3339 might be a more suitable format.

It is a subset of ISO 8601 extended format.  Some of the restrictions are

  Year must be 4 digits
  Fraction separator is period, not comma
  All components including time-offset are mandatory, except for time-secfrac
  time-minute in time-offset is not optional, must use �hh:mm or Z

Some latitude is allowed

  T may be replaced by e.g. space

Extra feature

  time-offset of -00:00 means UTC but local time is unknown

--
Pete Forman