ISO 8601 date handling

46 views
Skip to first unread message

david.ri...@enquora.com

unread,
Mar 30, 2017, 6:25:26 PM3/30/17
to Cappuccino & Objective-J
CPDateFormatter is failing to parse ISO 8601-formatted date strings, as illustrated here:

        Objective-C code results in a UTC testDate value:

    NSString* iso8601TestDate = @"2017-03-30T20:36:46.245Z";

    NSDateFormatter* dateFormatter = [[NSDateFormatter alloc] init];

    NSLocale* posix = [[NSLocale alloc] initWithLocaleIdentifier:@"en_US_POSIX"];

    [dateFormatter setLocale:posix];

    [dateFormatter setDateFormat:@"yyyy-MM-dd'T'HH:mm:ss.SSS'Z'"];

    NSDate* testDate = [dateFormatter dateFromString:iso8601TestDate];


        Objective-J code results in nil for testDate:
    var iso8601TestDate = @"2017-03-30T20:36:46.245Z";
    var dateFormatter = [[CPDateFormatter alloc] init];
    var posix = [[CPLocale alloc] initWithLocaleIdentifier:@"en_US_POSIX"];
    [dateFormatter setLocale:posix];
    [dateFormatter setDateFormat:@"yyyy-MM-dd'T'HH:mm:ss.SSS'Z'"];                
    var testDate = [dateFormatter dateFromString:iso8601TestDate];

    console.log("Test date", testDate);


         Objective-J code results in a testDate value in local time zone:

    var iso8601TestDate = @"2017-03-30 20:36:46.245";

    var dateFormatter = [[CPDateFormatter alloc] init];

    var posix = [[CPLocale alloc] initWithLocaleIdentifier:@"en_US_POSIX"];

    [dateFormatter setLocale:posix];

    [dateFormatter setDateFormat:@"yyyy-MM-dd HH:mm:ss"];

    var testDate = [dateFormatter dateFromString:iso8601TestDate];

    console.log("Test date", testDate);


The source code for CPDateFormatter seems to be missing handlers for the 'T', 'SSS' and 'Z' tokens, but I'm still reading it.


We're using Couchdb and relatives for data persistence, which stores data in JSON format - thus needing a standardized textual representation for dates.

We  don't particularly need millisecond precision nor do I trust the browsers clock that well, but Javascript Date's toISOString() does support it, so it seems we should too.


Can anyone confirm this problem or point out where I'm mistaken before I fix it?


d.r.

Keary Suska

unread,
Mar 30, 2017, 11:15:27 PM3/30/17
to objec...@googlegroups.com

> On Mar 30, 2017, at 4:25 PM, david.ri...@enquora.com wrote:
>
> Objective-J code results in nil for testDate:
> var iso8601TestDate = @"2017-03-30T20:36:46.245Z";
> var dateFormatter = [[CPDateFormatter alloc] init];
> var posix = [[CPLocale alloc] initWithLocaleIdentifier:@"en_US_POSIX"];
> [dateFormatter setLocale:posix];
> [dateFormatter setDateFormat:@"yyyy-MM-dd'T'HH:mm:ss.SSS'Z'"];
> var testDate = [dateFormatter dateFromString:iso8601TestDate];
> console.log("Test date", testDate);
>
>
>
> Objective-J code results in a testDate value in local time zone:
> var iso8601TestDate = @"2017-03-30 20:36:46.245";
> var dateFormatter = [[CPDateFormatter alloc] init];
> var posix = [[CPLocale alloc] initWithLocaleIdentifier:@"en_US_POSIX"];
> [dateFormatter setLocale:posix];
> [dateFormatter setDateFormat:@"yyyy-MM-dd HH:mm:ss"];
> var testDate = [dateFormatter dateFromString:iso8601TestDate];
> console.log("Test date", testDate);
>
>
>
> The source code for CPDateFormatter seems to be missing handlers for the 'T', 'SSS' and 'Z' tokens, but I'm still reading it.

It does, but there appears to be a bug in the parser. There appears to be at two things going wrong: 1) the parser is not handling text elements properly (AFAICT, it is simply ignoring the pattern as it occurs in the string and format, rather than parsing it); and 2) it doesn’t recognize a single quote or period as a delimiter. If you change your date to:

var iso8601TestDate = @"2017-03-30 ’T'20:36:46:245’Z'";

and your format to:

[dateFormatter setDateFormat:@"yyyy-MM-dd ’T'HH:mm:ss:SSS'Z'"];

It will work. Note the space before the ’T’. Without the space the day (I think) gets absorbed and although the string parses it does not give the correct date. I would file a bug report.

HTH,

Keary Suska
Esoteritech, Inc.


david.ri...@enquora.com

unread,
Mar 31, 2017, 12:31:07 AM3/31/17
to Cappuccino & Objective-J, cappu...@esoteritech.com

It does, but there appears to be a bug in the parser. There appears to be at two things going wrong: 1) the parser is not handling text elements properly (AFAICT, it is simply ignoring the pattern as it occurs in the string and format, rather than parsing it); and 2) it doesn’t recognize a single quote or period as a delimiter. If you change your date to:

var iso8601TestDate = @"2017-03-30 ’T'20:36:46:245’Z'";

and your format to:

 [dateFormatter setDateFormat:@"yyyy-MM-dd ’T'HH:mm:ss:SSS'Z'"];

It will work. Note the space before the ’T’. Without the space the day (I think) gets absorbed and although the string parses it does not give the correct date. I would file a bug report.

This code was an example case only - we can't very well alter the data storage format for something that must work across 3 different implementations of couchdb and their associated ecosystems, plus external reporting tools!

Looking at this a bit more closely, the implementation of CPDateFormatter was by Alexander Ljungberg, whose code is generally very careful and methodical. He also wrote Ratatosk, which implements an automatic serialization of NSDate to and from ISO 8601 format for remote requests (which we use extensively). I see there a space is used rather than the T token, and the +/-seconds format is used to specify UTC offset. Although Javascript's native Date.parseISO8601 is used, a custom string constructor is used rather than Date.toISOString (which does use T and Z tokens). Alexander seems to have retired from Cappuccino so it doesn't feel right to bother him with even recollections about his thinking on this.

I don't have enough experience with the ISO 8601 spec to understand what formats are valid, if the standard is evolving or other wrinkles, but we have a better chance of using spaces and +/-seconds if it comes to altering the date persistence format and there are indeed multiple valid string representations.

I'll have a deeper look at the existing parser tomorrow and over the weekend and see if I can patch the problem - although I'd really rather see a formal parser grammar for something with so many options!

cheers,
d.r.


Alexander Ljungberg

unread,
Apr 3, 2017, 8:22:07 AM4/3/17
to objec...@googlegroups.com

Hi guys,,

It has been a long time since I worked on that code but I think we never did work specifically on 8601 parsing for CPDateFormatter, but rather we focused on getting the main UI based formatting to work.

Instead Cappuccino provides Date.parseISO8601 on the native Date object in JS. That’s in fact what Ratatosk uses.

There’s no reason for CPDateFormatter not to support textual tokens the same way Objective-C does. I just don’t think anybody worked on it.




Alexander
--
You received this message because you are subscribed to the Google Groups "Cappuccino & Objective-J" group.
To unsubscribe from this group and stop receiving emails from it, send an email to objectivej+...@googlegroups.com.
To post to this group, send email to objec...@googlegroups.com.
Visit this group at https://groups.google.com/group/objectivej.
For more options, visit https://groups.google.com/d/optout.

david.ri...@enquora.com

unread,
Apr 4, 2017, 6:36:39 PM4/4/17
to Cappuccino & Objective-J


On Monday, April 3, 2017 at 6:22:07 AM UTC-6, Alexander Ljungberg wrote:

Hi guys,,

It has been a long time since I worked on that code but I think we never did work specifically on 8601 parsing for CPDateFormatter, but rather we focused on getting the main UI based formatting to work.

Instead Cappuccino provides Date.parseISO8601 on the native Date object in JS. That’s in fact what Ratatosk uses.

There’s no reason for CPDateFormatter not to support textual tokens the same way Objective-C does. I just don’t think anybody worked on it


This is our first foray into handling timestamps which must be timezone-aware and the biggest challenge has been understanding the ISO 8601 spec (we use a JSON data persistence layer and a naturally sortable string representation is essential). I wasn't aware that multiple representations are available in the spec for representing UTC (seconds offset or the Zulu token).

I have worked around the immediate problem by dropping down to Javascript's Date parse() and toISOString() methods. Javascript in the browsers I've checked uses the Zulu token, as do most external languages which we're likely to use for reporting, so this seems safe.

Ratatosk predates Alexander's own CPDateFormatter work and uses a custom implementation for ISO 8601 timestamp representations (+0000 to mark the UTC offset).
That implementation date (2009) seems an eternity ago now, Alexander, but I'd love to know the thinking behind using that representation rather than Z - if your memory of the specifics returns :-)

I am looking at the best way to ensure full parsing support in CPDateFormatter now and am finding few resources fully conversant with timestamp handling. Any background or experience with common (and uncommon) usage is welcome! In particular, I'm wondering if browser support was spotty circa 2009. Which begs the question - do we currently have a formal target for minimum browser support?
Reply all
Reply to author
Forward
0 new messages