I also changed the titles from "C0X proposal" to "C201X proposal",
since we're coming up on the next decade, almost eight years since
since I first wrote the proposals.
http://david.tribble.com/text/c0xlongtime.html
http://david.tribble.com/text/c0xcalendar.html
http://david.tribble.com/text/c0xtimezone.html
I hope to make a little more progress on the proof-of-concept
implementation source files, but I can't promise anything.
-drt
Here's a direct link to the diagram, for convenience:
http://david.tribble.com/text/c0xcalendartypes.png
-drt
Here is an email I received, which I am posting in this thread
for further discussion...
--------------------------------------------------------------------------------
> From: Deckers, Michael <michael...@ts.fujitsu.com>
> Sent: Sep 4, 2009 9:52 AM
Dear David:
Thank you for making your proposals:
[LTT] David R. Tribble: "ISO C 201X Proposal: Long Time Type".
Revision 2.1. 2009-08-22
[CDF] David R. Tribble: "ISO C 201X Proposal: Calendar Date
Functions".
Revision 2.3. 2009-08-25.
[TZF] David R. Tribble: "ISO C 201X Proposal: Timezone Functions".
Revision 1.3. 2009-08-19.
available online! I found that calendrical calculations are frequent
in system level programming, and the current deficiencies of C99 in
that area have led to many "fixes". Your proposals provide not just
fixes but a consistent scheme for dealing with datetimes in C; they
are systematic library extensions. (And they are very helpful already
now, and even if they are not accepted by the C language committee!)
In the following comments, I have tried to point out some areas
where I think the proposal could be enhanced to increase its
chances of being accepted for C1x. Please, do not take the
number of these comments as a sign for inferior quality of your
proposals. Quite the opposite is the case: it is because of the
high quality of the proposal that I could comment on all these
points. Most comments are minor, anyway.
Michael Deckers.
--------------------------------------------------------------------------------
[LTT section 1, diagram]
--------------------------------------------------------------------------------
Great! This certainly brings order into this whole issue; it also
serves as a good guideline for checking functional completeness.
Editorial: the arrow between struct calendar and typedef longtime_t
should be pointed at both ends (rather than only on the left),
due to mklongtime(). (This appplies to all three copies of the
diagram.)
--------------------------------------------------------------------------------
[LTT section 5, description]
--------------------------------------------------------------------------------
This is a minor quibble about the wording in:
"The longtime_t type includes accumulated leap second insertions and
deletions."
My point is that the wording might not make it sufficiently clear
whether longtime_t values represent instants P as integral value
[a] floor( ( UTC(at P) - 2001-01-01 )/( 2^-29 s ) )
or as integral value
[b] floor( ( TAI(at P) - 2001-01-01 - 32 s )/( 2^-29 s ) )
For a physicist or an astronomer, time is TAI or TT; including
leap second insertions and deletions yields UTC, so she would
opt for [a].
For somebody else, time may be UTC to begin with; removing the
effect of leap seconds in UTC (rather than including it!) will
yield TAI; hence formula [b], considering that
TAI was 2001-01-01 + 32 s when UTC was 2001-01-01.
I assume that you mean [b]; perhaps it is possible to add
that the system time expressed with getlongtime() is an
approximation to TAI - 32 seconds during the call.
Also, when reference is made to "accumulated leap seconds"
it should be made clear that "leap seconds accumulated since
2001-01-01" are meant.
--------------------------------------------------------------------------------
[LTT section 5, relation between longtime_t and struct calendar]
--------------------------------------------------------------------------------
This comment discusses the choice of resolution for longtime_t.
In the proposal, the representation of epoch values is done in
fixed point number fashion with constant resolution
- 2^-29 s for longtime_t values
- 10^-9 s for struct calendar values.
This implies that converting back and forth between longtime_t values
and struct calendar values cannot reproduce the exact time; due to
the possibility of millisecond time zone offsets, there is even a
chance that not only the subsecond components get affected through
rounding errors. In other words, with the current proposal, longtime_t
values cannot be used to faithfully represent struct calendar values
in a compact form (as "points on a timeline").
It is true that your proposal provides all the necessary operations
for struct calendar values (constructors, comparisons, addition of
times, normalization of member values, conversion between time zones),
without the explicit use of longtime_t values. longtime_t values
need only arise temporarily when one asks for the current system time.
Nevertheless, it would be nice to have a scalar representation
for the epochs represented by struct calendar values, without the
redundancies (eg, members cal_wday and cal_nmons), and easily
comparable. While your proposal fixes the representation of
longtime_t values, so that simple arithmetic on such values is
possible, this advantage cannot be exploited for epochs originally
given as struct calendar values because the conversion to
longtime_t values cannot be exact.
If the resolution of struct calendar values were made
equal to that of longtime_t values (2^-29 s), this would introduce
rounding into calendarformat() and calendarscanf(). I find this
undesirable since epoch values will increasingly be exchanged via
ISO 8601 notations (eg, in XML).
Enhancing the resolution of longtime_t values to 1 ns would
require 30 bit for the subsecond portion, leaving only 34 bit
for the count of seconds if longtime_t has 64 bit. A more
radical choice would allocate 128 bit to longtime_t, which
would at the same time extend the range of epochs representable
by longtime_t values well beyond the the 4 digit year number limit.
It might also be possible to require round trip equality for
struct calendar values with a cal_nsec member divisible by 1000 or so.
--------------------------------------------------------------------------------
[LTT section 6.1, getlongtime()]
--------------------------------------------------------------------------------
This is a suggestion to enhance the specification. Add the text:
"If there is a sequence point after the evaluation of a call
of getlongtime() and before the evaluation of a second call
of getlongtime(), and if both calls are successful, then the
second call returns a value not less than the first."
This still allows for (completely free-standing) implementations
that return the same value for all calls of getlongtime(), while
prohibiting paradox behaviour. A similar requirement is in Ada07.
--------------------------------------------------------------------------------
[LTT section 6.4 mklongtime()]
--------------------------------------------------------------------------------
TYPO in Examples:
"the current system into" |--> "the current system time into"
--------------------------------------------------------------------------------
[LTT section 6.5 setcalendartime()]
--------------------------------------------------------------------------------
This is a suggestion for an extension of the functionality of
setcalendartime().
Function mklongtime() gives a longtime_t value for a datetime given
with Gregorian calendar components. With the NULL value for argument
zone, and 0 for argument member date->cal_leapsec, this function
can be used when calendrical calculations are done for datetimes
in unspecified time zones, or for time scales for which the concepts
of time zones and leap seconds do not have any meaning (such as UT1
or GPS time).
The reverse operation, however, converting a typedef longtime value
into its representation in the Gregorian calendar, is not so easily
available with the proposed function setcalendartime(): one cannot
prevent that a number of leapseconds is put into the .cal_leapsec
member of the result, rather than in the .cal_sec member. In order
to get rid of them, one had to add them to the .cal_sec member, and
then normalize the result (eg, with an appropriate call of mklongtime
()).
Hence the suggestion to endow setcalendartime() with a possibility
to ignore leap seconds (in addition to ignoring time zones). I
even consider this to be in the "spirit of C": the most basic
variant of a function should be available. That basic mode is also
the one required in XML.
With your design, the suggested feature is easy to provide.
In fact, struct timezone values represent the differences between
system time, the time scale whose values are returned by getlongtime
(),
and the civil time scales for time zones and other geographic regions,
possibly with yearly and other shifts whose values are represented in
the struct calendar values derived from longtime_t values with
setcalendartime().
The suggestion amounts to having a struct timezone value that
represents system time: when setcalendartime() is called for a
longtime_t value V, and with a pointer referencing that struct
timezone value, no shift is applied to V and no leap seconds are
removed from V; the resulting struct calendar value represents
the epoch
2001-01-01 + V/2^29 s
in Gregorian calendar notation.
By the way, the same struct timezone value could be used with
the function mklongtime() (instead of the method described above).
See also the next comment for the portability aspect of this
issue.
--------------------------------------------------------------------------------
[LTT Appendix B -- Leap Seconds]
--------------------------------------------------------------------------------
This is a suggestion to enhance portability by requiring
one common behavior concerning leap seconds in all implementations.
Your proposal allows implementations to either "ignore" leap seconds,
or to consider past leap seconds when converting between system time
and epochs represented by struct calendar values. As far as I can
see, one cannot influence this behaviour, so that these conversions,
and probably also calendardiff() etc, have implementation-dependent
semantics. I find this undesirable.
The suggestion is to require that every implementation is able
work without considering any leap seconds. Again, I think this
is in the "spirit of C": the simple version is available, even
if a more complex version is also supported. One could also note
that XML has recently decided to no longer support leap seconds,
and COBOL has a compiler directive to turn leap second support
on and off. Finally, there is even a chance that leap seconds
in UTC are abolished so that leap seconds become a matter of
historical interest only.
Morevover, an implementation considering past leap seconds is
required to indicate the maximal epoch for which leap seconds
are known. The effect of leap second support on conversions
between typedef longtom_t values and epochs represented by
struct calendar values, and on calendaradd() and calendardiff()
must be specified, in particular for epochs in the future.
--------------------------------------------------------------------------------
[CDF, section 5.2, note to member .cal_era]
--------------------------------------------------------------------------------
This is a suggestion to drop part of a note.
The text:
"There is no requirement that the supported values of this
member form a set of consecutive integers, ..."
should be dropped. I do not see any reason to allow for this;
it just would make the checks with members .ci_era_min and .ci_era_max
of struct calendarinfo fairly useless.
--------------------------------------------------------------------------------
[CDF, section 5.2, note to member .cal_year]
--------------------------------------------------------------------------------
This is a suggestion to explain the term, proleptic Gregorian
calendar.
The text:
"Note that the Gregorian calendar has no zero (0) year;
thus it is implementation-defined whether this member
should have the value 0 or -1 to represent the year BC 0001."
does not conform with international standards (eg, ISO 8601:2004).
Astronomers use year number 0000 regularly (even with the Julian
calendar),
and even some (Maya) historians use negative year numbers with the
Gregorian calendar where the epoch -3113-08-11 is exactly
6000 Gregorian years before 2887-08-11 (hence a year with
number 0000 exists).
The text:
"That day is also a Monday, when using a strictly linear
(proleptic) representation of the Gregorian calendar rules
and ignoring the various historical calendar changes."
might suggest that less strict variants of the rules for the
Gregorian calendar, deviating from the definition in ISO 8601:2004,
are acceptable. This should not be the case.
If the formulae for the proleptic Gregorian calendar are not given
explicitly in the normative text, the definition of ISO 8601 should
be referenced.
--------------------------------------------------------------------------------
[CDF, section 5.3 struct calendarinfo]
--------------------------------------------------------------------------------
This is an editorial suggestion.
The wording
"The values shown are the minimum values required for the Gregorian
calendar, which might not necessarily apply to other calendar types.
should be modified so as not to require sophisticated interpretation
of
"minumum values". If anything minimal, minimal ranges and minimal
upper limits are meant, but I am not sure in every case:
The values given for members
ci_year_min, ci_mon_min, ci_week_min, ci_mday_min, ci_yday_min,
ci_wday_min, ci_hour_min
are the maximal values required for the Gregorian calendar, which
might
not necessarily apply to other calendar types.
The values given for members
ci_year_max, ci_mon_max, ci_week_max, ci_mday_max, ci_yday_max,
ci_wday_max, ci_hour_max, ci_min_maxm, ci_sec_max
are the minimal values required for the Gregorian calendar, which
might
not necessarily apply to other calendar types.
(For ci_wday_min and ci_wday_max, I have assumed that you want to
allow
for week numbering schemes where weeks start with days of the week
other than Monday. I am not sure that this was the intent.)
Also, the text
"Specifies the day of the week that is always present in the
first week of the year. If this value is indeterminate, the member
is set to -1."
is not very clear (every week has a Thursday).
Specifies, for the Gregorian calendar, the day of the week whose
first occurrence in a year belongs to week 1 of that year. If
there is no such day of the week for the week numbering scheme
of a calendar then the member is set to -1.
One probably should also require that ci_wday1 is between ci_wday_min
and ci_wday_max, inclusive.
--------------------------------------------------------------------------------
[CDF, section 5.3 struct calendarinfo]
--------------------------------------------------------------------------------
This is a suggestion to increase portability.
struct calendarinfo is an excellent idea to enhance portabilty.
For instance, applications can guarantee that a struct calendar
value with member .cal_mday exceeding the range [1..31] can be
successfully normalized (with mklongtime()). What is missing is
the semantics of the extended values: the normalized result
is unspecified.
The suggestion is to specify the behaviour for the maximal ranges
allowed by struct calendarinfo for the Gregorian calendar.
--------------------------------------------------------------------------------
[CDF, section 5.3, note to member .cal_era_max]
--------------------------------------------------------------------------------
This is a suggestion for a strict interpretation of the Gregorian
calendar.
I find the wording
"The Gregorian calendar has two eras, AD and BC
(also known as CE and BCE, respectively)."
misleading. BC (or BCE) is mainly used by historians with the
Julian calendar, use with the Gregorian calendar is rare (as are
events before 0001-01-01 that are datable with less than a day
of uncertainty). Some such uses I have seen were unintentional.
Anyway, an ISO 8601 epoch notation such as -3113-08-11 should
be representable with a struct calendar value (if the
implementation has .ci_year_min <= -3113), and this should
not be normalized by mklongtime() to another era and a positive
.cal_year member.
Note that regnal eras are used in Japan with a calendar whose
numeric form differs from the Gregorian calendar only by its
year numbers; in this case, normalization may well comprise
the change of era.
--------------------------------------------------------------------------------
[CDF, section 6.1 calendaradd(): normalization of calendar components]
--------------------------------------------------------------------------------
This is a suggestion to add some constraints on the operation of
"normalization of a struct calendar value by modifying
its members to fall within their normal ranges"
that occurs during the evaluation of a call of calendaradd()
(and also mklongtime()), so as to make the calendar arithemetic
reproducible (and equal to that required in XML).
To fix ideas, let me consider a normalization function
Nf: ( Y, M, D ) |----> ( Y', M', D' )
that maps some or all triples of integers ( Y, M, D ) to other
such triples ( Y', M', D' ), such that the image of Nf consists
of triples that are Gregorian year, month, and day numbers
of (midnight) epochs, and such that Nf(Nf) = Nf.
This is a special case of normalization: the result of
the normalization of the values of members cal_year, cal_mon,
and cal_mday only depends on the values of these members
before normalization, but not on other members of the
struct calendar value, nor on values in the environment
(like the current locale).
In general, such a normalization function Nf leads to a particular
choice of "calendar arithmetic", where adding J years, N months,
and T days to a midnight epoch with Gregorian year, month, and
day numbers (Y, M, D) is given by
Gregorian calendar( Nf( Y + J, M + N, D + T ) )
whenever ( Y + J, M + N, D + T ) is in the domain of Nf.
In fact, calendaradd() in your proposal is explicitly defined
in this way.
Conversely, any such "calendar arithmetic" satisfying some
mild conditions can be obtained in this way from a (unique)
normalization function.
In XML, two different normalizations of "componentized"
representations of epochs are used:
[a] lexical normalization takes "2009-03-00" to 2009-02-28
and "2009-02-29" to 2009-03-01,
[b] but the normalization performed when adding 1 month to
"2009-01-29" takes the intermediate result
2009-02-29 to 2009-02-28.
The first normalization applies in XML when subtracting 1 d
from 2009-03-01 and when adding 1 d to 2009-02-28.
There is general agreement on the calendar arithmetic involving
time units (that is, units d, h, min, s) -- after all, the set of
epochs can be viewed as a real affine space of dimension one whose
translation space is the one-dimensional vector space of times
(spanned by any one of these time units). The arithmetic has all
the nice properties of a group operation, like
[c] (epoch E + time T1) + time T2 = E + (T1 + T2).
The normalization function corresponding to this operation
is such that
[d] Gregorian calendar(normalize( Y, M, D))
= Gregorian calendar(Y, M, 01) + (D - 1) d
for integral Y, 1<= M <= 12, and any real D (this includes
any time values in units that can be converted to unit d).
Normalization [d] has applications outside of calendar arithemetic:
2009-03-00 is a convenient way to denote the last day of February
without looking at the year number, and January 0 is commonly used
in astronomy.
On the other hand, the set of times in calendrical units
month and year = 12 month can be taken as a free abelian group
(as it is done in XML), and this group operates in an intuitive
way on the subset of epochs whose Gregorian day in the month
number is between 01 and 28, inclusive. (These are the simple
cases of adding months to datetimes.)
The normalization function corresponding to this operation
is such that
[e] normalize( Y, M, D )
= ( Y + floor((M - 1)/12), 1 + (M - 1)mod 12, D)
for integral Y, M, D with 1 <= D <= 28.
However, among several reasonable extensions of that operation
to all epochs, none can be a group operation (they do not even
satisfy [c]).
(But the operation of the subgroup generated by 4800 months can
be extended as a group opreration onto all epochs.)
My suggestion is to use the normalization
[f] Gregorian calendar(normalize( Y, M, D )
= the infimum of
Gregorian calendar(Y + floor((M - 1)/12), 1 + (M - 1)mod 12, 01)
+ (D - 1) d
and
Gregorian calendar(Y + floor((M )/12), 1 + (M )mod 12, 01)
- 1 d
for integral Y, M, D, for 1 <= D <= 31,
for adding years and months to a date (in normalized form).
This is the normalization ("pinning") used in XML for that
purpose. (It can also be expressed mathematically using adjoints.)
More precisely, to provide arithmetic equal to that of XML,
the normalization [f] has to be applied after years and months are
added to the calendar components; whereupon time units are added
to the calendar components, and then normalization [d] is applied.
You have proposed the normalisation
[g] Gregorian calendar(normalize( Y, M, D))
= Gregorian calendar(( Y + floor((M - 1)/12), 1 + (M - 1)mod 12, 01)
+ (D - 1) d
for all integral Y, M, D
for inclusion into C99 (if I remember correctly). This is an
extension of [d], and would provide a single normalization,
applicable to addition of both times and months. Compared with [f],
however, [g] has the perceived disadvantage that adding a month
would not be monotone: upon adding one month to 2009-01-31
and 2009-02-01, the intermediate result
2009-02-31 would be normalized with [g] to 2009-03-03 and
2009-03-01, to 2009-03-01,
while [f] yields 2009-02-28 and 2009-03-01.
--------------------------------------------------------------------------------
[CDF, section 6.2 calendardiff()]
--------------------------------------------------------------------------------
This comment suggests replacing undefined behaviour in the
semantics of calendardiff() by unspecified result. Currently,
undefined behaviour occurs
- if the cal_type members have different values, and
- if one of the struct calendar members has an unnormalized value.
A robust program must avoid undefined behaviour in all cases, so it
always has to check the cal_type members, and has to normalize
the member values (eg, with mklongtime(), which also may fail),
even if the subsequent call of calendardiff() may fail for other
reasons.
It is also unclear to me when a struct calendar argument value with
value _CAL_YR_ERROR for member cal_year is considered to be
normalized, and how it can be normalized (if needed).
--------------------------------------------------------------------------------
[CDF, section 6 calendar arithemetic]
--------------------------------------------------------------------------------
This is a comment collecting some arguments in favor of distinguishing
types for durations from those for epochs.
In your proposal, you use struct calendar to represent both
- epoch values with time zone differential and with leap second
counts; and
- durations of time in terms of time units and calendrical units.
With the two uses, some members of struct calendar (such as .cal_mon)
are used with different ranges, some members are meaningful for only
one use (such as members .cal_wday and .cal_ndays), and most
operations
are only meaningful with one of the two uses of struct calendar.
These facts seem to indicate that the two uses merit two different
tpyes.
Other programmming languages with strong typing use different types
for both uses:
- Ada has Standard.Duration (and Ada.Real_Time.Time_Span) for
durations
and Ada.Calendar.Time (and Ada.Real_Time.Time) for epochs.
- SQL has INTERVAL types for durations
and TIMESTAMP (and TIMESTAMP WITH TIMEZONE) for epochs.
- XML has dayTimeDuration and yearMonthDuration for durations, and
dateTime for epochs.
There is, of course, a possibility to get rid of durations altogether:
function calendaradd() could be replaced by a normalization function
(or several, as described above), and function calendardiff(), by
subtracting corresponding typdef longtime_t values. For differences
in terms of months, however, a separate function would be needed.
Oracle SQL has shown that this method works.
--------------------------------------------------------------------------------
[TZF section 4.2, struct timezone]
--------------------------------------------------------------------------------
This is a suggestion to support a more independent model for time
zones.
struct timezone values are used to represent time zone differentials
in the epoch values represented by struct calendar values, via
the members .cal_zone and .cal_dsti. This is the only way to
represent time zone differentials in epoch representations; in other
words, in order to represent the timestamp 2009-09-04T05:13-04:30,
one has to construct a struct timezone object with inittimezone(),
in addition to the struct calendar object. This is awkward when
it comes to copying values, saving them in a file, etc.
There is also a slight chance that the semantics of a timestamp
changes: if the timestamp 2010-06-01T17:15+0500 is represented
with a struct calendar value whose member points to a struct
timezone object named "Asia/Karachi" then chances are that the
function timezoneoffset() will make that 2010-06-01T17:15+0600
when called next year.
The suggestion is to separate the two issues: rules for a civil time
scale for a particular geographic region, and the time zone
differential applicable to a specifc epoch. struct calendar
values should not require deep copies, and struct timezone values
should only be needed to derive the value of such time zone
differentials. After that derivation, the epoch value should be
indpendent from any struct timezone value.
Precedent for epoch values with (and without) time zone
differentials exists in XML and SQL; they are scalar values
in both languages that do not need other objects or system
support for their correct interpretation.
--------------------------------------------------------------------------------
[TZF section 4.2, struct timezone]
--------------------------------------------------------------------------------
This is a suggestion to extend struct timezone.
Currently, struct timezone values cannot represent civil time scales
where
the offset of winter time from UTC has varied over time (UK,
Portugal).
The suggestion is to add a member int z_isdst to struct tz_z, so that
each element of member struct tz_z[] is tagged as describing a
summer time (value 1), or not (value 0 for winter time
and standard time).
Another suggestion is to add a member int tz_dsti_max to
struct timezone giving the maximal index of an element of
the struct tz_z[] member (or int tz_dsti_count giving the
count of elements of the struct tz_z[] member).
--------------------------------------------------------------------------------
[TZF section 4.2, struct timezone]
--------------------------------------------------------------------------------
TYPO for char z_name[N+1]
for N, choose another letter so as to avoid the clash with
member struct tz_z[N].
TYPO for long int tz_offset:
"If this member if equal" |---> "If this member is equal"
TYPO for int tz_z[i].z_dst:
"similar to the tm_dst member of the struct tm type."
|--> "similar to the tm_isdst member of the struct tm type."
--------------------------------------------------------------------------------
[TZF section 5.2, mktimezonename()]
--------------------------------------------------------------------------------
Question:
Can the zone argument be a null pointer, as it can be in other
contexts,
to indicate the UTC time scale.
--------------------------------------------------------------------------------
Michael Deckers wrote:
> [LTT section 5, description]
> This is a minor quibble about the wording in:
> "The longtime_t type includes accumulated leap second insertions and
> deletions."
>
> My point is that the wording might not make it sufficiently clear
> whether longtime_t values represent instants P as integral value
> [a] floor( ( UTC(at P) - 2001-01-01 )/( 2^-29 s ) )
> or as integral value
> [b] floor( ( TAI(at P) - 2001-01-01 - 32 s )/( 2^-29 s ) )
>
> For a physicist or an astronomer, time is TAI or TT; including
> leap second insertions and deletions yields UTC, so she would
> opt for [a].
>
> For somebody else, time may be UTC to begin with; removing the
> effect of leap seconds in UTC (rather than including it!) will
> yield TAI; hence formula [b], considering that
> TAI was 2001-01-01 + 32 s when UTC was 2001-01-01.
>
> I assume that you mean [b]; perhaps it is possible to add
> that the system time expressed with getlongtime() is an
> approximation to TAI - 32 seconds during the call.
> Also, when reference is made to "accumulated leap seconds"
> it should be made clear that "leap seconds accumulated since
> 2001-01-01" are meant.
No, I meant [a], i.e., longtime_t values include all inserted or
deleted leap seconds. For positive values, this includes all
of the leap seconds accumulated since the epoch (2001-01-01);
for negative values, it includes all leap seconds since 1972;
for times prior to 1972, there are no included or deleted
leap seconds.
The longtime_t values parallel those of the NTP network time
protocol, but with a different number of bits and a different epoch.
NTP timestamps include inserted and deleted leap seconds,
a la UTC, and so do longtime_t values.
I wish we could completely ignore leap seconds (and I'm not
alone in this opinion), as it would simplify much of the time
handling. But there is a desire to be compatible with existing art
(NTP). It's also been pointed out several times that the
people/applications that need leap seconds need them badly,
so it's better to include them than not.
I've tried to make it very easy to remove leap seconds from
a given longtime_t value to convert it into a strict linear form
that is more amenable to date calculations, so that the
baggage of carrying around leap seconds is as painless
as possible.
-drt
Michael Deckers wrote:
> [LTT section 5, relation between longtime_t and struct calendar]
>
I made the longtime_t values to be binary-based at the
suggestion of Poul-Henning Kamp [*], one of the BSD kernel
developers. He convinced me that it's more convenient for
the kernel functions to encode system times in binary form
rather than decimal form.
Converting the binary-based form into a decimal form for
human consumption (i.e., calendar times) is a single division
or a multiply-and-shift operation.
The calendar form for fractional seconds (cal_nsec), on the
other hand, should be decimal-based, because it is designed
to be convrted into human-readable form, which means
decimal subseconds.
So yes, there is a slight disconnect between calendar times
and longtime_t values. A given longtime_t value converts
to a unique calendar date (with a unique cal_nsec value);
but more than one calendar date (multiple cal_nsec values)
may convert to the same longtime_t value.
This does not strike me as a big problem, as the two types
are designed for different problem domains. I'd prefer to
solve the kernel time compatibility issue, which seems to be
a more important problem.
-drt
[*] See the discussion that triggered it at
http://newsgroups.derkeiler.com/Archive/Comp/comp.std.c/2008-01/msg00067.html
This sounds reasonable. It guarantees that values returned
from sequential calls to getlongtime() are monotonically
increasing.
-drt
That's a misunderstanding. cal_sec will contain leap seconds
(i.e., have a value of 60) as appropriate, and cal_leapsecs
will always indicate the number of leap seconds that have been
inserted/deleted at that particular calendar date/time.
> Hence the suggestion to endow setcalendartime() with a possibility
> to ignore leap seconds (in addition to ignoring time zones). I
> even consider this to be in the "spirit of C": the most basic
> variant of a function should be available. That basic mode is also
> the one required in XML.
That would be a function of a non-standard calendar type, which
may be a calendric system that ignores leap seconds entirely.
However, the minimum mandated calendric system that
must be provided is the standard Gregorian calendar with
leap seconds. It is trivial to check for an inserted leap
second (cal.cal_sec >= 60) to throw it away, and no code
is needed at all to ignore a deleted leap second.
Please provide a link to the XML standard that specifies
the behavior for leap seconds that you describe.
I don't see any simple way to make everyone happy, apart
from giving them all the necessary information they could
possibly need in order to deal with leap seconds. Those that
need them will have them, and those that don't need them
will have all the information they need in order to remove them.
If someone can make a convincing case for ignoring leap
seconds entirely while still fulfilling all of the design goals
mentioned in my proposal, then by all means do so.
Unix/POSIX has managed to do it for several decades now,
so it's a distinct possibility.
> With your design, the suggested feature is easy to provide.
> In fact, struct timezone values represent the differences between
> system time, the time scale whose values are returned by getlongtime(),
> and the civil time scales for time zones and other geographic regions,
> possibly with yearly and other shifts whose values are represented in
> the struct calendar values derived from longtime_t values with
> setcalendartime().
>
> The suggestion amounts to having a struct timezone value that
> represents system time: when setcalendartime() is called for a
> longtime_t value V, and with a pointer referencing that struct
> timezone value, no shift is applied to V and no leap seconds are
> removed from V; the resulting struct calendar value represents
> the epoch 2001-01-01 + V/2^29 s in Gregorian calendar notation.
I don't think I understand what you're driving at.
The required Gregorian calendric system uniquely maps
system times (longtime_t values) to calendar date/times
(struct calendar values) and back, preserving leap seconds
in both directions.
> By the way, the same struct timezone value could be used with
> the function mklongtime() (instead of the method described above).
> See also the next comment for the portability aspect of this
> issue.
-drt
I see that this is an oversight on my part (in spite of what I
wrote previously). The one required calendric system (proleptic
Gregorian, cal_type == _CAL_TYPE_GREGORIAN) should
always recognize leap seconds.
This is what I intended, but I see that this is probably not
what my text says. (This is probably due to the fact that
the 'longtime_t' and 'struct calendar' proposals began as
entirely separate proposals, and my merging of the two
was not perfect.)
The aiblity remains for implementations to provide other
non-standard calendric systems that ignore leap seconds
(while still encoding them properly within their longtime_t
values).
> The suggestion is to require that every implementation is able
> work without considering any leap seconds. Again, I think this
> is in the "spirit of C": the simple version is available, even
> if a more complex version is also supported. One could also note
> that XML has recently decided to no longer support leap seconds,
> and COBOL has a compiler directive to turn leap second support
> on and off. Finally, there is even a chance that leap seconds
> in UTC are abolished so that leap seconds become a matter of
> historical interest only.
>
> Moreover, an implementation considering past leap seconds is
> required to indicate the maximal epoch for which leap seconds
> are known. The effect of leap second support on conversions
> between typedef longtime_t values and epochs represented by
> struct calendar values, and on calendaradd() and calendardiff()
> must be specified, in particular for epochs in the future.
It's more than me that needs convincing. I for one would
love to throw leap seconds out the window, but there are
others who think differently.
-drt
That's reasonable.
Note that implementations are only required to support the
common era (AD, CE) of the Gregorian calendar. This should
probably be made more explicit in the proposal.
-drt
It's not clear what the best approach is for dealing with
year numbers outside the era number specified by cal_era.
On the one hand, we could simply disallow invalid cal_year
values (specifically, negative values) for any given valid
cal_era value.
On the other hand, it would be convenient to let cal_year
take on unnormalized values respective to the cal_era value,
and let mklongtime() (and perhaps calendaradd()) normalize
the calendar object to the correct era.
In either case, it is probably desirable to allow only normalized
cal_year values for any given calendar date. Thus 1 BC can
only be properly represented by a normalized calendar object
having cal_era = _CAL_ERA_BC (or whatever) and
cal_year = +1.
> The text:
> "That day is also a Monday, when using a strictly linear
> (proleptic) representation of the Gregorian calendar rules
> and ignoring the various historical calendar changes."
> might suggest that less strict variants of the rules for the
> Gregorian calendar, deviating from the definition in ISO 8601:2004,
> are acceptable. This should not be the case.
>
> If the formulae for the proleptic Gregorian calendar are not given
> explicitly in the normative text, the definition of ISO 8601 should
> be referenced.
If I'm reading Wikipedia correctly,
http://en.wikipedia.org/wiki/Proleptic_Gregorian_calendar
then "proleptic" means that calendar dates before 1582
are not historically accurate. Which is what I intended.
Actually, it's dates prior to 1582, or 1758, or 1917, or several
other years, depending on when the Gregorian calendar was
adopted in whichever part of the world you're in.
The simplest approach is to assume a strictly linear calendar
system without hiccups between any of the dates within the
supported range. I want the full range of 1601-01-01 to
2399-12-31 covered linearly, corresponding to the full
longtime_t range. I think my proposal actually specifies a
minimum range of 1900-01-01 to 2399-12-31, though.
-drt
Yes.
> Also, the text
> "Specifies the day of the week that is always present in the
> first week of the year. If this value is indeterminate, the member
> is set to -1."
> is not very clear (every week has a Thursday).
> Specifies, for the Gregorian calendar, the day of the week whose
> first occurrence in a year belongs to week 1 of that year. If
> there is no such day of the week for the week numbering scheme
> of a calendar then the member is set to -1.
> One probably should also require that ci_wday1 is between ci_wday_min
> and ci_wday_max, inclusive.
Yes.
-drt
I thought that the description of mklongtime() was pretty clear
about normalizing calendar members that fall outside their
"normal ranges". Perhaps extra verbiage is needed.
One thing that I'd like to support is the (common?) practice in
POSIX of taking the current time() value, which is the count
of seconds since 1970-01-01, and using that as the tm_sec
value in an otherwise zeroed-out tm structure and calling
mktime() to produce the full tm object value for the current
time. That same technique should be possible using the
proposed calendar struct:
void posix_time(struct calendar *cal)
{
time_t now;
// Get the current POSIX time
time(&now);
// Initialize the calendar date
initcalendar(cal, _CAL_NAME_GREGORIAN);
// Convert the POSIX time to a normalized date
cal->cal_year = 1970;
cal->cal_sec = now; // Assumes 32-bit int
mklongtime(&cal, NULL);
}
This is similar to my example 'posix_date()' function in
the proposal.
-drt
Yes. Perhaps I should make that clearer. cal_year values
should be valid with respect to the current cal_era value of
the date object, and should be normalized as such.
> Note that regional eras are used in Japan with a calendar whose
> numeric form differs from the Gregorian calendar only by its
> year numbers; in this case, normalization may well comprise
> the change of era.
Yep.
-drt
I follow your arguments. I think the answer is that:
1. time members are normalized, rolling over into mdays;
2. year is normalized, to account for leap years next;
3. month is then normalized, so that [b] works;
4. mday is then normalized, along with any rollover days from (1),
rolling over into months and years, so that [a] works.
5. the remaining members (yday, week, wday, etc.) are then
normalized.
This is obviously a complicated issue, but I think we can
come up with reasonably efficient rules for the mandated
default behavior.
-drt
Yes, "unspecified" is better than "undefined".
> It is also unclear to me when a struct calendar argument value with
> value _CAL_YR_ERROR for member cal_year is considered to be
> normalized, and how it can be normalized (if needed).
A calendar date object with a cal_year of _CAL_YR_ERROR is an
erroneous date value, and is not expected to contain any normalized
members.
Such a date value can only be normalized after setting its cal_year
to a valid value.
-drt
On 2009-09-11, David R Tribble wrote:
> Michael Deckers wrote:
> >
> > My point is that the wording might not make it sufficiently clear
> > whether longtime_t values represent instants P as integral value
> > [a] floor( ( UTC(at P) - 2001-01-01 )/( 2^-29 s ) )
> > or as integral value
> > [b] floor( ( TAI(at P) - 2001-01-01 - 32 s )/( 2^-29 s ) )
> > ...
> >
> > I assume that you mean [b]; perhaps it is possible to add
> > that the system time expressed with getlongtime() is an
> > approximation to TAI - 32 seconds during the call.
> > Also, when reference is made to "accumulated leap seconds"
> > it should be made clear that "leap seconds accumulated since
> > 2001-01-01" are meant.
> No, I meant [a], i.e., longtime_t values include all inserted or
> deleted leap seconds. For positive values, this includes all
> of the leap seconds accumulated since the epoch (2001-01-01);
> for negative values, it includes all leap seconds since 1972;
> for times prior to 1972, there are no included or deleted
> leap seconds.
>
> The longtime_t values parallel those of the NTP network time
> protocol, but with a different number of bits and a different epoch.
> NTP timestamps include inserted and deleted leap seconds,
> a la UTC, and so do longtime_t values.
I see. What I do not see is which longtime_t values can
distinguish the two different instants when UTC was
2005-12-31T23:59:60 (with 32 leap seconds), and
2006-01-01T00:00:00 (with with 33 leap seconds).
NTP timestamps have the leap second bit to distinguish
them -- but longtime_t values don't.
> I wish we could completely ignore leap seconds (and I'm not
> alone in this opinion), as it would simplify much of the time
> handling. ........................................................
Yes, and I think that there is enough precedent that C could
follow (Posix and XML do not consider leap seconds, Ada and
Cobol make leap second support optional). I find it unreasonable
to require leap second support from every conforming
implementation, but I think it _is_ reasonable to
require that every conforming implementation supports
datetime calculations without regard of leap seconds.
> ......... But there is a desire to be compatible with existing art
> (NTP). It's also been pointed out several times that the
> people/applications that need leap seconds need them badly,
> so it's better to include them than not.
>
> I've tried to make it very easy to remove leap seconds from
> a given longtime_t value to convert it into a strict linear form
> that is more amenable to date calculations, so that the
> baggage of carrying around leap seconds is as painless
> as possible.
Yes, I agree that this is a worthy goal. The separate member
cal_leapsec of struct calendar supports this goal: it is easy to
ignore it or to set it to zero.
I suggest that it may be equally useful to make the leap second
bit a separate member of struct calendar, so that one does not
have to fight against seconds fields with values >= 60 if one
does not want them. Ada05 has both a leap second counter (TAI - UTC),
and a Boolean indicating whether UTC just inserts a leap second
([Annotated Ada Reference Manual, online at
http://www.adaic.org/standards/05aarm/AA-Final.pdf, section 9.6.1],
also for real time). I am just referring to other languages
in order to show that my suggestions are nothing new, but
well-known prior art.
Michael Deckers.
> Michael Deckers wrote:
>
> > [LTT section 5, relation between longtime_t and struct calendar]
> >
> > This comment discusses the choice of resolution for longtime_t.
> >
> > In the proposal, the representation of epoch values is done in
> > fixed point number fashion with constant resolution
> > - 2-29 s for longtime_t values
> > - 10-9 s for struct calendar values.
> > This implies that converting back and forth between longtime_t values
> > and struct calendar values cannot reproduce the exact time; due to
> > the possibility of millisecond time zone offsets, there is even a
> > chance that not only the subsecond components get affected through
> > rounding errors. In other words, with the current proposal, longtime_t
> > values cannot be used to faithfully represent struct calendar values
> > in a compact form (as "points on a timeline").
>
> I made the longtime_t values to be binary-based at the
> suggestion of Poul-Henning Kamp [*], one of the BSD kernel
> developers. He convinced me that it's more convenient for
> the kernel functions to encode system times in binary form
> rather than decimal form.
>
> [*] See the discussion that triggered it at
> http://newsgroups.derkeiler.com/Archive/Comp/comp.std.c
> /2008-01/msg00067.html
>
> Converting the binary-based form into a decimal form for
> human consumption (i.e., calendar times) is a single division
> or a multiply-and-shift operation.
>
> The calendar form for fractional seconds (cal_nsec), on the
> other hand, should be decimal-based, because it is designed
> to be convrted into human-readable form, which means
> decimal subseconds.
>
> So yes, there is a slight disconnect between calendar times
> and longtime_t values. A given longtime_t value converts
> to a unique calendar date (with a unique cal_nsec value);
> but more than one calendar date (multiple cal_nsec values)
> may convert to the same longtime_t value.
>
> This does not strike me as a big problem, as the two types
> are designed for different problem domains. I'd prefer to
> solve the kernel time compatibility issue, which seems to be
> a more important problem.
I understand that the hardware time counters are binary,
but that does not mean that they count in units that are
binary submultiples of a second. IBM z systems, for instance,
count in binary submultiples of microseconds, not seconds.
Converting these readings into a typedef longtime_t value
involves rounding, even though the counter is binary.
But my point is not about binary or decimal representation;
my point is that the additional rounding involved in the
conversion between typedef longtime_t values and struct
calendar values is a useless complication.
typedef longtime_t should not be confined to epoch
values read from some hardware clock. It should
be conceived more generally as a scalar type that
represents arbitrary points on a time line. In fact,
in your proposal typedef longtime_t is necessary for
several operations that are difficult or impossible to
do only with struct calendar values: convert between
different calendar systems, store epoch values compactly
in a data base, easy comparison of epoch values.
Reading hardware clocks is just one operation among
several.
The conversion between typedef longtime_t values and
struct calendar values works in both directions, and
having the same resolution for both would be a useful
trait because, in this case, one could guarantee that
the conversion is one-to-one, even with time zone
offsets applied on either side. Reading the hardware
clock into a typedef longtime_t value has no inverse
operation (in C); and resolutions of hardware clocks
are sufficiently diverse so as to require rounding
in the readings obtained with getlongtime() in most
cases. I see no reason why another rounding is
mandated for the conversions between typedef longtime_t
values and struct calendar values.
Michael Deckers.
> This sounds reasonable. It guarantees that values returned
> from sequential calls to getlongtime() are monotonically
> increasing.
I made this proposal under the assumption that getlongtime()
delivered something like (TAI - 2001-01-01 - 32 s)/(2^-29 s).
But now that I know that it is supposed to deliver an
approximation of (UTC - 2001-01-01)/(2^-29 s), I find this
more problematic.
The system time scale derived from NTP typically is an
ad hoc monotonized version of UTC, where the reading during
a positive leap second stays nearly constant. But with
the proposed conversions between longtime_t values and
struct calendar values, one would need a reproducible method
where 2^29 different longtime_t values represent 2 seconds
worth of UTC timestamps.
Michael Deckers.
> Michael Deckers wrote:
> > [LTT section 6.5 setcalendartime()]
> > This is a suggestion for an extension of the functionality of
> > setcalendartime().
> >
> > Function mklongtime() gives a longtime_t value for a datetime given
> > with Gregorian calendar components. With the NULL value for argument
> > zone, and 0 for argument member date->cal_leapsec, this function
> > can be used when calendrical calculations are done for datetimes
> > in unspecified time zones, or for time scales for which the concepts
> > of time zones and leap seconds do not have any meaning (such as UT1
> > or GPS time).
> >
> > The reverse operation, however, converting a typedef longtime value
> > into its representation in the Gregorian calendar, is not so easily
> > available with the proposed function setcalendartime(): one cannot
> > prevent that a number of leapseconds is put into the .cal_leapsec
> > member of the result, rather than in the .cal_sec member. In order
> > to get rid of them, one had to add them to the .cal_sec member, and
> > then normalize the result (eg, with an appropriate call of mklongtime()).
> That's a misunderstanding. cal_sec will contain leap seconds
> (i.e., have a value of 60) as appropriate, and cal_leapsecs
> will always indicate the number of leap seconds that have been
> inserted/deleted at that particular calendar date/time.
Yes, I misunderstood your spec. I meant the following:
given a struct calendar value C, and some struct timezone object TZ,
consider the operation
[a] setcalendartime( &C, &TZ, mklongtime( &C, &TZ ) )
I would expect that the epoch value represented by C does not change
when [a] is applied. That is probably true if C was set with
setcalendartime() to begin with (this would set the cal_leapsec
member in some way). I assumed that this was no longer true with C
set to arbitrary calendar components (as read from a file or from
the net), and with the cal_leapsec member set to 0. But now I
understand that it is still true, just the cal_leapsec member
is modified. So please ignore my comment.
But what about the operation [a] when the cal_sec member of C is 60
(ie, during a positive leap second)?
As far as I understand, the mklongtime() call will normalize this
to a value in {0..59}. Will the call of setcalendartime() denormalize
this again? Or is C changed in this case? This may be an important
issue since denormalized values are not acceptable to calendaradd()
and calendardiff().
> However, the minimum mandated calendric system that
> must be provided is the standard Gregorian calendar with
> leap seconds. It is trivial to check for an inserted leap
> second (cal.cal_sec >= 60) to throw it away, and no code
> is needed at all to ignore a deleted leap second.
With the usual time zones, this is indeed trivial; but your spec
allows for time zone offsets that are integral multiples of 1 ms.
How are leap seconds represented in a time zone that is 33 seconds
ahead of UTC? This is another argument in favour of a separate
leap second bit in struct calendar.
> Please provide a link to the XML standard that specifies
> the behavior for leap seconds that you describe.
[http://www.w3.org/TR/2009/CR-xmlschema11-2-20090430/#d-t-values]
> I don't see any simple way to make everyone happy, apart
> from giving them all the necessary information they could
> possibly need in order to deal with leap seconds. ..........
Agreed. So why not look at several ways, each making one group happy?
There are those that do not want to use and see leap seconds;
those that want to recognize leap seconds when they occur (eg, NTP);
and some that even want to keep the whole leap second history
(eg, via zic). For the last group, typedef longtime_t
values must represent TAI, for the first, UTC is needed.
I see no use in ignoring this difference.
> ................................................. Those that
> need them will have them, and those that don't need them
> will have all the information they need in order to remove them.
>
> If someone can make a convincing case for ignoring leap
> seconds entirely while still fulfilling all of the design goals
> mentioned in my proposal, then by all means do so.
> Unix/POSIX has managed to do it for several decades now,
> so it's a distinct possibility.
I think you are neglecting the implementors: they also belong
to one of these groups. Why should a C processor for embedded
systems deal with the intricacies of leap seconds? Requiring
_every_ conforming implementation of C to cater for all three
groups is not useful, in my opinion.
Michael Deckers.
> It's not clear what the best approach is for dealing with
> year numbers outside the era number specified by cal_era.
>
> On the one hand, we could simply disallow invalid cal_year
> values (specifically, negative values) for any given valid
> cal_era value.
>
> On the other hand, it would be convenient to let cal_year
> take on unnormalized values respective to the cal_era value,
> and let mklongtime() (and perhaps calendaradd()) normalize
> the calendar object to the correct era.
In [ISO 8601:2004], available online at [www.phys.uu.nl
/~vgent/calendar/downloads/iso_8601_2004.pdf], the
Gregorian calendar is defined so as to allow arbitrary
integral year numbers, positive and negative, including 0000.
With this calendar, cal_year is the only member that
cannot have an unnormalized value! My suggestion just was
to make support of this calendar mandatory.
> If I'm reading Wikipedia correctly,
> http://en.wikipedia.org/wiki/Proleptic_Gregorian_calendar
> then "proleptic" means that calendar dates before 1582
> are not historically accurate. Which is what I intended.
Yes, in the sense that the Gregorian calendar was promulgated
on 1582-03-06, but can be used to accurately designate any
day in the past or future. The American Heritage defines
prolepsis as
"1. The anachronistic representation of something as
existing before its proper or historical time,
as in the precolonial United States.
...."
Michael Deckers.
David R Tribble wrote:
>> No, I meant [a], i.e., longtime_t values include all inserted or
>> deleted leap seconds. For positive values, this includes all
>> of the leap seconds accumulated since the epoch (2001-01-01);
>> for negative values, it includes all leap seconds since 1972;
>> for times prior to 1972, there are no included or deleted
>> leap seconds.
>>
>> The longtime_t values parallel those of the NTP network time
>> protocol, but with a different number of bits and a different epoch.
>> NTP timestamps include inserted and deleted leap seconds,
>> a la UTC, and so do longtime_t values.
>
Michael Deckers wrote:
> I see. What I do not see is which longtime_t values can
> distinguish the two different instants when UTC was
> 2005-12-31T23:59:60 (with 32 leap seconds), and
> 2006-01-01T00:00:00 (with with 33 leap seconds).
> NTP timestamps have the leap second bit to distinguish
> them -- but longtime_t values don't.
Those two times differ by one second, so their corresponding
longtime_t values differ by one second's worth of ticks, i.e., the
later time will be 2^29 greater than the earlier time.
Likewise, passing these values to longtimeleapsecs() will return
33 for both, as they both include the same (33rd) leap second.
Passing a value representing one tick before the earlier
time (i.e., the point in time prior to the insertion of the 33rd leap
second) will return 32.
The intent is that this function makes it very easy to remove
accumulated leap seconds from any longtime_t value, in
order to convert them into a simplified linear time representation
that does not have to deal with leap seconds at all.
As I've said before, I would just as soon ignore leap seconds,
but more than one person in comp.std.c has made it clear
that applications that need leap seconds really need them
badly. All other applications can simply remove them, either
using longtimeleapsecs(), or by ignoring cal_sec values
greater than 59 when converted to calendar date objects.
The cal_leapsecs member is also present for this need.
If enough people can be convinced that including support
for leap seconds is a bad idea, then we can remove it from
the proposal. But I don't see enough evidence of that yet.
> I suggest that it may be equally useful to make the leap second
> bit a separate member of struct calendar, so that one does not
> have to fight against seconds fields with values >= 60 if one
> does not want them. Ada05 has both a leap second counter (TAI - UTC),
> and a Boolean indicating whether UTC just inserts a leap second
> ([Annotated Ada Reference Manual, online at
> http://www.adaic.org/standards/05aarm/AA-Final.pdf, section 9.6.1],
> also for real time). I am just referring to other languages
> in order to show that my suggestions are nothing new, but
> well-known prior art.
And hopefully we can incorporate the best ideas from prior
art to come up with the best scheme that makes (practically)
everyone happy.
-drt
David R Tribble wrote:
>> I made the longtime_t values to be binary-based at the
>> suggestion of Poul-Henning Kamp [*], one of the BSD kernel
>> developers. He convinced me that it's more convenient for
>> the kernel functions to encode system times in binary form
>> rather than decimal form.
>>
>> Converting the binary-based form into a decimal form for
>> human consumption (i.e., calendar times) is a single division
>> or a multiply-and-shift operation.
>>
>> The calendar form for fractional seconds (cal_nsec), on the
>> other hand, should be decimal-based, because it is designed
>> to be converted into human-readable form, which means
>> decimal subseconds.
>>
>> So yes, there is a slight disconnect between calendar times
>> and longtime_t values. A given longtime_t value converts
>> to a unique calendar date (with a unique cal_nsec value);
>> but more than one calendar date (multiple cal_nsec values)
>> may convert to the same longtime_t value.
>>
>> This does not strike me as a big problem, as the two types
>> are designed for different problem domains. I'd prefer to
>> solve the kernel time compatibility issue, which seems to be
>> a more important problem.
>
Michael Deckers wrote:
> I understand that the hardware time counters are binary,
> but that does not mean that they count in units that are
> binary submultiples of a second. IBM z systems, for instance,
> count in binary submultiples of microseconds, not seconds.
> Converting these readings into a typedef longtime_t value
> involves rounding, even though the counter is binary.
No, it only involves truncation. Conversion from a binary-based
form into a decimal-based form would require more than simple
bit shifting/packing. Which was Poul-Henning Kamp's point.
> But my point is not about binary or decimal representation;
> my point is that the additional rounding involved in the
> conversion between typedef longtime_t values and struct
> calendar values is a useless complication.
>
> typedef longtime_t should not be confined to epoch
> values read from some hardware clock. It should
> be conceived more generally as a scalar type that
> represents arbitrary points on a time line.
It is just such a scalar type. But it is restricted to being
encoded in a form that is convenient for reading from a
hardward clock (which is pure binary), even while applications
are free to use the type in more general ways.
> In fact,
> in your proposal typedef longtime_t is necessary for
> several operations that are difficult or impossible to
> do only with struct calendar values: convert between
> different calendar systems, store epoch values compactly
> in a data base, easy comparison of epoch values.
Converting between calendar objects of differing calendric
systems may or may not involve converting to intermediate
longtime_t values. That's up to the implemeter of the various
calendric types.
Bear in mind that there are perfectly valid calendar dates that
cannot be represented as a valid longtime values at all. Note
note while my proposal does require the default Gregorian calendar
system to be able to represent all valid longtime values, it allows
the default calendar implementation to support dates that exceed
the range of representable longtime values.
> The conversion between typedef longtime_t values and
> struct calendar values works in both directions, and
> having the same resolution for both would be a useful
> trait because, in this case, one could guarantee that
> the conversion is one-to-one, even with time zone
> offsets applied on either side.
That would require either that the longtime_t to be a decimal
based encoding, which is undesirable as I've said; or it would
require than the cal_nsec member be binary-based and only
accurate to approx 2 nsec, which is also undesirable.
I think it's safe to use the escape hatch that they cover two
different problem domains and leave it at that.
-drt
> Michael Deckers wrote:
> > I see. What I do not see is which longtime_t values can
> > distinguish the two different instants when UTC was
> > 2005-12-31T23:59:60 (with 32 leap seconds), and
> > 2006-01-01T00:00:00 (with with 33 leap seconds).
> > NTP timestamps have the leap second bit to distinguish
> > them -- but longtime_t values don't.
>
> Those two times differ by one second, so their corresponding
> longtime_t values differ by one second's worth of ticks, i.e., the
> later time will be 2^29 greater than the earlier time.
OK, this means that you do mean [b] after all: longtime_t values
are intended to represent
[b] floor( ( TAI - 2001-01-01 - 32 s )/( 2^-29 s ) ).
I probably should have stated that I meant the difference in both
formulae to be taken as in the POSIX formula for "seconds after the
epoch", without any leap second magic, so that the difference
2006-01-01T00:01:00 - 2005-12-31T23:59:00
is always 120 s, regardless of the time scale (TAI or UTC or TDB or..).
> Likewise, passing these values to longtimeleapsecs() will return
> 33 for both, as they both include the same (33rd) leap second.
> Passing a value representing one tick before the earlier
> time (i.e., the point in time prior to the insertion of the 33rd leap
> second) will return 32.
That's the question: when exactly does TAI - UTC increase by 1 s
upon insertion of a positive leap second. The leap second mentioned
above goes
from 2005-12-31T23:59:60 UTC until 2006-01-01T00:00:00 UTC and
from 2006-01-01T00:00:32 TAI until 2006-01-01T00:00:33 TAI.
In Ada05, the increase in TAI - UTC is taken to occur at the
beginning of the leap second, as you propose. But the time stamps
in Ada05 for the leap second above go
from 2005-12-31T23:59:59 until 2006-01-01T00:00:00
(Ada05 does not use out of range seconds fields), so that the
discontinuity in the calendar components is indeed at the beginning
of the leap second. With the UTC time stamps using overflowing
seconds fields, the discontinuity might be taken to occur at the
end of the leap second. However this is decided, it should be
documented because the official definition of UTC (["Recommendation
ITU-R TF.460-6 Standard-frequency and time-signal emissions".
2002 Geneva]) is silent on this point.
> As I've said before, I would just as soon ignore leap seconds,
> but more than one person in comp.std.c has made it clear
> that applications that need leap seconds really need them
> badly. All other applications can simply remove them, either
> using longtimeleapsecs(), or by ignoring cal_sec values
> greater than 59 when converted to calendar date objects.
> The cal_leapsecs member is also present for this need.
A proposal to cease the use of leap seconds in UTC is under discussion
since at least 2000 ([http://www.ucolick.org/~sla/leapsecs/]).
However, ITU standardization seems to be even slower than
programming language standardization by ISO, so that leap seconds
will probabaly still be with us when standard C1x is issued. And then,
dealing systematically with time scales that are non-injective functions
of TAI (such as UTC at positive leap seconds, and time zone times upon
their switch to winter time) is a nice and challenging design problem!
Michael Deckers.
> Michael Deckers wrote:
> > I understand that the hardware time counters are binary,
> > but that does not mean that they count in units that are
> > binary submultiples of a second. IBM z systems, for instance,
> > count in binary submultiples of microseconds, not seconds.
> > Converting these readings into a typedef longtime_t value
> > involves rounding, even though the counter is binary.
> No, it only involves truncation. Conversion from a binary-based
> form into a decimal-based form would require more than simple
> bit shifting/packing. Which was Poul-Henning Kamp's point.
No disagreement: I consider truncation as rounding towards zero;
and the kind of rounding is certainly not prescribed by the
language (except that it has to be monotone if the results
of successive calls of getlongtime() are to be non-decreasing).
To be sure, conversion of an IBM z system TODR counter value to a
longtime_t value requires multiplication by 2^19/5^6, which
is not just a shift.
> Converting between calendar objects of differing calendric
> systems may or may not involve converting to intermediate
> longtime_t values. That's up to the implementer of the various
> calendric types.
You are right: converting between calendars does not
require conversion to longtime_t values because it can
be done with convertcalendar(). But how can I find out
which of two struct calendar values is earlier unless
I convert both into longtime_t values? calendardiff()
does not work for different calendars.
> Bear in mind that there are perfectly valid calendar dates that
> cannot be represented as a valid longtime_t values at all. Note
> while my proposal does require the default Gregorian calendar
> system to be able to represent all valid longtime_t values, it allows
> the default calendar implementation to support dates that exceed
> the range of representable longtime_t values.
Well, yes, struct calendar could represent a larger range of epochs
than longtime_t, but an implementation need not provide for that.
And your proposal also allows longtime_t values to be 128 bit long,
so that the range of represented epochs could be much greater than
those of struct calendar. In any case, a portable program cannot
make any assumption on which range is larger.
> > The conversion between typedef longtime_t values and
> > struct calendar values works in both directions, and
> > having the same resolution for both would be a useful
> > trait because, in this case, one could guarantee that
> > the conversion is one-to-one, even with time zone
> > offsets applied on either side.
> That would require either that the longtime_t to be a decimal
> based encoding, which is undesirable as I've said; or it would
> require than the cal_nsec member be binary-based and only
> accurate to approx 2 nsec, which is also undesirable.
longtime_t values could be binary counts of 1 ns ticks or
of 2 ns ticks, for example. What is undesirable about that?
With your proposal, even making mklongtime() a left inverse to
setcalendartime(), and both functions monotone non-decreasing
is not wholly trivial.
> I think it's safe to use the escape hatch that they [longtime_t
> and struct calendar] cover two different problem domains and
> leave it at that.
They model two different abstraction levels alright: epochs
(points on a time line), with just an affine structure, and
epoch notations using particular calendars (with additional
operations such as normalization and addition of months).
With the different resolutions for both, comparing two epoch
values may depend on whether you compare longtime_t values
or struct calendar values (if possible), and I cannot easily
find the exact midpoint of an epoch interval given with
struct calendar values (even if it is exactly representable
as a struct calendar value). All this is certainly no hardship,
but a small nuisance nevertheless, and I do not see what this
inconvenience is buying me.
Michael Deckers.
Michael Deckers wrote:
> You are right: converting between calendars does not
> require conversion to longtime_t values because it can
> be done with convertcalendar(). But how can I find out
> which of two struct calendar values is earlier unless
> I convert both into longtime_t values? calendardiff()
> does not work for different calendars.
In that case, the proper approach would be to convert
one of the calendar date values into the calendric type
of the other; or to convert both calendar dates to a third
calendric type (e.g., standard Gregorian). Then the
comparison can be done for two dates within the same
calendric type.
Bear in mind that converting an arbitrary calendar date
into a longtime_t value might be impossible. Consider, for
example, a system that supports the Roman calendar;
dates beyond 1582 (or 1752, whatever) are probably not
supported in that calendar. But longtime_t dates before
1601-01-01 (proleptic Gregorian) are invalid. So there is
no valid conversion between that calendric type and
longtime_t values (i.e., mklongtime() always fails for that
calendric type). Yet it is expected that any given Roman
date can be converted into a (standard proleptic) Gregorian
date (i.e, convertcalendar() should not fail if the Roman
calendric type is supported fully).
Another way of looking at it is to say that you are comparing
calendar dates (not linear system times), so restrict the
problem to the domain of calendric computations.
-drt
Michael Deckers wrote:
> Well, yes, struct calendar could represent a larger range of epochs
> than longtime_t, but an implementation need not provide for that.
> And your proposal also allows longtime_t values to be 128 bit long,
> so that the range of represented epochs could be much greater than
> those of struct calendar. In any case, a portable program cannot
> make any assumption on which range is larger.
A portable program can only make the assumptions given in
the proposals, i.e., that longtime_t values span AD 1601-01-01
to AD 2399-12-31, and that likewise calendar dates span the
dates AD 1601-01-01 to 2399-12-31 (at a minimum).[1]
There is more latitude provided for calendar dates, though,
since a program can examine the calendarinfo struct to
retrieve the full range of supported dates on the system.
(I would expect most implementations to support proleptic
Gregorian dates as early as 0001-01-01, since it's easy
to do.)
Any assumptions beyond that render a program unportable.
And while it's possible for longtime_t to be implemented
as a 128-bit integer type, it's desirable for it to be limited
to a 64-bit type (for maximum portability). You're not
really gaining anything (for most applications) by making
it wider.
[1] The required minimum range for calendar dates has been
updated in revision 2.4 (2009-09-27) of the calendar proposal,
in order to parallel the range of longtime_t.
-drt
David R Tribble wrote:
>> That would require either that the longtime_t to be a decimal
>> based encoding, which is undesirable as I've said; or it would
>> require than the cal_nsec member be binary-based and only
>> accurate to approx 2 nsec, which is also undesirable.
>
Michael Deckers wrote:
> longtime_t values could be binary counts of 1 ns ticks or
> of 2 ns ticks, for example. What is undesirable about that?
My requirements for the longtime_t type are:
1. Fit within a standard integer datatype;
2. Span dates of several hundred years before and
after the near-present time;
3. Have the smallest subsecond resolution possible without
compromising (2).
4. Be efficiently retrievable from most computer hardware.
To meet requirement (2), I want longtime_t values to span
at least +/-400 years (two cycles of the Gregorian calendar).
Requirement (1) limits us to a type no wider than the largest
standard integer datatype. The two together therefore pretty
much forces us to use a 64-bit binary integer for the type,
which can give us a resolution no smaller than about 2 ns
and a range not wider than about +/-580 years.
Increasing the resolution to 1 ns halves the range to about
+/-292 years, which is not wide enough for requirement (2).
Also, as I've already mentioned, people smarter than me
prefer an all-binary encoding to meet requirement (4).
They say that a binary format is more efficient to deal with
at the hardware and system kernel level, and I agree with their
reasons. You're going to have to convince those people, not
just me, that a decimal encoding is better for them in the long
run.
-drt
> My requirements for the longtime_t type are:
> 1. Fit within a standard integer datatype;
> 2. Span dates of several hundred years before and
> after the near-present time;
> 3. Have the smallest subsecond resolution possible without
> compromising (2).
> 4. Be efficiently retrievable from most computer hardware.
>
> To meet requirement (2), I want longtime_t values to span
> at least �400 years (two cycles of the Gregorian calendar).
> Requirement (1) limits us to a type no wider than the largest
> standard integer datatype. The two together therefore pretty
> much forces us to use a 64-bit binary integer for the type,
> which can give us a resolution no smaller than about 2 ns
> and a range not wider than about �580 years.
>
> Increasing the resolution to 1 ns halves the range to about
> �292 years, which is not wide enough for requirement (2).
>
> Also, as I've already mentioned, people smarter than me
> prefer an all-binary encoding to meet requirement (4).
> They say that a binary format is more efficient to deal with
> at the hardware and system kernel level, and I agree with their
> reasons. You're going to have to convince those people, not
> just me, that a decimal encoding is better for them in the long
> run.
So an all-binary count of of 2 ns ticks should be fine with
these people! I don't want to dwell too much on this particular
point -- my general concern is: how well does a proposed feature
fit with other standards and practices.
The resolution of timestamps in NTP is 2^-32 s, but in PTP it
is 1 ns, and in XML and standard SQL it is also a decimal
submultiple of one second (1 s times a negative integral
power of ten). If the resolution of longtime_t is to be fixed
by the language (rather than made implementation-defined as
eg in Ada), then the language design issues are:
which kinds of timestamps are expected to be exchanged
with C programs, and
to which degree is an exact representation of such
a timestamp as longtime_t value desirable.
I do not pretend to know the answers; I just note
that the proposed choice of different resolutions for
longtime_t and struct calendar values makes conversion
between the two forms inexact in general, and possibly
even non-monotone. And I fail to see the benefit -- or
at least, I do not consider it a benefit if, on some
specific hardware, a multiplication can be replaced by
a shift instruction in the function getlongtime().
If a clock is synched with NTP then there will be
several multiplications for getlongtime() at any rate.
Anyway, thanks for your instructive replies!
For perosnal reasons, I will be offline for
quite some time.
Michael Deckers.
Michael Deckers wrote:
> So an all-binary count of of 2 ns ticks should be fine with
> these people!
No, it would not. See below.
> I don't want to dwell too much on this particular
> point -- my general concern is: how well does a proposed feature
> fit with other standards and practices.
>
> The resolution of timestamps in NTP is 2^-32 s, but in PTP it
> is 1 ns, and in XML and standard SQL it is also a decimal
> submultiple of one second (1 s times a negative integral
> power of ten). If the resolution of longtime_t is to be fixed
> by the language (rather than made implementation-defined as
> eg in Ada), then the language design issues are:
> which kinds of timestamps are expected to be exchanged
> with C programs, and
> to which degree is an exact representation of such
> a timestamp as longtime_t value desirable.
I've extracted the pertinent email discussion between myself
and Poul-Henning Kamp from my archives of 2005. You can
read for yourself the arguments behind choosing a tick
resolution of 2**-29 sec (instead of 2 ns), at:
http://david.tribble.com/text/c0xlongtime_email.txt
-drt
| From: Bill Seymour
| Date: Thu, 27 Aug 2009 09:49:47 -0500
| Subject: Your Civil Time Proposals to WG14
|
| [...]
| The only problem that jumped out at me is that, in your struct
| timezone, tz_offset has the wrong sign. The ISO 8601 offset is the
| amount of time added to UTC to get the local time, which is
positive
| in the Eastern Hemisphere. That's also what's done in the Olsen
| database, and it's consistent with both the sign of your own z_dst
and
| the usual understanding...positive means setting the wall clock
ahead,
| thus a given local time represents an earlier UTC. It seems to me
| that the principle of least surprise suggests that the signs of the
| UTC offset and DST offset should have the same meaning. That's
also
| the state of significant prior art. Does that make sense?
As per Bill's suggestion, I modified the 'timezone' struct so that
member 'tm_offset' is the offset East of UTC, rather than West.
This brings it in line with the ISO 8601 and RFC 2822 formats.
(An example of the latter is visible in the 'Date:' header of Bill's
email above.)
http://david.tribble.com/text/c0xtimezone.html (1.4)
-drt