Re: Force UTC times in the StationXML schema

13 views
Skip to first unread message

Chad Trabant

unread,
Oct 26, 2016, 7:53:50 PM10/26/16
to fdsn-w...@fdsn.org

Hi Lion and WG2,

In principle I agree with the addition of a UTC time zone indicator.

The practical issue when such a change is introduced. If you are proposing that we only include this at the next incompatible version, then it makes sense as is. Alternatively, if we would like to adopt this in the next backward-compatible version iteration, then the suggestion by Joachim to make the Z optional is the way to go. I think we should be prepared to do both. At this point it is unclear what kind of next version of StationXML is going to happen or even when it is going to happen.

In addition to the Potential Impact stated in the proposal that producers would need a simple modifications and the impacts Joachim raised, I add that there are implications for web service clients and subsequent readers of StationXML. I think there are a lot more clients and readers than implementations/producers, which are, quite frankly, the easy part.

Chad

> On Oct 26, 2016, at 9:32 AM, Lion Krischer <lion.k...@gmail.com> wrote:
>
> Hi Joachim,
>
>
> this is not an actual concern as the proposal would only affect the next
> StationXML version. The current version (1.0) will not change and files
> that are valid against it will naturally always remain valid.
>
> If and once a new StationXML + schema version is introduced all
> StationXML files will have to explicitly state that they adhere to the
> new schema by changing their stationxml namespace. As files and
> implementations have to be updated for this to happen in any case, the
> additional `Z` is only a minor change.
>
> Until that point the additional `Z` could be considered a recommendation
> and it is of course also valid against the current version of the schema.
>
> In general with data formats I think that it is always a good thing to
> limit choices and possibilities wherever possible.
>
>
> Cheers!
>
> Lion
>
> On 26/10/16 16:16, Joachim Saul wrote:
>> Lion Krischer wrote on 10/26/2016 12:39 PM:
>>> I propose a small change to be included in the next StationXML schema
>>> version that would force all datetimes to be explicitly marked as being
>>> in UTC.
>>>
>>> Technical details can be found here:
>>> https://github.com/FDSN/StationXML/pull/12
>>>
>>> The proposed change forces all datetimes in a StationXML file to end
>>> with `Z`. This explicitly marks them as being in UTC which (according to
>>> the SEED standard) must be given in any case.
>>
>> Forcing the time to be UTC is a very good idea.
>>
>> My concern would be existing web fdsnws implementations as well as
>> existing StationXML documents. Requiring a match with ".*Z" would render
>> invalid most of what is now perfectly valid StationXML. Webservices
>> could probably be updated relatively quickly to support a new time
>> format. I am more worried about existing XML documents that suddenly
>> would become invalid and which parsers might refuse to read. Not all
>> StationXML is created by webservices on the fly.
>>
>> A pattern that also matches existing time strings *without* time zone
>> would be better. This could be something like "[0-9-]+T[0-9.:Z]+" rather
>> than just ".*Z". Note that this pattern does not enforce a time string
>> ending with "Z" but only enforces a limited set of characters, which
>> does not include "+", "-" or space after the "T".
>>
>> Assuming that time strings containing time zones are currently rarely
>> used in StationXML anyway (if at all), the impact of this pattern will
>> be minimal.
>>
>> The documentation would need a sentence clarifying that time strings
>> *shall* (not *must*) end with a "Z".
>>
>> Cheers
>> Joachim
>>
>>
>> ----------------------
>> FDSN Working Group II (http://www.fdsn.org/message-center/topic/fdsn-wg2-data/)
>>
>> Sent from the FDSN Message Center (http://www.fdsn.org/message-center/)
>> Update subscription preferences at http://www.fdsn.org/account/profile/
>>
>
> ----------------------
> FDSN Working Group II (http://www.fdsn.org/message-center/topic/fdsn-wg2-data/)
>
> Sent from the FDSN Message Center (http://www.fdsn.org/message-center/)
> Update subscription preferences at http://www.fdsn.org/account/profile/

Lion Krischer

unread,
Oct 26, 2016, 10:38:54 PM10/26/16
to fdsn-w...@fdsn.org
Dear all,


I propose a small change to be included in the next StationXML schema
version that would force all datetimes to be explicitly marked as being
in UTC.

Technical details can be found here:
https://github.com/FDSN/StationXML/pull/12

The proposed change forces all datetimes in a StationXML file to end
with `Z`. This explicitly marks them as being in UTC which (according to
the SEED standard) must be given in any case.

# Motivation

It is currently possible to have non-UTC datetimes in a StationXML file
(e.g. `2013-01-01T00:00:00+07:00`). While this is not allowed according
to the SEED standard the schema does not prevent it and any XML file
with this would still validate just fine.

The forced `Z` at the end explicitly marks a time as being UTC and as a
consequence also forbids setting any time zone.

While the schema alone will never be enough to define the semantics of
StationXML it should be sufficient to define its syntax. This change is
a small step in that direction.

StationXML would additionally gain a more consistent datetime
representation.

# Potential Impact

Little. Existing implementations would have to be changed to add a `Z`
at the end and remove time zones (if any). Considering implementations
would have to at least up the version number if they update to a new
StationXML version this is really no big problem.

# Other Possibilities

It would also be possibly (with a more complex regular expression) to
not have the `Z` at the end and still disallow timezones. I personally
prefer the current approach as it is more explicit.


All the best,

Lion

Fabian Euchner

unread,
Oct 26, 2016, 11:25:37 PM10/26/16
to fdsn-w...@fdsn.org
Hi all,

I fully agree with Lion. Omitting the time zone information (although
permitted in the xs:dateTime type) may lead XML processors to use the local
time zone set in the locale of the processing machine. This has the potential
to introduce subtle errors that are hard to detect.

Regards,
Fabian

> ----------------------
> FDSN Working Group II
> (http://www.fdsn.org/message-center/topic/fdsn-wg2-data/)
>
> Sent from the FDSN Message Center (http://www.fdsn.org/message-center/)
> Update subscription preferences at http://www.fdsn.org/account/profile/

--
-----------------------------------------------------------------------------
Fabian Euchner phone +41 44 633 7178
Institute of Geophysics fax +41 44 633 1065
ETH Zurich, NO F5 e-mail fab...@sed.ethz.ch
Sonneggstrasse 5 orcid.org/0000-0001-6340-7439
8092 Zurich (Switzerland)
-----------------------------------------------------------------------------
QuakeML http://quakeml.org QuakePy http://quakepy.org
CSEP http://www.cseptesting.org/centers/eth
-----------------------------------------------------------------------------

Chad Trabant

unread,
Oct 26, 2016, 11:42:00 PM10/26/16
to fdsn-w...@fdsn.org

> In any case and independent of this particular proposal there has to be
> some path to update StationXML and move forward. Forcing the time zone
> indicator is about the smallest change I can think of so it might be a
> good test bed to see what happens when a new StationXML version is
> introduced.

Versioning to provide a path forward was built into the StationXML from the beginning. How the versions will progress and their meaning is documented in the comments at top of the schema, and now pasted here:

Versioning for FDSN StationXML:

The 'version' attribute of the schema definition identifies the version of the schema. This
version is not enforced when validating documents.

The required 'schemaVersion' attribute of the root element identifies the version of the schema
that the document is compatible with. Validation only requires that a value is present but
not that it matches the schema used for validation.

The targetNamespace of the document identifies the major version of the schema and document,
version 1.x of the schema uses a target namespace of "http://www.fdsn.org/xml/station/1".
All minor versions of a will be backwards compatible with previous minor releases. For
example, all 1.x schemas are backwards compatible with and will validate documents for 1.0.
Major changes to the schema that would break backwards compabibility will increment the major
version number, e.g. 2.0, and the namespace, e.g. "http://www.fdsn.org/xml/station/2".

This combination of attributes and targetNamespaces allows the schema and documents to be
versioned and allows the schema to be updated with backward compatible changes (e.g. 1.2)
and still validate documents created for previous major versions of the schema (e.g. 1.0).

Jeremy Fee

unread,
Oct 27, 2016, 12:59:36 AM10/27/16
to fdsn-w...@fdsn.org
Is the concern that the times aren't in UTC, or that times are being
specified without _any_ timezone? When a non-UTC timezone is specified,
it's at least an explicit time that can easily be converted to UTC.

The Quakeml spec deals with similar issues by saying times without an
explicit timezone should be interpreted as UTC (contrary to regular xml
dateTime). This requires a little parser hoop-jumping but is maybe easier
than introducing a new major version.


Thanks,

Jeremy


On Wed, Oct 26, 2016 at 2:43 PM, Chad Trabant <ch...@iris.washington.edu>
wrote:

Joachim Saul

unread,
Oct 27, 2016, 2:15:57 AM10/27/16
to fdsn-w...@fdsn.org
Lion Krischer wrote on 10/26/2016 12:39 PM:
> I propose a small change to be included in the next StationXML schema
> version that would force all datetimes to be explicitly marked as being
> in UTC.
>
> Technical details can be found here:
> https://github.com/FDSN/StationXML/pull/12
>
> The proposed change forces all datetimes in a StationXML file to end
> with `Z`. This explicitly marks them as being in UTC which (according to
> the SEED standard) must be given in any case.

Forcing the time to be UTC is a very good idea.

Lion Krischer

unread,
Oct 27, 2016, 4:31:44 AM10/27/16
to fdsn-w...@fdsn.org
Hi Joachim,


this is not an actual concern as the proposal would only affect the next
StationXML version. The current version (1.0) will not change and files
that are valid against it will naturally always remain valid.

If and once a new StationXML + schema version is introduced all
StationXML files will have to explicitly state that they adhere to the
new schema by changing their stationxml namespace. As files and
implementations have to be updated for this to happen in any case, the
additional `Z` is only a minor change.

Until that point the additional `Z` could be considered a recommendation
and it is of course also valid against the current version of the schema.

In general with data formats I think that it is always a good thing to
limit choices and possibilities wherever possible.


Cheers!

Lion


On 26/10/16 16:16, Joachim Saul wrote:

Lion Krischer

unread,
Oct 27, 2016, 6:49:25 AM10/27/16
to fdsn-w...@fdsn.org
Hi Chad and others,


as soon as the addition of the UTC time zone indicator is forced one
effectively creates a new version of StationXML and this really should
be reflected in an incremented version number. Otherwise the schema
version number becomes meaningless.

Until that point it can happily be treated as a recommendation as the
`Z` is also fully valid with the current schema. If it breaks readers it
is a bug and should be fixed.

But I guess this is pretty much what you are talking about just from a
different perspective.

In any case and independent of this particular proposal there has to be
some path to update StationXML and move forward. Forcing the time zone
indicator is about the smallest change I can think of so it might be a
good test bed to see what happens when a new StationXML version is
introduced.


All the best,

Lion


On 26/10/16 18:54, Chad Trabant wrote:
>

Lion Krischer

unread,
Oct 27, 2016, 10:06:37 PM10/27/16
to fdsn-w...@fdsn.org
Hi,


> This combination of attributes and targetNamespaces allows the schema
> and documents to be
> versioned and allows the schema to be updated with backward compatible
> changes (e.g. 1.2)
> and still validate documents created for previous major versions of the
> schema (e.g. 1.0).


Thanks for the clarification - I did not know this. If I read this
correctly than this is essentially semantic versioning as is used for
software. This does not fully carry over to data schemas as it only
allows the addition of new and optional pieces of data or the weakening
of some constraints for minor revisions; anything else would invalidate
some existing data.

Neither my proposed change nor Joachim's less restrictive version would
be fully backwards compatible changes and thus would require a new major
version.

For such a minuscule change this is not worth it so I guess this should
be postponed until enough changes have accumulated to justify a new
major version. Until that point it could just be changed to a comment
that recommends to add the Z time zone identifier to all datetimes.


A side note and something to maybe discuss at a future meeting: These
versioning semantics somewhat freeze StationXML and make it almost
impossible to incrementally improve it. Here is a blog post illustrating
an idea of semantic versioning for schemas:
http://snowplowanalytics.com/blog/2014/05/13/introducing-schemaver-for-semantic-versioning-of-schemas/
This would allow at least some flexibility but I'm not sure if this is
desired by the community.


> Is the concern that the times aren't in UTC, or that times are being
> specified without _any_ timezone? When a non-UTC timezone is specified,
> it's at least an explicit time that can easily be converted to UTC.
>
> The Quakeml spec deals with similar issues by saying times without an
> explicit timezone should be interpreted as UTC (contrary to regular xml
> dateTime). This requires a little parser hoop-jumping but is maybe
> easier than introducing a new major version.


A bit of both. The main motivation was that it would be a very simple
and obvious change that makes an implicit assumption about the time zone
explicit and potentially prevents some hard to discover bugs in software.


Cheers!

Lion


>
>
>
>> On 26/10/16 18:54, Chad Trabant wrote:
>>>
>>> Hi Lion and WG2,
>>>
>>> In principle I agree with the addition of a UTC time zone indicator.
>>>
>>> The practical issue when such a change is introduced. If you are
>>> proposing that we only include this at the next incompatible version,
>>> then it makes sense as is. Alternatively, if we would like to adopt
>>> this in the next backward-compatible version iteration, then the
>>> suggestion by Joachim to make the Z optional is the way to go. I
>>> think we should be prepared to do both. At this point it is unclear
>>> what kind of next version of StationXML is going to happen or even
>>> when it is going to happen.
>>>
>>> In addition to the Potential Impact stated in the proposal that
>>> producers would need a simple modifications and the impacts Joachim
>>> raised, I add that there are implications for web service clients and
>>> subsequent readers of StationXML. I think there are a lot more
>>> clients and readers than implementations/producers, which are, quite
>>> frankly, the easy part.
>>>
>>> Chad
>>>
>>>> On Oct 26, 2016, at 9:32 AM, Lion Krischer

Joachim Saul

unread,
Oct 28, 2016, 2:27:08 AM10/28/16
to fdsn-w...@fdsn.org
Hello Lion,

Lion Krischer wrote on 10/27/2016 12:07 PM:
> Neither my proposed change nor Joachim's less restrictive version would
> be fully backwards compatible changes and thus would require a new major
> version.

The question is how backwards compatible it really needs to be - not
just theoretically but in practice.

If a currently allowed time representation like
"2009-10-11T12:13:14.1567+02:00" shall remain valid throughout
StationXML 1.* then you are right and the proposed changes have to be
postponed until a possible future version 2.

But do we really expect that StationXML documents using non-UTC time
strings exist somewhere? Are you aware of any? It will likely turn out
that no currently existing, relevant software actually produces such
strings. In that case I would still be in favor of introducing a non-UTC
blocking pattern into the schema as soon as possible. I would think of
it as a clarification rather than a new feature.

> For such a minuscule change this is not worth it so I guess this should
> be postponed until enough changes have accumulated to justify a new
> major version.

When do you expect that new major version to be released?

Another aspect to keep in mind is that the time string format is also
relevant for several interfaces incl. all fdsnws's. If the *Z format is
made mandatory in StationXML it would be inconsistent to not make it
mandatory in the interfaces, too. I'd therefore prefer to be less
restrictive and make the trailing "Z" optional in the schema (but
recommended in the documentation). Instead define the time string format
as precisely as it was done in the FDSN Web Service Specifications 1.1b
page 6 *and* enforce this via a corresponding pattern in the xsd.

Cheers
Joachim

Reply all
Reply to author
Forward
0 new messages