Thanks,
Jeremy
On Monday, August 3, 2015, Philip Crotwell <crot...@seis.sc.edu> wrote:
>
> One issue I have with the existing StationXML is that it can be very large
> and with much repeated information, particularly for responses. For
> example, it is very common to have 3 components of motion recorded at a
> station on a single "sensor" for which the response of the 3 channels are
> identical, but the information is repeated for each channel. Moreover, the
> response is very often the nominal response of that model of sensor, and so
> is repeated for all stations with that model sensor. Similar issues of
> course for the DataLogger.
>
> One thing that might help with this is to make use of externally defined
> responses, such as the Nominal Response Library, and link to it from within
> the Sensor and DataLogger elements, and make this an alternative way of
> expressing the response. This could be done by adding a ResponseLink
> element to the EquipmentType with a URL to an external response, probably
> along with a starting stage and ending stage. Not only does this convey the
> same information as currently but in much less space, but it also conveys
> more information in that is says that the network operator is just using
> the nominal response. And so if their are errors found in the nominal
> response and it is updated, end users have the option of getting an update
> and being confident that they new nominal response is more correct than the
> previous.
>
> Also, because it is useful to have fully self contained statilxml, it
> might be useful to create a separate element parallel to Network, that
> could contain these same responses as a "named response" where the name
> would be the same as the URL used in the ResponseLink element. That way if
> I make a request for staitonxml for 99 stations that all have STS-2
> sensors, I only need to get the STS-2 response once, and all the channels
> link to it.
>
> Lastly, because a URL is very small, this information could be returned
> always as part of a channel level request, and so a client that already has
> the response associated with that URL does not need to ask for the response
> level for that channel.
>
> I will create a pull request for this, but thought some discussion might
> be good first.
>
> Philip
>
thanks
PHilip
> ----------------------
> FDSN Working Group II (
> http://www.fdsn.org/message-center/topic/fdsn-wg2-data/)
>
> Sent via IRIS Message Center (http://www.fdsn.org/message-center/)
> Update subscription preferences at http://www.fdsn.org/account/profile/
>
>
In other words, turn (at least) Response into a "top-level" element.
That would allow to have fully self contained XML, while referencing
Response from Sensor, DataLogger et al. Note that EquipmentType already
has a resourceID attribute to facilitate this.
It might actually make sense to turn all EquipmentType's into top-level
elements.
On a side note, similar considerations led to making for instance Pick a
top-level element in QuakeML. Within an event there are often several to
many Origins (e.g., evolving in time), each computed from strongly
overlapping sets of picks. It would have been very inefficient to
duplicate the very redundant sets of picks again and again as child
elements in each new origin for a given event. Just think of the
bandwidth required for that in a real-time messaging system! Instead, we
decided to make Pick, Amplitude etc. top-level elements that are only
/referenced/ by an Origin through it's publicID. This makes the Origins
smaller, reduces redundancy and -- of course -- allows Pick elements to
be sent independent of a Origin element, which is important, too.
Regards,
Joachim
> > One thing that might help with this is to make use of externally defined responses, such as the Nominal
> > Response Library, and link to it from within the Sensor and DataLogger elements, and make this an
> > alternative way of expressing the response.
[...]
> > Also, because it is useful to have fully self contained statilxml, it
> > might be useful to create a separate element parallel to Network
> Joachim Saul wrote:
>
> In other words, turn (at least) Response into a "top-level" element.
> That would allow to have fully self contained XML, while referencing
> Response from Sensor, DataLogger et al
You would need two top level elements:
- sensor (with model, manufacturer, vendor, serial, etc. reference to response) -> this would allow not to repeat identical information with 3 streams, but just reference it by publicid
- response (with the response to be considered). -> if a network operator provides just standard response per sensor model, all sensors of model x may reference to a single instance of response. If individual sensors were individually measured/calibrated, then every sensor may point to its own response (however only once, without repetition for each stream and each deployment of the same sensor [with the same serial number/publicID] at a different station).
Such a refactoring would also allow to describe a sensor if it is not actually deployed. This is exactly one of the changes I had in mind when proposing to do a larger refactoring of StationXML towards 2.0 and represent information as properties of natural entities, rather than for a single application case (in this case: "I need the response of stream MYSTATION.HHZ at date x"). Such structural changes are not backward compatible (and should not be...); thus, they should not be done incrementally, but in a (hopefully rare), large review.
In this context, InventoryXML has a nice feature to be considered also for StationXML:
A sensor refers to its default (either individual or type-specific response). However, it can also have multiple calibration periods as children, with individual timespan and reference to individual instances of response. These override the standard response.
Thus, if you have a stream pointing to sensor, and the sensor has calibration periods, you do time matching in order to get the calibrated response, if it does not have any, you are happy to use the sensor's default response.
Best regards,
Philipp (Kästli)
Here is a first cut at allowing stationXML to refer to responses instead of
embedding them directly in a channel.
https://github.com/FDSN/StationXML/pull/6
thanks
Philip
Rearranging the response details into re-usable entities within a document is certainly worth discussing. We should keep in mind that it would be pretty disruptive to existing reading software, more disruptive than any other changes proposed so far. I'm not saying it wouldn't be worth it.
On the other hand, referring to and relying on external definitions for a complete document sounds like a bad idea to me. There is a significant added burden to any reader of such information to get a complete response, especially if the external information is in another format like RESP. Beyond this reader burden there are issues of longevity and versioning. If the external document goes away (there is no guarantee that NRL entries are permanent) then the StationXML now refers to a dead link and is incomplete. If the external document changes then the complete, re-constituted document changes and it may not be obvious to any given reader or user.
Chad
> On Aug 6, 2015, at 2:47 PM, Philip Crotwell <crot...@seis.sc.edu> wrote:
>
> Hi
>
> Here is a first cut at allowing stationXML to refer to responses instead of embedding them directly in a channel.
> https://github.com/FDSN/StationXML/pull/6 <https://github.com/FDSN/StationXML/pull/6>
>
> thanks
> Philip
>
> On Thu, Aug 6, 2015 at 12:16 PM, <"Philipp Kästli <kae...@sed.ethz.ch>"@fdsn.fdsn.org <mailto:"Philipp K%C3%A4stli <kae...@sed.ethz.ch>"@fdsn.fdsn.org>> wrote:
> >
> > Philip Crotwell wrote on 08/03/2015 03:35 PM:
>
> > > One thing that might help with this is to make use of externally defined responses, such as the Nominal
> > > Response Library, and link to it from within the Sensor and DataLogger elements, and make this an
> > > alternative way of expressing the response.
> [...]
> > > Also, because it is useful to have fully self contained statilxml, it
> > > might be useful to create a separate element parallel to Network
>
> > Joachim Saul wrote:
> >
> > In other words, turn (at least) Response into a "top-level" element.
> > That would allow to have fully self contained XML, while referencing
> > Response from Sensor, DataLogger et al
>
> You would need two top level elements:
> - sensor (with model, manufacturer, vendor, serial, etc. reference to response) -> this would allow not to repeat identical information with 3 streams, but just reference it by publicid
> - response (with the response to be considered). -> if a network operator provides just standard response per sensor model, all sensors of model x may reference to a single instance of response. If individual sensors were individually measured/calibrated, then every sensor may point to its own response (however only once, without repetition for each stream and each deployment of the same sensor [with the same serial number/publicID] at a different station).
>
> Such a refactoring would also allow to describe a sensor if it is not actually deployed. This is exactly one of the changes I had in mind when proposing to do a larger refactoring of StationXML towards 2.0 and represent information as properties of natural entities, rather than for a single application case (in this case: "I need the response of stream MYSTATION.HHZ at date x"). Such structural changes are not backward compatible (and should not be...); thus, they should not be done incrementally, but in a (hopefully rare), large review.
>
> In this context, InventoryXML has a nice feature to be considered also for StationXML:
> A sensor refers to its default (either individual or type-specific response). However, it can also have multiple calibration periods as children, with individual timespan and reference to individual instances of response. These override the standard response.
> Thus, if you have a stream pointing to sensor, and the sensor has calibration periods, you do time matching in order to get the calibrated response, if it does not have any, you are happy to use the sensor's default response.
>
> Best regards,
>
> Philipp (Kästli)
>
>
> ----------------------
> FDSN Working Group II (http://www.fdsn.org/message-center/topic/fdsn-wg2-data/ <http://www.fdsn.org/message-center/topic/fdsn-wg2-data/>)
>
> Sent via IRIS Message Center (http://www.fdsn.org/message-center/ <http://www.fdsn.org/message-center/>)
> Update subscription preferences at http://www.fdsn.org/account/profile/ <http://www.fdsn.org/account/profile/>
>
>
> ----------------------
> FDSN Working Group II (http://www.fdsn.org/message-center/topic/fdsn-wg2-data/)
I guess their are two ways of using StationXML, as a "message" and as a
"document", and they have different needs and goals. In a messaging use,
the xml exists only for a short time as it is transmitted from server to
client and it is perfectly fine to convey some information directly and
refer to related information via links, think web page with linked images.
For a document use, you want long term storage and links external links are
"bad" in the sense that you can't rely on them existing in the future.
Think PDF with embedded images.
It sounds like you are viewing stationxml primarily as a long lived
document, but I question whether that is really the only, or even most
common use. And the argument that the response must live in the same file
as the other net/station/channel metadata is not that far from the argument
that the metadata and the waveform data must live in the same file (a la
full seed).
I am not saying that a large, all encompassing document is a wrong use of
stationxml, and my proposal does not remove the existing "stages in
response in channel" style. But stationxml as lots of small transient
messages between client and server is also a valid use, and in some ways
may be more common. In my particular use case, SOD would greatly benefit
from not getting a response for channel A when it is exactly the same as
the response for channel B. If the ID is the same, I get it from the local
cache.
That said, it may be that this type of short term messaging may be
better/easier to accommodate via something like StationJSON. In that case
then the JSON needs to be able to carry all the same information, and of
course actually be implemented as part of the servers.
Maybe that is a fair way to divide the use, xml as deep, long term, self
contained files, and JSON as small, short term, linked messages?
Philip
On Fri, Aug 7, 2015 at 4:57 PM, Chad Trabant <ch...@iris.washington.edu>
wrote:
>
> Hi Philip, Joachim & Philipp,
>
> Rearranging the response details into re-usable entities within a document
> is certainly worth discussing. We should keep in mind that it would be
> pretty disruptive to existing reading software, more disruptive than any
> other changes proposed so far. I'm not saying it wouldn't be worth it.
>
> On the other hand, referring to and relying on external definitions for a
> complete document sounds like a bad idea to me. There is a significant
> added burden to any reader of such information to get a complete response,
> especially if the external information is in another format like RESP.
> Beyond this reader burden there are issues of longevity and versioning. If
> the external document goes away (there is no guarantee that NRL entries are
> permanent) then the StationXML now refers to a dead link and is
> incomplete. If the external document changes then the complete,
> re-constituted document changes and it may not be obvious to any given
> reader or user.
>
> Chad
>
> On Aug 6, 2015, at 2:47 PM, Philip Crotwell <crot...@seis.sc.edu> wrote:
>
> Hi
>
> Here is a first cut at allowing stationXML to refer to responses instead
> of embedding them directly in a channel.
> https://github.com/FDSN/StationXML/pull/6
>
> thanks
> Philip
>
> On Thu, Aug 6, 2015 at 12:16 PM, <"Philipp Kästli
>> http://www.fdsn.org/message-center/topic/fdsn-wg2-data/)
>>
>> Sent via IRIS Message Center (http://www.fdsn.org/message-center/)
>> Update subscription preferences at http://www.fdsn.org/account/profile/
>>
>
>
> ----------------------
> FDSN Working Group II (
There is one other use case I forgot to mention, but I think is important
and was part of my original motivation. Consider the case of a network
operator uploading metadata to a datacenter. I really hope that some day
soon we can dispense with dataless seed and use stationxml for this. In the
case of uploading metadata, I really think there is an advantage for the
network operator to be able to say "the response of this channel is the
nominal response for this configuration of this sensor and that logger". It
makes the submission easier, gives the datacenter more information, and
potentially allows for future updates to a nominal response to be
incorporated.
The way I have done uploaded metadata in the past, for example, is to
download the NRL response from IRIS, have PDCC create dataless that
incorporates that response but without any information that connects the
response back to the NRL, and then send the very same (hopefully) response
back to IRIS? Seems kind of weird to download nominal responses from IRIS
just to send them right back, especially with the very real potential of me
and my fat fingers screwing something up in the transition?
Thoughts?
Philip
On Fri, Aug 7, 2015 at 5:47 PM, Philip Crotwell <crot...@seis.sc.edu>
when developing QuakeML on one hand, and the SeisComP software on the other hand, around nine years ago we had very similar discussions. Data needed to be represented as (XML) documents containing huge earthquake catalogs in some application cases (e.g., reference catalog for a hazard assessment), but very small snippets of information in others (e.g., one software informing another one on the appearance of a new Pick). Furthermore, you need some way to store the information in a database, for which you might not want to use XML strings.
Therefore, what IMHO we are looking for may not be just an XML format. In fact, like in the case of QuakeML, we are actually more generically looking for a data model. XML is just one of many ways to serialize the data, which in many use cases may not be the most efficient. For instance in SeisComP, we initially used XML for messages containing e.g. picks and amplitudes, resulting in 90% of the CPU time being spent on (de)serializing XML. Therefore now binary messages are used. I am certainly not in favor of using binary formats for station metadata exchange in an FDSN context. But as you say, depending on the use case, it may well be JSON (or YAML or some future format). The representation / serialization format should not really matter; different representations are converted easily as long as they use a standard *data* *model*. I propose to standardize the model itself (by the way of a UML diagram), and an XML representation (as a well defined format for high-level data ex
change).
Regards
Joachim
>> <mailto:crot...@seis.sc.edu>> wrote:
>>
>> Hi
>>
>> Here is a first cut at allowing stationXML to refer to responses
>> instead of embedding them directly in a channel.
>> https://github.com/FDSN/StationXML/pull/6
>>
>> thanks
>> Philip
>>
>> On Thu, Aug 6, 2015 at 12:16 PM, <"Philipp Kästli
>> <kae...@sed.ethz.ch>"@fdsn.fdsn.org
>> <mailto:%22Philipp+K%C3%A4stli+%3Cka...@sed.ethz.ch%3E%2...@fdsn.fdsn.org>>
>> (http://www.fdsn.org/message-center/topic/fdsn-wg2-data/)
>>
>> Sent via IRIS Message Center (http://www.fdsn.org/message-center/)
>> Update subscription preferences at
>> http://www.fdsn.org/account/profile/
>>
>>
>>
>> ----------------------
>> FDSN Working Group II
>> (http://www.fdsn.org/message-center/topic/fdsn-wg2-data/)
>>
>> Sent via IRIS Message Center (http://www.fdsn.org/message-center/)
>> Update subscription preferences at
>> http://www.fdsn.org/account/profile/
>
>
>
>
>
> ----------------------
> FDSN Working Group II (http://www.fdsn.org/message-center/topic/fdsn-wg2-data/)
One more thought, a reference is cheap, and so can be returned in a
"channel" level request. If the client asks for "response" level, they get
the stages, possibly referred to within the same file. This for existing
software and longevity I think.
On the other hand, all this is meaningless unless datacenters are actually
willing to coalesce identical (often nominal) responses and there is a URL
to retrieve an individual response.
The upload use case may still be worth pursuing.
Philip
On Tue, Aug 11, 2015 at 6:32 AM, Joachim Saul <sa...@gfz-potsdam.de> wrote:
> Hi Chad,
>
> Chad Trabant wrote on 08/07/2015 10:58 PM:
> > On the other hand, referring to and relying on external definitions for
> > a complete document sounds like a bad idea to me.
>
> It is a bad idea to refer to external data *if* they are not guaranteed
> to be persistent or if they are not in a standard format.
>
> > There is a
> > significant added burden to any reader of such information to get a
> > complete response, especially if the external information is in another
> > format like RESP. Beyond this reader burden there are issues of
> > longevity and versioning. If the external document goes away (there is
> > no guarantee that NRL entries are permanent) then the StationXML now
> > refers to a dead link and is incomplete. If the external document
> > changes then the complete, re-constituted document changes and it may
> > not be obvious to any given reader or user.
>
> I agree. On the other hand, "external" may refer to a Response that is
> not part of a Channel, hence from the Channel point of view it is
> "external" even though it resides within the same, self-contained XML
> document. Where self-containedness is a requirement it can always be
> achieved.
>
> One could think of persistent identifiers (e.g., DOIs) for responses,
> too. In fact, given the decreasing costs for assigning persistent
> identifiers, it might become feasible to use them for addressing
> specific responses as part of the NRL. Probably not within the very near
> future, but I have the impression that developments will move into that
> direction.
>
> Cheers
> Joachim
>
> ----------------------
> FDSN Working Group II (
Chad Trabant wrote on 08/07/2015 10:58 PM:
> On the other hand, referring to and relying on external definitions for
> a complete document sounds like a bad idea to me.
It is a bad idea to refer to external data *if* they are not guaranteed
to be persistent or if they are not in a standard format.
> There is a
> significant added burden to any reader of such information to get a
> complete response, especially if the external information is in another
> format like RESP. Beyond this reader burden there are issues of
> longevity and versioning. If the external document goes away (there is
> no guarantee that NRL entries are permanent) then the StationXML now
> refers to a dead link and is incomplete. If the external document
> changes then the complete, re-constituted document changes and it may
> not be obvious to any given reader or user.
I agree. On the other hand, "external" may refer to a Response that is