Addition of HTML5-like "pattern" regex attribute to OpenSearch Parameter Extension

220 views
Skip to first unread message

Ian Truslove

unread,
Jul 9, 2012, 2:57:17 PM7/9/12
to opens...@googlegroups.com
Hi all,

We're working on some geospatial standardization efforts as part of the ESIP group, trying to come up with some community standards for how to describe valid values for query parameters (e.g. enumerated types).  It was suggested that we could add a "pattern" attribute to the parameter element specified in the Parameter extension.  It would be much like the pattern attribute of html5's input element, and used to specify valid values for the parameters.  

Whilst we could put this attribute into an element in a custom namespace, we wanted to check whether this was reasonable fodder for pulling into the "parent" standard – that surely being a preferable solution to us.  Hope to hear what your thoughts are; we are certainly willing to help with legwork.

-Ian.

DeWitt Clinton

unread,
Jul 17, 2012, 10:26:31 AM7/17/12
to opens...@googlegroups.com, ian.tr...@gmail.com
Hi Ian,

Sorry for the delay in responding!  Offhand, I can't think of a reason why 'pattern' wouldn't work, and I especially like that it reuses the html5 attribute's definition.

Two considerations I can think with respect to the namespace (default vs custom) would be 1) the relatively low adoption in the wild of the Parameter extension (given the low adoption of POST for search requests, presumably) and 2) the challenge in defining the behavior of 'pattern' succinctly enough to be clear enough to implement without being overwhelming to the reader.  And perhaps a third—how convinced are you on your side that regular expressions are expressive enough for your use cases?  

But other than that, it isn't mechanically difficult to add to the Parameter spec itself.  Want to propose some draft text here?

Thanks for reaching out!

-DeWitt


-Ian.

--
You received this message because you are subscribed to the Google Groups "OpenSearch" group.
To view this discussion on the web visit https://groups.google.com/d/msg/opensearch/-/OYg-JkY1EZUJ.
To post to this group, send email to opens...@googlegroups.com.
To unsubscribe from this group, send email to opensearch+...@googlegroups.com.
For more options, visit this group at http://groups.google.com/group/opensearch?hl=en.

Lynnes, Christopher S. (GSFC-6102)

unread,
Jul 18, 2012, 9:37:37 AM7/18/12
to ian.tr...@gmail.com, opens...@googlegroups.com
Ian,
In addition to Dewitt's points, there is another disadvantage that came up during our ESIP Discovery hackathon. If search engine A adds a parameter "DayOfWeek", with valids "(Mo|Tu|We|Th|Fr|Sa|Su)", and engine B adds "Weekday", with valids "(Mon|Tues|Wednes|Thurs|Fri|Satur|Sun)day", we would like for clients to be able to recognize the equivalences, present a single set of choices to the users for the 2 engines and then turn around and send the appropriate values to the servers. We can "patch-up" the parameter name differences with semantics by saying that A:DayOfWeek is the sameAs B:WeekDay. But it's much harder, and may even be impossible for some regexes, to patch up the patterns in the same way. In contrast, a clearly enumerated set of valids, would allow us to assert equivalences in the valid values as well as the parameter name.

However, this argument does not apply for non-enumerated kinds of fields like numbers, dates, etc.
--
Dr. Christopher Lynnes, NASA/GSFC, Code 610.2, 301-614-5185

Ian Truslove

unread,
Jul 18, 2012, 2:11:07 PM7/18/12
to Lynnes, Christopher S. (GSFC-6102), opens...@googlegroups.com
Hi Chris,

I put some discussion on the ESIP wiki last night and needed to email
format it to send out to this list - I'll work in a response to your
points too.

It occurred to me that using regexes helps with imperatively describing
what valid values are, but goes nowhere to describe the semantics of the
substitutions. Having regexes would allow service clients to perform
client-side validation of template parameters, but that left me with the
question "so what"?

Knowing the precise input format would let us make more customized UIs.
Taking a concrete example, if the template parameter was ...?datum={datum}
and the pattern was (NAD27|NAD83|WGS84) then our UI could replace a text
input with a dropdown which only allows valid values. If the template
parameter was ...?prominentColor={color}, and the pattern for "color" was
a more standard regex like \#[0-9a-fA-F]{6}, option to the UI are limited,
e.g. to just setting the pattern attribute on the input element.

If we knew more about the semantics of the template substitutions, we
could still build our clients to be rich, but we might know a little more
about why. There's an OpenSearch mechanism to namespace the template
parameters
(http://www.opensearch.org/Specifications/OpenSearch/1.1#Parameter_names),
and if I read the specs correctly it would allow one to do something like:
<url xmlns:geoDataApp="http://www.example.org/ns/apps/GeoPortal/1.0"

template="http://www.example.org/services/geo_search?q={searchTerms}&datum=
{geoDataApp:datum} />
Then {datum} has a semantic basis. The service provider could publish
human-readable documentation at the namespace URI about the meaning of the
template fields, and regex patterns that limit valid inputs. If making
client applications super-responsive to the API, another service could be
exposed to present valid enumerations or regexes - but how that's done
remain within the purview of the service provider - and not an ESIP or
OpenSearch standard (though recommendations for how to document and expose
such a service may well be useful).

A second example:
<url xmlns:geoDataApp="http://www.example.org/ns/apps/GeoPortal/1.0"

template="http://www.example.org/services/geo_search?q={searchTerms}&backgr
oundColor={geoDataApp:color} />

If the namespace's intent for geoDataApp:color is that it is a valid
color, not only could the UI configure itself to limit valid inputs to the
regex above, the UI could also present a color picker widget, an
eyedropper tool, whatever - by giving the substitution variable a meaning
and having an understanding about how those meanings are communicated.

I think this approach is codified in the OpenSearch spec given a couple of
references:
http://www.opensearch.org/Specifications/OpenSearch/1.1#Parameter_names
indicates that "In the case of unqualified parameter names, the local
parameter name is implicitly associated with the OpenSearch 1.1
namespace", and a small number of parameters such as searchTerms are given
a semantic context in the spec
(http://www.opensearch.org/Specifications/OpenSearch/1.1#The_.22searchTerms
.22_parameter). So, if I'm right in thinking that we can use namespaces
and the like to solve our problem of needing a way to validate inputs, I
think any standardization work in DCP-5 is moot - it boils down to
recommendations to follow existing standards, and best practices.

Chris, I think your example can only be solved with much richer semantics
than the regexes can possibly provide, and I don't know a fully automated
way of representing all those relationships. One additional question for
your example: how can an application know that A's "DayOfWeek" is at all
compatible with B's "Weekday" in the context of a federated search? I
think at some point a human developer has to read some documentation and
determine that A's "DayOfWeek" parameter and B's "Weekday" parameter are
compatible, can be mapped to one another, and are in fact valid to map
given the context in which they're used.

I'm really interested to hear feedback. I'm wary of paper-engineering this
and coming up with a less-than-pragmatic approach (and I think regex
patterns just how Brian suggested are pragmatic), but at the same time I
think that the problem may have already been solved in the OpenSearch spec.


-Ian.




On 7/18/12 7:37 AM, "Lynnes, Christopher S. (GSFC-6102)"

Lynnes, Christopher S. (GSFC-6102)

unread,
Jul 18, 2012, 10:53:07 PM7/18/12
to Ian Truslove, opens...@googlegroups.com
On Jul 18, 2012, at 1:11 PM, Ian Truslove wrote:

> Chris, I think your example can only be solved with much richer semantics
> than the regexes can possibly provide,

We would need something that explicitly identified valid enumerations. In theory, one might encode that into regex form, but it would have to be with the understanding that this regex was limited only to pure enumerations, no wildcarding. Don't know if that provides any benefit

> and I don't know a fully automated
> way of representing all those relationships. One additional question for
> your example: how can an application know that A's "DayOfWeek" is at all
> compatible with B's "Weekday" in the context of a federated search?
> I
> think at some point a human developer has to read some documentation and

The point here is that it is the search engine developer, say, of engine B, who reads the documentation for search engine A and then asserts using e.g., OWL, that my B:Weekday is the sameAs A's DayOfWeek", and then does the same mapping for the valids. Now the vast masses of client developers out there :-) can write code to interpret the OWL automatically to discover the equivalence.

> determine that A's "DayOfWeek" parameter and B's "Weekday" parameter are
> compatible, can be mapped to one another, and are in fact valid to map
> given the context in which they're used.

Reply all
Reply to author
Forward
0 new messages