Fwd: JSON-LD Telecon Minutes for 2013-07-02

Bob Morris

unread,

Jul 3, 2013, 5:56:27 PM7/3/13

to tdwg...@googlegroups.com

The message forwarded below is not immediately addressing a problem
that I've seen arise in TDWG discussions, but shows, dare I call it a
quantum mechanical RDF entanglement, between two useful but delicate
(some say controversial) RDF constructions, namely blank nodes and
rdf:lists. What lesson may be here though, is that the frequently
advocated owl:sameAs is not necessary the lifesaver it is believed to
be. Furthermore, the thread it is part of may be saying that a
particular use case that the Open Annotation (OA) ontology Community
Draft [1] has for JSON-LD [2] may have no solution in JSON-LD
(Roughly, it is whether you can give a URI to the head of a list).

In what I regard as the order of importance to the current trends of TDWG are:

1. owl:sameAs is not a panacea. For anything. Useful, yes. Miracle cure, no.
2. choosing any two of the below in concert may lead to problems the
evaluation of which is quite delicate:
a. ordered lists
b. blank nodes
c. owl:sameAs
d. JSON-LD
3. JSON-LD is now on a fast track for W3C Recommendation status

[1] http://www.openannotation.org/spec/core/20130208/index.html
[2] https://dvcs.w3.org/hg/json-ld/raw-file/default/spec/latest/json-ld/index.html

p.s. I expect that the Annotation Interest Group will propose an OA
workshop of some sort for the Florence meeting.

Robert A. Morris

Emeritus Professor of Computer Science
UMASS-Boston
100 Morrissey Blvd
Boston, MA 02125-3390

IT Staff
Filtered Push Project
Harvard University Herbaria
Harvard University

email: morri...@gmail.com
web: http://efg.cs.umb.edu/
web: http://wiki.filteredpush.org
http://www.cs.umb.edu/~ram
===
The content of this communication is made entirely on my
own behalf and in no way should be deemed to express
official positions of The University of Massachusetts at Boston or
Harvard University.

---------- Forwarded message ----------
From: David Booth <da...@dbooth.org>
Date: Wed, Jul 3, 2013 at 2:40 PM
Subject: Re: JSON-LD Telecon Minutes for 2013-07-02
To: Robert Sanderson <azar...@gmail.com>
Cc: Pat Hayes <pha...@ihmc.us>, Linked JSON
<public-li...@w3.org>, RDF WG <public...@w3.org>,
public-openannotation <public-ope...@w3.org>

Hi Rob,

The owl:sameAs solution does have the right semantics, and it has the
benefit of using a standard term. But I'm afraid there may be a
downside as well, and I'm copying Pat to get his take on it. Normally
when you have:

<http://example/foo> owl:sameAs _:b1 .

in a graph, the blank node can be completely eliminated from the graph
and replaced by <http://example/foo>, because the semantics of a blank
node merely indicates the *existence* of a resource, but the
owl:sameAs assertion gives a concrete identity <http://example/foo> to
that resource. But in your case, you want to *avoid* having that
blank node eliminated. Thus, there could be some risk that smart
software that attempts to eliminate unnecessary nodes and assertions
(such as by making the graph "lean")
https://dvcs.w3.org/hg/rdf/raw-file/default/rdf-mt/index.html#dfn-lean
may eliminate the blank node triple that the Turtle serializer would
need for serializing back to the original list syntax.

In other words, if the original graph said:

...
_:b1 a rdf:List .
_:b1 rdf:first :s1 .
...

and you used owl:sameAs as above, then by owl:sameAs entailment we would have:

...
_:b1 a rdf:List .
<http://example/foo> a rdf:List .
_:b1 rdf:first :s1 .
<http://example/foo> rdf:first :s1 .
...

and if that were made lean then it would become:

...
<http://example/foo> a rdf:List .
<http://example/foo> rdf:first :s1 .
...

which would not serialize back to the original Turtle list ( :s1 ... ).

David

On 07/03/2013 11:15 AM, Robert Sanderson wrote:
>
>
> Dear all,
>
> TL;DR version: I think that owl:sameAs is a great solution for the
> predicate.
>
> Thank you for the discussion!
>
> The primary use case for lists with identity (and other properties,
> potentially) in Open Annotation is to have an ordered workflow for
> selecting the correct part of a document. For example, EPub documents
> are just zip files with HTML and other resources packed inside them, so
> it would be beneficial to reuse the methods for selecting the correct
> segment of a resource on the web with the resources inside the EPub, but
> first the file within the zip must be selected.
>
> Thus we would want:
>
> <target1> a oa:SpecificResource ;
> oa:hasSelector <list1> ;
> oa:hasSource <epub1> .
>
> <list1> a oa:List, rdf:List ;
> rdf:isList (<FileSelector>, <TextSelector>) .
> // Or something similar here
>
> <FileSelector> a idpf:EpubFileSelector ;
> rdf:value "/chapter1.html" .
>
> <TextSelector> a oa:TextQuoteSelector ;
> oa:prefix "bit before the segment"
> oa:exact "The text of the annotated segment"
> oa:suffix "bit after the segment"
>
>
> The relevant part of the specification is:
> http://www.openannotation.org/spec/core/multiplicity.html#List
> (and you'll see the long red editor's note!)
>
> I think that Pat's suggestion of owl:sameAs is very appropriate. It
> works in the different syntaxes and has the semantics that the resources
> are the same -- in the case above the blank node that has first of
> <FileSelector> and the resource <list1>.
>
> The other options discussed were rdf:value, which is extremely fuzzy and
> in JSON-LD context you couldn't assert that it always had a list as its
> object if it was also used with a literal. In which case it would result
> in multiple rdf:value predicates, each with one of the list items as
> object. That led to discussing a new predicate, such as listItems,
> listValue, isList, or similar. This would have the implication that the
> blank node and the main identified resource were different resources, as
> compared to the proposal of owl:sameAs which would mean they were the
> same resource.
>
> Rob
>
>
>
>
>
>
> On Wed, Jul 3, 2013 at 12:30 AM, Pat Hayes <pha...@ihmc.us
> <mailto:pha...@ihmc.us>> wrote:
>
>
> On Jul 2, 2013, at 11:38 PM, David Booth wrote:
>
> > On 07/03/2013 12:07 AM, Pat Hayes wrote:
> >>
> >> On Jul 2, 2013, at 12:40 PM, Manu Sporny wrote:
> >>
> >>> Thanks to Niklas for scribing. The minutes from this week's telecon
> >>> are now available.
> >>>
> >>> http://json-ld.org/minutes/2013-07-02/
> >>>
> >>> Full text of the discussion follows including a link to the audio
> >>> transcript:
> >>>
> >>> -------------------------------------------------------------------
> >>>
> >>>
> > JSON-LD Community Group Telecon Minutes for 2013-07-02
> >>>
> >>> Agenda:
> >>>
> http://lists.w3.org/Archives/Public/public-linked-json/2013Jul/0000.html
> >>>
> >>>
> > Topics:
> >>> 1. Assigning Properties to Lists 2. GSoC update 3. JSON-LD / RDF
> >>> Alignment 4. Lists in the JSON and RDF data models 5. Default
> >>> interpretation of JSON arrays Resolutions: 1. Create an issue in
> >>> the RDF WG to formalize a way to express lists that need to be
> >>> identified with a URL and annotated using properties.
> >>
> >> If I understand this correctly, this can be done in RDF already. For
> >> example, the list [ x:a, x:b, 27 ] identified by the URI ex:thisList
> >> and possessing the property x:prop with value x:value is
> described by
> >> this RDF:
> >>
> >> ex:thisList rdf:type rdf:List . ex:thisList rdf:first x:a .
> >> ex:thisLIst rdf:rest _:1 . _:1 rdf:first x:b . _:1 rdf:rest _:2
> . _:2
> >> rdf:first "27"^^xsd:number . _:2 rdf:rest rdf:nil . ex:thisLIst
> >> x:prop x:value .
> >
> > If I have understood the issue properly, the reason
> > for raising this issue in the RDF working group is that this is not
> > necessarily an advisable usage pattern for the RDF list
> vocabulary, because such a list cannot be serialized using Turtle's
> list syntax: (x:a x:b 27).
>
> Yes, you are right, and I confess I had never noticed this
> limitation of Turtle previously. OK, let me change the RDF to the
> following, keeping the list bnodes but using owl:sameAs. (You can of
> course use some other property indicating equality if y'all prefer.):
>
> ex:thisLIst rdf:type rdf:List .
> ex:thisLIst x:prop x:value .
> ex:thisList owl:sameAs _:3 .
> _:3 rdf:first x:a .
> _:3 rdf:rest _:1 .
> _:1 rdf:rest _:2 .
> _:2 rdf:first "27"^^xsd:number .
> _:2 rdf:rest rdf:nil .
>
> Or, in Turtle:
>
> ex:thisList rdf:type rdf:List ;
> x:prop x:value ;
> owl:sameAs (x:a , x:b, 27 ) .
>
> and you could probably omit the first triple, or even introduce your
> own category of JSON-lists and say it is one of those, instead, if
> that would help with triggering appropriate translations into other
> formats (or to distinguish these from eg RDF lists used to encode
> OWL syntax.)
>
> > It falls into a similar category as other uncommon uses of the
> RDF List vocabulary:...
>
> ...no, it doesn't. See remark below.
>
> Pat
>
> > other uncommon uses of the RDF List vocabulary:
> > http://www.w3.org/TR/rdf-schema/#ch_collectionvocab
> > [[
> > Note: RDFS does not require that there be only one first element
> of a list-like structure, or even that a list-like structure have a
> first element.
> > ]]
> >
> > While not prohibited by RDF, such uncommon uses of the RDF list
> vocabulary are certainly seen by some as being somewhat anti-social.
> Thus, the question is whether such uses should be *encouraged*.
> >
> > David
> >
> >>
> >> Pat
> >>
> >>> Chair: Manu Sporny Scribe: Niklas Lindström Present: Niklas
> >>> Lindström, Robert Sanderson, Markus Lanthaler, Manu Sporny, David
> >>> Booth, David I. Lehn, Vikash Agrawal Audio:
> >>> http://json-ld.org/minutes/2013-07-02/audio.ogg
> >>>
> >>> Niklas Lindström is scribing.
> >>>
> >>> Topic: Assigning Properties to Lists
> >>>
> >>> Markus Lanthaler: https://github.com/json-ld/json-ld.org/issues/75
> >>> Robert Sanderson: we'd very much like to give rdf:Lists identity,
> >>> so that they can be referenced from multiple graphs. Also to
> >>> describe them with other properties ... in openannotation, we need
> >>> lists to define a selector which determines which part is
> >>> annotated ... for instance, which piece of a text is annotated,
> >>> with "before" and "after" also recorded (most clients work like
> >>> that) ... Futhermore, IDPF has agreed to use openannotation for
> >>> all EPub books ... EPubs, being zip files with a bunch of files ...
> >>> To define a selector here (take the EPub, select a file, then a
> >>> part in there) ... So we don't want to reproduce every single
> >>> selector mechanism. Thus, an ordered list of two selectors would
> >>> be neeeded. ... We thus need to identify lists, so that we can
> >>> reuse these selectors in multiple statements. ... I.e. a person
> >>> wants to disagree with a specific annotation, or place being
> >>> annotated. ... Furthermore, we have the order of multiple targets,
> >>> e..g. "the first passage on page three, is derived from the second
> >>> passage on page five" ... Not as essential, since it's not really
> >>> machine actionable ... Another project using lists is Shared
> >>> Canvas ... We'd very much like to use JSON-LD there too, for
> >>> selecting pages, using a list of pages and so forth ... For this,
> >>> we took the "list items" approach; the list doesn't need to be
> >>> referenced directly. Markus Lanthaler: robert, do you have the link
> >>> of an example at hand? ... But it might be nice to have this
> >>> standardized, so people don't reinvent list items all the time. ...
> >>> at the mailing list and also the OA community meeting in Europe, we
> >>> agreed that we don't want to change the model to accomodate
> >>> different syntaxes ... We want to recommend JSON-LD Manu Sporny:
> >>> what's the timeline for these needs / when would the WG close
> >>> Robert Sanderson: at the moment, the CG is in an implementation
> >>> phase. We need to dicuss with Ivan, but we hope to move from CG to
> >>> WG next year Manu Sporny: we're very close to CR in JSON-LD. If
> >>> we'd add his feature in, it would put us back for many months.
> >>> Could we add this for JSON-LD 1.1? ... If we think we can put the
> >>> feature in, I think we can easily convince implementers to add it.
> >>> If we add it to the test suite, other implementers would add it.
> >>> ... So for practical purposes, we aim for it to be added within a
> >>> year or so. Robert Sanderson: Yes, that approach could work for
> >>> us. Given that your'e much further ahead. It's not our prefered
> >>> option, since for implementations, it might be unpredictable. ...
> >>> Also, changing this for OA now is much easier than when in a WG ...
> >>> I don't believe anyone has implemented it yet, but IDPF needs this
> >>> to be implementable Manu Sporny: so we may put it in jSON-LD 1.1
> >>> Niklas Lindström: First thing, as far as I know, Turtle doesn't
> >>> support this syntax either. Given that you have a shorthand in
> >>> Turtle.... actually, none of the formats in RDF/XML and Turtle
> >>> support this sort of list syntax. [scribe assist by Manu Sporny]
> >>> Markus Lanthaler: niklasl, AFAICT they currently set rdf:rest to a
> >>> Turtle list Niklas Lindström: Have you discussed that as well? Am
> >>> I missing something? [scribe assist by Manu Sporny] Robert
> >>> Sanderson: No, I don't think you missed anything. [scribe assist
> >>> by Manu Sporny] Robert Sanderson: The identity is easier in
> >>> RDF/XML - you have the property for the URI. [scribe assist by Manu
> >>> Sporny] Robert Sanderson: We did consider the other
> >>> serializations, it's not a ubiquitous feature, but it would be nice
> >>> to have in JSON-LD. [scribe assist by Manu Sporny] Niklas
> >>> Lindström: Right, the main argument when we had the issue, even
> >>> though it's in the Primer that says there is nothing preventing
> >>> lists from being described, multiple start properties, etc. None of
> >>> the core syntaxes allow it, it's not intended to be used like that.
> >>> [scribe assist by Manu Sporny] Niklas Lindström: They're supposed
> >>> to be used as syntactic constructs.... model-wise, they're not
> >>> really a part of RDF.
>
> That is not correct. Collections were intended to be an integral
> part of RDF. They were used by OWL as a syntactic device for
> encoding OWL syntax in RDF, making them unavailable inside OWL, but
> that is an OWL/RDF issue. (IMO, with hindsight, this was a serious
> mistake in designing the OWL/RDF layering. But I was there at the
> time and didn't see the danger myself, so mia culpa.)
>
> >>> [scribe assist by Manu Sporny] Niklas
> >>> Lindström: If this is supported in JSON-LD, it would be a lot
> >>> easier to deviate from the recommended usage pattern.... also
> >>> making it harder for a future RDF spec, who wants to add lists as a
> >>> native part of the model [scribe assist by Manu Sporny] Niklas
> >>> Lindström: You can still use rdf:first / rdf:next explicitly
> >>> today. [scribe assist by Manu Sporny] Robert Sanderson: I agree.
> >>> The notion of order in a graph is always problematic. Not the
> >>> common method to have a resource that is a list and has identity.
> >>> [scribe assist by Manu Sporny] Robert Sanderson: Maybe RDF
> >>> COncepts 1.1 should discuss it. [scribe assist by Manu Sporny]
> >>> David Booth: Yeah, RDF WG should consider this. I agree with
> >>> Niklas. It doesn't fit w/ the usual list pattern. Important to
> >>> consider implications. [scribe assist by Manu Sporny] ... Here's an
> >>> example:
> >>> http://www.openannotation.org/spec/core/multiplicity.html#List
> >>> Robert Sanderson: That's it exactly, thanks Niklas1 Manu Sporny:
> >>> any other thoughs on this? Markus Lanthaler: it would make it hard
> >>> to expect compaction to behave as predicted ... also, compaction
> >>> might be more complex Manu Sporny: Yes. We wanted to stay away
> >>> from it since it might be a mine field in general. ... that said,
> >>> there might be a case for this. Niklas Lindström: Agree with
> >>> Manu's point - there might be something new that's interesting
> >>> here. I don't think we should do it w/o discussing implications.
> >>> Algorithmic complexity for JSON-LD API and implementations. It
> >>> might be almost as problematic as bnodes as predicates. It's
> >>> possible to do this in raw RDF. It seems highly obvious that you
> >>> can add ID in other properties. On the other hands you... [scribe
> >>> assist by Manu Sporny] Manu Sporny: ...can do it w/ literals.
> >>> Niklas Lindström: This borders on the syntactical collapse.
> >>> [scribe assist by Manu Sporny] Markus Lanthaler: syntactically
> >>> having a property carrying the actual list is nearly
> >>> indistinguishable as the requested form (using "@list" as key)
> >>> Robert Sanderson: I agree. The easisest solution for everyone
> >>> would be to have a "listItem" as a property. ... and for the RDF
> >>> WG, it might be good to define a dedicated predicate for it.
> >>> rdf:value is explicitly fuzzy, so you can't always expect a list.
> >>> David Booth: Robert, would it be feasible to just wrap the list in
> >>> another object, and attach the additional info to the wrapper
> >>> object? (I apologize that I have not fully grokked the problem, so
> >>> this suggestion may not be helpful.) ... It would be easier to sell
> >>> changing the model if there was another predicate for this. Manu
> >>> Sporny: so a specific vocabulary for lists would be beneficial in
> >>> general, working in all syntaxes ... would that adress this issue?
> >>> If we quickly create a list vocabulary? Robert Sanderson: I think
> >>> so. Not preferable duing the discussions we had, but the syntactic
> >>> arguments may sway this position. ... A single, interoperable
> >>> solution is preferable. Manu Sporny: anyone objects to open issue
> >>> 75, to continue this dicussion? Niklas Lindström: I think we
> >>> should try to have this as an RDF issue - it really would not come
> >>> up if lists were core to the RDF model. It's a sore spot in RDF
> >>> Concepts. I think we should push it over to the RDF WG immediately.
> >>> It's arbitrary if we or OA try to push something forward, it won't
> >>> solve the real problem.... not in rdf schema vocab. [scribe assist
> >>> by Manu Sporny] Robert Sanderson: +1 to Niklas
> >>>
> >>> PROPOSAL: Create an issue in the RDF WG to formalize a way to
> >>> express lists that need to be identified with a URL and annotated
> >>> using properties.
> >>>
> >>> Manu Sporny: +1 David Booth: +1 Robert Sanderson: +1 Niklas
> >>> Lindström: +1 could be someything like rdf:listValue David I. Lehn:
> >>> +1 Markus Lanthaler: +1
> >>>
> >>> RESOLUTION: Create an issue in the RDF WG to formalize a way to
> >>> express lists that need to be identified with a URL and annotated
> >>> using properties.
> >>>
> >>> Topic: GSoC update
> >>>
> >>> Vikash Agrawal: what's broken in the playground? Manu Sporny: a
> >>> bit weird ui paradigm when clicking on expanded form; headings for
> >>> JSON-LD Context stay, but the input box disappears. Markus
> >>> Lanthaler: http://www.markus-lanthaler.com/jsonld/playground/
> >>> Markus Lanthaler: the headers stay but the inputs disappear.
> >>> Previously headers were toggled off if input areas weren't
> >>> applicable Manu Sporny: play around a bit. I think the old way is
> >>> better. There may be something even better, but right now, the
> >>> problem is that something not used is still shown. Vikash Agrawal:
> >>> this is bug 50 ... by this week, this should be done. Next week is
> >>> a creator app. Markus Lanthaler: could we discuss these things on
> >>> the mailing list or the issue tracker? Manu Sporny: email danbri
> >>> and gregg regarding a schema.org <http://schema.org> JSON-LD
> context Markus Lanthaler:
> >>> vikash, here's Sandro's schema.org <http://schema.org> context:
>
> >>> http://www.w3.org/People/Sandro/schema-org-context.jsonld Markus
> >>> Lanthaler: for the creator app, have a look at:
> >>> http://schema-creator.org/
> >>>
> >>> Topic: JSON-LD / RDF Alignment
> >>>
> >>> Manu Sporny:
> >>> http://lists.w3.org/Archives/Public/public-rdf-wg/2013Jun/0233.html
> >>>
> >>>
> > Manu Sporny: I went into the spec and tried to integrate what we
> >>> have consensus on. ... see the email link above for a list of
> >>> things. ... everything should be there except for skolemization
> >>> David Booth: I just found it, but I think it looks great (just
> >>> some minor things) Manu Sporny: would it adress the LC comment?
> >>> David Booth: It might. It's in the right direction. Manu Sporny:
> >>>
> http://json-ld.org/spec/ED/json-ld/20130630/diff-20130411.html#data-model
> >>>
> >>>
> > Manu Sporny: next, Peter's changes. Appendix A was changed to
> >>> flat out say that JSON-LD uses an extended RDF model. ... we just
> >>> say "Data Model", and that it's an extension of the RDF data
> >>> model. Markus Lanthaler:
> >>> http://lists.w3.org/Archives/Public/public-rdf-wg/2013Jul/0010.html
> >>>
> >>>
> > ... we need to have a resonse from Peter on this.
> >>> David Booth: I'd expect it to be, to the extent that I can channel
> >>> Peter. David Booth: Every node is an IRI , a blank node , a
> >>> JSON-LD value , or a list . David Booth: restricting the literal
> >>> space to JSON-LD values is a restriction rather than an extension
> >>> to the RDF model. Robert Sanderson: Sorry, have to attend another
> >>> call now, though would like to have stayed for the rest of the
> >>> conversation. Thanks everyone for the discussion re lists. ... and
> >>> I don't think that lists need to be mentioned there; they are just
> >>> sugar. Markus Lanthaler: "A JSON-LD value is a string, a number,
> >>> true or false, a typed value, or a language-tagged string." Markus
> >>> Lanthaler: thanks for joining robert Manu Sporny: on top, we
> >>> extension the value space to json true and false, numbers and
> >>> strings. David Booth: A JSON-LD value is a string , a number , true
> >>> or false , a typed value , or a language-tagged string . David
> >>> Booth: it wasn't clear that those lined up with the corresponding
> >>> RDF value space. Manu and David agree that the JSON number value
> >>> space is more general. Manu Sporny: different lexical spaces for
> >>> booleans in xsd and json
> >>>
> >>> Topic: Lists in the JSON and RDF data models
> >>>
> >>> David Booth: What about lists, aren't they the same as expressed
> >>> in RDF? Manu Sporny: not convinced that they are.. ... we need to
> >>> translate it to something in the data model. In RDF, it translates
> >>> to the list properties. There is nothing in RDF concepts to point
> >>> to. ... many just assumes that it's basically part of the data
> >>> model, but it's formally not David Booth: why not point to rdf
> >>> schema? Manu Sporny: not part of the rdf data model. Niklas
> >>> Lindström: Yeah, just a comment. Could we correlate this RDF
> >>> Concepts problem w/ the suggestion wrt. list values. [scribe assist
> >>> by Manu Sporny] David Booth: RDF lists: David Booth:
> >>> http://www.w3.org/TR/rdf-schema/#ch_list Niklas Lindström:
> >>> Clearly, lists are under-specified. [scribe assist by Manu Sporny]
> >>> Niklas Lindström: Maybe we should expand RDF Concepts that is
> >>> present in the 2004 Primer and the Syntax that I scanned
> >>> previously. [scribe assist by Manu Sporny] Manu Sporny: but does
> >>> rdf schema extend the rdf data model? David Booth: no, just a
> >>> convention which is using the rdf data model Markus Lanthaler:
> >>> but's still just a vocabulary. In JSON-LD, we use [a keyword and]
> >>> an array ... it's like a node type [just as literals] Manu Sporny:
> >>> the JSON-LD data model does not talk about rdf:first and rdf:rest
> >>> David Booth: I don't think any test cases needs to be changed by
> >>> the way this is described. So it's just a question of how this
> >>> concept is being described. At present, it's described as a
> >>> difference. Manu Sporny: True. We only change how you think about
> >>> the data model. Manu Sporny: if we make an argument about the
> >>> difference between native JSON literals and RDF literals, we need
> >>> to explain the difference of expressing lists as well. David Booth:
> >>> I don't see the benefit as a difference, from an RDF perspective.
> >>> Niklas Lindström: I think I can answer re: benefit of having
> >>> different model wrt. JSON lists and RDF lists. In JSON, there are
> >>> arrays, those arrays represent repeated statements in RDF> [scribe
> >>> assist by Manu Sporny] Niklas Lindström: RDF people understands
> >>> that intuitively. We mention @set because people that don't
> >>> understand RDF, but do understand mathematical sets.... ordered
> >>> list is more popular than sets in programming. [scribe assist by
> >>> Manu Sporny] Niklas Lindström: We need a way to explain lists in
> >>> JSON-LD, in the same way that we explain sets, and other things.
> >>> Not in a way that introduces rdf:first and rdf:next. [scribe assist
> >>> by Manu Sporny] David Booth: Bottom line: I do not see a need to
> >>> call out lists as being a difference from the RDF model, but I'm
> >>> okay with it being mentioned, in part because I'd like to push RDF
> >>> to have native lists. Markus Lanthaler: manu, did you see
> >>> http://lists.w3.org/Archives/Public/public-rdf-wg/2013Jul/0010.html
> >>>
> >>>
> > already?
> >>>
> >>> Topic: Default interpretation of JSON arrays
> >>>
> >>> David Booth: it seems strange to have @set (unordered) as the
> >>> default ... in regular json, the default is ordered Markus
> >>> Lanthaler: We discussed this quite a bit in the beginning, the
> >>> rationale was that the RDF that was generated would be unmanageable
> >>> - lots of blank nodes, lots of rdf:first/rdf:rest, you couldn't
> >>> work w/ the RDF anymore. [scribe assist by Manu Sporny] Markus
> >>> Lanthaler: we discussed it quite a bit in the beginning. The
> >>> rationale we came up with is that the generated RDF would be very
> >>> gruesome, using rdf lists for everything. ... hundreds of blank
> >>> nodes for everything. Niklas Lindström: Yeah, I agree. That's the
> >>> rationale. While it's true that arrays in JSON are ordered in their
> >>> nature, in all the JSON-LD examples, they are commonly only sets.
> >>> There is no real order. JSON-LD is intended to be used w/ RDF
> >>> properties, there are only a handful of common RDF properties -
> >>> author, contributorList, propertyChainAction, where the order is
> >>> semantic, it means something. [scribe assist by Manu Sporny] Niklas
> >>> Lindström: In every other case, it's just a bundle of things. I
> >>> think that's the better case - explicitly say order doesn't mean
> >>> anything. The same thinking has obscured lots of things wrt. XML.
> >>> You can rely on the order of the elements, not sure if you should.
> >>> It's better to say that "you can't rely on the order", unless
> >>> someone says so explicitly. [scribe assist by Manu Sporny] David
> >>> Booth: As a programmer, I'd use the exact opposite rationale.
> >>> [scribe assist by Manu Sporny] David Booth: So if the default were
> >>> changed to being ordered, then the examples would have to be
> >>> changed to add @set? Markus Lanthaler:
> >>> https://github.com/json-ld/json-ld.org/issues/12 Niklas Lindström:
> >>> We discussed whether we should do it in the @context, we could
> >>> define @set to be the default. [scribe assist by Manu Sporny]
> >>> Niklas Lindström: I agree w/ David that as a programmer, you think
> >>> like that. Unless you think otherwise. [scribe assist by Manu
> >>> Sporny] David Booth: There is also minimal changes going from JSON
> >>> to JSON-LD. [scribe assist by Manu Sporny] Niklas Lindström:
> >>> Datasets on the Web, you never know if the order is intentional or
> >>> not. It's better to assume that it's not ordered. [scribe assist by
> >>> Manu Sporny] Markus Lanthaler: JSON-LD can already serialize the
> >>> same data in so many ways already - remote contexts, you can't
> >>> really interpret the data anymore by just looking at it. Maybe
> >>> doing it in a processor flag, but not in the context. [scribe
> >>> assist by Manu Sporny] Niklas Lindström: I'd like to be able to do
> >>> this in the context. "@container": "@set" would be useful to me.
> >>> [scribe assist by Manu Sporny] David Booth: Can we have a global
> >>> way to indicate @set ? Niklas Lindström: Yeah, but I could wait
> >>> for this feature. [scribe assist by Manu Sporny] David Booth: I'm
> >>> worried about the element of surprise. It reverses the common
> >>> expectation. Manu Sporny: It has not come up as a real issue from
> >>> anywere though. Markus Lanthaler: Is there a use case for this?
> >>> [scribe assist by Manu Sporny] Markus Lanthaler: In the majority
> >>> of instances, the order is irrelevant David Booth: yes, quite
> >>> possible Manu Sporny: a change could also backfire at this stage
> >>> ... we could potentially have a JSON-LD 1.1, for e.g. this. David
> >>> Booth: I think the best solution would be a simple global way to
> >>> specify @set, and user get used to always doing that. Niklas
> >>> Lindström: I think that it can't fly from my point of view - given
> >>> that for every case where I've seen order having meaning, it's
> >>> always been a very specific technical reason. Implicitly ordered
> >>> things as properties on the object. In every specific scenario
> >>> where order is used.... [scribe missed] [scribe assist by Manu
> >>> Sporny] Niklas Lindström: check out schema.org
> <http://schema.org>· only a handful
>
> >>> where the meaning is explicitly ordered:
> >>> http://www.w3.org/People/Sandro/schema-org-context.jsonld Niklas
> >>> Lindström: I might be open that it should be ordered, but not by
> >>> default. [scribe assist by Manu Sporny]
> >>>
> >>> -- manu
> >>>
> >>> -- Manu Sporny (skype: msporny, twitter: manusporny, G+: +Manu
> >>> Sporny) Founder/CEO - Digital Bazaar, Inc. blog: Meritora - Web
> >>> payments commercial launch http://blog.meritora.com/launch/
> >>>
> >>>
> >>>
> >>
> >> ------------------------------------------------------------ IHMC
> >> (850)434 8903 or (650)494 3973 40 South Alcaniz St.
> >> (850)202 4416 office Pensacola (850)202
> >> 4440 fax FL 32502 (850)291 0667
> >> mobile phayesAT-SIGNihmc.us http://www.ihmc.us/users/phayes
> >>
> >>
> >>
> >>
> >>
> >>
> >>
> >>
> >>
> >
>
> ------------------------------------------------------------
> IHMC (850)434 8903 or (650)494 3973
> 40 South Alcaniz St. (850)202 4416 office
> Pensacola (850)202 4440 fax
> FL 32502 (850)291 0667 mobile
> phayesAT-SIGNihmc.us http://www.ihmc.us/users/phayes
>
>
>
>
>
>
>

Hilmar Lapp

unread,

Jul 3, 2013, 6:26:17 PM7/3/13

to tdwg...@googlegroups.com

On Jul 3, 2013, at 5:56 PM, Bob Morris wrote:

> I expect that the Annotation Interest Group will propose an OA workshop of some sort for the Florence meeting.

That'd be great!

-hilmar
--
===========================================================
: Hilmar Lapp -:- Durham, NC -:- informatics.nescent.org :
===========================================================

Steve Baskauf

unread,

Jul 4, 2013, 2:18:48 PM7/4/13

to tdwg...@googlegroups.com

In tracking the conversation for the last few months while lurking on the OA email list, I was interested to see how RDF lists were being built into the OA specification. I had been under the impression that the various list-related features of RDF were not widely used, if not effectively deprecated. Apparently that is not true.

Anyway, in a post subsequent to the one Bob forwarded, this response was given (abbreviated for brevity):
--------

downside as well, and I'm copying Pat to get his take on it.  Normally when you have:
> 
>  <http://example/foo> owl:sameAs _:b1 .
> 
> in a graph, the blank node can be completely eliminated from the graph and replaced by <http://example/foo>

Well, that is a *logically valid* consequence on the RDF as far as the RDF (plus in this case a bit of OWL) semantics is concerned. It is also logically valid to replace the URI by the blank node throughout, for that matter, or to do all kinds of other things to the RDF, such as just omitting half of the triples that describe the list in question. But (as the RDF 1.1 semantics document is at pains to point out), just because it is logically valid does not mean that it is required to be done or even that it is, in all cases, permitted to be done. Semantic extensions of RDF can impose syntactic conditions which require RDF graphs to be treated in special ways, and flag errors if they find violations of these syntactic conditions. OWL itself (well, OWL-DL) imposes all kinds of such conditions on its RDF encodings, for example. 

...

>  Thus, there could be some risk that smart software that attempts to eliminate unnecessary nodes and assertions (such as by making the graph "lean")
> https://dvcs.w3.org/hg/rdf/raw-file/default/rdf-mt/index.html#dfn-lean
> may eliminate the blank node triple that the Turtle serializer would need for serializing back to the original list syntax.

It is always the case that some RDF software can ruin almost any extension of RDF, while still being inferentially valid according to the pure RDF graph semantics. For example, it can legally, as far as basic RDF entailment is concerned, omit parts of any list. If I give you a complete description of a three-element list and you erase the part of it that describes the second element (as I did inadvertantly in my last email) then you have not done anything RDF-invalid; your partial description is still entailed by the complete description. But of course you have now screwed up any software which is expecting a complete list description. There is no way to *guarantee* that almost any use of RDF over and above a simple conjunction of triple-facts is going to be preserved by all semantically correct RDF operations. If you want to impose extra requirements, then publish them and explain to builders of engines how to preserve them and what kinds of errors to post when they find that things are not right according to your published standards.

----
There were a couple of statements in there that I thought were noteworthy: "Semantic extensions of RDF can impose syntactic conditions which require RDF graphs to be treated in special ways..." and "you have now screwed up any software which is expecting a complete list description. There is no way to *guarantee* that almost any use of RDF over and above a simple conjunction of triple-facts is going to be preserved by all semantically correct RDF operations. If you want to impose extra requirements, then publish them and explain to builders of engines how to preserve them..". I have to say that I am somewhat troubled that using OA (which I expect people in the TDWG community will do) would be going down a road that requires the builders of applications to have special sets of rules that they have to be following to avoid "breaking" the RDF. I can see how a developer of some kind of specialty application that is used for annotating Twitter feeds or something might not have an issue with "imposing syntactic conditions which require RDF graphs to be treated in special ways" but in our community, where people are likely to merge triples from diverse sources and do some significant owl:sameAs reasoning, it seems like we will to some extent be limited to "a simple conjunction of triple-facts". If one provider of triples assumes a laundry list of special ways that their triples have to be treated, while another provider assumes that lots of owl:sameAs processing will be done on their triples, it seems like we are putting a larger processing burden on people who want to do the merging.

I have heard and absorbed Bob's frequent warnings about owl:sameAs, but given the cacophony about URIs and how we shouldn't require people to stick to one particular HTTP-proxied URI, it is hard for me to see how we can escape doing some significant owl:sameAs processing if we want to merge data from diverse sources. This is particularly true in cases such as those that BiSciCol is dealing with where specimens, images, samples, and subsamples will be passed around to a variety of stakeholders who will probably be minting their own identifiers and (hopefully) linking them to the identifiers of others using owl:sameAs (I am assuming in this statement that the resources being linked actually are the same thing and not just related things). So it seems like if people do this and people also use OA, we are heading exactly to the situation Bob warns us: combining ordered lists, owl:sameAs, and probably some blank nodes as well. Given my lack of experience with creating applications that actually use RDF, it is difficult for me to anticipate what this is going to mean for developers. My impression was that the reason why RDF list-related features were not favored was because of the limitations they impose on conducting SPARQL queries on unstructured triple stores. But it sounds like that is exactly what the use of OA is going to do. But I'm not really into OA enough to say that with confidence.

Steve

-- 
Steven J. Baskauf, Ph.D., Senior Lecturer
Vanderbilt University Dept. of Biological Sciences

postal mail address:
PMB 351634
Nashville, TN  37235-1634,  U.S.A.

delivery address:
2125 Stevenson Center
1161 21st Ave., S.
Nashville, TN 37235

office: 2128 Stevenson Center
phone: (615) 343-4582,  fax: (615) 322-4942
If you fax, please phone or email so that I will know to look for it.
http://bioimages.vanderbilt.edu

Bob Morris

unread,

Jul 4, 2013, 4:24:36 PM7/4/13

to tdwg...@googlegroups.com

Here's my only slightly informed intuition about where the stumbling
blocks will appear when applied to what I suspect is the first use
case most TDWG users of RDF are striving toward, namely data
integration. I guess that most people are either thinking in terms
of, or even plan to implement, data integration by integration of one
or another form of RDF serialization, not by integrating the graphs
(in the mathematical sense, not the pictures comprising ovals and
rectangles and lines connecting them). This is pretty natural for a
few (unrelated?) reasons, not the least of which are (a) lists of
triples look kind of simple and the (correct) integration by appending
one list to another is particularly simple looking. But graphwise it
alone reveals very little to human perusal; (b) a lot of people's
first exposure to RDF serialization is to RDF/XML (all the more so for
those fond of HTTP Content Negotiation, which in practice is limited
(?) to RDF/XML ). In some cases RDF/XML is not fully expressive of the
graph theoretic language defining RDF. See [1], which addresses some
of this issue, and remarks that it is not restricted to RDF/XML but is
probably shared with other serializations. My expression of the
problem is this:

Gather(D,S)--->Integrate(D,S)-->Deserialize(D,S)-->Query(D)
will not be guaranteed to give the same result as
Gather(D,S)-->Deserialize(D,S)->Integrate(D)-->Query(D)

where D denotes the set(s) of data and S the serialization language.
In the case of RDF/XML, Integrate(D,S) probably is some kind of XSLT.
In the triple store, Integrate(D) ( which, note, is independent of the
serialization) is probably eagerly imagined to be the addition of
owl:sameAs assertions.

Does any of this matter? I'm fairly certain the answer depends on the
use case. For example, my guesses are: (a) rarely for data discovery
(b)rarely for applications that depend on data deduplication (even
though there are other challenges in that case); (c) not so rarely
for descriptive data such as might be described by morphological or
molecular characters. Also, probably some issues are mitigated by
named graph and their support in SPARQL 1.1, which even has recently
accepted W3 Recommendations [2] some of whose documents address some
of the problems---and more---head on.

[1] http://www.w3.org/2009/12/rdf-ws/papers/ws10
[2] http://www.w3.org/TR/2013/REC-sparql11-overview-20130321/

Bob

Robert A. Morris

Emeritus Professor of Computer Science
UMASS-Boston
100 Morrissey Blvd
Boston, MA 02125-3390

IT Staff
Filtered Push Project
Harvard University Herbaria
Harvard University

email: morri...@gmail.com
web: http://efg.cs.umb.edu/
web: http://wiki.filteredpush.org
http://www.cs.umb.edu/~ram
===
The content of this communication is made entirely on my
own behalf and in no way should be deemed to express
official positions of The University of Massachusetts at Boston or
Harvard University.

> --
> You received this message because you are subscribed to the Google Groups
> "TDWG RDF/OWL Task Group" group.
> To unsubscribe from this group and stop receiving emails from it, send an
> email to tdwg-rdf+u...@googlegroups.com.
> For more options, visit https://groups.google.com/groups/opt_out.
>
>

Steve Baskauf

unread,

Jul 4, 2013, 5:49:59 PM7/4/13

to tdwg...@googlegroups.com

It is interesting to hear your take on the theoretical future of data integration. I have been assuming that very little data integration would be done by integrating XML serialized RDF. John Deck told me that BiSciCol is already playing with millions of triples - it's hard for me to believe that so many triples would be serialized and shipped around in XML form given what I understand TDWG's experience with XML as an inefficient means of data transfer. I have assumed that the transfer of triples would be in the form of some kind of compressed archive format. I have assumed that the primary use of RDF/XML transferred as a result of dereferencing a URI would be to check for particular records that are known to have been updated or to access the definition of some new term that was "discovered". I suppose that some RDF to be integrated would result from "discovery and scraping" via the web, but it seems more likely that triples would be spewed directly into a store via conversion from a non-RDF database.

With regards to content negotiation, my understanding was that XML was the default serialization, but not the only one that a provider is limited to. In other words, if a client dereferencing a URI for which RDF is available via HTTP, they should be able to assume that if they ask for content type: application/rdf+xml, they will get it. If they ask for text/turtle they might get it if the data provider chooses to make it available. I checked http://www.w3.org/TR/rdf-concepts/#section-xml-serialization which just says "RDF has a recommended XML serialization form" but doesn't say that XML is THE recommended form. Perhaps I'm remembering R10 of http://bioimages.vanderbilt.edu/pages/guid-applicability-final-2011-01.pdf which is more specific.

Steve

Steve Baskauf

unread,

Jul 6, 2013, 9:10:20 AM7/6/13

to tdwg...@googlegroups.com

After sending this email and some reflection on its contents, I thought it might be better to give some references to explain my frequent use of "I have assumed...". My thinking on this subject has been primarily influenced by my reading about the VoID vocabulary and Sindice's notes on Publishing Web Data. I would consider VoID to be a well-known vocabulary, although http://www.w3.org/TR/void/ notes that it is not expected to become a W3C Recommendation. See also http://vocab.deri.ie/void for more on VoID. Sindice's notes are at http://sindice.com/developers/publishing . Both of these sites refer to content providers making available RDF dataset dumps. I see from http://www.w3.org/TR/void/#dumps that clients should expect the dumps to be "in one of the usual RDF serializations (RDF/XML, N-Triples, Turtle)" which I had forgotten about. So I guess a big dataset might get shipped as XML, although it wouldn't need to be if something like N-Triples were more efficient. I guess what I had keyed into in my mind was what section 3.3 says about the dump possibly being compressed using GZip or other compression algorithms. I don't really know how well this kind of transfer works since I've never tried consuming an RDF dump.

As far as checking for updated records, Sindice suggests using RSS as a means to let people know when particular documents have been updated and allowing consumers to avoid having to re-absorb an entire dump when only a few records have changed. I have tried creating such a file (http://bioimages.vanderbilt.edu/rdf/images.rss ) and I think it's scalable for records on the order of tens of thousands of records. I don't know about millions. At least theoretically it would allow consumers to only dereference a tiny subset of URIs which were known to have changed since the last time the provider's site was scraped.

It would be interesting to get feedback on this from people who have real experience with exposing and consuming vast numbers of triples. Maybe dbpedia???

Steve

Reply all

Reply to author

Forward