question about sioc / foaf usage

22 views
Skip to first unread message

Nathan

unread,
Nov 29, 2009, 10:21:37 AM11/29/09
to pedant...@googlegroups.com
Hi All,

(keeping it short) - Using the site of John Breslin's as an example;
particularly the following:

http://johnbreslin.com/blog/index.php?sioc_type=site#weblog

to my eye the <foaf:Document rdf:about=""> description should be:
<foaf:Document
rdf:about="http://johnbreslin.com/blog/index.php?sioc_type=site#weblog">

or am I missing something as per usual?!

ps: this isn't from the standpoint of John's made a mistake, but rather
I'm looking at examples in the wild of sioc usage to ensure I do it
correctly.

regards,

nathan

Hogan, Aidan

unread,
Nov 30, 2009, 9:52:51 AM11/30/09
to pedant...@googlegroups.com
Hi Nathan,

rdf:about="" is a simple shortcut to refer to the current document -- or
more accurately, the in-scope base URI. In this case -- and after
removing the hash fragment -- rdf:about="" represents [1].

As such, using rdf:about="" is a common RDF/XML (not just SIOC) shortcut
for talking about the document itself. Ideally, John should specify a
baseURI, but that's another topic [2].

Cheers,
Aidan

[1] http://johnbreslin.com/blog/index.php?sioc_type=site
[2] http://pedantic-web.org/fops.html#base

Nathan

unread,
Nov 30, 2009, 10:00:22 AM11/30/09
to pedant...@googlegroups.com
Cheers Aidan,

that explains it, assuming its perfectly okay to not use the shorthand
and be a bit more verbose w/ about. ( Purely for the benefit of parsing
rdf when called through a proxy / when you don't know the uri of the
current document. )

regards & thanks again,

Nathan

Kingsley Idehen

unread,
Nov 30, 2009, 10:27:18 AM11/30/09
to pedant...@googlegroups.com
Hogan, Aidan wrote:
> Hi Nathan,
>
> rdf:about="" is a simple shortcut to refer to the current document -- or
> more accurately, the in-scope base URI. In this case -- and after
> removing the hash fragment -- rdf:about="" represents [1].
>
And how do I describe [1] ? For instance, since [1] will 200 OK on an
HTTP GET, how do I refer to it in a Linked Data oriented triple?
Basically, [1] is an RDF island host i.e, a document that contains RDF
triples expressed in RDFa. Thus, how would one describe resource you
refer to at [1]? Is a Data Container not an Item of Data worthy of
structured description (metadata), de-referencable via its own
unambiguous HTTP URI?

Kingsley
--


Regards,

Kingsley Idehen Weblog: http://www.openlinksw.com/blog/~kidehen
President & CEO
OpenLink Software Web: http://www.openlinksw.com




Nathan

unread,
Nov 30, 2009, 11:06:29 AM11/30/09
to pedant...@googlegroups.com, kid...@openlinksw.com
Kingsley Idehen wrote:
> Hogan, Aidan wrote:
>> Hi Nathan,
>>
>> rdf:about="" is a simple shortcut to refer to the current document -- or
>> more accurately, the in-scope base URI. In this case -- and after
>> removing the hash fragment -- rdf:about="" represents [1].
>>
> And how do I describe [1] ? For instance, since [1] will 200 OK on an
> HTTP GET, how do I refer to it in a Linked Data oriented triple?
> Basically, [1] is an RDF island host i.e, a document that contains RDF
> triples expressed in RDFa. Thus, how would one describe resource you
> refer to at [1]? Is a Data Container not an Item of Data worthy of
> structured description (metadata), de-referencable via its own
> unambiguous HTTP URI?
>

This ties in with something I'm working on (surprise!) and some guidance
may help

I have:
- a Post on an xhtml page
- the sioc:Post data in rdf
- (and in the future) the content as RDFa

I was planning to use http://somedomain.com/item/123 as both the URL for
the post, and the uri for the rdf data, using content negotiation to
deliver html or rdf.

then i considered the following in rdf:
<http://somedomain.com/item/123>
a sioc:Post
sioc:link <http://somedomain.com/item/123>

which may make sense when it's RDFa(?) but for now.. doesn't really?
further, I'm very keen to avoid extensions (.html, .rdf, .n3 etc) as
want to use content negotiation + the following just seems awful to me:
<http://somedomain.com/item/123.rdf>
a sioc:Post
sioc:link <http://somedomain.com/item/123.html>

thoughts?

regards,

nathan

Nathan

unread,
Nov 30, 2009, 11:19:03 AM11/30/09
to pedant...@googlegroups.com, kid...@openlinksw.com
please do add in sioc:about to the equation..
sioc:about <http://somedomain.com/item/123>

Hogan, Aidan

unread,
Nov 30, 2009, 12:07:24 PM11/30/09
to pedant...@googlegroups.com
Hi Kingsley,

> > rdf:about="" is a simple shortcut to refer to the current document
-- or
> > more accurately, the in-scope base URI. In this case -- and after
> > removing the hash fragment -- rdf:about="" represents [1].
> >
> And how do I describe [1] ? For instance, since [1] will 200 OK on an
> HTTP GET, how do I refer to it in a Linked Data oriented triple?
> Basically, [1] is an RDF island host i.e, a document that contains RDF
> triples expressed in RDFa. Thus, how would one describe resource you
> refer to at [1]? Is a Data Container not an Item of Data worthy of
> structured description (metadata), de-referencable via its own
> unambiguous HTTP URI?

Well, yes... this is a slightly tricky philosophical question, but using
[1] as the information-resource URI to represent the document returned
is perfectly okay according to linked data principles:

1. Use URIs as names for things [yep]
2. Use HTTP URIs so that people can look up those names. [yep]
3. When someone looks up a URI, provide useful information, using the
standards (RDF, SPARQL) [yep]
4. Include links to other URIs so that they can discover more things.
[not directly applicable]

Why not use URI [1] to represent the document at [1] in Linked Data?

Kingsley Idehen

unread,
Nov 30, 2009, 12:47:42 PM11/30/09
to pedant...@googlegroups.com
<http://johnbreslin.com/blog/index.php?sioc_type=site> is an
Address/Location (URL) of a Data Container. Thus, the 200 OK treatment
by HTTP. This matter isn't philosophical, far from it. The question is
simply this: is a Document a Data Item or Not? If it is, then it should
have an Identifier that enables its observer (entity describing it) to
associate or discern its characteristics; basically, we should be able
to describe it as we would any other Data Item (or Object).

I do understand:
<http://johnbreslin.com/blog/index.php?sioc_type=site#this> as a simple
mechanism for enabling unambiguous description of its referent (what it
refers to or identifies) which makes an implicit association with
<http://johnbreslin.com/blog/index.php?sioc_type=site>; which holds the
default data representation of referent description.

Richard Cyganiak

unread,
Nov 30, 2009, 1:46:16 PM11/30/09
to pedant...@googlegroups.com
On 30 Nov 2009, at 18:47, Kingsley Idehen wrote:
> <http://johnbreslin.com/blog/index.php?sioc_type=site> is an Address/
> Location (URL) of a Data Container.

The term “Data Container” does not exist in any relevant
specification. Terms from specifications include “information
resource” and “RDF document”. Colloquial terms in common use include
“web document” or “web page”.

> Thus, the 200 OK treatment by HTTP. This matter isn't philosophical,
> far from it. The question is simply this: is a Document a Data Item
> or Not?

The term “Data Item” does not exist in any relevant specification. I'm
not sure what you mean by that term. Googling for “data item” tells me
that it is defined as “1. A named component of a data element; usually
the smallest component. 2. A subunit of descriptive information or
value classified under a data element. For example the data element
"military personnel grade" contains data items such as sergeant,
captain, and colonel.”

Using terms that are defined in Web specifications helps clarity in
communication.

Thank you,
Richard

Kingsley Idehen

unread,
Nov 30, 2009, 1:53:33 PM11/30/09
to pedant...@googlegroups.com
Richard Cyganiak wrote:
> On 30 Nov 2009, at 18:47, Kingsley Idehen wrote:
>> <http://johnbreslin.com/blog/index.php?sioc_type=site> is an
>> Address/Location (URL) of a Data Container.
>
> The term �Data Container� does not exist in any relevant
> specification. Terms from specifications include �information
> resource� and �RDF document�. Colloquial terms in common use include
> �web document� or �web page�.
Richard,

I am using these terms deliberately to reinforce my contempt for:
resource and information resource.

Colloquialism is as subjective as anything else in our real world of
comprehension.
>
>> Thus, the 200 OK treatment by HTTP. This matter isn't philosophical,
>> far from it. The question is simply this: is a Document a Data Item
>> or Not?
>
> The term �Data Item� does not exist in any relevant specification. I'm
> not sure what you mean by that term. Googling for �data item� tells me
> that it is defined as �1. A named component of a data element; usually
> the smallest component. 2. A subunit of descriptive information or
> value classified under a data element. For example the data element
> "military personnel grade" contains data items such as sergeant,
> captain, and colonel.�
>
> Using terms that are defined in Web specifications helps clarity in
> communication.
No I won't, especially as I have a lot of contempt for the specs.

In short, anything that decides to re-write a technical continuum
deserves the contempt it attracts.

Do you seriously believe there wasn't a realm of distributed objects,
identity, data model etc.. before the World Wide Web?

Kignsley
>
> Thank you,
> Richard
>
>
>
>> If it is, then it should have an Identifier that enables its observer
>> (entity describing it) to associate or discern its characteristics;
>> basically, we should be able to describe it as we would any other
>> Data Item (or Object).
>>
>> I do understand:
>> <http://johnbreslin.com/blog/index.php?sioc_type=site#this> as a
>> simple mechanism for enabling unambiguous description of its referent
>> (what it refers to or identifies) which makes an implicit association
>> with <http://johnbreslin.com/blog/index.php?sioc_type=site>; which

Kingsley Idehen

unread,
Nov 30, 2009, 1:55:27 PM11/30/09
to pedant...@googlegroups.com
Richard Cyganiak wrote:
> On 30 Nov 2009, at 18:47, Kingsley Idehen wrote:
>> <http://johnbreslin.com/blog/index.php?sioc_type=site> is an
>> Address/Location (URL) of a Data Container.
>
> The term �Data Container� does not exist in any relevant
> specification. Terms from specifications include �information
> resource� and �RDF document�. Colloquial terms in common use include
> �web document� or �web page�.
>
>> Thus, the 200 OK treatment by HTTP. This matter isn't philosophical,
>> far from it. The question is simply this: is a Document a Data Item
>> or Not?
>
> The term �Data Item� does not exist in any relevant specification. I'm
> not sure what you mean by that term. Googling for �data item� tells me
> that it is defined as �1. A named component of a data element; usually
> the smallest component. 2. A subunit of descriptive information or
> value classified under a data element. For example the data element
> "military personnel grade" contains data items such as sergeant,
> captain, and colonel.�
>
> Using terms that are defined in Web specifications helps clarity in
> communication.

For the sake of others.

How do you describe and information resource via an RDF graph that is
supposed to play well with Linked Data principles?

Kingsley
>
> Thank you,
> Richard
>
>
>
>> If it is, then it should have an Identifier that enables its observer
>> (entity describing it) to associate or discern its characteristics;
>> basically, we should be able to describe it as we would any other
>> Data Item (or Object).
>>
>> I do understand:
>> <http://johnbreslin.com/blog/index.php?sioc_type=site#this> as a
>> simple mechanism for enabling unambiguous description of its referent
>> (what it refers to or identifies) which makes an implicit association
>> with <http://johnbreslin.com/blog/index.php?sioc_type=site>; which

Hogan, Aidan

unread,
Nov 30, 2009, 3:52:23 PM11/30/09
to pedant...@googlegroups.com
Hi Kingsley,

> For the sake of others.
>
> How do you describe and information resource via an RDF graph that is
> supposed to play well with Linked Data principles?

If I understand the intent of your question, you are asking how an
information resource should be identified -- i.e., what's a suitable
URI? To clarify first: what's wrong with -- e.g. -- simply [1]? For me,
this fits well with [2]. How does it not play well with Linked Data
principles? Referring back to earlier:

> using [1] as the information-resource URI to represent the document
> returned is perfectly okay according to linked data principles:
>
> 1. Use URIs as names for things [yep]
> 2. Use HTTP URIs so that people can look up those names. [yep]
> 3. When someone looks up a URI, provide useful information, using
> the standards (RDF, SPARQL) [yep]
> 4. Include links to other URIs so that they can discover more
things.
> [not directly applicable]

[2] http://www.w3.org/TR/webarch/#id-resources

Peter Ansell

unread,
Nov 30, 2009, 4:04:12 PM11/30/09
to pedant...@googlegroups.com, publi...@w3.org
2009/12/1 Hogan, Aidan <aidan...@deri.org>:
My impression of the entire debacle is that it is designed to make
sure that every document has at least two identifiers so that
reasoning systems do not have to distinguish between details about the
delivery of the document, and details contained in the document. Some
rdf harvesting engines want to be able to say <URL>
<retrievedWithhttpStatusCode> "200", for example, and the flow on
effect is that you now apparently can't use the documents URL for any
other purpose because the extra httpStatusCode triple may get added
into the RDF store without a different graph URI. If the statements
are merged in a single graph, there is no way to separate it after
that point because reasoning engines, in this case description logics,
weren't designed with this multiplicity in mind. Interestingly,
everyone is okay with adding <URL> <retrievedWithhttpStatusCode>
"303", because that particular magic value is judged to be immaterial
to the nature of the URL.

That is just my impression of the underlying cause for this entire
debacle without any of the philosophical details about the nature of
the document etc., that always pop up.

Cheers,

Peter

Kingsley Idehen

unread,
Nov 30, 2009, 5:01:34 PM11/30/09
to Peter Ansell, pedant...@googlegroups.com, publi...@w3.org
Peter,

My real grip comes down to the fact that there seems to be an unwritten
rule re. Documents i.e., they aren't material data objects (entities,
data items, resources) re. RDF. Proof of this rule is demonstrated by
the plethora of RDF files that don't assert any relationship between the
RDF file (Data Container) and its structured content (Data Items).

In addition, re. the HTTP system that drives the Web, when you issue an
HTTP GET against a resource (i.e. a file; I don't buy the Information
Resource moniker one bit), a server issues a 200 OK to indicate its
ability to serve a User Agent the resource it requested. Naturally, this
isn't how a Data Identifier works, since Identifiers are independent of:
location, values, structure (this are very old Identity principles from
way before the Web), you have a 303 if the Identifier looks like a
normal resource URL or you leverage the Fragment Identifier component of
the URL by taking the remainder of the URL as the address of the
document containing the description of the HTTP URIs referent.

Thus, as I've stated before (elsewhere), in my world view, all data
objects are equal i.e., if something is worth describing (e.g. a
Document or Data Container or File), it deserves an Identifier, and in
the context of HTTP based data networks -what Linked Data is about - it
means: a Generic HTTP scheme URI.

I assume you've noticed the dearth of RDF examples that include
descriptions of RDF files that are distinct, but connected, to the file
contents.


> Cheers,
>
> Peter

Kingsley Idehen

unread,
Nov 30, 2009, 5:37:20 PM11/30/09
to Ian Davis, Peter Ansell, pedant...@googlegroups.com, publi...@w3.org
Ian Davis wrote:
>> I assume you've noticed the dearth of RDF examples that include descriptions
>> of RDF files that are distinct, but connected, to the file contents.
>>
>
> People have been doing that for years using foaf:primaryTopic. See
> example at http://xmlns.com/foaf/spec/#term_PersonalProfileDocument
> and substitute URIs for the nodeIDs
>
> Ian
>
>
Ian,

Dearth:
noun [in sing. ]
a scarcity or lack of something : there is a dearth of evidence. See
note at lack .

I never said: non existent. A majority of RDF files don't express the
aforementioned relationship.

If you lookup Linked Data from spaces associated with myself of OpenLink
you will see use the aforementioned property re. missing relation. Also,
you may also find out that few people added the missing triple to their
RDF files after nudges from me.

I hope I've made things clearer?

Peter Ansell

unread,
Nov 30, 2009, 7:02:54 PM11/30/09
to Ian Davis, Kingsley Idehen, pedant...@googlegroups.com, publi...@w3.org
2009/12/1 Ian Davis <li...@iandavis.com>:
> On Mon, Nov 30, 2009 at 10:37 PM, Kingsley Idehen
> <kid...@openlinksw.com> wrote:
>
>>
>> If you lookup Linked Data from spaces associated with myself of OpenLink you
>> will see use the aforementioned property re. missing relation. Also, you may
>> also find out that few people added the missing triple to their RDF files
>> after nudges from me.
>>
>> I hope I've made things clearer?
>
> I've read this thread and I don't understand the fuss. Some people
> aren't linking the document to the data it contains so we should
> encourage them to. Don't know why that is characterised as a debacle.
>

The necessary declaration of "document" as distinct, and yet necessary
for the definition of "data", and the necessity of different URI's for
these two concepts, are fundamental sticking points for many people.

If the HTTP web no longer existed (or the internet connection was
temporarily down), the discussion about document versus data would be
mute. Simple RDF Triple database queries, that do not rely on HTTP
communication, have no necessary need to refer to the
Document/Artifact. Only "data" would exist in the RDF triples (unless
you deliberately blur the division using the notion of foaf:Document
via foaf:primaryTopic for instance). Hence the debacle with saying
that Document is a necessary element to understand and use RDF data
linked together using resolvable HTTP URI's when to many it is just an
artifact that doesn't influence, and shouldn't need to semantically
interfere with, the data/information content that is actually being
referenced.

In the long term, I see it as introducing a permanent link from a
semantic RDF (or other similar format) universe to the current
document segregated web that wouldn't be there if everyone shared
their RDF information through some other system, and for example only
used the URI verbatim to do queries on some global hashtable/index
somewhere where there was no concept of document at the native RDF
level. The definition of Linked Data doesn't specifically say that
HTTP URI's have to be resolved using HTTP GET requests over TCP port
80 using DNS for an intermediate host name lookup as necessary, so why
should it require the notion of documents to be necessary containers
for data pretty much just because that is how HTTP GET semantics work.

I characterise it as a debacle because it has been a recurring
discussion for many years and shows that the semantic communicty
hasn't quite cleaned up its architecture/philosophy enough for it to
be clear to people who are trying to understand it and utilise it
without delving into philosophical debates.

Cheers,

Peter

Ian Davis

unread,
Nov 30, 2009, 5:44:25 PM11/30/09
to Kingsley Idehen, Peter Ansell, pedant...@googlegroups.com, publi...@w3.org
On Mon, Nov 30, 2009 at 10:37 PM, Kingsley Idehen
<kid...@openlinksw.com> wrote:

>
> If you lookup Linked Data from spaces associated with myself of OpenLink you
> will see use the aforementioned property re. missing relation. Also, you may
> also find out that few people added the missing triple to their RDF files
> after nudges from me.
>
> I hope I've made things clearer?

I've read this thread and I don't understand the fuss. Some people
aren't linking the document to the data it contains so we should
encourage them to. Don't know why that is characterised as a debacle.

Ian

Ian Davis

unread,
Nov 30, 2009, 5:28:14 PM11/30/09
to Kingsley Idehen, Peter Ansell, pedant...@googlegroups.com, publi...@w3.org
>
> I assume you've noticed the dearth of RDF examples that include descriptions
> of RDF files that are distinct, but connected, to the file contents.

Ian Davis

unread,
Nov 30, 2009, 8:37:06 PM11/30/09
to Peter Ansell, Kingsley Idehen, pedant...@googlegroups.com, publi...@w3.org
On Tue, Dec 1, 2009 at 12:02 AM, Peter Ansell <ansell...@gmail.com> wrote:
> The necessary declaration of "document" as distinct, and yet necessary
> for the definition of "data", and the necessity of different URI's for
> these two concepts, are fundamental sticking points for many people.

Who is getting stuck on this point? Documents have URIs, as do the
things documents might contain data about.

> If the HTTP web no longer existed (or the internet connection was
> temporarily down), the discussion about document versus data would be
> mute. Simple RDF Triple database queries, that do not rely on HTTP
> communication, have no necessary need to refer to the
> Document/Artifact. Only "data" would exist in the RDF triples (unless
> you deliberately blur the division using the notion of foaf:Document
> via foaf:primaryTopic for instance). Hence the debacle with saying
> that Document is a necessary element to understand and use RDF data
> linked together using resolvable HTTP URI's when to many it is just an
> artifact that doesn't influence, and shouldn't need to semantically
> interfere with, the data/information content that is actually being
> referenced.

I disagree. Documents aren't HTTP artefacts: they exist happily on
disks, printouts and in books. You can identify the medium (the data
container in Kingsley's words) separately from the things it is
describing (the data items). In fact it is usually necessary to do,
and intuitive for most people who can distinguish the publisher of a
book from the protaganist it describes.

>
> In the long term, I see it as introducing a permanent link from a
> semantic RDF (or other similar format) universe to the current
> document segregated web that wouldn't be there if everyone shared
> their RDF information through some other system, and for example only
> used the URI verbatim to do queries on some global hashtable/index
> somewhere where there was no concept of document at the native RDF
> level. The definition of Linked Data doesn't specifically say that
> HTTP URI's have to be resolved using HTTP GET requests over TCP port
> 80 using DNS for an intermediate host name lookup as necessary, so why
> should it require the notion of documents to be necessary containers
> for data pretty much just because that is how HTTP GET semantics work.
>
> I characterise it as a debacle because it has been a recurring
> discussion for many years and shows that the semantic communicty
> hasn't quite cleaned up its architecture/philosophy enough for it to
> be clear to people who are trying to understand it and utilise it
> without delving into philosophical debates.

It seems pretty clear to me and many others in my experience,
certainly not a debacle.

>
> Cheers,
>
> Peter
>

Ian

Nathan

unread,
Dec 1, 2009, 12:38:18 PM12/1/09
to Kingsley Idehen, Linked Data community, pedant...@googlegroups.com, SIOC-Dev
Hi All,

To follow on a conversation I'm having with Kingsley at the minute, and
to make it public, I'm also cc'ing in public-lod, pedantic-web and the
sioc user list, as it is to do with all 3. Please do give feedback and
correct me where I'm wrong. Especially if you can inline comment where
something is wrong in my understanding.

Kingsley Idehen wrote:
> Nathan wrote:
>> so do / should the Post, HTML Document and RDF Document all have
>> different Identifiers?
> If you want to make a statement (create a record) describing anything
> you need an Identifier for the subject of your description. If you want
> said description (a graph pictorial) to be fully explorable using HTTP
> (what Linked Data is about) then you shouldn't use the URL (Address of a
> Resource) as its Identifier. An HTTP GET against a URL has specific
> consequences distinct from an HTTP GET against a Generic HTTP scheme URI
> (a genuine Identifier/Name that Identifies an Object/Resource/Data
> Item/Entity).
>
> Rather than do the whole 303 and hash URI dance (counter productive
> since it dances around the issue of Data Identity), see if this document
> of Data Object Identity clarifies things for you re. Identifiers.
>
> Links:
>
> 1.
> http://www.cs.cmu.edu/afs/cs.cmu.edu/user/clamen/OODBMS/Manifesto/htManifesto/node4.html
>

okay.. here's the set-up; I have:

* a "Post" which is a <sioc:Post>
* a HTML Document which contains (among other things) a human readable
representation of the <sioc:Post> at an URL
* a RDF Document which contains a graph pictorial of the <sioc:Post>
which is published at an URL

to describe or reference the <sioc:Post> I have to give it a URI:
<http://example.lod/uri/post-123>

to describe or reference the HTML Document I have to give it a URI:
<http://example.lod/uri/html-document-123>
in addition the HTML document has an URL
<http://example.lod/documents/html-document-123.html>

to describe or reference the RDF Document I have to give it a URI:
<http://example.lod/uri/rdf-graph-123>
in addition the RDF document has an URL
<http://example.lod/documents/rdf-document-123.rdf>


now, I'm assuming the RDF Document will need to be self describing (also
contain a graph pictorial about itself, as well as the <sioc:Post> -
here's a very simplified version of the triples it'd contain.

<http://example.lod/uri/rdf-graph-123> <rdf:type> <foaf:Document> ;
<dc:title> "SIOC Post profile for post-123"@en
<foaf:primaryTopic> <http://example.lod/uri/post-123> .

<http://example.lod/uri/post-123> <rdf:type> <sioc:Post> .

Q1: is <foaf:primaryTopic> correct here?

to say that the <sioc:Post> is contained by this graph we'd add the triple:
<http://example.lod/uri/post-123>
<sioc:link> <http://example.lod/uri/rdf-graph-123> .

then we need to say where the rdf graph can be found (provide it's URL):
<http://example.lod/uri/rdf-graph-123>
<??????> <http://example.lod/documents/rdf-document-123.rdf> .

Q2: which ontology does one use for <??????> in the above triple?

then we need to say that the HTML document is a document, that contains
a human readable version of the <sioc:Post> (amongst other things)

<http://example.lod/uri/html-document-123>
<rdf:type> <foaf:Document> ;
<foaf:primaryTopic> <http://example.lod/uri/post-123> .

Q3: is the HTML Document a <sioc:Container>, which is a container of the
<sioc:Post>?
<http://example.lod/uri/html-document-123>
<rdf:type> <foaf:Document> , <sioc:Container> ;
<foaf:primaryTopic> <http://example.lod/uri/post-123> ;
<sioc:container_of> <http://example.lod/uri/post-123> .

Q4: should we also say the description of the HTML Document is also
contained by this graph?
<http://example.lod/uri/post-123>
<sioc:link> <http://example.lod/uri/rdf-graph-123> .

Q5: how do we specify the URL of the HTML Document?
<http://example.lod/uri/html-document-123>
<?????> <http://example.lod/documents/html-document-123.html> .

I think that's enough for now; all feedback welcome!

regards

nathan

Ed Summers

unread,
Dec 1, 2009, 3:54:22 PM12/1/09
to pedant...@googlegroups.com
On Nov 30, 8:37 pm, Ian Davis <li...@iandavis.com> wrote:
> Who is getting stuck on this point? Documents have URIs, as do the
> things documents might contain data about.

Perhaps this is just an odd library-land corner case, but I got stuck
when working on Linked Data views for Chronicling America [1]. I
wanted to mint URIs for newspaper pages [2]. At first it seemed to me
that a page of a newspaper was a Document or Information Resource,
since:

"""
The distinguishing characteristic of these resources is that all of
their essential characteristics can be conveyed in a message.
"""

The representation for a Chronicling America page resource is pretty
rich, and allows you to view the textual details of a page up close.
It didn't seem to me that there were any details of the resource that
would be lost in the HTML representation, other than perhaps olfactory
or tactile information. But I felt like I was missing the point of
what a Document is in the context of the web.

At the same time I wanted to assert when the newspaper page was
originally published using dcterms:issued [3]

"""
Date of formal issuance (e.g., publication) of the resource.
"""

So assuming a URI chronam:1234 for the newspaper Page information
resource I'd assert:

chronam:1234 dcterms:issued
"1903-10-04"^^<http://www.w3.org/2001/XMLSchema#date> .

But then a question arose about whether this assertion was saying the
Newspaper Page resource was published in 1903, or if (more strangely)
the HTML document was published in 1903. This led me to think I really
needed to have two resources, one Information Resource for the HTML
document representation of the Newspaper Page, and a Real World Object
resource for the abstract notion of the Newspaper Page as it exists in
the world [4].

Xiaoshu Wang has argued [5] that perhaps what is needed instead is
more precise vocabulary that takes Web Architecture into account. So
in my example I'd have a new property for distinguishing the issue
date of a representation versus from the issue date of the resource.

chronam:1234 ex:representationIssued
"2009-12-01"^^<http://www.w3.org/2001/XMLSchema#date> .

In his own words:

"""
... the so-called URI identity issue is unwarranted. The URI's
ambiguity, if there is one, is caused by our ambiguous wording, which
can be simply clarified by using more refined ontological terms.
"""

I find it hard to argue with Xiaoshu's position. I also find it
increasingly difficult aligning the notions of Representation and
Resource from REST with the notions of Information Resource and
Document and Real World Object from the W3C. But I've chalked that up
to not really understanding all the issues and being a newbie to the
area. Any advice for keeping them straight would be appreciated.

I ended up minting two URIs [2,6] and toeing what I thought was the
line. But the experience left me feeling like I was a bit daft, or
missing some key insight. Perhaps the httpbis [7] effort will bring
some clarity?

//Ed

[1] http://chroniclingamerica.loc.gov
[2] http://chroniclingamerica.loc.gov/lccn/sn85066387/1903-10-04/ed-1/seq-45/
[3] http://dublincore.org/documents/dcmi-terms/#terms-issued
[4] http://www.w3.org/TR/cooluris/#semweb
[5] http://dfdf.inesc-id.pt/tr/web-arch#sec4-1
[6] http://chroniclingamerica.loc.gov/lccn/sn85066387/1903-10-04/ed-1/seq-45#page
[7] http://www.ietf.org/dyn/wg/charter/httpbis-charter.html

Kingsley Idehen

unread,
Dec 1, 2009, 5:16:58 PM12/1/09
to nat...@webr3.org, Linked Data community, pedant...@googlegroups.com, SIOC-Dev
Assumption: your Identifiers are slash terminated (i.e. Slash style of
Generic HTTP URI).
>
> now, I'm assuming the RDF Document will need to be self describing (also
> contain a graph pictorial about itself, as well as the <sioc:Post> -
> here's a very simplified version of the triples it'd contain.
>
So the RDF data container (resource) is:

<http://example.lod/documents/rdf-document-123.rdf>, right?

> <http://example.lod/uri/rdf-graph-123> <rdf:type> <foaf:Document> ;
> <dc:title> "SIOC Post profile for post-123"@en
> <foaf:primaryTopic> <http://example.lod/uri/post-123> .
>
> <http://example.lod/uri/post-123> <rdf:type> <sioc:Post> .
>
> Q1: is <foaf:primaryTopic> correct here?
>
Yep.
> to say that the <sioc:Post> is contained by this graph we'd add the triple:
> <http://example.lod/uri/post-123>
> <sioc:link> <http://example.lod/uri/rdf-graph-123> .
>
Redundant, but not necessarily incorrect. You can make redundant
statements :-)
> then we need to say where the rdf graph can be found (provide it's URL):
> <http://example.lod/uri/rdf-graph-123>
> <??????> <http://example.lod/documents/rdf-document-123.rdf> .
>

<http://example.lod/documents/rdf-document-123.rdf> is a data set container so you identify it properly as in: <http://example.lod/documents/rdf-document-123.rdf#this>, via a simple URL to Generic HTTP URI hack, with Linked Data de-referencing in mind re. exploration of the description of this Thing/Object/Entity/Data Item. Note: a little change-up as I've added a new Identifier but taken the cheap # route via fragment identifier.

This also means your could have stated the following at the top:

<http://example.lod/documents/rdf-document-123.rdf#this> <rdf:type> <foaf:Document> ;
<foaf:primaryTopic> <http://example.lod/uri/post-123> .

<http://example.lod/uri/post-123> <rdf:type> <sioc:Post>;
<dc:title> "SIOC Post profile for post-123"@en.

OR even the following, assuming you'd already assigned these URIs and discovered that <http://example.lod/uri/rdf-graph-123> is basically the same as <http://example.lod/documents/rdf-document-123.rdf#this> i.e., RDF data set containers (documents or information resources):

<http://example.lod/documents/rdf-document-123.rdf#this> <rdf:type> <foaf:Document> ;
<owl:sameAs> <http://example.lod/uri/rdf-graph-123>;
<foaf:primaryTopic> <http://example.lod/uri/post-123> .

<http://example.lod/uri/post-123> <rdf:type> <sioc:Post>;
<dc:title> "SIOC Post profile for post-123"@en.




> Q2: which ontology does one use for <??????> in the above triple?
>
None.
> then we need to say that the HTML document is a document, that contains
> a human readable version of the <sioc:Post> (amongst other things)
>
> <http://example.lod/uri/html-document-123>
> <rdf:type> <foaf:Document> ;
> <foaf:primaryTopic> <http://example.lod/uri/post-123> .
>
> Q3: is the HTML Document a <sioc:Container>, which is a container of the
> <sioc:Post>?
> <http://example.lod/uri/html-document-123>
> <rdf:type> <foaf:Document> , <sioc:Container> ;
> <foaf:primaryTopic> <http://example.lod/uri/post-123> ;
> <sioc:container_of> <http://example.lod/uri/post-123> .
>
Yes, esp. as <sioc:Post> <rdfs:subClassOf> <sioc:Item> .

Note same applies to the RDF data container as in:

<http://example.lod/uri/rdf-graph-123> <rdf:type> <foaf:Document> , <sioc:Container> ;
<foaf:primaryTopic> <http://example.lod/uri/post-123> ;
<sioc:container_of> <http://example.lod/uri/post-123> .

OR
<http://example.lod/uri/rdf-graph-123> <rdf:type> <foaf:Document> , <sioc:Container> ;
<owl:sameAs> <http://example.lod/documents/rdf-document-123.rdf#this>;
<foaf:primaryTopic> <http://example.lod/uri/post-123> ;
<sioc:container_of> <http://example.lod/uri/post-123> .



> Q4: should we also say the description of the HTML Document is also
> contained by this graph?
> <http://example.lod/uri/post-123>
> <sioc:link> <http://example.lod/uri/rdf-graph-123> .
>

<http://example.lod/uri/rdf-graph-123> <sioc:link> <http://example.lod/uri/html-document-123>.
or even:
<http://example.lod/uri/rdf-graph-123> <foaf:Topic> <http://example.lod/uri/html-document-123>.


> Q5: how do we specify the URL of the HTML Document?
> <http://example.lod/uri/html-document-123>
> <?????> <http://example.lod/documents/html-document-123.html> .
>
Remember the earlier statement re. the RDF document (resource):


<http://example.lod/documents/rdf-document-123.rdf#this> <rdf:type> <foaf:Document> ;
<owl:sameAs> <http://example.lod/uri/rdf-graph-123>;
<foaf:primaryTopic> <http://example.lod/uri/post-123> .


Re. HTML resource description same thing applies re. association with the sioc:Post:

<http://example.lod/documents/html-document-123.html#this> <rdf:type> <foaf:Document>;
<foaf:primaryTopic> <http://example.lod/uri/post-123> .

<http://example.lod/uri/post-123> <rdf:type> <sioc:Post>;
<dc:title> "SIOC Post profile for post-123"@en.

OR

<http://example.lod/documents/html-document-123.html#this> <rdf:type> <foaf:Document> ;
<owl:sameAs> <http://example.lod/uri/html-document-123>;
<foaf:primaryTopic> <http://example.lod/uri/post-123> .


<http://example.lod/uri/post-123> <rdf:type> <sioc:Post>;
<dc:title> "SIOC Post profile for post-123"@en.


> I think that's enough for now; all feedback welcome!
>
> regards
>
> nathan
>
>
Bar any typos or cut&paste snafus, I've hopefully answered your questions.
Ultimately, the file (information resource, document, data container)
has its own set of attributes e.g. format (dcterms:format), actual file
name (not title of the content), creation date etc.. Distinct from the
description of its content (hence the use of foaf:primaryTopic as
conduit to content description graph).

Link:

1.
http://linkeddata.uriburner.com/about/html/http://news.cnet.com/8301-13577_3-10407056-36.html?tag=newsEditorsPicksArea.0
- example of Linked Data graph that describes an document (information
resource) in a manner distinct from its content (see the data exposed by
foaf:primaryTopic) .

Nathan

unread,
Dec 1, 2009, 5:37:57 PM12/1/09
to Kingsley Idehen, Linked Data community, pedant...@googlegroups.com, SIOC-Dev
perfect, thanks kingsley :)

only q (which i still don't follow) is that afaik I *need* to specify in
rdf where one can find the HTML document, no point describing something
people can't find... noted that in you're own rdf you use:
<resource>
<http://www.openlinksw.com/schema/attribution#isDescribedUsing> <url>

i essentially need the equiv for anything;

<http://example.lod/uri/html-document-123>
<canBeFound> <here/URL> .
or
<has_link> <here/URL> .

the thing I'm describing can be found at web address, ie show the human
this version etc etc (if you follow)

regards,

nathan

natlu2809

unread,
Dec 1, 2009, 5:40:18 PM12/1/09
to pedant...@googlegroups.com, nat...@webr3.org, Linked Data community, SIOC-Dev
Maybe I'm not understanding the dichotomy here:

  • A URI represents a thing, or is an address for a thing
  • Different things have different URIs
  • Different URIs represent different things - the POST, to html doc/serialisation, the rdf doc/serialisation
  • URIs are a front for code that generates things
but
  • A URI can represent the same thing in different serialisations depending on which agent/device/lense you look at it with
but

The identity however is maintained by the "fingerprint" of the object graphs, and the URI is just an image of that fingerprint at some point in time/location ?



http:

Nathan

unread,
Dec 1, 2009, 5:43:24 PM12/1/09
to Kingsley Idehen, Linked Data community, pedant...@googlegroups.com, SIOC-Dev
perhaps foaf:page ? like dbpedia uses?

Kingsley Idehen

unread,
Dec 1, 2009, 6:30:12 PM12/1/09
to nat...@webr3.org, Linked Data community, pedant...@googlegroups.com, SIOC-Dev
Nathan,

<http://www.openlinksw.com/schema/attribution#isDescribedUsing> takes URIs of ontologies/schemas/vocabs as values (i.e. object slot in triple statements we make).

For a simple outbound link, <sioc:links_to> would be fine.

Note though, my examples above are implying that URLs shouldn't be in triples, use #this to make them URIs and then for visual cues you can use an icon to capture the URL (for link out purposes which is where #this is cheap solution) while the URI enables Linked Data traversal. If you look at our pages we use "owl:sameAs" to similar effect (note link-out icons that typically matches the content type, thereby enabling users to exploit effect or URLs since apps know what to do with the data retrieved from URLs).

Kingsley Idehen

unread,
Dec 1, 2009, 6:32:04 PM12/1/09
to pedant...@googlegroups.com, Linked Data community, SIOC-Dev
Nathan wrote:
> [SNIP]
>> perfect, thanks kingsley :)
>>
>> only q (which i still don't follow) is that afaik I *need* to specify in
>> rdf where one can find the HTML document, no point describing something
>> people can't find... noted that in you're own rdf you use:
>> <resource>
>> <http://www.openlinksw.com/schema/attribution#isDescribedUsing> <url>
>>
>> i essentially need the equiv for anything;
>>
>> <http://example.lod/uri/html-document-123>
>> <canBeFound> <here/URL> .
>> or
>> <has_link> <here/URL> .
>>
>> the thing I'm describing can be found at web address, ie show the human
>> this version etc etc (if you follow)
>>
>
> perhaps foaf:page ? like dbpedia uses?
>
>
No problem, but note my comment re. use of icons as visual cue if you
are making an HTML based browser page. Either way, foaf:page is fine.

Nathan

unread,
Dec 1, 2009, 6:40:02 PM12/1/09
to kid...@openlinksw.com, pedant...@googlegroups.com, Linked Data community
Kingsley Idehen wrote:
> Nathan wrote:
>> [SNIP]

i think it's safe to say I grok this all now (lod); armed with
everything i need, and full comprehension to do a months work in the
next 4 days!

kingsley, sincerely, thank you for everything - you've been invaluable
in this process and looking forward to helping spread the word and help
people understand + implement wherever possible over the coming months
(and on).

many many thanks to all who've helped out, by no means am i undermining
the help of you all by specifically mentioning kingsley, just he's
committed a vast amount of hours from both himself and the team at
openlink to getting me through this.

many regards,

nathan

Richard Cyganiak

unread,
Dec 2, 2009, 6:25:38 AM12/2/09
to pedant...@googlegroups.com
Hi Ed,

A newspaper page (even an abstract one, that is, a Manifestation
rather than Item in the FRBR sense) is not the same as a web page.

You have a web page whose topic is a newspaper page.

The newspaper page perhaps is an information resource as well,
according to the AWWW definition of information resource, but knowing
that doesn't actually change anything; a web page describing an
information resource is no different from a web page describing, let's
say, a person, from the web architecture POV.

About Xiaoshu's view, his way of looking at the world is an
alternative possibility to the one that's commonly used in LOD -- I
will call that one the “LOD view”, but you might call it “Richard's
view” if you prefer. So, both the LOD view and Xiaoshu's view are
reasonable, but *combining* both leads to all sorts of issues, so you
should pick one and stick to it. The difference, explained on the
example of FRBR, is that Xiaoshu would have a single identifier, and
different properties to distinguish the “facets” of the resource:

<http://example.com/book123>
frbr:workIssued "1872";
frbr:expressionIssued "1878";
frbr:manifestationIssued "2006";
frbr:itemIssued "2009";
.

The LOD modelling would be:

<http://example.com/book123#work> dc:issued "1872" .
<http://example.com/book123#expression> dc:issued "1878";
<http://example.com/book123#manifestation> dc:issued "2006";
<http://example.com/book123#item> dc:issued "2009";

plus FRBR properties for relating the four resources (frbr:realizes
etc). In the LOD view, having different resources is indicated because
the things have different identity (i.e., were created at different
times).

Personally I agree with Xiaoshu that the definition of Information
Resource from AWWW is not particularly helpful. Personally, these days
I try to talk about “web document” and “thing described in a web
document.”

Best,
Richard

Kingsley Idehen

unread,
Dec 2, 2009, 7:07:49 AM12/2/09
to natlu2809, pedant...@googlegroups.com, nat...@webr3.org, Linked Data community, SIOC-Dev
natlu2809 wrote:
> Maybe I'm not understanding the dichotomy here:
>
> * A URI represents a thing, or is an address for a thing
>
URI Identifies a Thing. URIs basically have Referents (the things they
Identify).
A URL is a Resource Location/Address.
>
> * Different things have different URIs
>
Yes, as is the case in real life. Everything of importance to you has an
Identifier, otherwise you would be able describe or recognize it
distinct from other things.
>
> * Different URIs represent different things - the POST, to html
> doc/serialisation, the rdf doc/serialisation
> * URIs are a front for code that generates things
>
I would say a powerful abstraction, especially when looking at Generic
HTTP scheme URIs. For instance, each component of said URIs affects the
Data Representation that manifests when you issue an HTTP GET. This is
kind of like a composite (compound / concatenated) key in an RDBMS,
change a component as all associated data changes, and said changes
imply different data representations to the construction or breakage of
data relations. You basically get two things in one: Identity
(Reference)/Access (Address) duality, with Generic HTTP URIs.

Now here is the problem (as I've seen and experienced it), there is a
tendency to conflate a Generic URI with a Generic HTTP URI, the former
includes schemes like URN while the latter doesn't. Even worse, there is
a tendency to simply never mention URLs, and thereby conflate this
Location / Address oriented Identifier with a Generic HTTP URI which
simply makes everything confusing and inconsistent.
>
> *
>
>
> but
>
> * A URI can represent the same thing in different serialisations
> depending on which agent/device/lense you look at it with
>
A Generic HTTP URI is a conduit to a myriad of associated data
representations (remember its duality).
>
> but
>
> * a different URI can represent the same thing as another URI -
> http://example.lod/doc.html can be the same thing as
> http://example.lod/resource/doc when requested by a html agent ?
>
>
You can have different Identifiers for the same thing irrespective of
URI scheme. The Generic HTTP URI simply adds resolvability (data access)
to the mix courtesy of the HTTP scheme.
> The identity however is maintained by the "fingerprint" of the object
> graphs, and the URI is just an image of that fingerprint at some point
> in time/location ?
I think Identity is managed by the beholder of things, the one that
deems them important enough to be described, mentioned, talked about, or
referenced :-)

Kingsley
>>>> http://www.cs..cmu.edu/afs/cs.cmu.edu/user/clamen/OODBMS/Manifesto/htManifesto/node4.html
>>> <?????> <http://example.lod/documents/html-document-123..html> .

nat lu

unread,
Dec 2, 2009, 10:04:31 AM12/2/09
to Kingsley Idehen, pedant...@googlegroups.com, nat...@webr3.org, Linked Data community, SIOC-Dev
[snip]


The identity however is maintained by the "fingerprint" of the object graphs, and the URI is just an image of that fingerprint at some point in time/location ?
I think Identity is managed by the beholder of things, the one that deems them important enough to be described, mentioned, talked about, or referenced :-)


I should have said what I was thinking in my head and not what my fingers were thinking : "The identity however is defined by the fingerprint of the object graphs, varying perhaps in time". If I have today a graph [a->b->c]  identified by [http://example.lod/myThing] and tomorrow I change it to [a->-b->c->d] or maybe [a->b->d], the address is the same, the access path is the same, it identifies the same thing, but the qualities of that thing have varied : ie, it is the same, but different. That difference may or may not be important or have consequences for the consumer of that thing.

And unless I provide a versioning URI its not going to be possible to provide for recognising, or "replaying" an identity (or isolating the change in identity) of a thing, at some previous time - the address for instance start as [http://example.lod/v1/myThing] and then become [http://example.lod/v2/myThing] and so on ? But in this case the address has changed, and the internal access path might have, but they're still the same thing (I note it may perhaps also proxied by an agnostic [http://example.lod/myThing]. I suppose a canonical LoD-GUID and the version chain would need to be qualities of each version ?

If the Semantic Web is the second coming of the Internet, then there is going to be a lot of explaining to do :-) Think I'm going to need a fundamental allegory or two....

Apologies for beating this to death.

Nathan

unread,
Dec 2, 2009, 10:22:59 AM12/2/09
to natl...@gmail.com, Kingsley Idehen, pedant...@googlegroups.com, Linked Data community, SIOC-Dev
<http://example.com/thing>
<http://example.com/thing#v1>
<http://example.com/thing#v2>
<http://example.com/thing#v3>
<http://example.com/thing#latest>

then when you dereference the uri to get info you always hit the same
graph since you remove the fragment to dereference.

and to handle the versions you can use triples like..

<http://example.com/thing#v3>
<sioc:earlier_version>
<http://example.com/thing#v1> ,
<http://example.com/thing#v2> ;
<sioc:previous_version>
<http://example.com/thing#v2> ;
<sioc:latest_version>
<http://example.com/thing#latest> .


<http://example.com/thing>
<owl:sameAs>
<http://example.com/thing#latest> .

thus you can always describe a single version of a resource, the latest
version, and so on.

<completely ducking out of the time-travel convo, even if it is related>

regards!

Kingsley Idehen

unread,
Dec 2, 2009, 10:28:38 AM12/2/09
to natl...@gmail.com, pedant...@googlegroups.com, nat...@webr3.org, Linked Data community, SIOC-Dev
nat lu wrote:
>
> [snip]
>
>> The identity however is maintained by the "fingerprint" of the
>> object graphs, and the URI is just an image of that fingerprint
>> at some point in time/location ?
> I think Identity is managed by the beholder of things, the one
> that deems them important enough to be described, mentioned,
> talked about, or referenced :-)
>
>
>
> I should have said what I was thinking in my head and not what my
> fingers were thinking : "The identity however is defined by the
> fingerprint of the object graphs, varying perhaps in time". If I have
> today a graph [a->b->c] identified by [http://example.lod/myThing]
> and tomorrow I change it to [a->-b->c->d] or maybe [a->b->d], the
> address is the same, the access path is the same, it identifies the
> same thing, but the qualities of that thing have varied : ie, it is
> the same, but different. That difference may or may not be important
> or have consequences for the consumer of that thing.
Naturally :-)
>
> And unless I provide a versioning URI its not going to be possible to
> provide for recognising, or "replaying" an identity (or isolating the
> change in identity) of a thing, at some previous time - the address
> for instance start as [http://example.lod/v1/myThing] and then become
> [http://example.lod/v2/myThing] and so on ? But in this case the
> address has changed, and the internal access path might have, but
> they're still the same thing (I note it may perhaps also proxied by an
> agnostic [http://example.lod/myThing]. I suppose a canonical LoD-GUID
> and the version chain would need to be qualities of each version ?
Thorny issue here, and it is application specific. By this I mean your
in the application domain re. the above, where application purpose is
something like a "Time Machine" for deltas associated data bound to a
give URI.
>
> If the Semantic Web is the second coming of the Internet, then there
> is going to be a lot of explaining to do :-) Think I'm going to need a
> fundamental allegory or two....
>
> Apologies for beating this to death.
Those who seek to archive the Web (or the broader Internet )are the ones
that would typically deliver such functionality, as part of their
archival services (imho).

Kingsley

Kingsley Idehen

unread,
Dec 2, 2009, 10:34:06 AM12/2/09
to pedant...@googlegroups.com, natl...@gmail.com, Linked Data community, SIOC-Dev
Yes, this is all fine, but it falls bucket: how you or your application
have decided to version data etc. :-)

> thus you can always describe a single version of a resource, the latest
> version, and so on.
>
Delta-V vocabulary for RDF would enable this sort of thing to be done in
a uniform manner re. interoperability etc.. But its still application
(Versioning) specific orchestration that also loosely connected to the
Provenance space etc..
> <completely ducking out of the time-travel convo, even if it is related>
>
Time-travel via Dataset Deltas is a service that someone (or entity) may
decide to offer; basically, a Linked Data driven Time Machine :-)

Kingsley
> regards!

Peter Ansell

unread,
Dec 2, 2009, 7:48:49 PM12/2/09
to pedant...@googlegroups.com
2009/12/2 Richard Cyganiak <ric...@cyganiak.de>:
> Hi Ed,
>
> A newspaper page (even an abstract one, that is, a Manifestation rather than
> Item in the FRBR sense) is not the same as a web page.
>
> You have a web page whose topic is a newspaper page.
>
> The newspaper page perhaps is an information resource as well, according to
> the AWWW definition of information resource, but knowing that doesn't
> actually change anything; a web page describing an information resource is
> no different from a web page describing, let's say, a person, from the web
> architecture POV.
>
> About Xiaoshu's view, his way of looking at the world is an alternative
> possibility to the one that's commonly used in LOD -- I will call that one
> the “LOD view”, but you might call it “Richard's view” if you prefer. So,
> both the LOD view and Xiaoshu's view are reasonable, but *combining* both
> leads to all sorts of issues, so you should pick one and stick to it. The
> difference, explained on the example of FRBR, is that Xiaoshu would have a
> single identifier, and different properties to distinguish the “facets” of
> the resource:
>
> <http://example.com/book123>
>    frbr:workIssued "1872";
>    frbr:expressionIssued "1878";
>    frbr:manifestationIssued "2006";
>    frbr:itemIssued "2009";
>    .

I was told once that the reason that LOD and Semantic Web in general,
hadn't chosen this path was that current reasoning techniques were
unable to cope with it due to the range and domain restrictions on the
predicates inferring types on the URI that together might be confusing
to a reasoner that assumes one URI/Resource will have a set of classes
that are not disjoint, as opposed to a more liberal reasoning strategy
where the effects of the classes would be confined to the faceted
statements. In this example, the possible range of "frbr:Expression"
on frbr:expressionIssued, would have the class implication confined to
that statement instead of being broadcast to have effects on all other
statements that had that URI as the subject or object.. Because of
thse issues, LOD would therefore be better to have explicit single
typing, rather than implicit multiple typing. Is that an accurate
reason for the different choice?

In the real world, people are able to distinguish between implicit
types based on properties. For example, they create sentences like
"book 123 had a work issued in 1872, and had an expression issued in
1878", and people pick up the implicit typing on the single "book 123"
object for each part of the sentence, rather than creating new objects
in their heads just for the sentence to make sense. LOD would
encourage all of the typing parts of the sentence to be pushed from
the predicate URI into the subject URI, partially so that generic
vocabularies can be reused (without the usual pitfalls of generic
vocabularies). In LOD the sentence that someone would say might be
"the work aspect of the book 123 was issued in 1872 and the expression
aspect of the book 123 was issued in 1878, etc.", ie, making it
explicit that it is a new object being referred to. If AI can never
hope to pick up on implicit typing, then we may be stuck with the
separate preissued URI's for each facet as in the following example.

> The LOD modelling would be:
>
> <http://example.com/book123#work> dc:issued "1872" .
> <http://example.com/book123#expression> dc:issued "1878";
> <http://example.com/book123#manifestation> dc:issued "2006";
> <http://example.com/book123#item> dc:issued "2009";

This modelling, unlike the previous modelling, would also need
rdf:type statements for each resource and interlinks to describe the
relationships between the URI's, to make queries work the way they are
intended.

<http://example.com/book123#work> rdf:type frbr:Work .
<http://example.com/book123#expression> rdf:type frbr:Expression .
<http://example.com/book123#manifestation> rdf:type frbr:Manifestation .
<http://example.com/book123#item> rdf:type frbr:Item .
<http://example.com/book123#work> frbr:hasExpression
<http://example.com/book123#expression> .
<http://example.com/book123#work> frbr:hasManifestation
<http://example.com/book123#manifestation> .
<http://example.com/book123#work> frbr:hasItem
<http://example.com/book123#item> .

I personally do not think it is valuable enough to the web to have new
URI's created for every part of everything, when we could just as
easily distinguish between them based on specialised vocabularies that
inherit some behaviour from the generic vocabularies where possible,
but provide a specialised view that doesn't contradict with other
predicates as the generic dc vocabulary will inevitably do if you
tried to use it in the other system. If you asked for dc:issued in
Xiaoshu's view and the frbr:*issued predicates derived from dc:issued,
then you would legitimately get the issue date of each facet in the
results set, as it was unclear which issued date you were asking for.
You could still get specific frbr:*issued dates though, without
looking for an rdf:type statement as you need to if you go with the
LOD view and rely on overall object typing to give context to generic
vocabularies.

The idea of creating a new URI for each facet of an item, if the only
reason was that it enables users to utilise generic vocabularies,
seems to be overkill to me when the alternative only creates a single
new property URI when a new facet is defined. If you have a million
books whose work, expression, manifestation and resolved "item/web
page/web document" all have to be described, the LOD system will make
use at least 4 million and 1 URI's, and at least 11 million RDF
statements, the alternative system uses 1 million and 4 URI's, and
only 4 million RDF statements... And if a particular facet of the book
wasn't previously described, it requires the creation of another
million URI"s, and 2+N million RDF statements, whereas the alternative
only requires 1 new URI and 1 millon RDF statements. Given that the
information content is equivalent to humans, it is ironic that the
information explosion is only predicated on a current lack of
artificial intelligence.

How would someone easily indicate that they owned a copy of the book
using the LOD conventions? Would the only way be for them to check the
particular publication date, along with the version number that the
printers gave the book, and rely on inferencing to derive the link to
the actual work using this information? After all, there can be
copyfixes done between printings without altering the overall book
identifier... ;)

It reminds me of a Yes Minister skit where Jim Hacker gets confused
when someone asks him to respond with his personal "hat" instead of
his ministerial "hat". I am surprised that foaf and sioc havn't bumped
into this issue until recently with the "role" discussion, as it will
inevitably force the simple foaf:Person class to become a lot more
complicated if it follows the LOD practice, as noone is just a
foaf:Person, they always have particular properties that specialise
them into classes such as "Prime Minister of the UK" or "Researcher"
etc., and people refer to them using those classes, so that for one
someone can keep their personal life separate from their professional
life and still be able to represent it for interests sake on the
Semantic Web.

Ironically, perhaps, the LOD view is semantically compatible with
reasonsers based on Xiaoshu's view, but any instances utilising
Xiaoshu's view will destroy the results for current reasoners relying
on world-level class inferences at the resource level in the LOD view.
If reasoners were more advanced, both could live together, as the LOD
view which cautiously creates a new identifier for an item every time
a new facet is discovered, wouldn't conflict with the statement level
faceting in reasoners following Xiaoshu's view.

The following set of non-generic predicated statements is quite
understandable to a human... What are the theory level bottlenecks
preventing computers from understanding it fully? Is it a fault with
RDF theory that resources are presumed to not be schizophrenic by
nature as they are in human languages?

Ie, the layer specific knowledge is given without relying on the URI
changing to reflect this.

<URI> <http:resolvedDate> "2009-11-26" .
<URI> <http:statusCode> "200" .
<URI> <xhtml:title> "Work about semantic conflicts with implicit
typing by Researcher B" .
<URI> <frbr:issued> "2009-11-22" .
<URI> <frbr:issuedTitle> "Work about semantic conflicts with implicit typing" .
<URI> <frbr:issuingAuthor> "Researcher B" .

Noone has been able to give me a satisfactory answer about 303 by the
way, and why the following unique inconsistency is allowed to occur in
the LOD semantics.

<URI> <http:resolvedDate> "2009-11-26" .
<URI> <http:statusCode> "303" .
<URI> <frbr:issued> "2009-11-22" .
<URI> <frbr:issuedTitle> "Work about semantic conflicts with implicit typing" .
<URI> <frbr:issuingAuthor> "Researcher B" .
<URI> <foaf:page> <URI2> .
<URI2> <http:resolvedDate> "2009-11-26" .
<URI2> <http:statusCode> "200" .
<URI2> <xhtml:title> "Work about semantic conflicts with implicit
typing by Researcher B" .

Why is it that the <URI> <http:statusCode> "303" triple is valid (or
specifically not valid) but it is conversely invalid or valid if it is
"200" or any other number if the response indicates something that
can't transmitted across the wire in a response like thoughts,
actions, and matter. If RDF is serious about keeping track of
provenance metadata it has to be able to answer this question
consistently.

Cheers,

Peter

Richard Cyganiak

unread,
Dec 3, 2009, 5:37:39 AM12/3/09
to pedant...@googlegroups.com
Peter,

This is not a philosophy mailing list. Please let's get back to
talking about actual deployed web data, or at least about best
practices for deploying web data. If you want to talk about
alternative approaches of using AI on the Web, there are good lists
dedicated to such topics.

Thanks,
Richard

Ed Summers

unread,
Dec 3, 2009, 4:16:40 PM12/3/09
to pedant...@googlegroups.com
On Wed, Dec 2, 2009 at 6:25 AM, Richard Cyganiak <ric...@cyganiak.de> wrote:
> A newspaper page (even an abstract one, that is, a Manifestation rather than
> Item in the FRBR sense) is not the same as a web page.
>
> You have a web page whose topic is a newspaper page.
>
> The newspaper page perhaps is an information resource as well, according to
> the AWWW definition of information resource, but knowing that doesn't
> actually change anything; a web page describing an information resource is
> no different from a web page describing, let's say, a person, from the web
> architecture POV.

Thanks for this clarification Richard. I think I get into trouble when
trying to think in purely in terms of REST, where there are:

- a Resource (a newspaper page)
- an Identifier for the Resource (a URL)
- and a Representation of the Resource (an HTML document)

URIs identify Resources not Documents right? I realize this is really
a LOD oriented list, and not a suitable place for a discussion of
REST. But I would be curious to know if I'm misinterpreting REST and
the HTTP and URI RFCs.

I am at least being pedantic, that ought to count for something right? :-)

//Ed

Nathan

unread,
Dec 3, 2009, 4:55:42 PM12/3/09
to pedant...@googlegroups.com, ed.su...@gmail.com
Hi Ed,

Takes a bit of getting used to - and hope I don't step on toes here but
REST is a huge part of linked data, so very on topic.

Here's something which helped me "get it"; forget about URIs and URLs
for now;

1: give everything an id.

"real newspaper page" id:
newspaper-page-21

"html version of newspaper page" id:
html-page-34


2: turn the id in to an HTTP URI

"real newspaper page" id:
http://example.org/ids/newspaper-page-21

"html version of newspaper page" id:
http://example.org/ids/html-page-34

^^ these are your resources, fixed ids for the things.


3: describe the things / resources

<http://example.org/ids/newspaper-page-21>
dc:title "Daily Chronicle, Issue 13, page 2" ;
rdf:type dcterms:PhysicalObject, dcterms:Text ;
dcterms:medium "Newspaper" ;
dcterms:hasFormat <http://example.org/ids/html-page-34> ;


<http://example.org/ids/html-page-34>
dc:title "HTML Version of Daily Chronicle, Issue 13, page 2" ;
rdf:type sioc:Post, dcterms:Text;
dcterms:format "text/html" ;
dcterms:isFormatOf <http://example.org/ids/newspaper-page-21> ;
foaf:primaryTopicOf <http://example.org/pages/issue13_page2.html> .


then stick your newspaper page back in a box, and your html document on
the site at http://example.org/pages/issue13_page2.html

point I guess is, that the second I gave my HTML pages a resource id in
the form a URI, and then had an URL for them too (which is just the
location of where they currently are) it all made sense.

to drive it home, when the HTML document is on your home pc, attached to
an email, or published on the web - it's still the same thing; just as
the newspaper page is always the same thing too.

REST is a different topic, and IMHO not worth thinking about till the
URI/URL thing is nailed. (+ I'm outta time, too much to do today!)

many regards and hope it helps,

nathan

natlu2809

unread,
Dec 4, 2009, 3:42:12 AM12/4/09
to pedant...@googlegroups.com
So
  1. theres the physical newspaper page, in a pile of newspaper pages somewhere, or remembered in your head
  2. Theres a human viewable HTML page served on http://example.org/pages/issue13_page2.html
  3. Theres a bunch of addressable metadata describing the physical newspaper page(1), with a link to the metadata about the html page (4), in some triple repository
  4. Theres a bunch of addressable metadata about the html page(2) with a link to the actual html page(2) and the metadata about the newspaper page(3), in some repository

Richard Cyganiak

unread,
Dec 4, 2009, 2:25:40 PM12/4/09
to pedant...@googlegroups.com
Hi Ed,

On 3 Dec 2009, at 21:16, Ed Summers wrote:

> On Wed, Dec 2, 2009 at 6:25 AM, Richard Cyganiak
> <ric...@cyganiak.de> wrote:
>> A newspaper page (even an abstract one, that is, a Manifestation
>> rather than
>> Item in the FRBR sense) is not the same as a web page.
>>
>> You have a web page whose topic is a newspaper page.
>>
>> The newspaper page perhaps is an information resource as well,
>> according to
>> the AWWW definition of information resource, but knowing that doesn't
>> actually change anything; a web page describing an information
>> resource is
>> no different from a web page describing, let's say, a person, from
>> the web
>> architecture POV.
>
> Thanks for this clarification Richard. I think I get into trouble when
> trying to think in purely in terms of REST, where there are:
>
> - a Resource (a newspaper page)
> - an Identifier for the Resource (a URL)
> - and a Representation of the Resource (an HTML document)

I would avoid the term "document" when talking about representations.
Representations are those ephemeral things that go over the wire. A
representation is a "byte streams with a media type (and possibly
other meta data)".

When I use the term "HTML document", I mean a resource, identified by
a URI, that has (only) HTML representations.

> URIs identify Resources not Documents right?

The way I use the term "document", documents are a kind of resource,
and therefore URIs can also identify documents.

By the way: This discussion nicely shows the problem with the term
"document" in the context of web architecture. What is a document? The
thing that's being sent over the wire, which is transient? Or the
thing named by the URI, which is permanent and can change over time?
The TAG tried to fix that ambiguity by using precise technical terms:
"representation" for the transient stream-of-bytes, and "information
resource" for the thing named by the URI. But the term "information
resource" has its problems, and hasn't really helped the discussion at
all. So personally I'm still using "web document" or "web page", in
the sense of "thing named by a URI that has one or more
representations."

Coming back to your list above, I'd say we have:

- a Resource (newspaper page)
- an Identifier for the Resource (a URI)
- a Web Document describing the Resource (web page)
- an Identifier for that Web Document (another URI)
- Representations of the Web Document (perhaps in HTML and RDF)

I'm using a few rules that I think should be considered axioms of web
architecture:

First, if something exists independently from the Web, then it cannot
be a Web Document. (hence two resources, one for the newspaper page
and one for the web page)

Second, only Web Documents can have representations (hence the need to
describe the newspaper page in a web page, rather than directly
providing representations of the newspaper page).

I understand these rules as axioms, that is, they should be followed
because they make the system work best, not because they somehow
follow from the nature of the world (they don't).

I make sense of the REST worldview like this: In typical REST, all the
URIs *always* identify web documents. The REST folks might claim that
they identify other things, like users or items for sale or places on
the earth, but actually they just identify a document that is *about*
that thing. The thing itself doesn't have an identifier. This is
perfectly fine for building certain kinds of systems, so the REST guys
actually get away with pretending that the URI identifies the thing.
But this doesn't allow you to do certain things, like using domain-
independent vocabularies for metadata and coreference, and you get
into deep trouble if you want to use this for describing *web pages*
rather than *newspaper pages*.

Best,
Richard


> I realize this is really
> a LOD oriented list, and not a suitable place for a discussion of
> REST. But I would be curious to know if I'm misinterpreting REST and
> the HTTP and URI RFCs.
>
> I am at least being pedantic, that ought to count for something
> right? :-)
>
> //Ed



--
Linked Data Technologist • Linked Data Research Centre
Digital Enterprise Research Institute (DERI), NUI Galway, Ireland
http://linkeddata.deri.ie/
skype:richard.cyganiak
tel:+353-91-49-5711

Kingsley Idehen

unread,
Dec 4, 2009, 3:42:13 PM12/4/09
to pedant...@googlegroups.com
Richard,

1. Object -- Referent of an Identifier
2. Resource -- what is currently called Information Resource
3. Representation -- as you've described.

Net result, the following trinity:

Object--isReferentOf-->URI
|--->hasRepresentationAccessedFromAddress(URL)

Genric HTTP URI cleverly provides conduit to the Representation of it
Referent, courtesy of its Identity(Reference)/Access(Address) duality.
Hence the trinity. If I had time to do the ASCII art , you would have
triangle comprised of: Referent (top), Identifier (bottom left), and
representation (bottom right).

The "Resource" instead of "Object" problem came from IETF[1], they
assumed Web of Documents URLs too. Resources are physical artifacts.
Objects have existed in computing, forever. Ditto Referents, Ditto Data
Representations. The trinity I mention above is old, wouldn't be
sending this email if it didn't exist :-)

Links:

1.
http://old.nabble.com/Review-of-new-HTTPbis-text-for-303-See-Other-to24035004.html

Kingsley
> domain-independent vocabularies for metadata and coreference, and you
> get into deep trouble if you want to use this for describing *web
> pages* rather than *newspaper pages*.
>
> Best,
> Richard
>
>
>> I realize this is really
>> a LOD oriented list, and not a suitable place for a discussion of
>> REST. But I would be curious to know if I'm misinterpreting REST and
>> the HTTP and URI RFCs.
>>
>> I am at least being pedantic, that ought to count for something
>> right? :-)
>>
>> //Ed
>
>
>


--


Kingsley Idehen

unread,
Dec 4, 2009, 4:02:52 PM12/4/09
to pedant...@googlegroups.com
Kingsley Idehen wrote:
> Richard Cyganiak wrote:
>
>
> The "Resource" instead of "Object" problem came from IETF[1], they
> assumed Web of Documents URLs too. Resources are physical artifacts.
> Objects have existed in computing, forever. Ditto Referents, Ditto
> Data Representations. The trinity I mention above is old, wouldn't be
> sending this email if it didn't exist :-)
>
> Links:
>
> 1.
> http://old.nabble.com/Review-of-new-HTTPbis-text-for-303-See-Other-to24035004.html
>

Correct link: http://lists.w3.org/Archives/Public/www-tag/2009Aug/0000.html


Kingsley

Kingsley Idehen

unread,
Dec 4, 2009, 5:15:15 PM12/4/09
to pedant...@googlegroups.com
All,

Some additional commentary that helps reaffirm the fact that "resources"
are physical artifacts in any given medium; thereby making them synonyms
for "Objects" is very problematic.

"
The 'mailto' URI scheme is used to identify resources that are reached
using Internet mail. In its simplest form, a 'mailto' URI contains an
Internet mail address. For interactions that require message headers or
message bodies to be specified, the 'mailto' URI scheme also allows
setting mail header fields and the message body. **This document defines
the format of Uniform Resource Identifiers(URI) to identify resources
that are reached using Internet mail**. It adds better internationalization
and compatibility with IRIs (RFC 3987) to the previous syntax of 'mailto'
URIs. If approved, this Standards Track Internet Draft will obsolete
IETF RFC 2368. "


Note: the use of "Reach" you only reach for physical artifacts. You can
even reach my body (medium permitting), but I doubt you can reach my
"soul" :-)

If we use URLs for Resources and URIs for Objects, most confusion will
dissipate. Then it should be clear why you only use URIs when making
creating 3-tuple records based on the RDF data model.

natlu2809

unread,
Dec 5, 2009, 4:57:09 AM12/5/09
to pedant...@googlegroups.com
ping - test can anyone see this ?
> domain-independent vocabularies for metadata and coreference, and you

Kingsley Idehen

unread,
Dec 5, 2009, 9:41:08 AM12/5/09
to pedant...@googlegroups.com
On thing I left out above. The Resource (Document or Data Container) is
what holds the Structured Data. Thus, we have the data access
interaction sequence:
User Agents (data consumers) -->Dereference-->Object ID (Generic HTTP
URI)--->whichProvidesAccessTo--->Resource (exposed by URL aspect )
-->whichTransmitsDataItemsUsingNegotiatedRepresentation.

Kingsley
>
> Links:
>
> 1. http://lists.w3.org/Archives/Public/www-tag/2009Aug/0000.html

Ed Summers

unread,
Dec 15, 2009, 1:50:16 AM12/15/09
to pedant...@googlegroups.com
Hi Richard,

A belated thanks for your email helping me understand how REST and
LinkedData fit together in your view. I've been holding off responding
for a bit, hoping that some of my questions would dissipate.

On Fri, Dec 4, 2009 at 2:25 PM, Richard Cyganiak
<richard....@deri.org> wrote:
> I'm using a few rules that I think should be considered axioms of web
> architecture:
>
> First, if something exists independently from the Web, then it cannot be a
> Web Document. (hence two resources, one for the newspaper page and one for
> the web page)
>
> Second, only Web Documents can have representations (hence the need to
> describe the newspaper page in a web page, rather than directly providing
> representations of the newspaper page).
>
> I understand these rules as axioms, that is, they should be followed because
> they make the system work best, not because they somehow follow from the
> nature of the world (they don't).

These are short and sweet. The first axiom nicely summarizes the
problem I have had trouble answering. A naive question: do you
consider static HTML files on my web server to be dependent on the
web? If the web went away tonight I'd still have my server with my
HTML files tomorrow morning. I could ssh in and edit them, and email
them to other people just fine. I realize I'm treading into a
philosophical quagmire, so feel free to ignore my question. But
perhaps I'm misunderstanding something about what you mean by
'dependent'?

> I make sense of the REST worldview like this: In typical REST, all the URIs
> *always* identify web documents. The REST folks might claim that they
> identify other things, like users or items for sale or places on the earth,
> but actually they just identify a document that is *about* that thing. The
> thing itself doesn't have an identifier. This is perfectly fine for building
> certain kinds of systems, so the REST guys actually get away with pretending
> that the URI identifies the thing. But this doesn't allow you to do certain
> things, like using domain-independent vocabularies for metadata and
> coreference, and you get into deep trouble if you want to use this for
> describing *web pages* rather than *newspaper pages*.

Yes, that is very helpful. I imagine most RESTafarians would say that
URIs identify Resources not Documents though right? I wonder, have you
considered whether or not a tagging system like delicious solves
co-reference? For example there is a URI for a topic:

http://delicious.com/tag/linkeddata

Which groups together documents on the topic of Linked Data:

- http://www.jenitennison.com/blog/node/135
- http://decentralyze.com/2009/09/11/linked-government-data/
- http://linkeddata.org/
- etc

And you can look in the HTML representation and discover alternate
representations of this resource:

- http://feeds.delicious.com/v2/json/tag/linkeddata
- http://feeds.delicious.com/v2/rss/tag/linkeddata

It was my understanding that timbl's original casting of the 4 rules
of Linked Data [1] encompassed this sort of system. But nowadays it
seems to be generally thought that these sorts of systems are not
Linked Data, because they don't follow httpRange-14 or use an RDF
serialization. I've heard that HyperData is a term that might be more
appropriate for these sorts of systems?

//Ed

[1] http://web.archive.org/web/20080417235331/http://www.w3.org/DesignIssues/LinkedData.html
Reply all
Reply to author
Forward
0 new messages