question about sioc / foaf usage

22 views
Skip to first unread message

Nathan

unread,
Nov 29, 2009, 10:21:37 AM11/29/09
to pedant...@googlegroups.com
Hi All,

(keeping it short) - Using the site of John Breslin's as an example;
particularly the following:

http://johnbreslin.com/blog/index.php?sioc_type=site#weblog

to my eye the <foaf:Document rdf:about=""> description should be:
<foaf:Document
rdf:about="http://johnbreslin.com/blog/index.php?sioc_type=site#weblog">

or am I missing something as per usual?!

ps: this isn't from the standpoint of John's made a mistake, but rather
I'm looking at examples in the wild of sioc usage to ensure I do it
correctly.

regards,

nathan

Hogan, Aidan

unread,
Nov 30, 2009, 9:52:51 AM11/30/09
to pedant...@googlegroups.com
Hi Nathan,

rdf:about="" is a simple shortcut to refer to the current document -- or
more accurately, the in-scope base URI. In this case -- and after
removing the hash fragment -- rdf:about="" represents [1].

As such, using rdf:about="" is a common RDF/XML (not just SIOC) shortcut
for talking about the document itself. Ideally, John should specify a
baseURI, but that's another topic [2].

Cheers,
Aidan

[1] http://johnbreslin.com/blog/index.php?sioc_type=site
[2] http://pedantic-web.org/fops.html#base

Nathan

unread,
Nov 30, 2009, 10:00:22 AM11/30/09
to pedant...@googlegroups.com
Cheers Aidan,

that explains it, assuming its perfectly okay to not use the shorthand
and be a bit more verbose w/ about. ( Purely for the benefit of parsing
rdf when called through a proxy / when you don't know the uri of the
current document. )

regards & thanks again,

Nathan

Kingsley Idehen

unread,
Nov 30, 2009, 10:27:18 AM11/30/09
to pedant...@googlegroups.com
Hogan, Aidan wrote:
> Hi Nathan,
>
> rdf:about="" is a simple shortcut to refer to the current document -- or
> more accurately, the in-scope base URI. In this case -- and after
> removing the hash fragment -- rdf:about="" represents [1].
>
And how do I describe [1] ? For instance, since [1] will 200 OK on an
HTTP GET, how do I refer to it in a Linked Data oriented triple?
Basically, [1] is an RDF island host i.e, a document that contains RDF
triples expressed in RDFa. Thus, how would one describe resource you
refer to at [1]? Is a Data Container not an Item of Data worthy of
structured description (metadata), de-referencable via its own
unambiguous HTTP URI?

Kingsley
--


Regards,

Kingsley Idehen Weblog: http://www.openlinksw.com/blog/~kidehen
President & CEO
OpenLink Software Web: http://www.openlinksw.com




Nathan

unread,
Nov 30, 2009, 11:06:29 AM11/30/09
to pedant...@googlegroups.com, kid...@openlinksw.com
Kingsley Idehen wrote:
> Hogan, Aidan wrote:
>> Hi Nathan,
>>
>> rdf:about="" is a simple shortcut to refer to the current document -- or
>> more accurately, the in-scope base URI. In this case -- and after
>> removing the hash fragment -- rdf:about="" represents [1].
>>
> And how do I describe [1] ? For instance, since [1] will 200 OK on an
> HTTP GET, how do I refer to it in a Linked Data oriented triple?
> Basically, [1] is an RDF island host i.e, a document that contains RDF
> triples expressed in RDFa. Thus, how would one describe resource you
> refer to at [1]? Is a Data Container not an Item of Data worthy of
> structured description (metadata), de-referencable via its own
> unambiguous HTTP URI?
>

This ties in with something I'm working on (surprise!) and some guidance
may help

I have:
- a Post on an xhtml page
- the sioc:Post data in rdf
- (and in the future) the content as RDFa

I was planning to use http://somedomain.com/item/123 as both the URL for
the post, and the uri for the rdf data, using content negotiation to
deliver html or rdf.

then i considered the following in rdf:
<http://somedomain.com/item/123>
a sioc:Post
sioc:link <http://somedomain.com/item/123>

which may make sense when it's RDFa(?) but for now.. doesn't really?
further, I'm very keen to avoid extensions (.html, .rdf, .n3 etc) as
want to use content negotiation + the following just seems awful to me:
<http://somedomain.com/item/123.rdf>
a sioc:Post
sioc:link <http://somedomain.com/item/123.html>

thoughts?

regards,

nathan

Nathan

unread,
Nov 30, 2009, 11:19:03 AM11/30/09
to pedant...@googlegroups.com, kid...@openlinksw.com
please do add in sioc:about to the equation..
sioc:about <http://somedomain.com/item/123>

Hogan, Aidan

unread,
Nov 30, 2009, 12:07:24 PM11/30/09
to pedant...@googlegroups.com
Hi Kingsley,

> > rdf:about="" is a simple shortcut to refer to the current document
-- or
> > more accurately, the in-scope base URI. In this case -- and after
> > removing the hash fragment -- rdf:about="" represents [1].
> >
> And how do I describe [1] ? For instance, since [1] will 200 OK on an
> HTTP GET, how do I refer to it in a Linked Data oriented triple?
> Basically, [1] is an RDF island host i.e, a document that contains RDF
> triples expressed in RDFa. Thus, how would one describe resource you
> refer to at [1]? Is a Data Container not an Item of Data worthy of
> structured description (metadata), de-referencable via its own
> unambiguous HTTP URI?

Well, yes... this is a slightly tricky philosophical question, but using
[1] as the information-resource URI to represent the document returned
is perfectly okay according to linked data principles:

1. Use URIs as names for things [yep]
2. Use HTTP URIs so that people can look up those names. [yep]
3. When someone looks up a URI, provide useful information, using the
standards (RDF, SPARQL) [yep]
4. Include links to other URIs so that they can discover more things.
[not directly applicable]

Why not use URI [1] to represent the document at [1] in Linked Data?

Kingsley Idehen

unread,
Nov 30, 2009, 12:47:42 PM11/30/09
to pedant...@googlegroups.com
<http://johnbreslin.com/blog/index.php?sioc_type=site> is an
Address/Location (URL) of a Data Container. Thus, the 200 OK treatment
by HTTP. This matter isn't philosophical, far from it. The question is
simply this: is a Document a Data Item or Not? If it is, then it should
have an Identifier that enables its observer (entity describing it) to
associate or discern its characteristics; basically, we should be able
to describe it as we would any other Data Item (or Object).

I do understand:
<http://johnbreslin.com/blog/index.php?sioc_type=site#this> as a simple
mechanism for enabling unambiguous description of its referent (what it
refers to or identifies) which makes an implicit association with
<http://johnbreslin.com/blog/index.php?sioc_type=site>; which holds the
default data representation of referent description.

Richard Cyganiak

unread,
Nov 30, 2009, 1:46:16 PM11/30/09
to pedant...@googlegroups.com
On 30 Nov 2009, at 18:47, Kingsley Idehen wrote:
> <http://johnbreslin.com/blog/index.php?sioc_type=site> is an Address/
> Location (URL) of a Data Container.

The term “Data Container” does not exist in any relevant
specification. Terms from specifications include “information
resource” and “RDF document”. Colloquial terms in common use include
“web document” or “web page”.

> Thus, the 200 OK treatment by HTTP. This matter isn't philosophical,
> far from it. The question is simply this: is a Document a Data Item
> or Not?

The term “Data Item” does not exist in any relevant specification. I'm
not sure what you mean by that term. Googling for “data item” tells me
that it is defined as “1. A named component of a data element; usually
the smallest component. 2. A subunit of descriptive information or
value classified under a data element. For example the data element
"military personnel grade" contains data items such as sergeant,
captain, and colonel.”

Using terms that are defined in Web specifications helps clarity in
communication.

Thank you,
Richard

Kingsley Idehen

unread,
Nov 30, 2009, 1:53:33 PM11/30/09
to pedant...@googlegroups.com
Richard Cyganiak wrote:
> On 30 Nov 2009, at 18:47, Kingsley Idehen wrote:
>> <http://johnbreslin.com/blog/index.php?sioc_type=site> is an
>> Address/Location (URL) of a Data Container.
>
> The term �Data Container� does not exist in any relevant
> specification. Terms from specifications include �information
> resource� and �RDF document�. Colloquial terms in common use include
> �web document� or �web page�.
Richard,

I am using these terms deliberately to reinforce my contempt for:
resource and information resource.

Colloquialism is as subjective as anything else in our real world of
comprehension.
>
>> Thus, the 200 OK treatment by HTTP. This matter isn't philosophical,
>> far from it. The question is simply this: is a Document a Data Item
>> or Not?
>
> The term �Data Item� does not exist in any relevant specification. I'm
> not sure what you mean by that term. Googling for �data item� tells me
> that it is defined as �1. A named component of a data element; usually
> the smallest component. 2. A subunit of descriptive information or
> value classified under a data element. For example the data element
> "military personnel grade" contains data items such as sergeant,
> captain, and colonel.�
>
> Using terms that are defined in Web specifications helps clarity in
> communication.
No I won't, especially as I have a lot of contempt for the specs.

In short, anything that decides to re-write a technical continuum
deserves the contempt it attracts.

Do you seriously believe there wasn't a realm of distributed objects,
identity, data model etc.. before the World Wide Web?

Kignsley
>
> Thank you,
> Richard
>
>
>
>> If it is, then it should have an Identifier that enables its observer
>> (entity describing it) to associate or discern its characteristics;
>> basically, we should be able to describe it as we would any other
>> Data Item (or Object).
>>
>> I do understand:
>> <http://johnbreslin.com/blog/index.php?sioc_type=site#this> as a
>> simple mechanism for enabling unambiguous description of its referent
>> (what it refers to or identifies) which makes an implicit association
>> with <http://johnbreslin.com/blog/index.php?sioc_type=site>; which

Kingsley Idehen

unread,
Nov 30, 2009, 1:55:27 PM11/30/09
to pedant...@googlegroups.com
Richard Cyganiak wrote:
> On 30 Nov 2009, at 18:47, Kingsley Idehen wrote:
>> <http://johnbreslin.com/blog/index.php?sioc_type=site> is an
>> Address/Location (URL) of a Data Container.
>
> The term �Data Container� does not exist in any relevant
> specification. Terms from specifications include �information
> resource� and �RDF document�. Colloquial terms in common use include
> �web document� or �web page�.
>
>> Thus, the 200 OK treatment by HTTP. This matter isn't philosophical,
>> far from it. The question is simply this: is a Document a Data Item
>> or Not?
>
> The term �Data Item� does not exist in any relevant specification. I'm
> not sure what you mean by that term. Googling for �data item� tells me
> that it is defined as �1. A named component of a data element; usually
> the smallest component. 2. A subunit of descriptive information or
> value classified under a data element. For example the data element
> "military personnel grade" contains data items such as sergeant,
> captain, and colonel.�
>
> Using terms that are defined in Web specifications helps clarity in
> communication.

For the sake of others.

How do you describe and information resource via an RDF graph that is
supposed to play well with Linked Data principles?

Kingsley
>
> Thank you,
> Richard
>
>
>
>> If it is, then it should have an Identifier that enables its observer
>> (entity describing it) to associate or discern its characteristics;
>> basically, we should be able to describe it as we would any other
>> Data Item (or Object).
>>
>> I do understand:
>> <http://johnbreslin.com/blog/index.php?sioc_type=site#this> as a
>> simple mechanism for enabling unambiguous description of its referent
>> (what it refers to or identifies) which makes an implicit association
>> with <http://johnbreslin.com/blog/index.php?sioc_type=site>; which

Hogan, Aidan

unread,
Nov 30, 2009, 3:52:23 PM11/30/09
to pedant...@googlegroups.com
Hi Kingsley,

> For the sake of others.
>
> How do you describe and information resource via an RDF graph that is
> supposed to play well with Linked Data principles?

If I understand the intent of your question, you are asking how an
information resource should be identified -- i.e., what's a suitable
URI? To clarify first: what's wrong with -- e.g. -- simply [1]? For me,
this fits well with [2]. How does it not play well with Linked Data
principles? Referring back to earlier:

> using [1] as the information-resource URI to represent the document
> returned is perfectly okay according to linked data principles:
>
> 1. Use URIs as names for things [yep]
> 2. Use HTTP URIs so that people can look up those names. [yep]
> 3. When someone looks up a URI, provide useful information, using
> the standards (RDF, SPARQL) [yep]
> 4. Include links to other URIs so that they can discover more
things.
> [not directly applicable]

[2] http://www.w3.org/TR/webarch/#id-resources

Peter Ansell

unread,
Nov 30, 2009, 4:04:12 PM11/30/09
to pedant...@googlegroups.com, publi...@w3.org
2009/12/1 Hogan, Aidan <aidan...@deri.org>:
My impression of the entire debacle is that it is designed to make
sure that every document has at least two identifiers so that
reasoning systems do not have to distinguish between details about the
delivery of the document, and details contained in the document. Some
rdf harvesting engines want to be able to say <URL>
<retrievedWithhttpStatusCode> "200", for example, and the flow on
effect is that you now apparently can't use the documents URL for any
other purpose because the extra httpStatusCode triple may get added
into the RDF store without a different graph URI. If the statements
are merged in a single graph, there is no way to separate it after
that point because reasoning engines, in this case description logics,
weren't designed with this multiplicity in mind. Interestingly,
everyone is okay with adding <URL> <retrievedWithhttpStatusCode>
"303", because that particular magic value is judged to be immaterial
to the nature of the URL.

That is just my impression of the underlying cause for this entire
debacle without any of the philosophical details about the nature of
the document etc., that always pop up.

Cheers,

Peter

Kingsley Idehen

unread,
Nov 30, 2009, 5:01:34 PM11/30/09
to Peter Ansell, pedant...@googlegroups.com, publi...@w3.org
Peter,

My real grip comes down to the fact that there seems to be an unwritten
rule re. Documents i.e., they aren't material data objects (entities,
data items, resources) re. RDF. Proof of this rule is demonstrated by
the plethora of RDF files that don't assert any relationship between the
RDF file (Data Container) and its structured content (Data Items).

In addition, re. the HTTP system that drives the Web, when you issue an
HTTP GET against a resource (i.e. a file; I don't buy the Information
Resource moniker one bit), a server issues a 200 OK to indicate its
ability to serve a User Agent the resource it requested. Naturally, this
isn't how a Data Identifier works, since Identifiers are independent of:
location, values, structure (this are very old Identity principles from
way before the Web), you have a 303 if the Identifier looks like a
normal resource URL or you leverage the Fragment Identifier component of
the URL by taking the remainder of the URL as the address of the
document containing the description of the HTTP URIs referent.

Thus, as I've stated before (elsewhere), in my world view, all data
objects are equal i.e., if something is worth describing (e.g. a
Document or Data Container or File), it deserves an Identifier, and in
the context of HTTP based data networks -what Linked Data is about - it
means: a Generic HTTP scheme URI.

I assume you've noticed the dearth of RDF examples that include
descriptions of RDF files that are distinct, but connected, to the file
contents.


> Cheers,
>
> Peter

Kingsley Idehen

unread,
Nov 30, 2009, 5:37:20 PM11/30/09
to Ian Davis, Peter Ansell, pedant...@googlegroups.com, publi...@w3.org
Ian Davis wrote:
>> I assume you've noticed the dearth of RDF examples that include descriptions
>> of RDF files that are distinct, but connected, to the file contents.
>>
>
> People have been doing that for years using foaf:primaryTopic. See
> example at http://xmlns.com/foaf/spec/#term_PersonalProfileDocument
> and substitute URIs for the nodeIDs
>
> Ian
>
>
Ian,

Dearth:
noun [in sing. ]
a scarcity or lack of something : there is a dearth of evidence. See
note at lack .

I never said: non existent. A majority of RDF files don't express the
aforementioned relationship.

If you lookup Linked Data from spaces associated with myself of OpenLink
you will see use the aforementioned property re. missing relation. Also,
you may also find out that few people added the missing triple to their
RDF files after nudges from me.

I hope I've made things clearer?

Peter Ansell

unread,
Nov 30, 2009, 7:02:54 PM11/30/09
to Ian Davis, Kingsley Idehen, pedant...@googlegroups.com, publi...@w3.org
2009/12/1 Ian Davis <li...@iandavis.com>:
> On Mon, Nov 30, 2009 at 10:37 PM, Kingsley Idehen
> <kid...@openlinksw.com> wrote:
>
>>
>> If you lookup Linked Data from spaces associated with myself of OpenLink you
>> will see use the aforementioned property re. missing relation. Also, you may
>> also find out that few people added the missing triple to their RDF files
>> after nudges from me.
>>
>> I hope I've made things clearer?
>
> I've read this thread and I don't understand the fuss. Some people
> aren't linking the document to the data it contains so we should
> encourage them to. Don't know why that is characterised as a debacle.
>

The necessary declaration of "document" as distinct, and yet necessary
for the definition of "data", and the necessity of different URI's for
these two concepts, are fundamental sticking points for many people.

If the HTTP web no longer existed (or the internet connection was
temporarily down), the discussion about document versus data would be
mute. Simple RDF Triple database queries, that do not rely on HTTP
communication, have no necessary need to refer to the
Document/Artifact. Only "data" would exist in the RDF triples (unless
you deliberately blur the division using the notion of foaf:Document
via foaf:primaryTopic for instance). Hence the debacle with saying
that Document is a necessary element to understand and use RDF data
linked together using resolvable HTTP URI's when to many it is just an
artifact that doesn't influence, and shouldn't need to semantically
interfere with, the data/information content that is actually being
referenced.

In the long term, I see it as introducing a permanent link from a
semantic RDF (or other similar format) universe to the current
document segregated web that wouldn't be there if everyone shared
their RDF information through some other system, and for example only
used the URI verbatim to do queries on some global hashtable/index
somewhere where there was no concept of document at the native RDF
level. The definition of Linked Data doesn't specifically say that
HTTP URI's have to be resolved using HTTP GET requests over TCP port
80 using DNS for an intermediate host name lookup as necessary, so why
should it require the notion of documents to be necessary containers
for data pretty much just because that is how HTTP GET semantics work.

I characterise it as a debacle because it has been a recurring
discussion for many years and shows that the semantic communicty
hasn't quite cleaned up its architecture/philosophy enough for it to
be clear to people who are trying to understand it and utilise it
without delving into philosophical debates.

Cheers,

Peter

Ian Davis

unread,
Nov 30, 2009, 5:44:25 PM11/30/09
to Kingsley Idehen, Peter Ansell, pedant...@googlegroups.com, publi...@w3.org
On Mon, Nov 30, 2009 at 10:37 PM, Kingsley Idehen
<kid...@openlinksw.com> wrote:

>
> If you lookup Linked Data from spaces associated with myself of OpenLink you
> will see use the aforementioned property re. missing relation. Also, you may
> also find out that few people added the missing triple to their RDF files
> after nudges from me.
>
> I hope I've made things clearer?

I've read this thread and I don't understand the fuss. Some people
aren't linking the document to the data it contains so we should
encourage them to. Don't know why that is characterised as a debacle.

Ian

Ian Davis

unread,
Nov 30, 2009, 5:28:14 PM11/30/09
to Kingsley Idehen, Peter Ansell, pedant...@googlegroups.com, publi...@w3.org
>
> I assume you've noticed the dearth of RDF examples that include descriptions
> of RDF files that are distinct, but connected, to the file contents.

Ian Davis

unread,
Nov 30, 2009, 8:37:06 PM11/30/09
to Peter Ansell, Kingsley Idehen, pedant...@googlegroups.com, publi...@w3.org
On Tue, Dec 1, 2009 at 12:02 AM, Peter Ansell <ansell...@gmail.com> wrote:
> The necessary declaration of "document" as distinct, and yet necessary
> for the definition of "data", and the necessity of different URI's for
> these two concepts, are fundamental sticking points for many people.

Who is getting stuck on this point? Documents have URIs, as do the
things documents might contain data about.

> If the HTTP web no longer existed (or the internet connection was
> temporarily down), the discussion about document versus data would be
> mute. Simple RDF Triple database queries, that do not rely on HTTP
> communication, have no necessary need to refer to the
> Document/Artifact. Only "data" would exist in the RDF triples (unless
> you deliberately blur the division using the notion of foaf:Document
> via foaf:primaryTopic for instance). Hence the debacle with saying
> that Document is a necessary element to understand and use RDF data
> linked together using resolvable HTTP URI's when to many it is just an
> artifact that doesn't influence, and shouldn't need to semantically
> interfere with, the data/information content that is actually being
> referenced.

I disagree. Documents aren't HTTP artefacts: they exist happily on
disks, printouts and in books. You can identify the medium (the data
container in Kingsley's words) separately from the things it is
describing (the data items). In fact it is usually necessary to do,
and intuitive for most people who can distinguish the publisher of a
book from the protaganist it describes.

>
> In the long term, I see it as introducing a permanent link from a
> semantic RDF (or other similar format) universe to the current
> document segregated web that wouldn't be there if everyone shared
> their RDF information through some other system, and for example only
> used the URI verbatim to do queries on some global hashtable/index
> somewhere where there was no concept of document at the native RDF
> level. The definition of Linked Data doesn't specifically say that
> HTTP URI's have to be resolved using HTTP GET requests over TCP port
> 80 using DNS for an intermediate host name lookup as necessary, so why
> should it require the notion of documents to be necessary containers
> for data pretty much just because that is how HTTP GET semantics work.
>
> I characterise it as a debacle because it has been a recurring
> discussion for many years and shows that the semantic communicty
> hasn't quite cleaned up its architecture/philosophy enough for it to
> be clear to people who are trying to understand it and utilise it
> without delving into philosophical debates.

It seems pretty clear to me and many others in my experience,
certainly not a debacle.

>
> Cheers,
>
> Peter
>

Ian

Nathan

unread,
Dec 1, 2009, 12:38:18 PM12/1/09
to Kingsley Idehen, Linked Data community, pedant...@googlegroups.com, SIOC-Dev
Hi All,

To follow on a conversation I'm having with Kingsley at the minute, and
to make it public, I'm also cc'ing in public-lod, pedantic-web and the
sioc user list, as it is to do with all 3. Please do give feedback and
correct me where I'm wrong. Especially if you can inline comment where
something is wrong in my understanding.

Kingsley Idehen wrote:
> Nathan wrote:
>> so do / should the Post, HTML Document and RDF Document all have
>> different Identifiers?
> If you want to make a statement (create a record) describing anything
> you need an Identifier for the subject of your description. If you want
> said description (a graph pictorial) to be fully explorable using HTTP
> (what Linked Data is about) then you shouldn't use the URL (Address of a
> Resource) as its Identifier. An HTTP GET against a URL has specific
> consequences distinct from an HTTP GET against a Generic HTTP scheme URI
> (a genuine Identifier/Name that Identifies an Object/Resource/Data
> Item/Entity).
>
> Rather than do the whole 303 and hash URI dance (counter productive
> since it dances around the issue of Data Identity), see if this document
> of Data Object Identity clarifies things for you re. Identifiers.
>
> Links:
>
> 1.
> http://www.cs.cmu.edu/afs/cs.cmu.edu/user/clamen/OODBMS/Manifesto/htManifesto/node4.html
>

okay.. here's the set-up; I have:

* a "Post" which is a <sioc:Post>
* a HTML Document which contains (among other things) a human readable
representation of the <sioc:Post> at an URL
* a RDF Document which contains a graph pictorial of the <sioc:Post>
which is published at an URL

to describe or reference the <sioc:Post> I have to give it a URI:
<http://example.lod/uri/post-123>

to describe or reference the HTML Document I have to give it a URI:
<http://example.lod/uri/html-document-123>
in addition the HTML document has an URL
<http://example.lod/documents/html-document-123.html>

to describe or reference the RDF Document I have to give it a URI:
<http://example.lod/uri/rdf-graph-123>
in addition the RDF document has an URL
<http://example.lod/documents/rdf-document-123.rdf>


now, I'm assuming the RDF Document will need to be self describing (also
contain a graph pictorial about itself, as well as the <sioc:Post> -
here's a very simplified version of the triples it'd contain.

<http://example.lod/uri/rdf-graph-123> <rdf:type> <foaf:Document> ;
<dc:title> "SIOC Post profile for post-123"@en
<foaf:primaryTopic> <http://example.lod/uri/post-123> .

<http://example.lod/uri/post-123> <rdf:type> <sioc:Post> .

Q1: is <foaf:primaryTopic> correct here?

to say that the <sioc:Post> is contained by this graph we'd add the triple:
<http://example.lod/uri/post-123>
<sioc:link> <http://example.lod/uri/rdf-graph-123> .

then we need to say where the rdf graph can be found (provide it's URL):
<http://example.lod/uri/rdf-graph-123>
<??????> <http://example.lod/documents/rdf-document-123.rdf> .

Q2: which ontology does one use for <??????> in the above triple?

then we need to say that the HTML document is a document, that contains
a human readable version of the <sioc:Post> (amongst other things)

<http://example.lod/uri/html-document-123>
<rdf:type> <foaf:Document> ;
<foaf:primaryTopic> <http://example.lod/uri/post-123> .

Q3: is the HTML Document a <sioc:Container>, which is a container of the
<sioc:Post>?
<http://example.lod/uri/html-document-123>
<rdf:type> <foaf:Document> , <sioc:Container> ;
<foaf:primaryTopic> <http://example.lod/uri/post-123> ;
<sioc:container_of> <http://example.lod/uri/post-123> .

Q4: should we also say the description of the HTML Document is also
contained by this graph?
<http://example.lod/uri/post-123>
<sioc:link> <http://example.lod/uri/rdf-graph-123> .

Q5: how do we specify the URL of the HTML Document?
<http://example.lod/uri/html-document-123>
<?????> <http://example.lod/documents/html-document-123.html> .

I think that's enough for now; all feedback welcome!

regards

nathan

Ed Summers

unread,
Dec 1, 2009, 3:54:22 PM12/1/09
to pedant...@googlegroups.com
On Nov 30, 8:37 pm, Ian Davis <li...@iandavis.com> wrote:
> Who is getting stuck on this point? Documents have URIs, as do the
> things documents might contain data about.

Perhaps this is just an odd library-land corner case, but I got stuck
when working on Linked Data views for Chronicling America [1]. I
wanted to mint URIs for newspaper pages [2]. At first it seemed to me
that a page of a newspaper was a Document or Information Resource,
since:

"""
The distinguishing characteristic of these resources is that all of
their essential characteristics can be conveyed in a message.
"""

The representation for a Chronicling America page resource is pretty
rich, and allows you to view the textual details of a page up close.
It didn't seem to me that there were any details of the resource that
would be lost in the HTML representation, other than perhaps olfactory
or tactile information. But I felt like I was missing the point of
what a Document is in the context of the web.

At the same time I wanted to assert when the newspaper page was
originally published using dcterms:issued [3]

"""
Date of formal issuance (e.g., publication) of the resource.
"""

So assuming a URI chronam:1234 for the newspaper Page information
resource I'd assert:

chronam:1234 dcterms:issued
"1903-10-04"^^<http://www.w3.org/2001/XMLSchema#date> .

But then a question arose about whether this assertion was saying the
Newspaper Page resource was published in 1903, or if (more strangely)
the HTML document was published in 1903. This led me to think I really
needed to have two resources, one Information Resource for the HTML
document representation of the Newspaper Page, and a Real World Object
resource for the abstract notion of the Newspaper Page as it exists in
the world [4].

Xiaoshu Wang has argued [5] that perhaps what is needed instead is
more precise vocabulary that takes Web Architecture into account. So
in my example I'd have a new property for distinguishing the issue
date of a representation versus from the issue date of the resource.

chronam:1234 ex:representationIssued
"2009-12-01"^^<http://www.w3.org/2001/XMLSchema#date> .

In his own words:

"""
... the so-called URI identity issue is unwarranted. The URI's
ambiguity, if there is one, is caused by our ambiguous wording, which
can be simply clarified by using more refined ontological terms.
"""

I find it hard to argue with Xiaoshu's position. I also find it
increasingly difficult aligning the notions of Representation and
Resource from REST with the notions of Information Resource and
Document and Real World Object from the W3C. But I've chalked that up
to not really understanding all the issues and being a newbie to the
area. Any advice for keeping them straight would be appreciated.

I ended up minting two URIs [2,6] and toeing what I thought was the
line. But the experience left me feeling like I was a bit daft, or
missing some key insight. Perhaps the httpbis [7] effort will bring
some clarity?

//Ed

[1] http://chroniclingamerica.loc.gov
[2] http://chroniclingamerica.loc.gov/lccn/sn85066387/1903-10-04/ed-1/seq-45/
[3] http://dublincore.org/documents/dcmi-terms/#terms-issued
[4] http://www.w3.org/TR/cooluris/#semweb
[5] http://dfdf.inesc-id.pt/tr/web-arch#sec4-1
[6] http://chroniclingamerica.loc.gov/lccn/sn85066387/1903-10-04/ed-1/seq-45#page
[7] http://www.ietf.org/dyn/wg/charter/httpbis-charter.html

Kingsley Idehen

unread,
Dec 1, 2009, 5:16:58 PM12/1/09
to nat...@webr3.org, Linked Data community, pedant...@googlegroups.com, SIOC-Dev
Assumption: your Identifiers are slash terminated (i.e. Slash style of
Generic HTTP URI).
>
> now, I'm assuming the RDF Document will need to be self describing (also
> contain a graph pictorial about itself, as well as the <sioc:Post> -
> here's a very simplified version of the triples it'd contain.
>
So the RDF data container (resource) is:

<http://example.lod/documents/rdf-document-123.rdf>, right?

> <http://example.lod/uri/rdf-graph-123> <rdf:type> <foaf:Document> ;
> <dc:title> "SIOC Post profile for post-123"@en
> <foaf:primaryTopic> <http://example.lod/uri/post-123> .
>
> <http://example.lod/uri/post-123> <rdf:type> <sioc:Post> .
>
> Q1: is <foaf:primaryTopic> correct here?
>
Yep.
> to say that the <sioc:Post> is contained by this graph we'd add the triple:
> <http://example.lod/uri/post-123>
> <sioc:link> <http://example.lod/uri/rdf-graph-123> .
>
Redundant, but not necessarily incorrect. You can make redundant
statements :-)
> then we need to say where the rdf graph can be found (provide it's URL):
> <http://example.lod/uri/rdf-graph-123>
> <??????> <http://example.lod/documents/rdf-document-123.rdf> .
>

<http://example.lod/documents/rdf-document-123.rdf> is a data set container so you identify it properly as in: <http://example.lod/documents/rdf-document-123.rdf#this>, via a simple URL to Generic HTTP URI hack, with Linked Data de-referencing in mind re. exploration of the description of this Thing/Object/Entity/Data Item. Note: a little change-up as I've added a new Identifier but taken the cheap # route via fragment identifier.

This also means your could have stated the following at the top:

<http://example.lod/documents/rdf-document-123.rdf#this> <rdf:type> <foaf:Document> ;
<foaf:primaryTopic> <http://example.lod/uri/post-123> .

<http://example.lod/uri/post-123> <rdf:type> <sioc:Post>;
<dc:title> "SIOC Post profile for post-123"@en.

OR even the following, assuming you'd already assigned these URIs and discovered that <http://example.lod/uri/rdf-graph-123> is basically the same as <http://example.lod/documents/rdf-document-123.rdf#this> i.e., RDF data set containers (documents or information resources):

<http://example.lod/documents/rdf-document-123.rdf#this> <rdf:type> <foaf:Document> ;
<owl:sameAs> <http://example.lod/uri/rdf-graph-123>;
<foaf:primaryTopic> <http://example.lod/uri/post-123> .

<http://example.lod/uri/post-123> <rdf:type> <sioc:Post>;
<dc:title> "SIOC Post profile for post-123"@en.




> Q2: which ontology does one use for <??????> in the above triple?
>
None.
> then we need to say that the HTML document is a document, that contains
> a human readable version of the <sioc:Post> (amongst other things)
>
> <http://example.lod/uri/html-document-123>
> <rdf:type> <foaf:Document> ;
> <foaf:primaryTopic> <http://example.lod/uri/post-123> .
>
> Q3: is the HTML Document a <sioc:Container>, which is a container of the
> <sioc:Post>?
> <http://example.lod/uri/html-document-123>
> <rdf:type> <foaf:Document> , <sioc:Container> ;
> <foaf:primaryTopic> <http://example.lod/uri/post-123> ;
> <sioc:container_of> <http://example.lod/uri/post-123> .
>
Yes, esp. as <sioc:Post> <rdfs:subClassOf> <sioc:Item> .

Note same applies to the RDF data container as in:

<http://example.lod/uri/rdf-graph-123> <rdf:type> <foaf:Document> , <sioc:Container> ;
<foaf:primaryTopic> <http://example.lod/uri/post-123> ;
<sioc:container_of> <http://example.lod/uri/post-123> .

OR
<http://example.lod/uri/rdf-graph-123> <rdf:type> <foaf:Document> , <sioc:Container> ;
<owl:sameAs> <http://example.lod/documents/rdf-document-123.rdf#this>;
<foaf:primaryTopic> <http://example.lod/uri/post-123> ;
<sioc:container_of> <http://example.lod/uri/post-123> .



> Q4: should we also say the description of the HTML Document is also
> contained by this graph?
> <http://example.lod/uri/post-123>
> <sioc:link> <http://example.lod/uri/rdf-graph-123> .
>

<http://example.lod/uri/rdf-graph-123> <sioc:link> <http://example.lod/uri/html-document-123>.
or even:
<http://example.lod/uri/rdf-graph-123> <foaf:Topic> <http://example.lod/uri/html-document-123>.


> Q5: how do we specify the URL of the HTML Document?
> <http://example.lod/uri/html-document-123>
> <?????> <http://example.lod/documents/html-document-123.html> .
>
Remember the earlier statement re. the RDF document (resource):


<http://example.lod/documents/rdf-document-123.rdf#this> <rdf:type> <foaf:Document> ;
<owl:sameAs> <http://example.lod/uri/rdf-graph-123>;
<foaf:primaryTopic> <http://example.lod/uri/post-123> .


Re. HTML resource description same thing applies re. association with the sioc:Post:

<http://example.lod/documents/html-document-123.html#this> <rdf:type> <foaf:Document>;
<foaf:primaryTopic> <http://example.lod/uri/post-123> .

<http://example.lod/uri/post-123> <rdf:type> <sioc:Post>;
<dc:title> "SIOC Post profile for post-123"@en.

OR

<http://example.lod/documents/html-document-123.html#this> <rdf:type> <foaf:Document> ;
<owl:sameAs> <http://example.lod/uri/html-document-123>;
<foaf:primaryTopic> <http://example.lod/uri/post-123> .


<http://example.lod/uri/post-123> <rdf:type> <sioc:Post>;
<dc:title> "SIOC Post profile for post-123"@en.


> I think that's enough for now; all feedback welcome!
>
> regards
>
> nathan
>
>
Bar any typos or cut&paste snafus, I've hopefully answered your questions.
Ultimately, the file (information resource, document, data container)
has its own set of attributes e.g. format (dcterms:format), actual file
name (not title of the content), creation date etc.. Distinct from the
description of its content (hence the use of foaf:primaryTopic as
conduit to content description graph).

Link:

1.
http://linkeddata.uriburner.com/about/html/http://news.cnet.com/8301-13577_3-10407056-36.html?tag=newsEditorsPicksArea.0
- example of Linked Data graph that describes an document (information
resource) in a manner distinct from its content (see the data exposed by
foaf:primaryTopic) .

Nathan

unread,
Dec 1, 2009, 5:37:57 PM12/1/09
to Kingsley Idehen, Linked Data community, pedant...@googlegroups.com, SIOC-Dev
perfect, thanks kingsley :)

only q (which i still don't follow) is that afaik I *need* to specify in
rdf where one can find the HTML document, no point describing something
people can't find... noted that in you're own rdf you use:
<resource>
<http://www.openlinksw.com/schema/attribution#isDescribedUsing> <url>

i essentially need the equiv for anything;

<http://example.lod/uri/html-document-123>
<canBeFound> <here/URL> .
or
<has_link> <here/URL> .

the thing I'm describing can be found at web address, ie show the human
this version etc etc (if you follow)

regards,

nathan

natlu2809

unread,
Dec 1, 2009, 5:40:18 PM12/1/09
to pedant...@googlegroups.com, nat...@webr3.org, Linked Data community, SIOC-Dev
Maybe I'm not understanding the dichotomy here:

  • A URI represents a thing, or is an address for a thing
  • Different things have different URIs
  • Different URIs represent different things - the POST, to html doc/serialisation, the rdf doc/serialisation
  • URIs are a front for code that generates things
but
  • A URI can represent the same thing in different serialisations depending on which agent/device/lense you look at it with
but

The identity however is maintained by the "fingerprint" of the object graphs, and the URI is just an image of that fingerprint at some point in time/location ?



http:

Nathan

unread,
Dec 1, 2009, 5:43:24 PM12/1/09
to Kingsley Idehen, Linked Data community, pedant...@googlegroups.com, SIOC-Dev
perhaps foaf:page ? like dbpedia uses?

Kingsley Idehen

unread,
Dec 1, 2009, 6:30:12 PM12/1/09
to nat...@webr3.org, Linked Data community, pedant...@googlegroups.com, SIOC-Dev
Nathan,

<http://www.openlinksw.com/schema/attribution#isDescribedUsing> takes URIs of ontologies/schemas/vocabs as values (i.e. object slot in triple statements we make).

For a simple outbound link, <sioc:links_to> would be fine.

Note though, my examples above are implying that URLs shouldn't be in triples, use #this to make them URIs and then for visual cues you can use an icon to capture the URL (for link out purposes which is where #this is cheap solution) while the URI enables Linked Data traversal. If you look at our pages we use "owl:sameAs" to similar effect (note link-out icons that typically matches the content type, thereby enabling users to exploit effect or URLs since apps know what to do with the data retrieved from URLs).

Kingsley Idehen

unread,
Dec 1, 2009, 6:32:04 PM12/1/09
to pedant...@googlegroups.com, Linked Data community, SIOC-Dev
Nathan wrote:
> [SNIP]
>> perfect, thanks kingsley :)
>>
>> only q (which i still don't follow) is that afaik I *need* to specify in
>> rdf where one can find the HTML document, no point describing something
>> people can't find... noted that in you're own rdf you use:
>> <resource>
>> <http://www.openlinksw.com/schema/attribution#isDescribedUsing> <url>
>>
>> i essentially need the equiv for anything;
>>
>> <http://example.lod/uri/html-document-123>
>> <canBeFound> <here/URL> .
>> or
>> <has_link> <here/URL> .
>>
>> the thing I'm describing can be found at web address, ie show the human
>> this version etc etc (if you follow)
>>
>
> perhaps foaf:page ? like dbpedia uses?
>
>
No problem, but note my comment re. use of icons as visual cue if you
are making an HTML based browser page. Either way, foaf:page is fine.

Nathan

unread,
Dec 1, 2009, 6:40:02 PM12/1/09
to kid...@openlinksw.com, pedant...@googlegroups.com, Linked Data community
Kingsley Idehen wrote:
> Nathan wrote:
>> [SNIP]

i think it's safe to say I grok this all now (lod); armed with
everything i need, and full comprehension to do a months work in the
next 4 days!

kingsley, sincerely, thank you for everything - you've been invaluable
in this process and looking forward to helping spread the word and help
people understand + implement wherever possible over the coming months
(and on).

many many thanks to all who've helped out, by no means am i undermining
the help of you all by specifically mentioning kingsley, just he's
committed a vast amount of hours from both himself and the team at
openlink to getting me through this.

many regards,

nathan

Richard Cyganiak

unread,
Dec 2, 2009, 6:25:38 AM12/2/09
to pedant...@googlegroups.com
Hi Ed,

A newspaper page (even an abstract one, that is, a Manifestation
rather than Item in the FRBR sense) is not the same as a web page.

You have a web page whose topic is a newspaper page.

The newspaper page perhaps is an information resource as well,
according to the AWWW definition of information resource, but knowing
that doesn't actually change anything; a web page describing an
information resource is no different from a web page describing, let's
say, a person, from the web architecture POV.

About Xiaoshu's view, his way of looking at the world is an
alternative possibility to the one that's commonly used in LOD -- I
will call that one the “LOD view”, but you might call it “Richard's
view” if you prefer. So, both the LOD view and Xiaoshu's view are
reasonable, but *combining* both leads to all sorts of issues, so you
should pick one and stick to it. The difference, explained on the
example of FRBR, is that Xiaoshu would have a single identifier, and
different properties to distinguish the “facets” of the resource:

<http://example.com/book123>
frbr:workIssued "1872";
frbr:expressionIssued "1878";
frbr:manifestationIssued "2006";
frbr:itemIssued "2009";
.

The LOD modelling would be:

<http://example.com/book123#work> dc:issued "1872" .
<http://example.com/book123#expression> dc:issued "1878";
<http://example.com/book123#manifestation> dc:issued "2006";
<http://example.com/book123#item> dc:issued "2009";

plus FRBR properties for relating the four resources (frbr:realizes
etc). In the LOD view, having different resources is indicated because
the things have different identity (i.e., were created at different
times).

Personally I agree with Xiaoshu that the definition of Information
Resource from AWWW is not particularly helpful. Personally, these days
I try to talk about “web document” and “thing described in a web
document.”

Best,
Richard

Kingsley Idehen

unread,
Dec 2, 2009, 7:07:49 AM12/2/09
to natlu2809, pedant...@googlegroups.com, nat...@webr3.org, Linked Data community, SIOC-Dev
natlu2809 wrote:
> Maybe I'm not understanding the dichotomy here:
>
> * A URI represents a thing, or is an address for a thing
>
URI Identifies a Thing. URIs basically have Referents (the things they
Identify).
A URL is a Resource Location/Address.
>
> * Different things have different URIs
>
Yes, as is the case in real life. Everything of importance to you has an
Identifier, otherwise you would be able describe or recognize it
distinct from other things.
>
> * Different URIs represent different things - the POST, to html
> doc/serialisation, the rdf doc/serialisation
> * URIs are a front for code that generates things
>
I would say a powerful abstraction, especially when looking at Generic
HTTP scheme URIs. For instance, each component of said URIs affects the
Data Representation that manifests when you issue an HTTP GET. This is
kind of like a composite (compound / concatenated) key in an RDBMS,
change a component as all associated data changes, and said changes
imply different data representations to the construction or breakage of
data relations. You basically get two things in one: Identity
(Reference)/Access (Address) duality, with Generic HTTP URIs.

Now here is the problem (as I've seen and experienced it), there is a
tendency to conflate a Generic URI with a Generic HTTP URI, the former
includes schemes like URN while the latter doesn't. Even worse, there is
a tendency to simply never mention URLs, and thereby conflate this
Location / Address oriented Identifier with a Generic HTTP URI which
simply makes everything confusing and inconsistent.
>
> *
>
>
> but
>
> * A URI can represent the same thing in different serialisations
> depending on which agent/device/lense you look at it with
>
A Generic HTTP URI is a conduit to a myriad of associated data
representations (remember its duality).
>
> but
>
> * a different URI can represent the same thing as another URI -
> http://example.lod/doc.html can be the same thing as
> http://example.lod/resource/doc when requested by a html agent ?
>
>
You can have different Identifiers for the same thing irrespective of
URI scheme. The Generic HTTP URI simply adds resolvability (data access)
to the mix courtesy of the HTTP scheme.
> The identity however is maintained by the "fingerprint" of the object
> graphs, and the URI is just an image of that fingerprint at some point
> in time/location ?
I think Identity is managed by the beholder of things, the one that
deems them important enough to be described, mentioned, talked about, or
referenced :-)

Kingsley
>>>> http://www.cs..cmu.edu/afs/cs.cmu.edu/user/clamen/OODBMS/Manifesto/htManifesto/node4.html