Re: Atom Triples Internet Draft

4 views
Skip to first unread message

Story Henry

unread,
Jul 2, 2008, 10:47:43 AM7/2/08
to Beckett Dave, semantic-web@w3.org Web, atom...@googlegroups.com, Mark Nottingham, Atom-Protocol Protocol
Thanks Dave for this proposal to link atom and rdf.

A few remarks:

1. one can already embed rdf in atom
------------------------------------

Just as a matter of interest for those who do not know the atom
format, one can already embed rdf in atom quite simply.

<entry>
<title>syndeocms Project</title>
<link href="http://doapspace.org/doap/sf/syndeocms"/>
<id>http://doapspace.org/doap/sf/syndeocms</id>
<updated>2007-12-13T18:30:02Z</updated>
<summary>Some text..</summary>
<content type="text/rdf+n3">
@prefix doap: &lt;http://usefulinc.com/ns/doap#&gt; .
&lt;http://projects.com/1&gt; a doap:Project;
doap:name "Project 1" .
</content>
</entry>

It is interesting to think about what the Atom Triples proposal adds
to this. More below.

2. Mappings from Atom to RDF already exist
------------------------------------------

As another point of reference one has to point out that links between
rdf and atom are being worked on.
Projects such at the atom-owl group have started looking at designing
an ontology and a mapping for this.
http://groups.google.com/group/atom-owl
The latest spec is available here:
http://bblfish.net/work/atom-owl/2006-06-06/AtomOwl.html
with XSLT and XQuery transforms. David Powell has also worked on what
turns out to be a nearly isomorphic ontology atom-rdf

Clearly that atom in the content has to be interpreted as a literal,
otherwise a feed with a number of entries saying contradictory things
could produce on GRDDL extraction a nonsensical graph. Ie, the content
is close to an N3 graph. We could nearly translate the example above
into the following N3

@prefix : <http://bblfish.net/work/atom-owl/2006-06-06/#>

[] a :Entry
:title "syndeocms Project";
:alternate <http://doapspace.org/doap/sf/syndeocms>;
:id "http://doapspace.org/doap/sf/syndeocms"^^xsd:anyURI;
:updated "2007-12-13T18:30:02Z"^^xsd:dateTime;
:summary "Some text";
:content {
@prefix doap: <http://usefulinc.com/ns/doap#> .

<http://projects.com/1> a doap:Project;
doap:name "Project 1" .
}

There is no general rule that one has to merge any two graphs one
finds on the internet. How, and when to merge two graphs is a matter
of trust and choice of lifting rules. RDF semantics does state rules
about merging two graphs one believes to be true.

The way atom is used though it is quite possible that two entries in
the same feed with the same id have contradictory content, the second
entry being an update of the first entry. Perhaps a mistake was made
on first publication. It follows therefore that the content above need
not be merged with the surrounding context, or with other content of
different entries in the same feed, let alone other feeds. As a result
it is true, it would not be the place to add new metadata about the
feed or entry objects themselves.

This points to the need of something like what is being proposed by
the AtomTriples specification.


3. embedding rdf in atom
------------------------

It is clearly stated in the introduction that the aim of this format
is to embed
rdf in atom

[[
This specification describes AtomTriples, a set of Atom [RFC4287]
extension elements for embedding RDF
[W3C.WD-rdf-syntax-grammar-20031010] statements in Atom documents
(both element and feed), as well as declaring how they can be
derived
from existing content.
]]

The at:md element does in fact allow the embedding of any rdf in the
atom feed or entry elements. The following statement is very odd though:

[[
Likewise, the mechanics of combining metadata from multiple
instances
of the same entry, or from multiple feed documents, is out of the
scope of this specification.
]]

This cannot be right. You cannot use rdf in a format and yet say
nothing about how to use that RDF. RDF semantics makes it very clear
how to merge relations from different graphs. If you are embedding RDF
in atom, there has to be a way to make sense of what that embedding
means, or else how can one know that it is rdf that you have embedded
in atom, and not something completely different that just happens to
look very much like a well known serialisation of rdf?

That is, one has to have a story of what merged graphs looks like if
one believes all the information to be correct. Someone who publishes
a feed clearly makes a statement about the content of the feed, and
the statement should be taken on face value to say something true.

If one really does not want such a merge, then would the right place
to put this metadata not be the <content> element , where clearly we
have a literal, and there is no obligation to merge the log:semantics
of literals ?

Or are you saying that really the rdf in the at:md elements are
literals? So how then do they differ from the content then?


4. problems with at:md element
------------------------------

The at:md element allows one to place rdf anywhere in a feed or entry
element, and allows one to speak of anything.

[[
The subject of these statements is, by default, the value of the
atom:id element in the same context (atom:element or atom:feed).
However, this behaviour MAY be overridden by specifying the subject
attribute.
]]

Since the rdf one can place inside the at:md element can be about
anything one wonders what the point of the default behavior in the
spec is about. It turns out that by default the subject of the at:md
element should be the resource identified by the atom:id of the feed
or some resource related by a link relation.
Furthermore the way to find the URI of the link relation is extremely
contorted:

[[
It MUST contain a URI which MUST be interpreted as a link relation;
the first such occurrence of an atom:link element in the same
context
as its parent element with that relation (in lexical order) will
indicate the URI to use as the subject.
]]

Since the order of link relations in atom is insignificant this is
breaking the little atom semantics defined.

Furthermore both the id and the link relations MUST have URIs! So
there is absolutely no need to have these default behaviors since RDF
has many constructs to make speaking about anything with a URI very
easy.

And even worse than all of the above, the default behavior makes it
difficult to speak about the one thing that it may be important to
speak about, namely THE ENTRY (or the feed) ITSELF!!!!

In Atom Owl every entry is identified by a blank node, which has a
functional relation awol:id to an id, and a functional relation
awol:updated to a time stamp. It is not easy to speak about individual
entries in atom since they don't have identifiers. So the obvious
location to put information about them is as children of that element
as hinted at by the atom spec in section 6.4.1, which admittedly is
about simple extensions, but nevertheless makes the point well

[[
The element can be interpreted as a simple property (or name/value
pair) of the parent element that encloses it. The pair consisting of
the namespace-URI of the element and the local name of the element can
be interpreted as the name of the property. The character data content
of the element can be interpreted as the value of the property. If the
element is empty, then the property value can be interpreted as an
empty string.
]]


5. the mapping section
----------------------

The mapping section that allows one to make ad hoc mappings from atom
elements to rdf relations is broken in two ways.

Take the examples:

[[
<at:map
property="http://purl.org/dc/elements/1.1/title">atom:title</
at:map>

indicates that the atom:title element's content should be mapped to
the http://purl.org/dc/elements/1.1/title property. Given the entry

<atom:entry>
<atom:id>http://example.com/a</atom:id>
<atom:title>Test</atom:title>
</atom:entry>

and the map above as a child of at:entrymap, the following triple
would be implied;

<http://example.com/a> <http://purl.org/dc/elements/1.1/title>
"Test" .

]]


5.1 Wrong place to put the mapping
- - - - - - - - - - - - - - - - - -

It is the wrong way to put these mappings. Much better would be to
create a general semantics of atom, and then use RDF semantics to
create these mappings. So for example it would be easy to add the
following to atom owl

@prefix dc: <http://purl.org/dc/elements/1.1/> .
@prefix rdfs: <http://www.w3.org/2000/01/rdf-schema#> .
@prefix awol: <http://bblfish.net/work/atom-owl/2006-06-06/#> .

awol:title rdfs:subPropertyOf dc:title .

Why add it every time to the atom document? Is this something that you
think is going to be changing from one atom feed to another?


5.2 the semantics are wrong
- - - - - - - - - - - - - -


After a lot of work the AtomOwl group have come to develop an
semantics for atom expressed in RDF. David Powell developed
independently an ontology that was shown to be mostly isomorphic. Both
would agree on the following: since atom allows two entries to have
the same id, one should *not* make the subject of the title relations
be the id URI. The following feed is valid atom


<atom:feed>
...
<atom:entry>
<atom:id>http://example.com/a</atom:id>
<atom:title>Gold increases</atom:title>
<atom:updated>2008-06-13T18:30:02Z</updated>
<atom:content>The price of gold has just gone up</atom:content>
</atom:entry>

<atom:entry>
<atom:id>http://example.com/a</atom:id>
<atom:title>Gold value rises</atom:title>
<updated>2008-06-14T02:30:02Z</updated>
<atom:content>The price of gold has just gone up by 20%</
atom:content>
</atom:entry>
...
</feed>

if the suggested mapping were right then the meaning of this would be

[] a awol:Feed;
awol:entry <http://example.com/a> .

<http://example.com/a> dc:title "Gold increases", "Gold value rises" .
awol:content "The price of gold has just gone up by 20%",
"The price of gold has just gone up" ;
awol:updated "2008-06-13T18:30:02Z"^^xsd:dateTime,
"2008-06-14T02:30:02Z"^^xsd:dateTime .


which would make it impossible to work out:
- which title goes with which content
- which title goes with which time stamp
- which time stamp goes with which content


yet that information is very clear in the atom document, and can
furthermore be very clearly expressed in rdf without ambiguity, (but
with some simplification) as:


@prefix : <http://bblfish.net/work/atom-owl/2006-06-06/#> .
@prefix xsd: <http://www.w3.org/2001/XMLSchema#> .

[] a :Feed;
:entry [
:id "http://example.com/a"^^xsd:anyURI;
:title "Gold Increases";
:updated "2008-06-13T18:30:02Z"^^xsd:anyURI;
:content "The Price of gold has just gone up";
];
:entry [
:id "http://example.com/a"^^xsd:anyURI;
:title "Gold value rises";
:updated "2008-06-14T02:30:02Z"^^xsd:anyURI;
:content "The price of gold has just gone up by 20%";
]
.

Notice how we can very well associate which title goes with which
entry, which updated time stamp, and which content.

The current Atom Triples spec would make force the wrong default
interpretation of atom.


Given the above I do have to come to the conclusion that the above
spec is badly broken. I would suggest first working on an official
semantics for atom, then working on a better way to add general
extensions to it that would work well with the semantics. This would
be useful in getting a general semantics together and on making sure
the extensions were meaningful.


Yours sincerely,

Henry Story


On 1 Jul 2008, at 21:29, Dave Beckett wrote:
> Mark Nottingham and myself have co-authored an internet draft for
> transporting RDF in Atom. We've called it "Atom Triples" since
> the focus is on Atom, annotating/adding the triples to the existing
> format. Where we had a choice of the atom way or the rdf way, we
> picked the atom way.
>
> So the purpose of this format is to allow adding of triples for
> descriptions of the resources in an atom feed, using the URI
> of one atom:link as the main resource. The body of the at:md
> is typically the blank-node-closure of the graph associated with
> the main resource. Or at least, that's how I've done it so far.
>
> This is Version 0 and we know there are some things in the example
> that need clarifying and expanding and other questions, but here it
> is:
>
> AtomTriples: Embedding RDF Statements in Atom
> http://www.ietf.org/internet-drafts/draft-nottingham-
> atomtriples-00.txt
>
> Usual IETF I-D caveat: this URL will expire.
>
> Dave
> & Mark

Story Henry

unread,
Jul 2, 2008, 12:10:30 PM7/2/08
to Phil Archer, Beckett Dave, semantic-web@w3.org Web, atom...@googlegroups.com, Mark Nottingham, Atom-Protocol Protocol
Hi Phil,

thanks for the pointer to the POWDER spec. I have not used POWDER yet,
so I can't comment on that side of things. I do think it provides a
very good example to the atom working group, in that it provides the
following:

1. an RDF vocabulary and spec
2. an simple to use XML format
3. a GRDDL transform of the XML into RDF

This is very similar in structure to what the AtomOwl group is
proposing for atom, namely to complement the atom syntax with an
ontology and a transform. I would like to remind the atom working
group that this was part of the initial plan for atom.

On 2 Jul 2008, at 17:47, Phil Archer wrote:
> Just a quick note in this thread. POWDER (Protocol for Web
> Description Resources) is close to Last Call on some of its
> documents, notably [1, 2]. In section 4.1.3 of [1] we give an
> example of linking an Atom feed and an individual entry within it to
> a POWDER document, the processing of which can reveal triples about
> multiple resources, such as all the elements in the feed.

I think the interpretation of this is quite easy given the section in
the atom spec on simple extension elements:

<entry>
<wdrs:describedby>http://monarchy.example.org/powder2.xml</
wdrs:describedby>
<title>Divine Right</title>
<link href="http://monarchy.example.org/divine_right.html"/>
<id>urn:uuid:1225c695-cfb8-4ebb-aaaa-80da344efa6b</id>
<updated>2007-06-06T13:43:54Z</updated>
<summary>Divine Right was claimed by several English monarchs,
notably Charles I...</summary>
</entry>

would be translated in atom owl as

[] a :Entry
wdrs:describedby "http://monarchy.example.org/powder2.xml";
alternate: <http://monarchy.example.org/divine_right.html> .

if you wanted the wdrs:describedby to be automatically understood as
being pointing to a resource you should use the
atom

<link rel="http://wdrsnamespace/describedby" href="http://monarchy.example.org/powder2.xml/
>

Otherwise things are uncontroversial.

I would need to look at what you put into the powder2.xml to see if I
agree with how you think you can describe the atom entries. Do you
have an example of that? Or is that outside of the spec? (I have not
had time to get a close look at powder yet)

Henry


>
>
> The key thing for this discussion is that in a Rec Track document,
> we're already showing how one can add RDF directly to an Atom feed.
> I just hope we got it right so far!
>
> Phil.
>
> --
> Phil Archer
> Chief Technical Officer,
> Family Online Safety Institute
> w. http://www.fosi.org/people/philarcher/
>
> [1] http://www.w3.org/TR/powder-dr/
> [2] http://www.w3.org/TR/powder-grouping/

Story Henry

unread,
Jul 3, 2008, 4:13:24 PM7/3/08
to Booth, David (HP Software - Boston), semantic-web@w3.org Web, atom...@googlegroups.com

On 3 Jul 2008, at 21:12, Booth, David (HP Software - Boston) wrote:

>
>> From: Story Henry <henry...@bblfish.net>
>> [ . . . ]
>> Clearly that atom in the content has to be interpreted as a literal,
>> otherwise a feed with a number of entries saying contradictory things
>> could produce on GRDDL extraction a nonsensical graph.
>
> Would named graphs help here, i.e., having one named graph per entry?
> http://www.w3.org/2004/03/trix/


Well one named graph per content would be more precise. Since there is
one content per entry, that also comes down to one per entry of
course. But that is exactly what the N3 [1] example I gave and which I
have reproduced below says.

-----------------
@prefix : <http://bblfish.net/work/atom-owl/2006-06-06/#>

[] a :Entry;
:title "syndeocms Project";
:alternate <http://doapspace.org/doap/sf/syndeocms>;
:id "http://doapspace.org/doap/sf/syndeocms"^^xsd:anyURI;
:updated "2007-12-13T18:30:02Z"^^xsd:dateTime;
:summary "Some text";
:content {
@prefix doap: <http://usefulinc.com/ns/doap#> .

<http://projects.com/1> a doap:Project;
doap:name "Project 1" .
}

-----------------


The content of the curly brackets '{' '}' above refers to anonymous
graphs, which if you were to give them a name would be your named
graphs.

the :content relation is defined in atomOwl as a relation to a
Content, which you can think of as a literal. N3 literals have a
log:semantics, so in the above I just wrote it out as a shorthand.
Really I should have create a new :contentSemantics relations which
would be defined as

{ ?entry :content ?c . ?c log:semantics ?sem } => { ?
entry :contentSem ?sem } .

and I should have written the entry like this

-----------
[] a :Entry;
:contentSemantics {

<http://projects.com/1> a doap:Project;
doap:name "Project 1" .
}

-----------

Henry

[1] see the section on Rules of the n3 tutorial http://www.w3.org/2000/10/swap/doc/Rules

Story Henry

unread,
Jul 6, 2008, 8:35:05 AM7/6/08
to Taylor Cowan, Atom-Protocol Protocol, atom...@googlegroups.com

On 6 Jul 2008, at 03:43, Taylor Cowan wrote:

> Henry,
>
> Where you say "since atom allows two entries to have
> the same id" I'm confused. I think they are supposed to be globally
> unique, even outside the scope of the immediate feed.

You have to distinguish 3 things, then entry, the id relationship and
the object of the id relationship, namely the URI.

<entry>
<id>http://bblfish.net/blog/page1.html#1</id>
...
</entry>

which in N3 can be translated as:

[] a :Entry;
:id "http://bblfish.net/blog/page1.html#1"^^xsd:anyURI .


There is an :id relation between the anonymous node [] of the entry
and the URI which is unique.
The :id relationship is not the the same as the owl:sameAs
relationship, clearly since a feed can have
multiple entries with different contents and updated time stamps which
have the same id.

In section 4.1.1 it says:
[[
If multiple atom:entry elements with the same atom:id value appear in
an Atom Feed Document, they represent the same entry. Their
atom:updated timestamps SHOULD be different.
]]

so the :id relation is a owl:functionalProperty only.

This is not that different from the relationship between your passport
number, and any temporal instances of you.
Each temporal instance of you has the same temp:id relationship to the
passport number. Yet each temporal instance of you has different
properties. So

cowanOnMonday temp:id "22002342";
location Paris .

cowanOnTuesday temp:id "22002342";
location NewYork .

Hope this helps.

Clearly this question will come up again and again. So I should add
this explanation to the
http://bblfish.net/work/atom-owl/2006-06-06/AtomOwl.html#id

section of the atom-owl spec.


Henry

>> From the spec...
>
> "Instances of atom:id elements can be compared to determine whether
> an entry or feed is the same as one seen before."
>
> Taylor

Story Henry

unread,
Aug 1, 2008, 4:52:33 AM8/1/08
to Mark Birbeck, Niklas Lindström, Beckett Dave, Mark Nottingham, Booth, David (HP Software - Boston), semantic-web@w3.org Web, atom...@googlegroups.com
On 4 Jul 2008, at 13:36, Mark Birbeck wrote:
> Hi Niklas,
>
> I won't comment on all of your post, just a couple of small things:
>
>> I think that "Atom Triples" can be somewhat equivalent to Atom as
>> RDFa
>> is to XHTML.
>
> RDFa was designed for use with *any* mark-up, since the parsing
> algorithms are defined merely in terms of navigating a tree. So the
> parsing would work on any XML-based language, such as SVG or Atom, and
> of course with HTML.

I like the RDFa idea a lot. Btw, there would be a question as to the
namespace of the rdfa attributes if it
were to work with atom.

> [snip]
> I'd suggest a slightly different approach. First, I'd add the RDFa
> @about to 'entry' to give you the subject of the statements:
>
> <entry about="http://purl.org/NET/dust/foaf#me">
> <id>http://purl.org/NET/dust/foaf#me</id>
>

Though that makes sense in a way, my feeling is that an atom:Entry is
an information object,
and so is owl:disjointWith a foaf:Person . I would argue this by
looking at the main use case for atom: the Atom Protocol. Atom entries
get published, are GETable, etc. Human beings are not. There is
furthermore the problem of it being possible to have multiple entries
with the same id, which would suggest that at the very most an entry
would have to be a time slice of a person, not the full person.
Certainly you would not want the about url to be the same as the atom
id, that would make updates impossible.

If people want to use AtomPub to publish information about people,
they should really place a foaf file in the content. If you want to
specify information about the state of people at different times of
their life one should develop at time slice ontology [1] for people
( perhaps as part of a Resume ontology ), and add information about
people to their slices. Something like

:hs :during [ :from "1996-10-01"^^xsd:Date; :to "2001-08-01"^^xsd:date;
:workedFor [ foaf:homePage <http://altavista.com> ] .


Even though the semantic makes it easy to extend things, logic will
set limits to what can be said. A Person is not a document. So perhaps
it would be worth considering some slightly more realistic examples...

You could then use atom to publish that, and people subscribing to
your feed, would have a way to know about all the resources you
updates without being forced to crawl all your site. So atom is still
very useful. It just does not need to be forced to do every thing.

Henry


[1] CYC has some such construct here
http://www.cyc.com/cycdoc/vocab/top-vocab.html#timeSlices


> Next, I'd add a single element that allows properties to be set.
> Yahoo!'s DataRSS adds the 'meta' element which seems a reasonable
> enough choice, and mirrors what we've done in XHTML 2.
>
> Whatever the element is, it would allow the RDFa attributes @property,
> @content and @datatype.
>
> There are two reasons that I think this is better than simply placing
> elements in 'entry'.
>
> The first is that space is then left for Atom to add whatever it wants
> to in the future, and it is clear where the metadata for Atom itself,
> versus the metadata being carried, are distinct.
>
> Second, QNames don't allow for all possible resources--a problem that
> RDF/XML has, too.
>
> So, the plain literals in your example would become:
>
> <title>Niklas Lindström</title>
> <meta property="foaf:givenname">Niklas</meta>
> <summary>About Niklas.</summary>
> <content src="http://neverspace.net/me.html" type="text/html"/>
>
>
> Now, @rel is used slightly differently in RDFa, but before we get to
> that, RDFa supports @typeof which is a shorthand for rdf:type, so we
> can remove the first 'link' in your example. @typeof needs to go with
> the subject though, so the 'entry' would now look like this:
>
> <entry about="http://purl.org/NET/dust/foaf#me" typeof="foaf:Person">
>
> Nested statements in RDFa are 'about' the object of the parent (if
> present), so the next part of your example is easily converted to the
> following (ignore @rel again, for the moment):
>
> <link rel="http://xmlns.com/foaf/0.1/homepage"
> href="http://neverspace.net/">
> <meta property="dc:title" xml:lang="en">Neverspace</meta>
> </link>
>
> In RDFa, @rel would hold a CURIE, which is usually a prefix/suffix
> combination:
>
> <link rel="foaf:homepage" href="http://neverspace.net/">
> <meta property="dc:title" xml:lang="en">Neverspace</meta>
> </link>
>
> But it can also come from a set of predefined values, such as:
>
> <link rel="homepage" href="http://neverspace.net/">
> <meta property="dc:title" xml:lang="en">Neverspace</meta>
> </link>
>
> These values can be defined by a host language, possibly even
> dynamically.
>
> The finished example, of using RDFa to carry RDF in Atom, might look
> like this:
>
> <entry about="http://purl.org/NET/dust/foaf#me" typeof="foaf:Person">
> <id>http://purl.org/NET/dust/foaf#me</id>
> <title>Niklas Lindström</title>
> <meta property="foaf:givenname">Niklas</meta>
> <summary>About Niklas.</summary>
> <content src="http://neverspace.net/me.html" type="text/html"/>
> <link rel="foaf:homepage" href="http://neverspace.net/">
> <meta property="dc:title" xml:lang="en">Neverspace</meta>
> </link>
> </entry>
>
> Regards,
>
> Mark
>
> --
> Mark Birbeck, webBackplane
>
> mark.b...@webBackplane.com
>
> http://webBackplane.com/mark-birbeck
>
> webBackplane is a trading name of Backplane Ltd. (company number
> 05972288, registered office: 2nd Floor, 69/85 Tabernacle Street,
> London, EC2A 4RR)
>

Reply all
Reply to author
Forward
0 new messages