Atom serialization and RDF snippets

Herbert van de Sompel

unread,

Jun 25, 2008, 10:36:52 AM6/25/08

to oai...@googlegroups.com

hi all

There is an interesting discussion regarding ORE that was initiated by Erik Wilde on xml.com:

http://www.oreillynet.com/xml/blog/2008/06/oaiore_compound_documents_draf.html

There's quite a few arguments made in Erik's blog and in the follow-up comments. However, there is one core aspect that we would very much like to hear your opinion about, to support making an informed decision for the version 1.0 release of the ORE Specification:

The ORE Specifications propose serializations of Resource Maps in RDF/XML, RDFa and Atom. We think that there are potential use cases for each of these serializations, and feel quite strongly that we should stick to them.

Since the ORE data model is based on RDF, the RDF/XML and RDFa serializations are a natural fit. And while we agree that there are a few points of tension in the proposed ORE Profile of Atom (author of Resource Map expressed in atom:generator ; automatic inheritance of feed-level atom:author if an atom entry has no author), we feel the Atom serialization generally fits surprisingly well with the essence of the ORE Model. For sure, the Atom serialization is able to readily express Aggregations that have a simple tree structure. The RDF serializations are readily able to express complex Aggregation graphs.

In order to achieve the same level of expressiveness for the Atom serialization as for the RDF serializations, the ORE Specifications propose embedding RDF snippets at the atom:feed and atom:entry levels. This allows going beyond tree-structured Aggregations (expressed using Atom entries), and potentially conveying complex graphs within an Atom serialization.

However, this combination of the "feed paradigm" with the "RDF paradigm" is identified as a point of concern in Erik Wilde's blog. Hence, we would very much like to hear from you on this subject. Specifically, we wonder whether ORE Atom serializations should have the same level of expressiveness as ORE RDF serializations have? Or should the Atom ORE Profile, for example, be restricted to the simple case of tree-structured Aggregations, which can be expressed using native Atom elements only?

Examples:

(*) Atom feed expressing tree-structured Aggregation:
http://www.openarchives.org/ore/0.9/atom-examples/atom_dlib_mini.atom

(*) Atom feed expressing graph-structured Aggregation:
http://www.openarchives.org/ore/0.9/atom-examples/atom_dlib_maxi.atom

(*) RDF/XML expressing the same graph-structured Aggregation:
http://www.openarchives.org/ore/0.9/atom-examples/atom_dlib_maxi.rdf

Looking forward to your insights.

Cheers

Herbert Van de Sompel

Richard Jones

unread,

Jun 25, 2008, 10:57:40 AM6/25/08

to oai...@googlegroups.com

Hi Herbert,

> However, this combination of the "feed paradigm" with the "RDF
> paradigm" is identified as a point of concern in Erik Wilde's blog.
> Hence, we would very much like to hear from you on this subject.
> Specifically, we wonder whether ORE Atom serializations should have
> the same level of expressiveness as ORE RDF serializations have? Or
> should the Atom ORE Profile, for example, be restricted to the simple
> case of tree-structured Aggregations, which can be expressed using
> native Atom elements only?

I have to confess to having suffered considerable pain trying to
implement a general solution for expressing complete resource maps in
ATOM. As you say, purely hierarchic ones fit reasonably well, provided
that we are happy with the use of embedded RDF. The crucial point for
non-hierarchic graphs that needs to be addressed is that of shared
resources in the graph between aggregated resources (entries) and
aggregations (feeds). Where are these encoded? Can they be duplicated
in the feed and entry elements? and which takes precedence is there is a
conflict* in a supplied resource map?

* - the idea of a conflict here is a bit abstract, and might be just
that two serialisations of a resource do not expose the same triples in
both the feed and the entry, or the particular semantics of the ontology
leveraged by the resource (i.e. if the sum of all RDF fragments in all
feed and entry elements describing the resource express contradictory
semantics).

It would concern me to have a serialisations which were not isomorphic,
as you are a) guaranteed data loss during communication, b) no longer
able to assert that an Aggregation isDescribedBy an RDF/XML as well as
an ATOM resource map, since they will no longer contain the same
information, and may therefore be considered to be different Aggregations.

Cheers,

Richard

--
=======================================================================
Richard Jones | Hewlett-Packard Limited
Research Engineer, HP Labs | registered office:
Bristol, UK | Cain Road, Bracknell,
| Berks, RG12 1HN.
| Registered No: 690597 England
eml: richard...@hp.com -------------------------------------
blg: http://chronicles-of-richard.blogspot.com/
-----------------------------------------------------------------------
The contents of this message and any attachments to it are confidential
and may be legally privileged. If you have received this message in
error, you should delete it from your system immediately and advise the
sender. To any recipient of this message within HP, unless otherwise
stated you should consider this message and attachments as "HP
CONFIDENTIAL".
========================================================================

pkeane

unread,

Jun 25, 2008, 10:13:36 PM6/25/08

to OAI-ORE

Hi Herbert-

I really appreciate you posing this questions to the list! Obviously
(as my comments on Erik's blog post indicate), I am uncomfortable with
the idea of mixing in graph semantics with Atom's tree semantics. If
the OAI-ORE spec has a useful place for Atom, I think that'd be
great. But an Atom serialization that's isomorphic with the RDF
serialization is (I think) impossible without badly misusing Atom. I
could see having an Atom-based "simple view" of an aggregation that's
not round-trippable, but it's usefulness may not justify the
overhead.

Trying to use Atom only for aggregations that have a simple tree
structure may be problematic for two reasons: One, I am not sure the
distinction between tree & graph-based aggregation would be at all
clear cut, and maintaining that distinction as sets of resource are
modelled as aggregations seems nearly impossible (i.e., what good is
an aggregation w/o graph structures :-)), and secondly, while XML is
suitable for tree structures of arbitrary depth, Atom is not. Atom is
good for lists. And probably lists of lists. But beyond that, Atom
begins to break down. A few examples of successful nested hierarchies
in Atom include the Media RSS extension (http://search.yahoo.com/
mrss/), the Atom Threading Extension, and Google's FeedLink (http://
code.google.com/apis/gdata/elements.html#gdFeedLink). Those are the
logical starting points for an exploration of feed nesting w/ Atom.
If OAI-ORE does continue to pursue Atom as a serialization option, I
would hope (in the way of open source development) that effort would
result in some benefit to the Atom community in the form of one of
more standardized Atom extensions. I can imagine real utility in a
simple extension that provided some opportunity for expressing
hierarchies more effectively in Atom. See especially this thread:
http://www.mail-archive.com/atom-...@imc.org/msg01281.html .

Also...In thinking about all of this the last few days days, I cannot
help but see the wisdom in Mark D's postings about the need for
defined content types. Without that, modelling sets of resource sets
into aggregations that will be of actual usefulness both for
"standard" use cases AND for unforeseen use cases ("engineer for
serendipity" as Roy F. says) will be nearly impossible.

thanks much-
peter keane

> eml: richard.d.jo...@hp.com -------------------------------------

Robert Sanderson

unread,

Jun 26, 2008, 6:45:44 AM6/26/08

to oai...@googlegroups.com

The factors to be considered, as I see them, are:

Pro Atom:
* It's very commonly used
* It actually does have the right semantics: a list of resources
* It works in browsers and has real utility -- subscribing to an Aggregation is a meaningful thing to do
* For simple aggregations it can capture all of the information
* A lot of use cases can probably get away with using just Atom -- even though we talk a lot about complex ones, it's because the simple ones aren't problematic!

Against Atom:
* Some of the elements do not map well (eg entry/updated) to anything in the ORE model without breaking Atom's semantics
* Some Atom elms (eg rel/related) map differently at different levels
* As soon as you need to say something extra, it becomes very unclear as to the best way to do it.

All of the other problems I see stem from this second one.

* RDF/XML blocks in Atom are seen as bad form, even though it's perfectly within the atom spec to do this.
* rdf:Description blocks are just ugly!
* The number of occurrences of some elements in Atom (eg Generator, Title only appearing once) means that some information may occur in atom and the same semantics in RDF
* The graph model is (of course) unordered, so pulling in atom to a graph and then serialising back again will produce equivalent information but not necessarily identical XML.
* It's very hard to know where to put the extra information. Eg, if you get a triple which isn't directly about an Aggregation, Resource Map or Aggregated Resource, is it at the feed level or in one or more entries?
* Constructing the atom serialisation from a graph is really hard work! (However now that Richard and I have done it, hopefully that will forestall others having to go through the same headaches)

The real question to be answered I think is:
Must all serializations be isomorphic?

If no, then Simple Atom is fine.
If yes, then either Atom needs to be dropped or we need to accept some use of extensions will be required. At which point we must ask, what is the best way to extend Atom to ensure isomorphism?

Currently we use striped rdf:Description blocks, however there are other serialisations for RDF/XML. The 'pretty' version, which I posted a cut down version of a full resource map to the O'Reilly blog, could sit quite happily in Atom I think even though it's RDF/XML -- it doesn't *look* like RDF/XML.

I note that Peter suggests:

If OAI-ORE does continue to pursue Atom as a serialization option, I
would hope (in the way of open source development) that effort would
result in some benefit to the Atom community in the form of one of
more standardized Atom extensions.

So long as this doesn't imply yet another set of arbitrary XML elements, then I agree. However, if the Atom is to be fully isomorphic, it has to capture everything in RDF, which means that the extension is just some sort of RDF serialised in XML, which we already have several ways to do.
If it's not to be isomorphic, then we don't need an extension at all.

My opinion, finally:

I think we must keep Atom. It's a big selling point.

I'm less convinced that we need to maintain isomorphism, but if we do, then instead of striped rdf:Descriptions, we should use the prettier, nested RDF/XML serialisation with some explicit restrictions as to how it should be included. This would be much easier to process with XML tools rather than RDF tools, as if you have an RDF library already, then you're more likely to just use a straight RDF serialisation rather than Atom.

Rob

Robert Sanderson

unread,

Jun 26, 2008, 6:47:08 AM6/26/08

to oai...@googlegroups.com

That's what we have the Wiki for :)
Please feel free to start a page about content types etc. :)

Herbert van de Sompel

unread,

Jun 26, 2008, 12:46:04 PM6/26/08

to oai...@googlegroups.com

hi all,

The question of isomorphism between e.g. RDF/XML and Atom serializations has come up several times, now. I think I should mention that as things stand in the ORE Data Model, there is no requirement for different serializations of a same Aggregation to be isomorphic. http://www.openarchives.org/ore/0.9/datamodel#Resource_Map states:

An Authoritative Resource Map is one that is accessible via a dereference of the URI-A of the Aggregation that it describes. Details about the mechanisms of access are described in ORE User Guide - HTTP Implementation and Multiple Serializations. There MUST be at least one Authoritative Resource Map for each Aggregation. There MAY be many Authoritative Resource Maps for an Aggregation, each with a unique URI-R. As noted above each URI-R MUST dereference to one serialization. The serializations available from this set of URI-Rs SHOULD be in distinct formats (e.g., ATOM, RDF/XML). However, the RDF graph derivable from those serializations MUST express the same Aggregation Graph and define the same Proxies.

Which means that the essence of the Aggregation expressed in various ReMs (Aggregation URI-A, Aggregated Resources AR-i, and the ore:aggregates relationships between URI-A and AR-i) must be the same, but not everything needs to be the same. So, the current data model does allow for non isomorphic serializations. One can debate whether that is a good or a bad thing, but it is the present situation.

Personally, I do think it's a good thing to allow for serializations with a different level of expressiveness, as long as - as stated in the above - the essence is the same, mainly because:
- different serializations serve different purposes: by having RDF and Atom serializations ORE can potentially play a role in both the Linked Data and Web 2.0 environments
- not all purposes require the same amount of information: as we can tell from the widespread use of feed technologies on the web, simple lists go a long way. those simple lists are what the ORE Atom serialization is about

I think it is the aforementioned opening in the ORE data model that allows us to discuss whether the Atom serialization should be restricted to only things that can be expressed by using the native Atom elements, i.e. declare all non-Atom elements (so obviously the RDF description elements) out of scope for ORE processors.

cheers

Herbert

pkeane

unread,

Jun 26, 2008, 1:26:26 PM6/26/08

to OAI-ORE

Hi Herbert et. al. -

I am still a little bit unclear on what the ORE effort brings in terms
of a "value add" to an Atom serialization of web resources. I would
see great value in the ORE effort using Atom, defining standard
content types ("use cases", as it were), best practices, a standard
set of "categories" for resource-maps-as-atom, and perhaps, if
necessary, by proposing an Internet Draft that would register any new
atom:link "rel" values with IANA. That what I had hoped for when I
heard that ORE would be using Atom, but obviously, that's not been the
approach.

As a simple exercise, I have written a python script that "aggregates"
the same article that you used in your examples yesterday. You simple
give it a URL, a title (I should really just grab the '<title/>' from
the retrieved html), and a type (in this case journal article). The
script uses wget (recursive level set to '1'), then creates an Atom
Feed from the resulting directory of files.

python script:
https://webspace.utexas.edu/keanepj/www/code/aggregator.py.html

resulting feed:
https://webspace.utexas.edu/keanepj/www/feeds/aggregation_one.atom

As an "implementor" what does an ORE atom feed aggregation give me
that my atom feed aggregation lacks?

--peter keane

On Jun 26, 11:46 am, "Herbert van de Sompel" <hvds...@gmail.com>
wrote:

> hi all,
>
> The question of isomorphism between e.g. RDF/XML and Atom serializations has
> come up several times, now. I think I should mention that as things stand in
> the ORE Data Model, there is no requirement for different serializations of
> a same Aggregation to be isomorphic.http://www.openarchives.org/ore/0.9/datamodel#Resource_Mapstates:
>

> *An Authoritative Resource Map is one that is accessible via a dereference

> of the URI-A of the Aggregation that it describes. Details about the
> mechanisms of access are described in ORE User Guide - HTTP Implementation

> and Multiple Serializations <http://www.openarchives.org/ore/0.9/http>.

> There MUST be at least one Authoritative Resource Map for each Aggregation.
> There MAY be many Authoritative Resource Maps for an Aggregation, each with
> a unique URI-R. As noted above each URI-R MUST dereference to one
> serialization. The serializations available from this set of URI-Rs SHOULD

> be in distinct formats (e.g., ATOM<http://www.openarchives.org/ore/0.9/atom>,
> RDF/XML <http://www.openarchives.org/ore/0.9/rdfxml>). However, the RDF

> graph derivable from those serializations MUST express the same Aggregation

> Graph <http://www.openarchives.org/ore/0.9/datamodel#ore:aggregates> and
> define the same Proxies<http://www.openarchives.org/ore/0.9/datamodel#Proxies>
> .*
>
> Which means that the *essence* of the Aggregation expressed in various ReMs

> (Aggregation URI-A, Aggregated Resources AR-i, and the ore:aggregates
> relationships between URI-A and AR-i) must be the same, but not everything
> needs to be the same. So, the current data model does allow for non
> isomorphic serializations. One can debate whether that is a good or a bad
> thing, but it is the present situation.
>
> Personally, I do think it's a good thing to allow for serializations with a
> different level of expressiveness, as long as - as stated in the above - the
> essence is the same, mainly because:
> - different serializations serve different purposes: by having RDF and Atom
> serializations ORE can potentially play a role in both the Linked Data and
> Web 2.0 environments
> - not all purposes require the same amount of information: as we can tell
> from the widespread use of feed technologies on the web, simple lists go a
> long way. those simple lists are what the ORE Atom serialization is about
>
> I think it is the aforementioned opening in the ORE data model that allows
> us to discuss whether the Atom serialization should be restricted to only
> things that can be expressed by using the native Atom elements, i.e. declare
> all non-Atom elements (so obviously the RDF description elements) out of
> scope for ORE processors.
>
> cheers
>
> Herbert
>

> On Thu, Jun 26, 2008 at 4:45 AM, Robert Sanderson <azarot...@gmail.com>

Herbert van de Sompel

unread,

Jun 30, 2008, 7:05:22 PM6/30/08

to oai...@googlegroups.com

Peter,

First, I need to state that I can not look at the problem in the way you formulate it, i.e. what is the value that ORE adds to Atom? ORE is not an effort about profiling Atom, it is an effort about devising a way to deal with aggregations of web resources. And in order to tackle that problem, ORE has not started by selecting a specific serialization. Instead, it has worked on a data model that builds, among others, on the web architecture. With the data model in place, serializations have been devised. Atom is one such serialization, and we have proposed a way to express the ORE data model using Atom serializations. We have attempted:
- to make the Atom documents human friendly
- to make the Atom documents understandable for regular Atom processors (both semantic and syntactic)
- to allow ORE-Atom processors to understand the "deeper" ORE meaning expressed in the Atom documents

We very much understand the need for e.g. interoperable categorizations of aggregations, aggregated resources, etc but hoped this would be a problem that would be tackled by other community efforts. Interestingly enough, there was an indication earlier today that the joint DSpace/Fedora effort might be looking into parts of that problem space (see http://wiki.dspace.org/index.php/DSpace_Fedora_Collaboration). ORE has limited itself to specifying a very limited ORE-specific vocabulary, and recommending the use of some general purpose other vocabulary terms (see http://www.openarchives.org/ore/0.9/vocabulary). We felt that beyond this, things became a matter of community profiling. We hope to see efforts emerging inr this realm via the ORE Cookbook wiki (see http://foresite.cheshire3.org/wiki/).

I don't have time to go into a lot of detail about your example, but I would like to indicate that it illustrates why ORE has defined a model that contains an Aggregation (conceptual) and a Resource Map describing the Aggregation (concrete). Here is a snippet from your example, in which the Atom feed is supposed to represent the aggregation of all resources for a specific D-Lib magazine article:

<feed xmlns="http://www.w3.org/2005/Atom">
  <id>http://oreproxy.org/r?what=http://www.dlib.org/dlib/february06/smith/02smith.html</id>

  <title>Aggregation of "Observed Web Robot Behavior on Decaying Web Subsites"</title>
  <updated>2008-06-26T15:34:28Z</updated>
  <link href="https://webspace.utexas.edu/keanepj/www/feeds/aggregation_one.atom" rel="self" />

  <author>
    <name>Peter Keane, aggregator of resources</name>
    <uri>http://daseproject.org/pk</uri>
  </author>

I note the following:
- the content of <title> is the title of the D-Lib article
- the content of <author> is the author of the Atom document, i.e. you. It is not the author of the D-Lib article.

So, there is some discrepancy involved. By which I merely want to illustrate that, in the ORE context, there are 2 resources involved, each of which can have their own properties and relationships: the Resource Map (~Atom document) and the Aggregation described by the Resource Map. The Atom ORE Profile document (see http://www.openarchives.org/ore/0.9/atom) details how to express this 2 resource ORE model in Atom serilaizations.

Oh, and one more detail: Proxy URIs were introduced in ORE to allow referencing an aggregated resource the way it exists in the context of an aggregation. It's quite bizar to use them to identify an aggregation as you do with:

http://oreproxy.org/r?what=http://www.dlib.org/dlib/february06/smith/02smith.html

For one, using this approach one can't follow its nose from the URI of the Aggregation (the URI above here) to the URI of the Resource Map (the URI in <link rel="self"> in your exaample ). That ability is required in ORE.

cheers

herbert

pkeane

unread,

Jun 30, 2008, 10:34:15 PM6/30/08

to OAI-ORE

Thanks very much Herbert! This is quite helpful. I"ll make a few
comments below.

On Jun 30, 6:05 pm, "Herbert van de Sompel" <hvds...@gmail.com> wrote:
> Peter,
>
> First, I need to state that I can not look at the problem in the way you
> formulate it, i.e. what is the value that ORE adds to Atom? ORE is not an
> effort about profiling Atom, it is an effort about devising a way to deal
> with aggregations of web resources. And in order to tackle that problem, ORE
> has not started by selecting a specific serialization. Instead, it has

The web is awash now with documents in Atom format. Likewise, there
are numerous tools for managing content that already speak Atom
(notably, DSpace is not among them). I could see great utility in
starting with Atom and working out from there to express the sorts of
relationships that ORE aims to express.

> worked on a data model that builds, among others, on the web architecture.

I guess I would have to ask "which web?" As Erik Wilde pointed out in
a comment on his blog posting (
http://www.oreillynet.com/xml/blog/2008/06/oaiore_compound_documents_draf.html),
the "web" and the "semantic web" are two *very* different things. I
await the day that the semantic web brings its promised benefits.
Meanwhile, I find myself in the current iteration of the web (I'm told
we are up to 2.0 at this point ;-)), getting lots done with stuff that
has a just a URI and a MIME Type.

> With the data model in place, serializations have been devised. Atom is one
> such serialization, and we have proposed a way to express the ORE data model
> using Atom serializations. We have attempted:
> - to make the Atom documents human friendly
> - to make the Atom documents understandable for regular Atom processors
> (both semantic and syntactic)

I'd have to ask for what value of "understandable?" That's really the
core point of my last message. I don't think that an ORE Atom
serialization would be any *more* understandable than my wget
+autogenerated Atom to any Atom processor I am aware of. I just do
not see the use. Surely web browsers are more ubiquitous than feed
readers. Why not just focus on XHTML+RDFa for the "human-friendly"
version or ORE? I find it misleading to suggest that an Atom
serialization of an ORE aggregation offers any "semantic" richness at
all, since the RDF-kind of richness was explicitly rejected by the
Atom WG.

> - to allow ORE-Atom processors to understand the "deeper" ORE meaning
> expressed in the Atom documents

Surely an ORE-Atom processor would require RDF-capable tools, which,
as Erik Wilde pointed out, is a whole other kettle of fish than XML.

>
> We very much understand the need for e.g. interoperable categorizations of
> aggregations, aggregated resources, etc but hoped this would be a problem
> that would be tackled by other community efforts. Interestingly enough,
> there was an indication earlier today that the joint DSpace/Fedora effort
> might be looking into parts of that problem space (seehttp://wiki.dspace.org/index.php/DSpace_Fedora_Collaboration). ORE has

I fear that OAI-ORE is at its essence a DSpace<->Fedora<->ePrints
solution. That's really unfortunate, since the de facto world of
"institutional repositories" spans far beyond those applications.
Undoubtedly, the big 3 IRs were developed without the same eye towards
interoperability that newer applications and standards (Atom, JCR,
Flickr, Drupal, Wordpress, Google Docs) adopt as a matter of course.
I'd hate to see all of this work being done simply to compensate for
an inadequate content/interop model in a few applications. I am
correct in thinking that OAI-ORE would be useful for preserving and
making useful assertions about a set of documents a faculty member put
on Google Docs or a bunch of photos they put on Flickr? Obviously,
herein lies the the need for concrete use cases and scenarios.

> limited itself to specifying a very limited ORE-specific vocabulary, and

> recommending the use of some general purpose other vocabulary terms (seehttp://www.openarchives.org/ore/0.9/vocabulary). We felt that beyond this,

> things became a matter of community profiling. We hope to see efforts

> emerging inr this realm via the ORE Cookbook wiki (seehttp://foresite.cheshire3.org/wiki/).

>
> I don't have time to go into a lot of detail about your example, but I would
> like to indicate that it illustrates why ORE has defined a model that
> contains an Aggregation (conceptual) and a Resource Map describing the
> Aggregation (concrete). Here is a snippet from your example, in which the
> Atom feed is supposed to represent the aggregation of all resources for a
> specific D-Lib magazine article:
>
> <feed xmlns="http://www.w3.org/2005/Atom">

> <id>http://oreproxy.org/r?what=http://www.dlib.org/dlib/february06/smith/...</id>

> <title>Aggregation of "Observed Web Robot Behavior on Decaying Web
> Subsites"</title>
> <updated>2008-06-26T15:34:28Z</updated>
> <link href="https://webspace.utexas.edu/keanepj/www/feeds/aggregation_one.atom"
> rel="self" />
> <author>
> <name>Peter Keane, aggregator of resources</name>
> <uri>http://daseproject.org/pk</uri>
> </author>
>
> I note the following:
> - the content of <title> is the title of the D-Lib article

Most decidedly not. The atom:title is the title of the aggregation-as-
feed (as per RFC 4287). Sorry, I thought I made that obvious.

> - the content of <author> is the author of the Atom document, i.e. you. It

again, as per RFC 4287.

> is not the author of the D-Lib article.
>
> So, there is some discrepancy involved.

Sorry, but there is not. The sample is pure, valid RFC 4287 (Atom),
with no more and no less semantic richness than the spec allows.

By which I merely want to illustrate
> that, in the ORE context, there are 2 resources involved, each of which can
> have their own properties and relationships: the Resource Map (~Atom
> document) and the Aggregation described by the Resource Map. The Atom ORE

> Profile document (seehttp://www.openarchives.org/ore/0.9/atom) details how

> to express this 2 resource ORE model in Atom serilaizations.
>
> Oh, and one more detail: Proxy URIs were introduced in ORE to allow
> referencing an aggregated resource the way it exists in the context of an
> aggregation. It's quite bizar to use them to identify an aggregation as you
> do with:
>

> http://oreproxy.org/r?what=http://www.dlib.org/dlib/february06/smith/...
>

Yes indeed. I left the "&where=..." off of that atom:id. My
mistake.

I will note that stuffing all of this semantics into the id seems way
beyond being a proper use of Atom. The fact is, Atom provides a much
cleaner alternative with "atom:category," which is exactly the place
to express the meaning of this resource's "aggregatedness" in the
context of this specific aggregation. The problem (well, for Atom) is
that the ORE abstract model (5.3) states that a "URI-AR is not
specific to the Aggregation." Well, in Atom there is no such thing as
a relationship outside of that expressed in the Atom document. Thus,
category works just fine. It's the classic tension between an
"essentialist" world view, and an "existentialist" world view. That's
the semantic web vs. human web dichotomy all over again. Both valid,
of course, but awfully hard to reconcile.

> For one, using this approach one can't follow its nose from the URI of the
> Aggregation (the URI above here) to the URI of the Resource Map (the URI in
> <link rel="self"> in your exaample ). That ability is required in ORE.

Ultimately, though, I find myself far more interested in doing some
useful & interesting things (moving an electronic journal from one
system to another, archiving a scholar's set of images they have
posted on Flickr, "porting" a set of Google Docs into an institutional
repository, creating a re-usable XML export of a FileMaker database,
etc.) than meeting the requirements of the OAI-ORE spec. Of course in
the best-world scenario, the latter enables the former. And I truly
hope that'll be the case.

I suspect my comments are not terribly helpful at this point (and
perhaps they strike most readers here as utterly irrelevant! :-)), so
I will certainly not press the point. It is likely the case that I am
wanting OAI-ORE to be something it is not. In my work as a librarian/
content manager/archivist, I have found very real benefit in looking
for analogs to my work outside the typical library/archive sphere.
It's been enlightening and also a bit frightening as I realize that
much (perhaps the vast majority) of intellectual "product" is born,
lives, and thrives outside of this more traditional sphere. I would
hate to see the library world not take the opportunity to join in.

best regards-
Peter

>
> cheers
>
> herbert

> ...
>
> read more »

Jeff Young

unread,

Jul 8, 2008, 9:46:01 AM7/8/08

to OAI-ORE

Sorry I'm so far behind in the ORE discussions.

OAI-ORE has a domain model that a UML diagram would help clarify. This
domain model is similar to Atom, but only approximately. Herbert
recalled that Pete Johnston started doing one for ORE, but he didn't
see the final result.

Herbert also pointed out that a Schematron schema is available for ORE
Atom feeds (http://www.openarchives.org/ore/atom-tron). Since there is
some question about the ability of Atom to represent ORE in its full
richness, though, it sounds like the Atom and ORE domains should be
analyzed separately and compared for overlap. Apparently some objects
in the ORE domain model are not covered by Atom and should probably be
represented by a separate schema(s).

Assuming ORE identifies a set of domain objects beyond the Atom
domain, someone might argue that raw RDF triples could cover them. I
would suggest "Striped RDF" instead. Striped RDF is basically an XML
document decorated with a pattern of RDF tags. I've asked my colleague
Andy Houghton to write up an analysis of it for my Q6 blog, but he's
on vacation this week.

Jeff

On Jun 25, 10:36 am, "Herbert van de Sompel" <hvds...@gmail.com>
wrote:

> hi all
>
> There is an interesting discussion regarding ORE that was initiated by Erik
> Wilde on xml.com:
>

> http://www.oreillynet.com/xml/blog/2008/06/oaiore_compound_documents_...

Reply all

Reply to author

Forward