If ore:Aggregation is a subClassOf dctype:Collection...

9 views

Skip to first unread message

Mark Diggory

unread,

Mar 30, 2008, 12:25:03 PM3/30/08

to oai...@googlegroups.com

ORE Group,

If ore:Aggregation is a subClassOf dctype:Collection, then I
speculate there is some type of relationship between DCMI Collection
Profile for which this dctype:Collection is primarily used and
ore:Aggregation?

> ore:Aggregation
> a s:Class;
> s:comment
> "A set of related resources (Aggregated Resources), grouped
> together such that the set can be treated as a single resource.
> This is the entity described within the ORE interoperability
> framework by a Resource Map.";
> s:isDefinedBy
> ore:;
> s:label
> "Aggregation";
> s:subClassOf
> dctype:Collection.

Yet, I look at it and it do not see a very clean alignment. DCMI CAP
defines that in describing a Collection the following properties are
mandatory.

> http://dublincore.org/groups/collections/collection-application-
> profile
>
> dc:type = dcmitypes:Collection (http://dublincore.org/groups/
> collections/collection-application-profile/#coldctype1)
> dc:title (http://
> dublincore.org/groups/collections/collection-application-profile/
> #coldctitle)
> dcterms:abstract (http://dublincore.org/
> groups/collections/collection-application-profile/#coldctermsabstract)
>
> and this one highly recommended:
>
> dc:identifer (http://
> dublincore.org/groups/collections/collection-application-profile/
> #coldcidentifier)
>
> And it has the optional dc:relations that allow you to describe the
> relationship between the collection and others
>
> dcterms:hasPart (http://dublincore.org/groups/collections/
> collection-application-profile/#coldctermshasPart)
> dcterms:isPartOf (http://dublincore.org/groups/collections/
> collection-application-profile/#coldctermsisPartOf)

ORE:Aggregation sort of has a structural element out of this
[type,hasPart (ore:aggregates)] but none else, and I bring it up
because most of what is in the CAP has to do with describing the
metadata as a "thing", which is what is going on in the ReM.

I wonder then about ORE's usage of this dctype? What is the intention
in ORE's usage of this type?

Is it really of any use to identify an Aggregation as a subclass of
dcmi:Collection and DCMI CAP as a dctype:Collection?

Is it wrong to extrapolate that a ore:Aggregation has a relationship
to a CAP?

Perhaps this is only a sign that having "Classes" that are empty of
further definition (in the dctypes namespace) creates a horrible
vagueness about the usage of such a Class?

I've been taking both the ore and DCMI CAP quite literally and
attempting to encode examples in RDF. But I want to avoid the
replication, Graham Triggs recommended to me to drop using ore in
this context (describing Sites, Communities and Collections of Items)
and I'm tending to agree and just reserve my ore to just our Items
and the "abstract aggregations (subjects, authors, topics, types,
etc...) that may exist in their metadata". But if I do that, it
rather restricts our usage of ORE to then same context as METS (i.e.
describing Items).

Example of a DSpace Community:
http://dspace-test.mit.edu/metadata/handle/1721.1/29806/rdf.xml

> <?xml version="1.0" encoding="UTF-8"?>
> <rdf:RDF xmlns:dc="http://purl.org/dc/elements/1.1/"
> xmlns:dcterms="http://purl.org/dc/terms/" xmlns:ds="http://
> www.dspace.org/objectModel#" xmlns:rdf="http://www.w3.org/
> 1999/02/22-rdf-syntax-ns#" xmlns:rdfs="http://www.w3.org/2000/01/
> rdf-schema#" xmlns:xsi="http://www.w3.org/2001/XMLSchema#"
> xmlns:lw="http://simile.mit.edu/longwell/" xmlns:ow="http://
> www.w3.org/2002/07/owl#" xmlns:ore="http://www.openarchives.org/ore/
> terms/">
> <rdf:Description rdf:about="http://dspace-test.mit.edu:80/
> metadata/handle/1721.1/29806/rdf.xml">
> <rdf:type rdf:resource="http://www.dspace.org/
> objectModel#Community"/>
> <rdf:type rdf:resource="http://purl.org/dc/dcmitype/
> Collection"/>
> <dc:type rdf:resource="http://purl.org/dc/dcmitype/
> Collection"/>
> <dc:identifier rdf:resource="hdl:1721.1/29806"/>
> <dc:title>CSAIL Digital Archive</dc:title>
> <dc:creator>DSpace at MIT</dc:creator>
> <dcterms:modified rdf:datatype="http://www.w3.org/2001/
> XMLSchema#date">2008-03-30</dcterms:modified>
> <dcterms:isPartOf rdf:resource="http://dspace-test.mit.edu:
> 80/metadata/handle/1721.1/5458/rdf.xml"/>
> <dcterms:hasPart rdf:resource="http://dspace-test.mit.edu:
> 80/metadata/handle/1721.1/29807/rdf.xml"/>
> <dcterms:hasPart rdf:resource="http://dspace-test.mit.edu:
> 80/metadata/handle/1721.1/29808/rdf.xml"/>
> <rdf:type rdf:resource="http://www.openarchives.org/ore/
> terms/ResourceMap"/>
> <ore:describes rdf:resource="http://dspace-test.mit.edu:80/
> metadata/handle/1721.1/29806/rdf.xml#aggregation"/>
> </rdf:Description>
> <rdf:Description rdf:about="http://dspace-test.mit.edu:80/
> metadata/handle/1721.1/29806/rdf.xml#aggregation">
> <rdf:type rdf:resource="http://www.openarchives.org/ore/
> terms/Aggregation"/>
> <ore:analogousTo rdf:resource="http://hdl.handle.net/
> 1721.1/29806"/>
> <ore:aggregates rdf:resource="http://dspace-test.mit.edu:80/
> metadata/handle/1721.1/29807/rdf.xml"/>
> <ore:aggregates rdf:resource="http://dspace-test.mit.edu:80/
> metadata/handle/1721.1/29808/rdf.xml"/>
> </rdf:Description>
> </rdf:RDF>

As a graph in the validator:
http://www.w3.org/RDF/Validator/ARPServlet?URI=http%3A%2F%2Fdspace-
test.mit.edu%2Fmetadata%2Fhandle%2F1721.1%2F29806%
2Frdf.xml&PARSE=Parse+URI%3A
+&TRIPLES_AND_GRAPH=PRINT_BOTH&FORMAT=PNG_EMBED&NTRIPLES=on&NODE_COLOR=B
lack&NODE_TEXT_COLOR=Blue&EDGE_COLOR=Darkgreen&EDGE_TEXT_COLOR=Red&FONT_
SIZE=10&ORIENTATION=LR

Cheers,
Mark Diggory

~~~~~~~~~~~~~
Mark R. Diggory - DSpace Developer and Systems Manager
MIT Libraries, Systems and Technology Services
Massachusetts Institute of Technology

Graham Triggs

unread,

Mar 31, 2008, 9:21:17 AM3/31/08

to OAI-ORE

On Mar 30, 5:25 pm, Mark Diggory <mdigg...@MIT.EDU> wrote:
> I've been taking both the ore and DCMI CAP quite literally and
> attempting to encode examples in RDF. But I want to avoid the
> replication, Graham Triggs recommended to me to drop using ore in
> this context (describing Sites, Communities and Collections of Items)
> and I'm tending to agree and just reserve my ore to just our Items
> and the "abstract aggregations (subjects, authors, topics, types,
> etc...) that may exist in their metadata". But if I do that, it
> rather restricts our usage of ORE to then same context as METS (i.e.
> describing Items).

Mark,

My recommendation was that OAI aggregations should only be defined for
things that will be considered, referenced, cited as a whole. For the
most part, a repository, community or collection in 'regular' DSpace
installations don't fit that criteria - they have references to many
things that can be discovered within that context, but there generally
isn't a concept of two items being part of a higher conceptual whole.

G

pkeane

unread,

Mar 31, 2008, 9:39:01 AM3/31/08

to OAI-ORE

If so, that's a bit disappointing. I had expected ORE to be a
conceptual
model for *any* aggregation (and repository, community, or
collection
specifically). Saying it does *not* cover such things seems to
close
some pretty significant doors. Is there a better was to take a
DSpace
repository (whole) and save/serialize/move/preserve it outside
of
DSpace? Because (for me at least) that is a critical
issue.

-peter keane

> G

Mark Diggory

unread,

Mar 31, 2008, 10:54:01 AM3/31/08

to oai...@googlegroups.com

Peter and Graham,

I think there is a confusion about messages and services here... So
far (and others can correct me where I am wrong). ORE is just a few
properties that can be attached to an RDF Description and some
structural recommendations on how to organize those RDF descriptions...

The Semantic Web and efforts such as LOD (Linked Open Data) work to
expose whole Data-sets of content via RDF/SPARQL regardless of what
schema/ontologies are used. So, Peter, I wouldn't say that ORE
shouldn't deliver a conceptual model for repositories, communities,
collections expressed in RDF, There is already enough in pre-existing
schema/ontologies to support representing such structure. What I'd
like to see is more consideration of what is already out there,
because "defining your own model" for representing your system
immediately restricts the usability of your repository as a data-
source in the Semantic Web simply by obscurity and the isolating
properties of not using commonly used preexisting models to represent
your data-source with.

That said, I'm still not sure I buy that ORE is just for describing
Items and not for describing Collections of Items...

And I'm a little disconcerted with any of the assumptions/conclusions
in this group because responses on this list led by any actual
technical team members from the ORE community have been extremely
sparse.

Sincerely,
Mark

Pete Johnston

unread,

Mar 31, 2008, 11:27:29 AM3/31/08

to oai...@googlegroups.com

Hi Mark,

Disclosure: I'm a former chair of the DCMI Collection Description WG
which developed the DC Collections Application Profile and I'm also a
member of the ORE Technical Committee - but these are just some personal
comments/observations, not the views of either of those groups! :-)

Thanks for the query. As it happens, a variation of your question came
up in another conversation I was having a couple of weeks ago, and up
until that point it wasn't an issue I'd given much thought to.

> If ore:Aggregation is a subClassOf dctype:Collection, then I
> speculate there is some type of relationship between DCMI
> Collection Profile for which this dctype:Collection is
> primarily used and ore:Aggregation?

A couple of quick points about the DC CAP:

- the purpose of the DC Collections Application Profile is not to say
that all instances of the class dcmitype:Collection should/must be
described using the set of constraints specified by the DC CAP; rather
it provides one profile, one set of constraints, which may be applied
for describing a (hopefully quite a wide) range of instances of the
class dcmitype:Collection, in order to support certain
functions/operations, primarily related to the discovery of collections
or the selection of a collection from amongst a list of discovered
collections. But I could still create a description of a resource using
some set of properties I choose without reference to the DC CAP and say
that resource is an instance of dcmitype:Collection and I break nothing
by doing that.

- the primary focus of DC CAP is on describing the collection as an
entity, on the attributes of the collection and on some relationships
with other resources, principally (as noted above) to support
discovery/selection. It doesn't concern itself with enumerating the
members of a collection.

- the DC CAP is very permissive (perhaps too permissive!) about what
other resources as well as the collection might be described in the
graph (in DCMI terms, description set)

From the ORE side of things:

- the ORE Abstract Data Model focuses primarly on describing the
"structural" relationships between an Aggregation and its enumerated
component/member resources

- the ORE Abstract Data Model allows the provider of a Resource Map to
include in that Resource Map (more or less) any metadata about an ORE
Aggregation

So on that basis, although the structural constraints specified by the
ORE ADM don't _require_ a subgraph corresponding to the DC CAP
constraints, they do (I think - there may be some issues around the use
of blank nodes) _permit_ the inclusion of such a subgraph.

So I think the provider of an ORE ReM could indeed include a description
of the Aggregation reflecting the constraints of the DC CAP (i.e.
including triples with the dc:identifier, dc:type, dc:title,
dcterms:abstract predicates required by DC CAP). (And I suppose you
could have an "ORE ReM profile" which required that)

(On a slightly tangential but related note, I do worry a little bit that
in adopting a very general/generic notion of "aggregation" in ORE, we
may be glossing over some rather subtle but important distinctions
between different flavours of part/whole relationships, and distinctions
between "aggregation" and "composition", but I guess that is a broader
issue.)

I agree they both deal with descriptions of collections, though I think
they focus on different aspects of the collection, in order to support
different requirements.

> I wonder then about ORE's usage of this dctype? What is the
> intention in ORE's usage of this type?

I wasn't sure I understood the question, but an ORE ReM could include a
triple with a dc:type predicate and an object referring to a class from
one of the vocabs mentioned in DC CAP. The ORE ADM doesn't require it
but it does permit it. (And I think that is probably the right approach:
ORE has tried to be as permissible as possible, I think)

> Is it really of any use to identify an Aggregation as a
> subclass of dcmi:Collection and DCMI CAP as a dctype:Collection?

I'm not sure it does, TBH!

(Also, the aggregation v composition issue does worry me slightly. i.e.
Is it the case that everything ORE thinks of as an instance of
ore:Aggregation is also an instance of dcmitype:Collection? In making
the subclass assertion we say that is the case.)

> Is it wrong to extrapolate that a ore:Aggregation has a
> relationship to a CAP?

For the reasons outlined above, I think it would be wrong to extrapolate
that because ore:Aggregation is a subclass of dcmitype:Collection, then
all instances of ore:Aggregation should be described using a subgraph
corresponding to the constraints of the DC CAP.

But yes, I think there is a relationship. I agree that - as curently
defined - ORE concerns itself with the description of collections, and
the set of structural constraints on an RDF Graph specified by the ORE
Abstract Data Model (which I might call a "graph profile") play a
similar role to the set of structural constraints on a DC description
set specified by what DCMI calls a "description set profile". And one
could probably reformulate the ORE structural constraints in terms of a
DCMI description set profile, I think.

> Perhaps this is only a sign that having "Classes" that are
> empty of further definition (in the dctypes namespace)
> creates a horrible vagueness about the usage of such a Class?

I have some sympathy with that comment, yes.

In this example, I didn't quite grasp why you were describing

http://dspace-test.mit.edu:80/metadata/handle/1721.1/29806/rdf.xml

as an instance of ore:ResourceMap and also of dcmitype:Collection (OK,
it's a collection of triples, but it seems a stretch! :-) ).

The instance of dcmitype:Collection here is the Aggregation

http://dspace-test.mit.edu:80/metadata/handle/1721.1/29806/rdf.xml#aggre
gation

which, as you say, can be inferred from the subclass relationship for
ore:Aggregation so doesn't have to be asserted explicitly here.

Pete
---
Pete Johnston
Technical Researcher, Eduserv Foundation
Web: http://www.eduserv.org.uk/foundation/people/petejohnston/
Weblog: http://efoundations.typepad.com/efoundations/
Email: pete.j...@eduserv.org.uk
Tel: +44 (0)1225 474323

Mark Diggory

unread,

Mar 31, 2008, 11:38:33 AM3/31/08

to oai...@googlegroups.com

On Mar 31, 2008, at 7:54 AM, Mark Diggory wrote:
> And I'm a little disconcerted with any of the assumptions/conclusions
> in this group because responses on this list led by any actual
> technical team members from the ORE community have been extremely
> sparse.

Before there is any backlash, after review, I think this statement
was overly critical concerning the activity of the list. I'm not
interested in fostering any negativity here and am thankful to see an
avenue for discussion about ORE and its relationship to the community
and other standards. I know that often it takes a considerable
amount of time and work to build momentum in a community.

If its any consolation, it was about 7:00 and I'd yet to have a cup
of coffee. ;-)

Mark Diggory

unread,

Mar 31, 2008, 7:26:41 PM3/31/08

to oai...@googlegroups.com

Pete,

Thank you for a great reply! I will respond inline below.

On Mar 31, 2008, at 8:27 AM, Pete Johnston wrote:

>
> Hi Mark,
>
> Disclosure: I'm a former chair of the DCMI Collection Description WG
> which developed the DC Collections Application Profile and I'm also a
> member of the ORE Technical Committee - but these are just some
> personal
> comments/observations, not the views of either of those groups! :-)

I've been reading your work in the DCMI group for a several years
now, a very reputable body of research. I will take these comments
appropriately as yours.

>
> A couple of quick points about the DC CAP:
>
> - the purpose of the DC Collections Application Profile is not to say
> that all instances of the class dcmitype:Collection should/must be
> described using the set of constraints specified by the DC CAP; rather
> it provides one profile, one set of constraints, which may be applied
> for describing a (hopefully quite a wide) range of instances of the
> class dcmitype:Collection, in order to support certain
> functions/operations, primarily related to the discovery of
> collections
> or the selection of a collection from amongst a list of discovered
> collections. But I could still create a description of a resource
> using
> some set of properties I choose without reference to the DC CAP and
> say
> that resource is an instance of dcmitype:Collection and I break
> nothing
> by doing that.

Point taken.

> - the primary focus of DC CAP is on describing the collection as an
> entity, on the attributes of the collection and on some relationships
> with other resources, principally (as noted above) to support
> discovery/selection. It doesn't concern itself with enumerating the
> members of a collection.

That is certainly understandable as they could be quite numerous and
listing may require a complex negotiation involving searching,
browsing and/or paging.

> - the DC CAP is very permissive (perhaps too permissive!) about what
> other resources as well as the collection might be described in the
> graph (in DCMI terms, description set)
>
> From the ORE side of things:
>
> - the ORE Abstract Data Model focuses primarly on describing the
> "structural" relationships between an Aggregation and its enumerated
> component/member resources

I think even so, in its current state there may still be the same
limitation in design caused by the number of resources in an
aggregation.

> - the ORE Abstract Data Model allows the provider of a Resource Map to
> include in that Resource Map (more or less) any metadata about an ORE
> Aggregation
>
> So on that basis, although the structural constraints specified by the
> ORE ADM don't _require_ a subgraph corresponding to the DC CAP
> constraints, they do (I think - there may be some issues around the
> use
> of blank nodes) _permit_ the inclusion of such a subgraph.

If it is the intention of DC CAP to describe the Collection, I'm
struggling with the ADM being of type dcmitype:Collection, because
IMO, the ReM is describing a Collection of resources (which may
include describing other related collections or finding aids), and
the ADM is actually enumerating the contents of the collection. In
that sense, I'd see it more appropriate to have the ResourceMap
extend dcmitype:Collection and not the Aggregation.

> So I think the provider of an ORE ReM could indeed include a
> description
> of the Aggregation reflecting the constraints of the DC CAP (i.e.
> including triples with the dc:identifier, dc:type, dc:title,
> dcterms:abstract predicates required by DC CAP). (And I suppose you
> could have an "ORE ReM profile" which required that)

I think so too.

> (On a slightly tangential but related note, I do worry a little bit
> that
> in adopting a very general/generic notion of "aggregation" in ORE, we
> may be glossing over some rather subtle but important distinctions
> between different flavours of part/whole relationships, and
> distinctions
> between "aggregation" and "composition", but I guess that is a broader
> issue.)

Yes we may even have an example of this in the DSpace realm where
Items can be either "owned by a Collection" or "Mapped to a
Collection" reflecting a quality of Composition vs Aggregation. We
also likewise are introducing the ability to version an Item in the
near future and this likewise creates a similar scenario between
DSpace Items and their Bitstreams.

>>> dcterms:hasPart (http://dublincore.org/groups/collections/
>>> collection-application-profile/#coldctermshasPart)
>>> dcterms:isPartOf (http://dublincore.org/groups/collections/
>>> collection-application-profile/#coldctermsisPartOf)
>>
>> ORE:Aggregation sort of has a structural element out of this
>> [type,hasPart (ore:aggregates)] but none else, and I bring it
>> up because most of what is in the CAP has to do with
>> describing the metadata as a "thing", which is what is going
>> on in the ReM.
>
> I agree they both deal with descriptions of collections, though I
> think
> they focus on different aspects of the collection, in order to support
> different requirements.

What I percieve is that one resource is describing the Collection and
its relationship to others (CAP and ResourceMap) the other is an
enumeration of the contents of a collection (an aggregation). Which
continues to worry me, because while the ResourceMap may simply be
descriptive, the AGM is structural in nature and will immediately be
subject to the mechanisms of discovery and the size of a Collection.
A simple list will work for smaller collections, but will certainly
not scale well to larger collections. As well, given there may be
more than one way or service to enumerate the contents, being limited
to one "#aggregation" identifier to specify just one enumeration has
limitations. (I think Graham was pointing that out earlier).

>> I wonder then about ORE's usage of this dctype? What is the
>> intention in ORE's usage of this type?
>
> I wasn't sure I understood the question, but an ORE ReM could
> include a
> triple with a dc:type predicate and an object referring to a class
> from
> one of the vocabs mentioned in DC CAP. The ORE ADM doesn't require it
> but it does permit it. (And I think that is probably the right
> approach:
> ORE has tried to be as permissible as possible, I think)

I agree, but take a particular stance that neither CAP nor ORE is in
a position to be considered the "Container" of Description from the
standpoint of an expression in RDF, Meaning in my example, there are
statements about a subject (a DSpace Collection or Item) and they
might be ORE, CAP or of any other set of predicates that may be
appropriate make about the subject being described.

>> Is it really of any use to identify an Aggregation as a
>> subclass of dcmi:Collection and DCMI CAP as a dctype:Collection?
>
> I'm not sure it does, TBH!
>
> (Also, the aggregation v composition issue does worry me slightly.
> i.e.
> Is it the case that everything ORE thinks of as an instance of
> ore:Aggregation is also an instance of dcmitype:Collection? In making
> the subclass assertion we say that is the case.)

I assume it is, but without a stronger definition for the type, it
seems not to carry less meaning for me than the presence of the
actual enumerated contents. It would possibly become more meaningful
if the aggregated contents were not enumerated and required some
further resolution to acquire a list of or query interface for.

>> Is it wrong to extrapolate that a ore:Aggregation has a
>> relationship to a CAP?
>
> For the reasons outlined above, I think it would be wrong to
> extrapolate
> that because ore:Aggregation is a subclass of dcmitype:Collection,
> then
> all instances of ore:Aggregation should be described using a subgraph
> corresponding to the constraints of the DC CAP.

I understand.

> But yes, I think there is a relationship. I agree that - as curently
> defined - ORE concerns itself with the description of collections, and
> the set of structural constraints on an RDF Graph specified by the ORE
> Abstract Data Model (which I might call a "graph profile") play a
> similar role to the set of structural constraints on a DC description
> set specified by what DCMI calls a "description set profile". And one
> could probably reformulate the ORE structural constraints in terms
> of a
> DCMI description set profile, I think.

I've come across a wiki page on the subject:
http://dublincore.org/architecturewiki/DescriptionSetProfile

It may take me a couple reads to fully grok, but one question:

Are description set profiles expressible in RDFS or OWL? Mostly
because I'm curious of their usage in existing validation engines etc...

>> Perhaps this is only a sign that having "Classes" that are
>> empty of further definition (in the dctypes namespace)
>> creates a horrible vagueness about the usage of such a Class?
>
> I have some sympathy with that comment, yes.

[smile]

>
>> I've been taking both the ore and DCMI CAP quite literally
>> and attempting to encode examples in RDF. But I want to avoid
>> the replication, Graham Triggs recommended to me to drop
>> using ore in this context (describing Sites, Communities and
>> Collections of Items) and I'm tending to agree and just
>> reserve my ore to just our Items and the "abstract
>> aggregations (subjects, authors, topics, types,
>> etc...) that may exist in their metadata". But if I do that,
>> it rather restricts our usage of ORE to then same context as
>> METS (i.e.
>> describing Items).
>>
>> Example of a DSpace Community:
>> http://dspace-test.mit.edu/metadata/handle/1721.1/29806/rdf.xml
>>
>

> In this example, I didn't quite grasp why you were describing
>
> http://dspace-test.mit.edu:80/metadata/handle/1721.1/29806/rdf.xml
>
> as an instance of ore:ResourceMap and also of dcmitype:Collection (OK,
> it's a collection of triples, but it seems a stretch! :-) ).

The ReM is a DCMI CAP, it uses hasParts to describe what are its sub-
collections. It may be confusing because I described the sub-
collections in the Aggregation as well. Now that I've removed it It
should be clearer that unless I express an aggregation, all I really
have is a DCMI CAP.

> @prefix dc: <http://purl.org/dc/elements/1.1/>.
> @prefix dctype: <http://purl.org/dc/dcmitype/>.
> @prefix dcterms: <http://purl.org/dc/terms/>.
> @prefix ds: <http://www.dspace.org/objectModel#>.
>
> <http://dspace-test.mit.edu:80/metadata/handle/1721.1/39118/rdf.xml>
> dc:creator
> "DSpace at MIT";
> dc:identifier
> <hdl:1721.1/39118>;
> dc:title
> "Abdul Latif Jameel Poverty Action Lab (J-PAL)";
> dc:type
> dctype:Collection;
> dcterms:abstract
> """The Abdul Latif Jameel Poverty Action Lab (J-PAL)
> serves as a focal point for development and poverty research
> based on randomized trials. The objective is to improve the
> effectiveness of poverty
> programs by providing policy makers with clear scientific results
> that help shape successful policies to combat poverty. J-PAL
> works with NGOs, international organizations, and others to
> evaluate programs and disseminate the results of high quality
> research. We work on issues as diverse as boosting girls'
> attendance at school, improving the output of farmers in
> sub-Saharan Africa, racial bias in employment in the US, and the
> role of women political leaders in India.""";
> dcterms:hasPart
> <http://dspace-test.mit.edu:80/metadata/handle/1721.1/39119/
> rdf.xml>;
> dcterms:isPartOf
> <http://dspace-test.mit.edu:80/metadata/handle/1721.1/0/
> rdf.xml>;
> dcterms:modified
> "2008-03-31";
> ds:logoURL
> "http://dspace-test.mit.edu:80/retrieve/230228/pal_logo.jpg";
> a dctype:Collection, ds:Community.

I keep looking at the ADM and now think it be more appropriate that
it would actually contain a union of all the "Items" found in all the
sub-collections specified in "hasPart". Which exposes the scalability
issue even on a small repository like DSpace@MIT because if we go all
the way back to the top, we are talking about 27,000 Items and
growing. But, I can show you a smaller example from a DSpace
Collection here:

If I had been rendering ORE on the Community this collection was in,
it would have looked like...

> @prefix dc: <http://purl.org/dc/elements/1.1/>.
> @prefix dctype: <http://purl.org/dc/dcmitype/>.
> @prefix dcterms: <http://purl.org/dc/terms/>.
> @prefix ore: <http://www.openarchives.org/ore/terms/>.
> @prefix ds: <http://www.dspace.org/objectModel#>.
>
> <http://dspace-test.mit.edu:80/metadata/handle/1721.1/39118/rdf.xml>
> dc:creator
> "DSpace at MIT";
> dc:identifier
> <hdl:1721.1/39118>;
> dc:title
> "Abdul Latif Jameel Poverty Action Lab (J-PAL)";
> dc:type
> dctype:Collection;
> dcterms:abstract
> """The Abdul Latif Jameel Poverty Action Lab (J-PAL)
> serves as a focal point for development and poverty research
> based on randomized trials. The objective is to improve the
> effectiveness of poverty
> programs by providing policy makers with clear scientific results
> that help shape successful policies to combat poverty. J-PAL
> works with NGOs, international organizations, and others to
> evaluate programs and disseminate the results of high quality
> research. We work on issues as diverse as boosting girls'
> attendance at school, improving the output of farmers in
> sub-Saharan Africa, racial bias in employment in the US, and the
> role of women political leaders in India.""";
> dcterms:hasPart
> <http://dspace-test.mit.edu:80/metadata/handle/1721.1/39119/
> rdf.xml>;
> dcterms:isPartOf
> <http://dspace-test.mit.edu:80/metadata/handle/1721.1/0/
> rdf.xml>;
> dcterms:modified
> "2008-03-31";
> ds:logoURL
> "http://dspace-test.mit.edu:80/retrieve/230228/pal_logo.jpg";
> ore:describes
> <http://dspace-test.mit.edu:80/metadata/handle/1721.1/39118/
> rdf.xml#aggregation>;
> a dctype:Collection, ds:Community, ore:ResourceMap.
> <http://dspace-test.mit.edu:80/metadata/handle/1721.1/39118/
> rdf.xml#aggregation>
> ore:aggregates
> <http://dspace-test.mit.edu:80/metadata/handle/1721.1/39123/
> rdf.xml>,
> <http://dspace-test.mit.edu:80/metadata/handle/1721.1/39124/
> rdf.xml>,
> <http://dspace-test.mit.edu:80/metadata/handle/1721.1/39125/
> rdf.xml>,
> <http://dspace-test.mit.edu:80/metadata/handle/1721.1/39126/
> rdf.xml>.

> The instance of dcmitype:Collection here is the Aggregation
>
> http://dspace-test.mit.edu:80/metadata/handle/1721.1/29806/
> rdf.xml#aggre
> gation
>
> which, as you say, can be inferred from the subclass relationship for
> ore:Aggregation so doesn't have to be asserted explicitly here.
>
> Pete

Looking over other aspects of the DCMI CAP, I wonder if what an ADM
is really about is sort of a "Finding Aid" or a "Service" on which to
explore the Collections contents, and I speculate in its simplest
form it may be a list of resources, in its most complex form as
search interface or database access point. In either case, I think
there needs to be some flexibility in its identification and
resolution to allow a realistically scalable situation...

I will continue to adjust my examples and prototype. I've recently
removed the use of Aggregations to describe my Site and Communities
ATM because I feel that the DCMI CAP captures that relationship
better. I'm exploring using Aggregations just to enumerate the items
in any one of or communities, sub communities or collections, but
until I can come up with a scalable solution to enumerate 27,000
items in my Top Level Aggregation, I'll probably not be putting that
one in place immediately ;-).

Site (DCMI CAP):
http://dspace-test.mit.edu/metadata/handle/1721.1/0/rdf.xml

Community (DCMI CAP):
http://dspace-test.mit.edu/metadata/handle/1721.1/39118/rdf.xml

Collection (DCMI CAP, ReM and AG):
http://dspace-test.mit.edu/metadata/handle/1721.1/39119/rdf.xml

Item (ReM and AG):
http://dspace-test.mit.edu/metadata/handle/1721.1/39123/rdf.xml

Thank you,
Mark

Mark Diggory

unread,

Mar 31, 2008, 8:48:47 PM3/31/08

to oai...@googlegroups.com

I realized something after writing this that does solve some of my
problem....

Aggregation ore:aggregates <uri> can point at other Aggregations

http://www.openarchives.org/ore/0.2/datamodel#nestedAggregations

Thus enumerating individual items is not necessary in parent
Communities and this will probably be my avenue for keeping ORE
Aggregations in my Site and Parent Communities.

-Mark

pkeane

unread,

Apr 1, 2008, 3:59:47 AM4/1/08

to OAI-ORE

On Mar 31, 9:54 am, Mark Diggory <mdigg...@MIT.EDU> wrote:
>
> Peter and Graham,
>
> I think there is a confusion about messages and services here... So
> far (and others can correct me where I am wrong). ORE is just a few
> properties that can be attached to an RDF Description and some
> structural recommendations on how to organize those RDF descriptions...
>
> The Semantic Web and efforts such as LOD (Linked Open Data) work to
> expose whole Data-sets of content via RDF/SPARQL regardless of what
> schema/ontologies are used. So, Peter, I wouldn't say that ORE
> shouldn't deliver a conceptual model for repositories, communities,
> collections expressed in RDF, There is already enough in pre-existing
> schema/ontologies to support representing such structure. What I'd
> like to see is more consideration of what is already out there,
> because "defining your own model" for representing your system
> immediately restricts the usability of your repository as a data-
> source in the Semantic Web simply by obscurity and the isolating
> properties of not using commonly used preexisting models to represent
> your data-source with.
>
> That said, I'm still not sure I buy that ORE is just for describing
> Items and not for describing Collections of Items...
>

Mark-

Thanks -- that's helpful. I need to do some more investigation of
Linked
Open Data, etc.

My initial take on OAI-ORE (from a few years back and then upon
hearing Carl Lagoze talk about it at Open Repositories 2007) was
that while OAI-PMH was about publishing/harvesting metadata, ORE
would be a means to publish/harvest the objects as well. It may have
been a misunderstanding on my part, or perhaps that idea is still
in there and I am missing it. There does seem to be a pretty high
cognitive/conceptual load in the current documentation & discussion.
I was very pleased to see an Atom-based serialization, since Atom is
both ubiquitous AND presents a low barrier to entry, but the highly
conceptual bits of ORE raise that bar substantially.

I have been involved in the development of a digital object
repository/management tool at UT Austin. It's heavily used, esp. by
faculty in the visual arts, but other areas as well (it holds not just
images, but audio, video, pdfs, etc.). It's comprised of
"collections",
each with a custom metadata scheme. It uses Atom/AtomPub throughout.
It has replaced many FileMaker, Access, Excel, etc. databases around
campus into a centralized, standard system. It plays well with Google
Spreadsheets (thank you Atom/AtomPub) and presumably other Google-type
services as well.

Perhaps I am barking up the wrong tree with OAI-ORE, but I really need
a simple means by which to "describe" any collection such that interop
with ArtStor, DSpace, GoogleBase, etc. is easy & painless. Also, if I
have such a format/protocol, I am pretty sure it'll be an excellent
preservation format as well. I have found METS simply too heavyweight
for my needs here, although it is an obvious option.

As it stands, I can do much of what needs doing with Atom. All the
domain objects (collections,items) can be modelled in Atom, and
the custom metadata is captured in an RDF-ish data structure in
atom:content. (Actually it is an xhtml definition list with some
classes
to make it GRDDL-able, or such is the plan).

One principle I do intend to stick with is that the system needs to
be extremely simple (a bunch of addressable digital objects, and a
bag of key-value pairs for each), such that new, different, & perhaps
unintended uses can emerge. It needs to scale both up AND down. I am
aiming it to be the "Wordpress" of digital repositories ;-).

I'd be curious to know how folks might see OAI-ORE as the "backend"
format for such a system. Obviously, as faculty come & go there WILL
be
a need to describe the collections they put together and in that
regard,
at least, OAI-ORE seems appropriate.

thanks-
Peter Keane

Robert Sanderson

unread,

Apr 1, 2008, 11:18:41 AM4/1/08

to oai...@googlegroups.com

While I agree in principle with defining aggregations only for things that you would want to discuss as a single entity, there may very well be reasons to create aggregations like this in DSpace. Nothing /prevents/ the creation of Resource Maps describing aggregations of aggregations, and we explicitly allow for it.

For example an Aggregation for a journal would aggregate aggregations for issues, which would aggregate aggregations for articles, which might then aggregate aggregations of pages/paragraphs/figures/data/whatever.

Rob

On Mon, Mar 31, 2008 at 1:21 PM, Graham Triggs <graham...@gmail.com> wrote:

On Mar 30, 5:25 pm, Mark Diggory <mdigg...@MIT.EDU> wrote:
> I've been taking both the ore and DCMI CAP quite literally and
> attempting to encode examples in RDF. But I want to avoid the
> replication, Graham Triggs recommended to me to drop using ore in
> this context

Mark Diggory

unread,

Apr 1, 2008, 3:28:33 PM4/1/08

to oai...@googlegroups.com

On Apr 1, 2008, at 8:18 AM, Robert Sanderson wrote:

>
> While I agree in principle with defining aggregations only for
> things that you would want to discuss as a single entity, there may
> very well be reasons to create aggregations like this in DSpace.
> Nothing /prevents/ the creation of Resource Maps describing
> aggregations of aggregations, and we explicitly allow for it.

I've continued on to take this approach. With the Aggregations
correctly pointing at the #aggregation in the child collection. So
in effect what it is saying is that any descendant Item in any sub-
collection is also a member of this parent. This a little more
explicit a statement than I see coming from something like the DCMI
Collection Application Profile. Which, I think, infers such a
relationship via dcterms:hasPart dcterms:isPartOf, but I would not
necessarily want to say in DCMI CAP, that my Items are sub-
collections of their parents, or more pointedly, the DCMI CAP doesn't
give me a way to enumerate the Items, while the ORE does

So ultimately what the ORE Aggregation is providing, is a way to
explicitly define the contained child elements in all the collections
explicitly defined by my DCMI CAP profiles.

The payoff is that (and my prototype still needs a little more work
in this area), I may be able first explore the DCMI CAP for my
Collections to get the landscape of my repository, and then explore
the Aggregations afterward to harvest whatever content I choose. What
do I have i the end? I have a "machine navigable" and "machine
actionable" Repository that can be explored using generic semantic
web browsing, aggregating and searching technologies.

> For example an Aggregation for a journal would aggregate
> aggregations for issues, which would aggregate aggregations for
> articles, which might then aggregate aggregations of pages/
> paragraphs/figures/data/whatever.

Yes, I do not think that ultimately these will be the "only" ORE
aggregations that my repository has, I would like to have
aggregations that are driven off the Metadata in my items, such that
one can pull up an aggregation of Journal Articles, FRBR Works or
some other aggregation that isn't necessarily expressed in the
default hierarchy of the collections. But, all in all, even those
cases will be contingent on any interoperability (in the semantic web
sense) as well.

> On Mon, Mar 31, 2008 at 1:21 PM, Graham Triggs
> <graham...@gmail.com> wrote:
>
> My recommendation was that OAI aggregations should only be defined for
> things that will be considered, referenced, cited as a whole. For the
> most part, a repository, community or collection in 'regular' DSpace
> installations don't fit that criteria -

I agree with Rob, and IMHO don't think the standard should be saying
what should be aggregated or wether a Repository, Community, or
Collection isn't citable as a whole. And, I think that my position is
very well aligned with the Semantic Web and Linked Data Communities
when it come to this subject. I'm personally hoping that ORE is an
enabling technology that fits well into the burgeoning Semantic Web,
in that regard, its evolution and application are best expressed in
real world examples driven by actual use cases and needs.

[slips and falls off soap box]

Cheers,
Mark

Reply all

Reply to author

Forward

0 new messages