On Thu, Jan 14, 2010 at 12:10 PM, Alma <brianan...@sbcglobal.net> wrote:
> First, we certainly have an aggregate object problem: images often
> come in series, with multiple versions, layers of analysis, multiple
> attached annotations/metadata, and related publications.
Sounds like an ideal scenario to use ORE in! :)
> My laboratory
> has been considering creating and implementing an ontology to keep
> track of these types of relationships semantically, instead of just
> having everything sitting in a shared directory or dumb links. It
> appears this is at least part of what OAI-ORE’s resource maps are
> intended do—correct?
Yep.
> However, the relationships currently possible to
> describe seem quite limited, and definitely designed with library
> applications in mind. Is the standard designed to be extensible? Would
> we be able to create our own relations to describe relationships
> between aggregated resources?
Yes absolutely. You can use any relationships or ontologies that are
appropriate for your use case. When designing the ontology you might
want to keep in mind the basic ones -- for example dcterms:hasFormat
to link between different formats for the images, or
dcterms:hasVersion to link between different versions, but you can
always extend them if they're not suitable.
> The second problem is object transport/harvesting. Am I correct in
> understanding that ORE could be used like OAI-PMH for harvesting, but
> to transport multiple kinds of files (say, bioimaging ones), instead
> of just XML metadata?
I'm not sure that it's entirely appropriate for medical images, but
ORE is just a description mechanism for collections of web resources.
So you would just give the URI for your image, and then a client would
retrieve it in the same way as any other web resource. For your use
case you would obviously need secure transmission (https, with good
authentication and authorization) but it seems quite possible.
I hope that helps!
Rob
Let me be more specific about my harvesting scenario. We're working in
basic science. These are non-clinical applications for medical and
scientific imaging technologies. We have multiple laps whose annotated
imaging work we would like to harvest automatically into a central
repository (which people could also get things back out of), metadata
and data together. We don't want to retrieve things on the fly, we
want them harvested a regular intervals. I imagine this as a problem
somewhat akin to getting faculty publications into the Institutional
Repository. Wasn't someone in Texas doing this somehow with OAI-ORE/
PMH into DSpace on this board? I tried to look at that PPP, but there
was not enough detail on how the actual data harvesting could be
conducted. Can anyone point me towards something along those lines?
Could you use something like Atom to do this? Again, forgive my total
ignorance here, I'm way outside my area of expertise.
>So you would just give the URI for your image..
I understand in ORE that you use these as the identifiers for
everything, but a URI doesn't have to resolve to a retrievable
address, does it? Also, I have a vague feeling there might be
potential problems with defining a URI for everything. Can you tell me
what some objections to that might be?
One more question: How does this data model cope with an aggregate
resource that grows over time? Is that considered a version, the same
thing with a new timestamp in the last-updated property, or an
entirely new resource every time something is added?
Yes, the project you refer to is this one:
http://www.mail-archive.com/dspace...@lists.sourceforge.net/msg00182.html
The issue to consider is that ORE is a description format, not a
wrapper format like (say) METS. ORE only discusses things by
reference, so to harvest the full images (in your case) you would
still need to dereference those pointers (via HTTP, I expect). All
you would get in the ORE description is the metadata.
If it's important that all of the data as well as metadata be included
in the response, then ORE is not the way to go. I don't know all of
your requirements and current architecture, but it seems reasonable
that each image would have its own URI... in the simplest case it must
have an identifier or filename already, which can be mapped into URI
space.
>>So you would just give the URI for your image..
>
> I understand in ORE that you use these as the identifiers for
> everything, but a URI doesn't have to resolve to a retrievable
> address, does it? Also, I have a vague feeling there might be
> potential problems with defining a URI for everything. Can you tell me
> what some objections to that might be?
It doesn't have to resolve, but it's very very strongly recommended.
The potential problems resolve around keeping the URIs alive. Free
(or very very cheap) to obtain, but more expensive to maintain. This
prompts them to be compared to kittens :)
> One more question: How does this data model cope with an aggregate
> resource that grows over time? Is that considered a version, the same
> thing with a new timestamp in the last-updated property, or an
> entirely new resource every time something is added?
ORE doesn't mandate any one approach to versioning. You can either
change the Aggregation, or you can have a new version of it, it's up
to you.
Rob