On Tue, Aug 24, 2010 at 7:56 PM, Aaron Rubinstein <rubin...@gmail.com> wrote:
> Thought I'd try and spark some discussion on this list...
>
> At UMass, we're working on simple representations of our collections
> in RDF. Our current plans are to add some basic RDFa to our HTML
> finding aids. The RDF would look something like this:
>
> @prefix dcterms: <http://purl.org/dc/terms/>.
> @prefix foaf: <http://xmlns.com/foaf/0.1/>.
> @prefix pvn: <http://purl.org/archival/provenance/0.1#>.
> @prefix rdf: <http://www.w3.org/1999/02/22-rdf-syntax-ns#>.
>
> <http://example.com/mums589#collection>
> rdf:type pvn:Collection;
> dcterms:title "W. C. Wheeler Scrapbook";
> dcterms:extent "1 vol. (0.5 linear ft.)";
> dcterms:abstract "A resident of Haydenville, Mass., during the
> 1930s, C. H. Wheeler..." ;
> dcterms:creator "Wheeler, C. H.";
> pvn:heldBy <http://library.umass.edu/spcoll#archive> .
This seems nice, but the location and date should be accessible rather
than just in text strings.
Are there best practices around extent? (i.e. is there a way to encode
it so that '1 vol.' is separate from '.5 linear ft.'
>
> Another possible strategy would be to host the RDF descriptions
> separately from the finding aid and link to the static html finding
> aid using foaf:page or equivalent.
Content negotiation, maybe?
> The dcterms:creator could also
> link to a resource like this: http://gslis.simmons.edu/archival/5d9920257771ee469c5edc0da779ab17#person.
Yes, especially if that site could provide info for humans as well as
machines. But fine even so; links will help humans.
>
> As approaches to modeling the arrangement of collections become
> clearer, that metadata could be added to these descriptions as well.
> The advantage here is to: 1. Separate essential data about
> collections from display information. The co-mingling of structure
> and display metadata in EAD v. 2002 makes it almost impossible to
> model in RDF. 2. Provide a basic RDF representation of collections
> without trying to recreate EAD but allowing for modular addons as
> vocabularies develop.
>
> What do folks out there think of this approach?
>
> Aaron
>
>
>
>
-Jodi
On 8/25/2010 6:13 PM, Jodi Schneider wrote:
> LOCAH is using Linked Data for archives, too. They've written about it here:
> http://blogs.ukoln.ac.uk/locah/2010/08/18/some-thoughts-on-architecture-and-workflows/
I have seen this and am eager to see how they are modeling MODS and EAD
in RDF.
> On Tue, Aug 24, 2010 at 7:56 PM, Aaron Rubinstein<rubin...@gmail.com> wrote:
>> Thought I'd try and spark some discussion on this list...
>>
>> At UMass, we're working on simple representations of our collections
>> in RDF. Our current plans are to add some basic RDFa to our HTML
>> finding aids. The RDF would look something like this:
>>
>> @prefix dcterms:<http://purl.org/dc/terms/>.
>> @prefix foaf:<http://xmlns.com/foaf/0.1/>.
>> @prefix pvn:<http://purl.org/archival/provenance/0.1#>.
>> @prefix rdf:<http://www.w3.org/1999/02/22-rdf-syntax-ns#>.
>>
>> <http://example.com/mums589#collection>
>> rdf:type pvn:Collection;
>> dcterms:title "W. C. Wheeler Scrapbook";
>> dcterms:extent "1 vol. (0.5 linear ft.)";
>> dcterms:abstract "A resident of Haydenville, Mass., during the
>> 1930s, C. H. Wheeler..." ;
>> dcterms:creator "Wheeler, C. H.";
>> pvn:heldBy<http://library.umass.edu/spcoll#archive> .
>
> This seems nice, but the location and date should be accessible rather
> than just in text strings.
I somehow forgot to add dcterms:created to my example. Still, this
would only appear as a string literal. Could you explain a little more
what you mean by accessibility? If you mean using URIs to represent
dates, I could see the benefit but there is just as much benefit, in my
opinion, having a literal date encoded in a standard format.
> Are there best practices around extent? (i.e. is there a way to encode
> it so that '1 vol.' is separate from '.5 linear ft.'
The extent element here is extracted from our EAD, which follows DACS's
rules for formatting. There is no reason why we couldn't use a
combination of integers and attributes to add more granularity to extent
in our EAD but there's been little practical reason to do so and no best
practices for how to do that. In RDF, I'm not sure the best way to add
more granularity here. In fact, looking again at the dcterms ontology,
a range has appeared for dcterms:extent, which is the class
dcterms:SizeOrDuration. dcterms:SizeOrDuration does not appear as a
domain of any property in the ontology so I'm not entirely sure how
extent should be used.
>> Another possible strategy would be to host the RDF descriptions
>> separately from the finding aid and link to the static html finding
>> aid using foaf:page or equivalent.
> Content negotiation, maybe?
This would certainly be the ideal. For now, I think we're stuck adding
RDFa to the HTML finding aids, though we also have a collections catalog
in Wordpress that could have this RDFa and point to the full finding
aids. We don't have the flexibility to completely rethink how we
deliver collection information so we are trying to work with what we
have, though we are certainly working towards a more elegant system.
>> The dcterms:creator could also
>> link to a resource like this: http://gslis.simmons.edu/archival/5d9920257771ee469c5edc0da779ab17#person.
> Yes, especially if that site could provide info for humans as well as
> machines. But fine even so; links will help humans.
The link I gave is to a linked data service that connegs to html or rdf.
I'd like to follow up on this issue, and to make a more general point
about the future of EAD. As Aaron and others do this sort of work, it
would be useful for those of us involved in the EAD revision process
are finding difficulty. While it might be a stretch to create a more
"RDF-friendly" version of EAD at least in an initial revision cycle,
there is at least some interest in developing a stricter subset or
profile for data-oriented work.
Regarding the LOCAH folks, I've been trying to get them to join this
list, but I'm not sure if they have. :)
Mark A. Matienzo
Digital Archivist, Manuscripts and Archives
Yale University Library
> Regarding the LOCAH folks, I've been trying to get them to join this
> list, but I'm not sure if they have. :)
I have and I'm following this... I'm just trying to get something
written up at the moment, and will share it here as soon as I get it
done.
Pete
http://blogs.ukoln.ac.uk/locah/2010/09/28/model-a-first-cut/
As I try to emphasise, it's very much "a first cut" at the problem,
and I'm sure we'll find lots of things need changing along the way,
but I hope it's a starting point.
Any thoughts are welcome!
Cheers
Pete
----
Pete Johnston
Just a belated comment on the dcterms:extent question as I'm thinking
about this myself in trying to model archival repositories for a
redevelopment of http://directory.archivists.org.au
According to my own limited understanding it seems that there are two
approaches in creating a more fine-grained representation of extent.
One would be to define a custom datatype (eg LinearMetre) and use that
to identify the nature of the literal value something like:
dcterms:extent "23.4"^^<http:www.example.org/datatype/LinearMetre>
The second approach would be to create a subproperty of dcterms:extent
(eg linearMetre, or numberOfItems).
These alternatives are described in _Linked Data Modelling Patterns_ -
http://patterns.dataincubator.org/book/custom-datatype.html
From the recommendation there, I think that the subproperty approach
would be preferred, and it does seem sort of neater (to human eyes at
least). But I was wondering if anyone else had been thinking about
this or could point to any such examples in regard to dcterms:extent.
Cheers, Tim
--
Tim Sherratt (t...@discontents.com.au)
Words - http://www.discontents.com.au
Experiments - http://wraggelabs.com
@wragge on Twitter
Thanks for all the great feedback on the last data I sent. I was able
to incorporate many of your suggestions, though I'm still struggling
with others.
Here's some sample RDF for our collections:
@prefix dcterms: <http://purl.org/dc/terms/>.
@prefix foaf: <http://xmlns.com/foaf/0.1/>.
@prefix arch: <http://purl.org/archival/vocab/arch#>.
@prefix rdf: <http://www.w3.org/1999/02/22-rdf-syntax-ns#>.
@prefix xsd <http://www.w3.org/2001/XMLSchema#> .
<http://example.com/mums312#collection>
rdf:type arch:Collection;
dcterms:title "W. E. B. Du Bois Papers";
dcterms:extent "380 boxes
dcterms:extent "165 linear feet";
dcterms:abstract "Scholar, writer, editor of The Crisis and other
journals..." ;
dcterms:creator "Du Bois, W. E. B. (William Edward Burghardt), 1868-1963";
arch:inclusiveStart "1803"^^xsd:date;
arch:inclusiveEnd "1999"^^xsd:date;
arch:bulkStart "1877"^^xsd:date;
arch:bulkEnd "1963"^^xsd:date;
dcterms:subject <http://id.loc.gov/authorities/label/African
Americans--Civil rights>
dcterms:subject <http://id.loc.gov/authorities/label/African
Americans--History>
pvn:heldBy <http://library.umass.edu/spcoll#archive> .
I've added inclusive and bulk dates by creating subproperties of
dcterms:created and I've also added subjects as well. I'm a little
stuck with the extent, however. Breaking out each measurement is
certainly a help but the content is still not easily parsed by a
machine. Tim's suggestion to create custom datatypes or subproperties
of dcterms:extent makes a lot of sense but would mean a change in our
encoding best practices, which currently has us putting the human
readable form (100 linear feet) in <extent>.
A couple other things...
I've revised the vocabulary that supports this data as well as the
UMass archival names service at http://gslis.simmons.edu/archival.
You can find the revised vocabulary here:
http://purl.org/archival/vocab/arch.
At this point, the RDF above will be embedded in the HTML version of
the finding aid as RDFa. The model, then, is that when you request
the URI for the collection http://example.com/coll1#collection, you
get the EAD/HTML representation as well as the RDF representation.
Of course, any and all feedback from the LOD/SW/Archives heads out
there would be very much appreciated.
Thanks!
Aaron