Re: Some MME questions

Tim Sherratt

unread,

Oct 12, 2010, 11:06:46 PM10/12/10

to mmesit...@googlegroups.com

All,

Thanks ingrid, Basil and Robina. Just following up on a few points.

On spatial data, while text values for places are better than nothing,
they're not going to show up in Research Data Australia's map
interface, so their usefulness for discovery is going to be limited.
Obviously it would be good to share geospatial coordinates if
possible, and I'm wondering what people are already doing in regard to
geocoding locations in collection dbs.

I haven't done any of this in the museums area, but did do some
heavy-duty geocoding for Mapping our Anzacs. I'm wondering whether
there are tools and approaches that we could share that would benefit
both the MME project and beyond. As Basil pointed out, the GeoNames
API is available. You can also download the GeoNames db and use it,
for example, to populate an autocomplete field in your own app.

ANDS is funding the development of web services on top of Geoscience
Australia's Gazetteer, so this will make the geolocation of Australian
places much easier. I'm hoping too that placenames in the Australian
Gazetteer will be linked to GeoNames ids, to connect up with the
Linked Open Data cloud. But even when we have these APIs we'll have to
think about the best ways to use them.

Of course, all that assumes you already have your placenames
identified. If you're pulling them from text descriptions there's
things like Yahoo Placemaker, and the ever-growing range of entity
extraction tools.

On people/organisations, Ingrid are you saying that your only
intending to include these as subjects? Something like <subject
type="nla-party">http://nla.gov.au/nla.party-615689</subject>? This
might be ok if the person actually is the subject of the collection,
but if they're the collector, then it's quite misleading. It would
seem much better to me to include people/organisations as related
objects in RIF-CS and describe the relationships appropriately.
Indeed, I'm wondering whether 'isSubjectOf' should be added to the
relation types in RIF-CS. (See pp. 39-41 of the ANDS Content Providers
Guide for list of relation types -
http://ands.org.au/guides/content-providers-guide.html)

I don't understand what you mean by not getting into 'duplicating
party records'. What we're talking about is using (or minting) party
ids in People Australia to identify related objects in RIF-CS.
Similarly, the NLA isn't assigning relationships to collections,
that's what's meant to be happening in RDA. That's our responsibility.

Of course this all raises the bigger question of how the museums
sector is using People Australia identifiers in their collection dbs.
I'd be very interested to know what people are doing. People Australia
provides us with world-leading infrastructure for linking people data
across collections, databases, sectors and projects.

There's two parts to this - finding and using ids for people/orgs
already in People Australia, and providing data on people/orgs
associated with our own collections for harvest into People Australia
(entailing disambiguation and the minting of new ids where necessary).
The first is easy, you can look them up in Trove and just add the
identifiers to your db. My Identity Browser
(http://wraggelabs.com/identities/) makes it even easier by providing
a bookmarklet that you can use in any web-based form to easily look up
a name. It also provides some RDFa markup that you could use to
identify a person in a text description.

To see how you can use People Australia identifiers to build rich
semantic annotations around collection material, you might like to
check out the Flickr Machine Tag Challenge
(http://wraggelabs.com/fmtc/). Over 1000 photos in Flickr have been
annotated with machine tags using PA identifiers.

In terms of contributing to People Australia, as Basil has noted there
have been some new tools developed for the ARDC-PIP project that could
be very useful.

I've always thought that part of the point of the MME project is to
facilitate and encourage metadata enhancement, so while I appreciate
the time pressures associated with the project it would seem useful to
talk a bit about things like people and places before we lock into a
model.

Cheers, Tim

--
Tim Sherratt (t...@discontents.com.au)
National Museum of Australia
Adjunct Associate-Professor, Digital Design + Media Arts Research Cluster,
Faculty of Arts and Design, University of Canberra

Words - http://www.discontents.com.au
Experiments - http://wraggelabs.com
@wragge on Twitter

Robina Sanderson

unread,

Oct 12, 2010, 11:44:12 PM10/12/10

to mmesit...@googlegroups.com

Hi Tim, Ingrid and others

Just a quick comment on the Gazeteer project: the contracting process is not yet complete, but we hope to soon be in a position to announce the tools that will be delivered as part of the Gazeteer project and when they are expected to be available. I'm sorry I can't be more explicit at this point, as I think it is one of the more exciting things ANDS is engaged in, but it is great to see the beginning of conversations about what geospatial information is being collected in Museum systems and what would need to be done to see collections discoverable spatially as well as by text.

regards
Robina
--
Robina Sanderson
Business Analyst

Australian National Data Service (ANDS)
1st Floor,
East Wing Building 122
Hancock Library
ANU
Canberra City ACT 2601
Australia

T: +61 2 6125 7162
E: robina.s...@ands.org.au

--
Robina Sanderson
Business Analyst

Australian National Data Service (ANDS)
1st Floor,
East Wing Building 122
Hancock Library
ANU
Canberra City ACT 2601
Australia

T: +61 2 6125 7162
E: robina.s...@ands.org.au

Alexander Johannesen

unread,

Oct 12, 2010, 11:58:23 PM10/12/10

to mmesit...@googlegroups.com

Hi Tim,

> On people/organisations, Ingrid are you saying that your only
> intending to include these as subjects? Something like <subject
> type="nla-party">http://nla.gov.au/nla.party-615689</subject>? This
> might be ok if the person actually is the subject of the collection,
> but if they're the collector, then it's quite misleading.

The short answer; we're using a core and an upper ontology that may or
may not map to popular vocabs like DC and FOAF and friends (like
RIF-CS relationships), including multi-layered thesaurii. Suggestions
more than welcome.

> It would
> seem much better to me to include people/organisations as related
> objects in RIF-CS and describe the relationships appropriately.

The granularity and extensibility of the RIF-CS seems rather limited
to me, but that may be because I'm somewhat new to those vocabs
defined through it. (I don't see roles beyond collector, owner and
manager, like curator, consumer, physical vs. abstract management and
so on, unless they by that mean managed?) I also don't like binary
directional associations in federated meta data, but perhaps that's
just a personal preference. Is there better description of the vocabs
and their relationships outside of the RIF-CS spec and guide (which
mostly duplicates the spec)?

> Indeed, I'm wondering whether 'isSubjectOf' should be added to the
> relation types in RIF-CS. (See pp. 39-41 of the ANDS Content Providers
> Guide for list of relation types -
> http://ands.org.au/guides/content-providers-guide.html)

Wouldn't it be better if this vocab was defined outside the RIF-CS as
an extensible ontology instead, and let in include the various part of
the problem space and not just relations between objects?

> I'd be very interested to know what people are doing. People Australia
> provides us with world-leading infrastructure for linking people data
> across collections, databases, sectors and projects.

I can't speak for what is already in use (I suspect very little if
any), but having an authoritative repository of identifiers is a
welcome change provided the mechanisms for resolvable
human-understandable data be flexible (we must avoid semantic drift at
all costs). Has anyone raised a cross-linking to WikiPedia or other
external sources for anchoring, for example?

> There's two parts to this - finding and using ids for people/orgs
> already in People Australia, and providing data on people/orgs
> associated with our own collections for harvest into People Australia
> (entailing disambiguation and the minting of new ids where necessary).
> The first is easy, you can look them up in Trove and just add the
> identifiers to your db.

How is the second part supposed to work? I'm interested in the
creation of good identifiers, duplicates, synonymous / antonymous
identifiers, weak semantics, and similar things.

> To see how you can use People Australia identifiers to build rich
> semantic annotations around collection material, you might like to
> check out the Flickr Machine Tag Challenge
> (http://wraggelabs.com/fmtc/). Over 1000 photos in Flickr have been
> annotated with machine tags using PA identifiers.

Are you suggesting here that we use entities in PA as a basis for some
shared ontology?

> I've always thought that part of the point of the MME project is to
> facilitate and encourage metadata enhancement, so while I appreciate
> the time pressures associated with the project it would seem useful to
> talk a bit about things like people and places before we lock into a
> model.

Absolutely. Reusing PA meta data and identifiers is a good thing that
we'll push as hard as we can, but I do fear there will be semantic
mismatches across these layers of the metadata exchange, unless one
party is to be policing these things through harvesting and analysis?

Kind regards,

Alexander
--
Project Wrangler, SOA, Information Alchemist, UX, RESTafarian, Topic Maps
--- http://shelter.nu/blog/ ----------------------------------------------
------------------ http://www.google.com/profiles/alexander.johannesen ---

Robina Sanderson

unread,

Oct 13, 2010, 12:08:56 AM10/13/10

to mmesit...@googlegroups.com

Hi Alexander and others

On the vocabs in RIF-CS: most allow for user defined terms as well as the authorised terms in the vocabularies. In addition, contributors to RDA can ask for additional terms to be added to the authorised terms. As yet, the MME hasn't asked for any additions, however, I would be grateful for any recommendations for additional terms from the MME team after discussion within the community.

Regards

Robina
--
Robina Sanderson
Business Analyst

Australian National Data Service (ANDS)
1st Floor,
East Wing Building 122
Hancock Library
ANU
Canberra City ACT 2601
Australia

T: +61 2 6125 7162
E: robina.s...@ands.org.au

Basil Dewhurst

unread,

Oct 13, 2010, 12:17:27 AM10/13/10

to MMEsitecoord

Hi all,

Do check out the scenarios section of the document at http://bit.ly/9CpX7X.
Given that the MME is getting off the ground after a number of early
ANDS projects Scenarios 1 and 2 are the key scenarios for you and will
support the ARDC in the long term. Scenarios 3 and 4 are transitional
and it's expected that existing early ANDS projects would use these
and move from them as soon as possible.

Now, at the risk of over-simplfying things, working with the Party
Infrastructure isn't hard. You simply provide records using an OAI
Repository in Registry Interchange Format for Collections and Services
(RIF-CS)* or Encoded Archival Context for Corporate bodies, Persons
and Families (EAC-CPF)** format. We harvest these and attempt to
automatically match them (we're working on improving our automatic
matching algorithms as we speak) with records already in the NLA Party
Infrastrucutre corpus. If records fail automatching then they're made
available for contributors to match by hand or created as a new record
using the Party Administration Tool - our user-friendly matching
interface. Once records exist in our corpus they have an NLA Party
Identifier and are available for harvesting by ANDS and the record
contributor.

The nice thing about feeding party records to the Party Infrastructure
is that contributed records pick up context and value along the way
which contributors and others can benefit from. So in the case of
Douglas Mawson (http://nla.gov.au/nla.party-503844) we started with a
sparse Libraries Australia record, then the ADBOnline contributed a
record containing links to other people and biographical information,
then the Encyclopaedia of Australian Science (EOAS) did the same and
the record was further enriched with linked resources. Now anyone
visiting Trove or using our web services interface can get richer
information about Mawson and link to related services (ADBOnline,
EOAS)/resources. The EOAS are pulling back records from us to enrich
their records so over time this kind of activity enriches data across
the network (not just at the NLA) and allows contributors to enrich
their offer to their online users.

I fully agree with Tim here - in enhancing the data created/managed in
our Collection Information Management Systems (CIMS) and exposed on
the network there are benefits for all !

Basil

* http://ands.org.au/resource/rif-cs.html
** http://eac.staatsbibliothek-berlin.de/
***

Basil Dewhurst

unread,

Oct 13, 2010, 12:23:24 AM10/13/10

to MMEsitecoord

Alex (Long time no see !)

We treat RIF-CS as an inpt format which we map to the richer EAC-CPF
standard. The ARDC Party Infrastructure Project will be providing
records mapped from EAC-CPF to RDF. This mapping is not complete yet
but included in the thinking are dc, foaf, bio and skos onologies.

hth,
Basil

Alexander Johannesen

unread,

Oct 13, 2010, 12:37:08 AM10/13/10

to mmesit...@googlegroups.com

Hi Basil,

> Alex (Long time no see !)

Indeed. Thought you'd see the last of me, I'm sure. :)

> We treat RIF-CS as an inpt format which we map to the richer EAC-CPF
> standard. The ARDC Party Infrastructure Project will be providing
> records mapped from EAC-CPF to RDF. This mapping is not complete yet
> but included in the thinking are dc, foaf, bio and skos onologies.

Ok, that's interesting and match my thinking. Any docos or other
sharings on this work?

I guess the bigger question is if EAC-CPF is a worthy core ontology?
It looks rather de-normalized and untyped for relational data, but
that may be my poor understanding of its use. I'm happy to use it more
directly (specifically, by converting it to an ontological expression
in Topic Maps or similar) if you guys are happy with it.

Btw, who's on the technical backend on this project? :)

Regards,

Alex

Basil Dewhurst

unread,

Oct 13, 2010, 12:47:00 AM10/13/10

to MMEsitecoord

Hi Alex,

I intend seeking on our use of RDF - it's something I started last
month and am trying to get back to (many small fires to put out). EAC-
CPf is a vast improvement on EAC 2004 and its core value lies in its
ability to express wherever possible relationships between parties,
the names they use, the resoruces they create and are the subject of
as well as their functions/activities.

Paul Shields (who started after you left) is our developer and Simon
Jacob will be assisting with the Trove enhancements.

Cheers,
Basil

Tim Sherratt

unread,

Oct 13, 2010, 1:14:21 AM10/13/10

to mmesit...@googlegroups.com

Alex and all,

I think we need to clarify what we're talking about here. My questions
were based on the templates and examples on the MME website. They seem
to reflect a rather limited model and I couldn't see how they would
enable the data of contributors to be expressed fully as RIF-CS (as
required of course by the project). So I wasn't arguing for the use of
RIF-CS, but that we need something that will *at least* enable us to
define the sorts of relationships between objects that are present in
RIF-CS.

If, however, the MME is developing a much richer model, as you
indicate, then hooray! I'm looking forward to seeing some details and
understanding how the inputs and outputs are mapped.

I think I am probably also guilty of mixing up the question of what
metadata we provide to MME with the question of how we might extend
and enrich the metadata we store in our own systems. It seems to me
that the real, long-term value of the MME project is not in the
development of an aggregation service (we all understand the problems
of sustainability), but in these sorts of discussions about what we
want to know, model and share about our collections. So I'm interested
in extending the discussion beyond what we supply to MME to tools,
approaches, methods, recipes etc.

I'll let Basil handle all the People Australia questions... :-)

>> To see how you can use People Australia identifiers to build rich
>> semantic annotations around collection material, you might like to
>> check out the Flickr Machine Tag Challenge
>> (http://wraggelabs.com/fmtc/). Over 1000 photos in Flickr have been
>> annotated with machine tags using PA identifiers.
>
> Are you suggesting here that we use entities in PA as a basis for some
> shared ontology?

err umm am I? I thought I was just showing how using existing
ontologies like FOAF and DC together with existing technologies like
machine tags and exsiting identifiers like People Australia we could
start right now in creating semantic linkages between people and
collection items.

>> I've always thought that part of the point of the MME project is to
>> facilitate and encourage metadata enhancement, so while I appreciate
>> the time pressures associated with the project it would seem useful to
>> talk a bit about things like people and places before we lock into a
>> model.
>
> Absolutely. Reusing PA meta data and identifiers is a good thing that
> we'll push as hard as we can, but I do fear there will be semantic
> mismatches across these layers of the metadata exchange, unless one
> party is to be policing these things through harvesting and analysis?

I don't understand what you mean here. Could you give some examples? I
like examples...

Alexander Johannesen

unread,

Oct 13, 2010, 1:29:39 AM10/13/10

to mmesit...@googlegroups.com

Hiya,

> My questions
> were based on the templates and examples on the MME website.

My bad. Any pointers? (I'm a new addition to this thing :)

> If, however, the MME is developing a much richer model, as you
> indicate, then hooray! I'm looking forward to seeing some details and
> understanding how the inputs and outputs are mapped.

I'll make it as rich as it needs to be, but I do want to re-use
whatever I can, taking special care for the thesaurii / labels part of
the equation. I'm happy to base the core ontology on the RIF-CS, for
example, if people who are familiar with it claim it to be good and
valid for most use cases we're bound to bump into. (I have some
experience in the past with EAD which was less satisfactory, for
example)

>> Are you suggesting here that we use entities in PA as a basis for some
>> shared ontology?
>
> err umm am I?

I don't know, it was a genuine question. :) I noticed that a lot of
those machine codes were linking to entities in PA, which is why I
asked. It's the end entities in any relationship we need to worry
about, all the RDFa / MC stuff is just wrappers with pointers. Since a
lot of the type entities used are ultimately in PA, I thought maybe
someone had created a collection of them, wrapped them in a handy
ontology, and created a simple hierarchy of PI's we could use.

> I thought I was just showing how using existing
> ontologies like FOAF and DC together with existing technologies like
> machine tags and exsiting identifiers like People Australia we could
> start right now in creating semantic linkages between people and
> collection items.

Yes, DC is fine for most stuff, especially if you mean the extended
DC. FoaF I have allergies of, but they can be rectified with
hand-holding and promises that it will turn out alright.

>> Reusing PA meta data and identifiers is a good thing that
>> we'll push as hard as we can, but I do fear there will be semantic
>> mismatches across these layers of the metadata exchange, unless one
>> party is to be policing these things through harvesting and analysis?
>
> I don't understand what you mean here. Could you give some examples? I
> like examples...

The simplest example is if two parties create records for the same
thing X, yet the meta data provided by both are ambiguous enough to
not be matched by the internal PA software, creating two identifiers
for the same thing. Two parties now have two different identifiers for
something they both would like to make statements on. And if they are
merged, what are the mechanics for updating the individual parties own
systems? Are there going to be mechanics to fix these problems on the
fly, interfaces for merging of semantics and so on, self-contained
repository of valid data to match against, etc? (And yes, I'll let
Basil handle the PA questions :).

Kind regards,

Alex

Tim Sherratt

unread,

Oct 13, 2010, 6:24:16 PM10/13/10

to mmesit...@googlegroups.com

Alex and all,

On Wed, Oct 13, 2010 at 4:29 PM, Alexander Johannesen
<alexander....@gmail.com> wrote:

>> My questions
>> were based on the templates and examples on the MME website.
>
> My bad. Any pointers? (I'm a new addition to this thing :)

These are the only guidelines at the moment:

http://www.powerhousemuseum.com/museumexchange/index.php/collection-description-guidelines/writing-collection-descriptions

As institutions are being asked to make commitments based on this
information, it seems important to try and work out in some detail
what it all means!

> I'll make it as rich as it needs to be, but I do want to re-use
> whatever I can, taking special care for the thesaurii / labels part of
> the equation. I'm happy to base the core ontology on the RIF-CS, for
> example, if people who are familiar with it claim it to be good and
> valid for most use cases we're bound to bump into. (I have some
> experience in the past with EAD which was less satisfactory, for
> example)

Well EAD is enough to send anyone mad, but then it was never designed
as a data model (although they're working on changing that now). The
'I' in RIF-CS is for 'interchange' and as you've already noted is has
major limitations. I would just be thinking of it as one export
format.

> I don't know, it was a genuine question. :) I noticed that a lot of
> those machine codes were linking to entities in PA, which is why I
> asked. It's the end entities in any relationship we need to worry
> about, all the RDFa / MC stuff is just wrappers with pointers. Since a
> lot of the type entities used are ultimately in PA, I thought maybe
> someone had created a collection of them, wrapped them in a handy
> ontology, and created a simple hierarchy of PI's we could use.

The Flickr Machine Tag Challenge only uses PA identifiers. The machine
tags themselves are generated by my Identity Browser. There's more on
the 'About' page.

Cheers, Tim

Basil Dewhurst

unread,

Oct 13, 2010, 6:33:26 PM10/13/10

to mmesit...@googlegroups.com

Tim is right, users of EAD have been known to utter the phrase "That way madness lies !". Let's be careful to distinguish between EAD and EAC-CPF !

EAD = Encoded Archival Description (http://www.loc.gov/ead/). "EAD stands for Encoded Archival Description, and is a non-proprietary de facto standard for the encoding of finding aids for use in a networked (online) environment. Finding aids are inventories, indexes, or guides that are created by archival and manuscript repositories to provide information about specific collections. While the finding aids may vary somewhat in style, their common purpose is to provide detailed description of the content and intellectual organization of collections of archival materials. EAD allows the standardization of collection information in finding aids within and across repositories." (http://www.archivists.org/saagroups/ead/aboutEAD.html)

EAC-CPF = Encoded Archival Context - Corporate bodies, Persons and Families (http://eac.staatsbibliothek-berlin.de/). " [EAC-CPF] ... primarily addresses the description of individuals, families and corporate bodies that create, preserve, use and are responsible for and/or associated with records in a variety of ways. ... currently its primary purpose is to standardize the encoding of descriptions about agents to enable the sharing, discovery and display of this information in an electronic environment. It supports the linking of information about one agent to other agents to show/discover the relationships amongst record-creating entities, and the linking to descriptions of records and other contextual entities." (http://eac.staatsbibliothek-berlin.de/)

Basil

Reply all

Reply to author

Forward