SNAP ontology and cookbook: issues

53 views
Skip to first unread message

Vladimir Alexiev

unread,
Sep 18, 2014, 1:43:19 PM9/18/14
to ancient...@googlegroups.com
  1. Only familial/intimate/servant/slave, not professional (student, workshop of), nor groups. Professional are covered by AgRelOn and are needed by Getty ULAN. Ain't they in scope?
  2. All relations modeled as classes (good, allows for time/place/qualification). But the modeling of relations is asymmetric, even for inherently symmetric relations. Eg 
       <person1> snap:has-bond <person1/sibling>.
       <person1/sibling> a snap:Sibling; snap:bond-with <person2>.
     So eg to find all siblings of <person2>, one needs to use complicated queries searching in both directions.
  3. No abstraction of gender: many links implicitly say stuff about the gender of the linked people
  4. imports LAWD, SAWS, OA, PROV, FOAF, maybe ECRM? Why so many, do you use all of them?
  5. associatedDate & associatedPlace listed as both data and object properties? Protege got confused
  6. Try validating snap.owl with the Manchester Validator http://mowl-power.cs.man.ac.uk:8080/validator/. Try to stick to OWL QL or OWL RL. Reducing the imports should help
  7. No definitions of terms (!?!?!?). Eg what's the difference between Bond and Link, resp has-bond and has-link?
  8. snap:QuAC (?!?!) equivalentClass dct:Agent, foaf:Agent, lawd:Agent. A real poor name for your key class. Allegations to ducks not withstanding ;-)
  9. Property names using "-" are unusual. Better use camelCase (eg hasBond, hasLink)
  10. hi-jacking others' ontologies: efrbroo:F10_Person equivalentClass lawd:Person

  1. Not sure what is the utility of #this. You could just go like this, with the same effect (whole file being fetched at once)
   cito:citesAsEvidence <http://www.trismegistos.org/text/56> ;
  2. Using ISO 8601 time interval (<start>/<end>) for associatedDate: this is not supported by RDF literal  data types (=XSD types), nor the ISO 8601 profile adopted by W3C (http://www.w3.org/TR/NOTE-datetime, as cited by http://en.wikipedia.org/wiki/ISO_8601#cite_note-25. Putting two dates in one literal makes querying very inefficient. It's better to use separate fields for <start> and <end>
  3. cnt:chars is not widely used outside the OA community, and there mostly for structured text (eg an SVG shape). Better use rdf:value or rdfs:label?
  4. Prefer DCT instead of DC, because DCT defines ObjectProperties as appropriate. Eg use dct: for dc:publisher, dc:replaces
  5. The use of dc:identifier to link to other URLs for the same person (VIAF, DBpedia) is unusual. Both dc:identifier and dct:identifier allow literals. Better use rdfs:isDefinedBy or rdfs:seeAlso (but perhaps not owl:sameAs, which implies "smushing" semantics)
  6. snap:reason (this and following item): Define this property in the ontology
  7. In the example, use a (resolvable) URL instead of an URN
  8. Include the inverse link dct:isReplacedBy on the snap:MergedResource
  9. Maybe add a type (eg dc:type, or rdf:type) for typical reasons for replacement?

- LGPN
  1. http://clas-lgpn2.classics.ox.ac.uk/cgi-bin/lgpn_search.cgi?id=V5a-35652;style=rdf uses wrong language codes el-grc and el-grc-x-lexnoacc. "el" is modern Greek, "grc" is Ancient Greek, so they shouldn't be mixed. For the latter maybe better is grc-Latn-x-lexnoacc
  2. emailed lgpn at classics.ox.ac.uk

Sebastian Rahtz

unread,
Sep 24, 2014, 12:49:21 PM9/24/14
to ancient...@googlegroups.com
On 18 September 2014 18:43, Vladimir Alexiev <vlad...@sirma.bg> wrote:

- LGPN
  1. http://clas-lgpn2.classics.ox.ac.uk/cgi-bin/lgpn_search.cgi?id=V5a-35652;style=rdf uses wrong language codes el-grc and el-grc-x-lexnoacc. "el" is modern Greek, "grc" is Ancient Greek, so they shouldn't be mixed. For the latter maybe better is grc-Latn-x-lexnoacc
  2. emailed lgpn at classics.ox.ac.uk


I'll 'fess up on this one. Thanks, Vladimir, for pointing out a long-standing
misunderstanding on my part. I'll get it fixed as soon as I can

(not ignoring all the other interesting general points in your message, but this
one is specific to a dataset)
 
--

Sebastian Rahtz      

Director (Research) of Academic IT

University of Oxford IT Services

13 Banbury Road, Oxford OX2 6NN. Phone +44 1865 283431


Não sou nada.

Nunca serei nada.

Não posso querer ser nada.

À parte isso, tenho em mim todos os sonhos do mundo.

Vladimir Alexiev

unread,
Sep 26, 2014, 5:30:53 PM9/26/14
to ancient...@googlegroups.com

Sebastian Rahtz

unread,
Oct 8, 2014, 12:16:01 PM10/8/14
to ancient...@googlegroups.com
On 18 September 2014 18:43, Vladimir Alexiev <vlad...@sirma.bg> wrote:
- LGPN
  1. http://clas-lgpn2.classics.ox.ac.uk/cgi-bin/lgpn_search.cgi?id=V5a-35652;style=rdf uses wrong language codes el-grc and el-grc-x-lexnoacc. "el" is modern Greek, "grc" is Ancient Greek, so they shouldn't be mixed. For the latter maybe better is grc-Latn-x-lexnoacc

with apologies for the delay, I hope you'll now find the LGPN output uses 'grc' language codes. 

Gabriel Bodard

unread,
Oct 10, 2014, 7:10:22 AM10/10/14
to ancient...@googlegroups.com
Thanks for these suggestions and observations, Vladimir. We're in the
process of discussing your points, and I'll reply below to a few that
I think I know the answers to. (Others may contribute to this
conversation to clarify, add to, or disagree with my comments.)

On 18 September 2014 18:43, Vladimir Alexiev <vlad...@sirma.bg> wrote:
> 1. Only familial/intimate/servant/slave, not professional (student,
> workshop of), nor groups. Professional are covered by AgRelOn and are needed
> by Getty ULAN. Ain't they in scope?

No, I don't think we felt we had a use-case for that yet. People in
the ancient prosopographies we're working with are commonly identified
as "Diogenes son of Aristarchos", "Metella wife of Philippos" or
"Corax freedman of Caracalla" but not "Isak office-mate of
Sebastianus" or "Barbara line-manager of Fidella". If we some day want
to ingest a prosopography that does record these kind of non-familial
relationships (maybe Ethan's Kerameikos database has some examples)
then we'll look at things like AgRelOn as an extension to SNAP.

> 2. All relations modeled as classes (good, allows for
> time/place/qualification). But the modeling of relations is asymmetric, even
> for inherently symmetric relations. Eg
> <person1> snap:has-bond <person1/sibling>.
> <person1/sibling> a snap:Sibling; snap:bond-with <person2>.
> So eg to find all siblings of <person2>, one needs to use complicated
> queries searching in both directions.

We considered modelling the Bond class slightly differently to allow
this sort of thing, but at present didn't think it would solve any
problems (and would cause a few others, among them added complexity of
modelling). This is open to renegotiation in the future, of course.

> 3. No abstraction of gender: many links implicitly say stuff about the
> gender of the linked people

Since most of our prosopographies don't explicitly assign gender to
individuals, we felt it wasnt' our place to do so. If we could
abstract that information from certain relations, for example, then so
could others. Again, this could be done at a later date if it turned
out to be valuable. (But I'd still be nervous about _how_ to define
gender/sex in the absence of a sensible, inclusive standard for doing
so.)

> 5. associatedDate & associatedPlace listed as both data and object
> properties? Protege got confused

That's just an error; snap:associatedDate should be data, and
snap:associatedPlace should be object. I hope this will be fixed in
the latest version of the ontology.

> 10. hi-jacking others' ontologies: efrbroo:F10_Person equivalentClass
> lawd:Person

This equivalence isn't declared by SNAP, but (and only indirectly) by
LAWD and CIDOC (and appears in our ontology only via reasoning on the
ontologies we import. It's confusing, but I don't think it's our
fault.

> 3. cnt:chars is not widely used outside the OA community, and there mostly
> for structured text (eg an SVG shape). Better use rdf:value or rdfs:label?

Agreed that cnt:chars was being misused on Place and Bond: we've
replaced both of those with rdfs:label, as suggested. On Attestation
and Citation, however, the object being described *is* a string of
text, not just labelled by that text, so neither title, value nor
label would really work. cnt:ContentAsText does exactly what it says
on the tin here.

> 4. Prefer DCT instead of DC, because DCT defines ObjectProperties as
> appropriate. Eg use dct: for dc:publisher, dc:replaces

This is of course arbitrary, but agreed, dct: is more usual. Changed.

> 5. The use of dc:identifier to link to other URLs for the same person
> (VIAF, DBpedia) is unusual. Both dc:identifier and dct:identifier allow
> literals. Better use rdfs:isDefinedBy or rdfs:seeAlso (but perhaps not
> owl:sameAs, which implies "smushing" semantics)

This is a good point, but we don't think isDefinedBy or seeAlso are a
very good fit either. isDefinedBy is too specific, since the referrent
of such a URI may not in fact define the current person, but just
claims to refer to the same person. Conversely, rdfs:seeAlso is not
specific enough, since we really want a term that we can unambiguously
use when and only when we mean "if two records both point to the same
URL here, then they are the same person". To use a modern example, a
prosopographical entry for Sonia Greene might well say "rdfs:seeAlso"
-> the Wikipedia entry for HP Lovecraft, since she doesn't have her
own WP entry. We wouldn't want our database to then assume that they
are the same person, since they're clearly not. I agree with you that
owl:sameAs brings too much baggage with it, since it would imply that
everything we say in one record should be taken as true of the other,
which just doesn't work. Identifier has the right semantics, although
as you say our putting URIs in there is a bit clunky. Happy for other
suggestions from the list?

> 8. Include the inverse link dct:isReplacedBy on the snap:MergedResource

This might not hurt, but as far as I can see the only benefit it would
have would be to make SPARQL queries a bit easier? We're actually
doing the querying we'd need to create this in the process of building
the human-readable person page over the triplestore, so it's sort of
simultaneously (a) easy and (b) not necessary. One objection is the
issue of maintaining the same piece of information in two places,
leading to the danger of that falling out of synch. Again though, it
would be easy to add this later if we decided it was useful.

Thanks again for your very thoughtful and attentive examination of our
documentation!

We hope to have a new version of the ontology and Cookbook (and a
first stable, public-facing version of the core data) to share with
you in the next few weeks.

Best,

Gabby

--
Dr Gabriel BODARD
Researcher in Digital Epigraphy

Digital Humanities
King's College London
Boris Karloff Building
26-29 Drury Lane
London WC2B 5RL

Email: gabriel...@kcl.ac.uk
Tel: +44 (0)20 7848 1388
Fax: +44 (0)20 7848 2980

http://www.digitalclassicist.org/
http://www.currentepigraphy.org/

Vladimir Alexiev

unread,
Oct 14, 2014, 4:48:53 AM10/14/14
to ancient...@googlegroups.com
>> Professional are covered by AgRelOn and are needed
>> by Getty ULAN. Ain't they in scope?
> No, I don't think we felt we had a use-case for that yet.

Ok, I'm just asking.

> > 2. All relations modeled as classes (good, allows for
> > time/place/qualification). But the modeling of relations is asymmetric
> > So eg to find all siblings of <person2>, one needs to use complicated
> > queries searching in both directions.
> We considered modelling the Bond class slightly differently to allow
> this sort of thing, but at present didn't think it would solve any
> problems (and would cause a few others, among them added complexity of
> modelling). This is open to renegotiation in the future, of course.

So you're using this pattern: http://www.w3.org/TR/swbp-n-aryRelations/#useCase1 which singles out one participant.
My point is that for a symmetric relation, this is a more appropriate pattern: http://www.w3.org/TR/swbp-n-aryRelations/#useCase3
I don't thnk case3 is more complicated than case1.
I think that by inverting the outgoing arrow (or some similar "clever" trick) you can cater to the symmetric case.

No matter what the outcome, I think the cookbook should provide a bigger number of typical Queries.
(It's a good idea to think about them beforehand, following a "competence questions" approach.)
If the modeling is asymmetric, then to find e.g. Siblings of someone, you need a UNION query.

In the same train of thought:
9. Are the inferences of familial relations supposed to be present?
E.g. if X parent Y parent Z, is the data supposed to include X grandparent Z?
Otherwise one needs to make further UNION queries to explore both paths.

> > 3. No abstraction of gender: many links implicitly say stuff about the
> > gender of the linked people
> If we could abstract that information from certain relations, for example, then so
> could others.

I mean a different thing!
If you put the gender at the person (where it belongs), and the relations DON’T reflect a gender,
you'll have a more economical and better-organized hierarchy of relations.
I attach the prop hierarchy from Agrelon; there's a presentation by the same name (I think on slideshare) or I could send you the paper.

> > 10. hi-jacking others' ontologies: efrbroo:F10_Person equivalentClass
> > lawd:Person
> This equivalence isn't declared by SNAP, but (and only indirectly) by
> LAWD and CIDOC (and appears in our ontology only via reasoning on the
> ontologies we import

Do you need to import all of this stuff? I think you only need LAWD but I thought I also saw FRBRoo etc.

> > 5. The use of dc:identifier to link to other URLs for the same person
> > (VIAF, DBpedia) is unusual. Both dc:identifier and dct:identifier allow
> > literals. Better use rdfs:isDefinedBy or rdfs:seeAlso (but perhaps not
> > owl:sameAs, which implies "smushing" semantics)
> This is a good point, but we don't think isDefinedBy or seeAlso are a
> very good fit either. Happy for other suggestions from the list?

Ok then, use lvont:strictlySameAs.
See here for a nice discussion of various identity properties (and there's a paper):
http://www.lexvo.org/linkeddata/identity.html

> > 8. Include the inverse link dct:isReplacedBy on the snap:MergedResource
> it's sort of simultaneously (a) easy and (b) not necessary.

Querying relations in SPARQL is inherentily reversible, so yes it's easy.

Cheers!
AgRelOn- An Agent Relationship Ontology (MTSR 2012).png
Reply all
Reply to author
Forward
0 new messages