I've been working on an R package for matching ULAN IDs to a list of artist names, optionally filtering by life dates:
https://github.com/mdlincoln/ulanr
I'd like to search against BOTH prefLabel AND altLabel for each artist. Right now, I'm using this query, which accepts the arguments NAME, and the optional terms EARLYDATE, LATEDATE (if not specified, query is constructed with -9999 and 2099 respectively)
SELECT ?id ?pref_name ?startdate ?enddate ?gender ?nationality
WHERE {
?artist skos:inScheme ulan: ;
luc:term NAME ;
rdf:type gvp:PersonConcept ;
dc:identifier ?id ;
gvp:prefLabelGVP [gvp:term ?pref_name] .
?artist foaf:focus [gvp:biographyPreferred ?bio] .
?bio gvp:estStart ?startdate ;
gvp:estEnd ?enddate .
OPTIONAL {
?bio schema:gender [gvp:prefLabelGVP [gvp:term ?gender]] .
}
OPTIONAL {
?focus gvp:nationalityPreferred [gvp:prefLabelGVP [gvp:term ?nationality]] .
}
FILTER(?startdate >= EARLYDATE^^xsd:gYear && ?enddate <= LATEDATE^^xsd:gYear),
}
LIMIT 1
The program will returns a table containing all the bindings from the SPARQL query, along with a column for the originally-submitted name. If I understand correctly, this search will include both preflabels AND altlabels, and will return results ordered by score?
I find that the lucene index seems to handle variant spellings fine, but it behaves unexpectedly when searching against names with numbers. For example:
> ulan_data(c("Hendrik Hondius (I)", "Hendrick Hondius (I)", "Hendrik Hondius", "Hendrick Hondius"))
Source: local data frame [4 x 7]
name id pref_name birth_year death_year gender nationality
1 Hendrik Hondius (I) 500006788 Hondius, Hendrik, I 1573 1650 male Dutch
2 Hendrick Hondius (I) 500006788 Hondius, Hendrik, I 1573 1650 male Dutch
3 Hendrik Hondius 500116744 Hondius, Hendrik 1615 1677 male Dutch
4 Hendrick Hondius 500006787 Hondius, Gerrit 1891 1970 male Dutch
The fourth query result was a bit of a surprise, given that none of the pref/altabels for
500006787: Gerrit Hondius contain Hendrick/Hendrik - I would at least have expected that it would return
500116744:Hondius, Hendrik. Any thoughts?