Querying ULAN with python

93 views
Skip to first unread message

ziny...@gmail.com

unread,
Dec 12, 2017, 8:48:54 AM12/12/17
to Getty Vocabularies as Linked Open Data
I'd like to search ULAN for artist names and output the preferred name with Python.
I found a useful library http://skosprovider-getty.readthedocs.io/en/stable/index.html but it seems that there are only AAT and TGN providers.
Is there a way accomplishing this? A link to sample code would help a lot.

Thanks
Zin

Gabriel Kerneis

unread,
Dec 12, 2017, 9:53:27 AM12/12/17
to ziny...@gmail.com, Getty Vocabularies as Linked Open Data

Gabriel

--
You received this message because you are subscribed to the Google Groups "Getty Vocabularies as Linked Open Data" group.
To unsubscribe from this group and stop receiving emails from it, send an email to gettyvocablod+unsubscribe@googlegroups.com.
To post to this group, send email to gettyv...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/gettyvocablod/782c43c0-a4b2-4203-8c81-0107bee8529a%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

Message has been deleted

ziny...@gmail.com

unread,
Dec 12, 2017, 10:05:47 AM12/12/17
to Getty Vocabularies as Linked Open Data
meanwhile I was able to aubmit a query successfully:


'''
This script demonstrates using the ULANProvider to get the concept of Artists
'''

from skosprovider_getty.providers import ULANProvider

ulan = ULANProvider(metadata={'id': 'ULAN'})
artist = ulan.find({'label': 'Gončarova'})
#artist = ulan.get_by_id(500115630)
print(artist[0])

{'id': '500115630', 'uri': 'http://vocab.getty.edu/ulan/500115630', 'type': 'Concept', 'label': "Goncharova, Natal'ya", 'lang': ''}


But, what I'm interested in is the preferred name as stated on http://www.getty.edu/vow/ULANFullDisplay?find=natalia+goncharova&role=&nation=&subjectid=500115630.
How do I get this?

vladimir...@ontotext.com

unread,
Dec 12, 2017, 10:24:19 AM12/12/17
to Getty Vocabularies as Linked Open Data
'label': "Goncharova, Natal'ya"


The python provider gets correct info. You can see the pref and alt labels in RDF at http://vocab.getty.edu/ulan/500115630?inference=implicit
I can only surmise that the website data was updated recently and the RDF data will be updated within the next 2 weeks. Gregg, could you comment?

You can see the change log of that ULAN concept using this query; the last change to a term was in 2011. But of course it won't tell you what future changes will be made
select * {
  [prov:used ulan:500115630; 
             dc:type ?type;
             prov:startedAtTime ?time;
             dc:description ?descr]
} order by ?time

I also notice that this concept dcterms:replaces ulan:500002610, ulan:500114934. I.e. those two have been merged into this one.

Zin, have you noticed any other ULAN pref label discrepancies between RDF and website?

ziny...@gmail.com

unread,
Dec 12, 2017, 10:41:53 AM12/12/17
to Getty Vocabularies as Linked Open Data
this one looks also strange

from skosprovider_getty.providers import ULANProvider

ulan = ULANProvider(metadata={'id': 'ULAN'})
artist = ulan.find({'label': 'monet'})

print(artist[0]['label'])

>>> Hoschedé-Monet, Blanche
Should be Hoschedé-Monet, Blanche

I've following additional questions:
1. Is the first label returned always the prefered name?
2. How can I integrate an RDF query for ULAN within Python?

ziny...@gmail.com

unread,
Dec 12, 2017, 2:33:01 PM12/12/17
to Getty Vocabularies as Linked Open Data
this is how I got it right:
print(artist[0]['label'].encode('latin1').decode('utf-8'))

Getty Vocabularies LOD

unread,
Dec 12, 2017, 3:44:41 PM12/12/17
to Getty Vocabularies as Linked Open Data
The data was updated today, but this update will not fix the discrepancy between the skos:prefLabel for this record and preferred term in the online web site. I did some checking the reason is because the skos:prefTerm is marked as preferred for the language of "undetermined" in the data which is not correct. I see about 9000 records in ULAN where this is the case. This issue will be fixed in the next publish (January), but meanwhile if you want to always get the preferred term for a ULAN record, I suggest using the gvp:prefLabelGVP property.

Gregg Garcia
Software Architect
Getty Digital

vladimir...@ontotext.com

unread,
Dec 13, 2017, 2:01:34 AM12/13/17
to Getty Vocabularies as Linked Open Data
> suggest using the gvp:prefLabelGVP property.

To elaborate on this:
http://vocab.getty.edu/ulan/500115630 has gvp:prefLabelGVP http://vocab.getty.edu/ulan/term/1500203181.
http://vocab.getty.edu/ulan/term/1500203181 has xl:literalForm "Goncharova, Natalia".

You can query and get prefLabelGVP with a query like this
select * {
  ?x luc:term "Monet"; 
     skos:inScheme ulan: ; 
     gvp:prefLabelGVP/xl:literalForm ?name ;
     #rdf:type gvp:PersonConcept
 # filter(regex(?name,"Monet[^a-z]"))
}

luc: is Lucene FTS and it searches with some name variation. If you want an exact match, you can uncomment the filter.
If you want to exclude institutions (Musée Marmottan Monet), uncomment the rdf:type part.

See http://vocab.getty.edu/queries for a lot more examples.

> 1. Is the first label returned always the prefered name?

Consult the skosprovider_getty documentation or ask at its page (eg as a github issue)

> 2. How can I integrate an RDF query for ULAN within Python?

You need to use a more general library. A quick search for "python sparql" on stackoverflow comes up with
Reply all
Reply to author
Forward
0 new messages