Issues with reconciling person names to Wikidata

66 views
Skip to first unread message

Brendan Honick

unread,
Feb 1, 2022, 10:59:30 PM2/1/22
to OpenRefine
Hi everyone,

I've been running experiments on OpenRefine's reconciliation function using some of my institution's archival metadata, and I have encountered a strange issue when I attempt to reconcile person names to Wikidata.

For example, there are approximately 3900 person names (musicians, songwriters, etc.) associated with a collection of wax cylinders that we have. However, reconciling these people's names to Wikidata using the format "Firstname Middlename Lastname" yields negligible results. Even manually adjusting the relevant filters only gave about 36 matches. See below for an example:

after-automatic-reconciliation.PNG

However, I was able to manually match about 73% of the names to their corresponding entries in Wikidata.

after-manual-reconciliation.PNG

Although there are a variety of reasons why these names did not automatically get matched, such as there being multiple people with the same name in Wikidata, more than 36 names should have been automatically matched. I tried reconciling the names to VIAF (using Jeff Chiu's tool here: http://refine.codefork.com/). Within a few minutes of tweaking the filters, I was able to have 47% of the names in the metadata set linked.

Is there a reason why Wikidata isn't working for reconciling a large set of person names like this? Similar attempts at linking the data with some of our other datasets have failed as well.

Thanks for your help!

Brendan Honick

unread,
Feb 1, 2022, 11:10:41 PM2/1/22
to OpenRefine
As a follow-up question, does reconciliation with Wikidata become less effective with larger datasets? Or, are the issues I'm running into caused by an API issue?

Thanks!
Brendan Honick, Syracuse University

Antoine Beaubien

unread,
Feb 2, 2022, 2:00:01 AM2/2/22
to openr...@googlegroups.com
Hi Brendan Honick,

Just to know, have you set up properties linked to columns? And, for the birth/death dates, you must split them in 2 columns, and link them individually. 

To link occupation (role_name), you must FIRST reconcile the occupations in the column with their Wikidata equivalent.

Here an example:
image.png

Regard,
   Antoine Beaubien



--
You received this message because you are subscribed to the Google Groups "OpenRefine" group.
To unsubscribe from this group and stop receiving emails from it, send an email to openrefine+...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/openrefine/7cc67992-7e16-4609-8e1e-c6f36dbb2811n%40googlegroups.com.

Owen Stephens

unread,
Feb 2, 2022, 6:15:47 AM2/2/22
to OpenRefine
Hi Brendan,

When you are doing the reconciliation are you specifying an 'type' to reconcile against (e.g. Q5 Human) or selecting "Reconcile against no particular type"?
I think there's a problem with situations where the specified type has many subtypes in Wikidata - there is a discussion of how the reconciliation service for Wikidata might try to address this at https://github.com/wetneb/openrefine-wikibase/issues/131

In the meantime if you switch to using "Reconcile against no particular type" you may find that you get more success? (for example I tried reconciling "Louis Friedsell" and if I specify Q5 I get nothing, but if I say 'no particular type' it immediately finds the correct Wikidata entity)

Owen

Brendan Honick

unread,
Feb 8, 2022, 7:04:58 PM2/8/22
to OpenRefine
Hi Antoine and Owen,

Thanks for your suggestions! What ended up working for me was using the properties linked to columns (including splitting up the birth/death dates) as well as "reconcile against no particular type." The error discussed in Owen's GitHub link is likely affecting me as well, since reconciling with Q5 did not work. I've included a screenshot showing the success! :-)

I appreciate the time you took to help me solve this issue!

Best,
Brendan Honick
success.PNG
Reply all
Reply to author
Forward
0 new messages