Wikidata reconciliation doesn't return expected results based on subclass

37 views
Skip to first unread message

Jane

unread,
May 1, 2021, 5:37:51 PM5/1/21
to OpenRefine
I'm not sure if this is the right place to ask this question, but I'm having trouble with Wikidata's reconciliation service.  The documentation says that all entities of not just the exact reconciliation type, but all its subclasses should be possible to return from the reconciliation query.  But sometimes that doesn't seem to work.

As an example, I have an item called "Abel Tasman Scenic Reserve".  When I run a reconciliation with the type "scenic reserve" (Q63248569), it matches to the reserve (Q89099173).  But when I run the reconciliation with type "geographic entity" (Q27096213), I get no results for this item even though "scenic reserve" is a subclass of "protected area of New Zealand", which is a subclass of "protected area", which is a subclass of "administrative territorial entity", which is a subclass of "human-geographic territorial entity", which is a subclass of "territorial entity", which is a subclass of "geographic region", which is a subclass of "geographic entity".
In fact, reconciliation works as expected (i.e. finds the reserve) for all types in that list up until geographic region.
Is there a limit on the depth of search through subclasses?  Or some other reason for the failure to find the reserve for some types?

Thanks!

Owen Stephens

unread,
May 11, 2021, 4:46:20 AM5/11/21
to OpenRefine
Hi Jane

Just testing this in my OpenRefine setup and it works OK for me:
  • I create a cell with the text "Abel Tasman Scenic Reserve"
  • I start reconciliation
  • I set the "Reconcile against type" to be "geographic entity" (Q27096213)
  • Hit "start reconciling"
  • Cell successfully reconciled to Q89099173
Is this the right sequence? Does it now work for you? (thinking could have been a temporary issue?) Or have I done something different, or does it fail only if there is more data to run through etc. etc.

Thanks

Owen

Jane

unread,
May 11, 2021, 9:49:45 AM5/11/21
to OpenRefine
The problem is definitely still there, though I think I may have found a solution of a sort.  I only recently installed OpenRefine and I assumed by now the default Wikidata reconciliation service would be the "new" reconci.link service -- in part because the link to the Wikidata reconciliation API documentation from within the OpenRefine documentation goes to the reconci.link material.  It also wasn't immediately clear to me how to check the address for the default service labeled "Wikidata (en)".  Anyway, it turns out that's the old toolforge address, and using the reconci.link service (which is labeled "Wikidata reconci.link (en)") fixes the problem.
I think it might be helpful if the OpenRefine documentation were a little clearer on the differences between the toolforge service and the reconci.link service.

Antonin Delpeuch (lists)

unread,
Jun 3, 2021, 3:42:56 AM6/3/21
to openr...@googlegroups.com
Hi Jane,

On 11/05/2021 15:49, Jane wrote:
> I think it might be helpful if the OpenRefine documentation were a
> little clearer on the differences between the toolforge service and the
> reconci.link service.

I could not agree more with you. The problem is that we are in a pretty
sorry state of affairs at the moment:

- the wdreconcile.toolforge.org service, which is currently used by
default in OpenRefine, is unmaintained. I used to maintain it but I am
not able to fix some bugs which seem inherent to the hosting
infrastructure (Toolforge) which is outside my control. Therefore I have
decided to deprecate it and no longer intervene when it breaks.

- the wikidata.reconci.link service, which I run and maintain in a
personal capacity, hosted on my own infrastructure. I proposed updating
OpenRefine to use this one by default, but there are also ethical
concerns around this: it means I would be able to see the IP address and
reconciled data of anyone using OpenRefine to reconcile with Wikidata.
(With the toolforge service I am not able to see the IP address).

The obvious solution would be that Wikidata offers its own, official
reconciliation service. If you feel like requesting that to the Wikidata
team, perhaps it would help bump the priority of such a project on their
own roadmap.

Best,
Antonin
Reply all
Reply to author
Forward
0 new messages