Retrieve full URL of recon matches

337 views
Skip to first unread message

Joe Wicentowski

unread,
Aug 6, 2016, 3:10:12 AM8/6/16
to OpenRefine
Hi all,

I know of the cell.recon variables listed at https://github.com/OpenRefine/OpenRefine/wiki/Variables, and I like how we can retrieve a match's ID and other information about reconciliation status and judgments.  

Is there a way to retrieve the match's URL, not just the ID?

I have a feeling of deja vu - that I once wrote to the list or filed a request for this, but I can't find any record of that.  I seem to recall that the answer had to do something with the fact that the full URL isn't stored in the data, only the ID; and the URL rules come from the current reconciliation service's definition.  Is that right?  

Even so, Is there really no way to retrieve that base URL or URL-generation formula and have this available to a GREL expression in a cell/column transformation?

Thanks,
Joe

p.s. I've posted my reconciliation service code at https://github.com/HistoryAtState/people/blob/master/modules/reconcile.xq. It runs on eXIst-db. Unfortunately some nginx issue that I haven't fully debugged is getting in the way of accessing this service on the beta site, https://1861.history.state.gov/exist/apps/people/. It works when deployed locally, but on the remote server it stalls during the initial discovery step. But hopefully soon we will have the service available for testing.

Sean Crowe

unread,
May 22, 2017, 3:46:19 PM5/22/17
to OpenRefine
I have this same question. I see that when exporting html, the match URL is included but I can't seem to access the URL via GREL.

Thad Guidry

unread,
May 22, 2017, 5:40:02 PM5/22/17
to OpenRefine
The truth is always in the source code. :)

We don't have the URL as part of the ReconCandidate model.  However...read on. :)

Take a look at the existing ReconCandidate model

You'll notice that we store only 4 things for a ReconCandidate :

public class ReconCandidate implements HasFields, Jsonizable {

    final public String id;
    final public String name;
    final public String[] types;
    final public double score;


But you might be looking for lots of the under the covers (undocumented) methods and just construct something of your own in a hackish way (that maybe works for you... or not) :.

cell.recon.service + "/" + cell.recon.match.id

cell.recon.judgmentBatchSize

cell.recon.schemaSpace

In fact, you just play around with anything double quoted in our source code for Recon, ReconCandidate, etc ... https://github.com/OpenRefine/OpenRefine/tree/master/main/src/com/google/refine/model

you will see lots of things like "judgmentBatchSize" and "judgmentAction" and "schemaSpace", "identifierSpace", etc.

Many of those that you will see double quoted in that Recon.java source and others depend on WHAT the actual JSON looks like that is returned from the host Recon server your hitting.  (They may have implemented those or not, and if not, you will just get a null)

Feel free to go in and play and write up what you find to expand our examples in our Wiki GREL reference for Recon - https://github.com/OpenRefine/OpenRefine/wiki/Variables

Thanks !
-Thad

--
You received this message because you are subscribed to the Google Groups "OpenRefine" group.
To unsubscribe from this group and stop receiving emails from it, send an email to openrefine+...@googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

Sean Crowe

unread,
May 22, 2017, 8:01:56 PM5/22/17
to openr...@googlegroups.com
Thanks so much for the clues. I think adding that ```service``` method to the wiki/Variables#recon would be pretty helpful for my use case (though it may not technically be a field). Appropriate to add a note, maybe?

Best, 
Sean

To unsubscribe from this group and stop receiving emails from it, send an email to openrefine+unsubscribe@googlegroups.com.

For more options, visit https://groups.google.com/d/optout.

--
You received this message because you are subscribed to a topic in the Google Groups "OpenRefine" group.
To unsubscribe from this topic, visit https://groups.google.com/d/topic/openrefine/l1DJ76uKcXs/unsubscribe.
To unsubscribe from this group and all its topics, send an email to openrefine+unsubscribe@googlegroups.com.

Thad Guidry

unread,
May 22, 2017, 11:05:12 PM5/22/17
to openr...@googlegroups.com
When we say fields... we're also talking about the JSON fields that were returned for each entity record from the Recon service as well.

For instance, clicking on this and looking at what Antonin has wired up to return with the Wikidata Recon service  https://tools.wmflabs.org/openrefine-wikidata/en/api

You can see the various "fields" available.
I think your actually asking for the "view": {"url":  ???
and not the "service_url"


cell.recon.service.view.url

But it doesn't look like its wired up in Recon.java  (where the GREL syntax that you type is understood for cell.recon)

You can open an issue for this and perhaps we can get someone to add it in.

-Thad


To unsubscribe from this group and stop receiving emails from it, send an email to openrefine+...@googlegroups.com.

For more options, visit https://groups.google.com/d/optout.

--
You received this message because you are subscribed to a topic in the Google Groups "OpenRefine" group.
To unsubscribe from this topic, visit https://groups.google.com/d/topic/openrefine/l1DJ76uKcXs/unsubscribe.
To unsubscribe from this group and all its topics, send an email to openrefine+...@googlegroups.com.

For more options, visit https://groups.google.com/d/optout.

--
You received this message because you are subscribed to the Google Groups "OpenRefine" group.
To unsubscribe from this group and stop receiving emails from it, send an email to openrefine+...@googlegroups.com.
Reply all
Reply to author
Forward
0 new messages