As a follow up, in my use case what I need is something like a or
between values.
Others might need an "AND".
And actually doing an "AND" can be done by using multiple columns on
the same property. Candidates will get better score with the number of
values they have in the targeted property. 100 will mean a full
matching AND.
In my case doing with one column and then one property constraint by
value (painter, scupltor, engraver...) lowers the scores as one person
having all the possible occupations is very unlikely. But still
logically possible. In such a case candidates with score above ~70
could be considered as full matching. Which can actually be done in two
clicks thanks to beautiful Open Refine reconciliation facets and
actions.
So I guess that the way to go.
And at least in that case we stick to only one query by alignment to
achieve.
To finish on this very-long-talk-to-myslef thread, I would ideally need
here is a way to ask for a scoring mecanism which does output something
like a fix score whatever the number of "OR" values which match the
property.
Very specific I guess.
Above all it's not in Open Refine hands anyway as "The way candidates
are retrieved from the underlying database and scored against the query
is left entirely at the discretion of the service."
https://reconciliation-api.github.io/specs/0.1/#a-note-on-candidate-retrieval-and-scoring
Ok let's dig then:
"For each supplied property, all query values are matched against
reference values and the maximum matching score of all pairs is used as
the similarity score for this property."
https://openrefine-wikibase.readthedocs.io/en/latest/scoring.html#global-matching-formula
But wait. If I understand correctly this would actually means that if
any value matches perfectly the score should be maximum. That's a OR...
Mmm what I see from playing with Open Refine does not sounds like that
statement.
Can Open Refine actually send a list of multiple values for one
property? Or is that a list of with multiple times the same property
with one value each?
Sounds like the second option to me.
At that point I wish I knew a way to see the query sent by Open Refine
to the reconciliation service.
When I do a query with two diferent column on the same property I think
what Open Refine does is :
```
{
"q0": {
"query": "Paul Girard",
"type": "DifferentiatedPerson",
"limit": 5,
"properties": [
{
"pid": "occupation",
"v": "painter"
},
{
"pid": "occupation",
"v": "sculptor"
}
],
"type_strict": "should"
}
}
```
where it could also be
```
{
"q0": {
"query": "Paul Girard",
"type": "DifferentiatedPerson",
"limit": 5,
"properties": [
{
"pid": "occupation",
"v": [
"painter",
"sculptor"
]
}
],
"type_strict": "should"
}
}
```
Syntax taken from
https://reconciliation-api.github.io/specs/0.1/#structure-of-a-reconciliation-query
To finish number two: a very common use case about what I am describing
here is matching people Q5 when having a list of firstnames P735.
Sorry for beeing so long.
Hope this makes any sense to someone.
Best regards,
Paul