matching my data with reconciled data

32 views
Skip to first unread message

Robert Garrigos

unread,
Oct 24, 2022, 4:28:56 PM10/24/22
to openrefine
Hi all,

There is something I don't quite understand, yet. Sorry if this is a
silly question.

I have a csv file with persons with a name, gender and dates of birth
and death. I reconcile the names with wikidata, then I add a column with
birth dates from the reconciled name column.

Now, how can I mix those birth dates, mine and wd's, and update wikidata
values? Or update my values? Is there a way to do so? Is all my process
wrong?

Thanks.
--
========================
Robert Garrigós i Castro
https://garrigos.cat
+34 620 91 87 01

Owen Stephens

unread,
Oct 24, 2022, 5:13:27 PM10/24/22
to OpenRefine
On Monday, October 24, 2022 at 9:28:56 PM UTC+1 rob...@garrigos.cat wrote:
Hi all,

There is something I don't quite understand, yet. Sorry if this is a
silly question.

I have a csv file with persons with a name, gender and dates of birth
and death. I reconcile the names with wikidata, then I add a column with
birth dates from the reconciled name column.

Now, how can I mix those birth dates, mine and wd's, and update wikidata
values? Or update my values? Is there a way to do so? Is all my process
wrong?
I don't think your process is necessarily wrong, although it may depend on what you are aiming for of course.

The first question I'd ask is how do you know which of the birth dates are correct - your's or Wikidata's?
If you can answer that question I hope I can help with a process that gets you to the outcome you need

Owen

Robert Garrigos

unread,
Oct 24, 2022, 10:03:02 PM10/24/22
to openrefine
Thanks Owen,

That's the right question, actually, I don't know which birth date is the right one, but let's say that I can filter a bunch of rows of my data that need to update wikidata's. Is there a way to copy those to the reconciled column?


Robert Garrigós i Castro

Owen Stephens

unread,
Oct 25, 2022, 4:18:09 AM10/25/22
to OpenRefine
So if you have a list of correct dates of birth in a column and you can filter this on some criteria, you don't need to copy this to the reconciled one - you simply filter to that set of rows, then use the schema editor to design a schema where you are adding your data - set the configuration for the statement to have the behaviour you want (as per the discussion in https://groups.google.com/g/openrefine/c/HmoVxpcizwk/m/MQ6mk_QMBgAJ)

Screenshot 2022-10-25 at 09.04.35.png
Screenshot 2022-10-25 at 09.04.43.png

If you are taking the 'add new / delete old' approach, you can also include the statement that you got from Wikidata with a delete configuration

Now you can go ahead and load the data

My approach is usually to filter to a single row first and check that all my processes work for that single row. Only once I'm confident would I then go ahead with mass changes.

Some notes:
* The format needed for date of birth is YYYY-MM-DD (month and day are optional) and is expected as a text string, so make sure you are feeding text strings (not openrefine dates) into the schema
* Date of birth requires at least one reference - so you need to be able to cite a source for your information
* The 'editgroups' tool allows you to undo a whole set of Wikidata edits generated from an OpenRefine process in one go - book mark https://editgroups.toolforge.org as it's saved me on more than one occasion when I've messed up an edit process

Owen



Robert Garrigos

unread,
Oct 25, 2022, 7:57:54 AM10/25/22
to openr...@googlegroups.com
Thanks Owen, very useful and informative answer, as always.

Indeed, this is provably what I was looking for.

And very interesting tool, the editgroups! I was going to give up
uploading straight away within operefine and do it through
quickstatements just because of the undo function within
quickstatements. Now, I don't have to :-)

Again, thanks a lot.

========================
Robert Garrigós i Castro
https://garrigos.cat
+34 620 91 87 01

El 25/10/22 a les 10:18, Owen Stephens ha escrit:
> So if you have a list of correct dates of birth in a column and you can
> filter this on some criteria, you don't need to copy this to the
> reconciled one - you simply filter to that set of rows, then use the
> schema editor to design a schema where you are adding your data - set
> the configuration for the statement to have the behaviour you want (as
> per the discussion in
> https://groups.google.com/g/openrefine/c/HmoVxpcizwk/m/MQ6mk_QMBgAJ)
>
> Screenshot 2022-10-25 at 09.04.35.png
> +34 620 918 701 <tel:+34%20620%2091%2087%2001>
> <https://garrigos.cat>https://garrigos.cat <https://garrigos.cat>
>
> Get BlueMail for Android <https://bluemail.me>
> --
> You received this message because you are subscribed to the Google
> Groups "OpenRefine" group.
> To unsubscribe from this group and stop receiving emails from it, send
> an email to openrefine+...@googlegroups.com
> <mailto:openrefine+...@googlegroups.com>.
> To view this discussion on the web visit
> https://groups.google.com/d/msgid/openrefine/dcd793ff-d469-4aff-a240-39cfdac0176en%40googlegroups.com <https://groups.google.com/d/msgid/openrefine/dcd793ff-d469-4aff-a240-39cfdac0176en%40googlegroups.com?utm_medium=email&utm_source=footer>.

Thad Guidry

unread,
Oct 25, 2022, 9:57:25 AM10/25/22
to openr...@googlegroups.com
It feels like part of your response Owen should be added to our docs.  Maybe as a quick paragraph in a :::tip admonition or something?


>you simply filter to that set of rows, then use the schema editor to design a schema where you are adding your data - set the configuration for the statement to have the behaviour you want
 
>
If you are taking the 'add new / delete old' approach, you can also include the statement that you got from Wikidata with a delete configuration
>Now you can go ahead and load the data

Reply all
Reply to author
Forward
0 new messages