ETL process from Wikidata

30 views
Skip to first unread message

Pietro

unread,
Jul 22, 2022, 6:34:35 AM7/22/22
to islandora
Hello,
I would like to import a Wikidata table or SPARQL query (like https://www.wikidata.org/wiki/Wikidata:WikiProject_sum_of_all_paintings/Collection/Uffizi_Gallery , for example) to have some data to play with for collections, views, and so on.

Are there any examples or tools to understand the best way to approach the extract, transform and load process?

Thanks,
Pietro

Alexander O'Neill

unread,
Jul 22, 2022, 12:35:50 PM7/22/22
to islandora
Hi Pietro,

You might get some ideas from this Drupal module I made for our Islandora RDM project. It queries configurable web services to be the back end for an auto-complete field, and the field is paired with the canonical URL along with a label. You could add this field to a taxonomy term for example. It includes plugins that query using SPARQL so you could point it at wikidata family easily I think.

Best,

 - alexander

Alexander O'Neill

unread,
Jul 22, 2022, 12:36:53 PM7/22/22
to islandora

Oops forgot the module link, it's here: 

Pietro

unread,
Jul 27, 2022, 6:00:55 PM7/27/22
to islandora
Hi Alexander,
sorry for my late reply.

At the moment I'm not able to figure out how to use the linked_data_field module for bulk import, so I'm continuing to practice with the CSV importer after extracting the Wikidata table in Excel.

The problem I have now is identifying the format to use to select fields that are elements of the taxonomy, such as linked_agent | person | creator: what is the syntax to identify such fields?

Thanks, Pietro

Reply all
Reply to author
Forward
0 new messages