Wikidata GLAM Challenge / adding GLAM institutions by state

11 views
Skip to first unread message

Tillman, Ruth Kitchin

unread,
Feb 18, 2020, 2:55:43 PM2/18/20
to Archives and Linked Data Interest Group, Darnelle Melvin
Good afternoon all,

At our most recent meeting, I shared docs for adding repositories based on Eira Tansey and Ben Goldman's RepoData project, including some geocoordinates recon by Hillel Arnold and Ed Summers.

Elizabeth and I saw this as a time we could test out our recommendations for the Finding GLAMs challenge:
https://meta.wikimedia.org/wiki/FindingGLAMs_Challenge/About/en and the repodata project has tons of smaller institutions which aren't on major radar

I've made this publicly accessible so anyone should be able to view the folder: https://drive.google.com/drive/folders/19Twmy224hg6we1moGno02aviTUC1C1UE

The Instructions document has a walkthrough of how to tackle each major field. https://docs.google.com/document/d/1kyFirXksrUpKITyR-8nirG-YCaHgP9w8B8elV4XriLI/edit?usp=sharing CSVs have some additional columns you won't use in the final product but which may be helpful. I'm happy to answer any questions.

Please sign up for the state you want to start with on https://docs.google.com/spreadsheets/d/1L7Uk3pgbWF9FStEmoVET0eRfMECfj8XwKlx1BIpvu8A/edit#gid=0 so we know who's doing what! I would recommend looking at file size (e.g. Delaware is 3KB)  if you want to pick a smaller one.

Some critical instructions:
  1. You have to check every entity before creating a record for it. Some are in there under weirdly different name forms or with different instance of types than you'd thought of. I'd recommend that as a first (and boring, sorry) pass on the whole spreadsheet (or each smaller spreadsheet).
  2. For bigger states, you may want to make a few smaller spreadsheets and an especially small one (like 5 rows) to get started. Or the critical part: If you're not sure about it, start small, not big!
  3. Before you start reconciling, while deduping etc, you may want to work in Excel or LibreOffice. It is much easier to delete rows there.
Once you start reconciling, check out the Reconciling section of the instructions for some tips and I'm also happy to walk folks through it. There are other ways to get in batch data vs. QuickStatements, but OpenRefine's export to wikidata is not well-suited (from my attempts to make it suit) for this kind of project. It's better for enhancing records.

You can email me here, Elizabeth at eru...@emory.edu, the whole group via the listserv, and/or find me as ruthbrarian on various slacks if you want to problem-solve!

Sign up, take a sheet, and good luck!

Thanks,
Ruth

Ruth Kitchin Tillman
Cataloging Systems & Linked Data Strategist
Penn State University Libraries
Paterno Library 006 | 814-867-1038
rk...@psu.edu


she/her/hers

Tillman, Ruth Kitchin

unread,
Feb 18, 2020, 3:10:32 PM2/18/20
to Archives and Linked Data Interest Group, Darnelle Melvin
Just one follow up -- be careful w/Excel as the P625 (coordinates) column format may try to parse funny. I use LibreOffice instead.

Ruth Kitchin Tillman
Cataloging Systems & Linked Data Strategist
Penn State University Libraries
Paterno Library 006 | 814-867-1038
rk...@psu.edu


she/her/hers


From: archives-and...@googlegroups.com <archives-and...@googlegroups.com> on behalf of Tillman, Ruth Kitchin <rk...@psu.edu>
Sent: Tuesday, February 18, 2020 2:55 PM
To: Archives and Linked Data Interest Group <archives-and...@googlegroups.com>; Darnelle Melvin <darnell...@unlv.edu>
Subject: [archives-and-linked-data] Wikidata GLAM Challenge / adding GLAM institutions by state
 
--
You received this message because you are subscribed to the Google Groups "Archives and Linked Data Interest Group" group.
To unsubscribe from this group and stop receiving emails from it, send an email to archives-and-linke...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/archives-and-linked-data/DM6PR02MB46836F52D41CCB4722DE6D1E8D110%40DM6PR02MB4683.namprd02.prod.outlook.com.
Reply all
Reply to author
Forward
0 new messages