Stanford web archiving work cycle concluded

27 views
Skip to first unread message

Nicholas Taylor

unread,
Jun 5, 2017, 7:10:25 PM6/5/17
to WASAPI-Community
Hi everybody,

The web archiving sprint team – Christina, John, Naomi, Tommy, and myself – has concluded its work cycle, having achieved the primary goal to develop a download utility to retrieve web archive data from the Archive-It data transfer API. This utility offers the following value:
•    Restores our ability to bulk-download web archives produced using Archive-It for SDR accessioning;
•    Provides a simple, well-engineered component that can be easily integrated into LOCKSS for import of web archive data into preservation networks;
•    Provides a documented, reusable tool that other institutions could use to retrieve their own web archive data from Archive-It;
•    Fulfills a core grant requirement, maintaining our commitment to our grant partners and funder, IMLS.

Code and documentation for the “wasapi-downloader” can be found here: https://github.com/sul-dlss/wasapi-downloader. The demo for the last sprint can be found here: https://www.youtube.com/watch?v=hrI1U6VDB7c.

Thanks go out to:
Christina – For plugging in so well to the underway effort, contributing insightful architecture and data flow documentation, and benchmarking.
John + Tommy – For their patience and asking many, right questions in getting up-to-speed on the web archiving code bases, and working effectively as a team around the team’s outages.
Naomi – For stellar preparation and organization as tech lead to keep the team focused with a tight timeframe.
Jefferson + Archive-It Engineering Team – For great documentation and responsiveness on questions that arose while interacting with the endpoint.
Fernando + Thib – For quick, generous, and time-saving assistance on an area of functionality where LOCKSS had much Java experience.

~Nicholas
Reply all
Reply to author
Forward
0 new messages