Hi everybody,
The web archiving sprint team – Christina, John, Naomi, Tommy, and myself – has concluded its work cycle, having achieved the primary goal to develop a download utility to retrieve web archive data from the Archive-It data transfer API. This utility offers the following value:
• Restores our ability to bulk-download web archives produced using Archive-It for SDR accessioning;
• Provides a simple, well-engineered component that can be easily integrated into LOCKSS for import of web archive data into preservation networks;
• Provides a documented, reusable tool that other institutions could use to retrieve their own web archive data from Archive-It;
• Fulfills a core grant requirement, maintaining our commitment to our grant partners and funder, IMLS.
Code and documentation for the “wasapi-downloader” can be found here:
https://github.com/sul-dlss/wasapi-downloader. The demo for the last sprint can be found here:
https://www.youtube.com/watch?v=hrI1U6VDB7c.
Thanks go out to:
Christina – For plugging in so well to the underway effort, contributing insightful architecture and data flow documentation, and benchmarking.
John + Tommy – For their patience and asking many, right questions in getting up-to-speed on the web archiving code bases, and working effectively as a team around the team’s outages.
Naomi – For stellar preparation and organization as tech lead to keep the team focused with a tight timeframe.
Jefferson + Archive-It Engineering Team – For great documentation and responsiveness on questions that arose while interacting with the endpoint.
Fernando + Thib – For quick, generous, and time-saving assistance on an area of functionality where LOCKSS had much Java experience.
~Nicholas