Hi everybody,
The team working on web archive crawl data download automation has concluded sprint 4 of its work cycle, and is pleased to announce the availability of an associated demo video:
https://www.youtube.com/watch?v=VATmIhSmh6s. At the conclusion of this sprint, we can retrieve and parse JSON from the WASAPI endpoint as well as download and validate WARC files.
Features completed this sprint:
• Can run the downloader via command line to:
o Parse the first page response
o Make a request for the file list and get responses
o Parse the response JSON into an object
• Can download and validate a WARC file, per command line arguments
Look forward to more exciting features in the next (and final) installment of the web archive crawl data download automation work cycle!
~Nicholas