Work in progress:
https://ambuda.org/texts/downloads/ contains two downloads: metadata.json (list of all texts with source info, URLs for download, authors, etc.) and tei-headers.xml (list of all <teiHeader> elems from all documents on the site).
The goal is programmatic access to the full library for easier ingestion by other projects.
Still missing:
- bulk downloads of texts themselves. Certainly XML, likely TXT, not sure on other formats (eg probably not PDFs in each choice of script)
- `updated_at` to see if a text changed
- license information (CC0 1.0 Universal vs. various combinations of BY/NC/SA)
Arun