Dear Dataverse community,
I am writing to inform you that the Python Library Pooch has recently (in v1.7.0) added support for DataVerse repositories. Pooch is a BSD-3-licensed Python package for data download from Python code. Among others, it features the following features:
* On-demand download and caching in OS-specific paths
* Automatic checksum verification
* Multiple download protocols
* Built-In support for archive decompression upon download
* Optional progress bars and logging statements
* Automatic download from sources specified by DOIs
For DataVerse users, the DOI resolution feature is the most interesting bit. You can access data by just specifying a DOI for a DataVerse-hosted dataset in e.g. the following way:
data = pooch.create(base_url="doi:10.11588/data/TJNQZG", path=pooch.os_cache("myproject"))
datafile = data.fetch("nkd_fpl_valley_TF.json")
This will resolve the DOI, determine the data repository type (all DataVerse instances supported), query the DataVerse API for contained data files and their checksums and then on-demand download the specificed file and store the local path in the datafile variable. A second request to the same file would yield the cached version.
For more information, see the following sources:
* GitHub repository: https://github.com/fatiando/pooch
Pooch is available from PyPI and conda-forge.
I am happy to answer your questions,