Hi everyone!
I'm forwarding a request from Alvin Stockdale, who helps manage a dataset catalog that is harvesting metadata from repositories that use Dataverse. Alvin's asking for advice on handling datasets that have been deleted from the source repository.
In later emails Alvin mentioned that they're using Dataverse's API to harvest the metadata, and not OAI-PMH, and we talked about how "deleted" datasets could be ones that are what we call "destroyed" and datasets whose versions have all been deaccessioned.
Could anyone recommend how best to know if a dataset has been deleted "without having to diff our entire catalog each time we grab new datasets and modified datasets"?
---------- Forwarded message ---------
From: Stockdale, Alvin (NIH/NLM)
Date: Wed, Jun 11, 2025 at 8:10 AM
Subject: Harvard Dataverse deletes
Hi Julian,
Things are progressing with
NLM’s Dataset Catalog and we’re starting to figure out how we will handle datasets that have been deleted from the source repository...
We are trying to figure out how we will know if a dataset has been deleted without having to diff our entire catalog each time we grab new datasets and modified datasets. Any information you can provide would be greatly appreciated!
Thanks,
Alvin Stockdale
Senior Serials Specialist
Metadata Management Program
National Library of Medicine