Is there a way in the db to remove it? But then there are the 14,000+ files to remove from S3? And lots of table dependencies.
Any advice?
Thanks,
Sherry Lake
UVA Dataverse http://dataverse.lib.virginia.edu
Sherry,
Looks like you were discussing in zulip but I can’t tell if you’ve solved your issue. As you noted there, the 503 doesn’t mean that Dataverse has failed, just that the timeout you have set for responses was shorter than the time it takes to delete. If the delete did not finish for some reason, I suspect you could delete the datafile and filemetadata entries associated with that dataset/version in the db. It would be worse if there are categories/tags associated with the files, but if these were bulk uploaded, hopefully there isn’t much. Cleaning up the physical files would be a pain except we now have the https://guides.dataverse.org/en/6.2/api/native-api.html#cleanup-storage-of-a-dataset API call that will remove physical files not associated with the dataset in the database.
There’s good news coming too – shameless plug – a new delete files API call that would let you delete the whole list of files at once. (It looks like the delete dataset/delete draft version calls are iterating though n calls to DeleteDatafileCommand right now, so that will still be slow unless/until we adapt it to delete all the files at once like the new API.) There have also been other improvements since 6.2 for dealing with large numbers of files too, so, while scaling isn’t a solved problem, we’re making progress and hearing about specific issues like this one are useful in planning further work.
Hope that helps,
-- Jim
--
You received this message because you are subscribed to the Google Groups "Dataverse Users Community" group.
To unsubscribe from this group and stop receiving emails from it, send an email to
dataverse-commu...@googlegroups.com.
To view this discussion visit
https://groups.google.com/d/msgid/dataverse-community/b537fec0-b418-432b-8b5b-138924bedf7fn%40googlegroups.com.