API AIP Deletion Request Test Broke Archival Storage Dashboard

67 views
Skip to first unread message

Grayson Murphy

unread,
Dec 11, 2024, 12:57:08 PM12/11/24
to archivematica
Hi all, 

This happened on a VM running 1.16 on Ubuntu 22.04. We wanted to test bulk deleting AIPs using the storage service API, since we recently ran into an issue that caused AIPs to be ingested that weren't up to our specifications on our production instance. 

I created a python script to automate the Delete Package Request API call by pulling the UUIDs from a csv. Here is the script:
-----------------------------------------------------
import csv
import requests
import json
from datetime import datetime

def delete_aip_from_csv(csv_file, api_key, base_url, event_reason, pipeline_uuid, user_id, user_email, log_file):
    headers = {
        "Authorization": f"ApiKey {api_key}",
        "Content-Type": "application/json"
    }

    with open(log_file, "a") as log:
        log.write(f"Log started at {datetime.now()}\n")
       
        with open(csv_file, "r") as file:
            reader = csv.DictReader(file)
            for row in reader:
                uuid = row.get("UUID")
                if not uuid:
                    log.write("Skipping row without UUID.\n")
                    continue

                url = f"{base_url}/api/v2/file/{uuid}/delete_aip/"
                payload = {
                    "event_reason": event_reason,
                    "pipeline": pipeline_uuid,
                    "user_id": user_id,
                    "user_email": user_email
                }

                try:
                    response = requests.post(url, headers=headers, data=json.dumps(payload))
                    if response.status_code == 200:
                        log.write(f"SUCCESS: Deleted package with UUID: {uuid}\n")
                    else:
                        log.write(f"FAILURE: UUID: {uuid}, Status: {response.status_code}, Response: {response.text}\n")
                except requests.RequestException as e:
                    log.write(f"ERROR: UUID: {uuid}, Exception: {e}\n")
       
        log.write(f"Log ended at {datetime.now()}\n")

csv_file = ""  
api_key = ""  
base_url = ""  
event_reason = ""
pipeline_uuid = ""  
user_id =   
user_email = ""  
log_file = ""

delete_aip_from_csv(csv_file, api_key, base_url, event_reason, pipeline_uuid, user_id, user_email, log_file)
----------------------------------------------------------------------
The script returned the following message for each package: Status: 202 , Response {"message: "Delete request created successfully." ...}. 

However, when I opened the dashboard, this is what Archival Storage looked like: 
archival_storage.png
As you can see, AIPs no longer show up. Clicking on "Download CSV" opens a new window and returns this error, "Error accessing AIPs index", which appears to be derived from an elasticsearch exception error. Clicking "show files" does work, but I am unable to download the files as the storage service returns a 500 error.  

On the storage service side, the requests showed up and I was able to delete the AIPs. I checked the backend and they were actually deleted as well. 

I have restarted a-matica and rebooted the VM to no avail. Any insights are appreciated!

Best, 
Grayson
Univ. of Alabama at Birmingham Libraries

Alfeu Uzai Tavares

unread,
Dec 12, 2024, 1:59:36 PM12/12/24
to archivematica
Hello,

I just started to have a similar problem. I also have automated deletion requests and approval (with a crawler/bot).
It was running without problems for the last few months, but now the 'Archival Storage' tab just displays 'Elasticsearch Error'.
It seems to be related to this part of the code: https://github.com/artefactual/archivematica/blob/7fe56b18c012e7055caf6b9e416a4325e4146894/src/dashboard/src/components/archival_storage/views.py#L63

Is there a way to ask the Elasticsearch to update the index related just to the deleted AIPs?
Maybe rebuilding the whole index will fix the problem, but that would take a long time.

Alfeu Uzai Tavares

unread,
Dec 13, 2024, 12:24:36 PM12/13/24
to archivematica
Hey, I quick update about the issue.

There were a lot of deletion requests (about 1000) waiting for approval/rejection.
I let the script that automatically approves the deletion run for a while. It was also running much slower than usual.
Now, with about 500 pending requests, the Archival Storage tab is loading again without errors, but very slowly.
Maybe having too many pending deletion requests is what causes the problem.
Reply all
Reply to author
Forward
0 new messages