Cleaning and maintenance of the MCP MySQL database

51 views
Skip to first unread message

Andrew Berger

unread,
Mar 5, 2018, 6:53:30 PM3/5/18
to Archivematica Tech
Hi,

We've ingested over 5000 packages via Archivematica and over time the MCP database has grown considerably. This slows performance when loading the transfer and ingest tabs, especially following a server restart. It doesn't seem to be a problem when actually running an ingest, but it may be a source of future problems that just aren't visible yet.

Is there a way to clean up the database? As I understand it, the only data that needs to stay in the MCP database is data that relates to running the microservices (and maybe the FPR?). Data related to previous transfers/SIPs could be exported elsewhere. Or, if that data is included in the AIPs themselves, it may not be necessary to keep it at all. 

Beyond that, is there a way to test the "health" of the database? Given that the database generally seems to be modified during each upgrade, it would be helpful to be able to analyze it for potential problems.

Thanks,
Andrew





vincent....@fr.ch

unread,
Mar 13, 2018, 2:58:50 AM3/13/18
to Archivematica Tech
Hi,

I've asked a similar question some time ago "Consolidate Archivematica database" but I have no answer yet.
I've looked at the code of the transfer and ingest tabs of version 1.7 which should improve the performance. In 1.6 everything is retrieved and then filtered in Archivematica...

But a way to keep clean the DB is anyway necessary.

Regards
Vincent

Charles Dixon

unread,
Mar 13, 2018, 4:27:55 AM3/13/18
to Archivematica Tech
Hi,

You may already know about this and this is far from database cleanup or health checking but both of the SIP and Transfer tables have a column named "hidden" that you can set to 1/true. I've found that this solves issues with page load times. I'm planning on writing a new microservice to add to the end of the process to automatically do this for our successful transfers (our pipeline is automated and we involves other systems so get an email when it's completed anyway and we don't actually store data in Archivematica).

Charles 

vincent....@fr.ch

unread,
Mar 13, 2018, 8:14:42 AM3/13/18
to Archivematica Tech
Hi,

Yes hiding improve a little bit the loading on the browser side but in my case this is not enough. My workflow hide automatically the entries (can be done with a parameter on the automation tools or by REST API) but the loading of the tabs is still slow.

Regards
Vincent

Timothy Walsh

unread,
Mar 16, 2018, 1:27:37 PM3/16/18
to Archivematica Tech
Hi all,

We run into the same issue at CCA over time, and I would be interested in a script to prune the MCP database to remove unnecessary transfer and ingest information periodically.

If such a script doesn't exist, we could use tickets to sponsor the work. Would anyone else with a support contract be interested in going in on this together?

Tim
Reply all
Reply to author
Forward
0 new messages