Bulk corrections/changes on ATOM

Stuart Bligh

unread,

Aug 1, 2023, 7:40:03 AM8/1/23

to AtoM Users

Hi - I'm wondering what the best way is to make bulk changes/corrections on ATOM - for example changes to a storage area reference when a large number of items are moved to a new store. I've searched the forum and can't find any threads relating to this.

I'm guessing one way of doing this could be to export the relevant collection data to a spreadsheet, make the changes using excel and then import the data back into ATOM again again to replace the existing data. I'd be grateful for some advice from anyone who's done bulk changes ... either via export/import or there may be a way of doing it within ATOM itself.

Thanks

Stuart Bligh

Dan Gillean

unread,

Aug 1, 2023, 9:00:46 AM8/1/23

to ica-ato...@googlegroups.com

Hi Stuart,

It definitely depends on what changes you are trying to make, and to what entities. Some things will be easier via the user interface - for example, editing a single controlled vocabulary term will update it automatically on all linked records. The same is true for storage containers linked to descriptions, authorities, etc. Additionally, if you use the Move module to move a description from one parent record to another, all descendant records will also be moved.

Similarly, AtoM uses inheritance in description hierarchies for repository names and creator names, so linking these just once at the appropriate level, and then editing these at that one level when needed, will also automatically inherit the changes at lower levels. In fact, there is now a command-line task that can be run that will check for unnecessary direct links between authorities and descriptions that would have the same outcome if inheritance were used instead (because this can have performance impacts) - see:

https://www.accesstomemory.org/docs/latest/admin-manual/maintenance/cli-tools/#unlink-creators-from-child-descriptions-and-reapply-inheritance-to-hierarchy

There are a number of other CLI tasks that can be handy for data cleanup in certain situations - for example:

There are also a few bulk operations that can be done using SQL. I strongly encourage that you make a database backup before pursuing this option, but we also have basic instructions on how to do that on the page:

https://www.accesstomemory.org/docs/latest/admin-manual/maintenance/common-atom-queries/

If you are going to use CSV import and export to update some descriptions, it's also a good idea to make a backup first (just in case!). Additionally, your update process will be MUCH easier if you can use the command-line to run the re-import. The user interface documentation still provides a good overview of the process and how to prepare your CSV:

https://www.accesstomemory.org/en/docs/latest/user-manual/import-export/csv-import/#update-existing-descriptions-via-csv-import

We have tried to update these docs to make them as clear as possible, but... long story short, the original matching logic was not very intuitive, and mostly designed for system to system updates, not roundtripping in a single system. For more detail on how matching works, why it doesn't work great when roundtripping in a single system, and some suggestions for working around these limitations, see:

https://groups.google.com/g/ica-atom-users/c/HMoN4tEuJ10/m/n4e1gqftBQAJ
Same info, but another version: https://groups.google.com/g/ica-atom-users/c/bCWYFIkLCRs/m/7QTPfLnQAQAJ

However, the command-line task has an additional --roundtrip option that we added as a stop-gap to help address the challenges of the user interface roundtrip process. When used, it essentially ignores all existing matching logic, and instead looks for an exact match on the AtoM database objectID (a system-wide unique value), which is what AtoM will add to the legacyId column of the CSV template on export. This bypasses all the challenges with the current system and works much more reliably for roundtrips in a single system - in the future we hope to add user interface support for this option. In the meantime, see:

https://www.accesstomemory.org/docs/latest/admin-manual/maintenance/cli-import-export/#importing-archival-descriptions

So the basic process would be:

Select the descriptions you want to update, add them to the clipboard, and export them
Open the CSV in a spreadsheet application
It's not required, but you can remove columns you don't intend to change - just make sure you keep and DO NOT CHANGE the legacyId column
Make your changes and save
Use the command-line import task, with the --update="match-and-update" and --roundtrip options. I also suggest using the --skip-unmatched option - that way, if no match is found for a particular row, it will not fall back to the default of creating a new (i.e. duplicate) description, and will just skip to the next row instead.

Be sure to spot-check the outcome after. If you make a database backup first, then you have an easy way to roll back the outcome if something unexpected happens, which we generally recommend any time you are making bulk changes to the system.

Good luck!

Dan Gillean, MAS, MLIS
AtoM Program Manager
Artefactual Systems, Inc.
604-527-2056

@accesstomemory

he / him

--
You received this message because you are subscribed to the Google Groups "AtoM Users" group.
To unsubscribe from this group and stop receiving emails from it, send an email to ica-atom-user...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/ica-atom-users/ea0c0a91-caf7-42ed-9a40-9789f8aa98f4n%40googlegroups.com.

Stuart Bligh

unread,

Aug 1, 2023, 10:26:54 AM8/1/23

to ica-ato...@googlegroups.com

Thanks again Dan - much appreciated ... I'll have a read through and try some of your suggested solutions ....

Stuart Bligh

Archive Advisor

Mob: 07949377526

Max Communications Ltd.
2-3, Gunnery Terrace
Cornwallis Road
London SE18 6SW

www.maxcommunications.co.uk

www.royalwarrant.org/company/max-communications-ltd

To view this discussion on the web visit https://groups.google.com/d/msgid/ica-atom-users/CAC1FhZ%2B%2BLLPtFETbw9CQqfJQHZi%3D_uFpDVwL1oJv-_-RsBkMCQ%40mail.gmail.com.

Ed Warga

unread,

Sep 26, 2023, 4:38:20 PM9/26/23

to AtoM Users

Hello Dan,

I hope it is okay to respond here. I am working on bulk updates, too. I am having trouble with something you describe in your response above, "Select the descriptions you want to update, add them to the clipboard, and export them". When I try to export the descriptions from the clipboard, the job never completes. It hangs and just stops progressing. What can I do to troubleshoot this? I am running version 2.6.

Thank you,

Ed Warga

Library Systems Specialist

St. Cloud State University Library

Dan Gillean

unread,

Sep 27, 2023, 8:26:04 AM9/27/23

to ica-ato...@googlegroups.com

Hi Ed,

How many descriptions are you adding at once, and how much memory does your AtoM installation have? It's possible that it's exhausting resources. There are some useful tips for managing the job scheduler in this page of the docs:

https://www.accesstomemory.org/docs/latest/admin-manual/maintenance/asynchronous-jobs/#gearman-job-worker-management

I will highlight the key points for you below:

First, from the root installation directory (typically /usr/share/nginx/atom if you followed our recommended installation instructions), you can:

Check the status of the atom-worker with: sudo systemctl status atom-worker
Kill all stalled and running jobs with: php symfony jobs:clear

Keep in mind the "all" part of the command above - if you have other jobs in the queue waiting behind the stalled job that you don't want to lose and have to manually restart, you can also try killing only the job that is stalled. This is a bit more involved, however, as you will need to use SQL queries against the database to do so. Some instructions:

Now the atom-worker managed by the job scheduler does have its own ability to attempt a restart when it hits a problem and stalls. However, to prevent it from getting caught in an endless loop of restarting when doing so won't actually resolve the issue, we've added a failure limit counter. If there are more than 3 restart attempts in 24 hours, then this fail counter is filled, and the restart command won't work until the counter is reset. So, let's next reset the fail counter, and then restart the atom-worker:

sudo systemctl reset-failed atom-worker
sudo systemctl restart atom-worker

Now, try your export again. Hopefully everything will work!

If it doesn't, then it's time for us to get a bit more information. First, the job scheduler has its own error log that you can check - please share any relevant error message you find when running:

sudo journalctl -f -u atom-worker

It's also helpful to know a bit more about your AtoM installation. You mentioned version 2.6 - can you give me the full version number found in Admin > Settings?

Additionally, we provide some recommended hardware minimum targets for a production-ready AtoM deployment - does your installation meet or surpass these?

Processor: 2 vCPUs @ 2.3GHz
Memory: 7GB
Disk space (processing): 50GB at a minimum for AtoM’s core stack plus more storage would be required for supporting any substantial number of digital objects.

Finally, did you follow our recommended installation instructions for 2.6? If no, what changes have been made locally? Does your site have any local customziations (including a custom theme plugin, etc) that we should know about?

Hopefully those initial steps will sort things out for you - if not, then with a bit more information, we can likely provide further suggestions on next steps.

Cheers,

Dan Gillean, MAS, MLIS
AtoM Program Manager
Artefactual Systems, Inc.
604-527-2056

@accesstomemory

he / him

To view this discussion on the web visit https://groups.google.com/d/msgid/ica-atom-users/f56db1b1-3561-4421-ab12-a09536ae3b51n%40googlegroups.com.

Reply all

Reply to author

Forward