How to update records from a csv in this case

93 views

CSV-exportCSV-importFAQimport-exporttranslations-i18n

Skip to first unread message

Carlos Moreno

unread,

Mar 9, 2024, 1:03:50 PMMar 9

to AtoM Users

Greetings to this excellent group.
Dan, you are always there to help us, I have a question.

I prepare a mass migration from a csv.

There would be around 25 thousand records in this first migration.

My intention is to prepare the AtoM by first creating the hierarchy tree.

That is, create the fund first, the subfunds and then the series and subseries.

And then in the csv use the qubitParentSlug.

One of my doubts lies in the ability to later edit part of those records.

Can I do it with another CSV? If I do not have legacyID or parentID, will the match be based on the title of the description?

In this case, what would be the command in the CLI for the update?

Thanks to all.

Dan Gillean

unread,

Mar 11, 2024, 8:49:26 AMMar 11

to ica-ato...@googlegroups.com

Hi Carlos,

Your plan can work! Just be sure that you make periodic backups as you progress, and review the results carefully across all related entities (not just descriptions) after an import before determining whether or not to proceed with the next part of your migration project.

The short version of my suggestions are:

Currently the best way to attempt updates to existing records in AtoM via CSV import is by using the command-line task with the --roundtrip option. With this method you would essentially:

Make a backup of your data while it's in a good state! See:

https://www.accesstomemory.org/docs/latest/admin-manual/maintenance/common-atom-queries/#backing-up-the-database

Export all the records you need to update
Make changes to the relevant fields - do NOT change the legacyId value that is included in your export however, as this will be the matching criteria used!
Import via the command-line using the --update="match-and-update" option along with the --roundtrip option
Perform a careful review of the outcome. Be sure to check any related entities as well (such as authority record, access point terms, etc)
If things look good, then make another backup, and proceed!

For this to be successful in the future, it's important that you understand how AtoM's import matching and update logic currently works, and the limitations of each method. Some helpful links:

Be sure you understand which fields can and can't be updated via import - see:

https://www.accesstomemory.org/docs/latest/user-manual/import-export/csv-import/#update-existing-descriptions-via-csv-import

If your content is multilingual, you will also want to review how translation rows are handled with multilingual content in import and export:

https://www.accesstomemory.org/docs/latest/user-manual/import-export/csv-import/#importing-translations

In general, just be sure to review all of the CSV import documentation and be sure you are clear on the formatting expectations for each field type. See:

https://accesstomemory.org/docs/latest/user-manual/import-export/csv-import/

We also have the following slide deck which helps summarize the key points for preparing archival description CSV files for import:

https://www.slideshare.net/accesstomemory/csv-import-in-atom

Additionally, in 2.7 we now have CSV validation, which can check for common issues in CSV import files, and is supported through both the user interface and the command-line. See:

For more information on the --roundtrip option and other command-line task details, see:

https://www.accesstomemory.org/docs/latest/admin-manual/maintenance/cli-import-export/#importing-archival-descriptions

Now, currently the --roundtrip option is only supported via the command-line - we hope to add support for it to the user interface in the future, but currently if you are trying to update existing data in your own system via CSV import, then exporting it first, updating the export CSV, and reimporting that will be your best bet for success. It is helpful to understand a bit about the history of AtoM's CSV update import development and why it currently works the way it does - for a longer overview with some history on how the update import was originally designed, and why matching is hard to do etc, please see the following older forum threads:

https://groups.google.com/g/ica-atom-users/c/HMoN4tEuJ10/m/n4e1gqftBQAJ
Same info, but another version: https://groups.google.com/g/ica-atom-users/c/bCWYFIkLCRs/m/7QTPfLnQAQAJ

So, with all that in mind, one more consideration, for the initial migration imports:

If possible, I do recommend that you perform your initial imports via the command-line. The --roundtrip option WILL work in the future even if you don't assign legacy ID values on the initial imports. However, you don't know how else you will want to use this data in the future, so I would recommend you do all you can to ensure success with the weird ways that AtoM's import and export matching logic currently works.

This means: If possible, I suggest that you do take the time to assign each row in your prepared imports a unique legacyID value. This can be made up, and it can be alphanumeric - start at A000001 or similar (the "A" is just so you don't have to fight with the auto-formatting settings that many spreadsheet applications have for numbers with leading zeroes). Additionally, when importing, you might want to break up those 25K records into a few CSVs, and use the --source-name CLI option to give each a related unique source name on import, like "migration-001" etc or similar. By default, AtoM uses the legacyID value and the source-name of a CSV as some of the matching criteria for update imports, so it's helpful to have those populated just in case - especially if there may be updates coming from external sources.

Finally, one more reason to use the command-line for all stages of this process: known issues with non-English data on AtoM's exports.

I believe the maintainers are hoping to patch this in the next major release, but for now: AtoM has some known problems exporting non-English data from the user interface. The short version is that not all non-English data is properly exported when using the user interface. However, if you run a bulk CSV export from the command-line, all data across all cultures should be exported. See:

Bug ticket: https://projects.artefactual.com/issues/12155
Wish list enhancement ticket: https://projects.artefactual.com/issues/12107
Related forum thread where these issues were first discussed: https://groups.google.com/g/ica-atom-users/c/HSNdPCY-Ey4/m/duHABP1_AAAJ

As I said, I believe that the Maintainers are working on a patch so by default, all rows across all cultures are exported from the UI as well. In the meantime, I hope the information here will help you succeed in your project, and bypass some of AtoM's known quirks and challenges.

Good luck!

Cheers,

Dan Gillean, MAS, MLIS
AtoM Program Manager
Artefactual Systems, Inc.
604-527-2056

@accesstomemory

he / him

--
You received this message because you are subscribed to the Google Groups "AtoM Users" group.
To unsubscribe from this group and stop receiving emails from it, send an email to ica-atom-user...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/ica-atom-users/6a5831a9-a0b6-4c43-8901-688db907c7f8n%40googlegroups.com.

Carlos Moreno

unread,

Mar 13, 2024, 4:58:36 AMMar 13

to ica-ato...@googlegroups.com

Hi Dan, and thanks again for your guidance.

In fact, I have managed to do the update correctly with the indication you provide. It's the safest way.

I take into account the approach to divide the import into parts and thus ensure the process and have greater control.

Thanks for your answers.

Libre de virus.www.avast.com

To view this discussion on the web visit https://groups.google.com/d/msgid/ica-atom-users/CAC1FhZL5RJdsnN5i71OHRU6VjpD_zDio2Q_xuTSf9UpaG63y4Q%40mail.gmail.com.

Carlos Moreno

Reply all

Reply to author

Forward

0 new messages