Delete unlinked taxonomy terms (Subjects or Places)

66 views
Skip to first unread message

Rodolfo Peres Rodrigues

unread,
Jul 28, 2023, 4:40:33 PM7/28/23
to AtoM Users
Hello everyone,

We're managing our taxonomy terms (Subjects or Places), we know about the normalization task, but we looking for something else to help us.

There is a lot of terms which aren't associated with any archival descriptions or authority records. Subjects and places currently not used.

Is there a command line tool which is capable to delete taxonomy terms (Subjects or Places) which aren't associated with any archival descriptions or authority records?

Maybe a command just like "php symfony physicalobject:delete-unlinked", but wich identifys subjects and places that are not linked to any archival descriptions ou  authority records and then deletes them.

It will be awesome.

Thanks,
Rodolfo Peres Rodrigues

Dan Gillean

unread,
Jul 31, 2023, 9:58:48 AM7/31/23
to ica-ato...@googlegroups.com
Hi Rodolfo, 

Unfortunately, we do not currently have such a task. It could likely be done with SQL, but it will take some research and testing to figure out how exactly. I don't have a query prepared for you, but I can point you to some of the pieces if you have some basic SQL familiarity! 

First, some basic resources:
A very brief and incomplete overview of AtoM's database schema
  • Because AtoM is a multilingual application, most entities have 2 tables - a base table with fields that don't need translations (such as identifiers, etc), and another table where any string that can be translated lives - this second table will have the same name, but with _i18n appended to the name. This is a common developer shorthand for "internationalization" (because that word is 18 characters long and devs love shortcuts). So for example here are some of the main entities: 
    • Descriptions: information_object and information_object_i18n
    • Authority record: actor and actor_i18n
    • Accessions: accession and accession_i18n
    • Archival institution: repository and repository_i18n
    • etc
  • Despite the above, not ALL data from a single entity will necessarily be on these two tables alone. In many cases there are additional tables where data may be found. For example, there are note and note_i18n tables - so any field in descriptions, actors, terms, etc that is considered a "note" may in fact be stored in these tables and linked to the parent entity. Similarly, there are separate contact_information and contact_information_i18n tables that are used for donors, rightsholders, repositories, and any other entity that has a contact information section. 
  • There are also the property and property_i18n tables - these have been sort of used as key-value pair tables for fields that didn't easily fit into the existing database tables. For example, the description tables were originally modeled around ISAD fields - but the Canadian RAD standard has many fields that don't map easily to the international description standard, so a number of RAD's more unique fields are stored in the property tables instead. 
  • There are tables for taxonomies, and for the terms in a taxonomy. The term table has a taxonomy_id field to link it to the taxonomy to which it belongs
  • Most main entities in AtoM, including terms, are linked to a base object table. The object table holds the unique ID (i.e. the object ID) of each record - usually this is the key on other tables. So for example, the ID field on the term table will typically be the same ID of the related entry on the object table. 
  • Here's the complicated part - because terms can relate to multiple different entities, instead of having a foreign key directly on the terms table, we have a separate object_term_relation table. Each row will have its own ID, but also an object_id (of the related entity) and a term_id  (of the related term) to link them all together. I suspect this table will be important to you, so you can select the right terms for deletion. 
Also note: as a MySQL database, you can get AtoM's db to work with a graphical tool like MySQL Workbench, or PHPMyAdmin, if you prefer a UI to explore and construct queries. 

So, tables you will probably need in your join:
  • object
  • term and term_i18n
  • taxonomy
  • object_term_relation
Remember to make a backup before trying any SQL operation that will make changes! I would try to make your DELETE query from the object table, as I believe that this will cascade to other tables, so you don't have to perform deletes on all those tables that might have parts of the related term data. You will likely need the other tables when making your JOINs in the query, to properly select the right records that don't have entries in the object_term_relation table. 

We DO have a sample query in the docs that can be used to check how many times a particular term (once you know its ID) is used as an access point in a particular culture - this example may also help you construct the correct JOINS across tables. See: 
That is about as much help as I can personally offer at this point! I hope that this will be useful to you! 

Cheers, 

Dan Gillean, MAS, MLIS
AtoM Program Manager
Artefactual Systems, Inc.
604-527-2056
@accesstomemory
he / him


--
You received this message because you are subscribed to the Google Groups "AtoM Users" group.
To unsubscribe from this group and stop receiving emails from it, send an email to ica-atom-user...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/ica-atom-users/db58adc5-a6f0-4e92-9859-f434c4c14c08n%40googlegroups.com.

tat...@gmail.com

unread,
Aug 23, 2023, 3:05:52 PM8/23/23
to AtoM Users
Hi Dan,

is there a cli command to delete an authority record?

or API?

Thanks,
Tati Canelhas

Dan Gillean

unread,
Aug 23, 2023, 3:39:30 PM8/23/23
to ica-ato...@googlegroups.com
Hi Tatiana, 

Unfortunately at this time there is not - there is a task to delete a description from the CLI, but not an authority. Additionally, the API endpoints in AtoM are currently geared more towards reading rather than writing. 

Regards, 

Dan Gillean, MAS, MLIS
AtoM Program Manager
Artefactual Systems, Inc.
604-527-2056
@accesstomemory
he / him

tat...@gmail.com

unread,
Aug 24, 2023, 2:17:30 PM8/24/23
to AtoM Users
Thanks Dan!!
:)

Reply all
Reply to author
Forward
0 new messages