Hi Matt,
Unfortunately, AtoM doesn't currently have an established method of updating terms the way that descriptive data can be updated. However, I have identified a simple way you can modify the taxonomy normalize task to accomplish what you want.
First, my warnings: I AM NOT A DEVELOPER. Please back up all your data, and proceed at your own risk. I tested this locally and it worked, but you take on the responsibility for the outcome if you choose to proceed.
The next warning requires a brief explanation of how this task is working first. You can see the code for the task in
lib/task/taxonomy/taxonomyNormalizeTask.class.php (shown here in our code repository). Essentially, it is fetching all terms with an exact match on term name, and it is ordering them by the term's object ID. The default sort for this is ascending. When the merge is executed, it merges the information object relations from any duplicate terms into the first term - which based on the ascending sort order of the term IDs, tends to mean that the oldest term is preserved.
It's important to note that this task does NOT have any capacity to merge data - that could get rather complicated for non repeatable fields (like code, broader term, etc). Do you want to cram the data from both (or multiple) records into a single field? Do you want one to overwrite the other? If so, which? Etc.
Instead, all this task is doing is moving the information object relations (AKA the links to archival descriptions) from the duplicate term to the one that is going to be preserved.
Now, if you wanted to make sure that the newest term duplicates (aka your newly imported subjects) are preserved instead of the oldest, then we can make a very small change in
line 112 of this task. Right now it reads:
As I mentioned, the default sort here is in ascending order. However, we can modify this line to order the terms in descending order like so:
Here's an image of the modification I made locally:
Hopefully that might help you achieve what you need. However, I do want to pass on one more important warning about hierarchical relations.
Let's say you have the following term hierarchy:
Now, you run your import, and you end up with a new duplicate "Beverages" term, like so:
If you modify the task to sort descending and then run the taxonomy normalize task, what will happen? It turns out that Warm beverages, Coffee, and Tea will all be deleted.
So, put another way, the warning is this: ONLY information object relations are passed from the term(s) to be deleted to the one being kept. Hierarchical relations are NOT. When the task progresses to deleting the duplicate terms, this delete action cascades, so that descendant terms are also deleted.
This means that if you have hierarchical relations in your existing subjects that are not recreated in your new import, then all those child terms will be lost if you attempt this method. If your taxonomy organization is flat, this won't be an issue, but if not, you may need to do some manual work either pre or post-task to recreate lost terms and relations.
I hope this might still help!
Finally: don't forget to change the task back after - or at least, don't forget about the modification you've made!
Cheers,