Disabling nested set while deleting

59 views
Skip to first unread message

Raphaël Barman

unread,
Nov 7, 2023, 4:19:40 AM11/7/23
to AtoM Users
Hi,

We have a use case of AtoM where we have a large number of Archival descriptions (around 3.9 mio at the moment) and we do many single description imports per day (around 500 new descriptions per day) using XML EAD using CURL requests (so using the "UI" import and not the CLI import).

The biggest bottleneck has always been the nested set update that occurs when importing data.
Since this nested set update can be disabled for many of the imports using the command line (--skip-nested-set-build), we decided to make a small change to the import XML code so that nested set updating is also disabled there and felt relatively safe doing so since this was a behavior already implemented. We then use a cron job that rebuilds the nested set nightly.
This works great for us as we are OK with not having a consistent nested set on the day of the import since everything is correctly available the day after.

We now have the issue that we sometimes want to delete many descriptions because of some errors unrelated to AtoM (bad data was sent to import usually).

Since the deletion also updates the nested set, it can sometimes take more than a minute to complete a single description, which makes it quite unusable. Even worse, since a deletion does not happen as a background task, but in the "foreground" and it locks the tables it updates, all our import tasks fail when a deletion is being performed.

We would like to solve this issue in a similar way that we did for the import, this means also disabling the nested-set updating during the deletion of a description by modifying a bit the code of the deletion process.
However, since this is not something that is done by other CLI tasks (as was the case for the import), we are unsure if there are some side effects that may occur when deleting and importing data on an "inconsistent" nested set.
We did a few tests and did not see any issues for now, but it would be great if someone from Artefactual could chime in and give some insight if it is an acceptable solution.
I have attached the patch of the changes we did.

Thanks a lot for your input,
Best,
Raphaël
disable_nested_set_during_deleting.patch

José Raddaoui

unread,
Nov 7, 2023, 9:46:09 AM11/7/23
to AtoM Users
Hi Raphaël,

You are totally right about the nested set being one of the biggest bottleneck in the application. We have the following tickets (and many other related to them) explaining the situation and some improvements we have been able to do over the time, specially since the introduction of CTE queries in MySQL 8.


We could improve the situation a little in the 2.6.x release, where we implemented similar changes to the ones you are describing for both operations:

- Avoid nested set update on CSV import (commit) (ticket)
- Normalize hierarchy deletion (commit) (ticket)

I wonder if you are using 2.6.x or higher, otherwise and considering the size of your collection, I'd try to upgrade to the latest release. You'll also notice performance improvements in the indexing process (which affects the CSV import and many other operations).

Best regards,
Radda.

Raphaël Barman

unread,
Nov 13, 2023, 10:31:40 AM11/13/23
to AtoM Users
Hi Radda,

Thanks a lot for the useful feedback!

We are using 2.6.4, so we should have the changes you mentioned.
I had already seen that the normalization of the hierarchy was waiting until the last deletion. Would that mean that completely disabling the normalization during a deletion and running a normalization once per day is an acceptable solution?

Best,
Raphaël

José Raddaoui

unread,
Nov 13, 2023, 1:25:32 PM11/13/23
to AtoM Users
Hi Raphaël,

Waiting until the last deletion may be a good enough optimization to the point where you don't need to disable it. If it's still an issue for you, then I think it's an acceptable solution as long as you are okay with some inconsistency in the hierarchy during that period (mostly in the full-width tree-view and exports).

Best regards,
Radda.
Reply all
Reply to author
Forward
0 new messages