Skip to first unread message

sally-an...@york.ac.uk

unread,
Sep 23, 2020, 3:43:12 AM9/23/20
to AtoM Users
Hello!  I have a quick question for anyone who is running a version of AtoM with a large number of archival descriptions or particularly large archive catalogues.  My colleagues and I have noticed that our version of AtoM seems to be slower than usual in recent months. We've been importing additional csv to existing (and very large) catalogues and even if the csv import is quite small it can take several hours to process.  In one case it ultimately crashed and we've been left with 42 'active jobs' that won't complete and we can't delete from the front end either.  

We've been in contact with our IT support and they can't see anything wrong their end, although they did wonder if we might be hitting the PHP execution limits.  I was wondering if the slowdown is due to the large size of the catalogues we're adding to? When AtoM adds to a catalogue does it somehow reload or reprocess the whole thing, or just the small section we're adding?

Thanks for any advice on this!

José Raddaoui

unread,
Sep 23, 2020, 11:23:28 AM9/23/20
to AtoM Users
Hi there,

You're on the right track, the amount of descriptions in the instance can affect the speed and resources needed in those import processes. AtoM uses a nested set implementation for the hierarchical structure of archival descriptions, this has its benefits for reading but requires expensive SQL queries to insert/delete/move new resources in the tree. Those queries are more or less expensive based on the amount of descriptions and at what point in the entire hierarchy are they imported.

Recently, and thanks to the implementation of CTE queries in MySQL 8, we're trying to improve this situation moving to a mix of adjacency lists and nested set until we can fully remove the later. So, my first suggestion would be to give it a try to AtoM 2.6 where, apart from the CTE queries, there are several enhancements to the CSV import process.

Alternatively, if upgrading is not an option, you could try running these imports from the CLI with the --skip-nested-set-build option, more info here.

Best regards,
Radda.

Dan Gillean

unread,
Sep 23, 2020, 4:28:20 PM9/23/20
to ICA-AtoM Users
Just a quick follow up on this: 

If you have jobs stuck in the queue, there are a couple ways you can clear these on the backend. 

First, there is a command that will terminate all running jobs: 
  • php symfony jobs:clear
Keep in mind that this will kill ALL jobs, so if there are some in the queue you actually want, you'll need to restart the process afterwards (i.e. redo your import, etc). 

See: 
Additionally, we've also provided a SQL query that can be used to terminate individual jobs. See: 
Radda is correct to point out that using the command-line and disabling the nested set build will definitely improve the speed. On the command-line, indexing as the task progresses is also disabled by default (while in the user interface, each row triggers an update to both the nested set and the index) - it is generally these two processes that can make additions to large hierarchies slow. So long as you remember to run the tasks to rebuild the nested set and repopulate the search index after your import, using the command-line for these will likely vastly improve the process - as will upgrading to 2.6 where, as Radda mentioned, we've added a number of performance optimizations. 

Cheers, 

Dan Gillean, MAS, MLIS
AtoM Program Manager
Artefactual Systems, Inc.
604-527-2056
@accesstomemory
he / him


--
You received this message because you are subscribed to the Google Groups "AtoM Users" group.
To unsubscribe from this group and stop receiving emails from it, send an email to ica-atom-user...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/ica-atom-users/3202455a-2232-47b4-8c31-38af518008c2n%40googlegroups.com.

sally-an...@york.ac.uk

unread,
Sep 24, 2020, 4:09:20 AM9/24/20
to AtoM Users
Dear Dan and José,

Thank you so much for your speedy replies and explanations - that makes much more sense now! It looks as though we will be updating to AtoM 2.6 in the near future anyway so I hope this will help to address the issue.  I'm happy to report we've also managed to get rid of the 'active jobs' that have been hanging around for weeks.

Thanks again for your help,

Sally   

Reply all
Reply to author
Forward
0 new messages