Hi all,
We are running into some pretty significant indexing issues here at Yale and would appreciate your perspective on what might be happening and how best to tune certain indexer behaviors.
I am getting reports that staff performing routine accessioning steps (adding a new archival object to a series, and creating a new top container instance attached to that AO) on a large resource containing thousands of AOs/TCs are experiencing long delays.
After creating a new TC, it is invisible in box Manage Top Containers and Instances > Add Container Instance > Browse. I confirmed the container and AO records are in the database, but do not show up in the index for 15 to 30+ minutes. Updating system_mtime
is not helping.
The ASpace logs show:
E, [2025-08-19T08:51:26.083272 #589] ERROR -- : Thread-3308: SolrIndexerError when committing:
Timeout error with POST {"commit":{"softCommit":false}}.
Please check your :indexer_solr_timeout_seconds, :indexer_thread_count, and :indexer_records_per_thread settings in
your config.rb file.
We upped the AppConfig[:indexer_solr_timeout_seconds] to 600, but this did not noticeably improve the situation.
When I reviewed the backend and Solr logs, it appears that small edits (like adding a TC) trigger a resource-wide reindex. I'm seeing multiple "deleteByQuery
" calls against the entire resource and then began re-adding large batches of tree node
documents for the PUI. It all seems a bit excessive.
Has anyone seen massive re-indexing behavior triggered by small edits like this? Is there any way to adjust this behavior so the indexer "radius" is limited and doesn't traverse the entire collection tree?
Many thanks for any advice or guidance here.
Mary
Mary Kidd (she/her)
Technical Lead, Archival Systems
Yale Library IT – Client Services and IT Operations