Hi Andreas
I did this-see orange circle bottom righthand
https://psis.theahg.co.za/heritage
Yes, we've built exactly this on top of AtoM, and it works well. A few notes from our experience that might help:
RAG over the catalogue. We index archival descriptions (title, scope & content, dates, named entities) into a vector store and run a hybrid retriever: semantic/vector search combined with keyword (BM25 via the existing OpenSearch/Elasticsearch index), plus entity- and hierarchy-aware strategies. Hybrid matters a lot for archives, pure vector search is poor at identifiers, reference codes, and exact name lookups, which users rely on. Answers are grounded in the retrieved descriptions and linked back to the actual records, so it's citable rather than hallucinated.
Chat agent. On top of that we run a conversational assistant over the collection (with conversation history), and the same retrieval layer feeds NER, summarisation, and translation pipelines.
Agent-over-API. Both approaches you mention are viable. AtoM's data is very accessible, the REST API, plus a GraphQL layer we added, make good "tools" for an agent. We route all model calls through a single gateway (keyed/metered, with self-hostable/offline models), which we'd strongly recommend: archives often can't send descriptions to a third-party cloud, and a gateway lets you swap models and keep an audit trail.
A few hard-won gotchas if you build the RAG path:
- Keep the vector index in sync with deletions. Deleted records that linger in the vector store surface as high-ranked phantom hits.
- Make sure the search index your retriever queries are sourced from the same corpus/DB you hydrate results from. An index-name mismatch gives you the classic "N results found, but nothing displays" symptom.
- Apply your access/embargo/publication filtering to the retrieval layer, not just the UI, otherwise the model can surface restricted material in its answers.
Happy to go into more detail off list if useful.
Groete / Regards
Johan Pieterse (PhD)
The Archive and Heritage Group (Pty) Ltd
--
You received this message because you are subscribed to the Google Groups "AtoM Users" group.
To unsubscribe from this group and stop receiving emails from it, send an email to ica-atom-user...@googlegroups.com.
To view this discussion visit https://groups.google.com/d/msgid/ica-atom-users/da6789b1-01fb-4dc3-8c5d-55efb5875233n%40googlegroups.com.