Persistent Issue: Diacritics Ignored in Exact Search on AtoM 2.9.1 (Potential Override in arSearchQueryString.class.php)

21 views
Skip to first unread message

Rodolfo Peres Rodrigues

unread,
Nov 30, 2025, 11:53:56 PMNov 30
to AtoM Users

Hello everyone,

I'm encountering a persistent issue on AtoM 2.9.1 (using Elasticsearch 5.x/6.x) where the exact search (after hitting Enter) fails to normalize diacritics, requiring users to type the exact accent to find a result (e.g., searching for "cafe" fails, but "café" works). The autocomplete suggestions, however, work correctly, indicating that the problem is specific to the Search Time query processing, not the Index Time.

I have implemented comprehensive fixes, including:

  1. Complete Analyzer Configuration: The brazilian analyzer has been fully configured in search.yml with lowercase, brazilian_stop, preserved_asciifolding, and brazilian_stem.

  2. Explicit Field Mapping: The field_map: section is correctly set to use analyzer: brazilian and, crucially, search_analyzer: brazilian for full_text, title, and name.

  3. Reindexation: The search index has been rebuilt multiple times (php symfony search:populate).

The Code Override Hypothesis

The failure persists despite the correct Elasticsearch configuration, which strongly suggests that the AtoM application code is overriding the search_analyzer when building the final query_string.

My proposed solution/question is: Should we be explicitly forcing the analyzer on the query construction level?

Suggested Action: Inject the analyzer parameter into the array that builds the Elasticsearch query_string (likely within plugins/arElasticSearchPlugin/lib/model/search/arSearchQueryString.class.php).

Example of parameter injection:

$params = array
  'default_operator' => $this->getDefaultOperator(), 
  'analyzer'
=> 'brazilian', // FORCE THE ANALYZER HERE 
);

I would appreciate guidance from the community: Is this the recommended fix, or is there a known configuration flag or bug that is causing the search_analyzer defined in search.yml to be ignored in this specific AtoM version?

Thank you for your assistance.

Sincerely,
Rodolfo Peres Rodrigues



Dan Gillean

unread,
Dec 5, 2025, 8:21:06 AMDec 5
to ica-ato...@googlegroups.com
Hi Rodolfo, 

When you say exact search, do you mean you are putting your search terms in quotations, or do you just mean that you are typing a phrase and hitting enter to submit it?

If you mean you are putting your search query in quotations: 
That performs an exact search in ES, meaning that the proximity and exact ordering of terms matters - but I suspect it also means "search for exactly these characters and not close approximations"! In which case, personally this is the behavior I would expect from an exact search. See: 
If, on the other hand, you just mean you are typing a phrase, seeing autocomplete results where diacritics are normalized, but then no longer seeing any substitutions in the results after hitting enter, then... maybe it's a bug? Agreed that this is not what I would expect, but there are also many complexities with ES I don't personally understand. 

If it is the latter, I would suggest that you open a bug ticket on the AtoM repository. You can mention you have a possible fix, and ask for their take on the issue. The AtoM Maintainers will review it and if they agree it is a bug, you could then open a pull request? 

Cheers, 


Dan Gillean, MAS, MLIS
Business & User Experience Analyst
Artefactual Systems, Inc.
604-527-2056
he / him


--
You received this message because you are subscribed to the Google Groups "AtoM Users" group.
To unsubscribe from this group and stop receiving emails from it, send an email to ica-atom-user...@googlegroups.com.
To view this discussion visit https://groups.google.com/d/msgid/ica-atom-users/1704a161-4787-40cf-9b19-f9815a48d2abn%40googlegroups.com.
Reply all
Reply to author
Forward
0 new messages