ElasticSearch Exception on specific string

54 views
Skip to first unread message

Bob Higgie

unread,
Jun 27, 2023, 10:33:58 AM6/27/23
to AtoM Users
Hi,
I am running Atom 2.6 installed as exactly as the instructions as the only software on an Intel Mac with 4GB of RAM. 
We use the following style of reference code 1974/014/R/01. 
Entering that into the search bar, either the general one in the top bar or as an advanced search on the reference code field results in an ElasticSearch response exception.

Narrowing this down it appears that // is OK in a search string but /// isn't. 

Is this a known limitation? 

Apologies if this is well known, I'm a new user for the Waterworks Museum Hereford. 

Regards Bob. 

Dan Gillean

unread,
Jun 28, 2023, 2:00:56 PM6/28/23
to ica-ato...@googlegroups.com
Hi Bob, 

The short answer is: yes it's a known limitation, but there's a workaround - though the workaround itself has some gotchas. 

More details: 

AtoM uses ElasticSearch (currently ES version 5.6 in AtoM 2.6 and 2.7) as its search index. In ES, the / slash character is a reserved character used for escaping other special characters that can be used as Boolean search operators. That's why one slash will cause an error, but 2 will be okay - the first slash is understood as "escaping" the second slash. It's a bit weird I know. Some further details, from the ES 5.6 documentation

If you need to use any of the characters which function as operators in your query itself (and not as operators), then you should escape them with a leading backslash. For instance, to search for (1+1)=2, you would need to write your query as \(1\+1\)\=2.

The reserved characters are: + - = && || > < ! ( ) { } [ ] ^ " ~ * ? : \ /

Failing to escape these special characters correctly could lead to a syntax error which prevents your query from running.

You can also see a list of other reserved characters that are used as Boolean operators, and how they affect searching in AtoM, here in the Advanced Search documentation: 
Now for the workarounds...

There are in fact two: 

The simplest is just to wrap your search in quotations - i.e. try searching for "1974/014/R/01". In such a case, the special characters are automatically escaped, and the exact string and order of the elements of the string are searched. This means you are highly likely to get accurate results (the good news). The bad news is that end users won't automatically know how to do this, and there are many search cases where users might not want quotations (which ensures all terms appear together in the exact order found).

The second workaround is better for end users, but has that gotcha I mentioned: 

We have added a setting in AtoM that will automatically escape special characters when they are added to the setting - so that you can use slashes in your reference codes for example. See: 
But I also have to mention the gotcha to this workaround, which is mentioned in the IMPORTANT admonition at the end of the settings docs linked above. Essentially, when you add a slash to this setting and save, searching for a reference code like 1974/014/R/01 will actually now search for each part separately, with an AND Boolean - i.e. 
  • Return records that include 1974 AND 014 AND R AND 01
Meaning that there are likely to be more results, the later part of which may not be very relevant, as 01 might be found in the scope and content for example, while 1974 is part of a date of creation, etc. 

I would suggest that you play around with both and see what works best for you. As an added suggestion, note that many AtoM-using institutions have used AtoM's static pages and the ability to customize the menus to add guides to help end users with searching and browsing. A few examples: 
  • The custom guided buttons on the homepage that take you on a sort of user tour of the Mills Archive
  • The Search help and Browse help static pages added to the Quick Links menu of the Dalhousie University Archives
  • The YouTube tutorial video playlist that the MemoryNS portal has embedded on a static page (there are 11 videos in total if you expand the playlist)
I mention a few more examples of this in this slide deck about AtoM community implementations, and you can likely find many more creative uses by exploring some of our example User sites listed on the wiki. 

Hope this helps!

Dan Gillean, MAS, MLIS
AtoM Program Manager
Artefactual Systems, Inc.
604-527-2056
@accesstomemory
he / him


--
You received this message because you are subscribed to the Google Groups "AtoM Users" group.
To unsubscribe from this group and stop receiving emails from it, send an email to ica-atom-user...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/ica-atom-users/dcd2dedc-8539-49b4-a926-a1e675f14dd4n%40googlegroups.com.

Bob Higgie

unread,
Jun 29, 2023, 4:22:22 AM6/29/23
to AtoM Users
Thanks Dan, that is really helpful. I will discuss this with the archivists and see which method is best for them. 
Reply all
Reply to author
Forward
0 new messages