Using AWS OpenSearch instead of ElasticSearch? / What version(s) of ElasticSearch does AM currently support?

307 views
Skip to first unread message

John Gostick

unread,
Mar 18, 2022, 12:54:34 PM3/18/22
to archivematica
Hi

I'm curious if anyone is successfully either:

1) using AWS OpenSearch with Archivematica, instead of ElasticSearch?  
2) using ElasticSearch v7.x with Archivematica, instead of v6.x?

OpenSearch is Amazon's cloud based fork of open source Elasticsearch 7.10, and according to the AWS documentation the REST APIs for ingest, search, and management are backwards compatible + query syntax and responses are also the same. They do note though that despite this, "some clients or tools may include code, such as version checks, that may cause the client or tool to not work with OpenSearch."

According the latest Archivematica documentation, AM v1.13 requires ElasticSearch version 6.x (and is tested with 6.5.4), and digging around in the issues on GitHub and other posts in the group I get the impression there may be some issues if you do try using ElasticSearch v7.x

We're just experimenting with using OpenSearch 1.1 now, and while AM seems to be able to index items OK, search queries are failing with errors such as:

RequestError: RequestError(400, u'search_phase_execution_exception', u'Text fields are not optimised for operations that require per-document field data like aggregations and sorting, so these operations are disabled by default. Please use a keyword field instead. Alternatively, set fielddata=true on [sipuuid] in order to load field data by uninverting the inverted index. Note that this can use significant memory.')

We're continuing to investigate, but I'm conscious that if AM doesn't even support ElasticSearch 7.x, it isn't going to support OpenSearch, so using it may not currently be an option at all, so I'd love to hear if anyone else has tried this!

Thanks

John Gostick

John Gostick

unread,
May 3, 2022, 6:51:35 AM5/3/22
to archivematica
To reply to my original post as an update that might be useful to others trying this:

- If you're using AWS, you can actually chose to deploy a 6.x version of ElasticSearch under their 'OpenSearch' service, rather than the 'OpenSearch' fork of ES 7.x

- We deployed v6.8 of ElasticSearch on AWS under OpenSearch and Archivematica seems to work correctly with this version, displaying none of the issues recorded above

- One thing that briefly caught us out (not sure this is an issue/bug as such but happy to report if you think I should) was 'special' characters in the the ElasticSearch user password that was embedded in the URL for the ElasticSearch endpoint we specified in the Archivematica configuration. The password we initially used had various characters like { [ ] } etc. that are considered 'non-safe' for URLs and would normally be URL encoded. Specifying the endpoint URL with these password characters seemingly included caused Archivematica to crash on start-up, but URL encoding the password part of the URL worked, as did simply changing the password in ElasticSearch not to include any of these characters

Reply all
Reply to author
Forward
0 new messages