Dear iRODS Consortium team,
I am writing to request assistance regarding several issues we are encountering with the iRODS Indexing capability in our environment. We are currently using iRODS 5.0.1 in a Dockerized setup and we are following the instructions provided in the official documentation at https://docs.irods.org/5.0.1/capabilities/indexing/.
1.Document-type rule engine plugin: in the 5.0.1 documentation, the following rule engine plugin is still referenced in the server configuration:
{"instance_name": "irods_rule_engine_plugin-document_type-instance","plugin_name": "irods_rule_engine_plugin-document_type","plugin_specific_configuration": {} }However, in the March 2024 development update (https://irods.org/2024/03/irods-development-update-march-2024/), it is stated that:
"the document-type rule engine plugin is no longer provided by the Indexing capability plugin and as a result, you'll need to remove the document-type rule engine plugin from your server_config.json."
This suggests that the plugin has been deprecated or removed, but the 5.0.1 documentation still includes it.
Could you please clarify whether the document-type plugin should or should not be used with iRODS 5.0.1?
2. Mapping file: in the irods_capability_indexing GitHub repository, the es_mapping.json file cannot be applied using the documented curl commands.
Elasticsearch returns an error unless I manually remove the top-level "mappings" keyword and keep only the "properties" section.
3. Indexing not working: following the documentation, I created and tagged a collection:
imkdir indexmeAlthough the queue seems to process tasks (iqstat seems to work), Elasticsearch returns zero hits for every search query.
I am not able to get it to work.
This happens with both Elasticsearch 7.17.24 and Elasticsearch 8.12.2 versions.
Could you please confirm which Elasticsearch versions are officially supported to work with iRODS 5.0.1?
4. Logs: to understand the problem, can you suggest how to investigate the issue with logs or something else? Elastic and inserting an index via curl work correctly.
Thanks in advance,
Best regards,
Laura
Could you please clarify whether the document-type plugin should or should not be used with iRODS 5.0.1?
2. Mapping file: in the irods_capability_indexing GitHub repository, the es_mapping.json file cannot be applied using the documented curl commands.
Elasticsearch returns an error unless I manually remove the top-level "mappings" keyword and keep only the "properties" section.
Could you please confirm which Elasticsearch versions are officially supported to work with iRODS 5.0.1?
4. Logs: to understand the problem, can you suggest how to investigate the issue with logs or something else? Elastic and inserting an index via curl work correctly.
--
--
The Integrated Rule-Oriented Data System (iRODS) - https://irods.org
iROD-Chat: http://groups.google.com/group/iROD-Chat
---
You received this message because you are subscribed to the Google Groups "iRODS-Chat" group.
To unsubscribe from this group and stop receiving emails from it, send an email to irod-chat+...@googlegroups.com.
To view this discussion visit https://groups.google.com/d/msgid/irod-chat/6114b009-cd6a-40c7-85a8-adcadd44a6e9n%40googlegroups.com.
Hi Alan,
Thank you for your instructions. I followed the slides you provided (https://slides.com/irods/ugm-2025-indexing), but I am still experiencing the same issues with the Elasticsearch indexing capability.
My setup:
-iRODS 5.0.1 running in a Docker container (Ubuntu 24.04) with the following plugins installed: irods-rule-engine-plugin-indexing, irods-rule-engine-plugin-elasticsearch
-Elasticsearch 8.12.2 running in a separate Docker container ( I tried with 8.12.1 too)
-A CLI container authenticated as the rods user, from which we execute curl calls to initialize indices (the Elasticsearch indices full_text_index and metadata_index have been created successfully with the proper mappings). I can now use the JSON mapping file as specified in the slides.
Both plugins are configured in server_config.json:
{
"instance_name": "irods_rule_engine_plugin-indexing-instance",
"plugin_name": "irods_rule_engine_plugin-indexing",
"plugin_specific_configuration": {
"job_limit_per_collection_indexing_operation": 1000,
"maximum_delay_time": 30,
"minimum_delay_time": 1
}
},
{
"instance_name": "irods_rule_engine_plugin-elasticsearch-instance",
"plugin_name": "irods_rule_engine_plugin-elasticsearch",
"plugin_specific_configuration": {
"bulk_count": 100,
"hosts": [
],
"read_size": 4194304
}
},
After the Docker containers are up, I used the CLI to perform the following:
-imkdir indexed_collection
-imeta set -C indexed_collection irods::indexing::index full_text_index::full_text elasticsearch
-iput -r ./books indexed_collection/books0
The AVU metadata is correctly applied to the collection, and iqstat shows that 104 delayed jobs have been queued. However, when querying Elasticsearch, the hits array remains empty—no documents appear to have been indexed.
I don’t see any errors in legacy category logs.
Is there a way to manually trigger the processing of the delayed queue? It seems the jobs remain pending and the documents are not being indexed even after waiting.
Best regards,
Laura
To view this discussion visit https://groups.google.com/d/msgid/irod-chat/CADnp3x4fvktm%2BC_afRLH7KaOTbxce6OgM-zkhsyU6G8CY_tW-g%40mail.gmail.com.
It seems the jobs remain pending and the documents are not being indexed even after waiting.
To view this discussion visit https://groups.google.com/d/msgid/irod-chat/VI1PR06MB8927E5F5439A10C0CA6CD21082D8A%40VI1PR06MB8927.eurprd06.prod.outlook.com.
Hi Alan,
ps aux | grep "irods.*Server" shows only the irodsServer process, but no irodsDelayServer process.
Delayed rule remain indefinitely in the queue and are never executed :
$ iqstat
id name
10015 writeLine("serverLog", "Delayed Execution");
Therefore, no delay server logs appear in the log files.
For additional context, the server_config.json and server_config section (inside the irods provider Docker container) contains the following settings:
"delay_rule_executors": []
"delay_server_sleep_time_in_seconds": 30,
"maximum_size_of_delay_queue_in_bytes": 0,
"migrate_delay_server_sleep_time_in_seconds": 5,
"number_of_concurrent_delay_rule_executors": 4,
This suggests irodsServer isn't launching irodsDelayServer as a child process automatically, which should happen during normal startup for Irods 5.0.1, correct?.
We use an unattended installation to install iRODS in our container.
To view this discussion visit https://groups.google.com/d/msgid/irod-chat/CADnp3x4LD4O7eeBKddxxVXs5PeKci7JypGr5m3_LG-oGnHSHZA%40mail.gmail.com.
This suggests irodsServer isn't launching irodsDelayServer as a child process automatically, which should happen during normal startup for Irods 5.0.1, correct?.
To view this discussion visit https://groups.google.com/d/msgid/irod-chat/VI1PR06MB8927455884B6B8330CFC75F682D9A%40VI1PR06MB8927.eurprd06.prod.outlook.com.
Hi Alan,
the hostnames do match in our environment:
$ cat /etc/irods/server_config.json | grep '"host":' --context 1
"graceful_shutdown_timeout_in_seconds": 30,
"host": "irodscp-dev1.iit.local",
"host_access_control": {
--
"database": {
"host": "irods-catalog",
"name": "ICAT",
and:
$ iadmin get_delay_server_info
{
"leader": "irodscp-dev1.iit.local",
"successor": ""
}
Are there any other configuration aspects or logs we should examine to troubleshoot irodsDelayServer?
Laura
To view this discussion visit https://groups.google.com/d/msgid/irod-chat/CADnp3x4MvHJySg3RumY_0OEKQaLv55a_Yz1qEmZnHwgiUWe7mw%40mail.gmail.com.
To view this discussion visit https://groups.google.com/d/msgid/irod-chat/VI1PR06MB8927A6D43718AC21FCF62B4B82D9A%40VI1PR06MB8927.eurprd06.prod.outlook.com.