iRODS CPU Hogging phantom processes.

31 views
Skip to first unread message

Jan de Graaf

unread,
Apr 24, 2024, 8:20:32 AMApr 24
to iRODS-Chat
Hi,

Recently our iRODS iCAT server started hogging the CPU's and when stopping the main iRODS proces the iRODS Servers keep running (at 100% 24x7)
The only thing that the server is running is the iCAT server and the delay server.

The only way to stop the process if to kill them hard at the command line.

See attached screenshot for an example.

irods CPU Load and processes.png

Best,
Jan de Graaf

Terrell Russell

unread,
Apr 24, 2024, 8:34:56 AMApr 24
to irod...@googlegroups.com
Well that is not good at all.   A few questions...

Is there anything interesting/useful in the logs?

Does it happen immediately upon restart again?

What version of iRODS server is this?

Terrell


--
--
The Integrated Rule-Oriented Data System (iRODS) - https://irods.org
 
iROD-Chat: http://groups.google.com/group/iROD-Chat
---
You received this message because you are subscribed to the Google Groups "iRODS-Chat" group.
To unsubscribe from this group and stop receiving emails from it, send an email to irod-chat+...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/irod-chat/eafc246d-9f74-488f-ab9c-e074c5254d1fn%40googlegroups.com.

Alan King

unread,
Apr 24, 2024, 9:45:26 AMApr 24
to irod...@googlegroups.com
If this is iRODS 4.3.1, the leftover processes even after shutdown might be this issue: https://github.com/irods/irods/issues/7619 That issue will be resolved in 4.3.2.

The processes hogging cores would be something else entirely, of course. Haven't seen that.



--
Alan King
Senior Software Developer | iRODS Consortium

James, Justin Kyle

unread,
Apr 24, 2024, 12:54:21 PMApr 24
to irod...@googlegroups.com
Also please provide the resource configuration.

ilsresc -l <resc>

From: irod...@googlegroups.com <irod...@googlegroups.com> on behalf of Terrell Russell <terrell...@gmail.com>
Sent: Wednesday, April 24, 2024 8:34 AM
To: irod...@googlegroups.com <irod...@googlegroups.com>
Subject: Re: [iROD-Chat:21853] iRODS CPU Hogging phantom processes.
 

Jan de Graaf

unread,
Apr 25, 2024, 3:03:18 AMApr 25
to iRODS-Chat
Hi,

We're running  irods 4.3.1 with tiering plugin and ingestion (both latest versions)
It doesn't happen directly after restart. I seems to build up. 
Also this setup isn't in production yet so not really much is happening is on this setup yet.

This is our resource configuration

RES_IMG_Store_01:passthru
└── RES_IMG_Store_PT01:passthru
    └── IMG_Store_01:unixfilesystem -> local NFS storage
RES_IMG_Store_02:passthru
└── RES_IMG_Store_PT02:passthru
    └── IMG_Store_02:unixfilesystem -> directs to a SMB mount (central network storage)
RES_IMG_Store_03:passthru
└── IMG_Store_PT03:passthru -
    └── IMG_Store_03:unixfilesystem  > NFS mount to Azure Blob storage
RES_IMG_Store_04:passthru
└── IMG_Store_PT04:passthru
    └── IMG_Store_04:unixfilesystem  > NFS mount to Azure Blob storage
RES_IMG_Store_05:passthru
└── IMG_Store_PT05:passthru
    └── IMG_Store_05:unixfilesystem -> directs to a SMB mount (HPC storage)

Were 01-04 resources are in a tiering group.

Extended resource group output

resource name: IMG_Store_PT05
id: 147560
zone: nkiImaging
type: passthru
location: EMPTY_RESC_HOST
vault: EMPTY_RESC_PATH
free space:
free space time: : Never
status:
info:
comment:
create time: 01713365572: 2024-04-17.14:52:52
modify time: 01713508936: 2024-04-19.06:42:16
context:
parent: 147559
parent context:
----
resource name: RES_IMG_Store_02
id: 35070
zone: nkiImaging
type: passthru
location: EMPTY_RESC_HOST
vault: EMPTY_RESC_PATH
free space:
free space time: : Never
status:
info:
comment:
create time: 01707315397: 2024-02-07.14:16:37
modify time: 01707315397: 2024-02-07.14:16:37
context:
parent:
parent context:
----
resource name: IMG_Store_04
id: 78841
zone: nkiImaging
type: unixfilesystem
location: p-irods-004.nki.nl
vault: /mtn_NFS_IMG_Archive
free space:
free space time: : Never
status:
info:
comment:
create time: 01711709270: 2024-03-29.10:47:50
modify time: 01712060557: 2024-04-02.12:22:37
context:
parent: 78843
parent context:
----
resource name: RES_IMG_Store_05
id: 147559
zone: nkiImaging
type: passthru
location: EMPTY_RESC_HOST
vault: EMPTY_RESC_PATH
free space:
free space time: : Never
status:
info:
comment:
create time: 01713365538: 2024-04-17.14:52:18
modify time: 01713777369: 2024-04-22.09:16:09
context:
parent:
parent context:
----
resource name: RES_IMG_Store_PT02
id: 35072
zone: nkiImaging
type: passthru
location: EMPTY_RESC_HOST
vault: EMPTY_RESC_PATH
free space:
free space time: : Never
status:
info:
comment:
create time: 01707315415: 2024-02-07.14:16:55
modify time: 01709043754: 2024-02-27.14:22:34
context:
parent: 35070
parent context:
----
resource name: nkiResc3
id: 10050
zone: nkiImaging
type: unixfilesystem
location: p-irods-003.nki.nl
vault: /var/lib/irods/Vault
free space:
free space time: : Never
status:
info:
comment:
create time: 01705920536: 2024-01-22.10:48:56
modify time: 01705920536: 2024-01-22.10:48:56
context:
parent:
parent context:
----
resource name: IMG_Store_01
id: 35069
zone: nkiImaging
type: unixfilesystem
location: p-irods-003.nki.nl
vault: /irods
free space:
free space time: : Never
status:
info:
comment:
create time: 01707314900: 2024-02-07.14:08:20
modify time: 01707314964: 2024-02-07.14:09:24
context:
parent: 35068
parent context:
----
resource name: RES_IMG_Store_01
id: 35067
zone: nkiImaging
type: passthru
location: p-irods-003.nki.nl
vault: EMPTY_RESC_PATH
free space:
free space time: : Never
status:
info:
comment:
create time: 01707314192: 2024-02-07.13:56:32
modify time: 01712126551: 2024-04-03.06:42:31
context:
parent:
parent context:
----
resource name: IMG_Store_05
id: 147562
zone: nkiImaging
type: unixfilesystem
location: p-irods-003.nki.nl
vault: /mntHIVE3020_IRODS
free space:
free space time: : Never
status:
info:
comment:
create time: 01713365760: 2024-04-17.14:56:00
modify time: 01713775617: 2024-04-22.08:46:57
context:
parent: 147560
parent context:
----
resource name: nkiResc2
id: 10044
zone: nkiImaging
type: unixfilesystem
location: p-irods-002.nki.nl
vault: /var/lib/irods/Vault
free space:
free space time: : Never
status:
info:
comment:
create time: 01705918499: 2024-01-22.10:14:59
modify time: 01705918499: 2024-01-22.10:14:59
context:
parent:
parent context:
----
resource name: IMG_Store_02
id: 35077
zone: nkiImaging
type: unixfilesystem
location: p-irods-002.nki.nl
vault: /data/irods/data
free space:
free space time: : Never
status:
info:
comment:
create time: 01707316134: 2024-02-07.14:28:54
modify time: 01712915891: 2024-04-12.09:58:11
context:
parent: 35072
parent context:
----
resource name: nkiResc4
id: 35304
zone: nkiImaging
type: unixfilesystem
location: p-irods-004
vault: /var/lib/irods/Vault
free space:
free space time: : Never
status:
info:
comment:
create time: 01708352048: 2024-02-19.14:14:08
modify time: 01711720792: 2024-03-29.13:59:52
context:
parent:
parent context:
----
resource name: RES_IMG_Store_PT01
id: 35068
zone: nkiImaging
type: passthru
location: EMPTY_RESC_HOST
vault: EMPTY_RESC_PATH
free space:
free space time: : Never
status:
info:
comment:
create time: 01707314520: 2024-02-07.14:02:00
modify time: 01709043835: 2024-02-27.14:23:55
context:
parent: 35067
parent context:
----
resource name: RES_IMG_Store_03
id: 35071
zone: nkiImaging
type: passthru
location: p-irods-004.nki.nl
vault: EMPTY_RESC_PATH
free space:
free space time: : Never
status:
info:
comment:
create time: 01707315402: 2024-02-07.14:16:42
modify time: 01712130006: 2024-04-03.07:40:06
context:
parent:
parent context:
----
resource name: IMG_Store_03
id: 78840
zone: nkiImaging
type: unixfilesystem
location: p-irods-004.nki.nl
vault: /mtn_NFS_IMG_Cold
free space:
free space time: : Never
status:
info:
comment:
create time: 01711709241: 2024-03-29.10:47:21
modify time: 01712060551: 2024-04-02.12:22:31
context:
parent: 35073
parent context:
----
resource name: RES_IMG_Store_04
id: 78842
zone: nkiImaging
type: passthru
location: EMPTY_RESC_HOST
vault: EMPTY_RESC_PATH
free space:
free space time: : Never
status:
info:
comment:
create time: 01711709319: 2024-03-29.10:48:39
modify time: 01711709319: 2024-03-29.10:48:39
context:
parent:
parent context:
----
resource name: IMG_Store_PT03
id: 35073
zone: nkiImaging
type: passthru
location: EMPTY_RESC_HOST
vault: EMPTY_RESC_PATH
free space:
free space time: : Never
status:
info:
comment:
create time: 01707315420: 2024-02-07.14:17:00
modify time: 01709544713: 2024-03-04.09:31:53
context:
parent: 35071
parent context:
----
resource name: nkiResc
id: 10013
zone: nkiImaging
type: unixfilesystem
location: p-irods-001.nki.nl
vault: /var/lib/irods/Vault
free space:
free space time: : Never
status:
info:
comment:
create time: 01705573066: 2024-01-18.10:17:46
modify time: 01705573066: 2024-01-18.10:17:46
context:
parent:
parent context:
----
resource name: IMG_Store_PT04
id: 78843
zone: nkiImaging
type: passthru
location: EMPTY_RESC_HOST
vault: EMPTY_RESC_PATH
free space:
free space time: : Never
status:
info:
comment:
create time: 01711709333: 2024-03-29.10:48:53
modify time: 01711709409: 2024-03-29.10:50:09
context:
parent: 78842
parent context:
irods@p-irods-001:~$



Op woensdag 24 april 2024 om 14:34:56 UTC+2 schreef Terrell Russell:

Jan de Graaf

unread,
Apr 25, 2024, 10:40:55 AMApr 25
to iRODS-Chat
Just rebooted the iCAT server and watched the logfile while the system was basically idle, no user interaction was being done.

The logfile did show some errors in the first few minutes, then the system did not do or logged anything and then suddenly it jumped to 100% CPU.
During this jump to 100% nothing showed up in the log file.

This is the output of the logfile so far.

 {"log_category":"server","log_level":"info","log_message":"Initializing server ...","server_host":"p-irods-001","server_pid":234501,"server_timestamp":"2024-04-25T13:45:24.118Z","server_type":"server","server_zone":"nkiImaging"}
 {"log_category":"server","log_level":"info","log_message":"Setting up UNIX domain socket for agent factory ...","server_host":"p-irods-001","server_pid":234501,"server_timestamp":"2024-04-25T13:45:24.123Z","server_type":"server","server_zone":"nkiImaging"}
 {"log_category":"server","log_level":"info","log_message":"Forking agent factory ...","server_host":"p-irods-001","server_pid":234501,"server_timestamp":"2024-04-25T13:45:24.124Z","server_type":"server","server_zone":"nkiImaging"}
 {"log_category":"server","log_level":"info","log_message":"Connecting to agent factory [agent_factory_pid=234503] ...","server_host":"p-irods-001","server_pid":234501,"server_timestamp":"2024-04-25T13:45:24.124Z","server_type":"server","server_zone":"nkiImaging"}
 {"log_category":"agent_factory","log_level":"info","log_message":"Initializing agent factory ...","server_host":"p-irods-001","server_pid":234503,"server_timestamp":"2024-04-25T13:45:24.125Z","server_type":"agent_factory","server_zone":"nkiImaging"}
 {"local_hostname":"localhost, p-irods-001, p-irods-001.nki.nl","log_category":"server","log_level":"info","port":"1247","server_host":"p-irods-001","server_pid":234501,"server_timestamp":"2024-04-25T13:45:24.186Z","server_type":"server","server_zone":"nkiImaging"}
 {"log_category":"server","log_level":"info","server_host":"p-irods-001","server_pid":234501,"server_timestamp":"2024-04-25T13:45:24.295Z","server_type":"server","server_zone":"nkiImaging","zone_info.host":"localhost","zone_info.name":"nkiImaging","zone_info.port":"1247","zone_info.type":"LOCAL_ICAT"}
 {"log_category":"server","log_level":"info","log_message":"rodsServer Release version rods4.3.1 - API Version d is up","server_host":"p-irods-001","server_pid":234501,"server_timestamp":"2024-04-25T13:45:24.295Z","server_type":"server","server_zone":"nkiImaging"}
 {"log_category":"legacy","log_level":"info","log_message":">>> control plane :: listening on port 1248\n","server_host":"p-irods-001","server_pid":234501,"server_timestamp":"2024-04-25T13:45:24.296Z","server_type":"server","server_zone":"nkiImaging"}
 {"log_category":"server","log_level":"info","log_message":"Forking Delay Server (irodsDelayServer) ...","server_host":"p-irods-001","server_pid":234501,"server_timestamp":"2024-04-25T13:45:24.951Z","server_type":"server","server_zone":"nkiImaging"}
 {"log_category":"delay_server","log_level":"info","log_message":"Initializing delay server ...","server_host":"p-irods-001","server_pid":234516,"server_timestamp":"2024-04-25T13:45:24.978Z","server_type":"delay_server","server_zone":"nkiImaging"}
 {"log_category":"legacy","log_level":"error","log_message":"[-]\t/irods_source/server/core/src/rsApiHandler.cpp:565:int readAndProcClientMsg(rsComm_t *, int) :  status [SYS_HEADER_READ_LEN_ERR]  errno [] -- message [failed to call 'read header']\n\t[-]\t/irods_source/lib/core/src/sockComm.cpp:229:irods::error readMsgHeader(irods::network_object_ptr, msgHeader_t *, struct timeval *) :  status [SYS_HEADER_READ_LEN_ERR]  errno [] -- message [failed to call 'read header']\n\t\t[-]\t/irods_source/plugins/network/src/ssl.cpp:528:irods::error ssl_read_msg_header(irods::plugin_context &, void *, struct timeval *) :  status [SYS_HEADER_READ_LEN_ERR]  errno [] -- message [read 0 expected 4]\n\n","request_api_name":"GEN_QUERY_AN","request_api_number":702,"request_api_version":"d","request_client_user":"rods","request_host":"172.31.32.83","request_proxy_user":"rods","request_release_version":"rods4.3.1","server_host":"p-irods-001","server_pid":237784,"server_timestamp":"2024-04-25T13:52:30.947Z","server_type":"agent","server_zone":"nkiImaging"}
 {"log_category":"agent","log_level":"error","log_message":"[-]\t/irods_source/server/core/src/rodsAgent.cpp:725:int runIrodsAgentFactory(sockaddr_un) :  status [SSL_SHUTDOWN_ERROR]  errno [] -- message [failed to call 'agent stop']\n\t[-]\t/irods_source/lib/core/src/sockComm.cpp:191:irods::error sockAgentStop(irods::network_object_ptr) :  status [SSL_SHUTDOWN_ERROR]  errno [] -- message [failed to call 'agent stop']\n\t\t[-]\t/irods_source/plugins/network/src/ssl.cpp:850:irods::error ssl_agent_stop(irods::plugin_context &) :  status [SSL_SHUTDOWN_ERROR]  errno [] -- message [error completing shutdown of SSL connection]\n\n","request_api_name":"GEN_QUERY_AN","request_api_number":702,"request_api_version":"d","request_client_user":"rods","request_host":"172.31.32.83","request_proxy_user":"rods","request_release_version":"rods4.3.1","server_host":"p-irods-001","server_pid":237784,"server_timestamp":"2024-04-25T13:52:30.947Z","server_type":"agent","server_zone":"nkiImaging"}
 {"log_category":"server","log_level":"critical","log_message":"Agent factory returned with error code [-2104000].","request_api_name":"GEN_QUERY_AN","request_api_number":702,"request_api_version":"d","request_client_user":"rods","request_host":"172.31.32.83","request_proxy_user":"rods","request_release_version":"rods4.3.1","server_host":"p-irods-001","server_pid":237784,"server_timestamp":"2024-04-25T13:52:30.947Z","server_type":"agent","server_zone":"nkiImaging"}
 {"log_category":"delay_server","log_level":"error","log_message":"operator():337 - rcRuleExecMod failed -4000 for rule ID 148821","server_host":"p-irods-001","server_pid":234516,"server_timestamp":"2024-04-25T13:52:30.947Z","server_type":"delay_server","server_zone":"nkiImaging"}
 {"log_category":"delay_server","log_level":"error","log_message":"Rule exec for [148821] failed. status = [-4000]","server_host":"p-irods-001","server_pid":234516,"server_timestamp":"2024-04-25T13:52:30.947Z","server_type":"delay_server","server_zone":"nkiImaging"}
 {"log_category":"agent_factory","log_level":"error","log_message":"Agent process [237782] exited with status [11].","server_host":"p-irods-001","server_pid":234503,"server_timestamp":"2024-04-25T13:52:30.964Z","server_type":"agent_factory","server_zone":"nkiImaging"}
 {"log_category":"agent_factory","log_level":"error","log_message":"Agent process [237784] exited with status [1].","server_host":"p-irods-001","server_pid":234503,"server_timestamp":"2024-04-25T13:52:30.965Z","server_type":"agent_factory","server_zone":"nkiImaging"}
 {"log_category":"agent_factory","log_level":"error","log_message":"Agent process [238043] terminated by signal [13].","server_host":"p-irods-001","server_pid":234503,"server_timestamp":"2024-04-25T13:52:30.995Z","server_type":"agent_factory","server_zone":"nkiImaging"}
 {"log_category":"agent_factory","log_level":"error","log_message":"Agent process [238044] terminated by signal [13].","server_host":"p-irods-001","server_pid":234503,"server_timestamp":"2024-04-25T13:52:31.188Z","server_type":"agent_factory","server_zone":"nkiImaging"}
 {"log_category":"server","log_level":"critical","log_message":" 0# stacktrace_signal_handler in /lib/libirods_server.so.4.3.1\n 1# 0x00007FC8151F2420 in /lib/x86_64-linux-gnu/libpthread.so.0\n 2# void std::__1::__tree_balance_after_insert<std::__1::__tree_node_base<void*>*>(std::__1::__tree_node_base<void*>*, std::__1::__tree_node_base<void*>*) in /usr/sbin/irodsServer\n 3# 0x00007FC811327D06\n 4# 0x00007FC81131EB78\n 5# 0x00007FC8113287E2\n 6# 0x00007FC8113292A7\n 7# boost::asio::detail::scheduler::do_run_one(boost::asio::detail::conditionally_enabled_mutex::scoped_lock&, boost::asio::detail::scheduler_thread_info&, boost::system::error_code const&) in /lib/libirods_server.so.4.3.1\n 8# boost::asio::detail::scheduler::run(boost::system::error_code&) in /lib/libirods_server.so.4.3.1\n 9# boost::asio::detail::posix_thread::func<boost::asio::thread_pool::thread_function>::run() in /lib/libirods_server.so.4.3.1\n10# boost_asio_detail_posix_thread_function in /lib/libirods_server.so.4.3.1\n11# 0x00007FC8151E6609 in /lib/x86_64-linux-gnu/libpthread.so.0\n12# clone in /lib/x86_64-linux-gnu/libc.so.6\n","server_host":"p-irods-001","server_pid":234501,"server_timestamp":"2024-04-25T13:52:38.519Z","server_type":"server","server_zone":"nkiImaging","stacktrace_agent_pid":"237782","stacktrace_timestamp_epoch_milliseconds":"945","stacktrace_timestamp_epoch_seconds":"1714053150","stacktrace_timestamp_utc":"2024-04-25T13:52:30.945Z"}
 {"log_category":"legacy","log_level":"error","log_message":"[-]\t/irods_source/server/core/src/rsApiHandler.cpp:565:int readAndProcClientMsg(rsComm_t *, int) :  status [SYS_HEADER_READ_LEN_ERR]  errno [] -- message [failed to call 'read header']\n\t[-]\t/irods_source/lib/core/src/sockComm.cpp:229:irods::error readMsgHeader(irods::network_object_ptr, msgHeader_t *, struct timeval *) :  status [SYS_HEADER_READ_LEN_ERR]  errno [] -- message [failed to call 'read header']\n\t\t[-]\t/irods_source/plugins/network/src/ssl.cpp:528:irods::error ssl_read_msg_header(irods::plugin_context &, void *, struct timeval *) :  status [SYS_HEADER_READ_LEN_ERR]  errno [] -- message [read 0 expected 4]\n\n","request_api_name":"GEN_QUERY_AN","request_api_number":702,"request_api_version":"d","request_client_user":"rods","request_host":"172.31.32.83","request_proxy_user":"rods","request_release_version":"rods4.3.1","server_host":"p-irods-001","server_pid":243258,"server_timestamp":"2024-04-25T14:03:24.401Z","server_type":"agent","server_zone":"nkiImaging"}
 {"log_category":"agent","log_level":"error","log_message":"[-]\t/irods_source/server/core/src/rodsAgent.cpp:725:int runIrodsAgentFactory(sockaddr_un) :  status [SSL_SHUTDOWN_ERROR]  errno [] -- message [failed to call 'agent stop']\n\t[-]\t/irods_source/lib/core/src/sockComm.cpp:191:irods::error sockAgentStop(irods::network_object_ptr) :  status [SSL_SHUTDOWN_ERROR]  errno [] -- message [failed to call 'agent stop']\n\t\t[-]\t/irods_source/plugins/network/src/ssl.cpp:850:irods::error ssl_agent_stop(irods::plugin_context &) :  status [SSL_SHUTDOWN_ERROR]  errno [] -- message [error completing shutdown of SSL connection]\n\n","request_api_name":"GEN_QUERY_AN","request_api_number":702,"request_api_version":"d","request_client_user":"rods","request_host":"172.31.32.83","request_proxy_user":"rods","request_release_version":"rods4.3.1","server_host":"p-irods-001","server_pid":243258,"server_timestamp":"2024-04-25T14:03:24.402Z","server_type":"agent","server_zone":"nkiImaging"}
 {"log_category":"server","log_level":"critical","log_message":"Agent factory returned with error code [-2104000].","request_api_name":"GEN_QUERY_AN","request_api_number":702,"request_api_version":"d","request_client_user":"rods","request_host":"172.31.32.83","request_proxy_user":"rods","request_release_version":"rods4.3.1","server_host":"p-irods-001","server_pid":243258,"server_timestamp":"2024-04-25T14:03:24.403Z","server_type":"agent","server_zone":"nkiImaging"}
 {"log_category":"delay_server","log_level":"error","log_message":"operator():337 - rcRuleExecMod failed -4000 for rule ID 148821","server_host":"p-irods-001","server_pid":234516,"server_timestamp":"2024-04-25T14:03:24.401Z","server_type":"delay_server","server_zone":"nkiImaging"}
 {"log_category":"delay_server","log_level":"error","log_message":"Rule exec for [148821] failed. status = [-4000]","server_host":"p-irods-001","server_pid":234516,"server_timestamp":"2024-04-25T14:03:24.401Z","server_type":"delay_server","server_zone":"nkiImaging"}
 {"log_category":"agent_factory","log_level":"error","log_message":"Agent process [243256] exited with status [11].","server_host":"p-irods-001","server_pid":234503,"server_timestamp":"2024-04-25T14:03:24.405Z","server_type":"agent_factory","server_zone":"nkiImaging"}
 {"log_category":"agent_factory","log_level":"error","log_message":"Agent process [243258] exited with status [1].","server_host":"p-irods-001","server_pid":234503,"server_timestamp":"2024-04-25T14:03:24.405Z","server_type":"agent_factory","server_zone":"nkiImaging"}
 {"log_category":"agent_factory","log_level":"error","log_message":"Agent process [243495] terminated by signal [13].","server_host":"p-irods-001","server_pid":234503,"server_timestamp":"2024-04-25T14:03:24.525Z","server_type":"agent_factory","server_zone":"nkiImaging"}
 {"log_category":"agent_factory","log_level":"error","log_message":"Agent process [243496] terminated by signal [13].","server_host":"p-irods-001","server_pid":234503,"server_timestamp":"2024-04-25T14:03:24.525Z","server_type":"agent_factory","server_zone":"nkiImaging"}
 {"log_category":"server","log_level":"critical","log_message":" 0# stacktrace_signal_handler in /lib/libirods_server.so.4.3.1\n 1# 0x00007FC8151F2420 in /lib/x86_64-linux-gnu/libpthread.so.0\n 2# void std::__1::__tree_balance_after_insert<std::__1::__tree_node_base<void*>*>(std::__1::__tree_node_base<void*>*, std::__1::__tree_node_base<void*>*) in /usr/sbin/irodsServer\n 3# 0x00007FC811327D06\n 4# 0x00007FC81131EB78\n 5# 0x00007FC8113287E2\n 6# 0x00007FC8113292A7\n 7# boost::asio::detail::scheduler::do_run_one(boost::asio::detail::conditionally_enabled_mutex::scoped_lock&, boost::asio::detail::scheduler_thread_info&, boost::system::error_code const&) in /lib/libirods_server.so.4.3.1\n 8# boost::asio::detail::scheduler::run(boost::system::error_code&) in /lib/libirods_server.so.4.3.1\n 9# boost::asio::detail::posix_thread::func<boost::asio::thread_pool::thread_function>::run() in /lib/libirods_server.so.4.3.1\n10# boost_asio_detail_posix_thread_function in /lib/libirods_server.so.4.3.1\n11# 0x00007FC8151E6609 in /lib/x86_64-linux-gnu/libpthread.so.0\n12# clone in /lib/x86_64-linux-gnu/libc.so.6\n","server_host":"p-irods-001","server_pid":234501,"server_timestamp":"2024-04-25T14:03:33.398Z","server_type":"server","server_zone":"nkiImaging","stacktrace_agent_pid":"243256","stacktrace_timestamp_epoch_milliseconds":"399","stacktrace_timestamp_epoch_seconds":"1714053804","stacktrace_timestamp_utc":"2024-04-25T14:03:24.399Z"}

Op donderdag 25 april 2024 om 09:03:18 UTC+2 schreef Jan Graaf, de:

Alan King

unread,
Apr 25, 2024, 4:08:55 PMApr 25
to irod...@googlegroups.com
The logs indicate that the agent factory died: "Agent factory returned with error code [-2104000]."

After that, I see that the delay server emits an error as it attempted to update a delayed rule, and then a bunch of agents fail (possibly because the agent factory died). Is it possible that the delay server is causing the machine to run out of resources do to a very large number of delayed jobs? Try examining R_RULE_EXEC (or running iqstat -a) to see how many delayed rules are enqueued and roughly what the delayed rules are doing.

You may need to adjust the settings around the delay queue and executors. Try adjusting "number_of_concurrent_delay_rule_executors" and "maximum_size_of_delay_queue_in_bytes" to see if these help. More information on these settings can be found here: https://docs.irods.org/4.3.1/system_overview/configuration/#etcirodsserver_configjson

We should probably capture this in a GitHub issue because some of the output here is very concerning, specifically the stacktrace at the end.

Jan de Graaf

unread,
Apr 26, 2024, 1:32:33 AMApr 26
to iRODS-Chat
Hi Alan,

The only rule running is the rule for the tiering engine. But this rule has no work to do because the system is idle (no user working with irods).

irods@p-irods-001:~$ iqstat -a
id     name
148821 {"rule-engine-operation":"irods_policy_storage_tiering","storage-tier-groups":["TG_Imaging"]}

irods@p-irods-001:~$ iqstat -l
id: 148821
name: {"rule-engine-operation":"irods_policy_storage_tiering","storage-tier-groups":["TG_Imaging"]}
rei_file_path: EMPTY_REI_PATH
user_name: rods
address:
time:  : Never
frequency: REPEAT FOR EVER
priority: 5
estimated_exe_time:
notification_addr:
last_exe_time: 1714058819
exec_status:

Rule definition:
irods@p-irods-001:~$ cat /etc/irods/TG_Imaging.r
{
   "rule-engine-instance-name": "irods_rule_engine_plugin-unified_storage_tiering-instance",
   "rule-engine-operation": "irods_policy_schedule_storage_tiering",
   "delay-parameters": "<INST_NAME>irods_rule_engine_plugin-unified_storage_tiering-instance</INST_NAME><PLUSET>1s</PLUSET><EF>REPEAT FOR EVER</EF>",
   "storage-tier-groups": [
       "TG_Imaging"
   ]
}
INPUT null
OUTPUT ruleExecOut

Op donderdag 25 april 2024 om 22:08:55 UTC+2 schreef alankin...@gmail.com:

Jan de Graaf

unread,
Apr 26, 2024, 4:15:09 AMApr 26
to iRODS-Chat
Okay, one thing caught my attention after my post.

<PLUSET>1s</PLUSET> in the tering rule itself.

In the rule. I guess this got in  there due to copy/paste examples from the examples.

I adjusted this to <PLUSET>60s</PLUSET>
Now the server seems to behave? When idle, with no other rules but tiering running. Normally the server starts hogging CPU after 20-30 minutes. Now running 60 minutes without hogging the CPU.

So i will start enabling other rules and start using the system and see if it now keeps behaving and this was causing the CPU hogging.

But still having errors in the log file:


 {"log_category":"server","log_level":"info","log_message":"Initializing server ...","server_host":"p-irods-001","server_pid":5448,"server_timestamp":"2024-04-26T07:41:45.827Z","server_type":"server","server_zone":"nkiImaging"}
 {"log_category":"server","log_level":"info","log_message":"Setting up UNIX domain socket for agent factory ...","server_host":"p-irods-001","server_pid":5448,"server_timestamp":"2024-04-26T07:41:45.832Z","server_type":"server","server_zone":"nkiImaging"}
 {"log_category":"server","log_level":"info","log_message":"Forking agent factory ...","server_host":"p-irods-001","server_pid":5448,"server_timestamp":"2024-04-26T07:41:45.832Z","server_type":"server","server_zone":"nkiImaging"}
 {"log_category":"server","log_level":"info","log_message":"Connecting to agent factory [agent_factory_pid=5449] ...","server_host":"p-irods-001","server_pid":5448,"server_timestamp":"2024-04-26T07:41:45.833Z","server_type":"server","server_zone":"nkiImaging"}
 {"log_category":"agent_factory","log_level":"info","log_message":"Initializing agent factory ...","server_host":"p-irods-001","server_pid":5449,"server_timestamp":"2024-04-26T07:41:45.833Z","server_type":"agent_factory","server_zone":"nkiImaging"}
 {"local_hostname":"localhost, p-irods-001, p-irods-001.nki.nl","log_category":"server","log_level":"info","port":"1247","server_host":"p-irods-001","server_pid":5448,"server_timestamp":"2024-04-26T07:41:45.893Z","server_type":"server","server_zone":"nkiImaging"}
 {"log_category":"server","log_level":"info","server_host":"p-irods-001","server_pid":5448,"server_timestamp":"2024-04-26T07:41:45.996Z","server_type":"server","server_zone":"nkiImaging","zone_info.host":"localhost","zone_info.name":"nkiImaging","zone_info.port":"1247","zone_info.type":"LOCAL_ICAT"}
 {"log_category":"server","log_level":"info","log_message":"rodsServer Release version rods4.3.1 - API Version d is up","server_host":"p-irods-001","server_pid":5448,"server_timestamp":"2024-04-26T07:41:45.996Z","server_type":"server","server_zone":"nkiImaging"}
 {"log_category":"legacy","log_level":"info","log_message":">>> control plane :: listening on port 1248\n","server_host":"p-irods-001","server_pid":5448,"server_timestamp":"2024-04-26T07:41:45.998Z","server_type":"server","server_zone":"nkiImaging"}
 {"log_category":"server","log_level":"info","log_message":"Forking Delay Server (irodsDelayServer) ...","server_host":"p-irods-001","server_pid":5448,"server_timestamp":"2024-04-26T07:41:46.531Z","server_type":"server","server_zone":"nkiImaging"}
 {"log_category":"delay_server","log_level":"info","log_message":"Initializing delay server ...","server_host":"p-irods-001","server_pid":5462,"server_timestamp":"2024-04-26T07:41:46.561Z","server_type":"delay_server","server_zone":"nkiImaging"}
 {"log_category":"server","log_level":"critical","log_message":" 0# 0x00007F3FCF43794C\n 1# 0x00007F3FCDFBD420\n 2# 0x00007F3FCDFB3FC4\n 3# 0x00007F3FCF1F1FF9\n 4# 0x00007F3FCE1F0732\n 5# 0x00007F3FCE1E11DE\n 6# 0x00007F3FCE2C9E36\n 7# 0x00007F3FCF09DDF7\n 8# 0x00007F3FCA0E9C47\n 9# 0x00007F3FCA0F37E2\n10# 0x00007F3FCA0F42A7\n11# 0x00007F3FCF06CAB5\n12# 0x00007F3FCF06C5A1\n13# 0x00007F3FCF06D4B0\n14# 0x00007F3FCF06C3FD\n15# 0x00007F3FCDFB1609\n16# 0x00007F3FCDED4353\n","server_host":"p-irods-001","server_pid":5448,"server_timestamp":"2024-04-26T07:41:46.647Z","server_type":"server","server_zone":"nkiImaging","stacktrace_agent_pid":"5169","stacktrace_timestamp_epoch_milliseconds":"272","stacktrace_timestamp_epoch_seconds":"1714117253","stacktrace_timestamp_utc":"2024-04-26T07:40:53.272Z"}
 {"log_category":"legacy","log_level":"error","log_message":"[-]\t/irods_source/server/core/src/rsApiHandler.cpp:565:int readAndProcClientMsg(rsComm_t *, int) :  status [SYS_HEADER_READ_LEN_ERR]  errno [] -- message [failed to call 'read header']\n\t[-]\t/irods_source/lib/core/src/sockComm.cpp:229:irods::error readMsgHeader(irods::network_object_ptr, msgHeader_t *, struct timeval *) :  status [SYS_HEADER_READ_LEN_ERR]  errno [] -- message [failed to call 'read header']\n\t\t[-]\t/irods_source/plugins/network/src/ssl.cpp:528:irods::error ssl_read_msg_header(irods::plugin_context &, void *, struct timeval *) :  status [SYS_HEADER_READ_LEN_ERR]  errno [] -- message [read 0 expected 4]\n\n","request_api_name":"GEN_QUERY_AN","request_api_number":702,"request_api_version":"d","request_client_user":"rods","request_host":"172.31.32.83","request_proxy_user":"rods","request_release_version":"rods4.3.1","server_host":"p-irods-001","server_pid":8593,"server_timestamp":"2024-04-26T07:47:06.998Z","server_type":"agent","server_zone":"nkiImaging"}
 {"log_category":"delay_server","log_level":"error","log_message":"operator():337 - rcRuleExecMod failed -4000 for rule ID 148892","server_host":"p-irods-001","server_pid":5462,"server_timestamp":"2024-04-26T07:47:06.999Z","server_type":"delay_server","server_zone":"nkiImaging"}
 {"log_category":"delay_server","log_level":"error","log_message":"Rule exec for [148892] failed. status = [-4000]","server_host":"p-irods-001","server_pid":5462,"server_timestamp":"2024-04-26T07:47:06.999Z","server_type":"delay_server","server_zone":"nkiImaging"}
 {"log_category":"agent","log_level":"error","log_message":"[-]\t/irods_source/server/core/src/rodsAgent.cpp:725:int runIrodsAgentFactory(sockaddr_un) :  status [SSL_SHUTDOWN_ERROR]  errno [] -- message [failed to call 'agent stop']\n\t[-]\t/irods_source/lib/core/src/sockComm.cpp:191:irods::error sockAgentStop(irods::network_object_ptr) :  status [SSL_SHUTDOWN_ERROR]  errno [] -- message [failed to call 'agent stop']\n\t\t[-]\t/irods_source/plugins/network/src/ssl.cpp:850:irods::error ssl_agent_stop(irods::plugin_context &) :  status [SSL_SHUTDOWN_ERROR]  errno [] -- message [error completing shutdown of SSL connection]\n\n","request_api_name":"GEN_QUERY_AN","request_api_number":702,"request_api_version":"d","request_client_user":"rods","request_host":"172.31.32.83","request_proxy_user":"rods","request_release_version":"rods4.3.1","server_host":"p-irods-001","server_pid":8593,"server_timestamp":"2024-04-26T07:47:06.999Z","server_type":"agent","server_zone":"nkiImaging"}
 {"log_category":"server","log_level":"critical","log_message":"Agent factory returned with error code [-2104000].","request_api_name":"GEN_QUERY_AN","request_api_number":702,"request_api_version":"d","request_client_user":"rods","request_host":"172.31.32.83","request_proxy_user":"rods","request_release_version":"rods4.3.1","server_host":"p-irods-001","server_pid":8593,"server_timestamp":"2024-04-26T07:47:07.000Z","server_type":"agent","server_zone":"nkiImaging"}
 {"log_category":"agent_factory","log_level":"error","log_message":"Agent process [8592] exited with status [11].","server_host":"p-irods-001","server_pid":5449,"server_timestamp":"2024-04-26T07:47:07.023Z","server_type":"agent_factory","server_zone":"nkiImaging"}
 {"log_category":"agent_factory","log_level":"error","log_message":"Agent process [8593] exited with status [1].","server_host":"p-irods-001","server_pid":5449,"server_timestamp":"2024-04-26T07:47:07.023Z","server_type":"agent_factory","server_zone":"nkiImaging"}
 {"log_category":"server","log_level":"critical","log_message":" 0# stacktrace_signal_handler in /lib/libirods_server.so.4.3.1\n 1# 0x00007F003F003420 in /lib/x86_64-linux-gnu/libpthread.so.0\n 2# 0x00007F003B1389DF\n 3# 0x00007F003B1389F3\n 4# 0x00007F003B1389E7\n 5# 0x00007F003B1389E7\n 6# 0x00007F003B1389E7\n 7# 0x00007F003B1389F3\n 8# 0x00007F003B1389F3\n 9# 0x00007F003B1389F3\n10# 0x00007F003B1389F3\n11# 0x00007F003B1389E7\n12# 0x00007F003B1389F3\n13# 0x00007F003B12B06C\n14# 0x00007F003B12EDE3\n15# 0x00007F003B0FF74B\n16# 0x00007F003B12472F\n17# irods::pluggable_rule_engine<std::__1::tuple<> >::exec_rule_expression(std::__1::tuple<>&, std::__1::basic_string<char, std::__1::char_traits<char>, std::__1::allocator<char> > const&, MsParamArray*, irods::callback) in /lib/libirods_server.so.4.3.1\n18# irods::rule_engine_context_manager<std::__1::tuple<>, RuleExecInfo*, (irods::rule_execution_manager_pack)0>::exec_rule_expression(std::__1::basic_string<char, std::__1::char_traits<char>, std::__1::allocator<char> > const&, std::__1::basic_string<char, std::__1::char_traits<char>, std::__1::allocator<char> > const&, MsParamArray*) in /lib/libirods_server.so.4.3.1\n19# rsExecRuleExpression(RsComm*, ExecRuleExpression*) in /lib/libirods_server.so.4.3.1\n20# irods::api_call_adaptor<ExecRuleExpression*>::operator()(irods::plugin_context&, RsComm*, ExecRuleExpression*) in /lib/libirods_server.so.4.3.1\n21# std::__1::__function::__func<irods::api_call_adaptor<ExecRuleExpression*>, std::__1::allocator<irods::api_call_adaptor<ExecRuleExpression*> >, irods::error (irods::plugin_context&, RsComm*, ExecRuleExpression*)>::operator()(irods::plugin_context&, RsComm*&&, ExecRuleExpression*&&) in /lib/libirods_server.so.4.3.1\n22# int irods::api_entry::call_handler<ExecRuleExpression*>(RsComm*, ExecRuleExpression*) in /lib/libirods_server.so.4.3.1\n23# rsApiHandler(RsComm*, int, BytesBuf*, BytesBuf*) in /lib/libirods_server.so.4.3.1\n24# readAndProcClientMsg(RsComm*, int) in /lib/libirods_server.so.4.3.1\n25# agentMain(RsComm*) in /lib/libirods_server.so.4.3.1\n26# runIrodsAgentFactory(sockaddr_un) in /lib/libirods_server.so.4.3.1\n27# main::$_5::operator()() const at rodsServer.cpp:?\n28# main in /usr/sbin/irodsServer\n29# __libc_start_main in /lib/x86_64-linux-gnu/libc.so.6\n30# _start in /usr/sbin/irodsServer\n","server_host":"p-irods-001","server_pid":5448,"server_timestamp":"2024-04-26T07:47:08.874Z","server_type":"server","server_zone":"nkiImaging","stacktrace_agent_pid":"8592","stacktrace_timestamp_epoch_milliseconds":"996","stacktrace_timestamp_epoch_seconds":"1714117626","stacktrace_timestamp_utc":"2024-04-26T07:47:06.996Z"}
 {"log_category":"legacy","log_level":"error","log_message":"[-]\t/irods_source/server/core/src/rsApiHandler.cpp:565:int readAndProcClientMsg(rsComm_t *, int) :  status [SYS_HEADER_READ_LEN_ERR]  errno [] -- message [failed to call 'read header']\n\t[-]\t/irods_source/lib/core/src/sockComm.cpp:229:irods::error readMsgHeader(irods::network_object_ptr, msgHeader_t *, struct timeval *) :  status [SYS_HEADER_READ_LEN_ERR]  errno [] -- message [failed to call 'read header']\n\t\t[-]\t/irods_source/plugins/network/src/ssl.cpp:528:irods::error ssl_read_msg_header(irods::plugin_context &, void *, struct timeval *) :  status [SYS_HEADER_READ_LEN_ERR]  errno [] -- message [read 0 expected 4]\n\n","request_api_name":"GEN_QUERY_AN","request_api_number":702,"request_api_version":"d","request_client_user":"rods","request_host":"172.31.32.83","request_proxy_user":"rods","request_release_version":"rods4.3.1","server_host":"p-irods-001","server_pid":9112,"server_timestamp":"2024-04-26T07:48:07.131Z","server_type":"agent","server_zone":"nkiImaging"}
 {"log_category":"agent","log_level":"error","log_message":"[-]\t/irods_source/server/core/src/rodsAgent.cpp:725:int runIrodsAgentFactory(sockaddr_un) :  status [SSL_SHUTDOWN_ERROR]  errno [] -- message [failed to call 'agent stop']\n\t[-]\t/irods_source/lib/core/src/sockComm.cpp:191:irods::error sockAgentStop(irods::network_object_ptr) :  status [SSL_SHUTDOWN_ERROR]  errno [] -- message [failed to call 'agent stop']\n\t\t[-]\t/irods_source/plugins/network/src/ssl.cpp:850:irods::error ssl_agent_stop(irods::plugin_context &) :  status [SSL_SHUTDOWN_ERROR]  errno [] -- message [error completing shutdown of SSL connection]\n\n","request_api_name":"GEN_QUERY_AN","request_api_number":702,"request_api_version":"d","request_client_user":"rods","request_host":"172.31.32.83","request_proxy_user":"rods","request_release_version":"rods4.3.1","server_host":"p-irods-001","server_pid":9112,"server_timestamp":"2024-04-26T07:48:07.132Z","server_type":"agent","server_zone":"nkiImaging"}
 {"log_category":"delay_server","log_level":"error","log_message":"operator():337 - rcRuleExecMod failed -4000 for rule ID 148892","server_host":"p-irods-001","server_pid":5462,"server_timestamp":"2024-04-26T07:48:07.131Z","server_type":"delay_server","server_zone":"nkiImaging"}
 {"log_category":"delay_server","log_level":"error","log_message":"Rule exec for [148892] failed. status = [-4000]","server_host":"p-irods-001","server_pid":5462,"server_timestamp":"2024-04-26T07:48:07.131Z","server_type":"delay_server","server_zone":"nkiImaging"}
 {"log_category":"server","log_level":"critical","log_message":"Agent factory returned with error code [-2104000].","request_api_name":"GEN_QUERY_AN","request_api_number":702,"request_api_version":"d","request_client_user":"rods","request_host":"172.31.32.83","request_proxy_user":"rods","request_release_version":"rods4.3.1","server_host":"p-irods-001","server_pid":9112,"server_timestamp":"2024-04-26T07:48:07.132Z","server_type":"agent","server_zone":"nkiImaging"}
 {"log_category":"agent_factory","log_level":"error","log_message":"Agent process [9111] exited with status [11].","server_host":"p-irods-001","server_pid":5449,"server_timestamp":"2024-04-26T07:48:07.136Z","server_type":"agent_factory","server_zone":"nkiImaging"}
 {"log_category":"agent_factory","log_level":"error","log_message":"Agent process [9112] exited with status [1].","server_host":"p-irods-001","server_pid":5449,"server_timestamp":"2024-04-26T07:48:07.136Z","server_type":"agent_factory","server_zone":"nkiImaging"}
 {"log_category":"agent_factory","log_level":"error","log_message":"Agent process [9246] terminated by signal [13].","server_host":"p-irods-001","server_pid":5449,"server_timestamp":"2024-04-26T07:48:07.167Z","server_type":"agent_factory","server_zone":"nkiImaging"}
 {"log_category":"agent_factory","log_level":"error","log_message":"Agent process [9247] terminated by signal [13].","server_host":"p-irods-001","server_pid":5449,"server_timestamp":"2024-04-26T07:48:07.257Z","server_type":"agent_factory","server_zone":"nkiImaging"}
 {"log_category":"server","log_level":"critical","log_message":" 0# stacktrace_signal_handler in /lib/libirods_server.so.4.3.1\n 1# 0x00007F003F003420 in /lib/x86_64-linux-gnu/libpthread.so.0\n 2# void std::__1::__tree_balance_after_insert<std::__1::__tree_node_base<void*>*>(std::__1::__tree_node_base<void*>*, std::__1::__tree_node_base<void*>*) in /usr/sbin/irodsServer\n 3# 0x00007F003B138D06\n 4# 0x00007F003B12FB78\n 5# 0x00007F003B1397E2\n 6# 0x00007F003B13A2A7\n 7# boost::asio::detail::scheduler::do_run_one(boost::asio::detail::conditionally_enabled_mutex::scoped_lock&, boost::asio::detail::scheduler_thread_info&, boost::system::error_code const&) in /lib/libirods_server.so.4.3.1\n 8# boost::asio::detail::scheduler::run(boost::system::error_code&) in /lib/libirods_server.so.4.3.1\n 9# boost::asio::detail::posix_thread::func<boost::asio::thread_pool::thread_function>::run() in /lib/libirods_server.so.4.3.1\n10# boost_asio_detail_posix_thread_function in /lib/libirods_server.so.4.3.1\n11# 0x00007F003EFF7609 in /lib/x86_64-linux-gnu/libpthread.so.0\n12# clone in /lib/x86_64-linux-gnu/libc.so.6\n","server_host":"p-irods-001","server_pid":5448,"server_timestamp":"2024-04-26T07:48:09.236Z","server_type":"server","server_zone":"nkiImaging","stacktrace_agent_pid":"9111","stacktrace_timestamp_epoch_milliseconds":"128","stacktrace_timestamp_epoch_seconds":"1714117687","stacktrace_timestamp_utc":"2024-04-26T07:48:07.128Z"}
 {"log_category":"legacy","log_level":"error","log_message":"[-]\t/irods_source/server/core/src/rsApiHandler.cpp:565:int readAndProcClientMsg(rsComm_t *, int) :  status [SYS_HEADER_READ_LEN_ERR]  errno [] -- message [failed to call 'read header']\n\t[-]\t/irods_source/lib/core/src/sockComm.cpp:229:irods::error readMsgHeader(irods::network_object_ptr, msgHeader_t *, struct timeval *) :  status [SYS_HEADER_READ_LEN_ERR]  errno [] -- message [failed to call 'read header']\n\t\t[-]\t/irods_source/plugins/network/src/ssl.cpp:528:irods::error ssl_read_msg_header(irods::plugin_context &, void *, struct timeval *) :  status [SYS_HEADER_READ_LEN_ERR]  errno [] -- message [read 0 expected 4]\n\n","request_api_name":"GEN_QUERY_AN","request_api_number":702,"request_api_version":"d","request_client_user":"rods","request_host":"172.31.32.83","request_proxy_user":"rods","request_release_version":"rods4.3.1","server_host":"p-irods-001","server_pid":17269,"server_timestamp":"2024-04-26T08:02:39.500Z","server_type":"agent","server_zone":"nkiImaging"}
 {"log_category":"agent","log_level":"error","log_message":"[-]\t/irods_source/server/core/src/rodsAgent.cpp:725:int runIrodsAgentFactory(sockaddr_un) :  status [SSL_SHUTDOWN_ERROR]  errno [] -- message [failed to call 'agent stop']\n\t[-]\t/irods_source/lib/core/src/sockComm.cpp:191:irods::error sockAgentStop(irods::network_object_ptr) :  status [SSL_SHUTDOWN_ERROR]  errno [] -- message [failed to call 'agent stop']\n\t\t[-]\t/irods_source/plugins/network/src/ssl.cpp:850:irods::error ssl_agent_stop(irods::plugin_context &) :  status [SSL_SHUTDOWN_ERROR]  errno [] -- message [error completing shutdown of SSL connection]\n\n","request_api_name":"GEN_QUERY_AN","request_api_number":702,"request_api_version":"d","request_client_user":"rods","request_host":"172.31.32.83","request_proxy_user":"rods","request_release_version":"rods4.3.1","server_host":"p-irods-001","server_pid":17269,"server_timestamp":"2024-04-26T08:02:39.500Z","server_type":"agent","server_zone":"nkiImaging"}
 {"log_category":"server","log_level":"critical","log_message":"Agent factory returned with error code [-2104000].","request_api_name":"GEN_QUERY_AN","request_api_number":702,"request_api_version":"d","request_client_user":"rods","request_host":"172.31.32.83","request_proxy_user":"rods","request_release_version":"rods4.3.1","server_host":"p-irods-001","server_pid":17269,"server_timestamp":"2024-04-26T08:02:39.500Z","server_type":"agent","server_zone":"nkiImaging"}
 {"log_category":"delay_server","log_level":"error","log_message":"operator():337 - rcRuleExecMod failed -4000 for rule ID 148892","server_host":"p-irods-001","server_pid":5462,"server_timestamp":"2024-04-26T08:02:39.500Z","server_type":"delay_server","server_zone":"nkiImaging"}
 {"log_category":"agent_factory","log_level":"error","log_message":"Agent process [17268] exited with status [11].","server_host":"p-irods-001","server_pid":5449,"server_timestamp":"2024-04-26T08:02:39.503Z","server_type":"agent_factory","server_zone":"nkiImaging"}
 {"log_category":"agent_factory","log_level":"error","log_message":"Agent process [17269] exited with status [1].","server_host":"p-irods-001","server_pid":5449,"server_timestamp":"2024-04-26T08:02:39.503Z","server_type":"agent_factory","server_zone":"nkiImaging"}
 {"log_category":"delay_server","log_level":"error","log_message":"Rule exec for [148892] failed. status = [-4000]","server_host":"p-irods-001","server_pid":5462,"server_timestamp":"2024-04-26T08:02:39.500Z","server_type":"delay_server","server_zone":"nkiImaging"}
 {"log_category":"server","log_level":"critical","log_message":" 0# stacktrace_signal_handler in /lib/libirods_server.so.4.3.1\n 1# 0x00007F003F003420 in /lib/x86_64-linux-gnu/libpthread.so.0\n 2# 0x00007F003B1389DF\n 3# 0x00007F003B1389F3\n 4# 0x00007F003B1389E7\n 5# 0x00007F003B1389E7\n 6# 0x00007F003B1389E7\n 7# 0x00007F003B1389F3\n 8# 0x00007F003B1389E7\n 9# 0x00007F003B12B06C\n10# 0x00007F003B12EDE3\n11# 0x00007F003B0FF74B\n12# 0x00007F003B12472F\n13# irods::pluggable_rule_engine<std::__1::tuple<> >::exec_rule_expression(std::__1::tuple<>&, std::__1::basic_string<char, std::__1::char_traits<char>, std::__1::allocator<char> > const&, MsParamArray*, irods::callback) in /lib/libirods_server.so.4.3.1\n14# irods::rule_engine_context_manager<std::__1::tuple<>, RuleExecInfo*, (irods::rule_execution_manager_pack)0>::exec_rule_expression(std::__1::basic_string<char, std::__1::char_traits<char>, std::__1::allocator<char> > const&, std::__1::basic_string<char, std::__1::char_traits<char>, std::__1::allocator<char> > const&, MsParamArray*) in /lib/libirods_server.so.4.3.1\n15# rsExecRuleExpression(RsComm*, ExecRuleExpression*) in /lib/libirods_server.so.4.3.1\n16# irods::api_call_adaptor<ExecRuleExpression*>::operator()(irods::plugin_context&, RsComm*, ExecRuleExpression*) in /lib/libirods_server.so.4.3.1\n17# std::__1::__function::__func<irods::api_call_adaptor<ExecRuleExpression*>, std::__1::allocator<irods::api_call_adaptor<ExecRuleExpression*> >, irods::error (irods::plugin_context&, RsComm*, ExecRuleExpression*)>::operator()(irods::plugin_context&, RsComm*&&, ExecRuleExpression*&&) in /lib/libirods_server.so.4.3.1\n18# int irods::api_entry::call_handler<ExecRuleExpression*>(RsComm*, ExecRuleExpression*) in /lib/libirods_server.so.4.3.1\n19# rsApiHandler(RsComm*, int, BytesBuf*, BytesBuf*) in /lib/libirods_server.so.4.3.1\n20# readAndProcClientMsg(RsComm*, int) in /lib/libirods_server.so.4.3.1\n21# agentMain(RsComm*) in /lib/libirods_server.so.4.3.1\n22# runIrodsAgentFactory(sockaddr_un) in /lib/libirods_server.so.4.3.1\n23# main::$_5::operator()() const at rodsServer.cpp:?\n24# main in /usr/sbin/irodsServer\n25# __libc_start_main in /lib/x86_64-linux-gnu/libc.so.6\n26# _start in /usr/sbin/irodsServer\n","server_host":"p-irods-001","server_pid":5448,"server_timestamp":"2024-04-26T08:02:43.955Z","server_type":"server","server_zone":"nkiImaging","stacktrace_agent_pid":"17268","stacktrace_timestamp_epoch_milliseconds":"497","stacktrace_timestamp_epoch_seconds":"1714118559","stacktrace_timestamp_utc":"2024-04-26T08:02:39.497Z"}


Op vrijdag 26 april 2024 om 07:32:33 UTC+2 schreef Jan Graaf, de:

Terrell Russell

unread,
Jun 8, 2024, 10:28:00 AMJun 8
to irod...@googlegroups.com
We met at the iRODS UGM 2024 last week and I believe we got everything working?

Jan, is this true?   Can you explain here what the issue was, if it was resolved?

Terrell



Jan de Graaf

unread,
Jul 2, 2024, 3:44:24 AM (5 days ago) Jul 2
to iRODS-Chat
I stop hogging so no issue any more.

Op zaterdag 8 juni 2024 om 16:28:00 UTC+2 schreef Terrell Russell:
Reply all
Reply to author
Forward
0 new messages