jose cm
unread,Jan 8, 2024, 6:55:42 AMJan 8Sign in to reply to author
Sign in to forward
You do not have permission to delete messages in this group
Either email addresses are anonymous for this group or you need the view member email addresses permission to view the original message
to rabbitmq-users
Hi people,
We are using RabbitMQ server cluster with 4 nodes.
All 4 nodes are identical, running rabbit 3.11.10 and Erlang OTP 25.2.3
on Suse SLES 15 sp4
The issue is that randomly (twice a week) systemd service reports
monotonic time has gone backwards and restarts the rabbitmq server service.
as seen in systemd logs :
Jan 03 14:16:49 srv1wc0 rabbitmq-server[27462]: Monotonic time stepped backwards!
Jan 03 14:16:49 srv1wc0 rabbitmq-server[27462]: Previous time: 2861975396900129
Jan 03 14:16:49 srv1wc0 rabbitmq-server[27462]: Current time: 2861975396900128
Jan 03 14:16:50 srv1wc0 systemd[1]: URabbitMQ.service: Main process exited, code=killed, status=6/ABRT
Jan 03 14:16:50 srv1wc0 systemd[1]: URabbitMQ.service: Failed with result 'signal'.
Jan 03 14:17:20 srv1wc0 systemd[1]: URabbitMQ.service: Scheduled restart job, restart counter is at 1.
on the rabbit node logs we see a msg about Mnesia db stopped then the typical restart logs (systemd controlled) :
2024-01-03 14:17:25.630012+01:00 [notice] <0.44.0> Application mnesia exited with reason: stopped
2024-01-03 14:17:25.630223+01:00 [notice] <0.232.0> Feature flags: checking nodes `rabbit@RMAHRF02` and `rabbit@RMAHRB01` compatibility...
2024-01-03 14:17:25.633573+01:00 [notice] <0.232.0> Feature flags: nodes `rabbit@RMAHRF02` and `rabbit@RMAHRB01` are compatible
2024-01-03 14:17:25.656600+01:00 [info] <0.232.0> ra: starting system quorum_queues
2024-01-03 14:17:25.656676+01:00 [info] <0.232.0> starting Ra system: quorum_queues in directory: /users/rmq00/rmq/mnesia/rabbit@RMAHRF02/quorum/rabbit@RMAHRF02
2024-01-03 14:17:25.657611+01:00 [info] <0.338.0> ra system 'quorum_queues' running pre init for 1 registered servers
2024-01-03 14:17:25.658642+01:00 [info] <0.339.0> ra: meta data store initialised for system quorum_queues. 1 record(s) recovered
2024-01-03 14:17:25.658804+01:00 [notice] <0.344.0> WAL: ra_log_wal init, open tbls: ra_log_open_mem_tables, closed tbls: ra_log_closed_mem_tables
2024-01-03 14:17:25.662368+01:00 [info] <0.232.0> ra: starting system coordination
2024-01-03 14:17:25.662412+01:00 [info] <0.232.0> starting Ra system: coordination in directory: /users/rmq00/rmq/mnesia/rabbit@RMAHRF02/coordination/rabbit@RMAHRF02
2024-01-03 14:17:25.663298+01:00 [info] <0.355.0> ra system 'coordination' running pre init for 1 registered servers
2024-01-03 14:17:25.666412+01:00 [info] <0.356.0> ra: meta data store initialised for system coordination. 1 record(s) recovered
2024-01-03 14:17:25.666536+01:00 [notice] <0.365.0> WAL: ra_coordination_log_wal init, open tbls: ra_coordination_log_open_mem_tables, closed tbls: ra_coordination_log_closed_mem_tables
2024-01-03 14:17:26.505286+01:00 [info] <0.232.0>
2024-01-03 14:17:26.505286+01:00 [info] <0.232.0> Starting RabbitMQ 3.11.10 on Erlang 25.2.3 [jit]
We tried adding Erlang VM parameter modifier :
RABBITMQ_SERVER_ADDITIONAL_ERL_ARGS="+C multi_time_warp"
into cfg file rabbitmq-env.conf, but with no effect.
We don't know if the issue comes from
Rabbit itself , systemd or maybe something OS related (cpu/ntp, etc).
So, please if someone can shed a light it will be much appreciated !
Cheers!
José.