Rabbitmq queue crashed and failed to restart

49 views
Skip to first unread message

金甲虫

unread,
May 28, 2024, 11:50:41 AMMay 28
to rabbitmq-users
Our RabbitMQ cluster restarted after memory high watermark.
Soon after, some queues crashed and failed to restart.
version: 3.7.17
OS: centos
8U16G * 3 cluster

```log
2024-05-28 07:02:56.909 [warning] <0.5960.622> Binding error: home node 'rabbit@dms-vm-e745e1f6-rabbitmq-1' of durable queue 'com.anta.srm.purchasing.service.DemandToOrderMessageHandler.handlerCreatOrderFlagByFilaOneMessage(DemandToOrderDto)anta' in vhost 'ascm' is down or inaccessible
2024-05-28 07:12:14.422 [info] <0.26331.613> Making sure data directory '/opt/dms/data/rabbitmq/rabbit@dms-vm-e745e1f6-rabbitmq-1/msg_stores/vhosts/D3SEFWKV9AZ7151W6XGAFREY6' for vhost 'ascm-prod' exists
2024-05-28 07:12:14.427 [info] <0.26331.613> Starting message stores for vhost 'ascm-prod'
2024-05-28 07:12:14.427 [info] <0.12039.664> Message store "D3SEFWKV9AZ7151W6XGAFREY6/msg_store_transient": using rabbit_msg_store_ets_index to provide index
2024-05-28 07:12:14.428 [info] <0.26331.613> Started message store of type transient for vhost 'ascm-prod'
2024-05-28 07:12:14.428 [info] <0.14138.666> Message store "D3SEFWKV9AZ7151W6XGAFREY6/msg_store_persistent": using rabbit_msg_store_ets_index to provide index
2024-05-28 07:12:14.428 [warning] <0.14138.666> Message store "D3SEFWKV9AZ7151W6XGAFREY6/msg_store_persistent": rebuilding indices from scratch
2024-05-28 07:12:14.429 [info] <0.26331.613> Started message store of type persistent for vhost 'ascm-prod'
2024-05-28 07:12:34.891 [info] <0.1550.657> Setting permissions for 'ascm-user' in 'ascm-prod' to '.*', '.*', '.*'
2024-05-28 07:31:05.856 [info] <0.25110.685> Setting permissions for 'dms' in 'crm-kolon' to '.*', '.*', '.*'
2024-05-28 07:53:12.939 [warning] <0.304.0> memory resource limit alarm set on node 'rabbit@dms-vm-e745e1f6-rabbitmq-2'.

**********************************************************
*** Publishers will be blocked until this alarm clears ***
**********************************************************
2024-05-28 07:53:14.941 [warning] <0.304.0> memory resource limit alarm cleared on node 'rabbit@dms-vm-e745e1f6-rabbitmq-2'
2024-05-28 07:53:14.941 [warning] <0.304.0> memory resource limit alarm cleared across the cluster
2024-05-28 07:53:16.951 [warning] <0.304.0> memory resource limit alarm set on node 'rabbit@dms-vm-e745e1f6-rabbitmq-2'.

**********************************************************
*** Publishers will be blocked until this alarm clears ***
**********************************************************
2024-05-28 07:53:20.970 [warning] <0.304.0> memory resource limit alarm cleared on node 'rabbit@dms-vm-e745e1f6-rabbitmq-2'
2024-05-28 07:53:20.970 [warning] <0.304.0> memory resource limit alarm cleared across the cluster
2024-05-28 07:58:36.446 [error] <0.28457.753> Restarting crashed queue 'queue.member.coupon.write.off.initiative' in vhost 'crm-kolon'.
2024-05-28 07:58:36.446 [error] <0.15376.666> ** Generic server <0.15376.666> terminating
** Last message in was {init,new}
** When Server state == {q,{amqqueue,{resource,<<"crm-kolon">>,queue,<<"queue.member.coupon.write.off.initiative">>},true,false,none,[{<<"x-queue-type">>,longstr,<<"classic">>}],<0.15376.666>,[],[],[],[{vhost,<<"crm-kolon">>},{name,<<"queue.member.coupon.write.off.initiative">>},{pattern,<<"queue.member.coupon.write.off.initiative">>},{'apply-to',<<"queues">>},{definition,[{<<"expires">>,1}]},{priority,0}],undefined,[],[],live,0,[],<<"crm-kolon">>,#{user => <<"odcrm">>}},none,false,undefined,undefined,{state,{queue,[],[],0},{active,-576450215338394,1.0}},undefined,undefined,undefined,undefined,{state,fine,5000,undefined},{0,nil},undefined,undefined,undefined,{state,{dict,0,16,16,8,80,48,{[],[],[],[],[],[],[],[],[],[],[],[],[],[],[],[]},{{[],[],[],[],[],[],[],[],[],[],[],[],[],[],[],[]}}},delegate},undefined,undefined,undefined,undefined,'drop-head',0,0,running}
** Reason for termination ==
** {{badmatch,true},[{rabbit_queue_index,init,3,[{file,"src/rabbit_queue_index.erl"},{line,315}]},{rabbit_variable_queue,init,6,[{file,"src/rabbit_variable_queue.erl"},{line,537}]},{rabbit_priority_queue,init,3,[{file,"src/rabbit_priority_queue.erl"},{line,151}]},{rabbit_amqqueue_process,init_it2,3,[{file,"src/rabbit_amqqueue_process.erl"},{line,207}]},{rabbit_amqqueue_process,handle_call,3,[{file,"src/rabbit_amqqueue_process.erl"},{line,1164}]},{gen_server2,handle_msg,2,[{file,"src/gen_server2.erl"},{line,1035}]},{proc_lib,init_p_do_apply,3,[{file,"proc_lib.erl"},{line,249}]}]}
2024-05-28 07:58:36.446 [error] <0.15376.666> CRASH REPORT Process <0.15376.666> with 0 neighbours exited with reason: no match of right hand value true in rabbit_queue_index:init/3 line 315 in gen_server2:terminate/3 line 1172
2024-05-28 07:58:36.446 [error] <0.24966.758> Supervisor {<0.24966.758>,rabbit_amqqueue_sup} had child rabbit_amqqueue started with rabbit_prequeue:start_link({amqqueue,{resource,<<"crm-kolon">>,queue,<<"queue.member.coupon.write.off.initiative">>},true,false,...}, declare, <0.9394.758>) at <0.15376.666> exit with reason no match of right hand value true in rabbit_queue_index:init/3 line 315 in context child_terminated
2024-05-28 07:58:36.447 [error] <0.19815.761> ** Generic server <0.19815.761> terminating
** Last message in was {'$gen_cast',{method,{'queue.declare',0,<<"queue.member.coupon.write.off.initiative">>,false,true,false,false,false,[{<<"x-queue-type">>,longstr,<<"classic">>}]},none,noflow}}
** When Server state == {ch,running,rabbit_framing_amqp_0_9_1,1,<0.32244.753>,<0.31130.763>,<0.32244.753>,<<"172.31.1.200:13792 -> 172.31.1.230:5672">>,rabbit_reader,{lstate,<0.26156.727>,false},none,1,{[],[]},{user,<<"odcrm">>,[administrator],[{rabbit_auth_backend_internal,none}]},<<"crm-kolon">>,<<"queue.member.coupon.write.off.event">>,#{},{state,{dict,0,16,16,8,80,48,{[],[],[],[],[],[],[],[],[],[],[],[],[],[],[],[]},{{[],[],[],[],[],[],[],[],[],[],[],[],[],[],[],[]}}},erlang},#{},#{},{set,0,16,16,8,80,48,{[],[],[],[],[],[],[],[],[],[],[],[],[],[],[],[]},{{[],[],[],[],[],[],[],[],[],[],[],[],[],[],[],[]}}},<0.23877.737>,{state,fine,5000,#Ref<0.693285892.3135504385.83844>},false,1,{{0,nil},{0,nil}},[],[],{{0,nil},{0,nil}},[{<<"exchange_exchange_bindings">>,bool,true},{<<"connection.blocked">>,bool,true},{<<"authentication_failure_close">>,bool,true},{<<"basic.nack">>,bool,true},{<<"publisher_confirms">>,bool,true},{<<"consumer_cancel_notify">>,bool,true}],none,0,none,flow,[]}
** Reason for termination ==
** {{{{badmatch,true},[{rabbit_queue_index,init,3,[{file,"src/rabbit_queue_index.erl"},{line,315}]},{rabbit_variable_queue,init,6,[{file,"src/rabbit_variable_queue.erl"},{line,537}]},{rabbit_priority_queue,init,3,[{file,"src/rabbit_priority_queue.erl"},{line,151}]},{rabbit_amqqueue_process,init_it2,3,[{file,"src/rabbit_amqqueue_process.erl"},{line,207}]},{rabbit_amqqueue_process,handle_call,3,[{file,"src/rabbit_amqqueue_process.erl"},{line,1164}]},{gen_server2,handle_msg,2,[{file,"src/gen_server2.erl"},{line,1035}]},{proc_lib,init_p_do_apply,3,[{file,"proc_lib.erl"},{line,249}]}]},{gen_server2,call,[<0.15376.666>,{init,new},infinity]}},[{gen_server2,call,3,[{file,"src/gen_server2.erl"},{line,335}]},{rabbit_channel,handle_method,6,[{file,"/root/rabbitmq-server-3.7.17/deps/rabbit/src/rabbit_channel.erl"},{line,2246}]},{rabbit_channel,handle_method,3,[{file,"/root/rabbitmq-server-3.7.17/deps/rabbit/src/rabbit_channel.erl"},{line,1456}]},{rabbit_channel,handle_cast,2,[{file,"/root/rabbitmq-server-3.7.17/deps/rabbit/src/rabbit_channel.erl"},{line,567}]},{gen_server2,handle_msg,2,[{file,"src/gen_server2.erl"},{line,1056}]},{proc_lib,init_p_do_apply,3,[{file,"proc_lib.erl"},{line,249}]}]}
2024-05-28 07:58:36.447 [error] <0.19815.761> CRASH REPORT Process <0.19815.761> with 0 neighbours exited with reason: {{{badmatch,true},[{rabbit_queue_index,init,3,[{file,"src/rabbit_queue_index.erl"},{line,315}]},{rabbit_variable_queue,init,6,[{file,"src/rabbit_variable_queue.erl"},{line,537}]},{rabbit_priority_queue,init,3,[{file,"src/rabbit_priority_queue.erl"},{line,151}]},{rabbit_amqqueue_process,init_it2,3,[{file,"src/rabbit_amqqueue_process.erl"},{line,207}]},{rabbit_amqqueue_process,handle_call,3,[{file,"src/rabbit_amqqueue_process.erl"},{line,1164}]},{gen_server2,handle_msg,2,[{file,"src/gen_serv..."},...]},...]},...} in gen_server2:call/3 line 335 in gen_server2:terminate/3 line 1172
2024-05-28 07:58:36.447 [error] <0.14415.767> Supervisor {<0.14415.767>,rabbit_channel_sup} had child channel started with rabbit_channel:start_link(1, <0.32244.753>, <0.31130.763>, <0.32244.753>, <<"172.31.1.200:13792 -> 172.31.1.230:5672">>, rabbit_framing_amqp_0_9_1, {user,<<"odcrm">>,[administrator],[{rabbit_auth_backend_internal,none}]}, <<"crm-kolon">>, [{<<"exchange_exchange_bindings">>,bool,true},{<<"connection.blocked">>,bool,true},{<<"authentica...">>,...},...], <0.23877.737>, <0.26156.727>) at <0.19815.761> exit with reason {{{badmatch,true},[{rabbit_queue_index,init,3,[{file,"src/rabbit_queue_index.erl"},{line,315}]},{rabbit_variable_queue,init,6,[{file,"src/rabbit_variable_queue.erl"},{line,537}]},{rabbit_priority_queue,init,3,[{file,"src/rabbit_priority_queue.erl"},{line,151}]},{rabbit_amqqueue_process,init_it2,3,[{file,"src/rabbit_amqqueue_process.erl"},{line,207}]},{rabbit_amqqueue_process,handle_call,3,[{file,"src/rabbit_amqqueue_process.erl"},{line,1164}]},{gen_server2,handle_msg,2,[{file,"src/gen_serv..."},...]},...]},...} in gen_server2:call/3 line 335 in context child_terminated
2024-05-28 07:58:36.447 [error] <0.14415.767> Supervisor {<0.14415.767>,rabbit_channel_sup} had child channel started with rabbit_channel:start_link(1, <0.32244.753>, <0.31130.763>, <0.32244.753>, <<"172.31.1.200:13792 -> 172.31.1.230:5672">>, rabbit_framing_amqp_0_9_1, {user,<<"odcrm">>,[administrator],[{rabbit_auth_backend_internal,none}]}, <<"crm-kolon">>, [{<<"exchange_exchange_bindings">>,bool,true},{<<"connection.blocked">>,bool,true},{<<"authentica...">>,...},...], <0.23877.737>, <0.26156.727>) at <0.19815.761> exit with reason reached_max_restart_intensity in context shutdown
2024-05-28 07:58:36.449 [warning] <0.32244.753> Non-AMQP exit reason '{{{{badmatch,true},[{rabbit_queue_index,init,3,[{file,"src/rabbit_queue_index.erl"},{line,315}]},{rabbit_variable_queue,init,6,[{file,"src/rabbit_variable_queue.erl"},{line,537}]},{rabbit_priority_queue,init,3,[{file,"src/rabbit_priority_queue.erl"},{line,151}]},{rabbit_amqqueue_process,init_it2,3,[{file,"src/rabbit_amqqueue_process.erl"},{line,207}]},{rabbit_amqqueue_process,handle_call,3,[{file,"src/rabbit_amqqueue_process.erl"},{line,1164}]},{gen_server2,handle_msg,2,[{file,"src/gen_server2.erl"},{line,1035}]},{proc_lib,init_p_do_apply,3,[{file,"proc_lib.erl"},{line,249}]}]},{gen_server2,call,[<0.15376.666>,{init,new},infinity]}},[{gen_server2,call,3,[{file,"src/gen_server2.erl"},{line,335}]},{rabbit_channel,handle_method,6,[{file,"/root/rabbitmq-server-3.7.17/deps/rabbit/src/rabbit_channel.erl"},{line,2246}]},{rabbit_channel,handle_method,3,[{file,"/root/rabbitmq-server-3.7.17/deps/rabbit/src/rabbit_channel.erl"},{line,1456}]},{rabbit_channel,handle_cast,2,[{file,"/root/rabbitmq-server-3.7.17/deps/rabbit/src/rabbit_channel.erl"},{line,567}]},{gen_server2,handle_msg,2,[{file,"src/gen_server2.erl"},{line,1056}]},{proc_lib,init_p_do_apply,3,[{file,"proc_lib.erl"},{line,249}]}]}'
2024-05-28 08:17:55.262 [warning] <0.254.0> rabbit_sysmon_handler busy_dist_port <0.17036.0> [{name,delegate_management_0},{initial_call,{delegate,init,1}},{gen_server2,process_next_msg,1},{message_queue_len,0}] {#Port<0.1036>,unknown}
2024-05-28 08:56:11.816 [error] <0.5493.914> Restarting crashed queue 'otd_file_reportForm_queue' in vhost 'otd'.
2024-05-28 08:56:11.816 [error] <0.26939.920> ** Generic server <0.26939.920> terminating
** Last message in was {init,new}
** When Server state == {q,{amqqueue,{resource,<<"otd">>,queue,<<"otd_file_reportForm_queue">>},true,false,none,[],<0.26939.920>,[],[],[],undefined,undefined,[],[],live,0,[],<<"otd">>,#{user => <<"OTD">>}},none,false,undefined,undefined,{state,{queue,[],[],0},{active,-576446759943434,1.0}},undefined,undefined,undefined,undefined,{state,fine,5000,undefined},{0,nil},undefined,undefined,undefined,{state,{dict,0,16,16,8,80,48,{[],[],[],[],[],[],[],[],[],[],[],[],[],[],[],[]},{{[],[],[],[],[],[],[],[],[],[],[],[],[],[],[],[]}}},delegate},undefined,undefined,undefined,undefined,'drop-head',0,0,running}
** Reason for termination ==
** {{badmatch,true},[{rabbit_queue_index,init,3,[{file,"src/rabbit_queue_index.erl"},{line,315}]},{rabbit_variable_queue,init,6,[{file,"src/rabbit_variable_queue.erl"},{line,537}]},{rabbit_priority_queue,init,3,[{file,"src/rabbit_priority_queue.erl"},{line,151}]},{rabbit_amqqueue_process,init_it2,3,[{file,"src/rabbit_amqqueue_process.erl"},{line,207}]},{rabbit_amqqueue_process,handle_call,3,[{file,"src/rabbit_amqqueue_process.erl"},{line,1164}]},{gen_server2,handle_msg,2,[{file,"src/gen_server2.erl"},{line,1035}]},{proc_lib,init_p_do_apply,3,[{file,"proc_lib.erl"},{line,249}]}]}
2024-05-28 08:56:11.816 [error] <0.26939.920> CRASH REPORT Process <0.26939.920> with 0 neighbours exited with reason: no match of right hand value true in rabbit_queue_index:init/3 line 315 in gen_server2:terminate/3 line 1172
2024-05-28 08:56:11.816 [error] <0.19563.913> Supervisor {<0.19563.913>,rabbit_amqqueue_sup} had child rabbit_amqqueue started with rabbit_prequeue:start_link({amqqueue,{resource,<<"otd">>,queue,<<"otd_file_reportForm_queue">>},true,false,none,[],none,[],...}, declare, <0.4758.925>) at <0.26939.920> exit with reason no match of right hand value true in rabbit_queue_index:init/3 line 315 in context child_terminated
```

Michal Kuratczyk

unread,
May 28, 2024, 12:13:37 PMMay 28
to rabbitm...@googlegroups.com
You are running a version that has been out of support for years:

The components involved in this failure have been completely rewritten over the last 2 years or so.

Do not expect support from the RabbitMQ team until you upgrade to 3.13.

Of course community members can chime in and try to help.


--
You received this message because you are subscribed to the Google Groups "rabbitmq-users" group.
To unsubscribe from this group and stop receiving emails from it, send an email to rabbitmq-user...@googlegroups.com.
To view this discussion on the web, visit https://groups.google.com/d/msgid/rabbitmq-users/51b21adf-9754-485e-a6eb-75f44b32c91dn%40googlegroups.com.


--
Michal
RabbitMQ Team

This electronic communication and the information and any files transmitted with it, or attached to it, are confidential and are intended solely for the use of the individual or entity to whom it is addressed and may contain information that is confidential, legally privileged, protected by privacy laws, or otherwise restricted from disclosure to anyone else. If you are not the intended recipient or the person responsible for delivering the e-mail to the intended recipient, you are hereby notified that any use, copying, distributing, dissemination, forwarding, printing, or copying of this e-mail is strictly prohibited. If you received this e-mail in error, please return the e-mail to the sender, delete it from your computer, and destroy any printed copy of it.
Reply all
Reply to author
Forward
0 new messages