We had a network issue and now RMQ is unhappy. I see RMQ errors in the logs of various openstack components:
/var/log/neutron/neutron-metadata-agent.log:
error: [Errno 32] Broken pipe
/var/log/designate/designate-central.log:
2019-12-02 15:28:14.203 4106 ERROR oslo.messaging._drivers.impl_rabbit [req-155b359b-4468-46ce-a95a-e2a8a50fc3ee 5b288691527245bda715ab7744a193e9 deea2d8541f741eda6fb0d242d16bb23 - - -] [ee9e3b84-55c7-40a4-a7e5-7f9c72688002] AMQP server on
us01odc-p02-ctrl2.internal.synopsys.com:5672 is unreachable: [Errno 32] Broken pipe. Trying again in 1 seconds.: error: [Errno 32] Broken pipe
If I look at cluster_status it looks fine:
root@us01odc-p02-ctrl1:/var/log/rabbitmq# rabbitmqctl cluster_status
Cluster status of node 'rabbit@us01odc-p02-ctrl1'
[{nodes,[{disc,['rabbit@us01odc-p02-ctrl1']},
{ram,['rabbit@us01odc-p02-ctrl3','rabbit@us01odc-p02-ctrl2']}]},
{running_nodes,['rabbit@us01odc-p02-ctrl3','rabbit@us01odc-p02-ctrl2',
'rabbit@us01odc-p02-ctrl1']},
{cluster_name,<<"
rab...@us01odc-p02-ctrl1.internal.synopsys.com">>},
{partitions,[]},
{alarms,[{'rabbit@us01odc-p02-ctrl3',[]},
{'rabbit@us01odc-p02-ctrl2',[]},
{'rabbit@us01odc-p02-ctrl1',[]}]}]
But when I go to the web interface and look at the Summary I see 1000 Ready messages and 4500 Unacked. If I go to Queues and sort by Total I see the unacked messages under q-plugin:
The q-plugin queue was out of sync so I synced it but that didn't fix the problem. If I try to get messages from q-plugin I get an error "Queue is empty." If I purge it, the unacked messages don't go away. How can I help this queue recover from the network outage?