rabbitmqctl list queues crashes RabbitMQ

378 views
Skip to first unread message

Henning Torsteinsen

unread,
Nov 8, 2021, 5:04:46 AM11/8/21
to rabbitmq-users
Listing the queues from the command line, or looking at the queues in the web admin interface crashes RabbitMQ. I haven't found any similar problems after Googling for a few hours, so here goes...

The problem was noticed this Monday, it was not a problem on Friday. The server admins have patched and upgraded both the VM host and Windows. I do not know if this has anything to do with the problem or not, but I thought it was worth mentioning it. I am responsible for RabbitMQ, the server admins for running Windows server, so I'm on my own.

I have 3 node cluster, and I have crashed 2 of the nodes. I haven't tried it on the last node.

The nodes are at 3.9.6, Erlang is v24.0.
I upgraded one of the nodes to 3.9.8 in case this could help, but no joy.
The nodes are running at Windows Server 2019 (1809). There's lots of RAM and disk space  available.

I can crash RabbitMQ by opening the Queues tab, and then press "Total" to sort the queues. The UI then shows an error message:
"Error: could not connect to server since 2021-11-08 10:50:54."

The log file then contains two notices:
2021-11-08 10:51:00.267000+01:00 [notice] <0.229.0> Logging: configured log handlers are now ACTIVE
2021-11-08 10:51:03.595000+01:00 [notice] <0.44.0> Application mnesia exited with reason: stopped

The rest of the log is related to RabbitMQ starting up. I haven't compared the startup log of today to Friday (yet).

There's a few warnings, I'm not sure if it's relevant...
2021-11-08 10:51:04.285000+01:00 [warning] <0.229.0> Feature flags: the previous instance of this node must have failed to write the `feature_flags` file at `c:/Users/XXX/AppData/Roaming/RabbitMQ/db/rabbit@NORENAPP001V-feature_flags`:
2021-11-08 10:51:04.285000+01:00 [warning] <0.229.0> Feature flags:   - list of previously disabled feature flags now marked as such: [empty_basic_get_metric]

There is no erlang crash dump file to be found, at least not on C:

I really hope someone can help me with this!

Cheers
Henning

Walt Pang

unread,
Nov 8, 2021, 6:00:59 AM11/8/21
to rabbitm...@googlegroups.com
It seems the underline mnesia server has stopped working.
Are you sure the disk has enough space/inodes? And there is no read only file system?

regards.

--
You received this message because you are subscribed to the Google Groups "rabbitmq-users" group.
To unsubscribe from this group and stop receiving emails from it, send an email to rabbitmq-user...@googlegroups.com.
To view this discussion on the web, visit https://groups.google.com/d/msgid/rabbitmq-users/9c5d4a94-4f12-48d3-9503-041837e668acn%40googlegroups.com.

Henning Torsteinsen

unread,
Nov 8, 2021, 6:19:17 AM11/8/21
to rabbitmq-users
Grafana reports more than 28GB of disc space, and 3GB of RAM.
I have to check the read only stuff, is there any specific place I should look into?

-- 
Henning

Walt Pang

unread,
Nov 8, 2021, 6:31:45 AM11/8/21
to rabbitm...@googlegroups.com
And please shutdown any antivirus software.


Henning Torsteinsen <henning.negaa...@gmail.com>于2021年11月8日 周一下午7:19写道:

Henning Torsteinsen

unread,
Nov 8, 2021, 7:46:34 AM11/8/21
to rabbitmq-users
Disabling Symantec Endpoint protection didn't help :-(

Luke Bakken

unread,
Nov 8, 2021, 8:37:24 AM11/8/21
to rabbitmq-users
Hi Henning,

We need to get some more detailed information from you. I have questions in-line:

I have 3 node cluster, and I have crashed 2 of the nodes. I haven't tried it on the last node.

When you say "crashed", is the node not processing messages any more? Is the erl.exe process still running?

Do you know what the previous Windows version was prior to the upgrade? Were the servers rebooted as part of the upgrade?
 
Can you describe exactly how you installed RabbitMQ on these servers? Environment variables, configuration, etc?

The nodes are at 3.9.6, Erlang is v24.0.
I upgraded one of the nodes to 3.9.8 in case this could help, but no joy.
The nodes are running at Windows Server 2019 (1809). There's lots of RAM and disk space  available.

I can crash RabbitMQ by opening the Queues tab, and then press "Total" to sort the queues. The UI then shows an error message:
"Error: could not connect to server since 2021-11-08 10:50:54."

The log file then contains two notices:
2021-11-08 10:51:00.267000+01:00 [notice] <0.229.0> Logging: configured log handlers are now ACTIVE
2021-11-08 10:51:03.595000+01:00 [notice] <0.44.0> Application mnesia exited with reason: stopped

If you can provide the complete configuration and log files from all three nodes that would be great. Please feel free to use the "reply privately" feature if you're worried about the contents of the files.

Thanks,
Luke
Message has been deleted

Henning Torsteinsen

unread,
Nov 10, 2021, 4:52:42 AM11/10/21
to rabbitmq-users

> When you say "crashed", is the node not processing messages any more? Is the erl.exe process still running?

erl.exe is still running, not entirely sure if the node stops processing messages.
I've noticed that the connected clients are disconnected, so I assume the worst. I


> Do you know what the previous Windows version was prior to the upgrade? Were the servers rebooted as part of the upgrade?

I do not know what version they had before the upgrade.
The underlying VM host was also patched over the weekend.
The servers were rebooted, and I tried to reboot one node as well.


>  Can you describe exactly how you installed RabbitMQ on these servers? Environment variables, configuration, etc?

Well, I used Chocolatey to do the installation, I've made script that does that for me. See attached file. 
Initially I installed RabbitMQ 3.8.5 on Erlang 22.3, but I have since upgraded to the latest versions.


> If you can provide the complete configuration and log files from all three nodes that would be great. Please feel free to use the "reply privately" feature if you're worried about the contents of the files.

The install script is included, and I'll send the server logs in a private reply. I had set the extension to txt, ps1 files are not accepted.

-- 
Henning
InstallRabbitMQ.txt

Wes Peng

unread,
Nov 10, 2021, 5:42:01 AM11/10/21
to rabbitm...@googlegroups.com
Generally the OS log and Rabbitmq logs could tell you the crashing details.

regards.

Henning Torsteinsen

unread,
Nov 10, 2021, 5:55:19 AM11/10/21
to rabbitmq-users
On Wednesday, November 10, 2021 at 11:42:01 AM UTC+1 wes....@yahoo.com wrote:
Generally the OS log and Rabbitmq logs could tell you the crashing details.

Probably true, but I'm not competent enough to understand what is wrong, and how I can fix it ;-)
-- 
Henning 

Henning Torsteinsen

unread,
Nov 10, 2021, 5:59:41 AM11/10/21
to rabbitmq-users

It actually looks like erl.exe crashed as well.
I've added the crash report in case someone finds it interesting.

-- 
Henning

Report.wer

Wes Peng

unread,
Nov 10, 2021, 7:18:30 AM11/10/21
to rabbitm...@googlegroups.com
I would suggest taking the additional check:
1. file system issue (such as read-only FS)
2. antivirus software influencing.

Regards


--
You received this message because you are subscribed to the Google Groups "rabbitmq-users" group.
To unsubscribe from this group and stop receiving emails from it, send an email to rabbitmq-user...@googlegroups.com.

Henning Torsteinsen

unread,
Nov 15, 2021, 1:52:35 AM11/15/21
to rabbitmq-users
Thanks Wes, 
I tried disabling Symantec Endpoint Protection, but it still crashes.

Erl / RabbitMQ is running as SYSTEM, so I'm assuming read only FS shouldn't be a problem.
Is there any particular file / directory I should check for read-only-ness?

-- 
Henning

Luke Bakken

unread,
Nov 15, 2021, 9:02:42 AM11/15/21
to rabbitmq-users
Hi Henning,

I see evidence of network issues between nodes in your clusters:

2021-11-08 07:52:00.201000+01:00 [info] <0.453.0> node rabbit@NORENAPP001V down: connection_closed 
2021-11-08 07:52:32.392000+01:00 [info] <0.453.0> node rabbit@NORENAPP001V up 
2021-11-08 07:52:40.798000+01:00 [info] <0.453.0> rabbit on node rabbit@NORENAPP001V up 
2021-11-08 07:53:11.206000+01:00 [info] <0.453.0> rabbit on node rabbit@NORENAPP001V down 
2021-11-08 07:53:11.237000+01:00 [info] <0.453.0> Node rabbit@NORENAPP001V is down, deleting its listeners 
2021-11-08 07:53:11.284000+01:00 [info] <0.453.0> node rabbit@NORENAPP001V down: connection_closed 
2021-11-08 07:53:11.534000+01:00 [info] <0.453.0> node rabbit@NORENAPP001V up

I also see where node NORENAPP006V crashed (confirmed by your system event log). It looks like it may have been a memory access fault in the Erlang VM itself.

Prior to this I saw many events in all logs where nodes were asked to stop and then started again. Presumably this is someone managing the cluster? There were quite a few sequences of RabbitMQ shutting down cleanly then starting right back up again.

The Erlang VM crashing is extremely rare. Ensure your nodes have sufficient RAM, and please be sure to use the latest version of Erlang 24, which is 24.1.5 - http://erlang.org/download/otp_versions_tree.html

Upgrade Erlang and let us know how your cluster is operating.

Thanks -
Luke

vineesh m

unread,
Nov 15, 2021, 12:25:15 PM11/15/21
to rabbitmq-users
I'm facing the similar issue in our Dev RabbitMQ. The servers are hosted in Linux and the admins applied the patches and the nodes came backup fine. But the next day we noticed one of the node is down and it affected the whole cluster. The other nodes which were running stopped processing the messages. I tried restarting the all the nodes and none of them came backup (RabbitMQ app is not coming backup). I checked in google and noticed that Mnesia file is corrupted and we deleted that file and started the RabbitMQ app again and it came online. But we lost all the data so we had to recreate again. 

Henning Torsteinsen

unread,
Nov 16, 2021, 2:13:08 AM11/16/21
to rabbitm...@googlegroups.com
Great news!
Upgrading to Erlang 24.1.5 and RabbitMQ 3.9.9 made the problem (or symptom) go away.

Thanks for your help, and patience!

Cheers
Henning



--
You received this message because you are subscribed to the Google Groups "rabbitmq-users" group.
To unsubscribe from this group and stop receiving emails from it, send an email to rabbitmq-user...@googlegroups.com.
Reply all
Reply to author
Forward
0 new messages