Generic server rabbit_epmd_monitor terminating

636 views
Skip to first unread message

ufuk varol

unread,
Jul 30, 2018, 3:20:52 AM7/30/18
to rabbitmq-users

We are facing a crash report for a while after started rabbitmq service. Below you can find crash report text and status output.

Crash Report
2018-07-28 07:47:54.182 [error] <0.322.0> ** Generic server rabbit_epmd_monitor terminating
** Last message in was check
** When Server state == {state,{erlang,#Ref<0.695054671.1317011457.252848>},erl_epmd,"rabbit","testserver1",25672}
** Reason for termination ==
** {enoent,[{erlang,open_port,[{spawn_executable,false},[{args,["-sname","epmd-starter-133468332","-noshell","-eval","halt()."]},exit_status,stderr_to_stdout,use_stdio]],[{file,"erlang.erl"},{line,2213}]},{rabbit_nodes_common,ensure_epmd,0,[{file,"src/rabbit_nodes_common.erl"},{line,76}]},{rabbit_epmd_monitor,check_epmd,1,[{file,"src/rabbit_epmd_monitor.erl"},{line,108}]},{rabbit_epmd_monitor,handle_info,2,[{file,"src/rabbit_epmd_monitor.erl"},{line,79}]},{gen_server,try_dispatch,4,[{file,"gen_server.erl"},{line,637}]},{gen_server,handle_msg,6,[{file,"gen_server.erl"},{line,711}]},{proc_lib,init_p_do_apply,3,[{file,"proc_lib.erl"},{line,249}]}]}
2018-07-28 07:47:54.183 [error] <0.322.0> CRASH REPORT Process rabbit_epmd_monitor with 0 neighbours crashed with reason: {enoent,[{erlang,open_port,[{spawn_executable,false},[{args,["-sname","epmd-starter-133468332","-noshell","-eval","halt()."]},exit_status,stderr_to_stdout,use_stdio]],[{file,"erlang.erl"},{line,2213}]},{rabbit_nodes_common,ensure_epmd,0,[{file,"src/rabbit_nodes_common.erl"},{line,76}]},{rabbit_epmd_monitor,check_epmd,1,[{file,"src/rabbit_epmd_monitor.erl"},{line,108}]},{rabbit_epmd_monitor,handle_info,2,[{file,"src/rabbit_epmd_monitor.erl"},{line,79}]},{gen_server,try_dispatch,4,[{file,"gen..."},...]},...]}
2018-07-28 07:47:54.184 [error] <0.321.0> Supervisor rabbit_epmd_monitor_sup had child rabbit_epmd_monitor started with rabbit_epmd_monitor:start_link() at <0.322.0> exit with reason {enoent,[{erlang,open_port,[{spawn_executable,false},[{args,["-sname","epmd-starter-133468332","-noshell","-eval","halt()."]},exit_status,stderr_to_stdout,use_stdio]],[{file,"erlang.erl"},{line,2213}]},{rabbit_nodes_common,ensure_epmd,0,[{file,"src/rabbit_nodes_common.erl"},{line,76}]},{rabbit_epmd_monitor,check_epmd,1,[{file,"src/rabbit_epmd_monitor.erl"},{line,108}]},{rabbit_epmd_monitor,handle_info,2,[{file,"src/rabbit_epmd_monitor.erl"},{line,79}]},{gen_server,try_dispatch,4,[{file,"gen..."},...]},...]} in context child_terminated
2018-07-28 07:48:54.177 [error] <0.760.0> ** Generic server rabbit_epmd_monitor terminating

rabbitmqctl status output

Status of node rabbit@testserver1 ...
[{pid,5780},
{running_applications,
[{rabbitmq_management,"RabbitMQ Management Console","3.7.7"},
{rabbitmq_management_agent,"RabbitMQ Management Agent","3.7.7"},
{rabbitmq_web_stomp,"Rabbit WEB-STOMP - WebSockets to Stomp adapter",
"3.7.7"},
{rabbitmq_stomp,"RabbitMQ STOMP plugin","3.7.7"},
{rabbitmq_mqtt,"RabbitMQ MQTT Adapter","3.7.7"},
{rabbitmq_delayed_message_exchange,"RabbitMQ Delayed Message Exchange",
"20171201-3.7.x"},
{rabbitmq_web_dispatch,"RabbitMQ Web Dispatcher","3.7.7"},
{rabbit,"RabbitMQ","3.7.7"},
{amqp_client,"RabbitMQ AMQP Client","3.7.7"},
{rabbit_common,
"Modules shared by rabbitmq-server and rabbitmq-erlang-client",
"3.7.7"},
{ranch_proxy_protocol,"Ranch Proxy Protocol Transport","1.5.0"},
{cowboy,"Small, fast, modern HTTP server.","2.2.2"},
{ranch,"Socket acceptor pool for TCP protocols.","1.5.0"},
{ssl,"Erlang/OTP SSL application","9.0"},
{mnesia,"MNESIA CXC 138 12","4.15.4"},
{public_key,"Public key infrastructure","1.6"},
{asn1,"The Erlang ASN1 compiler version 5.0.6","5.0.6"},
{os_mon,"CPO CXC 138 46","2.4.5"},
{recon,"Diagnostic tools for production use","2.3.2"},
{cowlib,"Support library for manipulating Web protocols.","2.1.0"},
{crypto,"CRYPTO","4.3"},
{jsx,"a streaming, evented json parsing toolkit","2.8.2"},
{inets,"INETS CXC 138 49","7.0"},
{xmerl,"XML parser","1.3.17"},
{lager,"Erlang logging framework","3.6.3"},
{goldrush,"Erlang event stream processor","0.1.9"},
{compiler,"ERTS CXC 138 10","7.2"},
{syntax_tools,"Syntax tools","2.1.5"},
{syslog,"An RFC 3164 and RFC 5424 compliant logging framework.","3.4.2"},
{sasl,"SASL CXC 138 11","3.2"},
{stdlib,"ERTS CXC 138 10","3.5"},
{kernel,"ERTS CXC 138 10","6.0"}]},
{os,{win32,nt}},
{erlang_version,
"Erlang/OTP 21 [erts-10.0] [64-bit] [smp:2:2] [ds:2:2:10] [async-threads:64]\n"},
{memory,
[{connection_readers,34720},
{connection_writers,1356},
{connection_channels,7212},
{connection_other,104364},
{queue_procs,49864},
{queue_slave_procs,0},
{plugins,1808180},
{other_proc,31758652},
{metrics,209740},
{mgmt_db,281488},
{mnesia,105032},
{other_ets,2980928},
{binary,796720},
{msg_index,29504},
{code,28069482},
{atom,1172689},
{other_system,10116293},
{allocated_unused,11864880},
{reserved_unallocated,0},
{strategy,rss},
{total,[{erlang,77526224},{rss,89391104},{allocated,89391104}]}]},
{alarms,[]},
{listeners,
[{clustering,25672,"::"},
{amqp,5672,"::"},
{amqp,5672,"0.0.0.0"},
{'amqp/ssl',5671,"::"},
{'amqp/ssl',5671,"0.0.0.0"},
{mqtt,1884,"::"},
{mqtt,1884,"0.0.0.0"},
{'mqtt/ssl',8883,"::"},
{'mqtt/ssl',8883,"0.0.0.0"},
{stomp,61613,"::"},
{stomp,61613,"0.0.0.0"},
{'http/web-stomp',1983,"::"},
{'http/web-stomp',1983,"0.0.0.0"},
{'https/web-stomp',15671,"::"},
{'https/web-stomp',15671,"0.0.0.0"},
{http,15672,"::"},
{http,15672,"0.0.0.0"}]},
{vm_memory_calculation_strategy,rss},
{vm_memory_high_watermark,0.4},
{vm_memory_limit,3435785420},
{disk_free_limit,50000000},
{disk_free,112424361984},
{file_descriptors,
[{total_limit,8092},
{total_used,3},
{sockets_limit,7280},
{sockets_used,1}]},
{processes,[{limit,1048576},{used,558}]},
{run_queue,1},
{uptime,331},
{kernel,{net_ticktime,60}}]

Michael Klishin

unread,
Jul 30, 2018, 4:30:28 PM7/30/18
to rabbitm...@googlegroups.com
The error says that the node tried to start a subprocess that ensures that epmd is running.
It couldn't because the OS returned an "ENOENT" ("no entry"), which is a curious error code
for starting a subprocess.

epmd monitor is not a critically important component to RabbitMQ operation, at least not on Windows.

Security frameworks, tools and system limits are the most likely suspects here.

--
You received this message because you are subscribed to the Google Groups "rabbitmq-users" group.
To unsubscribe from this group and stop receiving emails from it, send an email to rabbitmq-users+unsubscribe@googlegroups.com.
To post to this group, send email to rabbitmq-users@googlegroups.com.
For more options, visit https://groups.google.com/d/optout.



--
MK

Staff Software Engineer, Pivotal/RabbitMQ

Luke Bakken

unread,
Jul 30, 2018, 11:00:27 PM7/30/18
to rabbitmq-users
Hello,

This is the line that is failing:


The gist of it is that the erl.exe program is not in the PATH, so os:find_executable is failing. You can see the result from os:find_executable (which is "false") in the stack trace you provide provided as the first argument to erlang:open_port -

{spawn_executable,false}

So, my question is - how did you install Erlang and RabbitMQ? Did you install both using an administrative user? The

I'll see if I can reproduce this tomorrow.
Thanks,
Luke

Hugh Knaus

unread,
Jul 31, 2020, 11:50:19 AM7/31/20
to rabbitmq-users
Luke, I know your post is 2 years old, but were you ever able to reproduce this?  And did you find a resolution?  My team has just run into this very issue, but appears to have been ongoing for sometime since we upgraded to RabbitMQ version 3.7.8 (it was actually a re-install to move the previous install from a developer's profile to a system profile [i.e. rabbitmq user], which also meant that the newer version of Erlang was installed using the same system user account). 

I've also found https://groups.google.com/forum/#!topic/rabbitmq-users/uSL4W4s8j7Q. I've gone through the firewall ensuring the recommended ports are open, it's a WIndows HA install and the ERL_MAX_PORTS is the default value (I think 1024), which we don't come close to reaching, and the 3rd part relating to any additional security restrictions I'm not entirely sure about.

Luke Bakken

unread,
Jul 31, 2020, 6:01:39 PM7/31/20
to rabbitmq-users
Hi Hugh,

In general it's best to not reply to old threads as there's a chance it will be missed.

I never received the requested information from that user so I was unable to assist. I have never been able to reproduce this issue.

Like I stated, it's most likely to erl.exe not being in the PATH when RabbitMQ is running as a Windows service.

If there are exact steps to reproduce this I'm sure it would be easy to explain and / or fix. In your case, how exactly was the upgrade performed?

Thanks,
Luke

Hugh Knaus

unread,
Aug 3, 2020, 10:33:27 AM8/3/20
to rabbitmq-users
Thank you for the reply.  Seeing how this is an inherited issue I can't say precisely how the install went but, I believe the upgrade was done by exporting the schema (I think we were on 3.6.1 or 3.6.2), uninstalled RabbitMQ (which was installed under a developer's profile), then created a new user "rabbitmq" with admin rights, logged in as rabbitmq, installed Erlang 10.0, then installed RabbitMQ and imported the schema.  If the developer didn't do an import/export of the schema I bet he/she recreated them manually.

Since you mentioned the environment variable path for ERLANG_HOME I double checked the path and found something suspect... The Erlang folder structure has erl.exe in two folders:

C:\Program Files\erl10.0\bin
C:\Program Files\erl10.0\erts-10.0\bin

The environment varibiable is set to:
ERLANG_HOME = C:\Program Files\erl10.0

The RabbitMQ service is pointed at:
C:\Program Files\erl10.0\erts-10.0\bin\erlsrv.exe

I have a test machine that I can do an install on to check the Erlang paths and whether that's a good install or not (since I'm unsure what to expect).

Hugh Knaus

unread,
Aug 3, 2020, 11:22:12 AM8/3/20
to rabbitmq-users
Did a fresh install on my spare machine and the paths checked out, appearing the same as our production server.

Hugh Knaus

unread,
Aug 3, 2020, 11:36:49 AM8/3/20
to rabbitmq-users
Also, PATH contains C:\Program Files\erl10.0\bin but not C:\Program Files\erl10.0\erts-10.0\bin\

Luke Bakken

unread,
Aug 3, 2020, 3:56:51 PM8/3/20
to rabbitmq-users
Hi Hugh,

Thanks for doing the thorough investigation. Yes that is as expected.

Do you consistently see this error?

** Reason for termination ==
** {enoent,[{erlang,open_port,[{spawn_executable,false},[{args,["-sname","epmd-starter-133468332","-noshell","-eval","halt()."]

Hugh Knaus

unread,
Aug 3, 2020, 5:47:28 PM8/3/20
to rabbitmq-users
Yeah, it's writing to the log every minute on both nodes (two different VMs).  Here's the log snippet:

2020-08-03 15:42:44.262 [error] <0.1238.6> ** Generic server rabbit_epmd_monitor terminating
** Last message in was check
** When Server state == {state,{erlang,#Ref<0.2191993907.1373372418.46831>},erl_epmd,"rabbit","NODE1",25672}

** Reason for termination ==
** {enoent,[{erlang,open_port,[{spawn_executable,false},[{args,["-sname","epmd-starter-530722723","-noshell","-eval","halt()."]},exit_status,stderr_to_stdout,use_stdio]],[{file,"erlang.erl"},{line,2213}]},{rabbit_nodes_common,ensure_epmd,0,[{file,"src/rabbit_nodes_common.erl"},{line,76}]},{rabbit_epmd_monitor,check_epmd,1,[{file,"src/rabbit_epmd_monitor.erl"},{line,108}]},{rabbit_epmd_monitor,handle_info,2,[{file,"src/rabbit_epmd_monitor.erl"},{line,79}]},{gen_server,try_dispatch,4,[{file,"gen_server.erl"},{line,637}]},{gen_server,handle_msg,6,[{file,"gen_server.erl"},{line,711}]},{proc_lib,init_p_do_apply,3,[{file,"proc_lib.erl"},{line,249}]}]}
2020-08-03 15:42:44.262 [error] <0.1238.6> CRASH REPORT Process rabbit_epmd_monitor with 0 neighbours crashed with reason: {enoent,[{erlang,open_port,[{spawn_executable,false},[{args,["-sname","epmd-starter-530722723","-noshell","-eval","halt()."]},exit_status,stderr_to_stdout,use_stdio]],[{file,"erlang.erl"},{line,2213}]},{rabbit_nodes_common,ensure_epmd,0,[{file,"src/rabbit_nodes_common.erl"},{line,76}]},{rabbit_epmd_monitor,check_epmd,1,[{file,"src/rabbit_epmd_monitor.erl"},{line,108}]},{rabbit_epmd_monitor,handle_info,2,[{file,"src/rabbit_epmd_monitor.erl"},{line,79}]},{gen_server,try_dispatch,4,[{file,"gen..."},...]},...]}
2020-08-03 15:42:44.262 [error] <0.368.0> Supervisor rabbit_epmd_monitor_sup had child rabbit_epmd_monitor started with rabbit_epmd_monitor:start_link() at <0.1238.6> exit with reason {enoent,[{erlang,open_port,[{spawn_executable,false},[{args,["-sname","epmd-starter-530722723","-noshell","-eval","halt()."]},exit_status,stderr_to_stdout,use_stdio]],[{file,"erlang.erl"},{line,2213}]},{rabbit_nodes_common,ensure_epmd,0,[{file,"src/rabbit_nodes_common.erl"},{line,76}]},{rabbit_epmd_monitor,check_epmd,1,[{file,"src/rabbit_epmd_monitor.erl"},{line,108}]},{rabbit_epmd_monitor,handle_info,2,[{file,"src/rabbit_epmd_monitor.erl"},{line,79}]},{gen_server,try_dispatch,4,[{file,"gen..."},...]},...]} in context child_terminated

Luke Bakken

unread,
Aug 4, 2020, 4:24:39 PM8/4/20
to rabbitmq-users
Hi Hugh,

Thanks for that information. Could you open an administrative command prompt, change to this directory...

C:\Program Files\RabbitMQ Server\rabbitmq_server-3.7.8

...and run this command:

.\sbin\rabbitmqctl.bat eval "os:getenv(""PATH"")."

I would like to see the output. Thanks!

Hugh Knaus

unread,
Aug 4, 2020, 5:48:41 PM8/4/20
to rabbitmq-users
All it returned was:

false

Luke Bakken

unread,
Aug 4, 2020, 8:20:57 PM8/4/20
to rabbitmq-users
Hi Hugh,

That is the problem. There is no PATH available for the account under which the RabbitMQ service is running as.

Could you please check to see what account is running the service? It should be SYSTEM or NT_AUTHORITY\SYSTEM

Thanks,
Luke

Hugh Knaus

unread,
Aug 5, 2020, 10:32:28 AM8/5/20
to rabbitmq-users
The service is running as the Local System account... is that not correct?

Hugh Knaus

unread,
Aug 5, 2020, 11:00:45 AM8/5/20
to rabbitmq-users
Hey Luke, I went back and ran this:

.\sbin\rabbitmqctl.bat eval "os:getenv()."

I noticed in its output that we do have a "PATH" variable, however, its casing was not all upper case like you had me run, it was: "Path".  So I reran what you gave me but with the proper casing and then it returns the appropriate variable setting... Does the system environment variable need to be all caps??

This worked:

C:\Program Files\RabbitMQ Server\rabbitmq_server-3.7.8>.\sbin\rabbitmqctl.bat eval "os:getenv(""Path"")."
"C:\\Program Files\\erl10.0\\erts-10.0\\bin;C:\\ProgramData\\Oracle\\Java\\javapath;C:\\Windows\\system32;C:\\Windows;C:\\Windows\\System32\\Wbem;C:\\Windows\\System32\\WindowsPowerShell\\v1.0\\;C:\\Program Files\\erl10.0\\bin;C:\\Program Files\\erl10.0\\erts-10.0\\bin"

Message has been deleted

Luke Bakken

unread,
Aug 5, 2020, 4:09:31 PM8/5/20
to rabbitmq-users
Hi Hugh,

You know that could be it. Please change the variable to all-caps. Just to be sure the change sticks, stop and restart the RabbitMQ windows service after making the system-wide change. Then verify using rabbitmqctl.

Thanks,
Luke

Hugh Knaus

unread,
Aug 5, 2020, 5:15:54 PM8/5/20
to rabbitmq-users
So I changed the system env variable from "Path" to "PATH" by completely deleting the variable and then re-adding it.  I took RabbitMQ service 1 step further and removed, installed, started, start_app.  I re-ran:

.\sbin\rabbitmqctl.bat eval "os:getenv(""PATH"")."

Returned: false

Re-ran:

.\sbin\rabbitmqctl.bat eval "os:getenv()."

It still shows mixed case "Path=..." (and that is after closing and reopening the command prompt)

Re-ran:

.\sbin\rabbitmqctl.bat eval "os:getenv(""Path"")."

Returned the expect paths.

Double checked the system env variable and it is indeed in all caps. Still getting the same crash reports.

Ran the set command, file is attached.
settings.txt

Luke Bakken

unread,
Aug 5, 2020, 5:58:14 PM8/5/20
to rabbitmq-users
Hi Hugh,

As you can see in the set command output, the variable is still mixed-case.

You may have to delete the variable and re-create it. Be sure to save the value first, of course. A system restart may be necessary too.

This is the code responsible for what you are seeing - https://github.com/erlang/otp/blob/master/lib/kernel/src/os.erl#L167-L168

It only takes an all-uppercase PATH into account.

Once this PR ships this all will be a non-issue, but it won't be sooner than 3.8.7 - https://github.com/rabbitmq/rabbitmq-common/pull/407

Thanks for continuing to work on this issue!
Luke

Luke Bakken

unread,
Aug 5, 2020, 6:02:47 PM8/5/20
to rabbitmq-users
One more thing Hugh -

I'm sure you're already doing this but be sure to change the system-wide environment variable to all uppercase. Sometimes individual user accounts also define PATH but it's the system-wide one that LocalSystem uses I believe.

You may have to use this trick to start a cmd.exe prompt as the LocalSystem user - https://stackoverflow.com/questions/77528/how-do-you-run-cmd-exe-under-the-local-system-account

You can then use the setx command to ensure the variable is set correctly, and use that prompt to double-check.

Luke

Hugh Knaus

unread,
Aug 6, 2020, 9:58:41 AM8/6/20
to rabbitmq-users
I'll post here if I'm able to get the variable casing resolved on the server, otherwise, like I think you're suggesting, we'll aim for the upgrade. Thank you again Luke!

Hugh Knaus

unread,
Aug 6, 2020, 10:48:09 AM8/6/20
to rabbitmq-users
Luke, I just ran onto this as well, https://bugs.erlang.org/browse/ERL-644.

Hugh Knaus

unread,
Aug 6, 2020, 11:02:11 AM8/6/20
to rabbitmq-users
I also checked both the System Environment Variables, the User Environment Variables, and how they are stored in the Registry (after a reboot).  They all show uppercase "PATH" but os:getenv still returns mixed case.  I also ran the command prompt as the local system account like you suggested... still the same.

Hugh Knaus

unread,
Aug 6, 2020, 1:16:51 PM8/6/20
to rabbitmq-users
Here's a kicker for you Luke, we happen to have several servers with a couple different versions of RabbitMQ installed.  I attached a screenshot that shows, a Windows Server 2008 r2 machine, with the mixed case environment variable (i.e. "Path"), running RabbitMQ 3.6.2, Erlang OTP 18, and it can retrieve the variable in either way, "os:getenv(""PATH"")." or "os:getenv(""Path"")." or "os:getenv(""path"")."
2020-08-06 11_10_58-Clipboard.png

Luke Bakken

unread,
Aug 6, 2020, 1:36:26 PM8/6/20
to rabbitmq-users
Hi Hugh,

I also ran the command prompt as the local system account like you suggested... still the same

Please always run the commands I suggest, capture the output, and attach it to your responses. The above statement doesn't help me help you - I don't know what "still the same" means. Do you mean that, when running as LocalSystem, the PATH env var is spelled PATH or Path? That is why I asked for the output of the set command.

It does appear you are running into ERL-644, and that the bug wasn't fixed correctly. I will follow-up.

Thanks,
Luke

Luke Bakken

unread,
Aug 6, 2020, 2:27:53 PM8/6/20
to rabbitmq-users
Hi Hugh,

I can confirm that the mixed-case env variable issue (ERL-644) is fixed using Erlang 23.0.3 on Windows.


I'm hoping that fixes the issue for you.

Hugh Knaus

unread,
Aug 7, 2020, 10:21:35 AM8/7/20
to rabbitmq-users
Unfortunately, I don't think that Erlang version is compatible with RabbitMQ 3.7.8 and we're not quite in a position to upgrade RabbitMQ yet.
Reply all
Reply to author
Forward
0 new messages