Crash in rabbit_mgmt_external_stats:init (bad argument) (3.6.14)

1,377 views
Skip to first unread message

Måns Andersson

unread,
Dec 28, 2017, 1:59:43 AM12/28/17
to rabbitmq-users
Hi,

Just upgraded to 3.6.14 yesterday (from 3.6.2) and are now seeing crash reports in my log file that I don't remember seeing before. Has anyone else seen these and should I be worried?
I'm running RabbitMq on Windows with Erlang OTP 20.2 in a cluster of 3 servers.


=CRASH REPORT==== 27-Dec-2017::16:59:38 ===
  crasher:
    initial call: rabbit_mgmt_external_stats:init/1
    pid: <0.492.0>
    registered_name: rabbit_mgmt_external_stats
    exception error: bad argument
      in function  port_command/2
         called as port_command(#Port<0.32201>,[])
      in call from os:cmd/1 (os.erl, line 242)
      in call from rabbit_mgmt_external_stats:get_used_fd/1 (src/rabbit_mgmt_external_stats.erl, line 132)
      in call from rabbit_mgmt_external_stats:get_used_fd/0 (src/rabbit_mgmt_external_stats.erl, line 60)
      in call from rabbit_mgmt_external_stats:'-infos/2-lc$^0/1-0-'/2 (src/rabbit_mgmt_external_stats.erl, line 174)
      in call from rabbit_mgmt_external_stats:emit_update/1 (src/rabbit_mgmt_external_stats.erl, line 369)
      in call from rabbit_mgmt_external_stats:handle_info/2 (src/rabbit_mgmt_external_stats.erl, line 356)
      in call from gen_server:try_dispatch/4 (gen_server.erl, line 616)
    ancestors: [rabbit_mgmt_agent_sup,rabbit_mgmt_agent_sup_sup,<0.480.0>]
    message_queue_len: 1
    messages: [{'DOWN',#Ref<0.3329576275.3491758083.65367>,port,
                          #Port<0.32201>,normal}]
    links: [<0.482.0>]
    dictionary: [{logged_used_fd_error,true}]
    trap_exit: false
    status: running
    heap_size: 2586
    stack_size: 27
    reductions: 7745867
  neighbours:

=SUPERVISOR REPORT==== 27-Dec-2017::16:59:38 ===
     Supervisor: {local,rabbit_mgmt_agent_sup}
     Context:    child_terminated
     Reason:     {badarg,
                     [{erlang,port_command,
                          [#Port<0.32201>,[]],
                          [{file,"erlang.erl"},{line,3042}]},
                      {os,cmd,1,[{file,"os.erl"},{line,242}]},
                      {rabbit_mgmt_external_stats,get_used_fd,1,
                          [{file,"src/rabbit_mgmt_external_stats.erl"},
                           {line,132}]},
                      {rabbit_mgmt_external_stats,get_used_fd,0,
                          [{file,"src/rabbit_mgmt_external_stats.erl"},
                           {line,60}]},
                      {rabbit_mgmt_external_stats,'-infos/2-lc$^0/1-0-',2,
                          [{file,"src/rabbit_mgmt_external_stats.erl"},
                           {line,174}]},
                      {rabbit_mgmt_external_stats,emit_update,1,
                          [{file,"src/rabbit_mgmt_external_stats.erl"},
                           {line,369}]},
                      {rabbit_mgmt_external_stats,handle_info,2,
                          [{file,"src/rabbit_mgmt_external_stats.erl"},
                           {line,356}]},
                      {gen_server,try_dispatch,4,
                          [{file,"gen_server.erl"},{line,616}]}]}
     Offender:   [{pid,<0.492.0>},
                  {id,rabbit_mgmt_external_stats},
                  {mfargs,{rabbit_mgmt_external_stats,start_link,[]}},
                  {restart_type,permanent},
                  {shutdown,5000},
                  {child_type,worker}]

Michael Klishin

unread,
Dec 28, 2017, 9:13:32 AM12/28/17
to rabbitm...@googlegroups.com
For starters, 20.2 is not a version we officially support [1].

However, the error is not necessarily related. All it says that a RabbitMQ node tried to run
an external command to gather some node stats (e.g. how much disk space is free) and that failed
with a "bad argument". There can be all kinds of reasons for this, from OS/runtime limits such as the number of file descriptors [2]
to anti-virus software.

If this was a one-off event you likely don't have anything to worry. The worst outcome of this is that some node-wide stats won't be available.

1. http://www.rabbitmq.com/which-erlang.html
2. http://www.rabbitmq.com/networking.html


--
You received this message because you are subscribed to the Google Groups "rabbitmq-users" group.
To unsubscribe from this group and stop receiving emails from it, send an email to rabbitmq-users+unsubscribe@googlegroups.com.
To post to this group, send email to rabbitmq-users@googlegroups.com.
For more options, visit https://groups.google.com/d/optout.



--
MK

Staff Software Engineer, Pivotal/RabbitMQ

Måns Andersson

unread,
Dec 28, 2017, 9:36:10 AM12/28/17
to rabbitmq-users
Darn it, must have missed that page when I selected Erlang version. I figured "Erlang OTP 20" support in release notes meant any 20.x version. Anyway, it has been running okay since yesterday so let's keep our fingers crossed :-)

Anyway. I had another error for a missing "handle.exe" file in PATH which I've now solved and it seems the error has disappeared since. I didn't think the two messages were related. Thanks for the help!

// Måns
To unsubscribe from this group and stop receiving emails from it, send an email to rabbitmq-user...@googlegroups.com.
To post to this group, send email to rabbitm...@googlegroups.com.

For more options, visit https://groups.google.com/d/optout.

Michael Klishin

unread,
Dec 29, 2017, 11:48:30 AM12/29/17
to rabbitm...@googlegroups.com
I cannot argue with "20.2 works for us", so if it works fine, no need to change it. We'd appreciate an
experience report from you in a few weeks ;)

Yes, a missing handle.exe could lead to the behavior you've reported. Glad you figured it out.

To unsubscribe from this group and stop receiving emails from it, send an email to rabbitmq-users+unsubscribe@googlegroups.com.
To post to this group, send email to rabbitmq-users@googlegroups.com.

For more options, visit https://groups.google.com/d/optout.

Måns Andersson

unread,
Jan 2, 2018, 9:29:04 AM1/2/18
to rabbitmq-users
I think I'll keep posting errors I get in this thread and thereby turning it into an "OTP 20.2" status thread. Hope that's okay :-)

I got another error Sunday that wasn't critical (i.e. the system kept running) but it's in my log so I'm curious about it. Is this something you're aware of?

Thanks,
// Måns

=CRASH REPORT==== 31-Dec-2017::12:01:53 ===
  crasher:
    initial call: rabbit_disk_monitor:init/1
    pid: <0.305.0>
    registered_name: rabbit_disk_monitor

    exception error: bad argument
      in function  port_command/2
         called as port_command(#Port<0.388353>,[])

      in call from os:cmd/1 (os.erl, line 242)
      in call from rabbit_disk_monitor:get_disk_free/2 (src/rabbit_disk_monitor.erl, line 222)
      in call from rabbit_disk_monitor:internal_update/1 (src/rabbit_disk_monitor.erl, line 197)
      in call from rabbit_disk_monitor:handle_info/2 (src/rabbit_disk_monitor.erl, line 169)

      in call from gen_server:try_dispatch/4 (gen_server.erl, line 616)
      in call from gen_server:handle_msg/6 (gen_server.erl, line 686)
    ancestors: [rabbit_disk_monitor_sup,rabbit_sup,<0.295.0>]
    message_queue_len: 29
    messages: [{#Port<0.388353>,
                   {data,<<" Volume in drive C has no label.\r\n">>}},
                  {#Port<0.388353>,
                   {data,<<" Volume Serial Number is 90AB-2E58\r\n">>}},
                  {#Port<0.388353>,{data,<<"\r\n">>}},
                  {#Port<0.388353>,
                   {data,<<" Directory of c:\\Uplink\\CONFIG~1\\RabbitMQ\\db\\RABBIT~1\r\n\r\n">>}},
                  {#Port<0.388353>,
                   {data,<<"[.]                             [..]">>}},
                  {#Port<0.388353>,{data,<<"\r\n">>}},
                  {#Port<0.388353>,
                   {data,<<"cluster_nodes.config            DECISION_TAB.LOG">>}},
                  {#Port<0.388353>,{data,<<"\r\n">>}},
                  {#Port<0.388353>,
                   {data,<<"LATEST.LOG                      [msg_store_persistent]">>}},
                  {#Port<0.388353>,{data,<<"\r\n">>}},
                  {#Port<0.388353>,
                   {data,<<"[msg_store_transient]           nodes_running_at_shutdown">>}},
                  {#Port<0.388353>,{data,<<"\r\n">>}},
                  {#Port<0.388353>,
                   {data,<<"[queues]                        rabbit_durable_exchange.DCD">>}},
                  {#Port<0.388353>,{data,<<"\r\n">>}},
                  {#Port<0.388353>,
                   {data,<<"rabbit_durable_queue.DCD        rabbit_durable_queue.DCL">>}},
                  {#Port<0.388353>,{data,<<"\r\n">>}},
                  {#Port<0.388353>,
                   {data,<<"rabbit_durable_route.DCD        rabbit_runtime_parameters.DCD">>}},
                  {#Port<0.388353>,{data,<<"\r\n">>}},
                  {#Port<0.388353>,
                   {data,<<"rabbit_serial                   rabbit_user.DCD">>}},
                  {#Port<0.388353>,{data,<<"\r\n">>}},
                  {#Port<0.388353>,
                   {data,<<"rabbit_user_permission.DCD      rabbit_vhost.DCD">>}},
                  {#Port<0.388353>,{data,<<"\r\n">>}},
                  {#Port<0.388353>,
                   {data,<<"recovery.dets                   schema.DAT">>}},
                  {#Port<0.388353>,{data,<<"\r\n">>}},
                  {#Port<0.388353>,
                   {data,<<"schema_version                  ">>}},
                  {#Port<0.388353>,{data,<<"\r\n">>}},
                  {#Port<0.388353>,
                   {data,<<"              16 File(s)          55634 bytes\r\n">>}},
                  {#Port<0.388353>,
                   {data,<<"               5 Dir(s)    214213492736 bytes free\r\n">>}},
                  {'DOWN',#Ref<0.3329576275.3565420546.7019>,port,
                          #Port<0.388353>,normal}]
    links: [<0.304.0>]
    dictionary: []
    trap_exit: false
    status: running
    heap_size: 6772
    stack_size: 27
    reductions: 103999242
  neighbours:
Reply all
Reply to author
Forward
0 new messages