rabbitmqctl fails to connect after upgrade to RMQ 3.8.5

426 views
Skip to first unread message

Vitaliy K

unread,
Jul 7, 2020, 6:15:29 PM7/7/20
to rabbitmq-users
Hello guys,
After inplace upgrade of RMQ from 3.7.14/Erlang 21.3.3 to RMQ 3.8.5/Erlang 23.0.2 rabbitmqctl fails to connect to to local note.
Host OS: Ubuntu 18.04
Everything worked well before upgrade 

rab...@rmq-dev-01.host.local:
  * connected to epmd (port 4369) on rmq-dev-01.host.local
  * epmd reports node 'rabbit' uses port 25672 for inter-node and CLI tool traffic
  * TCP connection succeeded but Erlang distribution failed

  * Remote host closed TCP connection before completing authentication. Is the Erlang distribution using TLS?


Current node details:
 * node name: 'rabbitmqcli...@rmq-dev-01.host.local'
 * effective user's home directory: /var/lib/rabbitmq
 * Erlang cookie hash: ***

Cookie hash matches to hash I see in server log.
Same error when if I supply erlang cookie via cmd line. 

I use TLS, SSL_PATH was updated to reflect new version, the only other change I made to configuration was to turf off HIPE.

cat /etc/rabbitmq/rabbitmq-env.conf

ERL_SSL_PATH=/usr/lib/erlang/lib/ssl-10.0/ebin

SERVER_ADDITIONAL_ERL_ARGS="-pa $ERL_SSL_PATH \
  -proto_dist inet_tls \
  -ssl_dist_optfile /etc/rabbitmq/ssl_dist.config"

CTL_ERL_ARGS="-pa $ERL_SSL_PATH \
  -proto_dist inet_tls \
  -ssl_dist_optfile /etc/rabbitmq/ssl_dist.config"

USE_LONGNAME=true

Server log does not contain any error messages, I can connect to the instance using administrative plugin over HTTPS. 

Any ideas what else to look for?

Thanks in advance

Luke Bakken

unread,
Jul 7, 2020, 6:22:48 PM7/7/20
to rabbitmq-users
Hello,

There is a bug that has already been addressed. For now, rename CTL_ERL_ARGS to RABBITMQ_CTL_ERL_ARGS as a workaround.

Thanks,
Luke


On Tuesday, July 7, 2020 at 3:15:29 PM UTC-7, Vitaliy K wrote:
Hello guys,
After inplace upgrade of RMQ from 3.7.14/Erlang 21.3.3 to RMQ 3.8.5/Erlang 23.0.2 rabbitmqctl fails to connect to to local note.
Host OS: Ubuntu 18.04
Everything worked well before upgrade 

rab...@rmq-dev-01.host.local:
  * connected to epmd (port 4369) on rmq-dev-01.host.local
  * epmd reports node 'rabbit' uses port 25672 for inter-node and CLI tool traffic
  * TCP connection succeeded but Erlang distribution failed

  * Remote host closed TCP connection before completing authentication. Is the Erlang distribution using TLS?


Current node details:
 * node name: 'rabbitmqcli-4940-rabbit@rmq-dev-01.host.local'
 * effective user's home directory: /var/lib/rabbitmq
 * Erlang cookie hash: ***

Vitaliy K

unread,
Jul 7, 2020, 6:59:11 PM7/7/20
to rabbitmq-users
Thanks a lot!

Should I keep both settings in the file (for future release that will fix it)? 

Luke Bakken

unread,
Jul 7, 2020, 8:52:07 PM7/7/20
to rabbitmq-users
Hello,

You can keep both settings. Once version 3.8.6 is released you can use either setting, though it's "typical" to use CTL_ERL_ARGS in rabbitmq-env.conf

Thanks,
Luke

Vitaliy K

unread,
Jul 7, 2020, 10:50:58 PM7/7/20
to rabbitmq-users
Unfortunately it didn't work

Now I'm getting slightly different message in response to rabbitmqctl status
  
  * connected to epmd (port 4369) on rmq-dev-01.dl.internal.tree.com
  * epmd reports node 'rabbit' uses port 25672 for inter-node and CLI tool traffic
  * TCP connection succeeded but Erlang distribution failed
  * suggestion: check if the Erlang cookie identical for all server nodes and CLI tools
  * suggestion: check if all server nodes and CLI tools use consistent hostnames when addressing each other
  * suggestion: check if inter-node connections may be configured to use TLS. If so, all nodes and CLI tools must do that
   * suggestion: see the CLI, clustering and networking guides on https://rabbitmq.com/documentation.html to learn more

Vitaliy K

unread,
Jul 7, 2020, 11:05:13 PM7/7/20
to rabbitmq-users
Full configuration 

$ cat /etc/rabbitmq/rabbitmq.conf
#loopback_users.guest = true
log.default.level = debug
# TLS
listeners.ssl.default = 5671

ssl_options.cacertfile = /etc/rabbitmq/tls/wildcard.host.local.interm.cer
ssl_options.certfile   = /etc/rabbitmq/tls/wildcard.host.local.cert.cer
ssl_options.keyfile    = /etc/rabbitmq/tls/wildcard.host.local.key
#ssl_options.verify     = verify_peer
ssl_options.verify     = verify_none
ssl_options.fail_if_no_peer_cert = false
ssl_options.versions.1 = tlsv1.2
ssl_options.versions.2 = tlsv1.1

# RabbitMQ Management plugin
management.listener.port = 15672
management.listener.ssl = true
# management.tcp.compress = true
management.listener.ssl_opts.cacertfile = /etc/rabbitmq/tls/wildcard.host.local.interm.cer
management.listener.ssl_opts.certfile = /etc/rabbitmq/tls/wildcard.host.local.cert.cer
management.listener.ssl_opts.keyfile = /etc/rabbitmq/tls/wildcard.host.local.key

# cluster
cluster_formation.peer_discovery_backend = rabbit_peer_discovery_classic_config
queue_master_locator=min-masters

# plugins
management.load_definitions = /etc/rabbitmq/definitions.json


$ cat /etc/rabbitmq/advanced.config
[
 {rabbit,
  [{tcp_listeners, []}
  ]}
].



$ cat /etc/rabbitmq/rabbitmq-env.conf
ERL_SSL_PATH=/usr/lib/erlang/lib/ssl-10.0/ebin

SERVER_ADDITIONAL_ERL_ARGS="-pa $ERL_SSL_PATH \
  -proto_dist inet_tls \
  -ssl_dist_optfile /etc/rabbitmq/ssl_dist.config"

CTL_ERL_ARGS="-pa $ERL_SSL_PATH \
  -proto_dist inet_tls \
  -ssl_dist_optfile /etc/rabbitmq/ssl_dist.config"

# remove after 3.8.6 release - RABBITMQ_CTL_ERL_ARGS
RABBITMQ_CTL_ERL_ARGS="-pa $ERL_SSL_PATH \
  -proto_dist inet_tls \
  -ssl_dist_optfile /etc/rabbitmq/ssl_dist.config"

RABBITMQ_USE_LONGNAME=true


$ cat /etc/rabbitmq/ssl_dist.config
[
  {server, [
    {cacertfile, "/etc/rabbitmq/tls/wildcard.host.local.interm.cer"},
    {certfile, "/etc/rabbitmq/tls/wildcard.host.local.cert.cer"},
    {keyfile,  "/etc/rabbitmq/tls/wildcard.host.local.key"},
    {secure_renegotiate, true},
    {verify, verify_none},
    {fail_if_no_peer_cert, true}
  ]},
  {client, [
    {cacertfile, "/etc/rabbitmq/tls/wildcard.host.local.interm.cer"},
    {certfile, "/etc/rabbitmq/tls/wildcard.host.local.cert.cer"},
    {keyfile, "/etc/rabbitmq/tls/wildcard.host.local.key"},
    {secure_renegotiate, true},
    {verify, verify_none},
    {fail_if_no_peer_cert, true}
  ]}
].


Log file is attached
log.txt
Message has been deleted

Luke Bakken

unread,
Jul 8, 2020, 9:17:27 AM7/8/20
to rabbitmq-users
Hello,

Could you please find the full path to the rabbitmqctl shell script and run it this way:

/bin/sh -x /usr/lib/path/to/rabbitmqctl status

At some point the script calls the erl executable and I would like to see the arguments getting passed to it. Thank you!
Luke

Vitaliy K

unread,
Jul 8, 2020, 9:31:35 AM7/8/20
to rabbitmq-users
Sure,
this is what I see
$ sudo /bin/sh -x /usr/sbin/rabbitmqctl status
+ basename /usr/sbin/rabbitmqctl
+ SCRIPT=rabbitmqctl
+ main status
+ ensure_we_are_in_a_readable_dir
+ cd /var/lib/rabbitmq
+ current_user_is_rabbitmq
+ id -un
+ [ root = rabbitmq ]
+ current_user_is_rabbitmq
+ id -un
+ [ root = rabbitmq ]
+ current_user_is_root
+ id -u
+ [ 0 = 0 ]
+ calling_rabbitmq_plugins
+ [ rabbitmqctl = rabbitmq-plugins ]
+ current_user_is_root
+ id -u
+ [ 0 = 0 ]
+ exec_script_as_root status
+ [ -x /sbin/runuser ]
+ /sbin/runuser --version
+ grep -qF util-linux
+ exec /sbin/runuser -u rabbitmq -- /usr/lib/rabbitmq/bin/rabbitmqctl status

Vitaliy K

unread,
Jul 8, 2020, 9:36:27 AM7/8/20
to rabbitmq-users
and one more

$ sudo /bin/sh -x /usr/lib/rabbitmq/bin/rabbitmqctl status
+ set -e
+ set -a
+ . /usr/lib/rabbitmq/bin/rabbitmq-env
+ [  = 1 ]
+ [ -z  ]
+ set +e
+ SCRIPT_PATH=/usr/lib/rabbitmq/bin/rabbitmqctl
+ [ -h /usr/lib/rabbitmq/bin/rabbitmqctl ]
+ readlink -f /usr/lib/rabbitmq/bin/rabbitmqctl
+ FULL_PATH=/usr/lib/rabbitmq/lib/rabbitmq_server-3.8.5/sbin/rabbitmqctl
+ [ 0 != 0 ]
+ SCRIPT_PATH=/usr/lib/rabbitmq/lib/rabbitmq_server-3.8.5/sbin/rabbitmqctl
+ [ -h /usr/lib/rabbitmq/lib/rabbitmq_server-3.8.5/sbin/rabbitmqctl ]
+ set -e
+ dirname /usr/lib/rabbitmq/lib/rabbitmq_server-3.8.5/sbin/rabbitmqctl
+ RABBITMQ_SCRIPTS_DIR=/usr/lib/rabbitmq/lib/rabbitmq_server-3.8.5/sbin
+ rmq_realpath /usr/lib/rabbitmq/lib/rabbitmq_server-3.8.5/sbin/..
+ local path=/usr/lib/rabbitmq/lib/rabbitmq_server-3.8.5/sbin/..
+ [ -d /usr/lib/rabbitmq/lib/rabbitmq_server-3.8.5/sbin/.. ]
+ cd /usr/lib/rabbitmq/lib/rabbitmq_server-3.8.5/sbin/..
+ pwd
+ RABBITMQ_HOME=/usr/lib/rabbitmq/lib/rabbitmq_server-3.8.5
+ ESCRIPT_DIR=/usr/lib/rabbitmq/lib/rabbitmq_server-3.8.5/escript
+ . /usr/lib/rabbitmq/lib/rabbitmq_server-3.8.5/sbin/rabbitmq-defaults
+ SYS_PREFIX=
+ CLEAN_BOOT_FILE=start_clean
+ SASL_BOOT_FILE=start_sasl
+ BOOT_MODULE=rabbit
+ test -z
+ test -z
+ CONF_ENV_FILE=/etc/rabbitmq/rabbitmq-env.conf
+ saved_RABBITMQ_PID_FILE=
+ [ x = x ]
+ RABBITMQ_CONF_ENV_FILE=/etc/rabbitmq/rabbitmq-env.conf
+ [ -f /etc/rabbitmq/rabbitmq-env.conf ]
+ CONF_ENV_FILE_PHASE=rabbitmq-env
+ . /etc/rabbitmq/rabbitmq-env.conf
+ ERL_SSL_PATH=/usr/lib/erlang/lib/ssl-10.0/ebin
+ SERVER_ADDITIONAL_ERL_ARGS=-pa /usr/lib/erlang/lib/ssl-10.0/ebin   -proto_dist inet_tls   -ssl_dist_optfile /etc/rabbitmq/ssl_dist.config
+ RABBITMQ_CTL_ERL_ARGS=-pa /usr/lib/erlang/lib/ssl-10.0/ebin   -proto_dist inet_tls   -ssl_dist_optfile /etc/rabbitmq/ssl_dist.config
+ USE_LONGNAME=true
+ [ -n  ]
+ [ -n  ]
+ DEFAULT_SCHEDULER_BIND_TYPE=db
+ [ -n  ]
+ SCHEDULER_BIND_TYPE=db
+ [ -n  ]
+ RABBITMQ_SCHEDULER_BIND_TYPE=db
+ DEFAULT_DISTRIBUTION_BUFFER_SIZE=128000
+ [ -n  ]
+ DISTRIBUTION_BUFFER_SIZE=128000
+ [ -n  ]
+ RABBITMQ_DISTRIBUTION_BUFFER_SIZE=128000
+ DEFAULT_MAX_NUMBER_OF_PROCESSES=1048576
+ [ -n  ]
+ MAX_NUMBER_OF_PROCESSES=1048576
+ [ -n  ]
+ RABBITMQ_MAX_NUMBER_OF_PROCESSES=1048576
+ DEFAULT_MAX_NUMBER_OF_ATOMS=5000000
+ [ -n  ]
+ MAX_NUMBER_OF_ATOMS=5000000
+ [ -n  ]
+ RABBITMQ_MAX_NUMBER_OF_ATOMS=5000000
+ SERVER_ERL_ARGS= +P 1048576 +t 5000000 +stbt db +zdbbl 128000
+ [ x = x ]
+ RABBITMQ_IO_THREAD_POOL_SIZE=
+ [ x = x ]
+ RABBITMQ_SERVER_ERL_ARGS= +P 1048576 +t 5000000 +stbt db +zdbbl 128000
+ [ x = x ]
+ RABBITMQ_SERVER_START_ARGS=
+ [ x = x ]
+ RABBITMQ_SERVER_ADDITIONAL_ERL_ARGS=-pa /usr/lib/erlang/lib/ssl-10.0/ebin   -proto_dist inet_tls   -ssl_dist_optfile /etc/rabbitmq/ssl_dist.config
+ [ x = x ]
+ RABBITMQ_SERVER_CODE_PATH=
+ [ x = x ]
+ RABBITMQ_IGNORE_SIGINT=true
+ [ xtrue = xtrue ]
+ RABBITMQ_IGNORE_SIGINT_FLAG=+B i
+ [ -n  ]
+ [ x = x ]
+ RABBITMQ_BOOT_MODULE=rabbit
+ RABBITMQ_ENV_LOADED=1
+ true
+ run_escript rabbitmqctl_escript /usr/lib/rabbitmq/lib/rabbitmq_server-3.8.5/escript/rabbitmqctl status
+ escript_main=rabbitmqctl_escript
+ shift
+ escript=/usr/lib/rabbitmq/lib/rabbitmq_server-3.8.5/escript/rabbitmqctl
+ shift
+ _rmq_env_set_erl_libs
+ [ -n  ]
+ export ERL_LIBS=/usr/lib/rabbitmq/lib/rabbitmq_server-3.8.5/plugins
+ exec erl +B -boot start_clean -noinput -noshell -hidden -smp enable -pa /usr/lib/erlang/lib/ssl-10.0/ebin -proto_dist inet_tls -ssl_dist_optfile /etc/rabbitmq/ssl_dist.config -run escript start -escript main rabbitmqctl_escript -extra /usr/lib/rabbitmq/lib/rabbitmq_server-3.8.5/escript/rabbitmqctl status

Luke Bakken

unread,
Jul 8, 2020, 12:28:45 PM7/8/20
to rabbitmq-users
Thank you, that is valuable information. I am investigating.

Vitaliy K

unread,
Jul 8, 2020, 12:33:06 PM7/8/20
to rabbitmq-users
Thanks a lot.
Please let me know if you need more details.

Luke Bakken

unread,
Jul 8, 2020, 6:51:46 PM7/8/20
to rabbitmq-users
Hello,

I can't reproduce your issue using RabbitMQ 3.8.5 and Erlang 23.0.2.

This repository contains the complete set of certificates and test process I used: https://github.com/lukebakken/tls-dist-not-working-26raub5hLCI

You can use the above to double-check your environment.

The only issue I found was that RABBITMQ_CTL_ERL_ARGS is required in 3.8.5 as I mentioned already.

Thanks,
LUke

Vitaliy K

unread,
Jul 9, 2020, 3:46:34 PM7/9/20
to rabbitmq-users
Hello, Luke!

I changed config a little, but it still does not work.

I noticed one interesting thing - when I run rabbitmq-diagnostics, I'm getting following error - notice longnames, even if I have USE_LONGNAME=true in rabbitmq-env.conf (see below) 


sudo rabbitmq-diagnostics erlang_cookie_hash
Asking node rab...@rmq-dev-01.host.local its Erlang cookie hash...
Stack trace:

** (FunctionClauseError) no function clause matching in RabbitMQ.CLI.Diagnostics.Commands.ErlangCookieHashCommand.output/2
    (rabbitmqctl) lib/rabbitmq/cli/diagnostics/commands/erlang_cookie_hash_command.ex:28: RabbitMQ.CLI.Diagnostics.Commands.ErlangCookieHashCommand.output({:badrpc, :nodedown}, %{longnames: false, node: :"rab...@rmq-dev-01.host.local", timeout: :infinity})
    (rabbitmqctl) lib/rabbitmqctl.ex:166: RabbitMQCtl.maybe_run_command/3
    (rabbitmqctl) lib/rabbitmqctl.ex:134: anonymous fn/5 in RabbitMQCtl.do_exec_parsed_command/5
    (rabbitmqctl) lib/rabbitmqctl.ex:605: RabbitMQCtl.maybe_with_distribution/3
    (rabbitmqctl) lib/rabbitmqctl.ex:106: RabbitMQCtl.exec_command/2
    (rabbitmqctl) lib/rabbitmqctl.ex:45: RabbitMQCtl.main/1
    (elixir) lib/kernel/cli.ex:105: anonymous fn/3 in Kernel.CLI.exec_fun/2

Error:
:function_clause


rabbitmq-env.conf
readonly ERL_SSL_PATH=/usr/lib/erlang/lib/ssl-10.0/ebin

readonly _base_dir='/etc/rabbitmq'
readonly USE_LONGNAME=true

readonly SERVER_ADDITIONAL_ERL_ARGS="-pa $ERL_SSL_PATH -proto_dist inet_tls -ssl_dist_optfile $_base_dir/ssl_dist.config"

readonly CTL_ERL_ARGS="-pa $ERL_SSL_PATH -proto_dist inet_tls -ssl_dist_optfile $_base_dir/ssl_dist.config"

# remove after 3.8.6 release - RABBITMQ_CTL_ERL_ARGS
readonly RABBITMQ_CTL_ERL_ARGS="$CTL_ERL_ARGS"

Vitaliy K

unread,
Jul 10, 2020, 2:55:40 PM7/10/20
to rabbitmq-users
Hey Luke,
Your configuration worked for me, thanks a lot!

Initially we had wildcard certificate (same for both) configured for server and client.
I disabled client certificate and turned off peer verification and fail_if_no_peer_set, now it works fine.
I wonder what has changed there.

Luke Bakken

unread,
Jul 13, 2020, 6:04:25 PM7/13/20
to rabbitmq-users
Hello,

There is a very good chance that certificate validation for wildcard certs changed between Erlang 21.3.3 to Erlang 23.0.2

If you could provide instructions for generating wildcard certs in the same manner as you did, or at least provide enough information I could use tls-gen to generate comparable certs, I could try to see what changed. It may be a bug in Erlang/OTP

tls-gen allows me to customize the CN= value of a certificate - https://github.com/michaelklishin/tls-gen

I'm not sure if that is sufficient to reproduce what you are doing, however.

Thanks -
Luke

vkr2...@gmail.com

unread,
Jul 15, 2020, 2:31:07 PM7/15/20
to rabbitmq-users
Hi Luke,
here are some certificate parameters, we do not use self-signed certificates

Issuer:
CN = Sectigo RSA Organization Validation Secure Server CA
O = Sectigo Limited
L = Salford
S = Greater Manchester
C = GB

Subject:
CN = *.x.y.host.local
OU = X
O = X
L = X
S = X
C = US

Enhanced Key Usage:
Server Authentication (1.3.6.1.5.5.7.3.1)
Client Authentication (1.3.6.1.5.5.7.3.2)

Key Usage:
Digital Signature, Key Encipherment (a0)

Basic constraints:
Subject Type=End Entity
Path Length Constraint=None

Subject alternative name:
DNS Name=*.x.y.host.local


Reply all
Reply to author
Forward
0 new messages