Erlang 26 breaks inter-node client with fail_if_no_peer_cert=true

214 views
Skip to first unread message

Julio Polo

unread,
May 8, 2024, 11:52:17 PM5/8/24
to rabbitmq-users
I have determined that going from Erlang 25.3.2.12-1.el8 to Erlang 26.0-1.el8 (any Erlang 26 version actually) causes the following issues in my cluster:
  • Starting rabbitmq-server on the first node works fine, but running any CLI (on that same node) like "rabbitmqctl cluster_status" results in "TCP connection succeeded but Erlang distribution failed"
  • Attempts to start rabbitmq-server in the other two nodes are stuck "waiting for Mnesia tables" and is never successful.  The eventual error after the last attempt also says "TCP connection succeeded but Erlang distribution failed"
Both of these problems go away if I change the client fail_if_no_peer_cert from true to false in the  -ssl_dist_optfile:

[
  {server, [
    {versions, [ 'tlsv1.3' ]},
    {cacertfile, "/etc/pki/tls/certs/node-cacert"},
    {certfile, "/etc/pki/tls/certs/node-cert"},
    {keyfile,  "/etc/pki/tls/private/node-key"},
    {verify, verify_peer},
    {depth, 4},
    {fail_if_no_peer_cert, true}
  ]},
  {client, [
    {versions, [ 'tlsv1.3' ]},
    {cacertfile, "/etc/pki/tls/certs/node-cacert"},
    {certfile, "/etc/pki/tls/certs/node-cert"},
    {keyfile,  "/etc/pki/tls/private/node-key"},
    {verify, verify_peer},
    {depth, 4},
    {fail_if_no_peer_cert, 
false}
  ]}
].


This is strange because the RabbitMQ logs do not show any cert errors when the client  fail_if_no_peer_cert is set to true.

I have used the openssl command to verify that each node listening on 25672 returns an error if I do not send a client cert. I have also verified that sending the client certs configured in the above ssl_dist_optfile do not return any errors (including no verification errors)

My original setting of client fail_if_no_peer_cert=true works for Erlang 25 and RabbitMQ versions 3.11.13 through 3.12.13.   I have also tried upgrading to the latest RabbitMQ 3.13.2-1.el8 and Erlang 26.2.5-1.el8 but that didn't help either.   I made sure I updated ERL_SSL_PATH to match whatever was installed after an Erlang upgrade before I started rabbitmq-server.

This all seems to point to a problem introduced when going from Erlang 25 to Erlang 26.  Hopefully someone in the Erlang or RabbitMQ team can confirm that this is indeed a bug.

-julio

Julio Polo

unread,
May 9, 2024, 3:04:14 PM5/9/24
to rabbitmq-users
Never mind.  I found https://github.com/erlang/otp/issues/7497 which explains that fail_if_no_peer_cert is not valid as a client setting.  Sorry for the false alarm.

-julio

Reply all
Reply to author
Forward
0 new messages