Peer discovery kubernetes not able to connect with k8s apiserver

866 views
Skip to first unread message

Dennis Winter

unread,
Sep 6, 2018, 5:35:47 PM9/6/18
to rabbitmq-users
Hello,

I'm currently trying to build a cluster on k8s with the rabbitmq_peer_discovery_k8s plugin. 
The problem seems to be, that there is something like an incompatibility with certificates.
We are actually using self-signed certs, but as there is no issue with connecting to the apiserver via cURL, I don't think that this is the problem.

2018-09-06 21:26:06.912 [info] <0.217.0> Configured peer discovery backend: rabbit_peer_discovery_k8s
2018-09-06 21:26:06.912 [debug] <0.217.0> Peer discovery backend supports initialisation.
2018-09-06 21:26:06.912 [debug] <0.217.0> Peer discovery Kubernetes: initialising...
2018-09-06 21:26:06.912 [debug] <0.217.0> HTTP client proxy is not configured
2018-09-06 21:26:06.912 [debug] <0.217.0> Peer discovery backend initialisation succeeded.
2018-09-06 21:26:06.912 [info] <0.217.0> Will try to lock with peer discovery backend rabbit_peer_discovery_k8s
2018-09-06 21:26:06.913 [info] <0.217.0> Peer discovery backend does not support locking, falling back to randomized delay
2018-09-06 21:26:06.913 [info] <0.217.0> Peer discovery backend rabbit_peer_discovery_k8s does not support registration, skipping randomized startup delay.
2018-09-06 21:26:06.920 [info] <0.197.0> SSL WARNING: Ignoring a CA cert as it could not be correctly decoded.

2018-09-06 21:26:06.927 [info] <0.236.0> TLS client: In state certify at ssl_handshake.erl:339 generated CLIENT ALERT: Fatal - Internal Error - {unexpected_error,{case_clause,{error,{asn1,{...}}}}}

2018-09-06 21:26:06.927 [debug] <0.217.0> Response: {error,{failed_connect,[{to_address,{"manager.k8s.local",6443}},{inet,[inet],{tls_alert,"internal error"}}]}}
2018-09-06 21:26:06.927 [debug] <0.217.0> HTTP Error {failed_connect,[{to_address,{"manager.k8s.local",6443}},{inet,[inet],{tls_alert,"internal error"}}]}
2018-09-06 21:26:06.927 [info] <0.217.0> Failed to get nodes from k8s - {failed_connect,[{to_address,{"manager.k8s.local",6443}},
                 {inet,[inet],{tls_alert,"internal error"}}]}
2018-09-06 21:26:06.927 [error] <0.216.0> CRASH REPORT Process <0.216.0> with 0 neighbours exited with reason: no case clause matching {error,"{failed_connect,[{to_address,{\"manager.k8s.local\",6443}},\n                 {inet,[inet],{tls_alert,\"internal error\"}}]}"} in rabbit_mnesia:init_from_config/0 line 164 in application_master:init/4 line 134
2018-09-06 21:26:06.927 [info] <0.33.0> Application rabbit exited with reason: no case clause matching {error,"{failed_connect,[{to_address,{\"manager.k8s.local\",6443}},\n                 {inet,[inet],{tls_alert,\"internal error\"}}]}"} in rabbit_mnesia:init_from_config/0 line 164
Kernel pid terminated (application_controller) ({application_start_failure,rabbit,{bad_return,{{rabbit,start,[normal,[]]},{'EXIT',{{case_clause,{error,"{failed_connect,[{to_address,{\"manager.k8s.loca
{"Kernel pid terminated",application_controller,"{application_start_failure,rabbit,{bad_return,{{rabbit,start,[normal,[]]},{'EXIT',{{case_clause,{error,\"{failed_connect,[{to_address,{\\"manager.k8s.local\\",6443}},\n                 {inet,[inet],{tls_alert,\\"internal error\\"}}]}\"}},[{rabbit_mnesia,init_from_config,0,[{file,\"src/rabbit_mnesia.erl\"},{line,164}]},{rabbit_mnesia,init_with_lock,3,[{file,\"src/rabbit_mnesia.erl\"},{line,144}]},{rabbit_mnesia,init,0,[{file,\"src/rabbit_mnesia.erl\"},{line,111}]},{rabbit_boot_steps,'-run_step/2-lc$^1/1-1-',1,[{file,\"src/rabbit_boot_steps.erl\"},{line,49}]},{rabbit_boot_steps,run_step,2,[{file,\"src/rabbit_boot_steps.erl\"},{line,49}]},{rabbit_boot_steps,'-run_boot_steps/1-lc$^0/1-0-',1,[{file,\"src/rabbit_boot_steps.erl\"},{line,26}]},{rabbit_boot_steps,run_boot_steps,1,[{file,\"src/rabbit_boot_steps.erl\"},{line,26}]},{rabbit,start,2,[{file,\"src/rabbit.erl\"},{line,805}]}]}}}}}"}

The actual configuration used: 
[
      {ssl, [
        {versions, ['tlsv1.2', 'tlsv1.1']},
        {ciphers,  [
          {ecdhe_rsa,aes_256_gcm,null,sha384},
          {ecdhe_rsa,aes_256_cbc,sha384,sha384},
          {ecdh_rsa,aes_256_gcm,null,sha384},
          {ecdh_rsa,aes_256_cbc,sha384,sha384},
          {dhe_rsa,aes_256_gcm,null,sha384},
          {dhe_dss,aes_256_gcm,null,sha384},
          {dhe_rsa,aes_256_cbc,sha256},
          {dhe_dss,aes_256_cbc,sha256},
          {rsa,aes_256_gcm,null,sha384},
          {rsa,aes_256_cbc,sha256},
          {ecdhe_rsa,aes_128_gcm,null,sha256},
          {ecdhe_rsa,aes_128_cbc,sha256,sha256},
          {ecdh_rsa,aes_128_gcm,null,sha256},
          {ecdh_rsa,aes_128_cbc,sha256,sha256},
          {dhe_rsa,aes_128_gcm,null,sha256},
          {dhe_dss,aes_128_gcm,null,sha256},
          {dhe_rsa,aes_128_cbc,sha256},
          {ecdh_rsa,aes_128_gcm,null,sha256}
        ]}
      ]},
      {rabbitmq_management, [
        {listener, [
          {port, 15672}
        ]},
        {load_definitions, "/etc/rabbitmq/definitions.json"}
      ]},
      {rabbit, [
        {log, [
          {console, [
            {enabled, true},
            {level,   debug}
          ]}
        ]},
        {cluster_partition_handling, autoheal},
        {queue_master_locator, "min-masters"},
        {loopback_users, []},
        {cluster_formation, [
          {peer_discovery_backend, rabbit_peer_discovery_k8s},
          {peer_discovery_k8s, [
            {k8s_host, "manager.k8s.local"},
            {k8s_port, 6443},
            {k8s_address_type, ip}
          ]},
          {node_cleanup, [
            {interval, 30},
            {only_log_warning, true}
          ]}
        ]},
        {ssl_options, [
          {versions, ['tlsv1.2', 'tlsv1.1']},
          %% This list is just an example!
          %% Not all cipher suites are available on all machines.
          %% Cipher suite order is important: preferred suites
          %% should be listed first.
          %% Different suites have different security and CPU load characteristics.
          {ciphers,  [
            {ecdhe_rsa,aes_256_gcm,null,sha384},
            {ecdhe_rsa,aes_256_cbc,sha384,sha384},
            {ecdh_rsa,aes_256_gcm,null,sha384},
            {ecdh_rsa,aes_256_cbc,sha384,sha384},
            {dhe_rsa,aes_256_gcm,null,sha384},
            {dhe_dss,aes_256_gcm,null,sha384},
            {dhe_rsa,aes_256_cbc,sha256},
            {dhe_dss,aes_256_cbc,sha256},
            {rsa,aes_256_gcm,null,sha384},
            {rsa,aes_256_cbc,sha256},
            {ecdhe_rsa,aes_128_gcm,null,sha256},
            {ecdhe_rsa,aes_128_cbc,sha256,sha256},
            {ecdh_rsa,aes_128_gcm,null,sha256},
            {ecdh_rsa,aes_128_cbc,sha256,sha256},
            {dhe_rsa,aes_128_gcm,null,sha256},
            {dhe_dss,aes_128_gcm,null,sha256},
            {dhe_rsa,aes_128_cbc,sha256},
            {ecdh_rsa,aes_128_gcm,null,sha256}
          ]}
        ]}
      ]}
    ].

As one can see, I already tried a few things, but nothing really works. I've already updated Erlang to 20.3..8.6 and RabbitMQ to 3.7.7.
Any help is appreciated.
Thanks 

Michael Klishin

unread,
Sep 6, 2018, 5:46:00 PM9/6/18
to rabbitm...@googlegroups.com
And what were those things you already tried?

The error says there was a TLS alert. See Kubernetes API logs if any. I highly recommend relaxing the cipher suite
and maybe even version requirements and reintroducing them one by one (e.g. removing cipher suites one by one),
then you will know which one is the crucial intersection.

A TLS alert that says "internal error" to me suggests it's a TLS implementation incompatibility but what that might be is impossible to tell
without a lengthy traffic dump inspection session.

Most content covered in [1] applies to any TLS client and server combination.


--
You received this message because you are subscribed to the Google Groups "rabbitmq-users" group.
To unsubscribe from this group and stop receiving emails from it, send an email to rabbitmq-users+unsubscribe@googlegroups.com.
To post to this group, send email to rabbitmq-users@googlegroups.com.
For more options, visit https://groups.google.com/d/optout.



--
MK

Staff Software Engineer, Pivotal/RabbitMQ

Jonh Dow

unread,
Sep 18, 2019, 4:28:56 AM9/18/19
to rabbitmq-users
Hi,
Did you manage to fix that issue.
If yes, how?
I am experiencing the same issue when I try to install RabbitMQ image 3.7.15-debian-9-r8 on Kubernetes 1.15.
I have not set any cipher in the RabbitMQ config but I am having :
Peer discovery backend rabbit_peer_discovery_k8s

SSL WARNING: Ignoring a CA cert as it could not be correctly decoded.
2019-09-18 08:07:16.096 [info] <0.246.0> TLS client: In state certify at ssl_handshake.erl:364 generated CLIENT ALERT: Fatal - Internal Error - {unexpected_error,{case_clause,{error,{asn1,{...}}}}} 
when the container starts. The CA cert seems OK as I can access it from another pod.

Any suggestions what could be the issue?

Thanks


On Thursday, September 6, 2018 at 11:46:00 PM UTC+2, Michael Klishin wrote:
And what were those things you already tried?

The error says there was a TLS alert. See Kubernetes API logs if any. I highly recommend relaxing the cipher suite
and maybe even version requirements and reintroducing them one by one (e.g. removing cipher suites one by one),
then you will know which one is the crucial intersection.

A TLS alert that says "internal error" to me suggests it's a TLS implementation incompatibility but what that might be is impossible to tell
without a lengthy traffic dump inspection session.

Most content covered in [1] applies to any TLS client and server combination.

To unsubscribe from this group and stop receiving emails from it, send an email to rabbitm...@googlegroups.com.
To post to this group, send email to rabbitm...@googlegroups.com.

For more options, visit https://groups.google.com/d/optout.

Michael Klishin

unread,
Sep 18, 2019, 1:44:11 PM9/18/19
to rabbitmq-users
> 2018-09-06 21:26:06.920 [info] <0.197.0> SSL WARNING: Ignoring a CA cert as it could not be correctly decoded.

is a yet another (in addition to a new thread by J Dow) confirmation that the ASN.1 decoder in Erlang runs into an exception.

This likely is ERL-968 [1].

Michael Klishin

unread,
Sep 18, 2019, 1:48:19 PM9/18/19
to rabbitmq-users
A fix to ERL-968 is available in Erlang 21.3.8.7, 22.0.7 and 22.1, the latest versions
in each 21 and 22 release series. See [1] to learn how to provision them on Debian-based systems
and [2] for RPM-based ones.

Jonh Dow

unread,
Sep 19, 2019, 11:34:38 AM9/19/19
to rabbitmq-users
Thanks Michael,
The fact is that I am using a RabbitMQ docker image that I imagine is already based on an Erlang image.
Do you know by any chance which RabbitMQ image is likely to be based on the correct Erlang Image having the bug fix.

Thank you
Regards
Stan

Jonh Dow

unread,
Sep 19, 2019, 4:37:49 PM9/19/19
to rabbitmq-users
Hi Michael,
I have used another RabbitMQ image that is based on Erlang 22.1 and still have the TLS error below:
******************************************************************************************************
Starting RabbitMQ 3.7.18 on Erlang 22.1
..............
2019-09-19 20:14:04.472 [debug] <0.234.0> GET https://kubernetes.default.svc.cluster.local:443/api/v1/namespaces/helm/endpoints/rabbitmq-headless
2019-09-19 20:14:04.495 [debug] <0.234.0> Response: {error,{failed_connect,[{to_address,{"kubernetes.default.svc.cluster.local",443}},{inet,[inet],{tls_alert,{internal_error,"TLS client: In state certify at ssl_handshake.erl:376 generated CLIENT ALERT: Fatal - Internal Error - {unexpected_error,{case_clause,{error,{asn1,{...}}}}}\n"}}}]}}
2019-09-19 20:14:04.495 [debug] <0.234.0> HTTP Error {failed_connect,[{to_address,{"kubernetes.default.svc.cluster.local",443}},{inet,[inet],{tls_alert,{internal_error,"TLS client: In state certify at ssl_handshake.erl:376 generated CLIENT ALERT: Fatal - Internal Error - {unexpected_error,{case_clause,{error,{asn1,{...}}}}}\n"}}}]}
2019-09-19 20:14:04.496 [info] <0.234.0> Failed to get nodes from k8s - {failed_connect,
[{to_address,{"kubernetes.default.svc.cluster.local",443}},
{inet,
[inet],
{tls_alert,
{internal_error,
"TLS client: In state certify at ssl_handshake.erl:376 generated CLIENT ALERT: Fatal - Internal Error - {unexpected_error,{case_clause,{error,{asn1,{...}}}}}\n"}}}]}
2019-09-19 20:14:04.496 [error] <0.233.0> CRASH REPORT Process <0.233.0> with 0 neighbours exited with reason: no case clause matching {error,"{failed_connect,\n [{to_address,{\"kubernetes.default.svc.cluster.local\",443}},\n {inet,\n [inet],\n {tls_alert,\n {internal_error,\n \"TLS client: In state certify at ssl_handshake.erl:376 generated CLIENT ALERT: Fatal - Internal Error
*******************************************************************************************************************************************************************
One more time I am able to reach the CA certificate from the Pod before the full startup of the container using Openssl command.
I am really not sure what could be the reason for this error.
Any idea?
Thanks
John

Michael Klishin

unread,
Sep 19, 2019, 8:00:54 PM9/19/19
to rabbitmq-users
The Docker image is automatically rebuilt when a new version of Erlang, RabbitMQ or OpenSSL are released.

To unsubscribe from this group and stop receiving emails from it, send an email to rabbitmq-user...@googlegroups.com.
To view this discussion on the web, visit https://groups.google.com/d/msgid/rabbitmq-users/186edec9-bf64-47ac-9eca-5bd1273d07c6%40googlegroups.com.

Michael Klishin

unread,
Sep 19, 2019, 8:02:21 PM9/19/19
to rabbitmq-users
Please report this to [1] and provide as much information as possible for the Erlang maintainers to reproduce.
[2] helps you narrow things down and collect evidence of connections succeeding with a different (and very minimalistic) client.

ASN.1 and TLS are implemented by Erlang, not RabbitMQ.


To unsubscribe from this group and stop receiving emails from it, send an email to rabbitmq-user...@googlegroups.com.

Jonh Dow

unread,
Sep 23, 2019, 11:37:01 AM9/23/19
to rabbitmq-users
Hi Michael,
I have checked at the Erlang side also.
But it seems more logs are needed especially in the following section, after asn1,{...:
Internal Error - {unexpected_error,{case_clause,{error,{asn1,{...}}}}}

Would you know how can I enable more SSL logs in the RabbitMQ set up?

Thank you
Regards
John
To unsubscribe from this group and stop receiving emails from it, send an email to rabbitm...@googlegroups.com.
Reply all
Reply to author
Forward
0 new messages