(gnutls_handshake) ssl handshake fails after upgrade to 2.15

569 views
Skip to first unread message

Atttila Santo

unread,
Jun 14, 2016, 3:14:26 AM6/14/16
to ganeti
hi,

I have recently upgraded an old ganeti 2.3 cluster (single node at the time; second node was down due to hardware failure and was to be ree-added after the upgrade) to 2.12 and to 2.15 subsequently. The upgrade was smooth (although I have to admit, I had suspected differently).

gnt-cluster verify looked fine, but complained about self-signed certificates. So I figured the certs need to be renewed and issued:

gnt-cluster renew-crypto --new-cluster-certificate --new-node-certificates



Ever since gnt-cluster verify fails like this:

Submitted jobs 1174353, 1174354
Waiting for job 1174353 ...
Tue Jun 14 08:18:14 2016 * Verifying cluster config
Tue Jun 14 08:18:14 2016 * Verifying cluster certificate files
Tue Jun 14 08:18:14 2016 * Verifying hypervisor parameters
Tue Jun 14 08:18:14 2016 * Verifying all nodes belong to an existing group
Waiting for job 1174354 ...
Tue Jun 14 08:18:15 2016 * Verifying group 'default'
Tue Jun 14 08:18:15 2016 * Gathering data (1 nodes)
Tue Jun 14 08:18:15 2016 * Gathering information about nodes (1 nodes)
Tue Jun 14 08:18:15 2016 * Gathering disk information (1 nodes)
Tue Jun 14 08:18:15 2016   - ERROR: node emotion1.emotion.eu: while getting disk information: Error 35: gnutls_handshake() failed: Handshake failed
Tue Jun 14 08:18:15 2016 * Verifying configuration file consistency
Tue Jun 14 08:18:15 2016   - ERROR: node emotion1.emotion.eu: Could not verify the SSH setup of this node.
Tue Jun 14 08:18:15 2016   - ERROR: node emotion1.emotion.eu: Node did not return file checksum data
Tue Jun 14 08:18:15 2016 * Verifying node status
Tue Jun 14 08:18:15 2016   - ERROR: node emotion1.emotion.eu: while contacting node: Error 35: gnutls_handshake() failed: Handshake failed
Tue Jun 14 08:18:15 2016 * Verifying instance status
Tue Jun 14 08:18:15 2016   - ERROR: instance inst3.emotion.eu: instance not running on its primary node emotion1.emotion.eu
Tue Jun 14 08:18:15 2016   - ERROR: instance inst3.emotion.eu: couldn't retrieve status for disk/0 on emotion1.emotion.eu: Error 35: gnutls_handshake() failed: Handshake failed
Tue Jun 14 08:18:15 2016   - ERROR: node emotion1.emotion.eu: instance inst3.emotion.eu, connection to primary node failed
Tue Jun 14 08:18:15 2016   - ERROR: node emotion1.emotion.eu: instance inst2.emotion.eu, connection to primary node failed
Tue Jun 14 08:18:15 2016   - ERROR: instance inst1.emotion.eu: instance not running on its primary node emotion1.emotion.eu
Tue Jun 14 08:18:15 2016   - ERROR: instance inst1.emotion.eu: couldn'
t retrieve status for disk/0 on emotion1.emotion.eu: Error 35: gnutls_handshake() failed: Handshake failed
Tue Jun 14 08:18:16 2016   - ERROR: node emotion1.emotion.eu: instance inst1.emotion.eu, connection to primary node failed
Tue Jun 14 08:18:16 2016 * Verifying orphan volumes
Tue Jun 14 08:18:16 2016 * Verifying N+1 Memory redundancy
Tue Jun 14 08:18:16 2016 * Other Notes
Tue Jun 14 08:18:16 2016   - NOTICE: 3 non-redundant instance(s) found.
Tue Jun 14 08:18:16 2016  - WARNING: Communication failure to node 87111631-99df-480b-bdb6-0a5296c59e7b: Error 35: gnutls_handshake() failed: Handshake failed
Tue Jun 14 08:18:16 2016 * Hooks Results
Tue Jun 14 08:18:16 2016   - ERROR: node 87111631-99df-480b-bdb6-0a5296c59e7b: Communication failure in hooks execution: Error 35: gnutls_handshake() failed: Handshake failed




/var/log/ganeti/node-daemon.log shows:

ganeti-noded pid=11902 ERROR Error while handling request from 192.168.1.101:33637
Traceback (most recent call last):
 
File "/usr/share/ganeti/2.15/ganeti/http/server.py", line 594, in _IncomingConnection
   
self.request_executor(self, self.handler, connection, client_addr)
 
File "/usr/share/ganeti/2.15/ganeti/server/noded.py", line 158, in __init__
    http
.server.HttpServerRequestExecutor.__init__(self, *args, **kwargs)
 
File "/usr/share/ganeti/2.15/ganeti/http/server.py", line 422, in __init__
    http
.Handshake(sock, self.WRITE_TIMEOUT)
 
File "/usr/share/ganeti/2.15/ganeti/http/__init__.py", line 539, in Handshake
   
raise HttpError("Error in SSL handshake: %s" % err)
HttpError: Error in SSL handshake: ([('SSL routines', 'SSL3_GET_CLIENT_CERTIFICATE', 'peer did not return a certificate')],)




gnt-node seems to be working (info, and list)

- Node name: emotion1.emotion.eu
  primary ip
: 192.168.1.101
  secondary ip
: 10.168.0.1
  master candidate
: True
  drained
: False
  offline
: False
  master_capable
: True
  vm_capable
: True
  primary
for instances:
   
- inst1.emotion.eu
   
- inst2.emotion.eu
   
- inst3.emotion.eu
  secondary
for instances:
  node parameters
:
    cpu_speed
: default (1)
    exclusive_storage
: default (False)
    oob_program
: default ()
    ovs
: default (False)
    ovs_link
: default ()
    ovs_name
: default (switch1)
    spindle_count
: default (1)
    ssh_port
: default (22)



Node                DTotal  DFree MTotal MNode MFree Pinst Sinst
emotion1
.emotion.eu 821.4G 291.4G  15.7G 10.7G  6.0G     3     0



And so does gnt-instance list

# gnt-instance list

Instance         Hypervisor OS                  Primary_node        Status     Memory
inst1
.emotion.eu kvm        debootstrap+default emotion1.emotion.eu running      3.0G
inst2
.emotion.eu kvm        debootstrap+default emotion1.emotion.eu ADMIN_down      -
inst3
.emotion.eu kvm        debootstrap+default emotion1.emotion.eu running      6.0G




gnt-instance info inst1 wont do (fails):

# gnt-instance info inst1

Failure: command execution error:
Error checking node emotion1.emotion.eu: Error 35: gnutls_handshake() failed: Handshake failed




I've tried https://code.google.com/p/ganeti/wiki/GanetiAndSSL over and over, to no avail. In fact, it seems, ssconf_master_candidates_certs is being created as soon as gnt-cluster verify is issued, but client.pem is never created, not even after issuing gnt-cluster renew-crypto --new-node-certificates)


As mentioned in another thread, I tried to verify the client.pem against the new server.pem. To me, it seems ok. But combined with the non-re-created client.pem it might as well be the source form my trouble.

# openssl verify -CAfile /var/lib/ganeti/server.pem /var/lib/ganeti/client.pem
/var/lib/ganeti/client.pem: CN = ganeti.example.com
error
18 at 0 depth lookup:self signed certificate
OK





Extra info just in case it helps solving this case:

This server is on debian 8.5; 3.16.0-4-amd64 #1 SMP Debian 3.16.7-ckt25-2 (2016-04-08) x86_64 GNU/Linux

Ganeti is installed from the jessie-backport repository.

# gnt-cluster version

Software version: 2.15.2
Internode protocol: 2150000
Configuration format: 2150000
OS api version
: 20
Export interface: 0
VCS version
: (ganeti) version 2.15.2-1~bpo8+1


# dpkg -l|grep -E "tls|ssl|ganeti"

ii  ganeti                                
2.15.2-1~bpo8+1                      all          cluster virtualization manager
ii  ganeti
-2.11                           2.11.6-1~bpo70+1                     all          cluster virtualization manager - Python components
rc  ganeti
-2.12                           2.12.4-1+deb8u3                      all          cluster virtualization manager - Python components
ii  ganeti
-2.15                           2.15.2-1~bpo8+1                      all          cluster virtualization manager - Python components
ii  ganeti
-haskell-2.11                   2.11.6-1~bpo70+1                     amd64        cluster virtualization manager - Haskell components
rc  ganeti
-haskell-2.12                   2.12.4-1+deb8u3                      amd64        cluster virtualization manager - Haskell components
ii  ganeti
-haskell-2.15                   2.15.2-1~bpo8+1                      amd64        cluster virtualization manager - Haskell components
ii  ganeti
-htools-2.11                    2.11.6-1~bpo70+1                     amd64        cluster virtualization manager - tools for Ganeti 2.11
rc  ganeti
-htools-2.12                    2.12.4-1+deb8u3                      amd64        cluster virtualization manager - tools for Ganeti 2.12
ii  ganeti
-htools-2.15                    2.15.2-1~bpo8+1                      amd64        cluster virtualization manager - tools for Ganeti 2.15
ii  ganeti
-instance-debootstrap           0.14-2                               all          debootstrap-based instance OS definition for ganeti
ii  gnutls
-bin                            3.3.8-6+deb8u3                       amd64        GNU TLS library - commandline utilities
ii  libcurl3
-gnutls:amd64                 7.38.0-4+deb8u3                      amd64        easy-to-use client-side URL transfer library (GnuTLS flavour)
ii  libcurl4
-openssl-dev:amd64            7.38.0-4+deb8u3                      amd64        development files and documentation for libcurl (OpenSSL flavour)
ii  libflac8
:amd64                        1.3.0-3                              amd64        Free Lossless Audio Codec - runtime C library
ii  libgnutls
-deb0-28:amd64               3.3.8-6+deb8u3                       amd64        GNU TLS library - main runtime library
ii  libgnutls
-openssl27:amd64             3.3.8-6+deb8u3                       amd64        GNU TLS library - OpenSSL wrapper
ii  libgnutls26
:amd64                     2.12.20-8+deb7u5                     amd64        GNU TLS library - runtime library
ii  libgnutls28
-dev:amd64                 3.3.8-6+deb8u3                       amd64        GNU TLS library - development files
ii  libgnutlsxx28
:amd64                   3.3.8-6+deb8u3                       amd64        GNU TLS library - C++ runtime library
ii  libio
-socket-ssl-perl                 2.002-2+deb8u1                       all          Perl module implementing object oriented interface to SSL sockets
ii  libnet
-smtp-ssl-perl                  1.01-3                               all          Perl module providing SSL support to Net::SMTP
ii  libnet
-ssleay-perl                    1.65-1+b1                            amd64        Perl module for Secure Sockets Layer (SSL)
ii  libssl
-dev:amd64                      1.0.1t-1+deb8u2                      amd64        Secure Sockets Layer toolkit - development files
ii  libssl
-doc                            1.0.1t-1+deb8u2                      all          Secure Sockets Layer toolkit - development documentation
ii  libssl0
.9.8                           0.9.8o-4squeeze14                    amd64        SSL shared libraries
ii  libssl1
.0.0:amd64                     1.0.1t-1+deb8u2                      amd64        Secure Sockets Layer toolkit - shared libraries
ii  libwavpack1
:amd64                     4.70.0-1                             amd64        audio codec (lossy and lossless) - library
ii  openssl                              
1.0.1t-1+deb8u2                      amd64        Secure Sockets Layer toolkit - cryptographic utility
ii  python
-gnutls                         2.0.1-2                              amd64        Python wrapper for the GNUTLS library
ii  python
-openssl                        0.15.1-2~bpo8+1                      all          Python 2 wrapper around the OpenSSL library
ii  ssl
-cert                              1.0.35                               all          simple debconf wrapper for OpenSSL




After searching for a solution for days, I'm at a loss and hope for some support or hint.

regards,
Attila

Atttila Santo

unread,
Jul 1, 2016, 6:04:35 PM7/1/16
to ganeti
any hints? anything? I'm still stuck.

sorry for the bump.

Viktor Bachraty

unread,
Jul 4, 2016, 9:14:22 AM7/4/16
to gan...@googlegroups.com
I'm pretty sure gnt-cluster renew-crypto --new-node-certificates should recreate client.pem on all nodes while --new-cluster-certificate recreates server.pem. If you say it didn't get regenerated, did the command just silently fail ? Another thing that comes in my mind is redist-conf (but that should only push only ssconf files).

robin.w...@tnp.net.uk

unread,
Aug 19, 2016, 2:38:09 PM8/19/16
to ganeti
Hi Atttila,

Having the same problem here after a debian upgrade which updated openssl - did you ever get to the bottom of this?

Thanks,
Robin.

robin.w...@tnp.net.uk

unread,
Aug 21, 2016, 1:01:37 PM8/21/16
to ganeti
In case it helps anyone, the cause (for me at least) was that during an upgrade (which included some xen packages and a kernel, but not any ganeti packages) using Jessie stable repo (2.12.4) the list of CIPHERS in the /usr/share/ganeti/2.12/ganeti/_constants.py file had been replaced with a NULL version:

# Generated automatically from Haskell constant 'opensslCiphers' in file 'src/Ganeti/Constants.hs'
#OPENSSL_CIPHERS = "HIGH:-DES:-3DES:-EXPORT:-ADH"
OPENSSL_CIPHERS = "NULL"

No idea how that could happen, but after switching back, everything was fine again.  Would be interested to hear if this happens for anyone else.

I found this bug in which part of it suggested switching ciphers to null (and it relates to 2.12.4 version), but not sure if that means this change occurred because of a package update?

From the apt history.log, these are the packages that were updated:

Start-Date: 2016-08-19  11:04:23
Commandline: apt-get dist-upgrade
Upgrade: libxapian22:amd64 (1.2.19-1, 1.2.19-1+deb8u1), lvm2:amd64 (2.02.111-2.2, 2.02.111-2.2+deb8u1), perl:amd64 (5.20.2-3+deb8u4, 5.20.2-3+deb8u6), linux-image-3.16.0-4-amd64:amd64 (3.16.7-ckt25-2, 3.16.7-ckt25-2+deb8u3), libssl1.0.0:amd64 (1.0.1k-3+deb8u5, 1.0.1t-1+deb8u2), exim4-base:amd64 (4.84.2-1, 4.84.2-2+deb8u1), perl-base:amd64 (5.20.2-3+deb8u4, 5.20.2-3+deb8u6), dpkg:amd64 (1.17.26, 1.17.27), libxen-4.4:amd64 (4.4.1-9+deb8u5, 4.4.1-9+deb8u6), dmsetup:amd64 (1.02.90-2.2, 1.02.90-2.2+deb8u1), exim4:amd64 (4.84.2-1, 4.84.2-2+deb8u1), libfontconfig1:amd64 (2.11.0-6.3, 2.11.0-6.3+deb8u1), libksba8:amd64 (1.3.2-1, 1.3.2-1+deb8u1), libxenstore3.0:amd64 (4.4.1-9+deb8u5, 4.4.1-9+deb8u6), openssh-server:amd64 (6.7p1-5+deb8u2, 6.7p1-5+deb8u3), liblvm2cmd2.02:amd64 (2.02.111-2.2, 2.02.111-2.2+deb8u1), fontconfig:amd64 (2.11.0-6.3, 2.11.0-6.3+deb8u1), openssh-sftp-server:amd64 (6.7p1-5+deb8u2, 6.7p1-5+deb8u3), exim4-daemon-light:amd64 (4.84.2-1, 4.84.2-2+deb8u1), libmodule-build-perl:amd64 (0.421000-2, 0.421000-2+deb8u1), xen-hypervisor-4.4-amd64:amd64 (4.4.1-9+deb8u5, 4.4.1-9+deb8u6), xenstore-utils:amd64 (4.4.1-9+deb8u5, 4.4.1-9+deb8u6), fontconfig-config:amd64 (2.11.0-6.3, 2.11.0-6.3+deb8u1), base-files:amd64 (8+deb8u4, 8+deb8u5), gnupg:amd64 (1.4.18-7+deb8u1, 1.4.18-7+deb8u2), initramfs-tools:amd64 (0.120+deb8u1, 0.120+deb8u2), libspice-server1:amd64 (0.12.5-1+deb8u2, 0.12.5-1+deb8u3), libintl-perl:amd64 (1.23-1, 1.23-1+deb8u1), xen-utils-common:amd64 (4.4.1-9+deb8u5, 4.4.1-9+deb8u6), perl-modules:amd64 (5.20.2-3+deb8u4, 5.20.2-3+deb8u6), openssh-client:amd64 (6.7p1-5+deb8u2, 6.7p1-5+deb8u3), exim4-config:amd64 (4.84.2-1, 4.84.2-2+deb8u1), libdevmapper-event1.02.1:amd64 (1.02.90-2.2, 1.02.90-2.2+deb8u1), dmeventd:amd64 (1.02.90-2.2, 1.02.90-2.2+deb8u1), xen-linux-system-3.16.0-4-amd64:amd64 (3.16.7-ckt25-2, 3.16.7-ckt25-2+deb8u3), libcurl3:amd64 (7.38.0-4+deb8u3, 7.38.0-4+deb8u4), libdevmapper1.02.1:amd64 (1.02.90-2.2, 1.02.90-2.2+deb8u1), gpgv:amd64 (1.4.18-7+deb8u1, 1.4.18-7+deb8u2), xen-utils-4.4:amd64 (4.4.1-9+deb8u5, 4.4.1-9+deb8u6), libexpat1:amd64 (2.1.0-6+deb8u2, 2.1.0-6+deb8u3), tzdata:amd64 (2016d-0+deb8u1, 2016f-0+deb8u1), openssl:amd64 (1.0.1k-3+deb8u5, 1.0.1t-1+deb8u2), xen-system-amd64:amd64 (4.4.1-9+deb8u5, 4.4.1-9+deb8u6), libcurl3-gnutls:amd64 (7.38.0-4+deb8u3, 7.38.0-4+deb8u4), libgcrypt20:amd64 (1.6.3-2+deb8u1, 1.6.3-2+deb8u2)
End-Date: 2016-08-19  11:05:36


I'd log it as a bug as it seems quite important given it broke the cluster, but I'm not sure at this point whether this is a change caused by ganeti (given none of the packages above are ganeti), although it does seem to be a ganeti file which has caused it.

Anyone that knows more about this have any thoughts?

Cheers,
Rob

Benjamin Redling

unread,
Aug 22, 2016, 5:13:18 PM8/22/16
to gan...@googlegroups.com
Hi,

maybe I am mixing things up here, but your "NULL" in constansts.py
sounded familiar...

On 2016-08-21 19:01, robin.w...@tnp.net.uk wrote:
> [...] using Jessie stable repo (2.12.4) the list of CIPHERS in the
> /usr/share/ganeti/2.12/ganeti/_constants.py file had been replaced with a
> NULL version:
>
> # Generated automatically from Haskell constant 'opensslCiphers' in file
> 'src/Ganeti/Constants.hs'
> #OPENSSL_CIPHERS = "HIGH:-DES:-3DES:-EXPORT:-ADH"
> OPENSSL_CIPHERS = "NULL"
> No idea how that could happen, but after switching back, everything was
> fine again.

$subject says you are on 2.15. So I am confused what that issue with
your (parallel?) 2.12 means.

Regarding "switching back": gnt-backup export still working?
Putting NULL in constants.py could be a (questionable?) hack to
circumvent another problem (with socat/TLS).


> Would be interested to hear if this happens for anyone else.
> I found this bug in which part of it suggested switching ciphers to null
> (and it relates to 2.12.4 version), but not sure if that means this change
> occurred because of a package update?
> https://code.google.com/p/ganeti/issues/detail?id=1104

gnt-backup export stops working after an upgrade to 2.15 due protocol
version naming issue. Related?
https://groups.google.com/forum/#!topic/ganeti/Mm8LrdXYsr8

Regards,
Benjamin
--
FSU Jena | JULIELab.de/Staff/Benjamin+Redling.html
vox: +49 3641 9 44323 | fax: +49 3641 9 44321
Reply all
Reply to author
Forward
0 new messages