I am running a two-node cluster on Centos 7.1. I recently updated ganeti from 2.12.1 to to 2.12.4. Now, when I do "gnt-cluster master-failover -d" on the master candidate node, I get this:
2015-06-26 14:20:40,788: gnt-cluster master-failover pid=4790 cli:2706 DEBUG Command line: gnt-cluster master-failover -d
2015-06-26 14:20:40,789: gnt-cluster master-failover pid=4790 node:93 INFO Using PycURL libcurl/7.29.0 NSS/3.15.4 zlib/1.2.7 libidn/1.28 libssh2/1.4.3
2015-06-26 14:20:40,792: gnt-cluster master-failover pid=4790 client:142 DEBUG Starting request <ganeti.http.client.HttpClientRequest 192.168.2.202:1811 POST /master_node_name at 0x13313d0> 2015-06-26 14:20:40,792: gnt-cluster master-failover pid=4790 client:142 DEBUG Starting request <ganeti.http.client.HttpClientRequest 192.168.2.201:1811 POST /master_node_name at 0x1331490> 2015-06-26 14:20:40,903: gnt-cluster master-failover pid=4790 client:228 DEBUG Request <ganeti.http.client.HttpClientRequest 192.168.2.201:1811 POST /master_node_name at 0x1331490> finished, errmsg=None 2015-06-26 14:20:40,904: gnt-cluster master-failover pid=4790 client:228 DEBUG Request <ganeti.http.client.HttpClientRequest 192.168.2.202:1811 POST /master_node_name at 0x13313d0> finished, errmsg=None 2015-06-26 14:20:40,906: gnt-cluster master-failover pid=4790 process:217 INFO RunCmd /usr/lib64/ganeti/daemon-util start ganeti-wconfd --force-node --no-voting --yes-do-it
2015-06-26 14:20:40,960: gnt-cluster master-failover pid=4790 process:217 INFO RunCmd /usr/lib64/ganeti/daemon-util stop ganeti-wconfd
2015-06-26 14:20:41,008: gnt-cluster master-failover pid=4790 cli:2713 ERROR Error during command processing
Traceback (most recent call last):
File "/usr/share/ganeti/2.12/ganeti/cli.py", line 2709, in GenericMain
result = func(options, args)
File "/usr/share/ganeti/2.12/ganeti/rpc/node.py", line 141, in wrapper
return fn(*args, **kwargs)
File "/usr/share/ganeti/2.12/ganeti/client/gnt_cluster.py", line 861, in MasterFailover
rvlaue, msgs = bootstrap.MasterFailover(no_voting=opts.no_voting)
File "/usr/share/ganeti/2.12/ganeti/bootstrap.py", line 1071, in MasterFailover
cfg = config.GetConfig(None, livelock, accept_foreign=True)
File "/usr/share/ganeti/2.12/ganeti/config.py", line 105, in GetConfig
kwargs['wconfd'] = wc.Client()
File "/usr/share/ganeti/2.12/ganeti/wconfd.py", line 64, in __init__
self._InitTransport()
File "/usr/share/ganeti/2.12/ganeti/rpc/client.py", line 199, in _InitTransport
timeouts=self.timeouts)
File "/usr/share/ganeti/2.12/ganeti/rpc/transport.py", line 101, in __init__
args=(self.socket, address, self._ctimeout))
File "/usr/share/ganeti/2.12/ganeti/utils/retry.py", line 173, in Retry
return fn(*args)
File "/usr/share/ganeti/2.12/ganeti/rpc/transport.py", line 130, in _Connect
raise errors.NoMasterError(address)
NoMasterError: /var/run/ganeti/socket/ganeti-wconfd
Cannot communicate with socket '/var/run/ganeti/socket/ganeti-wconfd'.
Is the process running and listening for connections?
The cluster is healthy otherwise: Instances are running and performing normally, and I can migrate instances between nodes.
Thanks.