In master-daemon.log I don't see anything after node2 reboot so I try:
node2# service ganeti status
* ganeti-noded is running
* ganeti-masterd is not running
* ganeti-rapi is running
* ganeti-luxid is running
* ganeti-kvmd is not running
* ganeti-confd is running
* ganeti-mond is running
masterd isn't running! Trying to start it give the errors:
node2# service ganeti start
*
Starting Ganeti
cluster
*
ganeti-noded...
[ OK ]
*
ganeti-masterd...
ERROR:root:RPC error in master_node_name on node
node1.mydomain.com:
Error 7: Failed to connect to 192.168.111.201 port 1811: No route to
host
WARNING:root:Error contacting node
node1.mydomain.com: Error 7: Failed to connect to 192.168.111.201 port 1811: No route to host
ERROR:root:RPC
error in master_node_name on node
node1.mydomain.com: Error 7: Failed
to connect to 192.168.111.201 port 1811: No route to host
WARNING:root:Error contacting node
node1.mydomain.com: Error 7: Failed to connect to 192.168.111.201 port 1811: No route to host
ERROR:root:RPC
error in master_node_name on node
node1.mydomain.com: Error 7: Failed
to connect to 192.168.111.201 port 1811: No route to host
WARNING:root:Error contacting node
node1.mydomain.com: Error 7: Failed to connect to 192.168.111.201 port 1811: No route to host
ERROR:root:RPC
error in master_node_name on node
node1.mydomain.com: Error 7: Failed
to connect to 192.168.111.201 port 1811: No route to host
WARNING:root:Error contacting node
node1.mydomain.com: Error 7: Failed to connect to 192.168.111.201 port 1811: No route to host
ERROR:root:RPC
error in master_node_name on node
node1.mydomain.com: Error 7: Failed
to connect to 192.168.111.201 port 1811: No route to host
WARNING:root:Error contacting node
node1.mydomain.com: Error 7: Failed to connect to 192.168.111.201 port 1811: No route to host
ERROR:root:RPC
error in master_node_name on node
node1.mydomain.com: Error 7: Failed
to connect to 192.168.111.201 port 1811: No route to host
WARNING:root:Error contacting node
node1.mydomain.com: Error 7: Failed to connect to 192.168.111.201 port 1811: No route to host
CRITICAL:root:Cluster inconsistent, most of the nodes didn't answer after multiple retries. Aborting startup
CRITICAL:root:Use the --no-voting option if you understand what effects it has on the cluster state
So
the problem is that ganeti-masterd can't start because can't comunicate
to node1 even though I put node1 in offline mode before.
I try to put the --no-voting option in /etc/default/ganeti and when start the service it says that this option is dangerous and ask confirmation. I confirm the option and now masterd starts.
So, is this the correct recovery process for a two node cluster in this situation ?
---
David Sedeño