Error: gnt commands refused to work

1,079 views
Skip to first unread message

ayebou afianké komla joseph

unread,
Feb 5, 2017, 4:16:12 PM2/5/17
to ganeti
I'm using debian jessie, and i install ganeti witch apt-get.

after a electrical fail, my ganeti cluster refuse to excute gnt commands.
i have this error for all gnt commands:

# gnt-cluster verify
Timeout while talking to the master daemon. Jobs might have been submitted and will continue to run even if the call timed out. Useful commands in this situation are "gnt-job list", "gnt-job cancel" and "gnt-job watch". Error:
Connect timed out

I can't also join the cluster IP adresse but all the master and the second node are up.

I need help, is an production cluster.
thanks

Iustin Pop

unread,
Feb 5, 2017, 4:26:26 PM2/5/17
to gan...@googlegroups.com
A small checklist:

- Are there any hardware errors on the nodes? Are all HDDs fine, does
'dmesg' report errors?
- Are the node daemons started on all nodes? If not, what are the
errors?
- Is the master daemon running on the master node? If not, what does it
say when you launch it manually in debug mode?


Basically either the problem is simple (daemons not started), or complex
(hardware failure or software misconfiguration since you last rebooted
things).

regards,
iustin

ayebou afianké komla joseph

unread,
Feb 5, 2017, 4:48:11 PM2/5/17
to ganeti
# dmesg

 device eth0 entered promiscuous mode
[    6.770628] r8169 0000:02:00.0: firmware: failed to load rtl_nic/rtl8168g-2.fw (-2)
[    6.770629] r8169 0000:02:00.0: Direct firmware load failed with error -2
[    6.770629] r8169 0000:02:00.0: Falling back to user helper
[    6.770836] r8169 0000:02:00.0 eth0: unable to load firmware patch rtl_nic/rtl8168g-2.fw (-12)
[    6.781586] r8169 0000:02:00.0 eth0: link down

# tail /var/log/ganeti/node-daemon.log
  http.server.HttpServerRequestExecutor.__init__(self, *args, **kwargs)
  File "/usr/share/ganeti/2.12/ganeti/http/server.py", line 439, in __init__
    request_msg_reader, force_close)
  File "/usr/share/ganeti/2.12/ganeti/http/__init__.py", line 522, in ShutdownConnection
    raise HttpError("Error while shutting down connection: %s" % err)
HttpError: Error while shutting down connection: ([],)
2017-02-05 19:49:23,617: ganeti-noded pid=1424 INFO Received signal 15 asking for shutdown
2017-02-05 19:49:24,544: ganeti-noded pid=2038 INFO ganeti-noded daemon startup
2017-02-05 19:49:25,104: ganeti-noded pid=2051 INFO 10.10.0.51:54979 POST /master_node_name HTTP/1.1 200
2017-02-05 19:49:26,626: ganeti-noded pid=2090 INFO 10.10.0.51:54981 POST /master_node_name HTTP/1.1 200

If you can tell me where to see. I can send you more log

ayebou afianké komla joseph

unread,
Feb 5, 2017, 4:53:25 PM2/5/17
to ganeti
this the last line of the syslog

# tail /var/log/syslog

feb  5 21:50:01 host1 CRON[3114]: (root) CMD ([ -x /usr/sbin/ganeti-watcher ] && /usr/sbin/ganeti-watcher)

Iustin Pop

unread,
Feb 5, 2017, 5:02:36 PM2/5/17
to gan...@googlegroups.com
On 2017-02-05 13:48:11, ayebou afianké komla joseph wrote:
> # dmesg
>
> device eth0 entered promiscuous mode
> [ 6.770628] r8169 0000:02:00.0: firmware: failed to load
> rtl_nic/rtl8168g-2.fw (-2)
> [ 6.770629] r8169 0000:02:00.0: Direct firmware load failed with error -2
> [ 6.770629] r8169 0000:02:00.0: Falling back to user helper
> [ 6.770836] r8169 0000:02:00.0 eth0: unable to load firmware patch
> rtl_nic/rtl8168g-2.fw (-12)
> [ 6.781586] r8169 0000:02:00.0 eth0: link down

So this seems like network is broken on this node because you are
missing the firmware for the NIC.

Can you ping (ideally from each node, but from the master node at least)
all the other nodes?

iustin

ayebou afianké komla joseph

unread,
Feb 5, 2017, 5:09:00 PM2/5/17
to ganeti
Yes i can ping the second node. I have juste two nodes.

But i can't ping the cluster IP.

ayebou afianké komla joseph

unread,
Feb 5, 2017, 5:28:52 PM2/5/17
to ganeti
# tail /var/log/ganeti/commands.log
    cl = GetClient()
  File "/usr/share/ganeti/2.12/ganeti/runtime.py", line 265, in GetClient
    client = luxi.Client(address=pathutils.QUERY_SOCKET)
  File "/usr/share/ganeti/2.12/ganeti/luxi.py", line 101, in __init__
    self._InitTransport()
  File "/usr/share/ganeti/2.12/ganeti/rpc/client.py", line 199, in _InitTransport
    timeouts=self.timeouts)
  File "/usr/share/ganeti/2.12/ganeti/rpc/transport.py", line 103, in __init__
    raise errors.TimeoutError("Connect timed out")
TimeoutError: Connect timed out

Iustin Pop

unread,
Feb 5, 2017, 5:41:01 PM2/5/17
to gan...@googlegroups.com
On 2017-02-05 14:09:00, ayebou afianké komla joseph wrote:
> Yes i can ping the second node. I have juste two nodes.
>
> But i can't ping the cluster IP.

The cluster IP is automatically activated when the master daemon/etc.
starts up (not sure anymore in newer versions).

I would check what happens if you try to restart ganeti, or what the
ganeti watcher log says.

iustin

ayebou afianké komla joseph

unread,
Feb 5, 2017, 5:49:51 PM2/5/17
to ganeti
tail -n 30 /var/log/syslog
Feb  5 22:39:41 host1 kernel: [    9.066807] br-hotlan: port 1(eth0.200) entered forwarding state
Feb  5 22:39:41 host1 kernel: [    9.066813] IPv6: ADDRCONF(NETDEV_CHANGE): eth0.250: link becomes ready
Feb  5 22:39:41 host1 kernel: [    9.067001] br-lanP: port 1(eth0.250) entered forwarding state
Feb  5 22:39:41 host1 kernel: [    9.067003] br-lanP: port 1(eth0.250) entered forwarding state
Feb  5 22:39:41 host1 kernel: [    9.067009] IPv6: ADDRCONF(NETDEV_CHANGE): eth0.255: link becomes ready
Feb  5 22:39:41 host1 kernel: [    9.067199] br-lanE: port 1(eth0.255) entered forwarding state
Feb  5 22:39:41 host1 kernel: [    9.067200] br-lanE: port 1(eth0.255) entered forwarding state
Feb  5 22:39:41 host1 kernel: [    9.067209] IPv6: ADDRCONF(NETDEV_CHANGE): br-rep: link becomes ready
Feb  5 22:39:41 host1 kernel: [    9.067219] IPv6: ADDRCONF(NETDEV_CHANGE): br-dmz: link becomes ready
Feb  5 22:39:41 host1 kernel: [    9.067229] IPv6: ADDRCONF(NETDEV_CHANGE): br-hotlan: link becomes ready
Feb  5 22:39:41 host1 kernel: [    9.067239] IPv6: ADDRCONF(NETDEV_CHANGE): br-lanP: link becomes ready
Feb  5 22:39:41 host1 kernel: [    9.067248] IPv6: ADDRCONF(NETDEV_CHANGE): br-lanE: link becomes ready
Feb  5 22:39:43 host1 ganeti[1132]: Starting Ganeti cluster:ganeti-noded...done.
Feb  5 22:39:44 host1 ganeti[1132]: ganeti-wconfd...Error in the RPC HTTP reply from 'Node {nodeName = "host3.ucao-uut.tg", nodePrimaryIp = "10.10.0.53", nodeSecondaryIp = "10.10.100.53", nodeMasterCandidate = True, nodeOffline = False, nodeDrained = False, nodeGroup = "66efcb8e-b16f-4e3d-9554-d44bef17fa0a", nodeMasterCapable = True, nodeVmCapable = True, nodeNdparams = PartialNDParams {ndpOobProgramP = Nothing, ndpSpindleCountP = Nothing, ndpExclusiveStorageP = Nothing, ndpOvsP = Nothing, ndpOvsNameP = Nothing, ndpOvsLinkP = Nothing, ndpSshPortP = Nothing, ndpCpuSpeedP = Nothing}, nodePowered = True, nodeCtime = Wed Jun 22 14:50:19 GMT 2016, nodeMtime = Wed Jun 22 14:50:19 GMT 2016, nodeUuid = "66437f56-7c88-4c73-8dc1-582aba341d02", nodeSerial = 1, nodeTags = fromList []}': CurlLayerError "code: CurlCouldntConnect, explanation: Failed to connect to 10.10.0.53 port 1811: Connection refused"
Feb  5 22:39:44 host1 ganeti[1132]: No voting RPC result from ["host3.ucao-uut.tg"]
Feb  5 22:39:45 host1 ntpd[1197]: Listen normally on 6 br-lanE fe80::fabc:12ff:fe79:cafe UDP 123
Feb  5 22:39:45 host1 ntpd[1197]: Listen normally on 7 br-lanP fe80::fabc:12ff:fe79:cafe UDP 123
Feb  5 22:39:45 host1 ntpd[1197]: Listen normally on 8 br-hotlan fe80::fabc:12ff:fe79:cafe UDP 123
Feb  5 22:39:45 host1 ntpd[1197]: Listen normally on 9 br-dmz fe80::fabc:12ff:fe79:cafe UDP 123
Feb  5 22:39:45 host1 ntpd[1197]: Listen normally on 10 br-rep fe80::fabc:12ff:fe79:cafe UDP 123
Feb  5 22:39:45 host1 ntpd[1197]: Listen normally on 11 br-lan fe80::fabc:12ff:fe79:cafe UDP 123
Feb  5 22:39:45 host1 ntpd[1197]: peers refreshed
Feb  5 22:39:53 host1 ganeti[1132]: done.
Feb  5 22:39:54 host1 ganeti[1132]: ganeti-rapi...done.
Feb  5 22:39:55 host1 ganeti[1132]: ganeti-luxid...done.
Feb  5 22:39:56 host1 ganeti[1132]: ganeti-kvmd...done.
Feb  5 22:39:56 host1 ganeti[1132]: ganeti-confd...done.
Feb  5 22:39:57 host1 ganeti[1132]: ganeti-mond...done.
Feb  5 22:40:01 host1 CRON[1531]: (root) CMD ([ -x /usr/sbin/ganeti-watcher ] && /usr/sbin/ganeti-watcher)
Feb  5 22:45:01 host1 CRON[1583]: (root) CMD ([ -x /usr/sbin/ganeti-watcher ] && /usr/sbin/ganeti-watcher)

ayebou afianké komla joseph

unread,
Feb 5, 2017, 5:51:03 PM2/5/17
to ganeti
# tail -n 30 /var/log/ganeti/watcher.log

  File "/usr/share/ganeti/2.12/ganeti/luxi.py", line 101, in __init__
    self._InitTransport()
  File "/usr/share/ganeti/2.12/ganeti/rpc/client.py", line 199, in _InitTransport
    timeouts=self.timeouts)
  File "/usr/share/ganeti/2.12/ganeti/rpc/transport.py", line 103, in __init__
    raise errors.TimeoutError("Connect timed out")
TimeoutError: Connect timed out
2017-02-05 22:50:02,211: ganeti-watcher pid=1627 INFO RunCmd /usr/lib/ganeti/daemon-util check-and-start ganeti-noded
2017-02-05 22:50:02,221: ganeti-watcher pid=1627 INFO RunCmd /usr/lib/ganeti/daemon-util check-and-start ganeti-confd
2017-02-05 22:50:02,230: ganeti-watcher pid=1627 INFO RunCmd /usr/lib/ganeti/daemon-util check-and-start ganeti-mond
2017-02-05 22:50:02,239: ganeti-watcher pid=1627 INFO RunCmd /usr/lib/ganeti/daemon-util check-and-start ganeti-kvmd
2017-02-05 22:50:12,736: ganeti-watcher pid=1627 ERROR Connect timed out
Traceback (most recent call last):
  File "/usr/share/ganeti/2.12/ganeti/watcher/__init__.py", line 906, in Main
    return fn(options)
  File "/usr/share/ganeti/2.12/ganeti/rapi/client.py", line 263, in wrapper
    return fn(*args, **kwargs)
  File "/usr/share/ganeti/2.12/ganeti/watcher/__init__.py", line 696, in _GlobalWatcher
    client = GetLuxiClient(True)
  File "/usr/share/ganeti/2.12/ganeti/watcher/__init__.py", line 602, in GetLuxiClient
    return cli.GetClient()

  File "/usr/share/ganeti/2.12/ganeti/runtime.py", line 265, in GetClient
    client = luxi.Client(address=pathutils.QUERY_SOCKET)
  File "/usr/share/ganeti/2.12/ganeti/luxi.py", line 101, in __init__
    self._InitTransport()
  File "/usr/share/ganeti/2.12/ganeti/rpc/client.py", line 199, in _InitTransport
    timeouts=self.timeouts)
  File "/usr/share/ganeti/2.12/ganeti/rpc/transport.py", line 103, in __init__
    raise errors.TimeoutError("Connect timed out")
TimeoutError: Connect timed out

Iustin Pop

unread,
Feb 5, 2017, 6:25:04 PM2/5/17
to gan...@googlegroups.com
OK, this tells that on the master node, wconfd can't start because it
can't talk to host3.ucao-uut.tg; is the node daemon running on that
node?

With only two nodes, you'll have many times this kind of problem; I'd
recommend running a third small node as tie breaker.

iustin

ayebou afianké komla joseph

unread,
Feb 6, 2017, 2:56:48 AM2/6/17
to ganeti
This is what i have on host3
# tail /var/log/ganeti/node-daemon.log
2017-02-06 06:25:04,737: ganeti-noded pid=1468 INFO Received request to reopen log files

how can i verify that the "the node daemon running on that node" ?

ayebou afianké komla joseph

unread,
Feb 6, 2017, 3:15:43 AM2/6/17
to ganeti
How can i add new node ? Any command gnt didn't work.

ayebou afianké komla joseph

unread,
Feb 6, 2017, 5:09:10 AM2/6/17
to ganeti
I have two ideas

- Try to promote the secondary node to master , is't a good idea?
- install new node as master and try to add the olders nodes as slave.

there is a way to recovery my instance if there is nothing other to do.

ayebou afianké komla joseph

unread,
Feb 6, 2017, 7:08:23 AM2/6/17
to ganeti
I try to promote salve in loss master senario.
I halted the master node and execute:

 #gnt-cluster master-failover --no-voting.
This will perform the failover even if most other nodes are down, or
if this node is outdated. This is dangerous as it can lead to a non-
consistent cluster. Check the gnt-cluster(8) man page before
proceeding. Continue?
y/[n]/?: y
Could not disable the master IP: Error 7: Failed to connect to 10.10.0.51 port 1811: No route to host
Could not disable the master role on the old master host1.ucao-uut.tg, please disable manually: Error 7: Failed to connect to 10.10.0.51 port 1811: No route to host
The master IP did not come up within 30 seconds; the cluster should still be working and reachable via host3.ucao-uut.tg, but not via the master IP address


the slave become master node but the problem is stell the same on it.

#gnt-cluster verify   (on the new master)

Timeout while talking to the master daemon. Jobs might have been submitted and will continue to run even if the call timed out. Useful commands in this situation are "gnt-job list", "gnt-job cancel" and "gnt-job watch". Error:
Connect timed out.


the system cann't see the cluster IP (is't normal) i had the same thing on old master(different HWaddr)
#ifconfig br-lan:0 (on the new master node)
br-lan:0  Link encap:Ethernet  HWaddr 90:1b:0e:47:80:eb 
          UP BROADCAST RUNNING MULTICAST  MTU:1500  Metric:1

Now i restart the old master node. it's still ruing as master node and don't detect any conflict.
I have now two master node. (I added new difficulty to my problem).

There is a way to reinstall all and recovery my instances


Iustin Pop

unread,
Feb 6, 2017, 3:37:39 PM2/6/17
to gan...@googlegroups.com
On 2017-02-06 04:08:22, ayebou afianké komla joseph wrote:
>
> *I try to promote salve in loss master senario. I halted the master node
> and execute:*
> #gnt-cluster master-failover --no-voting.
> This will perform the failover even if most other nodes are down, or
> if this node is outdated. This is dangerous as it can lead to a non-
> consistent cluster. Check the gnt-cluster(8) man page before
> proceeding. Continue?
> y/[n]/?: y
> Could not disable the master IP: Error 7: Failed to connect to 10.10.0.51
> port 1811: No route to host
> Could not disable the master role on the old master host1.ucao-uut.tg,
> please disable manually: Error 7: Failed to connect to 10.10.0.51 port
> 1811: No route to host
> The master IP did not come up within 30 seconds; the cluster should still
> be working and reachable via host3.ucao-uut.tg, but not via the master IP
> address
>
>
> *the slave become master node but the problem is stell the same on it.*
>
> #gnt-cluster verify (on the new master)
> Timeout while talking to the master daemon. Jobs might have been submitted
> and will continue to run even if the call timed out. Useful commands in
> this situation are "gnt-job list", "gnt-job cancel" and "gnt-job watch".
> Error:
> Connect timed out.
>
>
> *the system cann't see the cluster IP (is't normal) i had the same thing on
> old master(different HWaddr)*
> #ifconfig br-lan:0 (on the new master node)
> br-lan:0 Link encap:Ethernet HWaddr 90:1b:0e:47:80:eb
> UP BROADCAST RUNNING MULTICAST MTU:1500 Metric:1
>
>
>
>
> *Now i restart the old master node. it's still ruing as master node and
> don't detect any conflict.I have now two master node. (I added new
> difficulty to my problem). There is a way to reinstall all and recovery my
> instances*

It's possible but complicated. Try to copy /var/lib/ganeti/ss* from the
node you want primary to the other node.

To your question how to check node daemon running: do you have a
ganeti-noded process running?

iustin

ayebou afianké komla joseph

unread,
Feb 6, 2017, 5:40:30 PM2/6/17
to ganeti
Can you give me help to recovery my instance ?
Can i have a guide ?

Iustin Pop

unread,
Feb 6, 2017, 5:59:54 PM2/6/17
to gan...@googlegroups.com
On 2017-02-06 14:40:30, ayebou afianké komla joseph wrote:
> Can you give me help to recovery my instance ?
> Can i have a guide ?

It is hard if you don't have Linux skills. Again, the steps are:

- ensure that on both machines, ganeti-noded runs successfully (ps shows
it)
- solve the master node role (copy /var/lib/ganeti from one node to the
other manually)
- eventually reboot the nodes, ensuring that everything comes up
- check all the ganeti logs and match the timestamps with your restart
attempt

At this point, ganeti-wconfd should start successfully, and everything
should work.

regards,
iustin

ayebou afianké komla joseph

unread,
Feb 8, 2017, 11:48:11 AM2/8/17
to ganeti
After Copy, the problem of master role is solved.
but #gnt-cluster  still given me error.

#gnt-cluster info

Timeout while talking to the master daemon. Jobs might have been submitted and will continue to run even if the call timed out. Useful commands in this situation are "gnt-job list", "gnt-job cancel" and "gnt-job watch". Error:
Connect timed out

this is my logs on the master node:

# tail  /var/log/ganeti/rapi-daemon.log
2017-02-06 11:03:47,669: ganeti-rapi pid=1458 INFO ganeti-rapi daemon startup
2017-02-06 11:03:47,679: ganeti-rapi pid=1458 INFO Reading users file at /var/lib/ganeti/rapi/users
2017-02-06 11:03:47,679: ganeti-rapi pid=1458 WARNING No users file at /var/lib/ganeti/rapi/users
2017-02-06 16:07:33,593: ganeti-rapi pid=1463 INFO ganeti-rapi daemon startup
2017-02-06 16:07:33,607: ganeti-rapi pid=1463 INFO Reading users file at /var/lib/ganeti/rapi/users
2017-02-06 16:07:33,607: ganeti-rapi pid=1463 WARNING No users file at /var/lib/ganeti/rapi/users
2017-02-08 14:46:32,437: ganeti-rapi pid=1463 INFO Received signal 15 asking for shutdown
2017-02-08 14:47:02,210: ganeti-rapi pid=1459 INFO ganeti-rapi daemon startup
2017-02-08 14:47:02,229: ganeti-rapi pid=1459 INFO Reading users file at /var/lib/ganeti/rapi/users
2017-02-08 14:47:02,230: ganeti-rapi pid=1459 WARNING No users file at /var/lib/ganeti/rapi/users

# tail  /var/log/ganeti/conf-daemon.log
2017-02-06 08:20:53,144084000000 GMT: ganeti-confd pid=6625/ThreadId 3 INFO Starting up in inotify mode
2017-02-06 11:03:49,573614000000 GMT: ganeti-confd pid=1502/ThreadId 3 NOTICE ganeti-confd daemon startup
2017-02-06 11:03:49,614389000000 GMT: ganeti-confd pid=1502/ThreadId 3 INFO Loaded new config, serial 1199
2017-02-06 11:03:49,614618000000 GMT: ganeti-confd pid=1502/ThreadId 3 INFO Starting up in inotify mode
2017-02-06 16:07:35,585320000000 GMT: ganeti-confd pid=1506/ThreadId 3 NOTICE ganeti-confd daemon startup
2017-02-06 16:07:35,627115000000 GMT: ganeti-confd pid=1506/ThreadId 3 INFO Loaded new config, serial 1199
2017-02-06 16:07:35,627370000000 GMT: ganeti-confd pid=1506/ThreadId 3 INFO Starting up in inotify mode
2017-02-08 14:47:04,133091000000 GMT: ganeti-confd pid=1506/ThreadId 3 NOTICE ganeti-confd daemon startup
2017-02-08 14:47:04,182297000000 GMT: ganeti-confd pid=1506/ThreadId 3 INFO Loaded new config, serial 1199
2017-02-08 14:47:04,182528000000 GMT: ganeti-confd pid=1506/ThreadId 3 INFO Starting up in inotify mode


# tail  /var/log/ganeti/wconf-daemon.log
2017-02-05 22:57:35,233166000000 GMT: ganeti-wconfd pid=1789/ThreadId 15 INFO Cleaning up stale file /var/run/ganeti/livelocks/wconf-daemon_1486334393
2017-02-05 23:02:46,333679000000 GMT: ganeti-wconfd pid=1789/ThreadId 15 INFO Cleaning up stale file /var/run/ganeti/livelocks/luxi-daemon_1486335456
2017-02-06 06:25:03,485605000000 GMT: ganeti-wconfd pid=1789/ThreadId 18 INFO Reopening log files after receiving SIGHUP
2017-02-06 08:20:50,651680000000 GMT: ganeti-wconfd pid=6557/ThreadId 6 NOTICE ganeti-wconfd daemon startup
2017-02-06 08:20:50,652145000000 GMT: ganeti-wconfd pid=6557/ThreadId 6 INFO Changing permissions of /var/run/ganeti/socket/ganeti-wconfd to 600
2017-02-06 08:20:50,659125000000 GMT: ganeti-wconfd pid=6557/ThreadId 15 INFO Cleaning up stale file /var/run/ganeti/livelocks/wconf-daemon_1486335455
2017-02-06 08:26:01,759644000000 GMT: ganeti-wconfd pid=6557/ThreadId 15 INFO Cleaning up stale file /var/run/ganeti/livelocks/luxi-daemon_1486369252
2017-02-08 14:47:01,060282000000 GMT: ganeti-wconfd pid=1437/ThreadId 6 NOTICE ganeti-wconfd daemon startup
2017-02-08 14:47:01,125756000000 GMT: ganeti-wconfd pid=1437/ThreadId 6 INFO Changing permissions of /var/run/ganeti/socket/ganeti-wconfd to 600
2017-02-08 14:52:12,042798000000 GMT: ganeti-wconfd pid=1437/ThreadId 15 INFO Cleaning up stale file /var/run/ganeti/livelocks/luxi-daemon_1486565223

# tail  /var/log/ganeti/node-daemon.log
2017-02-06 16:07:21,599: ganeti-noded pid=1425 INFO ganeti-noded daemon startup
2017-02-06 16:07:22,502: ganeti-noded pid=1438 INFO 10.10.0.51:44027 POST /master_node_name HTTP/1.1 200
2017-02-06 16:07:30,998: ganeti-noded pid=1440 INFO 10.10.0.53:46809 POST /master_node_name HTTP/1.1 200
2017-02-06 16:07:32,590: ganeti-noded pid=1441 INFO 10.10.0.51:44029 POST /master_node_name HTTP/1.1 200
2017-02-06 16:07:34,337: ganeti-noded pid=1478 INFO 10.10.0.51:44031 POST /master_node_name HTTP/1.1 200
2017-02-06 16:07:36,072: ganeti-noded pid=1518 INFO 10.10.0.53:46811 POST /master_node_name HTTP/1.1 200
2017-02-08 14:46:32,490: ganeti-noded pid=1425 INFO Received signal 15 asking for shutdown
2017-02-08 14:47:00,098: ganeti-noded pid=1422 INFO ganeti-noded daemon startup
2017-02-08 14:47:01,039: ganeti-noded pid=1435 INFO 10.10.0.51:55534 POST /master_node_name HTTP/1.1 200
2017-02-08 14:47:02,947: ganeti-noded pid=1474 INFO 10.10.0.51:55536 POST /master_node_name HTTP/1.1 200


# tail  /var/log/ganeti/luxi-daemon.log

ganeti-luxid: /var/lib/ganeti/queue/job-83878: can't watch what isn't there!: does not exist
2017-02-08 14:47:02,979185000000 GMT: ganeti-luxid pid=1477/ThreadId 6 NOTICE ganeti-luxid daemon startup
2017-02-08 14:47:03,018574000000 GMT: ganeti-luxid pid=1477/ThreadId 6 INFO Changing permissions of /var/run/ganeti/socket/ganeti-query to 660
2017-02-08 14:47:03,025840000000 GMT: ganeti-luxid pid=1477/ThreadId 6 INFO Loaded new config, serial 1199
2017-02-08 14:47:03,025941000000 GMT: ganeti-luxid pid=1477/ThreadId 6 INFO Starting up in inotify mode
2017-02-08 14:47:03,026124000000 GMT: ganeti-luxid pid=1477/ThreadId 6 INFO Loading job queue
2017-02-08 14:47:03,030826000000 GMT: ganeti-luxid pid=1477/ThreadId 6 INFO Non-archived jobs on disk: 838781493
2017-02-08 14:47:03,047574000000 GMT: ganeti-luxid pid=1477/ThreadId 6 INFO Waiting jobs: [83878]; running jobs: []
2017-02-08 14:47:03,047685000000 GMT: ganeti-luxid pid=1477/ThreadId 6 INFO Starting jobs: 83878
ganeti-luxid: /var/lib/ganeti/queue/job-83878: can't watch what isn't there!: does not exist


# tail  /var/log/ganeti/kvm-daemon.log
2017-02-08 16:10:02,146107000000 GMT: ganeti-kvmd pid=2228/ThreadId 3 NOTICE ganeti-kvmd daemon startup
2017-02-08 16:10:02,146480000000 GMT: ganeti-kvmd pid=2228/ThreadId 3 INFO User shutdown not enabled, exiting
2017-02-08 16:15:02,198377000000 GMT: ganeti-kvmd pid=2266/ThreadId 3 NOTICE ganeti-kvmd daemon startup
2017-02-08 16:15:02,198757000000 GMT: ganeti-kvmd pid=2266/ThreadId 3 INFO User shutdown not enabled, exiting
2017-02-08 16:20:02,288006000000 GMT: ganeti-kvmd pid=2309/ThreadId 3 NOTICE ganeti-kvmd daemon startup
2017-02-08 16:20:02,288407000000 GMT: ganeti-kvmd pid=2309/ThreadId 3 INFO User shutdown not enabled, exiting
2017-02-08 16:25:02,364588000000 GMT: ganeti-kvmd pid=2351/ThreadId 3 NOTICE ganeti-kvmd daemon startup
2017-02-08 16:25:02,364967000000 GMT: ganeti-kvmd pid=2351/ThreadId 3 INFO User shutdown not enabled, exiting
2017-02-08 16:30:02,475104000000 GMT: ganeti-kvmd pid=2392/ThreadId 3 NOTICE ganeti-kvmd daemon startup
2017-02-08 16:30:02,475486000000 GMT: ganeti-kvmd pid=2392/ThreadId 3 INFO User shutdown not enabled, exiting


# tail  /var/log/ganeti/watcher.log

    return cli.GetClient()
  File "/usr/share/ganeti/2.12/ganeti/runtime.py", line 265, in GetClient
    client = luxi.Client(address=pathutils.QUERY_SOCKET)
  File "/usr/share/ganeti/2.12/ganeti/luxi.py", line 101, in __init__
    self._InitTransport()
  File "/usr/share/ganeti/2.12/ganeti/rpc/client.py", line 199, in _InitTransport
    timeouts=self.timeouts)
  File "/usr/share/ganeti/2.12/ganeti/rpc/transport.py", line 103, in __init__
    raise errors.TimeoutError("Connect timed out")
TimeoutError: Connect timed out

# tail  /var/log/ganeti/monitoring-daemon.log
Error on startup:
ExitSuccess
2017-02-06 08:20:53,617575000000 GMT: ganeti-mond pid=6639/ThreadId 3 NOTICE ganeti-mond daemon startup
Error on startup:
ExitSuccess
2017-02-06 11:03:50,160067000000 GMT: ganeti-mond pid=1516/ThreadId 3 NOTICE ganeti-mond daemon startup
2017-02-06 16:07:36,218072000000 GMT: ganeti-mond pid=1521/ThreadId 3 NOTICE ganeti-mond daemon startup
Error on startup:
ExitSuccess
2017-02-08 14:47:04,785791000000 GMT: ganeti-mond pid=1520/ThreadId 3 NOTICE ganeti-mond daemon startup


# tail  /var/log/ganeti/monitoring-daemon-error.log

[06/Feb/2017:10:36:54 +0000] Server.httpServe: BACKEND STOPPED
[06/Feb/2017:10:36:54 +0000] Error on startup:
ExitSuccess
[06/Feb/2017:11:03:50 +0000] Server.httpServe: START, binding to [http://0.0.0.0:1815/]
[06/Feb/2017:16:07:36 +0000] Server.httpServe: START, binding to [http://0.0.0.0:1815/]
[08/Feb/2017:14:46:32 +0000] Server.httpServe: SHUTDOWN
[08/Feb/2017:14:46:32 +0000] Server.httpServe: BACKEND STOPPED
[08/Feb/2017:14:46:32 +0000] Error on startup:
ExitSuccess
[08/Feb/2017:14:47:04 +0000] Server.httpServe: START, binding to [http://0.0.0.0:1815/]


# tail  /var/log/ganeti/kvm-daemon.log
2017-02-08 16:15:02,198377000000 GMT: ganeti-kvmd pid=2266/ThreadId 3 NOTICE ganeti-kvmd daemon startup
2017-02-08 16:15:02,198757000000 GMT: ganeti-kvmd pid=2266/ThreadId 3 INFO User shutdown not enabled, exiting
2017-02-08 16:20:02,288006000000 GMT: ganeti-kvmd pid=2309/ThreadId 3 NOTICE ganeti-kvmd daemon startup
2017-02-08 16:20:02,288407000000 GMT: ganeti-kvmd pid=2309/ThreadId 3 INFO User shutdown not enabled, exiting
2017-02-08 16:25:02,364588000000 GMT: ganeti-kvmd pid=2351/ThreadId 3 NOTICE ganeti-kvmd daemon startup
2017-02-08 16:25:02,364967000000 GMT: ganeti-kvmd pid=2351/ThreadId 3 INFO User shutdown not enabled, exiting
2017-02-08 16:30:02,475104000000 GMT: ganeti-kvmd pid=2392/ThreadId 3 NOTICE ganeti-kvmd daemon startup
2017-02-08 16:30:02,475486000000 GMT: ganeti-kvmd pid=2392/ThreadId 3 INFO User shutdown not enabled, exiting
2017-02-08 16:35:02,566717000000 GMT: ganeti-kvmd pid=2433/ThreadId 3 NOTICE ganeti-kvmd daemon startup
2017-02-08 16:35:02,567115000000 GMT: ganeti-kvmd pid=2433/ThreadId 3 INFO User shutdown not enabled, exiting


# tail  /var/log/ganeti/jobs.log
2017-02-05 12:20:06,578: job-84140 pid=15992 INFO Finished job 84140, status = success
2017-02-05 12:25:05,182: job-84141 pid=16331 INFO Restarting job 84141
2017-02-05 12:25:05,504: job-84141 pid=16331 INFO Op 1/1: opcode GROUP_VERIFY_DISKS(66efcb8e-b16f-4e3d-9554-d44bef17fa0a) waiting for locks
2017-02-05 12:25:06,671: job-84141 pid=16331 INFO Finished job 84141, status = success
2017-02-05 12:30:05,237: job-84142 pid=16593 INFO Restarting job 84142
2017-02-05 12:30:05,461: job-84142 pid=16593 INFO Op 1/1: opcode GROUP_VERIFY_DISKS(66efcb8e-b16f-4e3d-9554-d44bef17fa0a) waiting for locks
2017-02-05 12:30:06,653: job-84142 pid=16593 INFO Finished job 84142, status = success
2017-02-05 12:35:05,201: job-84143 pid=16858 INFO Restarting job 84143
2017-02-05 12:35:05,520: job-84143 pid=16858 INFO Op 1/1: opcode GROUP_VERIFY_DISKS(66efcb8e-b16f-4e3d-9554-d44bef17fa0a) waiting for locks
2017-02-05 12:35:06,713: job-84143 pid=16858 INFO Finished job 84143, status = success


# tail  /var/log/ganeti/conf-daemon.log

2017-02-06 08:20:53,144084000000 GMT: ganeti-confd pid=6625/ThreadId 3 INFO Starting up in inotify mode
2017-02-06 11:03:49,573614000000 GMT: ganeti-confd pid=1502/ThreadId 3 NOTICE ganeti-confd daemon startup
2017-02-06 11:03:49,614389000000 GMT: ganeti-confd pid=1502/ThreadId 3 INFO Loaded new config, serial 1199
2017-02-06 11:03:49,614618000000 GMT: ganeti-confd pid=1502/ThreadId 3 INFO Starting up in inotify mode
2017-02-06 16:07:35,585320000000 GMT: ganeti-confd pid=1506/ThreadId 3 NOTICE ganeti-confd daemon startup
2017-02-06 16:07:35,627115000000 GMT: ganeti-confd pid=1506/ThreadId 3 INFO Loaded new config, serial 1199
2017-02-06 16:07:35,627370000000 GMT: ganeti-confd pid=1506/ThreadId 3 INFO Starting up in inotify mode
2017-02-08 14:47:04,133091000000 GMT: ganeti-confd pid=1506/ThreadId 3 NOTICE ganeti-confd daemon startup
2017-02-08 14:47:04,182297000000 GMT: ganeti-confd pid=1506/ThreadId 3 INFO Loaded new config, serial 1199
2017-02-08 14:47:04,182528000000 GMT: ganeti-confd pid=1506/ThreadId 3 INFO Starting up in inotify mode



# tail  /var/log/ganeti/commands.log

    cl = GetClient()
  File "/usr/share/ganeti/2.12/ganeti/runtime.py", line 265, in GetClient
    client = luxi.Client(address=pathutils.QUERY_SOCKET)
  File "/usr/share/ganeti/2.12/ganeti/luxi.py", line 101, in __init__
    self._InitTransport()
  File "/usr/share/ganeti/2.12/ganeti/rpc/client.py", line 199, in _InitTransport
    timeouts=self.timeouts)
  File "/usr/share/ganeti/2.12/ganeti/rpc/transport.py", line 103, in __init__
    raise errors.TimeoutError("Connect timed out")
TimeoutError: Connect timed out

# tail  /var/log/ganeti/node-daemon.log
2017-02-06 16:07:21,599: ganeti-noded pid=1425 INFO ganeti-noded daemon startup
2017-02-06 16:07:22,502: ganeti-noded pid=1438 INFO 10.10.0.51:44027 POST /master_node_name HTTP/1.1 200
2017-02-06 16:07:30,998: ganeti-noded pid=1440 INFO 10.10.0.53:46809 POST /master_node_name HTTP/1.1 200
2017-02-06 16:07:32,590: ganeti-noded pid=1441 INFO 10.10.0.51:44029 POST /master_node_name HTTP/1.1 200
2017-02-06 16:07:34,337: ganeti-noded pid=1478 INFO 10.10.0.51:44031 POST /master_node_name HTTP/1.1 200
2017-02-06 16:07:36,072: ganeti-noded pid=1518 INFO 10.10.0.53:46811 POST /master_node_name HTTP/1.1 200
2017-02-08 14:46:32,490: ganeti-noded pid=1425 INFO Received signal 15 asking for shutdown
2017-02-08 14:47:00,098: ganeti-noded pid=1422 INFO ganeti-noded daemon startup
2017-02-08 14:47:01,039: ganeti-noded pid=1435 INFO 10.10.0.51:55534 POST /master_node_name HTTP/1.1 200
2017-02-08 14:47:02,947: ganeti-noded pid=1474 INFO 10.10.0.51:55536 POST /master_node_name HTTP/1.1 200

# tail  /var/log/ganeti/luxi-daemon.log
ganeti-luxid: /var/lib/ganeti/queue/job-83878: can't watch what isn't there!: does not exist
2017-02-08 14:47:02,979185000000 GMT: ganeti-luxid pid=1477/ThreadId 6 NOTICE ganeti-luxid daemon startup
2017-02-08 14:47:03,018574000000 GMT: ganeti-luxid pid=1477/ThreadId 6 INFO Changing permissions of /var/run/ganeti/socket/ganeti-query to 660
2017-02-08 14:47:03,025840000000 GMT: ganeti-luxid pid=1477/ThreadId 6 INFO Loaded new config, serial 1199
2017-02-08 14:47:03,025941000000 GMT: ganeti-luxid pid=1477/ThreadId 6 INFO Starting up in inotify mode
2017-02-08 14:47:03,026124000000 GMT: ganeti-luxid pid=1477/ThreadId 6 INFO Loading job queue
2017-02-08 14:47:03,030826000000 GMT: ganeti-luxid pid=1477/ThreadId 6 INFO Non-archived jobs on disk: 838781493
2017-02-08 14:47:03,047574000000 GMT: ganeti-luxid pid=1477/ThreadId 6 INFO Waiting jobs: [83878]; running jobs: []
2017-02-08 14:47:03,047685000000 GMT: ganeti-luxid pid=1477/ThreadId 6 INFO Starting jobs: 83878
ganeti-luxid: /var/lib/ganeti/queue/job-83878: can't watch what isn't there!: does not exist
Reply all
Reply to author
Forward
0 new messages