Ganeti - can not perform some gnt commands

355 views
Skip to first unread message

Pavel Műller

unread,
Feb 4, 2016, 8:11:05 AM2/4/16
to ganeti
Hello,

firstly I would like to apologize for the dummy question, I am newbie here. I have installed Ganeti three days ago, but it works strangely. Next sentences will describe what exactly I mean and what I found out. 

I have created cluster with one node and web manager - it is for testing purposes, so no stress :) It seems, that it partly works. Some "gnt-" command dont work, some of them do. I checked log file, but to tell the truth I am not able get the right info about problem and how to fix it.

Please, watch the outputs of commands bellow. When I start commands like gnt-cluster verify-disks or gnt-cluster verify, it finishs with error. And that is what I need to solve.

### failed commands

root@sunfirex2200:~# gnt-cluster verify-disks
Failure: command execution error:
detected death of job 64

root@sunfirex2200:~# gnt-cluster verify
Failure: command execution error:
detected death of job 93

###


# Additional information
(ganeti_webmgr)root@sunfirex2200:/usr/sbin# gnt-node list
Node         DTotal DFree MTotal MNode MFree Pinst Sinst
sunfirex2200   1.7T  1.7T   3.9G  1.4G  3.0G     0     0


(ganeti_webmgr)root@sunfirex2200:/usr/sbin# gnt-cluster info
Cluster name: ganeti
Cluster UUID: d65b386b-44c2-433a-93a3-58394c1978e4
Creation time: 2016-02-02 10:58:16
Modification time: 2016-02-02 10:58:16
Master node: sunfirex2200
Architecture (this node): 64bits (x86_64)
Tags: (none)
Default hypervisor: kvm
Enabled hypervisors: kvm
Hypervisor parameters: 
  kvm: 
    acpi: True
    boot_order: disk
    cdrom2_image_path: 
    cdrom_disk_type: 
    cdrom_image_path: 
    cpu_cores: 0
    cpu_mask: all
    cpu_sockets: 0
    cpu_threads: 0
    cpu_type: 
    disk_aio: threads
    disk_cache: default
    disk_type: paravirtual
    floppy_image_path: 
    initrd_path: 
    kernel_args: ro
    kernel_path: /boot/vmlinuz-3-kvmU
    keymap: 
    kvm_extra: 
    kvm_flag: 
    kvm_path: /usr/bin/kvm
    machine_version: 
    mem_path: 
    migration_bandwidth: 32
    migration_caps: 
    migration_downtime: 30
    migration_mode: live
    migration_port: 8102
    nic_type: paravirtual
    reboot_behavior: reboot
    root_path: /dev/vda1
    security_domain: 
    security_model: none
    serial_console: True
    serial_speed: 38400
    soundhw: 
    spice_bind: 
    spice_image_compression: 
    spice_ip_version: 0
    spice_jpeg_wan_compression: 
    spice_password_file: 
    spice_playback_compression: True
    spice_streaming_video: 
    spice_tls_ciphers: HIGH:-DES:-3DES:-EXPORT:-ADH
    spice_use_tls: False
    spice_use_vdagent: True
    spice_zlib_glz_wan_compression: 
    usb_devices: 
    usb_mouse: 
    use_chroot: False
    use_localtime: False
    user_shutdown: False
    vga: 
    vhost_net: False
    virtio_net_queues: 1
    vnc_bind_address: 
    vnc_password_file: 
    vnc_tls: False
    vnc_x509_path: 
    vnc_x509_verify: False
    vnet_hdr: True
OS-specific hypervisor parameters: 
OS parameters: 
Hidden OSes: 
Blacklisted OSes: 
Cluster parameters: 
  candidate pool size: 10
  maximal number of jobs running simultaneously: 20
  maximal number of jobs simultaneously tracked by the scheduler: 25
  mac prefix: aa:00:00
  master netdev: kvm-br-public
  master netmask: 32
  use external master IP address setup script: False
  lvm volume group: data
  lvm reserved volumes: (none)
  drbd usermode helper: /bin/true
  file storage path: /srv/ganeti/file-storage
  shared file storage path: /srv/ganeti/shared-file-storage
  gluster storage path: /var/run/ganeti/gluster
  maintenance of node health: False
  uid pool: 
  default instance allocator: hail
  default instance allocator parameters: 
  primary ip version: 4
  preallocation wipe disks: False
  OS search path: /srv/ganeti/os, /usr/local/lib/ganeti/os, /usr/lib/ganeti/os, /usr/share/ganeti/os
  ExtStorage Providers search path: /srv/ganeti/extstorage, /usr/local/lib/ganeti/extstorage, /usr/lib/ganeti/extstorage, /usr/share/ganeti/extstorage
  enabled disk templates: drbd, plain
  install image: 
  instance communication network: 
  zeroing image: 
  compression tools: 
    - gzip
    - gzip-fast
    - gzip-slow
  enabled user shutdown: False
Default node parameters: 
  cpu_speed: 1
  exclusive_storage: False
  oob_program: 
  ovs: False
  ovs_link: 
  ovs_name: switch1
  spindle_count: 1
  ssh_port: 22
Default instance parameters: 
  default: 
    always_failover: False
    auto_balance: True
    maxmem: 128
    minmem: 128
    spindle_use: 1
    vcpus: 1
Default nic parameters: 
  default: 
    link: kvm-br-public
    mode: bridged
    vlan: 
Default disk parameters: 
  blockdev: 
  diskless: 
  drbd: 
    c-delay-target: 1
    c-fill-target: 0
    c-max-rate: 61440
    c-min-rate: 4096
    c-plan-ahead: 20
    data-stripes: 1
    disk-barriers: n
    disk-custom: 
    dynamic-resync: False
    meta-barriers: False
    meta-stripes: 1
    metavg: data
    net-custom: 
    protocol: C
    resync-rate: 61440
  ext: 
  file: 
  gluster: 
    access: kernelspace
    host: 127.0.0.1
    port: 24007
    volume: gv0
  plain: 
    stripes: 1
  rbd: 
    access: kernelspace
    pool: rbd
  sharedfile: 
Instance policy - limits for instances: 
  bounds specs: 
    - max/0: 
        cpu-count: 8
        disk-count: 16
        disk-size: 1048576
        memory-size: 32768
        nic-count: 8
        spindle-use: 12
      min/0: 
        cpu-count: 1
        disk-count: 1
        disk-size: 1024
        memory-size: 128
        nic-count: 1
        spindle-use: 1
  std: 
    cpu-count: 1
    disk-count: 1
    disk-size: 1024
    memory-size: 128
    nic-count: 1
    spindle-use: 1
  allowed disk templates: drbd, plain
  vcpu-ratio: 4
  spindle-ratio: 32


gnt-job list
ID Status Summary
61 error  CLUSTER_VERIFY
62 error  CLUSTER_VERIFY
63 error  OS_DIAGNOSE
64 error  CLUSTER_VERIFY_DISKS
65 error  OOB_COMMAND
66 error  NODE_QUERYVOLS
67 error  OS_DIAGNOSE
68 error  OS_DIAGNOSE
69 error  OS_DIAGNOSE
70 error  OS_DIAGNOSE
71 error  OS_DIAGNOSE
72 error  OS_DIAGNOSE
73 error  OS_DIAGNOSE
74 error  OS_DIAGNOSE
75 error  OS_DIAGNOSE
76 error  OOB_COMMAND
77 error  OOB_COMMAND
78 error  OS_DIAGNOSE
79 error  INSTANCE_CREATE(test)
80 error  INSTANCE_CREATE(test)
81 error  INSTANCE_CREATE(test1)
82 error  OOB_COMMAND


root@sunfirex2200:~# gnt-cluster version
Software version: 2.12.4
Internode protocol: 2120000
Configuration format: 2120000
OS api version: 20
Export interface: 0
VCS version: (ganeti) version v2.12.4

root@sunfirex2200:~# uname -a
Linux sunfirex2200 3.16.0-4-amd64 #1 SMP Debian 3.16.7-ckt20-1+deb8u3 (2016-01-17) x86_64 GNU/Linux

root@sunfirex2200:~# systemctl status ganeti
● ganeti.service - LSB: Ganeti Cluster Manager
   Loaded: loaded (/etc/init.d/ganeti)
   Active: active (running) since Wed 2016-02-03 16:12:49 CET; 16h ago
  Process: 12697 ExecStop=/etc/init.d/ganeti stop (code=exited, status=0/SUCCESS)
  Process: 12743 ExecStart=/etc/init.d/ganeti start (code=exited, status=0/SUCCESS)
   CGroup: /system.slice/ganeti.service
           ├─12762 /usr/bin/python /usr/sbin/ganeti-noded -b 192.168.4.47
           ├─12813 /usr/sbin/ganeti-luxid
           ├─12860 /usr/sbin/ganeti-confd -b 192.168.4.47
           └─12885 /usr/sbin/ganeti-mond

Feb 03 16:12:39 sunfirex2200 ganeti[12743]: --yes-do-it        Force a dangerous operation
Feb 03 16:12:39 sunfirex2200 ganeti[12743]: -h  --help             show help
Feb 03 16:12:39 sunfirex2200 ganeti[12743]: -V  --version          show the version of the program
Feb 03 16:12:39 sunfirex2200 ganeti[12743]: --help-completion  show completion info
Feb 03 16:12:39 sunfirex2200 ganeti[12743]: failed (exit code 2).
Feb 03 16:12:42 sunfirex2200 ganeti[12743]: ganeti-rapi...done.
Feb 03 16:12:44 sunfirex2200 ganeti[12743]: ganeti-luxid...done.
Feb 03 16:12:46 sunfirex2200 ganeti[12743]: ganeti-kvmd...done.
Feb 03 16:12:48 sunfirex2200 ganeti[12743]: ganeti-confd...done.
Feb 03 16:12:49 sunfirex2200 ganeti[12743]: ganeti-mond...done.


Phil Regnauld

unread,
Feb 4, 2016, 8:17:29 AM2/4/16
to gan...@googlegroups.com
Pavel Műller (pavel.muller) writes:
> Hello,
>
> firstly I would like to apologize for the dummy question, I am newbie here.
> I have installed Ganeti three days ago, but it works strangely. Next
> sentences will describe what exactly I mean and what I found out.
>
> I have created cluster with one node and web manager - it is for testing
> purposes, so no stress :) It seems, that it partly works. Some "gnt-"
> command dont work, some of them do. I checked log file, but to tell the
> truth I am not able get the right info about problem and how to fix it.
>
> Please, watch the outputs of commands bellow. When I start commands like *gnt-cluster
> verify-disks *or *gnt-cluster verify*, it finishs with error. And that is
> what I need to solve.

Hi,

Have you looked in /var/log/ganeti/, specifically:

node-daemon.log
master-daemon.log
luxi-daemon.log

?


Pavel Műller

unread,
Feb 4, 2016, 9:05:21 AM2/4/16
to ganeti
Yes, that was the first thing what I did. But I am not able fix the problem even with info from log files. I dont have master-daemon.log in log files. The node-daemon.log and luxi-daemon.log are in attachment... 

PM

Dne čtvrtek 4. února 2016 14:17:29 UTC+1 Phil Regnauld napsal(a):
ganeti_logs.txt

Phil Regnauld

unread,
Feb 4, 2016, 9:11:07 AM2/4/16
to gan...@googlegroups.com
Pavel Műller (pavel.muller) writes:
> Yes, that was the first thing what I did. But I am not able fix the problem
> even with info from log files. I dont have *master-daemon.log* in log
> files. The *node-daemon.log *and* luxi-daemon.log* are in attachment...

Not having a master-daemon.log tells me something is odd - but you
wouldn't be able to run any commands if that was the case.

It's a single node cluster, at the moment, correct /

What do you get if you try and restart ganeti ? And what do you get
in the system logs (/var/log/system.log and similar).

gnt-node list ?

gnt-cluster verify (I'll check your previous mail in case you
sent that).

Pavel Műller

unread,
Feb 4, 2016, 9:25:49 AM2/4/16
to ganeti
Exactly, in this moment it is single node cluster.

(ganeti_webmgr)root@sunfirex2200:~# systemctl restart ganeti.service

root@sunfirex2200:~# tail -f /var/log/syslog
Feb  4 15:19:11 sunfirex2200 ganeti[14717]: Stopping Ganeti cluster:ganeti-metad...done.
Feb  4 15:19:11 sunfirex2200 ganeti[14717]: ganeti-mond...done.
Feb  4 15:19:11 sunfirex2200 ganeti[14717]: ganeti-confd...done.
Feb  4 15:19:11 sunfirex2200 ganeti[14717]: ganeti-kvmd...done.
Feb  4 15:19:11 sunfirex2200 ganeti[14717]: ganeti-luxid...done.
Feb  4 15:19:12 sunfirex2200 ganeti[14717]: ganeti-rapi...done.
Feb  4 15:19:12 sunfirex2200 ganeti[14717]: ganeti-wconfd...done.
Feb  4 15:19:12 sunfirex2200 ganeti[14717]: ganeti-noded...done.
Feb  4 15:19:15 sunfirex2200 ganeti[14763]: Starting Ganeti cluster:ganeti-noded...done.
Feb  4 15:19:17 sunfirex2200 ganeti[14763]: ganeti-wconfd...Command line error: unrecognized option `-b'
Feb  4 15:19:17 sunfirex2200 ganeti[14763]: ganeti-wconfd (ganeti) version v2.12.4
Feb  4 15:19:17 sunfirex2200 ganeti[14763]: Usage: ganeti-wconfd [OPTION...]
Feb  4 15:19:17 sunfirex2200 ganeti[14763]: -f  --foreground       Don't detach from the current terminal
Feb  4 15:19:17 sunfirex2200 ganeti[14763]: --no-user-checks   Ignore user checks
Feb  4 15:19:17 sunfirex2200 ganeti[14763]: -d  --debug            Enable debug messages
Feb  4 15:19:17 sunfirex2200 ganeti[14763]: --syslog=SYSLOG    Enable logging to syslog (except debug messages); one of 'no', 'yes' or 'only' [no]
Feb  4 15:19:17 sunfirex2200 ganeti[14763]: --force-node       Force the daemon to run on a different node than the master
Feb  4 15:19:17 sunfirex2200 ganeti[14763]: --no-voting        Skip node agreement check (dangerous)
Feb  4 15:19:17 sunfirex2200 ganeti[14763]: --yes-do-it        Force a dangerous operation
Feb  4 15:19:17 sunfirex2200 ganeti[14763]: -h  --help             show help
Feb  4 15:19:17 sunfirex2200 ganeti[14763]: -V  --version          show the version of the program
Feb  4 15:19:17 sunfirex2200 ganeti[14763]: --help-completion  show completion info
Feb  4 15:19:17 sunfirex2200 ganeti[14763]: failed (exit code 2).
Feb  4 15:19:21 sunfirex2200 ganeti[14763]: ganeti-rapi...done.
Feb  4 15:19:22 sunfirex2200 ganeti[14763]: ganeti-luxid...done.
Feb  4 15:19:24 sunfirex2200 ganeti[14763]: ganeti-kvmd...done.
Feb  4 15:19:26 sunfirex2200 ganeti[14763]: ganeti-confd...done.
Feb  4 15:19:28 sunfirex2200 ganeti[14763]: ganeti-mond...done.


(ganeti_webmgr)root@sunfirex2200:~# gnt-node list
Node         DTotal DFree MTotal MNode MFree Pinst Sinst
sunfirex2200   1.7T  1.7T   3.9G  1.2G  3.1G     0     0

(ganeti_webmgr)root@sunfirex2200:~# gnt-cluster verify
Failure: command execution error:
detected death of job 94



Dne čtvrtek 4. února 2016 15:11:07 UTC+1 Phil Regnauld napsal(a):

Klaus Aehlig

unread,
Feb 4, 2016, 9:27:15 AM2/4/16
to gan...@googlegroups.com
> Not having a master-daemon.log tells me something is odd - but you
> wouldn't be able to run any commands if that was the case.

Not having master-daemon.log is fine in Ganeti 2.12 and higher (and
the first mail mentions that Ganeti 2.12.4 is used). The log files
to look at are luxi-daemon.log and, especially with those symptomps,
jobs.log. If the latter doesn't exist, I would manually try what happens
when calling jqueue/exec.py (on a file-system layout as debian uses it,
it should be located under /usr/share/ganeti/2.12/ganeti/jqueue/exec.py).

Regards,
Klaus

--
Klaus Aehlig
Google Germany GmbH, Erika-Mann-Str. 33, 80636 Muenchen
Registergericht und -nummer: Hamburg, HRB 86891
Sitz der Gesellschaft: Hamburg
Geschaeftsfuehrer: Matthew Scott Sucherman, Paul Terence Manicle

Phil Regnauld

unread,
Feb 4, 2016, 9:29:28 AM2/4/16
to 'Klaus Aehlig' via ganeti
'Klaus Aehlig' via ganeti (ganeti) writes:
> > Not having a master-daemon.log tells me something is odd - but you
> > wouldn't be able to run any commands if that was the case.
>
> Not having master-daemon.log is fine in Ganeti 2.12 and higher (and
> the first mail mentions that Ganeti 2.12.4 is used).

Aha, thanks for the pointer, I'm still on 2.11 on most of my nodes :)

Pavel Műller

unread,
Feb 4, 2016, 9:37:32 AM2/4/16
to ganeti
# trying to statrt exec.py
(ganeti_webmgr)root@sunfirex2200:~# /usr/share/ganeti/2.12/ganeti/jqueue/exec.py
/usr/share/ganeti/2.12/ganeti/jqueue/exec.py: line 35: $'Module implementing executing of a job as a separate process\n\nThe complete protocol of initializing a job is described in the haskell\nmodule Ganeti.Query.Exec\n': command not found
/usr/share/ganeti/2.12/ganeti/jqueue/exec.py: line 37: import: command not found
/usr/share/ganeti/2.12/ganeti/jqueue/exec.py: line 38: import: command not found
/usr/share/ganeti/2.12/ganeti/jqueue/exec.py: line 39: import: command not found
/usr/share/ganeti/2.12/ganeti/jqueue/exec.py: line 40: import: command not found
/usr/share/ganeti/2.12/ganeti/jqueue/exec.py: line 41: import: command not found
/usr/share/ganeti/2.12/ganeti/jqueue/exec.py: line 42: import: command not found
from: can't read /var/mail/ganeti
from: can't read /var/mail/ganeti.server
from: can't read /var/mail/ganeti.rpc
from: can't read /var/mail/ganeti
from: can't read /var/mail/ganeti
from: can't read /var/mail/ganeti.utils
/usr/share/ganeti/2.12/ganeti/jqueue/exec.py: line 52: syntax error near unexpected token `('
/usr/share/ganeti/2.12/ganeti/jqueue/exec.py: line 52: `def _GetMasterInfo():'


#last job...
root@sunfirex2200:~# less /var/log/ganeti/jobs.log
2016-02-04 15:33:40,326: job-95 pid=14978 ERROR Exception when trying to run job 95
Traceback (most recent call last):
  File "/usr/share/ganeti/2.12/ganeti/jqueue/exec.py", line 83, in main
    context = masterd.GanetiContext(livelock_name)
  File "/usr/share/ganeti/2.12/ganeti/server/masterd.py", line 337, in __init__
    cfg = self.GetConfig(None)
  File "/usr/share/ganeti/2.12/ganeti/server/masterd.py", line 355, in GetConfig
    return config.GetConfig(ec_id, self.livelock)
  File "/usr/share/ganeti/2.12/ganeti/config.py", line 105, in GetConfig
    kwargs['wconfd'] = wc.Client()
  File "/usr/share/ganeti/2.12/ganeti/wconfd.py", line 64, in __init__
    self._InitTransport()
  File "/usr/share/ganeti/2.12/ganeti/rpc/client.py", line 199, in _InitTransport
    timeouts=self.timeouts)
  File "/usr/share/ganeti/2.12/ganeti/rpc/transport.py", line 103, in __init__
    raise errors.TimeoutError("Connect timed out")
TimeoutError: Connect timed out



Dne čtvrtek 4. února 2016 15:27:15 UTC+1 Klaus Aehlig napsal(a):

Klaus Aehlig

unread,
Feb 4, 2016, 9:42:59 AM2/4/16
to gan...@googlegroups.com
> *#last job...*
> root@sunfirex2200:~# *less /var/log/ganeti/jobs.log*
> 2016-02-04 15:33:40,326: job-95 pid=14978 ERROR Exception when trying to
> run job 95
> Traceback (most recent call last):
> File "/usr/share/ganeti/2.12/ganeti/jqueue/exec.py", line 83, in main
> context = masterd.GanetiContext(livelock_name)
> File "/usr/share/ganeti/2.12/ganeti/server/masterd.py", line 337, in
> __init__
> cfg = self.GetConfig(None)
> File "/usr/share/ganeti/2.12/ganeti/server/masterd.py", line 355, in
> GetConfig
> return config.GetConfig(ec_id, self.livelock)
> File "/usr/share/ganeti/2.12/ganeti/config.py", line 105, in GetConfig
> kwargs['wconfd'] = wc.Client()
> File "/usr/share/ganeti/2.12/ganeti/wconfd.py", line 64, in __init__
> self._InitTransport()
> File "/usr/share/ganeti/2.12/ganeti/rpc/client.py", line 199, in
> _InitTransport
> timeouts=self.timeouts)
> File "/usr/share/ganeti/2.12/ganeti/rpc/transport.py", line 103, in
> __init__
> raise errors.TimeoutError("Connect timed out")
> TimeoutError: Connect timed out

That fits with the error observed in the mail where you reported about
the attempt to restart Ganeti: wconfd is not up; according to your report
about trying to start it, it seems that somehow "systemctl restart ganeti.service"
passes incorrect arguments to that daemon (either directly or somehow indirectly;
/etc/defaults/ganeti certainly is a place to check).

Pavel Műller

unread,
Feb 4, 2016, 10:04:43 AM2/4/16
to ganeti
BINGO! I started manually wconfd. 
ganeti-wconfd --no-user-checks

I am heading to next steps which,I hope, will make my "cluster" working.

(ganeti_webmgr)root@sunfirex2200:~# gnt-cluster verify
Submitted jobs 99, 100
Waiting for job 99 ...
Thu Feb  4 15:45:33 2016 * Verifying cluster config
Thu Feb  4 15:45:33 2016 * Verifying cluster certificate files
Thu Feb  4 15:45:33 2016 * Verifying hypervisor parameters
Thu Feb  4 15:45:33 2016 * Verifying all nodes belong to an existing group
Waiting for job 100 ...
Thu Feb  4 15:45:37 2016 * Verifying group 'default'
Thu Feb  4 15:45:37 2016 * Gathering data (1 nodes)
Thu Feb  4 15:45:37 2016 * Gathering information about nodes (1 nodes)
Thu Feb  4 15:45:40 2016 * Gathering disk information (1 nodes)
Thu Feb  4 15:45:40 2016 * Verifying configuration file consistency
Thu Feb  4 15:45:40 2016   - ERROR: cluster: The cluster's list of master candidate certificates is empty. If you just updated the cluster, please run 'gnt-cluster renew-crypto --new-node-certificates'.
Thu Feb  4 15:45:40 2016 * Verifying node status
Thu Feb  4 15:45:40 2016 * Verifying instance status
Thu Feb  4 15:45:40 2016 * Verifying orphan volumes
Thu Feb  4 15:45:40 2016 * Verifying N+1 Memory redundancy
Thu Feb  4 15:45:40 2016 * Other Notes
Thu Feb  4 15:45:40 2016 * Hooks Results


(ganeti_webmgr)root@sunfirex2200:~# gnt-cluster renew-crypto --new-node-certificates
This requires all daemons on all nodes to be restarted and may take
some time. Continue?
y/[n]/?: y
Gathering cluster information
Blocking watcher
Stopping master daemons
Stopping daemons on sunfirex2200
Updating certificates and keys
Starting daemons on sunfirex2200
Failure: command execution error:
Failed to run command /usr/lib/ganeti/daemon-util start-all on node sunfirex2200 : exitcode 1 and error Command line error: unrecognized option `-b'

ganeti-wconfd (ganeti) version v2.12.4
Usage: ganeti-wconfd [OPTION...]
  -f  --foreground       Don't detach from the current terminal
      --no-user-checks   Ignore user checks
  -d  --debug            Enable debug messages
      --syslog=SYSLOG    Enable logging to syslog (except debug messages); one of 'no', 'yes' or 'only' [no]
      --force-node       Force the daemon to run on a different node than the master
      --no-voting        Skip node agreement check (dangerous)
      --yes-do-it        Force a dangerous operation
  -h  --help             show help
  -V  --version          show the version of the program
      --help-completion  show completion info
exit code 2

# here is my /etc/defaults/ganeti
(ganeti_webmgr)root@sunfirex2200:~# cat /etc/default/ganeti
# Default arguments for Ganeti daemons
NODED_ARGS="-b 192.168.4.47"
RAPI_ARGS="-b 192.168.4.47"
CONFD_ARGS="-b 192.168.4.47"
WCONFD_ARGS="-b 192.168.4.47"
LUXID_ARGS=""




Dne čtvrtek 4. února 2016 15:42:59 UTC+1 Klaus Aehlig napsal(a):

Klaus Aehlig

unread,
Feb 4, 2016, 10:21:30 AM2/4/16
to gan...@googlegroups.com
> # here is my /etc/defaults/ganeti
> (ganeti_webmgr)root@sunfirex2200:~# *cat /etc/default/ganeti*
> # Default arguments for Ganeti daemons
> NODED_ARGS="-b 192.168.4.47"
> RAPI_ARGS="-b 192.168.4.47"
> CONFD_ARGS="-b 192.168.4.47"
> WCONFD_ARGS="-b 192.168.4.47"

There we have problem: wconfd only binds to a UNIX domain socket
anyway; therefore, it does not support a -b option.

Pavel Műller

unread,
Feb 4, 2016, 10:40:16 AM2/4/16
to ganeti
I changed file /etc/default/ganeti:
root@sunfirex2200:~# cat /etc/default/ganeti
# Default arguments for Ganeti daemons
NODED_ARGS="-b 192.168.4.47"
RAPI_ARGS="-b 192.168.4.47"
CONFD_ARGS="-b 192.168.4.47"
WCONFD_ARGS=""
LUXID_ARGS=""

But I still receive error, when I start ganeti service:

(ganeti_webmgr)root@sunfirex2200:~# systemctl status ganeti.service
● ganeti.service - LSB: Ganeti Cluster Manager
   Loaded: loaded (/etc/init.d/ganeti)
   Active: active (running) since Thu 2016-02-04 16:37:59 CET; 21s ago
  Process: 17686 ExecStop=/etc/init.d/ganeti stop (code=exited, status=0/SUCCESS)
  Process: 17733 ExecStart=/etc/init.d/ganeti start (code=exited, status=0/SUCCESS)
   CGroup: /system.slice/ganeti.service
           ├─17751 /usr/bin/python /usr/sbin/ganeti-noded -b 192.168.4.47
           ├─17788 /usr/bin/python /usr/sbin/ganeti-rapi -b 192.168.4.47
           ├─17805 /usr/sbin/ganeti-luxid
           ├─17836 /usr/sbin/ganeti-confd -b 192.168.4.47
           └─17850 /usr/sbin/ganeti-mond

Feb 04 16:37:46 sunfirex2200 ganeti[17733]: Starting Ganeti cluster:ganeti-noded...done.
Feb 04 16:37:48 sunfirex2200 ganeti[17733]: ganeti-wconfd...Error when starting the daemon process: user error (Initialization of the daemon failedUnhandle...n denied))
Feb 04 16:37:48 sunfirex2200 ganeti[17733]: failed (exit code 1).
Feb 04 16:37:51 sunfirex2200 ganeti[17733]: ganeti-rapi...done.
Feb 04 16:37:53 sunfirex2200 ganeti[17733]: ganeti-luxid...done.
Feb 04 16:37:55 sunfirex2200 ganeti[17733]: ganeti-kvmd...done.
Feb 04 16:37:57 sunfirex2200 ganeti[17733]: ganeti-confd...done.
Feb 04 16:37:59 sunfirex2200 ganeti[17733]: ganeti-mond...done.
Hint: Some lines were ellipsized, use -l to show in full.






Dne čtvrtek 4. února 2016 16:21:30 UTC+1 Klaus Aehlig napsal(a):

Pavel Muller

unread,
Feb 4, 2016, 10:55:22 AM2/4/16
to ganeti
Problem solved:


(ganeti_webmgr)root@sunfirex2200:~# systemctl status ganeti.service
● ganeti.service - LSB: Ganeti Cluster Manager
   Loaded: loaded (/etc/init.d/ganeti)
   Active: active (running) since Thu 2016-02-04 16:51:41 CET; 4s ago
  Process: 18489 ExecStop=/etc/init.d/ganeti stop (code=exited, status=0/SUCCESS)
  Process: 18536 ExecStart=/etc/init.d/ganeti start (code=exited, status=0/SUCCESS)
   CGroup: /system.slice/ganeti.service
           ├─18555 /usr/bin/python /usr/sbin/ganeti-noded -b 192.168.4.47
           ├─18570 /usr/sbin/ganeti-wconfd
           ├─18592 /usr/bin/python /usr/sbin/ganeti-rapi -b 192.168.4.47
           ├─18609 /usr/sbin/ganeti-luxid
           ├─18646 /usr/sbin/ganeti-confd -b 192.168.4.47
           └─18671 /usr/sbin/ganeti-mond

Feb 04 16:51:28 sunfirex2200 ganeti[18536]: Starting Ganeti cluster:ganeti-noded...done.
Feb 04 16:51:30 sunfirex2200 ganeti[18536]: ganeti-wconfd...done.
Feb 04 16:51:33 sunfirex2200 ganeti[18536]: ganeti-rapi...done.
Feb 04 16:51:35 sunfirex2200 ganeti[18536]: ganeti-luxid...done.
Feb 04 16:51:37 sunfirex2200 ganeti[18536]: ganeti-kvmd...done.
Feb 04 16:51:39 sunfirex2200 ganeti[18536]: ganeti-confd...done.
Feb 04 16:51:41 sunfirex2200 ganeti[18536]: ganeti-mond...done.

There was necessary to add permission on files mentioned in /var/log/ganeti/wconf-daemon.log.
chmod a+rwx /var/lib/ganeti/tempres.data
chmod a+rwx /var/lib/ganeti/locks.data
-- 
S Pozdravem

Pavel Muller
Systémový inženýr
Mobil: +420 603 436 153

candlerb

unread,
Feb 5, 2016, 6:30:41 AM2/5/16
to ganeti
On Thursday, 4 February 2016 14:25:49 UTC, Pavel Műller wrote:
...
Feb  4 15:19:12 sunfirex2200 ganeti[14717]: ganeti-noded...done.
Feb  4 15:19:15 sunfirex2200 ganeti[14763]: Starting Ganeti cluster:ganeti-noded...done.
Feb  4 15:19:17 sunfirex2200 ganeti[14763]: ganeti-wconfd...Command line error: unrecognized option `-b'
Feb  4 15:19:17 sunfirex2200 ganeti[14763]: ganeti-wconfd (ganeti) version v2.12.4
Feb  4 15:19:17 sunfirex2200 ganeti[14763]: Usage: ganeti-wconfd [OPTION...]
Feb  4 15:19:17 sunfirex2200 ganeti[14763]: -f  --foreground       Don't detach from the current terminal
Feb  4 15:19:17 sunfirex2200 ganeti[14763]: --no-user-checks   Ignore user checks
Feb  4 15:19:17 sunfirex2200 ganeti[14763]: -d  --debug            Enable debug messages
Feb  4 15:19:17 sunfirex2200 ganeti[14763]: --syslog=SYSLOG    Enable logging to syslog (except debug messages); one of 'no', 'yes' or 'only' [no]
Feb  4 15:19:17 sunfirex2200 ganeti[14763]: --force-node       Force the daemon to run on a different node than the master
Feb  4 15:19:17 sunfirex2200 ganeti[14763]: --no-voting        Skip node agreement check (dangerous)
Feb  4 15:19:17 sunfirex2200 ganeti[14763]: --yes-do-it        Force a dangerous operation
Feb  4 15:19:17 sunfirex2200 ganeti[14763]: -h  --help             show help
Feb  4 15:19:17 sunfirex2200 ganeti[14763]: -V  --version          show the version of the program
Feb  4 15:19:17 sunfirex2200 ganeti[14763]: --help-completion  show completion info
Feb  4 15:19:17 sunfirex2200 ganeti[14763]: failed (exit code 2).
Feb  4 15:19:21 sunfirex2200 ganeti[14763]: ganeti-rapi...done.

Bleurgh. Whatever the startup script is, it's wrong. It's passing an unknown option (-b).

Could you tell us which operating system and version you are running on, and also how you installed ganeti: i.e. from source, from standard OS package repo, or from a third-party package repo?

I have some experience with a "-b" flag which might be relevant.

I was helping someone with a test ganeti cluster under Debian Jessie, it worked fine except the RAPI wouldn't accept connections over the network. It turned out this was because Debian's default config had locked it down, by passing some additional flags to the RAPI daemon ("-b 127.0.0.1 --require-authentication"). This was configured in /etc/default/ganeti, something like RAPI_ARGS.

This was running version 2.12.x out of the standard Jessie package repos. However the rest of the cluster was working fine - in particular there weren't any wconfd problems as far as I know.
Reply all
Reply to author
Forward
0 new messages