Intermittent failures when doing DRBD live migration

785 views
Skip to first unread message

candlerb

unread,
Jul 11, 2013, 7:31:09 AM7/11/13
to gan...@googlegroups.com
Two nodes (Dell R210-II), Debian Wheezy, ganeti 2.6.2, LVM is md1 disk pair, cluster burnin works fine.

Problem: live migrate works quite often, but sometimes aborts and leaves drbd in a broken state. I can replicate this with a loop of live migrations:

root@wrn-vm1:~# while gnt-instance migrate -f briantest.int.example.com; do echo "==="; sleep 2; done
...
Thu Jul 11 10:59:47 2013 Migrating instance briantest.int.example.com
Thu Jul 11 10:59:48 2013 * checking disk consistency between source and target
Thu Jul 11 10:59:48 2013 * switching node wrn-vm2.int.example.com to secondary mode
Thu Jul 11 10:59:48 2013 * changing into standalone mode
Thu Jul 11 10:59:48 2013 * changing disks into dual-master mode
Thu Jul 11 10:59:49 2013 * wait until resync is done
Failure: command execution error:
Cannot resync disks on node wrn-vm2.int.example.com: DRBD device <<class 'ganeti.bdev.DRBD8'>: unique_id: ('192.168.8.102', 11002, '192.168.8.101', 11002, 0, '139d530e0df6e7ce718a1dc7cd330f7b6e27d872'), children: [<<class 'ganeti.bdev.LogicalVolume'>: unique_id: ('xenvg', 'b14730aa-6838-4a2f-80c5-3d44bb6683d4.disk0_data'), children: [], 253:5, /dev/xenvg/b14730aa-6838-4a2f-80c5-3d44bb6683d4.disk0_data>, <<class 'ganeti.bdev.LogicalVolume'>: unique_id: ('xenvg', 'b14730aa-6838-4a2f-80c5-3d44bb6683d4.disk0_meta'), children: [], 253:6, /dev/xenvg/b14730aa-6838-4a2f-80c5-3d44bb6683d4.disk0_meta>], 147:0, /dev/drbd0> is not in sync: stats=<ganeti.bdev.DRBD8Status object at 0x2542b50>
root@wrn-vm1:~# cat /proc/drbd
version: 8.3.11 (api:88/proto:86-96)
srcversion: F937DCB2E5D83C6CCE4A6C9
 0: cs:WFConnection ro:Primary/Unknown ds:UpToDate/DUnknown C r-----
    ns:0 nr:0 dw:792 dr:4246768 al:32 bm:256 lo:0 pe:0 ua:0 ap:0 ep:1 wo:f oos:72
 1: cs:Unconfigured
 2: cs:Unconfigured

This loop started at 10:56:50, so it took about 3 minutes worth of migrating back and forth to fail. The main log error appears to be this one:

Jul 11 10:59:46 wrn-vm2 kernel: [236603.130604] block drbd0: uuid_compare()=-1 by rule 35
Jul 11 10:59:46 wrn-vm2 kernel: [236603.135779] block drbd0: I shall become SyncTarget, but I am primary!
Jul 11 10:59:46 wrn-vm2 kernel: [236603.142336] block drbd0: ASSERT( os.conn == C_WF_REPORT_PARAMS ) in /build/linux-s5x2oE/linux-3.2.46/drivers/block/drbd/drbd_receiver.c:3245
Jul 11 10:59:46 wrn-vm2 kernel: [236603.155084] block drbd0: peer( Primary -> Unknown ) conn( Connected -> Disconnecting ) pdsk( UpToDate -> DUnknown )

This is obviously a serious issue. "gnt-instance migrate --cleanup" doesn't seem to fix it either, it hangs at the point "changing disks into single-master mode" and then times out after a couple of minutes.

Thu Jul 11 12:23:03 2013  - INFO: Not checking memory on the secondary node as instance will not be started
Thu Jul 11 12:23:03 2013 Migrating instance briantest.int.example.com
Thu Jul 11 12:23:03 2013 * checking where the instance actually runs (if this hangs, the hypervisor might be in a bad state)
Thu Jul 11 12:23:03 2013 * instance confirmed to be running on its primary node (wrn-vm1.int.example.com)
Thu Jul 11 12:23:03 2013 * switching node wrn-vm2.int.example.com to secondary mode
Thu Jul 11 12:23:03 2013 * wait until resync is done
Thu Jul 11 12:23:19 2013 * changing into standalone mode
Thu Jul 11 12:23:19 2013 * changing disks into single-master mode
Failure: command execution error:
Cannot change disks config on node wrn-vm2.int.example.com: Timeout in disk reconnecting

The only way I can find to recover is to shutdown the instance, change it to -t file and then back to -t drbd.

Kernel and drbd versions:

root@wrn-vm1:~# uname -a
Linux wrn-vm1.int.example.com 3.2.0-4-amd64 #1 SMP Debian 3.2.46-1 x86_64 GNU/Linux
root@wrn-vm1:~# dpkg-query -l | grep -i drbd
ii  drbd8-utils                                    2:8.3.13-2                   amd64        RAID 1 over tcp/ip for Linux utilities

This is fairly easy for me to reproduce, so is there anything I can do here to narrow it down?

Regards,

Brian.

Logs from primary node:

root@wrn-vm1:~# grep -2 drbd /var/log/syslog
...
Jul 11 10:59:20 wrn-vm1 kernel: [236543.929504] block drbd0: 0 KB (0 bits) marked out-of-sync by on disk bit-map.
Jul 11 10:59:20 wrn-vm1 kernel: [236544.339164] mgmt-br0: port 2(tap0) entering forwarding state
Jul 11 10:59:27 wrn-vm1 kernel: [236550.610398] mgmt-br0: port 2(tap0) entering forwarding state
--
Jul 11 10:59:27 wrn-vm1 kernel: [236550.625758] device tap0 left promiscuous mode
Jul 11 10:59:27 wrn-vm1 kernel: [236550.630237] mgmt-br0: port 2(tap0) entering disabled state
Jul 11 10:59:28 wrn-vm1 kernel: [236551.616598] block drbd0: role( Primary -> Secondary )
Jul 11 10:59:28 wrn-vm1 kernel: [236551.621896] block drbd0: bitmap WRITE of 0 pages took 0 jiffies
Jul 11 10:59:28 wrn-vm1 kernel: [236551.628144] block drbd0: 0 KB (0 bits) marked out-of-sync by on disk bit-map.
Jul 11 10:59:28 wrn-vm1 kernel: [236551.951400] block drbd0: peer( Primary -> Unknown ) conn( Connected -> TearDown ) pdsk( UpToDate -> DUnknown )
Jul 11 10:59:28 wrn-vm1 kernel: [236551.961691] block drbd0: asender terminated
Jul 11 10:59:28 wrn-vm1 kernel: [236551.966001] block drbd0: Terminating drbd0_asender
Jul 11 10:59:28 wrn-vm1 kernel: [236551.971317] block drbd0: Connection closed
Jul 11 10:59:28 wrn-vm1 kernel: [236551.975552] block drbd0: conn( TearDown -> Unconnected )
Jul 11 10:59:28 wrn-vm1 kernel: [236551.981069] block drbd0: receiver terminated
Jul 11 10:59:28 wrn-vm1 kernel: [236551.985447] block drbd0: Restarting drbd0_receiver
Jul 11 10:59:28 wrn-vm1 kernel: [236551.990374] block drbd0: receiver (re)started
Jul 11 10:59:28 wrn-vm1 kernel: [236551.994850] block drbd0: conn( Unconnected -> WFConnection )
Jul 11 10:59:28 wrn-vm1 kernel: [236552.000758] block drbd0: conn( WFConnection -> Disconnecting )
Jul 11 10:59:28 wrn-vm1 kernel: [236552.006804] block drbd0: Discarding network configuration.
Jul 11 10:59:28 wrn-vm1 kernel: [236552.012498] block drbd0: Connection closed
Jul 11 10:59:28 wrn-vm1 kernel: [236552.016712] block drbd0: conn( Disconnecting -> StandAlone )
Jul 11 10:59:28 wrn-vm1 kernel: [236552.022611] block drbd0: receiver terminated
Jul 11 10:59:28 wrn-vm1 kernel: [236552.026990] block drbd0: Terminating drbd0_receiver
Jul 11 10:59:28 wrn-vm1 kernel: [236552.187138] block drbd0: conn( StandAlone -> Unconnected )
Jul 11 10:59:28 wrn-vm1 kernel: [236552.192868] block drbd0: Starting receiver thread (from drbd0_worker [8055])
Jul 11 10:59:28 wrn-vm1 kernel: [236552.200125] block drbd0: receiver (re)started
Jul 11 10:59:28 wrn-vm1 kernel: [236552.204609] block drbd0: conn( Unconnected -> WFConnection )
Jul 11 10:59:29 wrn-vm1 kernel: [236552.988235] block drbd0: Handshake successful: Agreed network protocol version 96
Jul 11 10:59:29 wrn-vm1 kernel: [236552.996087] block drbd0: Peer authenticated using 16 bytes of 'md5' HMAC
Jul 11 10:59:29 wrn-vm1 kernel: [236553.002930] block drbd0: conn( WFConnection -> WFReportParams )
Jul 11 10:59:29 wrn-vm1 kernel: [236553.009123] block drbd0: Starting asender thread (from drbd0_receiver [12490])
Jul 11 10:59:29 wrn-vm1 kernel: [236553.016579] block drbd0: data-integrity-alg: <not-used>
Jul 11 10:59:29 wrn-vm1 kernel: [236553.021922] block drbd0: drbd_sync_handshake:
Jul 11 10:59:29 wrn-vm1 kernel: [236553.031692] block drbd0: self 100B80B148D9872E:0000000000000000:82B9378155996F1D:82B8378155996F1D bits:0 flags:0
Jul 11 10:59:29 wrn-vm1 kernel: [236553.041979] block drbd0: peer 0D9DDADA643F4AD5:100B80B148D9872F:82B9378155996F1D:82B8378155996F1D bits:0 flags:4
Jul 11 10:59:29 wrn-vm1 kernel: [236553.052267] block drbd0: uuid_compare()=-1 by rule 50
Jul 11 10:59:29 wrn-vm1 kernel: [236553.057424] block drbd0: peer( Unknown -> Primary ) conn( WFReportParams -> WFBitMapT ) disk( UpToDate -> Outdated ) pdsk( DUnknown -> UpToDate )
Jul 11 10:59:29 wrn-vm1 kernel: [236553.087816] block drbd0: conn( WFBitMapT -> WFSyncUUID )
Jul 11 10:59:29 wrn-vm1 kernel: [236553.142670] block drbd0: updated sync uuid 100C80B148D9872E:0000000000000000:82B9378155996F1D:82B8378155996F1D
Jul 11 10:59:29 wrn-vm1 kernel: [236553.189275] block drbd0: helper command: /bin/true before-resync-target minor-0
Jul 11 10:59:29 wrn-vm1 kernel: [236553.197379] block drbd0: helper command: /bin/true before-resync-target minor-0 exit code 0 (0x0)
Jul 11 10:59:29 wrn-vm1 kernel: [236553.206398] block drbd0: conn( WFSyncUUID -> SyncTarget ) disk( Outdated -> Inconsistent )
Jul 11 10:59:29 wrn-vm1 kernel: [236553.214890] block drbd0: Began resync as SyncTarget (will sync 0 KB [0 bits set]).
Jul 11 10:59:29 wrn-vm1 kernel: [236553.224219] block drbd0: Resync done (total 1 sec; paused 0 sec; 0 K/sec)
Jul 11 10:59:29 wrn-vm1 kernel: [236553.231109] block drbd0: updated UUIDs 0D9DDADA643F4AD4:0000000000000000:100C80B148D9872E:100B80B148D9872F
Jul 11 10:59:29 wrn-vm1 kernel: [236553.240877] block drbd0: conn( SyncTarget -> Connected ) disk( Inconsistent -> UpToDate )
Jul 11 10:59:29 wrn-vm1 kernel: [236553.260004] block drbd0: helper command: /bin/true after-resync-target minor-0
Jul 11 10:59:29 wrn-vm1 kernel: [236553.268064] block drbd0: helper command: /bin/true after-resync-target minor-0 exit code 0 (0x0)
Jul 11 10:59:29 wrn-vm1 kernel: [236553.278205] block drbd0: bitmap WRITE of 32 pages took 0 jiffies
Jul 11 10:59:29 wrn-vm1 kernel: [236553.294355] block drbd0: 0 KB (0 bits) marked out-of-sync by on disk bit-map.
Jul 11 10:59:34 wrn-vm1 kernel: [236557.562849] block drbd0: peer( Primary -> Unknown ) conn( Connected -> TearDown ) pdsk( UpToDate -> DUnknown )
Jul 11 10:59:34 wrn-vm1 kernel: [236557.573073] block drbd0: conn( TearDown -> Disconnecting )
Jul 11 10:59:34 wrn-vm1 kernel: [236557.582398] block drbd0: asender terminated
Jul 11 10:59:34 wrn-vm1 kernel: [236557.586705] block drbd0: Terminating drbd0_asender
Jul 11 10:59:34 wrn-vm1 kernel: [236557.586845] block drbd0: Connection closed
Jul 11 10:59:34 wrn-vm1 kernel: [236557.586854] block drbd0: conn( Disconnecting -> StandAlone )
Jul 11 10:59:34 wrn-vm1 kernel: [236557.586925] block drbd0: receiver terminated
Jul 11 10:59:34 wrn-vm1 kernel: [236557.586928] block drbd0: Terminating drbd0_receiver
Jul 11 10:59:34 wrn-vm1 kernel: [236557.769356] block drbd0: conn( StandAlone -> Unconnected )
Jul 11 10:59:34 wrn-vm1 kernel: [236557.775055] block drbd0: Starting receiver thread (from drbd0_worker [8055])
Jul 11 10:59:34 wrn-vm1 kernel: [236557.782314] block drbd0: receiver (re)started
Jul 11 10:59:34 wrn-vm1 kernel: [236557.786792] block drbd0: conn( Unconnected -> WFConnection )
Jul 11 10:59:35 wrn-vm1 kernel: [236558.511784] block drbd0: Handshake successful: Agreed network protocol version 96
Jul 11 10:59:35 wrn-vm1 kernel: [236558.519635] block drbd0: Peer authenticated using 16 bytes of 'md5' HMAC
Jul 11 10:59:35 wrn-vm1 kernel: [236558.526481] block drbd0: conn( WFConnection -> WFReportParams )
Jul 11 10:59:35 wrn-vm1 kernel: [236558.532703] block drbd0: Starting asender thread (from drbd0_receiver [12541])
Jul 11 10:59:35 wrn-vm1 kernel: [236558.540134] block drbd0: data-integrity-alg: <not-used>
Jul 11 10:59:35 wrn-vm1 kernel: [236558.545482] block drbd0: drbd_sync_handshake:
Jul 11 10:59:35 wrn-vm1 kernel: [236558.549951] block drbd0: self 0D9DDADA643F4AD4:0000000000000000:100C80B148D9872E:100B80B148D9872F bits:0 flags:0
Jul 11 10:59:35 wrn-vm1 kernel: [236558.560239] block drbd0: peer 22A56126142FEB7B:0D9DDADA643F4AD5:100C80B148D9872F:100B80B148D9872F bits:0 flags:4
Jul 11 10:59:35 wrn-vm1 kernel: [236558.570526] block drbd0: uuid_compare()=-1 by rule 50
Jul 11 10:59:35 wrn-vm1 kernel: [236558.575684] block drbd0: peer( Unknown -> Primary ) conn( WFReportParams -> WFBitMapT ) disk( UpToDate -> Outdated ) pdsk( DUnknown -> UpToDate )
Jul 11 10:59:35 wrn-vm1 kernel: [236558.610699] block drbd0: conn( WFBitMapT -> WFSyncUUID )
Jul 11 10:59:35 wrn-vm1 kernel: [236558.665737] block drbd0: role( Secondary -> Primary )
Jul 11 10:59:35 wrn-vm1 kernel: [236558.677221] block drbd0: updated sync uuid 0D9EDADA643F4AD5:0000000000000000:100C80B148D9872E:100B80B148D9872F
Jul 11 10:59:35 wrn-vm1 kernel: [236558.707854] block drbd0: helper command: /bin/true before-resync-target minor-0
Jul 11 10:59:35 wrn-vm1 kernel: [236558.715944] block drbd0: helper command: /bin/true before-resync-target minor-0 exit code 0 (0x0)
Jul 11 10:59:35 wrn-vm1 kernel: [236558.724946] block drbd0: conn( WFSyncUUID -> SyncTarget ) disk( Outdated -> Inconsistent )
Jul 11 10:59:35 wrn-vm1 kernel: [236558.733430] block drbd0: Began resync as SyncTarget (will sync 0 KB [0 bits set]).
Jul 11 10:59:35 wrn-vm1 kernel: [236558.753078] block drbd0: Resync done (total 1 sec; paused 0 sec; 0 K/sec)
Jul 11 10:59:35 wrn-vm1 kernel: [236558.759979] block drbd0: updated UUIDs 22A56126142FEB7B:0000000000000000:0D9EDADA643F4AD5:0D9DDADA643F4AD5
Jul 11 10:59:35 wrn-vm1 kernel: [236558.769763] block drbd0: conn( SyncTarget -> Connected ) disk( Inconsistent -> UpToDate )
Jul 11 10:59:35 wrn-vm1 kernel: [236558.794404] block drbd0: helper command: /bin/true after-resync-target minor-0
Jul 11 10:59:35 wrn-vm1 kernel: [236558.802474] block drbd0: helper command: /bin/true after-resync-target minor-0 exit code 0 (0x0)
Jul 11 10:59:35 wrn-vm1 kernel: [236558.819775] block drbd0: bitmap WRITE of 32 pages took 1 jiffies
Jul 11 10:59:35 wrn-vm1 kernel: [236558.834307] block drbd0: 0 KB (0 bits) marked out-of-sync by on disk bit-map.
Jul 11 10:59:35 wrn-vm1 kernel: [236559.083477] device tap0 entered promiscuous mode
Jul 11 10:59:35 wrn-vm1 kernel: [236559.088259] mgmt-br0: port 2(tap0) entering forwarding state
Jul 11 10:59:35 wrn-vm1 kernel: [236559.094033] mgmt-br0: port 2(tap0) entering forwarding state
Jul 11 10:59:42 wrn-vm1 kernel: [236566.475644] block drbd0: peer( Primary -> Secondary )
Jul 11 10:59:43 wrn-vm1 kernel: [236566.815233] block drbd0: peer( Secondary -> Unknown ) conn( Connected -> Disconnecting ) pdsk( UpToDate -> DUnknown )
Jul 11 10:59:43 wrn-vm1 kernel: [236566.819529] block drbd0: sock was shut down by peer
Jul 11 10:59:43 wrn-vm1 kernel: [236566.819549] block drbd0: meta connection shut down by peer.
Jul 11 10:59:43 wrn-vm1 kernel: [236566.836760] block drbd0: new current UUID F346E637932F81FD:22A56126142FEB7B:0D9EDADA643F4AD5:0D9DDADA643F4AD5
Jul 11 10:59:43 wrn-vm1 kernel: [236566.850645] block drbd0: asender terminated
Jul 11 10:59:43 wrn-vm1 kernel: [236566.854985] block drbd0: Terminating drbd0_asender
Jul 11 10:59:43 wrn-vm1 kernel: [236566.867327] block drbd0: Connection closed
Jul 11 10:59:43 wrn-vm1 kernel: [236566.871555] block drbd0: conn( Disconnecting -> StandAlone )
Jul 11 10:59:43 wrn-vm1 kernel: [236566.877481] block drbd0: receiver terminated
Jul 11 10:59:43 wrn-vm1 kernel: [236566.881880] block drbd0: Terminating drbd0_receiver
Jul 11 10:59:43 wrn-vm1 kernel: [236567.043437] block drbd0: conn( StandAlone -> Unconnected )
Jul 11 10:59:43 wrn-vm1 kernel: [236567.049168] block drbd0: Starting receiver thread (from drbd0_worker [8055])
Jul 11 10:59:43 wrn-vm1 kernel: [236567.056403] block drbd0: receiver (re)started
Jul 11 10:59:43 wrn-vm1 kernel: [236567.060905] block drbd0: conn( Unconnected -> WFConnection )
Jul 11 10:59:44 wrn-vm1 kernel: [236567.780332] block drbd0: Handshake successful: Agreed network protocol version 96
Jul 11 10:59:44 wrn-vm1 kernel: [236567.788185] block drbd0: Peer authenticated using 16 bytes of 'md5' HMAC
Jul 11 10:59:44 wrn-vm1 kernel: [236567.795029] block drbd0: conn( WFConnection -> WFReportParams )
Jul 11 10:59:44 wrn-vm1 kernel: [236567.801191] block drbd0: Starting asender thread (from drbd0_receiver [12613])
Jul 11 10:59:44 wrn-vm1 kernel: [236567.808643] block drbd0: data-integrity-alg: <not-used>
Jul 11 10:59:44 wrn-vm1 kernel: [236567.813998] block drbd0: drbd_sync_handshake:
Jul 11 10:59:44 wrn-vm1 kernel: [236567.818461] block drbd0: self F346E637932F81FD:22A56126142FEB7B:0D9EDADA643F4AD5:0D9DDADA643F4AD5 bits:0 flags:0
Jul 11 10:59:44 wrn-vm1 kernel: [236567.828754] block drbd0: peer 22A56126142FEB7A:0000000000000000:0D9EDADA643F4AD5:0D9DDADA643F4AD5 bits:0 flags:4
Jul 11 10:59:44 wrn-vm1 kernel: [236567.839040] block drbd0: uuid_compare()=1 by rule 70
Jul 11 10:59:44 wrn-vm1 kernel: [236567.844109] block drbd0: peer( Unknown -> Secondary ) conn( WFReportParams -> WFBitMapS ) pdsk( DUnknown -> Consistent )
Jul 11 10:59:44 wrn-vm1 kernel: [236567.880876] block drbd0: helper command: /bin/true before-resync-source minor-0
Jul 11 10:59:44 wrn-vm1 kernel: [236567.888941] block drbd0: helper command: /bin/true before-resync-source minor-0 exit code 0 (0x0)
Jul 11 10:59:44 wrn-vm1 kernel: [236567.897945] block drbd0: conn( WFBitMapS -> SyncSource ) pdsk( Consistent -> Inconsistent )
Jul 11 10:59:44 wrn-vm1 kernel: [236567.906512] block drbd0: Began resync as SyncSource (will sync 0 KB [0 bits set]).
Jul 11 10:59:44 wrn-vm1 kernel: [236567.906544] block drbd0: updated sync UUID F346E637932F81FD:22A66126142FEB7B:22A56126142FEB7B:0D9EDADA643F4AD5
Jul 11 10:59:44 wrn-vm1 kernel: [236568.065036] block drbd0: Resync done (total 1 sec; paused 0 sec; 0 K/sec)
Jul 11 10:59:44 wrn-vm1 kernel: [236568.071941] block drbd0: updated UUIDs F346E637932F81FD:0000000000000000:22A66126142FEB7B:22A56126142FEB7B
Jul 11 10:59:44 wrn-vm1 kernel: [236568.081717] block drbd0: conn( SyncSource -> Connected ) pdsk( Inconsistent -> UpToDate )
Jul 11 10:59:44 wrn-vm1 kernel: [236568.116981] block drbd0: bitmap WRITE of 32 pages took 5 jiffies
Jul 11 10:59:44 wrn-vm1 kernel: [236568.132489] block drbd0: 0 KB (0 bits) marked out-of-sync by on disk bit-map.
Jul 11 10:59:46 wrn-vm1 kernel: [236569.934524] tap0: no IPv6 routers present
Jul 11 10:59:48 wrn-vm1 kernel: [236572.240308] block drbd0: peer( Secondary -> Unknown ) conn( Connected -> TearDown ) pdsk( UpToDate -> DUnknown )
Jul 11 10:59:48 wrn-vm1 kernel: [236572.250750] block drbd0: new current UUID 2D4D56DBB857BD95:F346E637932F81FD:22A66126142FEB7B:22A56126142FEB7B
Jul 11 10:59:48 wrn-vm1 kernel: [236572.263463] block drbd0: conn( TearDown -> Disconnecting )
Jul 11 10:59:48 wrn-vm1 kernel: [236572.268593] block drbd0: asender terminated
Jul 11 10:59:48 wrn-vm1 kernel: [236572.268607] block drbd0: Terminating drbd0_asender
Jul 11 10:59:48 wrn-vm1 kernel: [236572.278375] block drbd0: Connection closed
Jul 11 10:59:48 wrn-vm1 kernel: [236572.282688] block drbd0: conn( Disconnecting -> StandAlone )
Jul 11 10:59:48 wrn-vm1 kernel: [236572.288592] block drbd0: receiver terminated
Jul 11 10:59:48 wrn-vm1 kernel: [236572.292995] block drbd0: Terminating drbd0_receiver
Jul 11 10:59:48 wrn-vm1 kernel: [236572.438029] block drbd0: conn( StandAlone -> Unconnected )
Jul 11 10:59:48 wrn-vm1 kernel: [236572.443723] block drbd0: Starting receiver thread (from drbd0_worker [8055])
Jul 11 10:59:48 wrn-vm1 kernel: [236572.451037] block drbd0: receiver (re)started
Jul 11 10:59:48 wrn-vm1 kernel: [236572.455522] block drbd0: conn( Unconnected -> WFConnection )
Jul 11 10:59:49 wrn-vm1 kernel: [236573.183959] block drbd0: Handshake successful: Agreed network protocol version 96
Jul 11 10:59:49 wrn-vm1 kernel: [236573.191806] block drbd0: Peer authenticated using 16 bytes of 'md5' HMAC
Jul 11 10:59:49 wrn-vm1 kernel: [236573.198648] block drbd0: conn( WFConnection -> WFReportParams )
Jul 11 10:59:49 wrn-vm1 kernel: [236573.204816] block drbd0: Starting asender thread (from drbd0_receiver [12654])
Jul 11 10:59:49 wrn-vm1 kernel: [236573.212370] block drbd0: data-integrity-alg: <not-used>
Jul 11 10:59:49 wrn-vm1 kernel: [236573.217705] block drbd0: drbd_sync_handshake:
Jul 11 10:59:49 wrn-vm1 kernel: [236573.222178] block drbd0: self 2D4D56DBB857BD95:F346E637932F81FD:22A66126142FEB7B:22A56126142FEB7B bits:0 flags:0
Jul 11 10:59:49 wrn-vm1 kernel: [236573.232475] block drbd0: peer F346E637932F81FC:0000000000000000:22A66126142FEB7A:22A56126142FEB7B bits:0 flags:4
Jul 11 10:59:49 wrn-vm1 kernel: [236573.242762] block drbd0: uuid_compare()=1 by rule 70
Jul 11 10:59:49 wrn-vm1 kernel: [236573.247832] block drbd0: peer( Unknown -> Secondary ) conn( WFReportParams -> WFBitMapS ) pdsk( DUnknown -> Consistent )
Jul 11 10:59:49 wrn-vm1 kernel: [236573.283858] block drbd0: helper command: /bin/true before-resync-source minor-0
Jul 11 10:59:49 wrn-vm1 kernel: [236573.291936] block drbd0: helper command: /bin/true before-resync-source minor-0 exit code 0 (0x0)
Jul 11 10:59:49 wrn-vm1 kernel: [236573.300942] block drbd0: conn( WFBitMapS -> SyncSource ) pdsk( Consistent -> Inconsistent )
Jul 11 10:59:49 wrn-vm1 kernel: [236573.309527] block drbd0: conn( SyncSource -> WFBitMapS ) pdsk( Inconsistent -> Consistent )
Jul 11 10:59:49 wrn-vm1 kernel: [236573.309560] block drbd0: updated sync UUID 2D4D56DBB857BD95:F347E637932F81FD:F346E637932F81FD:22A66126142FEB7B
Jul 11 10:59:49 wrn-vm1 kernel: [236573.309565] block drbd0: Began resync as SyncSource (will sync 0 KB [0 bits set]).
Jul 11 10:59:49 wrn-vm1 kernel: [236573.335887] block drbd0: peer( Secondary -> Primary )
Jul 11 10:59:49 wrn-vm1 kernel: [236573.443871] block drbd0: pdsk( Consistent -> Inconsistent )
Jul 11 10:59:49 wrn-vm1 kernel: [236573.458723] block drbd0: pdsk( Inconsistent -> Consistent )
Jul 11 10:59:50 wrn-vm1 kernel: [236573.570170] block drbd0: sock was shut down by peer
Jul 11 10:59:50 wrn-vm1 kernel: [236573.570175] block drbd0: meta connection shut down by peer.
Jul 11 10:59:50 wrn-vm1 kernel: [236573.570183] block drbd0: peer( Primary -> Unknown ) conn( WFBitMapS -> NetworkFailure ) pdsk( Consistent -> DUnknown )
Jul 11 10:59:50 wrn-vm1 kernel: [236573.582983] block drbd0: asender terminated
Jul 11 10:59:50 wrn-vm1 kernel: [236573.582987] block drbd0: Terminating drbd0_asender
Jul 11 10:59:50 wrn-vm1 kernel: [236573.583714] block drbd0: bitmap WRITE of 32 pages took 4 jiffies
Jul 11 10:59:50 wrn-vm1 kernel: [236573.594487] block drbd0: 0 KB (0 bits) marked out-of-sync by on disk bit-map.
Jul 11 10:59:50 wrn-vm1 kernel: [236573.614403] block drbd0: Connection closed
Jul 11 10:59:50 wrn-vm1 kernel: [236573.618627] block drbd0: conn( NetworkFailure -> Unconnected )
Jul 11 10:59:50 wrn-vm1 kernel: [236573.624659] block drbd0: receiver terminated
Jul 11 10:59:50 wrn-vm1 kernel: [236573.629042] block drbd0: Restarting drbd0_receiver
Jul 11 10:59:50 wrn-vm1 kernel: [236573.633944] block drbd0: receiver (re)started
Jul 11 10:59:50 wrn-vm1 kernel: [236573.638428] block drbd0: conn( Unconnected -> WFConnection )
Jul 11 10:59:50 wrn-vm1 kernel: [236574.107179] mgmt-br0: port 2(tap0) entering forwarding state
Jul 11 11:00:01 wrn-vm1 /USR/SBIN/CRON[12669]: (root) CMD ([ -x /usr/local/sbin/ganeti-watcher ] && /usr/local/sbin/ganeti-watcher)

Logs from secondary node:

root@wrn-vm2:~# grep -2 drbd /var/log/syslog
...
Jul 11 10:59:17 wrn-vm2 kernel: [236573.843746] mgmt-br0: port 2(tap0) entering forwarding state
Jul 11 10:59:24 wrn-vm2 kernel: [236581.262453] block drbd0: peer( Primary -> Secondary )
Jul 11 10:59:25 wrn-vm2 kernel: [236581.577412] block drbd0: peer( Secondary -> Unknown ) conn( Connected -> Disconnecting ) pdsk( UpToDate -> DUnknown )
Jul 11 10:59:25 wrn-vm2 kernel: [236581.581688] block drbd0: sock was shut down by peer
Jul 11 10:59:25 wrn-vm2 kernel: [236581.581706] block drbd0: meta connection shut down by peer.
Jul 11 10:59:25 wrn-vm2 kernel: [236581.598938] block drbd0: new current UUID 0D9DDADA643F4AD5:100B80B148D9872F:82B9378155996F1D:82B8378155996F1D
Jul 11 10:59:25 wrn-vm2 kernel: [236581.618135] block drbd0: asender terminated
Jul 11 10:59:25 wrn-vm2 kernel: [236581.622476] block drbd0: Terminating drbd0_asender
Jul 11 10:59:25 wrn-vm2 kernel: [236581.643131] block drbd0: Connection closed
Jul 11 10:59:25 wrn-vm2 kernel: [236581.647353] block drbd0: conn( Disconnecting -> StandAlone )
Jul 11 10:59:25 wrn-vm2 kernel: [236581.653289] block drbd0: receiver terminated
Jul 11 10:59:25 wrn-vm2 kernel: [236581.657678] block drbd0: Terminating drbd0_receiver
Jul 11 10:59:25 wrn-vm2 kernel: [236581.885183] block drbd0: conn( StandAlone -> Unconnected )
Jul 11 10:59:25 wrn-vm2 kernel: [236581.890905] block drbd0: Starting receiver thread (from drbd0_worker [2621])
Jul 11 10:59:25 wrn-vm2 kernel: [236581.898265] block drbd0: receiver (re)started
Jul 11 10:59:25 wrn-vm2 kernel: [236581.902759] block drbd0: conn( Unconnected -> WFConnection )
Jul 11 10:59:26 wrn-vm2 kernel: [236582.603953] block drbd0: Handshake successful: Agreed network protocol version 96
Jul 11 10:59:26 wrn-vm2 kernel: [236582.611753] block drbd0: Peer authenticated using 16 bytes of 'md5' HMAC
Jul 11 10:59:26 wrn-vm2 kernel: [236582.618577] block drbd0: conn( WFConnection -> WFReportParams )
Jul 11 10:59:26 wrn-vm2 kernel: [236582.624738] block drbd0: Starting asender thread (from drbd0_receiver [6472])
Jul 11 10:59:26 wrn-vm2 kernel: [236582.632271] block drbd0: data-integrity-alg: <not-used>
Jul 11 10:59:26 wrn-vm2 kernel: [236582.637616] block drbd0: drbd_sync_handshake:
Jul 11 10:59:26 wrn-vm2 kernel: [236582.642083] block drbd0: self 0D9DDADA643F4AD5:100B80B148D9872F:82B9378155996F1D:82B8378155996F1D bits:0 flags:0
Jul 11 10:59:26 wrn-vm2 kernel: [236582.652378] block drbd0: peer 100B80B148D9872E:0000000000000000:82B9378155996F1D:82B8378155996F1D bits:0 flags:4
Jul 11 10:59:26 wrn-vm2 kernel: [236582.662664] block drbd0: uuid_compare()=1 by rule 70
Jul 11 10:59:26 wrn-vm2 kernel: [236582.667733] block drbd0: peer( Unknown -> Secondary ) conn( WFReportParams -> WFBitMapS ) pdsk( DUnknown -> Consistent )
Jul 11 10:59:26 wrn-vm2 kernel: [236582.709295] block drbd0: helper command: /bin/true before-resync-source minor-0
Jul 11 10:59:26 wrn-vm2 kernel: [236582.717334] block drbd0: helper command: /bin/true before-resync-source minor-0 exit code 0 (0x0)
Jul 11 10:59:26 wrn-vm2 kernel: [236582.726336] block drbd0: conn( WFBitMapS -> SyncSource ) pdsk( Consistent -> Inconsistent )
Jul 11 10:59:26 wrn-vm2 kernel: [236582.734905] block drbd0: Began resync as SyncSource (will sync 0 KB [0 bits set]).
Jul 11 10:59:26 wrn-vm2 kernel: [236582.734937] block drbd0: updated sync UUID 0D9DDADA643F4AD5:100C80B148D9872F:100B80B148D9872F:82B9378155996F1D
Jul 11 10:59:26 wrn-vm2 kernel: [236582.917507] block drbd0: Resync done (total 1 sec; paused 0 sec; 0 K/sec)
Jul 11 10:59:26 wrn-vm2 kernel: [236582.924409] block drbd0: updated UUIDs 0D9DDADA643F4AD5:0000000000000000:100C80B148D9872F:100B80B148D9872F
Jul 11 10:59:26 wrn-vm2 kernel: [236582.934193] block drbd0: conn( SyncSource -> Connected ) pdsk( Inconsistent -> UpToDate )
Jul 11 10:59:26 wrn-vm2 kernel: [236582.965261] block drbd0: bitmap WRITE of 32 pages took 6 jiffies
Jul 11 10:59:26 wrn-vm2 kernel: [236582.975656] block drbd0: 0 KB (0 bits) marked out-of-sync by on disk bit-map.
Jul 11 10:59:28 wrn-vm2 kernel: [236584.722569] tap0: no IPv6 routers present
Jul 11 10:59:30 wrn-vm2 kernel: [236587.188883] block drbd0: peer( Secondary -> Unknown ) conn( Connected -> Disconnecting ) pdsk( UpToDate -> DUnknown )
Jul 11 10:59:30 wrn-vm2 kernel: [236587.199763] block drbd0: new current UUID 22A56126142FEB7B:0D9DDADA643F4AD5:100C80B148D9872F:100B80B148D9872F
Jul 11 10:59:30 wrn-vm2 kernel: [236587.202457] block drbd0: meta connection shut down by peer.
Jul 11 10:59:30 wrn-vm2 kernel: [236587.227467] block drbd0: asender terminated
Jul 11 10:59:30 wrn-vm2 kernel: [236587.231795] block drbd0: Terminating drbd0_asender
Jul 11 10:59:30 wrn-vm2 kernel: [236587.231863] block drbd0: Connection closed
Jul 11 10:59:30 wrn-vm2 kernel: [236587.231873] block drbd0: conn( Disconnecting -> StandAlone )
Jul 11 10:59:30 wrn-vm2 kernel: [236587.231896] block drbd0: receiver terminated
Jul 11 10:59:30 wrn-vm2 kernel: [236587.231899] block drbd0: Terminating drbd0_receiver
Jul 11 10:59:30 wrn-vm2 kernel: [236587.405182] block drbd0: conn( StandAlone -> Unconnected )
Jul 11 10:59:30 wrn-vm2 kernel: [236587.410904] block drbd0: Starting receiver thread (from drbd0_worker [2621])
Jul 11 10:59:30 wrn-vm2 kernel: [236587.418206] block drbd0: receiver (re)started
Jul 11 10:59:30 wrn-vm2 kernel: [236587.422683] block drbd0: conn( Unconnected -> WFConnection )
Jul 11 10:59:31 wrn-vm2 kernel: [236588.127534] block drbd0: Handshake successful: Agreed network protocol version 96
Jul 11 10:59:31 wrn-vm2 kernel: [236588.135301] block drbd0: Peer authenticated using 16 bytes of 'md5' HMAC
Jul 11 10:59:31 wrn-vm2 kernel: [236588.142132] block drbd0: conn( WFConnection -> WFReportParams )
Jul 11 10:59:31 wrn-vm2 kernel: [236588.148319] block drbd0: Starting asender thread (from drbd0_receiver [6506])
Jul 11 10:59:31 wrn-vm2 kernel: [236588.155833] block drbd0: data-integrity-alg: <not-used>
Jul 11 10:59:31 wrn-vm2 kernel: [236588.161179] block drbd0: drbd_sync_handshake:
Jul 11 10:59:31 wrn-vm2 kernel: [236588.165642] block drbd0: self 22A56126142FEB7B:0D9DDADA643F4AD5:100C80B148D9872F:100B80B148D9872F bits:0 flags:0
Jul 11 10:59:31 wrn-vm2 kernel: [236588.175935] block drbd0: peer 0D9DDADA643F4AD4:0000000000000000:100C80B148D9872E:100B80B148D9872F bits:0 flags:4
Jul 11 10:59:31 wrn-vm2 kernel: [236588.186220] block drbd0: uuid_compare()=1 by rule 70
Jul 11 10:59:31 wrn-vm2 kernel: [236588.191290] block drbd0: peer( Unknown -> Secondary ) conn( WFReportParams -> WFBitMapS ) pdsk( DUnknown -> Consistent )
Jul 11 10:59:31 wrn-vm2 kernel: [236588.232398] block drbd0: helper command: /bin/true before-resync-source minor-0
Jul 11 10:59:31 wrn-vm2 kernel: [236588.240477] block drbd0: helper command: /bin/true before-resync-source minor-0 exit code 0 (0x0)
Jul 11 10:59:31 wrn-vm2 kernel: [236588.249485] block drbd0: conn( WFBitMapS -> SyncSource ) pdsk( Consistent -> Inconsistent )
Jul 11 10:59:31 wrn-vm2 kernel: [236588.258065] block drbd0: Began resync as SyncSource (will sync 0 KB [0 bits set]).
Jul 11 10:59:31 wrn-vm2 kernel: [236588.258085] block drbd0: updated sync UUID 22A56126142FEB7B:0D9EDADA643F4AD5:0D9DDADA643F4AD5:100C80B148D9872F
Jul 11 10:59:31 wrn-vm2 kernel: [236588.275978] block drbd0: peer( Secondary -> Primary )
Jul 11 10:59:31 wrn-vm2 kernel: [236588.427265] block drbd0: Resync done (total 1 sec; paused 0 sec; 0 K/sec)
Jul 11 10:59:31 wrn-vm2 kernel: [236588.434163] block drbd0: updated UUIDs 22A56126142FEB7B:0000000000000000:0D9EDADA643F4AD5:0D9DDADA643F4AD5
Jul 11 10:59:31 wrn-vm2 kernel: [236588.443940] block drbd0: conn( SyncSource -> Connected ) pdsk( Inconsistent -> UpToDate )
Jul 11 10:59:31 wrn-vm2 kernel: [236588.455178] block drbd0: bitmap WRITE of 32 pages took 0 jiffies
Jul 11 10:59:31 wrn-vm2 kernel: [236588.473835] block drbd0: 0 KB (0 bits) marked out-of-sync by on disk bit-map.
Jul 11 10:59:32 wrn-vm2 kernel: [236588.847244] mgmt-br0: port 2(tap0) entering forwarding state
Jul 11 10:59:38 wrn-vm2 kernel: [236595.052608] mgmt-br0: port 2(tap0) entering forwarding state
Jul 11 10:59:38 wrn-vm2 kernel: [236595.058487] device tap0 left promiscuous mode
Jul 11 10:59:38 wrn-vm2 kernel: [236595.062963] mgmt-br0: port 2(tap0) entering disabled state
Jul 11 10:59:39 wrn-vm2 kernel: [236596.055853] block drbd0: role( Primary -> Secondary )
Jul 11 10:59:39 wrn-vm2 kernel: [236596.061152] block drbd0: bitmap WRITE of 0 pages took 0 jiffies
Jul 11 10:59:39 wrn-vm2 kernel: [236596.075052] block drbd0: 0 KB (0 bits) marked out-of-sync by on disk bit-map.
Jul 11 10:59:39 wrn-vm2 kernel: [236596.420519] block drbd0: peer( Primary -> Unknown ) conn( Connected -> TearDown ) pdsk( UpToDate -> DUnknown )
Jul 11 10:59:39 wrn-vm2 kernel: [236596.430798] block drbd0: asender terminated
Jul 11 10:59:39 wrn-vm2 kernel: [236596.435114] block drbd0: Terminating drbd0_asender
Jul 11 10:59:39 wrn-vm2 kernel: [236596.449122] block drbd0: Connection closed
Jul 11 10:59:39 wrn-vm2 kernel: [236596.453346] block drbd0: conn( TearDown -> Unconnected )
Jul 11 10:59:39 wrn-vm2 kernel: [236596.458869] block drbd0: receiver terminated
Jul 11 10:59:39 wrn-vm2 kernel: [236596.463257] block drbd0: Restarting drbd0_receiver
Jul 11 10:59:39 wrn-vm2 kernel: [236596.468175] block drbd0: receiver (re)started
Jul 11 10:59:39 wrn-vm2 kernel: [236596.472651] block drbd0: conn( Unconnected -> WFConnection )
Jul 11 10:59:39 wrn-vm2 kernel: [236596.492650] block drbd0: conn( WFConnection -> Disconnecting )
Jul 11 10:59:39 wrn-vm2 kernel: [236596.498720] block drbd0: Discarding network configuration.
Jul 11 10:59:39 wrn-vm2 kernel: [236596.504370] block drbd0: Connection closed
Jul 11 10:59:39 wrn-vm2 kernel: [236596.508592] block drbd0: conn( Disconnecting -> StandAlone )
Jul 11 10:59:40 wrn-vm2 kernel: [236596.514520] block drbd0: receiver terminated
Jul 11 10:59:40 wrn-vm2 kernel: [236596.518964] block drbd0: Terminating drbd0_receiver
Jul 11 10:59:40 wrn-vm2 kernel: [236596.670275] block drbd0: conn( StandAlone -> Unconnected )
Jul 11 10:59:40 wrn-vm2 kernel: [236596.681547] block drbd0: Starting receiver thread (from drbd0_worker [2621])
Jul 11 10:59:40 wrn-vm2 kernel: [236596.688793] block drbd0: receiver (re)started
Jul 11 10:59:40 wrn-vm2 kernel: [236596.693275] block drbd0: conn( Unconnected -> WFConnection )
Jul 11 10:59:40 wrn-vm2 kernel: [236597.396132] block drbd0: Handshake successful: Agreed network protocol version 96
Jul 11 10:59:40 wrn-vm2 kernel: [236597.403933] block drbd0: Peer authenticated using 16 bytes of 'md5' HMAC
Jul 11 10:59:40 wrn-vm2 kernel: [236597.410769] block drbd0: conn( WFConnection -> WFReportParams )
Jul 11 10:59:40 wrn-vm2 kernel: [236597.416948] block drbd0: Starting asender thread (from drbd0_receiver [6571])
Jul 11 10:59:40 wrn-vm2 kernel: [236597.424351] block drbd0: data-integrity-alg: <not-used>
Jul 11 10:59:40 wrn-vm2 kernel: [236597.429695] block drbd0: drbd_sync_handshake:
Jul 11 10:59:40 wrn-vm2 kernel: [236597.434179] block drbd0: self 22A56126142FEB7A:0000000000000000:0D9EDADA643F4AD5:0D9DDADA643F4AD5 bits:0 flags:0
Jul 11 10:59:40 wrn-vm2 kernel: [236597.444468] block drbd0: peer F346E637932F81FD:22A56126142FEB7B:0D9EDADA643F4AD5:0D9DDADA643F4AD5 bits:0 flags:4
Jul 11 10:59:40 wrn-vm2 kernel: [236597.454756] block drbd0: uuid_compare()=-1 by rule 50
Jul 11 10:59:40 wrn-vm2 kernel: [236597.459912] block drbd0: peer( Unknown -> Primary ) conn( WFReportParams -> WFBitMapT ) disk( UpToDate -> Outdated ) pdsk( DUnknown -> UpToDate )
Jul 11 10:59:40 wrn-vm2 kernel: [236597.490569] block drbd0: conn( WFBitMapT -> WFSyncUUID )
Jul 11 10:59:41 wrn-vm2 kernel: [236597.548357] block drbd0: updated sync uuid 22A66126142FEB7A:0000000000000000:0D9EDADA643F4AD5:0D9DDADA643F4AD5
Jul 11 10:59:41 wrn-vm2 kernel: [236597.564690] block drbd0: helper command: /bin/true before-resync-target minor-0
Jul 11 10:59:41 wrn-vm2 kernel: [236597.572540] block drbd0: helper command: /bin/true before-resync-target minor-0 exit code 0 (0x0)
Jul 11 10:59:41 wrn-vm2 kernel: [236597.581548] block drbd0: conn( WFSyncUUID -> SyncTarget ) disk( Outdated -> Inconsistent )
Jul 11 10:59:41 wrn-vm2 kernel: [236597.590034] block drbd0: Began resync as SyncTarget (will sync 0 KB [0 bits set]).
Jul 11 10:59:41 wrn-vm2 kernel: [236597.606526] block drbd0: Resync done (total 1 sec; paused 0 sec; 0 K/sec)
Jul 11 10:59:41 wrn-vm2 kernel: [236597.613433] block drbd0: updated UUIDs F346E637932F81FC:0000000000000000:22A66126142FEB7A:22A56126142FEB7B
Jul 11 10:59:41 wrn-vm2 kernel: [236597.623208] block drbd0: conn( SyncTarget -> Connected ) disk( Inconsistent -> UpToDate )
Jul 11 10:59:41 wrn-vm2 kernel: [236597.639619] block drbd0: helper command: /bin/true after-resync-target minor-0
Jul 11 10:59:41 wrn-vm2 kernel: [236597.647659] block drbd0: helper command: /bin/true after-resync-target minor-0 exit code 0 (0x0)
Jul 11 10:59:41 wrn-vm2 kernel: [236597.657841] block drbd0: bitmap WRITE of 32 pages took 0 jiffies
Jul 11 10:59:41 wrn-vm2 kernel: [236597.673201] block drbd0: 0 KB (0 bits) marked out-of-sync by on disk bit-map.
Jul 11 10:59:45 wrn-vm2 kernel: [236601.866594] block drbd0: peer( Primary -> Unknown ) conn( Connected -> Disconnecting ) pdsk( UpToDate -> DUnknown )
Jul 11 10:59:45 wrn-vm2 kernel: [236601.884419] block drbd0: meta connection shut down by peer.
Jul 11 10:59:45 wrn-vm2 kernel: [236601.890120] block drbd0: asender terminated
Jul 11 10:59:45 wrn-vm2 kernel: [236601.894441] block drbd0: Terminating drbd0_asender
Jul 11 10:59:45 wrn-vm2 kernel: [236601.894566] block drbd0: Connection closed
Jul 11 10:59:45 wrn-vm2 kernel: [236601.894575] block drbd0: conn( Disconnecting -> StandAlone )
Jul 11 10:59:45 wrn-vm2 kernel: [236601.894654] block drbd0: receiver terminated
Jul 11 10:59:45 wrn-vm2 kernel: [236601.894658] block drbd0: Terminating drbd0_receiver
Jul 11 10:59:45 wrn-vm2 kernel: [236602.079615] block drbd0: conn( StandAlone -> Unconnected )
Jul 11 10:59:45 wrn-vm2 kernel: [236602.085346] block drbd0: Starting receiver thread (from drbd0_worker [2621])
Jul 11 10:59:45 wrn-vm2 kernel: [236602.092586] block drbd0: receiver (re)started
Jul 11 10:59:45 wrn-vm2 kernel: [236602.097073] block drbd0: conn( Unconnected -> WFConnection )
Jul 11 10:59:46 wrn-vm2 kernel: [236602.799789] block drbd0: Handshake successful: Agreed network protocol version 96
Jul 11 10:59:46 wrn-vm2 kernel: [236602.807584] block drbd0: Peer authenticated using 16 bytes of 'md5' HMAC
Jul 11 10:59:46 wrn-vm2 kernel: [236602.814422] block drbd0: conn( WFConnection -> WFReportParams )
Jul 11 10:59:46 wrn-vm2 kernel: [236602.820613] block drbd0: Starting asender thread (from drbd0_receiver [6617])
Jul 11 10:59:46 wrn-vm2 kernel: [236602.828068] block drbd0: data-integrity-alg: <not-used>
Jul 11 10:59:46 wrn-vm2 kernel: [236602.833423] block drbd0: drbd_sync_handshake:
Jul 11 10:59:46 wrn-vm2 kernel: [236602.837890] block drbd0: self F346E637932F81FC:0000000000000000:22A66126142FEB7A:22A56126142FEB7B bits:0 flags:0
Jul 11 10:59:46 wrn-vm2 kernel: [236602.848187] block drbd0: peer 2D4D56DBB857BD95:F346E637932F81FD:22A66126142FEB7B:22A56126142FEB7B bits:0 flags:4
Jul 11 10:59:46 wrn-vm2 kernel: [236602.858475] block drbd0: uuid_compare()=-1 by rule 50
Jul 11 10:59:46 wrn-vm2 kernel: [236602.863632] block drbd0: peer( Unknown -> Primary ) conn( WFReportParams -> WFBitMapT ) disk( UpToDate -> Outdated ) pdsk( DUnknown -> UpToDate )
Jul 11 10:59:46 wrn-vm2 kernel: [236602.893555] block drbd0: conn( WFBitMapT -> WFSyncUUID )
Jul 11 10:59:46 wrn-vm2 kernel: [236602.933004] block drbd0: updated sync uuid F347E637932F81FC:0000000000000000:22A66126142FEB7A:22A56126142FEB7B
Jul 11 10:59:46 wrn-vm2 kernel: [236602.950997] block drbd0: helper command: /bin/true before-resync-target minor-0
Jul 11 10:59:46 wrn-vm2 kernel: [236602.958541] block drbd0: role( Secondary -> Primary )
Jul 11 10:59:46 wrn-vm2 kernel: [236602.964370] block drbd0: helper command: /bin/true before-resync-target minor-0 exit code 0 (0x0)
Jul 11 10:59:46 wrn-vm2 kernel: [236602.973426] block drbd0: conn( WFSyncUUID -> SyncTarget ) disk( Outdated -> Inconsistent )
Jul 11 10:59:46 wrn-vm2 kernel: [236602.981954] block drbd0: Began resync as SyncTarget (will sync 0 KB [0 bits set]).
Jul 11 10:59:46 wrn-vm2 kernel: [236602.989931] block drbd0: unexpected cstate (SyncTarget) in receive_bitmap
Jul 11 10:59:46 wrn-vm2 kernel: [236603.007056] block drbd0: Resync done (total 1 sec; paused 0 sec; 0 K/sec)
Jul 11 10:59:46 wrn-vm2 kernel: [236603.013955] block drbd0: updated UUIDs 2D4D56DBB857BD95:0000000000000000:F347E637932F81FD:F346E637932F81FD
Jul 11 10:59:46 wrn-vm2 kernel: [236603.023731] block drbd0: conn( SyncTarget -> Connected ) disk( Inconsistent -> UpToDate )
Jul 11 10:59:46 wrn-vm2 kernel: [236603.042558] block drbd0: helper command: /bin/true after-resync-target minor-0
Jul 11 10:59:46 wrn-vm2 kernel: [236603.050575] block drbd0: helper command: /bin/true after-resync-target minor-0 exit code 0 (0x0)
Jul 11 10:59:46 wrn-vm2 kernel: [236603.068838] block drbd0: bitmap WRITE of 32 pages took 2 jiffies
Jul 11 10:59:46 wrn-vm2 kernel: [236603.079056] block drbd0: 0 KB (0 bits) marked out-of-sync by on disk bit-map.
Jul 11 10:59:46 wrn-vm2 kernel: [236603.086402] block drbd0: drbd_sync_handshake:
Jul 11 10:59:46 wrn-vm2 kernel: [236603.090881] block drbd0: self 2D4D56DBB857BD95:0000000000000000:F347E637932F81FD:F346E637932F81FD bits:0 flags:0
Jul 11 10:59:46 wrn-vm2 kernel: [236603.101187] block drbd0: peer 2D4D56DBB857BD95:F347E637932F81FD:F346E637932F81FD:22A66126142FEB7B bits:0 flags:4
Jul 11 10:59:46 wrn-vm2 kernel: [236603.111488] block drbd0: was SyncTarget, peer missed the resync finished event, corrected peer:
Jul 11 10:59:46 wrn-vm2 kernel: [236603.120310] block drbd0: peer 2D4D56DBB857BD95:0000000000000000:F347E637932F81FD:F346E637932F81FD bits:0 flags:4
Jul 11 10:59:46 wrn-vm2 kernel: [236603.130604] block drbd0: uuid_compare()=-1 by rule 35
Jul 11 10:59:46 wrn-vm2 kernel: [236603.135779] block drbd0: I shall become SyncTarget, but I am primary!
Jul 11 10:59:46 wrn-vm2 kernel: [236603.142336] block drbd0: ASSERT( os.conn == C_WF_REPORT_PARAMS ) in /build/linux-s5x2oE/linux-3.2.46/drivers/block/drbd/drbd_receiver.c:3245
Jul 11 10:59:46 wrn-vm2 kernel: [236603.155084] block drbd0: peer( Primary -> Unknown ) conn( Connected -> Disconnecting ) pdsk( UpToDate -> DUnknown )
Jul 11 10:59:46 wrn-vm2 kernel: [236603.165727] block drbd0: error receiving ReportState, l: 4!
Jul 11 10:59:46 wrn-vm2 kernel: [236603.165765] block drbd0: new current UUID 4342617FBA6D19AB:2D4D56DBB857BD95:F347E637932F81FD:F346E637932F81FD
Jul 11 10:59:46 wrn-vm2 kernel: [236603.181488] block drbd0: asender terminated
Jul 11 10:59:46 wrn-vm2 kernel: [236603.185807] block drbd0: Terminating drbd0_asender
Jul 11 10:59:46 wrn-vm2 kernel: [236603.185867] block drbd0: Connection closed
Jul 11 10:59:46 wrn-vm2 kernel: [236603.185872] block drbd0: conn( Disconnecting -> StandAlone )
Jul 11 10:59:46 wrn-vm2 kernel: [236603.185881] block drbd0: receiver terminated
Jul 11 10:59:46 wrn-vm2 kernel: [236603.185882] block drbd0: Terminating drbd0_receiver
Jul 11 11:00:01 wrn-vm2 /USR/SBIN/CRON[6637]: (root) CMD ([ -x /usr/local/sbin/ganeti-watcher ] && /usr/local/sbin/ganeti-watcher)
Jul 11 11:05:01 wrn-vm2 /USR/SBIN/CRON[6686]: (root) CMD ([ -x /usr/local/sbin/ganeti-watcher ] && /usr/local/sbin/ganeti-watcher)

wrn-vm2's clock is about 3 seconds behind wrn-vm1 (sorry, I will fix this)


candlerb

unread,
Jul 11, 2013, 11:19:48 AM7/11/13
to gan...@googlegroups.com
This is looking very much like a race condition.

I attached "strace -v -t -f -p <pid> >node.out 2>&1" to ganeti-noded on both servers, and the migrations ran continuously back and forth for over an hour. But after I killed the strace processes, within 2 minutes the failure had reoccured.

These servers are both new and doing nothing else. There is a 1G switch between the management interfaces, and the storage network is a 10G back-to-back link between two 10G NICs, mtu 9000. Interface error counters are all zero.

root@wrn-vm2:~# ping -f 192.168.8.101
PING 192.168.8.101 (192.168.8.101) 56(84) bytes of data.
^C
--- 192.168.8.101 ping statistics ---
269001 packets transmitted, 269001 received, 0% packet loss, time 14618ms
rtt min/avg/max/mdev = 0.025/0.029/0.158/0.007 ms, ipg/ewma 0.054/0.029 ms

candlerb

unread,
Jul 12, 2013, 5:03:52 AM7/12/13
to gan...@googlegroups.com
Small patch made to show the status after failed sync, submitted at https://code.google.com/p/ganeti/issues/detail?id=525

Now the migration error looks like this:

Fri Jul 12 09:33:23 2013 Migrating instance briantest.int.example.com
Fri Jul 12 09:33:23 2013 * checking disk consistency between source and target
Fri Jul 12 09:33:24 2013 * switching node wrn-vm2.int.example.com to secondary mode
Fri Jul 12 09:33:24 2013 * changing into standalone mode
Fri Jul 12 09:33:24 2013 * changing disks into dual-master mode
Fri Jul 12 09:33:26 2013 * wait until resync is done
Failure: command execution error:
Cannot resync disks on node wrn-vm2.int.example.com: DRBD device <<class 'ganeti.bdev.DRBD8'>: unique_id: ('192.168.8.102', 11002, '192.168.8.101', 11002, 0, '4f63c8749c743aa5db07342d9a14c531af8478af'), children: [<<class 'ganeti.bdev.LogicalVolume'>: unique_id: ('xenvg', '412856f4-ec4f-42e4-b321-72120ec81209.disk0_data'), children: [], 253:5, /dev/xenvg/412856f4-ec4f-42e4-b321-72120ec81209.disk0_data>, <<class 'ganeti.bdev.LogicalVolume'>: unique_id: ('xenvg', '412856f4-ec4f-42e4-b321-72120ec81209.disk0_meta'), children: [], 253:6, /dev/xenvg/412856f4-ec4f-42e4-b321-72120ec81209.disk0_meta>], 147:0, /dev/drbd0> is not in sync: stats=<<class 'ganeti.bdev.DRBD8Status'>: cstatus=StandAlone, lrole=Primary, rrole=Unknown, ldisk=UpToDate, rdisk=DUnknown>

I was able to get this error to occur with strace attached, by limiting strace to just "-e trace=execve", i.e.

strace -e trace=execve -tt -v -f -p <pid> >execve.log 2>&1

The logs are shown below for the last three migrations: successful migrations started at 09:32:47 and 09:33:04, and the failed migration started at 09:33:22. I have merged them so you can see how the activities on the two machines correlated - they are now both ntp-synced.

Any takers for what's happening?

09:32:48.056608 [vm2] execve("/usr/bin/kvm", ["/usr/bin/kvm", "--help"]
09:32:48.521946 [vm1] execve("/usr/bin/kvm", ["/usr/bin/kvm", "--help"]
09:32:48.532583 [vm2] execve("/usr/bin/kvm", ["/usr/bin/kvm", "--help"]
09:32:48.686284 [vm2] execve("/sbin/lvs", ["lvs", "--noheadings", "--separator=,", "--units=m", "--nosuffix", "-olv_attr,lv_kernel_major,lv_ker"..., "/dev/xenvg/412856f4-ec4f-42e4-b3"...]
09:32:48.781334 [vm2] execve("/sbin/lvs", ["lvs", "--noheadings", "--separator=,", "--units=m", "--nosuffix", "-olv_attr,lv_kernel_major,lv_ker"..., "/dev/xenvg/412856f4-ec4f-42e4-b3"...]
09:32:48.940586 [vm2] execve("/sbin/lvs", ["lvs", "--noheadings", "--separator=,", "--units=m", "--nosuffix", "-olv_attr,lv_kernel_major,lv_ker"..., "/dev/xenvg/412856f4-ec4f-42e4-b3"...]
09:32:49.099578 [vm2] execve("/sbin/lvs", ["lvs", "--noheadings", "--separator=,", "--units=m", "--nosuffix", "-olv_attr,lv_kernel_major,lv_ker"..., "/dev/xenvg/412856f4-ec4f-42e4-b3"...]
09:32:49.372258 [vm2] execve("/sbin/lvs", ["lvs", "--noheadings", "--separator=,", "--units=m", "--nosuffix", "-olv_attr,lv_kernel_major,lv_ker"..., "/dev/xenvg/412856f4-ec4f-42e4-b3"...]
09:32:49.463028 [vm2] execve("/sbin/lvs", ["lvs", "--noheadings", "--separator=,", "--units=m", "--nosuffix", "-olv_attr,lv_kernel_major,lv_ker"..., "/dev/xenvg/412856f4-ec4f-42e4-b3"...]
09:32:49.551932 [vm2] execve("/sbin/drbdsetup", ["drbdsetup", "/dev/drbd0", "secondary"]
09:32:49.679846 [vm2] execve("/sbin/lvs", ["lvs", "--noheadings", "--separator=,", "--units=m", "--nosuffix", "-olv_attr,lv_kernel_major,lv_ker"..., "/dev/xenvg/412856f4-ec4f-42e4-b3"...]
09:32:49.681164 [vm1] execve("/sbin/lvs", ["lvs", "--noheadings", "--separator=,", "--units=m", "--nosuffix", "-olv_attr,lv_kernel_major,lv_ker"..., "/dev/xenvg/412856f4-ec4f-42e4-b3"...]
09:32:49.768090 [vm2] execve("/sbin/lvs", ["lvs", "--noheadings", "--separator=,", "--units=m", "--nosuffix", "-olv_attr,lv_kernel_major,lv_ker"..., "/dev/xenvg/412856f4-ec4f-42e4-b3"...]
09:32:49.779703 [vm1] execve("/sbin/lvs", ["lvs", "--noheadings", "--separator=,", "--units=m", "--nosuffix", "-olv_attr,lv_kernel_major,lv_ker"..., "/dev/xenvg/412856f4-ec4f-42e4-b3"...]
09:32:49.858771 [vm2] execve("/sbin/drbdsetup", ["drbdsetup", "/dev/drbd0", "disconnect"]
09:32:49.881660 [vm1] execve("/sbin/drbdsetup", ["drbdsetup", "/dev/drbd0", "disconnect"]
09:32:50.025287 [vm2] execve("/sbin/lvs", ["lvs", "--noheadings", "--separator=,", "--units=m", "--nosuffix", "-olv_attr,lv_kernel_major,lv_ker"..., "/dev/xenvg/412856f4-ec4f-42e4-b3"...]
09:32:50.029986 [vm1] execve("/sbin/lvs", ["lvs", "--noheadings", "--separator=,", "--units=m", "--nosuffix", "-olv_attr,lv_kernel_major,lv_ker"..., "/dev/xenvg/412856f4-ec4f-42e4-b3"...]
09:32:50.135814 [vm1] execve("/sbin/lvs", ["lvs", "--noheadings", "--separator=,", "--units=m", "--nosuffix", "-olv_attr,lv_kernel_major,lv_ker"..., "/dev/xenvg/412856f4-ec4f-42e4-b3"...]
09:32:50.139538 [vm2] execve("/sbin/lvs", ["lvs", "--noheadings", "--separator=,", "--units=m", "--nosuffix", "-olv_attr,lv_kernel_major,lv_ker"..., "/dev/xenvg/412856f4-ec4f-42e4-b3"...]
09:32:50.230205 [vm1] execve("/sbin/drbdsetup", ["drbdsetup", "/dev/drbd0", "syncer", "-r", "61440", "--create-device"]
09:32:50.237590 [vm2] execve("/sbin/drbdsetup", ["drbdsetup", "/dev/drbd0", "syncer", "-r", "61440", "--create-device"]
09:32:50.255106 [vm1] execve("/sbin/drbdsetup", ["drbdsetup", "/dev/drbd0", "net", "ipv4:192.168.8.101:11002", "ipv4:192.168.8.102:11002", "C", "-A", "discard-zero-changes", "-B", "consensus", "--create-device", "-m", "-a", "md5", "-x", "4f63c8749c743aa5db07342d9a14c531"...]
09:32:50.262141 [vm2] execve("/sbin/drbdsetup", ["drbdsetup", "/dev/drbd0", "net", "ipv4:192.168.8.102:11002", "ipv4:192.168.8.101:11002", "C", "-A", "discard-zero-changes", "-B", "consensus", "--create-device", "-m", "-a", "md5", "-x", "4f63c8749c743aa5db07342d9a14c531"...]
09:32:50.284461 [vm1] execve("/sbin/drbdsetup", ["drbdsetup", "/dev/drbd0", "show"]
09:32:50.291940 [vm2] execve("/sbin/drbdsetup", ["drbdsetup", "/dev/drbd0", "show"]
09:32:51.177573 [vm1] execve("/sbin/drbdsetup", ["drbdsetup", "/dev/drbd0", "primary"]
09:32:51.184845 [vm2] execve("/sbin/drbdsetup", ["drbdsetup", "/dev/drbd0", "primary"]
09:32:51.579769 [vm1] execve("/sbin/lvs", ["lvs", "--noheadings", "--separator=,", "--units=m", "--nosuffix", "-olv_attr,lv_kernel_major,lv_ker"..., "/dev/xenvg/412856f4-ec4f-42e4-b3"...]
09:32:51.584927 [vm2] execve("/sbin/lvs", ["lvs", "--noheadings", "--separator=,", "--units=m", "--nosuffix", "-olv_attr,lv_kernel_major,lv_ker"..., "/dev/xenvg/412856f4-ec4f-42e4-b3"...]
09:32:51.687877 [vm1] execve("/sbin/lvs", ["lvs", "--noheadings", "--separator=,", "--units=m", "--nosuffix", "-olv_attr,lv_kernel_major,lv_ker"..., "/dev/xenvg/412856f4-ec4f-42e4-b3"...]
09:32:51.691878 [vm2] execve("/sbin/lvs", ["lvs", "--noheadings", "--separator=,", "--units=m", "--nosuffix", "-olv_attr,lv_kernel_major,lv_ker"..., "/dev/xenvg/412856f4-ec4f-42e4-b3"...]
09:32:51.896459 [vm2] execve("/usr/bin/kvm", ["/usr/bin/kvm", "--help"]
09:32:51.950310 [vm2] execve("/usr/local/lib/ganeti/kvm-ifup", ["/usr/local/lib/ganeti/kvm-ifup", "tap0"]
09:32:51.959335 [vm2] execve("/sbin/ip", ["ip", "link", "show", "tap0"]
09:32:51.959881 [vm2] execve("/usr/bin/awk", ["awk", "{if ($1 == \"link/ether\") printf("...]
09:32:51.964488 [vm2] execve("/sbin/ip", ["ip", "link", "set", "tap0", "address", "fe:3b:c3:e7:1d:06"]
09:32:51.968689 [vm2] execve("/sbin/ip", ["ip", "link", "set", "tap0", "up"]
09:32:51.974503 [vm2] execve("/sbin/ip", ["ip", "link", "set", "tap0", "mtu", "1500"]
09:32:51.977960 [vm2] execve("/sbin/brctl", ["brctl", "addif", "mgmt-br0", "tap0"]
09:32:52.110289 [vm2] execve("/usr/bin/kvm", ["/usr/bin/kvm", "-name", "briantest.int.example.com", "-m", "128", "-smp", "1", "-pidfile", "/var/run/ganeti/kvm-hypervisor/p"..., "-balloon", "virtio", "-daemonize", "-drive", "file=/var/run/ganeti/instance-di"..., "-kernel", "/boot/vmlinuz-3.2.0-4-amd64", "-initrd", "/boot/initrd.img-3.2.0-4-amd64", "-append", "root=/dev/vda1 ro console=ttyS0,"..., "-monitor", "unix:/var/run/ganeti/kvm-hypervi"..., "-serial", "unix:/var/run/ganeti/kvm-hypervi"..., "-usbdevice", "tablet", "-vnc", ":5100", "-netdev", "type=tap,id=netdev0,fd=8", "-device", "virtio-net-pci,mac=aa:00:00:21:c"..., "-incoming", "tcp:192.168.8.102:8102", "-qmp", "unix:/var/run/ganeti/kvm-hypervi"...]
09:32:52.347111 [vm2] execve("/bin/sh", ["/bin/sh", "-c", "echo 'info cpus' | /usr/bin/soca"...]
09:32:52.350296 [vm2] execve("/usr/bin/socat", ["/usr/bin/socat", "STDIO", "UNIX-CONNECT:/var/run/ganeti/kvm"...]
09:32:52.350528 [vm2] <... execve resumed> ) = 0
09:32:52.963404 [vm2] execve("/usr/bin/kvm", ["/usr/bin/kvm", "--help"]
09:32:53.116883 [vm1] execve("/bin/sh", ["/bin/sh", "-c", "echo 'migrate_set_speed 32m' | /"...]
09:32:53.119778 [vm1] execve("/usr/bin/socat", ["/usr/bin/socat", "STDIO", "UNIX-CONNECT:/var/run/ganeti/kvm"...]
09:32:53.119987 [vm1] <... execve resumed> ) = 0
09:32:53.651306 [vm1] execve("/bin/sh", ["/bin/sh", "-c", "echo 'migrate_set_downtime 30ms'"...]
09:32:53.653697 [vm1] execve("/usr/bin/socat", ["/usr/bin/socat", "STDIO", "UNIX-CONNECT:/var/run/ganeti/kvm"...]
09:32:53.653904 [vm1] <... execve resumed> ) = 0
09:32:54.185065 [vm1] execve("/bin/sh", ["/bin/sh", "-c", "echo 'migrate -d tcp:192.168.8.1"...]
09:32:54.187396 [vm1] execve("/usr/bin/socat", ["/usr/bin/socat", "STDIO", "UNIX-CONNECT:/var/run/ganeti/kvm"...]
09:32:54.187624 [vm1] <... execve resumed> ) = 0
09:32:54.829748 [vm1] execve("/bin/sh", ["/bin/sh", "-c", "echo 'info migrate' | /usr/bin/s"...]
09:32:54.832297 [vm1] execve("/usr/bin/socat", ["/usr/bin/socat", "STDIO", "UNIX-CONNECT:/var/run/ganeti/kvm"...]
09:32:56.435344 [vm1] execve("/bin/sh", ["/bin/sh", "-c", "echo 'info migrate' | /usr/bin/s"...]
09:32:56.437802 [vm1] execve("/usr/bin/socat", ["/usr/bin/socat", "STDIO", "UNIX-CONNECT:/var/run/ganeti/kvm"...]
09:32:56.438012 [vm1] <... execve resumed> ) = 0
09:32:58.044140 [vm1] execve("/bin/sh", ["/bin/sh", "-c", "echo 'info migrate' | /usr/bin/s"...]
09:32:58.055719 [vm1] execve("/usr/bin/socat", ["/usr/bin/socat", "STDIO", "UNIX-CONNECT:/var/run/ganeti/kvm"...]
09:32:58.055970 [vm1] <... execve resumed> ) = 0
09:32:59.687622 [vm1] execve("/sbin/lvs", ["lvs", "--noheadings", "--separator=,", "--units=m", "--nosuffix", "-olv_attr,lv_kernel_major,lv_ker"..., "/dev/xenvg/412856f4-ec4f-42e4-b3"...]
09:32:59.791358 [vm1] execve("/sbin/lvs", ["lvs", "--noheadings", "--separator=,", "--units=m", "--nosuffix", "-olv_attr,lv_kernel_major,lv_ker"..., "/dev/xenvg/412856f4-ec4f-42e4-b3"...]
09:32:59.884917 [vm1] execve("/sbin/drbdsetup", ["drbdsetup", "/dev/drbd0", "secondary"]
09:33:00.039259 [vm1] execve("/sbin/lvs", ["lvs", "--noheadings", "--separator=,", "--units=m", "--nosuffix", "-olv_attr,lv_kernel_major,lv_ker"..., "/dev/xenvg/412856f4-ec4f-42e4-b3"...]
09:33:00.054437 [vm2] execve("/sbin/lvs", ["lvs", "--noheadings", "--separator=,", "--units=m", "--nosuffix", "-olv_attr,lv_kernel_major,lv_ker"..., "/dev/xenvg/412856f4-ec4f-42e4-b3"...]
09:33:00.135134 [vm1] execve("/sbin/lvs", ["lvs", "--noheadings", "--separator=,", "--units=m", "--nosuffix", "-olv_attr,lv_kernel_major,lv_ker"..., "/dev/xenvg/412856f4-ec4f-42e4-b3"...]
09:33:00.155997 [vm2] execve("/sbin/lvs", ["lvs", "--noheadings", "--separator=,", "--units=m", "--nosuffix", "-olv_attr,lv_kernel_major,lv_ker"..., "/dev/xenvg/412856f4-ec4f-42e4-b3"...]
09:33:00.372550 [vm1] execve("/sbin/lvs", ["lvs", "--noheadings", "--separator=,", "--units=m", "--nosuffix", "-olv_attr,lv_kernel_major,lv_ker"..., "/dev/xenvg/412856f4-ec4f-42e4-b3"...]
09:33:00.385966 [vm2] execve("/sbin/lvs", ["lvs", "--noheadings", "--separator=,", "--units=m", "--nosuffix", "-olv_attr,lv_kernel_major,lv_ker"..., "/dev/xenvg/412856f4-ec4f-42e4-b3"...]
09:33:00.460792 [vm1] execve("/sbin/lvs", ["lvs", "--noheadings", "--separator=,", "--units=m", "--nosuffix", "-olv_attr,lv_kernel_major,lv_ker"..., "/dev/xenvg/412856f4-ec4f-42e4-b3"...]
09:33:00.481650 [vm2] execve("/sbin/lvs", ["lvs", "--noheadings", "--separator=,", "--units=m", "--nosuffix", "-olv_attr,lv_kernel_major,lv_ker"..., "/dev/xenvg/412856f4-ec4f-42e4-b3"...]
09:33:00.572853 [vm1] execve("/sbin/drbdsetup", ["drbdsetup", "/dev/drbd0", "disconnect"]
09:33:00.575710 [vm2] execve("/sbin/drbdsetup", ["drbdsetup", "/dev/drbd0", "disconnect"]
09:33:00.770367 [vm2] execve("/sbin/lvs", ["lvs", "--noheadings", "--separator=,", "--units=m", "--nosuffix", "-olv_attr,lv_kernel_major,lv_ker"..., "/dev/xenvg/412856f4-ec4f-42e4-b3"...]
09:33:00.784610 [vm1] execve("/sbin/lvs", ["lvs", "--noheadings", "--separator=,", "--units=m", "--nosuffix", "-olv_attr,lv_kernel_major,lv_ker"..., "/dev/xenvg/412856f4-ec4f-42e4-b3"...]
09:33:00.882283 [vm2] execve("/sbin/lvs", ["lvs", "--noheadings", "--separator=,", "--units=m", "--nosuffix", "-olv_attr,lv_kernel_major,lv_ker"..., "/dev/xenvg/412856f4-ec4f-42e4-b3"...]
09:33:00.975216 [vm2] execve("/sbin/drbdsetup", ["drbdsetup", "/dev/drbd0", "syncer", "-r", "61440", "--create-device"]
09:33:01.000696 [vm2] execve("/sbin/drbdsetup", ["drbdsetup", "/dev/drbd0", "net", "ipv4:192.168.8.102:11002", "ipv4:192.168.8.101:11002", "C", "-A", "discard-zero-changes", "-B", "consensus", "--create-device", "-a", "md5", "-x", "4f63c8749c743aa5db07342d9a14c531"...]
09:33:01.030325 [vm2] execve("/sbin/drbdsetup", ["drbdsetup", "/dev/drbd0", "show"]
09:33:01.042487 [vm1] execve("/sbin/lvs", ["lvs", "--noheadings", "--separator=,", "--units=m", "--nosuffix", "-olv_attr,lv_kernel_major,lv_ker"..., "/dev/xenvg/412856f4-ec4f-42e4-b3"...]
09:33:01.137131 [vm1] execve("/sbin/drbdsetup", ["drbdsetup", "/dev/drbd0", "syncer", "-r", "61440", "--create-device"]
09:33:01.160639 [vm1] execve("/sbin/drbdsetup", ["drbdsetup", "/dev/drbd0", "net", "ipv4:192.168.8.101:11002", "ipv4:192.168.8.102:11002", "C", "-A", "discard-zero-changes", "-B", "consensus", "--create-device", "-a", "md5", "-x", "4f63c8749c743aa5db07342d9a14c531"...]
09:33:01.190834 [vm1] execve("/sbin/drbdsetup", ["drbdsetup", "/dev/drbd0", "show"]
09:33:02.193756 [vm1] execve("/sbin/lvs", ["lvs", "--noheadings", "--separator=,", "--units=m", "--nosuffix", "-olv_attr,lv_kernel_major,lv_ker"..., "/dev/xenvg/412856f4-ec4f-42e4-b3"...]
09:33:02.210622 [vm2] execve("/sbin/lvs", ["lvs", "--noheadings", "--separator=,", "--units=m", "--nosuffix", "-olv_attr,lv_kernel_major,lv_ker"..., "/dev/xenvg/412856f4-ec4f-42e4-b3"...]
09:33:02.309592 [vm1] execve("/sbin/lvs", ["lvs", "--noheadings", "--separator=,", "--units=m", "--nosuffix", "-olv_attr,lv_kernel_major,lv_ker"..., "/dev/xenvg/412856f4-ec4f-42e4-b3"...]
09:33:02.412262 [vm2] execve("/sbin/lvs", ["lvs", "--noheadings", "--separator=,", "--units=m", "--nosuffix", "-olv_attr,lv_kernel_major,lv_ker"..., "/dev/xenvg/412856f4-ec4f-42e4-b3"...]
09:33:05.618867 [vm1] execve("/usr/bin/kvm", ["/usr/bin/kvm", "--help"]
09:33:06.071925 [vm2] execve("/usr/bin/kvm", ["/usr/bin/kvm", "--help"]
09:33:06.082540 [vm1] execve("/usr/bin/kvm", ["/usr/bin/kvm", "--help"]
09:33:06.233538 [vm1] execve("/sbin/lvs", ["lvs", "--noheadings", "--separator=,", "--units=m", "--nosuffix", "-olv_attr,lv_kernel_major,lv_ker"..., "/dev/xenvg/412856f4-ec4f-42e4-b3"...]
09:33:06.333905 [vm1] execve("/sbin/lvs", ["lvs", "--noheadings", "--separator=,", "--units=m", "--nosuffix", "-olv_attr,lv_kernel_major,lv_ker"..., "/dev/xenvg/412856f4-ec4f-42e4-b3"...]
09:33:06.502166 [vm1] execve("/sbin/lvs", ["lvs", "--noheadings", "--separator=,", "--units=m", "--nosuffix", "-olv_attr,lv_kernel_major,lv_ker"..., "/dev/xenvg/412856f4-ec4f-42e4-b3"...]
09:33:06.657548 [vm1] execve("/sbin/lvs", ["lvs", "--noheadings", "--separator=,", "--units=m", "--nosuffix", "-olv_attr,lv_kernel_major,lv_ker"..., "/dev/xenvg/412856f4-ec4f-42e4-b3"...]
09:33:06.918893 [vm1] execve("/sbin/lvs", ["lvs", "--noheadings", "--separator=,", "--units=m", "--nosuffix", "-olv_attr,lv_kernel_major,lv_ker"..., "/dev/xenvg/412856f4-ec4f-42e4-b3"...]
09:33:07.018840 [vm1] execve("/sbin/lvs", ["lvs", "--noheadings", "--separator=,", "--units=m", "--nosuffix", "-olv_attr,lv_kernel_major,lv_ker"..., "/dev/xenvg/412856f4-ec4f-42e4-b3"...]
09:33:07.118838 [vm1] execve("/sbin/drbdsetup", ["drbdsetup", "/dev/drbd0", "secondary"]
09:33:07.245373 [vm2] execve("/sbin/lvs", ["lvs", "--noheadings", "--separator=,", "--units=m", "--nosuffix", "-olv_attr,lv_kernel_major,lv_ker"..., "/dev/xenvg/412856f4-ec4f-42e4-b3"...]
09:33:07.251213 [vm1] execve("/sbin/lvs", ["lvs", "--noheadings", "--separator=,", "--units=m", "--nosuffix", "-olv_attr,lv_kernel_major,lv_ker"..., "/dev/xenvg/412856f4-ec4f-42e4-b3"...]
09:33:07.351738 [vm1] execve("/sbin/lvs", ["lvs", "--noheadings", "--separator=,", "--units=m", "--nosuffix", "-olv_attr,lv_kernel_major,lv_ker"..., "/dev/xenvg/412856f4-ec4f-42e4-b3"...]
09:33:07.374835 [vm2] execve("/sbin/lvs", ["lvs", "--noheadings", "--separator=,", "--units=m", "--nosuffix", "-olv_attr,lv_kernel_major,lv_ker"..., "/dev/xenvg/412856f4-ec4f-42e4-b3"...]
09:33:07.443832 [vm1] execve("/sbin/drbdsetup", ["drbdsetup", "/dev/drbd0", "disconnect"]
09:33:07.465943 [vm2] execve("/sbin/drbdsetup", ["drbdsetup", "/dev/drbd0", "disconnect"]
09:33:07.631000 [vm2] execve("/sbin/lvs", ["lvs", "--noheadings", "--separator=,", "--units=m", "--nosuffix", "-olv_attr,lv_kernel_major,lv_ker"..., "/dev/xenvg/412856f4-ec4f-42e4-b3"...]
09:33:07.633793 [vm1] execve("/sbin/lvs", ["lvs", "--noheadings", "--separator=,", "--units=m", "--nosuffix", "-olv_attr,lv_kernel_major,lv_ker"..., "/dev/xenvg/412856f4-ec4f-42e4-b3"...]
09:33:07.745302 [vm1] execve("/sbin/lvs", ["lvs", "--noheadings", "--separator=,", "--units=m", "--nosuffix", "-olv_attr,lv_kernel_major,lv_ker"..., "/dev/xenvg/412856f4-ec4f-42e4-b3"...]
09:33:07.749066 [vm2] execve("/sbin/lvs", ["lvs", "--noheadings", "--separator=,", "--units=m", "--nosuffix", "-olv_attr,lv_kernel_major,lv_ker"..., "/dev/xenvg/412856f4-ec4f-42e4-b3"...]
09:33:07.841025 [vm2] execve("/sbin/drbdsetup", ["drbdsetup", "/dev/drbd0", "syncer", "-r", "61440", "--create-device"]
09:33:07.846028 [vm1] execve("/sbin/drbdsetup", ["drbdsetup", "/dev/drbd0", "syncer", "-r", "61440", "--create-device"]
09:33:07.865605 [vm2] execve("/sbin/drbdsetup", ["drbdsetup", "/dev/drbd0", "net", "ipv4:192.168.8.102:11002", "ipv4:192.168.8.101:11002", "C", "-A", "discard-zero-changes", "-B", "consensus", "--create-device", "-m", "-a", "md5", "-x", "4f63c8749c743aa5db07342d9a14c531"...]
09:33:07.869866 [vm1] execve("/sbin/drbdsetup", ["drbdsetup", "/dev/drbd0", "net", "ipv4:192.168.8.101:11002", "ipv4:192.168.8.102:11002", "C", "-A", "discard-zero-changes", "-B", "consensus", "--create-device", "-m", "-a", "md5", "-x", "4f63c8749c743aa5db07342d9a14c531"...]
09:33:07.896069 [vm2] execve("/sbin/drbdsetup", ["drbdsetup", "/dev/drbd0", "show"]
09:33:07.899427 [vm1] execve("/sbin/drbdsetup", ["drbdsetup", "/dev/drbd0", "show"]
09:33:08.789040 [vm2] execve("/sbin/drbdsetup", ["drbdsetup", "/dev/drbd0", "primary"]
09:33:08.790731 [vm1] execve("/sbin/drbdsetup", ["drbdsetup", "/dev/drbd0", "primary"]
09:33:09.154611 [vm1] execve("/sbin/lvs", ["lvs", "--noheadings", "--separator=,", "--units=m", "--nosuffix", "-olv_attr,lv_kernel_major,lv_ker"..., "/dev/xenvg/412856f4-ec4f-42e4-b3"...]
09:33:09.170216 [vm2] execve("/sbin/lvs", ["lvs", "--noheadings", "--separator=,", "--units=m", "--nosuffix", "-olv_attr,lv_kernel_major,lv_ker"..., "/dev/xenvg/412856f4-ec4f-42e4-b3"...]
09:33:09.248671 [vm1] execve("/sbin/lvs", ["lvs", "--noheadings", "--separator=,", "--units=m", "--nosuffix", "-olv_attr,lv_kernel_major,lv_ker"..., "/dev/xenvg/412856f4-ec4f-42e4-b3"...]
09:33:09.280124 [vm2] execve("/sbin/lvs", ["lvs", "--noheadings", "--separator=,", "--units=m", "--nosuffix", "-olv_attr,lv_kernel_major,lv_ker"..., "/dev/xenvg/412856f4-ec4f-42e4-b3"...]
09:33:09.490899 [vm1] execve("/usr/bin/kvm", ["/usr/bin/kvm", "--help"]
09:33:09.544779 [vm1] execve("/usr/local/lib/ganeti/kvm-ifup", ["/usr/local/lib/ganeti/kvm-ifup", "tap0"]
09:33:09.553803 [vm1] execve("/sbin/ip", ["ip", "link", "show", "tap0"]
09:33:09.554046 [vm1] execve("/usr/bin/awk", ["awk", "{if ($1 == \"link/ether\") printf("...]
09:33:09.554122 [vm1] <... execve resumed> ) = 0
09:33:09.554277 [vm1] <... execve resumed> ) = 0
09:33:09.558646 [vm1] execve("/sbin/ip", ["ip", "link", "set", "tap0", "address", "fe:31:db:79:7b:f1"]
09:33:09.562969 [vm1] execve("/sbin/ip", ["ip", "link", "set", "tap0", "up"]
09:33:09.568700 [vm1] execve("/sbin/ip", ["ip", "link", "set", "tap0", "mtu", "1500"]
09:33:09.572045 [vm1] execve("/sbin/brctl", ["brctl", "addif", "mgmt-br0", "tap0"]
09:33:09.703122 [vm1] execve("/usr/bin/kvm", ["/usr/bin/kvm", "-name", "briantest.int.example.com", "-m", "128", "-smp", "1", "-pidfile", "/var/run/ganeti/kvm-hypervisor/p"..., "-balloon", "virtio", "-daemonize", "-drive", "file=/var/run/ganeti/instance-di"..., "-kernel", "/boot/vmlinuz-3.2.0-4-amd64", "-initrd", "/boot/initrd.img-3.2.0-4-amd64", "-append", "root=/dev/vda1 ro console=ttyS0,"..., "-monitor", "unix:/var/run/ganeti/kvm-hypervi"..., "-serial", "unix:/var/run/ganeti/kvm-hypervi"..., "-usbdevice", "tablet", "-vnc", ":5100", "-netdev", "type=tap,id=netdev0,fd=8", "-device", "virtio-net-pci,mac=aa:00:00:21:c"..., "-incoming", "tcp:192.168.8.101:8102", "-qmp", "unix:/var/run/ganeti/kvm-hypervi"...]
09:33:10.037627 [vm1] execve("/bin/sh", ["/bin/sh", "-c", "echo 'info cpus' | /usr/bin/soca"...]
09:33:10.039868 [vm1] execve("/usr/bin/socat", ["/usr/bin/socat", "STDIO", "UNIX-CONNECT:/var/run/ganeti/kvm"...]
09:33:10.040077 [vm1] <... execve resumed> ) = 0
09:33:10.641866 [vm1] execve("/usr/bin/kvm", ["/usr/bin/kvm", "--help"]
09:33:10.810248 [vm2] execve("/bin/sh", ["/bin/sh", "-c", "echo 'migrate_set_speed 32m' | /"...]
09:33:10.813352 [vm2] execve("/usr/bin/socat", ["/usr/bin/socat", "STDIO", "UNIX-CONNECT:/var/run/ganeti/kvm"...]
09:33:10.813574 [vm2] <... execve resumed> ) = 0
09:33:11.343810 [vm2] execve("/bin/sh", ["/bin/sh", "-c", "echo 'migrate_set_downtime 30ms'"...]
09:33:11.346126 [vm2] execve("/usr/bin/socat", ["/usr/bin/socat", "STDIO", "UNIX-CONNECT:/var/run/ganeti/kvm"...]
09:33:11.878433 [vm2] execve("/bin/sh", ["/bin/sh", "-c", "echo 'migrate -d tcp:192.168.8.1"...]
09:33:11.880788 [vm2] execve("/usr/bin/socat", ["/usr/bin/socat", "STDIO", "UNIX-CONNECT:/var/run/ganeti/kvm"...]
09:33:11.880983 [vm2] <... execve resumed> ) = 0
09:33:12.524433 [vm2] execve("/bin/sh", ["/bin/sh", "-c", "echo 'info migrate' | /usr/bin/s"...]
09:33:12.526819 [vm2] execve("/usr/bin/socat", ["/usr/bin/socat", "STDIO", "UNIX-CONNECT:/var/run/ganeti/kvm"...]
09:33:14.131899 [vm2] execve("/bin/sh", ["/bin/sh", "-c", "echo 'info migrate' | /usr/bin/s"...]
09:33:14.134911 [vm2] execve("/usr/bin/socat", ["/usr/bin/socat", "STDIO", "UNIX-CONNECT:/var/run/ganeti/kvm"...]
09:33:14.135106 [vm2] <... execve resumed> ) = 0
09:33:15.746841 [vm2] execve("/bin/sh", ["/bin/sh", "-c", "echo 'info migrate' | /usr/bin/s"...]
09:33:15.749399 [vm2] execve("/usr/bin/socat", ["/usr/bin/socat", "STDIO", "UNIX-CONNECT:/var/run/ganeti/kvm"...]
09:33:15.749675 [vm2] <... execve resumed> ) = 0
09:33:17.351033 [vm2] execve("/sbin/lvs", ["lvs", "--noheadings", "--separator=,", "--units=m", "--nosuffix", "-olv_attr,lv_kernel_major,lv_ker"..., "/dev/xenvg/412856f4-ec4f-42e4-b3"...]
09:33:17.445710 [vm2] execve("/sbin/lvs", ["lvs", "--noheadings", "--separator=,", "--units=m", "--nosuffix", "-olv_attr,lv_kernel_major,lv_ker"..., "/dev/xenvg/412856f4-ec4f-42e4-b3"...]
09:33:17.537360 [vm2] execve("/sbin/drbdsetup", ["drbdsetup", "/dev/drbd0", "secondary"]
09:33:17.729187 [vm1] execve("/sbin/lvs", ["lvs", "--noheadings", "--separator=,", "--units=m", "--nosuffix", "-olv_attr,lv_kernel_major,lv_ker"..., "/dev/xenvg/412856f4-ec4f-42e4-b3"...]
09:33:17.735442 [vm2] execve("/sbin/lvs", ["lvs", "--noheadings", "--separator=,", "--units=m", "--nosuffix", "-olv_attr,lv_kernel_major,lv_ker"..., "/dev/xenvg/412856f4-ec4f-42e4-b3"...]
09:33:17.824880 [vm1] execve("/sbin/lvs", ["lvs", "--noheadings", "--separator=,", "--units=m", "--nosuffix", "-olv_attr,lv_kernel_major,lv_ker"..., "/dev/xenvg/412856f4-ec4f-42e4-b3"...]
09:33:17.846654 [vm2] execve("/sbin/lvs", ["lvs", "--noheadings", "--separator=,", "--units=m", "--nosuffix", "-olv_attr,lv_kernel_major,lv_ker"..., "/dev/xenvg/412856f4-ec4f-42e4-b3"...]
09:33:18.060267 [vm2] execve("/sbin/lvs", ["lvs", "--noheadings", "--separator=,", "--units=m", "--nosuffix", "-olv_attr,lv_kernel_major,lv_ker"..., "/dev/xenvg/412856f4-ec4f-42e4-b3"...]
09:33:18.062689 [vm1] execve("/sbin/lvs", ["lvs", "--noheadings", "--separator=,", "--units=m", "--nosuffix", "-olv_attr,lv_kernel_major,lv_ker"..., "/dev/xenvg/412856f4-ec4f-42e4-b3"...]
09:33:18.151347 [vm2] execve("/sbin/lvs", ["lvs", "--noheadings", "--separator=,", "--units=m", "--nosuffix", "-olv_attr,lv_kernel_major,lv_ker"..., "/dev/xenvg/412856f4-ec4f-42e4-b3"...]
09:33:18.240340 [vm2] execve("/sbin/drbdsetup", ["drbdsetup", "/dev/drbd0", "disconnect"]
09:33:18.365502 [vm1] execve("/sbin/lvs", ["lvs", "--noheadings", "--separator=,", "--units=m", "--nosuffix", "-olv_attr,lv_kernel_major,lv_ker"..., "/dev/xenvg/412856f4-ec4f-42e4-b3"...]
09:33:18.476382 [vm1] execve("/sbin/drbdsetup", ["drbdsetup", "/dev/drbd0", "disconnect"]
09:33:18.637652 [vm1] execve("/sbin/lvs", ["lvs", "--noheadings", "--separator=,", "--units=m", "--nosuffix", "-olv_attr,lv_kernel_major,lv_ker"..., "/dev/xenvg/412856f4-ec4f-42e4-b3"...]
09:33:18.637661 [vm2] execve("/sbin/lvs", ["lvs", "--noheadings", "--separator=,", "--units=m", "--nosuffix", "-olv_attr,lv_kernel_major,lv_ker"..., "/dev/xenvg/412856f4-ec4f-42e4-b3"...]
09:33:18.726426 [vm1] execve("/sbin/lvs", ["lvs", "--noheadings", "--separator=,", "--units=m", "--nosuffix", "-olv_attr,lv_kernel_major,lv_ker"..., "/dev/xenvg/412856f4-ec4f-42e4-b3"...]
09:33:18.748102 [vm2] execve("/sbin/lvs", ["lvs", "--noheadings", "--separator=,", "--units=m", "--nosuffix", "-olv_attr,lv_kernel_major,lv_ker"..., "/dev/xenvg/412856f4-ec4f-42e4-b3"...]
09:33:18.827480 [vm1] execve("/sbin/drbdsetup", ["drbdsetup", "/dev/drbd0", "syncer", "-r", "61440", "--create-device"]
09:33:18.848930 [vm2] execve("/sbin/drbdsetup", ["drbdsetup", "/dev/drbd0", "syncer", "-r", "61440", "--create-device"]
09:33:18.850927 [vm1] execve("/sbin/drbdsetup", ["drbdsetup", "/dev/drbd0", "net", "ipv4:192.168.8.101:11002", "ipv4:192.168.8.102:11002", "C", "-A", "discard-zero-changes", "-B", "consensus", "--create-device", "-a", "md5", "-x", "4f63c8749c743aa5db07342d9a14c531"...]
09:33:18.874891 [vm2] execve("/sbin/drbdsetup", ["drbdsetup", "/dev/drbd0", "net", "ipv4:192.168.8.102:11002", "ipv4:192.168.8.101:11002", "C", "-A", "discard-zero-changes", "-B", "consensus", "--create-device", "-a", "md5", "-x", "4f63c8749c743aa5db07342d9a14c531"...]
09:33:18.880361 [vm1] execve("/sbin/drbdsetup", ["drbdsetup", "/dev/drbd0", "show"]
09:33:18.904489 [vm2] execve("/sbin/drbdsetup", ["drbdsetup", "/dev/drbd0", "show"]
09:33:19.924359 [vm2] execve("/sbin/lvs", ["lvs", "--noheadings", "--separator=,", "--units=m", "--nosuffix", "-olv_attr,lv_kernel_major,lv_ker"..., "/dev/xenvg/412856f4-ec4f-42e4-b3"...]
09:33:19.936098 [vm1] execve("/sbin/lvs", ["lvs", "--noheadings", "--separator=,", "--units=m", "--nosuffix", "-olv_attr,lv_kernel_major,lv_ker"..., "/dev/xenvg/412856f4-ec4f-42e4-b3"...]
09:33:20.028602 [vm2] execve("/sbin/lvs", ["lvs", "--noheadings", "--separator=,", "--units=m", "--nosuffix", "-olv_attr,lv_kernel_major,lv_ker"..., "/dev/xenvg/412856f4-ec4f-42e4-b3"...]
09:33:20.044156 [vm1] execve("/sbin/lvs", ["lvs", "--noheadings", "--separator=,", "--units=m", "--nosuffix", "-olv_attr,lv_kernel_major,lv_ker"..., "/dev/xenvg/412856f4-ec4f-42e4-b3"...]
09:33:23.000220 [vm2] execve("/usr/bin/kvm", ["/usr/bin/kvm", "--help"]
09:33:23.464968 [vm2] execve("/usr/bin/kvm", ["/usr/bin/kvm", "--help"]
09:33:23.466536 [vm1] execve("/usr/bin/kvm", ["/usr/bin/kvm", "--help"]
09:33:23.617502 [vm2] execve("/sbin/lvs", ["lvs", "--noheadings", "--separator=,", "--units=m", "--nosuffix", "-olv_attr,lv_kernel_major,lv_ker"..., "/dev/xenvg/412856f4-ec4f-42e4-b3"...]
09:33:23.711247 [vm2] execve("/sbin/lvs", ["lvs", "--noheadings", "--separator=,", "--units=m", "--nosuffix", "-olv_attr,lv_kernel_major,lv_ker"..., "/dev/xenvg/412856f4-ec4f-42e4-b3"...]
09:33:23.879848 [vm2] execve("/sbin/lvs", ["lvs", "--noheadings", "--separator=,", "--units=m", "--nosuffix", "-olv_attr,lv_kernel_major,lv_ker"..., "/dev/xenvg/412856f4-ec4f-42e4-b3"...]
09:33:24.043287 [vm2] execve("/sbin/lvs", ["lvs", "--noheadings", "--separator=,", "--units=m", "--nosuffix", "-olv_attr,lv_kernel_major,lv_ker"..., "/dev/xenvg/412856f4-ec4f-42e4-b3"...]
09:33:24.312280 [vm2] execve("/sbin/lvs", ["lvs", "--noheadings", "--separator=,", "--units=m", "--nosuffix", "-olv_attr,lv_kernel_major,lv_ker"..., "/dev/xenvg/412856f4-ec4f-42e4-b3"...]
09:33:24.403153 [vm2] execve("/sbin/lvs", ["lvs", "--noheadings", "--separator=,", "--units=m", "--nosuffix", "-olv_attr,lv_kernel_major,lv_ker"..., "/dev/xenvg/412856f4-ec4f-42e4-b3"...]
09:33:24.493854 [vm2] execve("/sbin/drbdsetup", ["drbdsetup", "/dev/drbd0", "secondary"]
09:33:24.623350 [vm2] execve("/sbin/lvs", ["lvs", "--noheadings", "--separator=,", "--units=m", "--nosuffix", "-olv_attr,lv_kernel_major,lv_ker"..., "/dev/xenvg/412856f4-ec4f-42e4-b3"...]
09:33:24.627684 [vm1] execve("/sbin/lvs", ["lvs", "--noheadings", "--separator=,", "--units=m", "--nosuffix", "-olv_attr,lv_kernel_major,lv_ker"..., "/dev/xenvg/412856f4-ec4f-42e4-b3"...]
09:33:24.711436 [vm2] execve("/sbin/lvs", ["lvs", "--noheadings", "--separator=,", "--units=m", "--nosuffix", "-olv_attr,lv_kernel_major,lv_ker"..., "/dev/xenvg/412856f4-ec4f-42e4-b3"...]
09:33:24.725329 [vm1] execve("/sbin/lvs", ["lvs", "--noheadings", "--separator=,", "--units=m", "--nosuffix", "-olv_attr,lv_kernel_major,lv_ker"..., "/dev/xenvg/412856f4-ec4f-42e4-b3"...]
09:33:24.804341 [vm2] execve("/sbin/drbdsetup", ["drbdsetup", "/dev/drbd0", "disconnect"]
09:33:24.825650 [vm1] execve("/sbin/drbdsetup", ["drbdsetup", "/dev/drbd0", "disconnect"]
09:33:24.966856 [vm2] execve("/sbin/lvs", ["lvs", "--noheadings", "--separator=,", "--units=m", "--nosuffix", "-olv_attr,lv_kernel_major,lv_ker"..., "/dev/xenvg/412856f4-ec4f-42e4-b3"...]
09:33:24.974045 [vm1] execve("/sbin/lvs", ["lvs", "--noheadings", "--separator=,", "--units=m", "--nosuffix", "-olv_attr,lv_kernel_major,lv_ker"..., "/dev/xenvg/412856f4-ec4f-42e4-b3"...]
09:33:25.081733 [vm1] execve("/sbin/lvs", ["lvs", "--noheadings", "--separator=,", "--units=m", "--nosuffix", "-olv_attr,lv_kernel_major,lv_ker"..., "/dev/xenvg/412856f4-ec4f-42e4-b3"...]
09:33:25.097482 [vm2] execve("/sbin/lvs", ["lvs", "--noheadings", "--separator=,", "--units=m", "--nosuffix", "-olv_attr,lv_kernel_major,lv_ker"..., "/dev/xenvg/412856f4-ec4f-42e4-b3"...]
09:33:25.177193 [vm1] execve("/sbin/drbdsetup", ["drbdsetup", "/dev/drbd0", "syncer", "-r", "61440", "--create-device"]
09:33:25.200978 [vm1] execve("/sbin/drbdsetup", ["drbdsetup", "/dev/drbd0", "net", "ipv4:192.168.8.101:11002", "ipv4:192.168.8.102:11002", "C", "-A", "discard-zero-changes", "-B", "consensus", "--create-device", "-m", "-a", "md5", "-x", "4f63c8749c743aa5db07342d9a14c531"...]
09:33:25.231012 [vm1] execve("/sbin/drbdsetup", ["drbdsetup", "/dev/drbd0", "show"]
09:33:25.250479 [vm2] execve("/sbin/drbdsetup", ["drbdsetup", "/dev/drbd0", "syncer", "-r", "61440", "--create-device"]
09:33:25.274626 [vm2] execve("/sbin/drbdsetup", ["drbdsetup", "/dev/drbd0", "net", "ipv4:192.168.8.102:11002", "ipv4:192.168.8.101:11002", "C", "-A", "discard-zero-changes", "-B", "consensus", "--create-device", "-m", "-a", "md5", "-x", "4f63c8749c743aa5db07342d9a14c531"...]
09:33:25.304177 [vm2] execve("/sbin/drbdsetup", ["drbdsetup", "/dev/drbd0", "show"]
09:33:26.121748 [vm1] execve("/sbin/drbdsetup", ["drbdsetup", "/dev/drbd0", "primary"]
09:33:26.200937 [vm2] execve("/sbin/drbdsetup", ["drbdsetup", "/dev/drbd0", "primary"]
09:33:26.568007 [vm1] execve("/sbin/lvs", ["lvs", "--noheadings", "--separator=,", "--units=m", "--nosuffix", "-olv_attr,lv_kernel_major,lv_ker"..., "/dev/xenvg/412856f4-ec4f-42e4-b3"...]
09:33:26.590973 [vm2] execve("/sbin/lvs", ["lvs", "--noheadings", "--separator=,", "--units=m", "--nosuffix", "-olv_attr,lv_kernel_major,lv_ker"..., "/dev/xenvg/412856f4-ec4f-42e4-b3"...]
09:33:26.676211 [vm1] execve("/sbin/lvs", ["lvs", "--noheadings", "--separator=,", "--units=m", "--nosuffix", "-olv_attr,lv_kernel_major,lv_ker"..., "/dev/xenvg/412856f4-ec4f-42e4-b3"...]
09:33:26.821862 [vm2] execve("/sbin/lvs", ["lvs", "--noheadings", "--separator=,", "--units=m", "--nosuffix", "-olv_attr,lv_kernel_major,lv_ker"..., "/dev/xenvg/412856f4-ec4f-42e4-b3"...]
09:35:02.474637 [vm2] execve("/sbin/vgs", ["vgs", "--noheadings", "--nosuffix", "--units=m", "--unbuffered", "--separator=|", "-ovg_name,vg_free,vg_attr,vg_siz"...]
09:35:02.486002 [vm1] execve("/sbin/vgs", ["vgs", "--noheadings", "--nosuffix", "--units=m", "--unbuffered", "--separator=|", "-ovg_name,vg_free,vg_attr,vg_siz"...]
09:35:02.663910 [vm2] execve("/usr/bin/kvm", ["/usr/bin/kvm", "--help"]
09:35:02.750273 [vm1] execve("/usr/bin/kvm", ["/usr/bin/kvm", "--help"]
09:35:03.700100 [vm1] execve("/sbin/lvs", ["lvs", "--noheadings", "--units=m", "--nosuffix", "--separator=|", "-ovg_name,lv_name,lv_size,lv_att"...]
09:35:03.711416 [vm2] execve("/sbin/lvs", ["lvs", "--noheadings", "--units=m", "--nosuffix", "--separator=|", "-ovg_name,lv_name,lv_size,lv_att"...]
09:40:02.592194 [vm1] execve("/sbin/vgs", ["vgs", "--noheadings", "--nosuffix", "--units=m", "--unbuffered", "--separator=|", "-ovg_name,vg_free,vg_attr,vg_siz"...]
09:40:02.604413 [vm2] execve("/sbin/vgs", ["vgs", "--noheadings", "--nosuffix", "--units=m", "--unbuffered", "--separator=|", "-ovg_name,vg_free,vg_attr,vg_siz"...]
09:40:02.706357 [vm1] execve("/usr/bin/kvm", ["/usr/bin/kvm", "--help"]
09:40:02.747726 [vm2] execve("/usr/bin/kvm", ["/usr/bin/kvm", "--help"]
09:40:03.699732 [vm1] execve("/sbin/lvs", ["lvs", "--noheadings", "--units=m", "--nosuffix", "--separator=|", "-ovg_name,lv_name,lv_size,lv_att"...]
09:40:03.707604 [vm2] execve("/sbin/lvs", ["lvs", "--noheadings", "--units=m", "--nosuffix", "--separator=|", "-ovg_name,lv_name,lv_size,lv_att"...]


Here's the drbd status from both sides:

root@wrn-vm1:~# cat /proc/drbd
version: 8.3.11 (api:88/proto:86-96)
srcversion: F937DCB2E5D83C6CCE4A6C9
 0: cs:WFConnection ro:Primary/Unknown ds:UpToDate/DUnknown C r-----
    ns:0 nr:0 dw:4195268 dr:53616 al:5 bm:256 lo:0 pe:0 ua:0 ap:0 ep:1 wo:f oos:0
 1: cs:Unconfigured
 2: cs:Unconfigured

root@wrn-vm2:~# cat /proc/drbd
version: 8.3.11 (api:88/proto:86-96)
srcversion: F937DCB2E5D83C6CCE4A6C9
 0: cs:StandAlone ro:Primary/Unknown ds:UpToDate/DUnknown   r-----
    ns:0 nr:0 dw:964 dr:4279608 al:14 bm:256 lo:0 pe:0 ua:0 ap:0 ep:1 wo:f oos:0
 1: cs:Unconfigured
 2: cs:Unconfigured

Finally, a gnt-cluster verify:

root@wrn-vm1:~# gnt-cluster verify
Submitted jobs 3026, 3027
Waiting for job 3026 ...
Fri Jul 12 10:01:28 2013 * Verifying cluster config
Fri Jul 12 10:01:28 2013 * Verifying cluster certificate files
Fri Jul 12 10:01:28 2013 * Verifying hypervisor parameters
Fri Jul 12 10:01:28 2013 * Verifying all nodes belong to an existing group
Waiting for job 3027 ...
Fri Jul 12 10:01:28 2013 * Verifying group 'default'
Fri Jul 12 10:01:28 2013 * Gathering data (2 nodes)
Fri Jul 12 10:01:29 2013 * Gathering disk information (2 nodes)
Fri Jul 12 10:01:29 2013 * Verifying configuration file consistency
Fri Jul 12 10:01:29 2013 * Verifying node status
Fri Jul 12 10:01:29 2013 * Verifying instance status
Fri Jul 12 10:01:29 2013 * Verifying orphan volumes
Fri Jul 12 10:01:29 2013   - ERROR: node wrn-vm2.int.example.com: volume xenvg/home is unknown
Fri Jul 12 10:01:29 2013   - ERROR: node wrn-vm2.int.example.com: volume xenvg/usr is unknown
Fri Jul 12 10:01:29 2013   - ERROR: node wrn-vm2.int.example.com: volume xenvg/var is unknown
Fri Jul 12 10:01:29 2013   - ERROR: node wrn-vm2.int.example.com: volume xenvg/swap is unknown
Fri Jul 12 10:01:29 2013   - ERROR: node wrn-vm2.int.example.com: volume xenvg/root_fs is unknown
Fri Jul 12 10:01:29 2013   - ERROR: node wrn-vm1.int.example.com: volume xenvg/home is unknown
Fri Jul 12 10:01:29 2013   - ERROR: node wrn-vm1.int.example.com: volume xenvg/usr is unknown
Fri Jul 12 10:01:29 2013   - ERROR: node wrn-vm1.int.example.com: volume xenvg/var is unknown
Fri Jul 12 10:01:29 2013   - ERROR: node wrn-vm1.int.example.com: volume xenvg/swap is unknown
Fri Jul 12 10:01:29 2013   - ERROR: node wrn-vm1.int.example.com: volume xenvg/root_fs is unknown
Fri Jul 12 10:01:29 2013 * Verifying N+1 Memory redundancy
Fri Jul 12 10:01:29 2013 * Other Notes
Fri Jul 12 10:01:29 2013 * Hooks Results
root@wrn-vm1:~#

(As you can see I've used some logical volumes from xenvg for the host system's purposes, but I don't see the harm in that)

tschend

unread,
Jul 12, 2013, 6:04:03 AM7/12/13
to gan...@googlegroups.com
Hi,

we are running almost the same setup.

Wheezy + stock kernel (3.2.0-4-amd64 #1 SMP Debian 3.2.46-1 x86_64 GNU/Linux) and stock drbd (2:8.3.13-2).

The only difference maybe is that we use openvswitch and the snf-ganeti because we use ganeti as a cloud backend.

I am running the live migration of a windows 2008 r2 box for about 10 minutes now and it works flawlessly.
Will keeo this running für about 2 hours and report back.

Regards
Thomas

tschend

unread,
Jul 12, 2013, 9:03:07 AM7/12/13
to gan...@googlegroups.com
Just a little update.

Worked like a charm for about 2 hours. No crash.

We run a dedicated vlan for the drbd/migration traffic.

Hope that helps.

Regards
Thomas

candlerb

unread,
Jul 12, 2013, 11:18:23 AM7/12/13
to gan...@googlegroups.com


On Friday, 12 July 2013 14:03:07 UTC+1, tschend wrote:

Worked like a charm for about 2 hours. No crash.

Thanks for trying it. It seems to be on a knife-edge, because if I add a full strace then it doesn't fall over for me either, so it's very timing-sensitive.

One option I might try is to insert sleep(0.5) at various points in the code, and see where makes a difference.
 
We run a dedicated vlan for the drbd/migration traffic.

I use a dedicated 10G link for that.

Regards,

Brian.

candlerb

unread,
Jul 14, 2013, 5:20:54 AM7/14/13
to gan...@googlegroups.com
This is looking more and more like some sort of race.

WIth the following patch, the migrations ran back and forth for 18 hours:

--- backend.py.orig 2013-06-25 15:38:16.700108568 +0100
+++ backend.py 2013-07-13 07:53:19.105850717 +0100
@@ -3414,6 +3414,7 @@
   for rd in bdevs:
     try:
       rd.AttachNet(multimaster)
+      time.sleep(0.5)
     except errors.BlockDeviceError, err:
       _Fail("Can't change network configuration: %s", err)

Then one failed with a different error:

Sun Jul 14 01:54:10 2013 Migrating instance briantest.int.example.com
Sun Jul 14 01:54:10 2013 * checking disk consistency between source and target
Sun Jul 14 01:54:10 2013 * switching node wrn-vm1.int.example.com to secondary mode
Sun Jul 14 01:54:10 2013 * changing into standalone mode
Sun Jul 14 01:54:11 2013 * changing disks into dual-master mode
Failure: command execution error:
Cannot change disks config on node wrn-vm1.int.example.com: Can't change to primary mode: drbd0: can't make drbd device primary: /dev/drbd0: State change failed: (-2) Need access to UpToDate data

This appears to be an error message from drbd itself, e.g. see http://john.wesorick.com/2012/02/error-when-promoting-drbd-primary.html

However the drbd status looks OK to me:

root@wrn-vm1:~# cat /proc/drbd
version: 8.3.11 (api:88/proto:86-96)
srcversion: F937DCB2E5D83C6CCE4A6C9
 0: cs:Connected ro:Secondary/Primary ds:UpToDate/UpToDate C r-----
    ns:0 nr:6080 dw:4202356 dr:2546952 al:31 bm:273 lo:0 pe:0 ua:0 ap:0 ep:1 wo:f oos:0
 1: cs:Unconfigured
 2: cs:Unconfigured
root@wrn-vm2:~# cat /proc/drbd
version: 8.3.11 (api:88/proto:86-96)
srcversion: F937DCB2E5D83C6CCE4A6C9
 0: cs:Connected ro:Primary/Secondary ds:UpToDate/UpToDate C r-----
    ns:6080 nr:0 dw:8052 dr:6802788 al:91 bm:271 lo:0 pe:0 ua:0 ap:0 ep:1 wo:f oos:0
 1: cs:Unconfigured
 2: cs:Unconfigured

And "gnt-instance migrate --cleanup" was able to sort things out.

With the following variation, which moves the sleep slightly, it ran for nearly 2 hours and gave the same error ("Need access to UpToDate data")

--- backend.py.orig 2013-06-25 15:38:16.700108568 +0100
+++ backend.py 2013-07-14 10:07:22.433363561 +0100
@@ -3416,6 +3416,7 @@
       rd.AttachNet(multimaster)
     except errors.BlockDeviceError, err:
       _Fail("Can't change network configuration: %s", err)
+  time.sleep(0.5)

   # wait until the disks are connected; we need to retry the re-attach
   # if the device becomes standalone, as this might happen if the one

However if I run with the following patch instead, which puts the sleep earlier after the disconnects, the migrations fall over after a few minutes in the original way ("is not in sync: stats=<<class 'ganeti.bdev.DRBD8Status'>: cstatus=StandAlone, lrole=Primary, rrole=Unknown, ldisk=UpToDate, rdisk=DUnknown>") which requires a full switch to -t plain and back to -t drbd to fix.

--- backend.py.orig 2013-06-25 15:38:16.700108568 +0100
+++ backend.py 2013-07-14 10:00:40.604378983 +0100
@@ -3392,6 +3392,7 @@
   for rd in bdevs:
     try:
       rd.DisconnectNet()
+      time.sleep(0.5)
     except errors.BlockDeviceError, err:
       _Fail("Can't change network configuration to standalone mode: %s",
             err, exc=True)


candlerb

unread,
Jul 16, 2013, 9:01:06 AM7/16/13
to gan...@googlegroups.com
Some feedback from the drbd-user list:

http://lists.linbit.com/pipermail/drbd-user/2013-July/020166.html

"> Jul 11 10:59:46 wrn-vm2 kernel: [236603.135779] block drbd0: I shall
> become SyncTarget, but I am primary!
The message above is MUCH more frightening than the line below.

Apparently a node was promoted right in the middle of a resync
handshake, and did not like that at all.
...
In that backend script,

add a loop before the promote,
that checks that the connection state really is "Connected",
and the disk state really is "UpToDate"."

kargig...@gmail.com

unread,
Nov 4, 2013, 5:43:06 AM11/4/13
to gan...@googlegroups.com, ganeti...@googlegroups.com
Hello all,

it seems that we are facing the same problem on our ganeti clusters that have been upgraded from squeeze to wheezy. The live migration failure ratio is huge, 1 out of 8-10 migrations fail. When these nodes where running squeeze there were absolutely no such issues.
For comparison purposes here are the package details of our clusters:

Squeeze:
drbd8-utils:
  Installed: 2:8.3.7-2.1
ganeti2:
  Installed: 2.4.2+ippool5-1
linux-image-2.6.32-5-amd64:
  Installed: 2.6.32-48squeeze3

Wheezy:
drbd8-utils:
  Installed: 2:8.3.13-2
ganeti2:
  Installed: 2.8.1-1~bpo70+httpboot
linux-image-3.2.0-4-amd64:
  Installed: 3.2.51-1

I am attaching you the relevant files of two failed migrations. To replicate the issue, just migrate a VM in while true; loop...and wait for a little bit.
We have experienced the same issue with ganeti 2.6, 2.7 and 2.8, but also with squeeze and kernel 3.2 from backports.
The common denominator in all problematic situations seems to be kernel 3.2 but maybe there's a way to overcome this issue in ganeti itself.

Can someone with better insight of drbd/ganeti/kernel take a look at the proposed "option a" fix from:
http://lists.linbit.com/pipermail/drbd-user/2013-July/020173.html would that work?

We would really like some help from the ganeti-devel team to solve this grave issue. We are available for any testing that might be needed.

Regards,
George Kargiotakis
migration-fail-gnt7-05.txt
migration2-fail-ganeti.txt
migration-fail-ganeti.txt
migration2-fail-gnt7-05.txt
migration2-fail-gnt7-01.txt
migration-fail-gnt7-01.txt

candlerb

unread,
Nov 5, 2013, 6:43:58 AM11/5/13
to gan...@googlegroups.com, ganeti...@googlegroups.com
After upgrading ganeti from 2.6.2 to 2.8.1, which got rid of my "sleep" patches, the very first live migrate I did suffered the same problem:

root@wrn-vm1:~# gnt-instance migrate testvm.int.example.com
Instance testvm.int.example.com will be migrated. Note that
migration might impact the instance if anything goes wrong (e.g. due
to bugs in the hypervisor). Continue?
y/[n]/?: y
Tue Nov  5 11:08:30 2013 Migrating instance testvm.int.example.com
Tue Nov  5 11:08:30 2013 * checking disk consistency between source and target
Tue Nov  5 11:08:31 2013 * switching node wrn-vm2.int.example.com to secondary mode
Tue Nov  5 11:08:31 2013 * changing into standalone mode
Tue Nov  5 11:08:31 2013 * changing disks into dual-master mode
Tue Nov  5 11:08:32 2013 * wait until resync is done
Failure: command execution error:
Cannot resync disks on node wrn-vm2.int.example.com: DRBD device <<class 'ganeti.bdev.DRBD8'>: unique_id: ('192.168.8.102', 11001, '192.168.8.101', 11001, 0, 'c3e57192c295a443428f68ec8f9ff141f692f78f'), children: [<<class 'ganeti.bdev.LogicalVolume'>: unique_id: ('xenvg', '10e24636-0481-4ded-b1d7-b9f97b2919d2.disk0_data'), children: [], 253:7, /dev/xenvg/10e24636-0481-4ded-b1d7-b9f97b2919d2.disk0_data>, <<class 'ganeti.bdev.LogicalVolume'>: unique_id: ('xenvg', '10e24636-0481-4ded-b1d7-b9f97b2919d2.disk0_meta'), children: [], 253:8, /dev/xenvg/10e24636-0481-4ded-b1d7-b9f97b2919d2.disk0_meta>], 147:0, /dev/drbd0> is not in sync: stats=<ganeti.bdev.DRBD8Status object at 0x23d6c10>

As a reminder, the underlying platform is Debian Wheezy on Dell R210-II with 10G back-to-back for the replication network.

I would say this is a very high priority issue, given that Ganeti is apparently driving DRBD wrongly and it can cause these failures on more installations than just mine. The detailled post at http://lists.linbit.com/pipermail/drbd-user/2013-July/020173.html has all the details. Personally I'd say that Ganeti should be using the drbd "dual master" mode (which means both sides *may* be promoted to master, but don't *have* to be), but fixing the current disconnect/reconnect approach is also described in that post.

Anyway, I have put back my bandaid and it's OK for now:

--- /usr/local/lib/python2.7/dist-packages/ganeti/backend.py.orig 2013-11-05 10:58:42.161871214 +0000
+++ /usr/local/lib/python2.7/dist-packages/ganeti/backend.py 2013-11-05 11:18:37.614989450 +0000
@@ -3607,6 +3607,7 @@

Apollon Oikonomopoulos

unread,
Nov 5, 2013, 9:24:15 AM11/5/13
to gan...@googlegroups.com, ganeti...@googlegroups.com
Hi,

On 03:43 Tue 05 Nov , candlerb wrote:
> After upgrading ganeti from 2.6.2 to 2.8.1, which got rid of my "sleep"
> patches, the very first live migrate I did suffered the same problem:

We managed to find the root cause of this and we are now testing it
thoroughly on a cluster. There is a race in the way dual-primary DRBD is
set up.

If it still hasn't crashed in a couple of hours from now (it was
reproducible in 5-10' before that), I'll submit the patch.

Regards,
Apollon

Apollon Oikonomopoulos

unread,
Nov 5, 2013, 9:27:48 AM11/5/13
to gan...@googlegroups.com, ganeti...@googlegroups.com
On 16:24 Tue 05 Nov , Apollon Oikonomopoulos wrote:
> If it still hasn't crashed in a couple of hours from now (it was
> reproducible in 5-10' before that), I'll submit the patch.
>
> Regards,
> Apollon

On second thoughts, I'll post it now so you can give it a try as well if
you like. Two affected systems are better than one :-).

Apollon Oikonomopoulos

unread,
Nov 5, 2013, 6:42:37 AM11/5/13
to kargig...@gmail.com, gan...@googlegroups.com, ganeti...@googlegroups.com
Ok, it looks like we got it! Basically the thread above is correct, an
analysis and patch will follow.

Regards,
Apollon

candlerb

unread,
Nov 14, 2013, 10:19:30 AM11/14/13
to gan...@googlegroups.com, kargig...@gmail.com, ganeti...@googlegroups.com
I've upgraded to 2.8.2 and ran migrations for an hour with no problems.

Many thanks!

Reply all
Reply to author
Forward
0 new messages