How to resync a split-brain drbd volume?

1,861 views
Skip to first unread message

candlerb

unread,
Nov 7, 2016, 1:05:21 PM11/7/16
to ganeti
I have a two-node ganeti-2.11 cluster where one of the drbd instances is in a bad state:

# drbd-overview
  0:??not-found??  Connected  Primary/Secondary UpToDate/UpToDate C      r-----
  1:??not-found??  Connected  Primary/Secondary UpToDate/UpToDate C      r-----
  2:??not-found??  Connected  Primary/Secondary UpToDate/UpToDate C      r-----
  4:??not-found??  Connected  Primary/Secondary UpToDate/UpToDate C      r-----
  5:??not-found??  Connected  Primary/Secondary UpToDate/UpToDate C      r-----
  6:??not-found??  Connected  Primary/Secondary UpToDate/UpToDate C      r-----
  7:??not-found??  Connected  Primary/Secondary UpToDate/UpToDate C      r-----
 10:??not-found??  StandAlone Primary/Unknown   UpToDate/DUnknown r-----
 11:??not-found??  Connected  Primary/Secondary UpToDate/UpToDate C      r-----

The other side shows:

...
10:??not-found??  StandAlone Secondary/Unknown UpToDate/DUnknown
...

I have found the instance which uses /dev/drbd10. gnt-instance info shows:

  Disk template: drbd
  Disks:
    - disk/0: drbd, size 20.0G
      access mode: rw
      nodeA: wrn-vm1.int.example.net, minor=10
      nodeB: wrn-vm2.int.example.net, minor=10
      port: 11026
      auth key: 416bdb66767e7664133766f7e2b41d55af1ecdb0
      on primary: /dev/drbd10 (147:10) in sync, status *DEGRADED*
      on secondary: /dev/drbd10 (147:10) in sync, status *DEGRADED*
      name: None
      UUID: be8c8774-6727-4c82-b83a-ff51e5aca425
      child devices:
        - child 0: plain, size 20.0G
          logical_id: xenvg/f50973f4-e40f-46fb-bee8-831cecaf051c.disk0_data
          on primary: /dev/xenvg/f50973f4-e40f-46fb-bee8-831cecaf051c.disk0_data (253:25)
          on secondary: /dev/xenvg/f50973f4-e40f-46fb-bee8-831cecaf051c.disk0_data (253:23)
          name: None
          UUID: 53b1b9f3-5d33-449f-b6c1-59a95d861379
        - child 1: plain, size 128M
          logical_id: xenvg/f50973f4-e40f-46fb-bee8-831cecaf051c.disk0_meta
          on primary: /dev/xenvg/f50973f4-e40f-46fb-bee8-831cecaf051c.disk0_meta (253:26)
          on secondary: /dev/xenvg/f50973f4-e40f-46fb-bee8-831cecaf051c.disk0_meta (253:24)
          name: None
          UUID: 02330058-838a-4ae3-ba6f-6085aa00bdf5

(Aside: I don't understand how it can be "in sync" and yet "degraded" at the same time...)

dmesg | grep drbd10 shows:

[9623265.691560] block drbd10: conn( StandAlone -> Unconnected )
[9623265.697449] block drbd10: Starting receiver thread (from drbd10_worker [4780])
[9623265.705032] block drbd10: receiver (re)started
[9623265.709782] block drbd10: conn( Unconnected -> WFConnection )
[9623266.214534] block drbd10: Handshake successful: Agreed network protocol version 96
[9623266.229363] block drbd10: Peer authenticated using 16 bytes of 'md5' HMAC
[9623266.236380] block drbd10: conn( WFConnection -> WFReportParams )
[9623266.242686] block drbd10: Starting asender thread (from drbd10_receiver [20302])
[9623266.250333] block drbd10: data-integrity-alg: <not-used>
[9623266.255853] block drbd10: drbd_sync_handshake:
[9623266.260498] block drbd10: self B5BE145BFFB86D7B:D66CEA14B365CF09:AA37400902A6071A:AA36400902A6071B bits:40099 flags:0
[9623266.271347] block drbd10: peer 6CD10A4C30600E74:D66CEA14B365CF09:AA37400902A6071B:AA36400902A6071B bits:11 flags:0
[9623266.281898] block drbd10: uuid_compare()=100 by rule 90
[9623266.287320] block drbd10: helper command: /bin/true initial-split-brain minor-10
[9623266.295266] block drbd10: helper command: /bin/true initial-split-brain minor-10 exit code 0 (0x0)
[9623266.304451] block drbd10: Split-Brain detected but unresolved, dropping connection!
[9623266.312319] block drbd10: helper command: /bin/true split-brain minor-10
[9623266.319519] block drbd10: helper command: /bin/true split-brain minor-10 exit code 0 (0x0)
[9623266.328089] block drbd10: conn( WFReportParams -> Disconnecting )
[9623266.334564] block drbd10: error receiving ReportState, l: 4!
[9623266.340521] block drbd10: asender terminated
[9623266.344991] block drbd10: Terminating drbd10_asender
[9623266.345037] block drbd10: Connection closed
[9623266.345043] block drbd10: conn( Disconnecting -> StandAlone )
[9623266.345055] block drbd10: receiver terminated
[9623266.345056] block drbd10: Terminating drbd10_receiver

OK, so it's a split brain. The instance is running on node1, so I just want to force it to resync so that node2 has an identical copy (i.e. sync primary to secondary). But how to do this on a 2-node cluster?

I found https://blkperl.github.io/split-brain-ganeti.html which suggests moving the secondary to a third node. I don't have one.

I could convert to plain and back to drbd again, but that would involve shutting down the instance twice for an extended time, which I'd rather avoid.

I tried using gnt-instance replace-disks -s, but that didn't work:

root@wrn-vm1:~# gnt-instance replace-disks -s instance-name
Mon Nov  7 17:53:51 2016 Replacing disk(s) 0 for instance 'instance-name.int.example.net'
Mon Nov  7 17:53:51 2016 Current primary node: wrn-vm1.int.example.net
Mon Nov  7 17:53:51 2016 Current seconary node: wrn-vm2.int.example.net
Mon Nov  7 17:53:51 2016 STEP 1/6 Check device existence
Mon Nov  7 17:53:51 2016  - INFO: Checking disk/0 on wrn-vm1.int.example.net
Mon Nov  7 17:53:52 2016  - INFO: Checking disk/0 on wrn-vm2.int.example.net
Mon Nov  7 17:53:53 2016  - INFO: Checking volume groups
Mon Nov  7 17:53:54 2016 STEP 2/6 Check peer consistency
Mon Nov  7 17:53:54 2016  - INFO: Checking disk/0 consistency on node wrn-vm1.int.example.net
Failure: command execution error:
Node wrn-vm1.int.example.net has degraded storage, unsafe to replace disks for instance instance-name.int.example.net

There doesn't seem to be an option to force it. Using "-a" instead of "-s" it says there's no problem:

root@wrn-vm1:~# gnt-instance replace-disks -a instance-name
Mon Nov  7 17:59:00 2016  - INFO: Checking disk/0 on wrn-vm1.int.example.net
Mon Nov  7 17:59:02 2016  - INFO: Checking disk/0 on wrn-vm2.int.example.net
Mon Nov  7 17:59:04 2016 No disks need replacement for instance 'instance-name.int.example.net'

Any other clues for how to proceed?

Thanks,

Brian.

Phil Regnauld

unread,
Nov 7, 2016, 2:58:50 PM11/7/16
to gan...@googlegroups.com
candlerb (b.candler) writes:
>
> I could convert to plain and back to drbd again, but that would involve
> shutting down the instance twice for an extended time, which I'd rather
> avoid.

Extended time ? About 1-2 minutes each time:

gnt-instance shutdown instance
gnt-instance modify -t plain instance (takes 10-20 secs)
gnt-instance modify -t drbd --no-wait-for-sync
gnt-instance start instance

> Any other clues for how to proceed?

What does "active-disks" say ?

But I think the last time I helped someone fix this, it was
drbd->plain->drbd.

candlerb

unread,
Nov 7, 2016, 3:16:07 PM11/7/16
to ganeti
>         What does "active-disks" say ? 

You mean "activate-disks" ?  It said everything was OK.
 
Anyway, after digging around the drbdsetup manpage, I think I found the solution. On the secondary node I just did this:

# drbdsetup /dev/drbd10 invalidate

After a few minutes, everything kicked off. Primary shows:

 10:??not-found??  SyncSource Primary/Secondary UpToDate/Inconsistent C r-----
[>...................] sync'ed:  9.7% (18500/20480)Mfinish: 0:18:50 speed: 16,748 (20,236) K/sec

Secondary shows:

10:??not-found??  SyncTarget Secondary/Primary Inconsistent/UpToDate
[=>..................] sync'ed: 10.4% (18364/20480)Mfinish: 0:29:07 speed: 10,752 (18,660) want: 5,840 K/sec

dmesg shows:

[10033.664658] block drbd10: disk( UpToDate -> Inconsistent )
[10033.709663] block drbd10: bitmap WRITE of 160 pages took 3 jiffies
[10033.727035] block drbd10: 20 GB (5242880 bits) marked out-of-sync by on disk bit-map.
...
[10106.618678] block drbd10: drbd_sync_handshake:
[10106.623179] block drbd10: self 6CD10A4C30600E74:D66CEA14B365CF09:AA37400902A6071B:AA36400902A6071B bits:5242880 flags:0
[10106.634000] block drbd10: peer B5BE145BFFB86D7B:D66CEA14B365CF09:AA37400902A6071A:AA36400902A6071B bits:40216 flags:0
[10106.644726] block drbd10: uuid_compare()=100 by rule 90
[10106.649998] block drbd10: Becoming sync target due to disk states.
[10106.656225] block drbd10: peer( Unknown -> Primary ) conn( WFReportParams -> WFBitMapT ) pdsk( DUnknown -> UpToDate )
[10106.688508] block drbd10: send bitmap stats [Bytes(packets)]: plain 0(0), RLE 15(1), total 15; compression: 100.0%
[10106.698942] block drbd10: conn( WFBitMapT -> WFSyncUUID )
[10106.783566] block drbd10: updated sync uuid D66DEA14B365CF08:0000000000000000:AA37400902A6071B:AA36400902A6071B
[10106.803809] block drbd10: helper command: /bin/true before-resync-target minor-10
[10106.811972] block drbd10: helper command: /bin/true before-resync-target minor-10 exit code 0 (0x0)
[10106.821109] block drbd10: conn( WFSyncUUID -> SyncTarget )
[10106.826733] block drbd10: Began resync as SyncTarget (will sync 20971520 KB [5242880 bits set]).

Cheers,

Brian.

Iustin Pop

unread,
Nov 7, 2016, 3:21:14 PM11/7/16
to gan...@googlegroups.com
You can always manually force DRBD to resync (thus without any
downtime), but it's a bit more complex and potentially more dangerous.

If one wants to do it, lookup for drbdsetup invalidate/invalidate-remote
commands. It's been a long time since I've done it, so I'm not entirely
sure, but I think a "drbdsetup invalidate <minor>" on the secondary and
then trying to re-activate the disks should do it. If not, try to
invalidate-remote on the primary as well.

regards,
iustin

Phil Regnauld

unread,
Nov 7, 2016, 3:23:08 PM11/7/16
to gan...@googlegroups.com
candlerb (b.candler) writes:
> > What does "active-disks" say ?
>
> You mean "activate-disks" ? It said everything was OK.

Yes, activate-disks, sorry. Ah ok, hadn't seen you trying
that, only "gnt-instance replace-disks -a instance-name"

> Anyway, after digging around the drbdsetup manpage, I think I found the
> solution. On the secondary node I just did this:
>
> # drbdsetup /dev/drbd10 invalidate

Ah neat. I see a thread on this from 2012 (although in that particular
case it didn't help):

https://groups.google.com/forum/#!topic/ganeti/7GsZ7zHIH1g



Reply all
Reply to author
Forward
0 new messages