( drbd not in sync anymore: drbd devices from /proc/drbd are seen as Primary/Unknown or Secondary/Unknown):
You want to discard the data of one drbd node and trust the other one to recontruct the drbd device as a RAID 1 disk.
1) To start, you need to know the drbd resource involved (DRBD minor, device and port):
[MASTER NODE] # gnt-instance info $INSTANCE | grep -A9 -- "- disk/"
- disk/0: drbd, size 20.0G
access mode: rw
port: 11020 *** DRBD PORT ***
auth key: 6af168277b7abc986b0d2c3abf12348ab5fa388f
on primary: /dev/drbd4 (147:4) in sync, status ok *** DRBD DEVICE ***
on secondary: /dev/drbd4 (147:4) in sync, status ok *** DRBD DEVICE ***
name: None
UUID: ab12e4c3-dcb4-bc45-a340-36fff8924c2
Usually on ganeti, minor, device and resource keep the same number, but it's better to check it anyway.
Here, minor=4 device=drbd4 port=11020
2) Then you need to know the corresponding drbd resource:
# drbdsetup show $MINOR (here: drbdsetup show 4)
resource resource4 {
options {
}
net {
cram-hmac-alg "md5";
shared-secret "6af168277b7abc986b0d2c3abf12348ab5fa388f";
after-sb-0pri discard-zero-changes;
after-sb-1pri consensus;
}
_remote_host {
}
_this_host {
volume 0 {
device minor 4;
disk "/dev/vg-cluster/d3764526-e655-4d59-bd8b-57333ee83497.disk0_data";
meta-disk "/dev/vg-cluster/d3764526-e655-4d59-bd8b-57333ee83497.disk0_meta";
disk {
size 41943040s; # bytes
resync-rate 1048576k; # bytes/second
}
}
}
}
2) You want to discard data from node01 and rebuild your drbd device with data from node02:
- use drbdsetup to force primary/secondary if you have both primary or secondary:
drbdsetup primary $MINOR [--force]
drbdsetup secondary $MINOR [--force]
- disconnect the drbd device on each node:
[NODE01] drbdsetup disconnect 192.168.0.1:$PORT 192.168.0.2:$PORT --force=yes
[NODE02] drbdsetup disconnect 192.168.0.2:$PORT 192.168.0.1:$PORT --force=yes
- reconnect drbd device on each node discarding data on node01:
[NODE01] drbdsetup connect $RESOURCE 192.168.0.1:$PORT 192.168.0.2:$PORT --discard-my-data=yes
[NODE02] drbdsetup connect $RESOURCE 192.168.0.2:$PORT 192.168.0.1:$PORT
Note: If you can't reconnect here, you need to set the correct net part or empty it on both node to have the same net section on your drbd resource
=> drbd devices will resync fully, you could watch it via /proc/drbd
Here you're ok but your drbd resource should have an empty net section.
Stop and start the instance to fix it.
Jean-François