Master node crash - DRBD Split-brain

John N.

unread,

May 15, 2013, 1:58:42 PM5/15/13

to gan...@googlegroups.com

Hi,

On a mini test cluster of two nodes with Ganeti 2.6.2 I just had the USB key where I had the OS+ganeti installed crash on my master node. So I simply took a new USB key and re-installed the OS+ganeti. In the mean time on the second node I ran the following commands in order to bring my instances (3 of them) up and running again:

gnt-cluster master-failover --no-voting
gnt-node modify -O yes nodeKO
gnt-node failover nodeKO

As soon as my crashed node was reinstalled I ran the following commands on the second node:

gnt-node add --readd nodeKO

Then on the newly reinstalled node:

gnt-cluster master-failover

Then when I wanted to migrate back my instances to the reinstalled node by using the "gnt-node migrate nodeKO" command I get the following message:

No primary instances on node nodeKO.domain.tld, exiting.

By looking at the /proc/drbd file I notice that the DRBD connection between the two nodes is not active, for example:

0: cs:StandAlone ro:Secondary/Unknown ds:UpToDate/DUnknown r----
ns:0 nr:0 dw:0 dr:0 al:0 bm:16 lo:0 pe:0 ua:0 ap:0 ep:1 wo:b oos:86016

On the secondary node I can also see the following error message on the console:

block drbd0: Split-Brain detected, dropping connection!
block drbd1: Split-Brain detected, dropping connection!
etc.

Now my question here is: is this a normal behavior when the master node crashes? and second (most importantly) how do I fix this situation?

An important remark here is that my xenvg has not been harmed on any nodes, as they are on a RAID 5 array on SAS disks. The OS+ganeti itself is running on a USB key and only the USB key broke down. So data on both nodes should be fine, it should "just" be a matter of bringing the DRBD link back up and syncing the deltas.

Regards,
John

John N.

unread,

May 15, 2013, 2:04:32 PM5/15/13

to gan...@googlegroups.com

Hi Guide,

First of all sorry I would like delete my first post as I forgot to replace real domains/names from the logs with fake domains/names. May I kindly ask you to delete your answer in that first post and use this one instead?

So you suggested to run "gnt-instance replace-disks -s instance", I tried this out but I get the following error:

sudo gnt-instance replace-disks -s inst1
Wed May 15 20:00:14 2013 Replacing disk(s) 0 for instance 'inst1.domain.tld'
Wed May 15 20:00:14 2013 node2
Wed May 15 20:00:14 2013 node1
Wed May 15 20:00:14 2013 STEP 1/6 Check device existence
Wed May 15 20:00:14 2013 - INFO: Checking disk/0 on node2
Wed May 15 20:00:15 2013 - INFO: Checking disk/0 on node1
Wed May 15 20:00:15 2013 - INFO: Checking volume groups
Wed May 15 20:00:15 2013 STEP 2/6 Check peer consistency
Wed May 15 20:00:15 2013 - INFO: Checking disk/0 consistency on node node2
Failure: command execution error:
Node node2 has degraded storage, unsafe to replace disks for instance inst2

Should I force here or is there another problem?

John

John N.

unread,

May 16, 2013, 3:46:35 AM5/16/13

to gan...@googlegroups.com

I checked and there are no force option to the gnt-instance replace-disks command. And the replace-disks command fails, what else can I do instead to fix this situation?

J.

Guido Trotter

unread,

May 16, 2013, 5:54:24 AM5/16/13

to gan...@googlegroups.com

On Thu, May 16, 2013 at 8:46 AM, John N. <hosting...@gmail.com> wrote:
> I checked and there are no force option to the gnt-instance replace-disks
> command. And the replace-disks command fails, what else can I do instead to
> fix this situation?
>

Try making sure that
1) the node you're replacing on is actually the right one (-s replaces
on the secondary, but I am assuming that the instance has the right
one configured as primary)
2) Make sure on the old node the drbd devices are not up. Try shutting
them down completely with drbdsetup.
3) Make sure on the primary they are in disconnected state (again with
drbdsetup)

This should allow you to replace-disks. You can then file a bug asking
for a way to "cleanup" a split brain situation specifying one node to
keep as "valid".

Thanks,

Guido

John N.

unread,

May 16, 2013, 8:07:54 AM5/16/13

to gan...@googlegroups.com

Actually to your point 1) I tried both replace-disks with --on-primary and --on-secondary options but both output the same error message as posted yesterday.

To point 2) I am not familiar with DRBD and its drbdsetup command, could you mention the command(s) I need to run in order to shuptdown a DRBD device?

Same for point 3) :)

Regards,
J.

Thomas Thrainer

unread,

May 16, 2013, 8:43:01 AM5/16/13

to gan...@googlegroups.com

Hi,

Regarding drbdsetup:

Look in /prod/drbd for the volume(s) you want to shut down on old node. You can shut them down with `drbdsetup <minor> down`, where <minor> is the first number in /proc/drbd. Depending on the state of your volume, `drbdsetup <minor> detach` or `drbdsetup <minor> disconnect` might help as well (although down should perform the two for you).

Cheers,

Thomas

--

Thomas Thrainer | Software Engineer | thom...@google.com |

Google Germany GmbH

Dienerstr. 12

80331 München

Registergericht und -nummer: Hamburg, HRB 86891
Sitz der Gesellschaft: Hamburg
Geschäftsführer: Graham Law, Katherine Stephens

John N.

unread,

May 16, 2013, 9:17:06 AM5/16/13

to gan...@googlegroups.com

Hi Thomas,

As suggested I ran on the freshly re-installed master node the command:

drbdsetup /dev/drbd0 down

That worked fine. The I tried a gnt-instance replace-disks -s myinstance (and also with -p) but both variations of parameters give me the following error:

Failure: command execution error:
Can't find disk/0 on node nodeA: disk not found

The output of /proc/drbd right now is:

0: cs:Unconfigured

I must be doing something wrong here?

J.

Guido Trotter

unread,

May 16, 2013, 9:36:29 AM5/16/13

to gan...@googlegroups.com

Just to clarify, can we see the /proc/drbd on both nodes, and gnt-node list ?

Could you also verify that the instance is still up?

Thanks a lot,

Guido

--
Guido Trotter
Ganeti Engineering
Google Germany GmbH
Dienerstr. 12, 80331, München

Registergericht und -nummer: Hamburg, HRB 86891
Sitz der Gesellschaft: Hamburg
Geschäftsführer: Graham Law, Katherine Stephens

Steuernummer: 48/725/00206
Umsatzsteueridentifikationsnummer: DE813741370

John N.

unread,

May 16, 2013, 9:42:56 AM5/16/13

to gan...@googlegroups.com

Sure, here would be the info...

/proc/drbd from node A (the freshly re-installed master node):

version: 8.3.7 (api:88/proto:86-91)
srcversion: EE47D8BF18AC166BE219757

0: cs:StandAlone ro:Secondary/Unknown ds:UpToDate/DUnknown r----

    ns:0 nr:0 dw:0 dr:0 al:0 bm:0 lo:0 pe:0 ua:0 ap:0 ep:1 wo:b oos:86016
1: cs:StandAlone ro:Secondary/Unknown ds:UpToDate/DUnknown   r----
    ns:0 nr:0 dw:0 dr:0 al:0 bm:16 lo:0 pe:0 ua:0 ap:0 ep:1 wo:b oos:77824
2: cs:Connected ro:Secondary/Primary ds:UpToDate/UpToDate C r----
    ns:0 nr:17660 dw:17660 dr:0 al:0 bm:5 lo:0 pe:0 ua:0 ap:0 ep:1 wo:b oos:0
3: cs:Connected ro:Primary/Secondary ds:UpToDate/UpToDate C r----
    ns:8616 nr:0 dw:2757080 dr:23751404 al:1702 bm:1284 lo:0 pe:0 ua:0 ap:0 ep:1 wo:b oos:0

/proc/drbd from node B:

version: 8.3.7 (api:88/proto:86-91)
srcversion: EE47D8BF18AC166BE219757
0: cs:WFConnection ro:Primary/Unknown ds:UpToDate/DUnknown C r----
    ns:0 nr:0 dw:680 dr:10949 al:30 bm:0 lo:0 pe:0 ua:0 ap:0 ep:1 wo:b oos:31364
1: cs:WFConnection ro:Primary/Unknown ds:UpToDate/DUnknown C r----
    ns:0 nr:0 dw:33996 dr:47257 al:60 bm:0 lo:0 pe:0 ua:0 ap:0 ep:1 wo:b oos:30888
2: cs:Connected ro:Primary/Secondary ds:UpToDate/UpToDate C r----
    ns:2460256 nr:0 dw:8379200 dr:723444 al:712 bm:325 lo:0 pe:0 ua:0 ap:0 ep:1 wo:b oos:0
3: cs:Connected ro:Secondary/Primary ds:UpToDate/UpToDate C r----
    ns:0 nr:8616 dw:23728600 dr:6116 al:34 bm:1280 lo:0 pe:0 ua:0 ap:0 ep:1 wo:b oos:0

Output from gnt-node list:

Node                          DTotal DFree MTotal MNode MFree Pinst Sinst
nodeA.domain.tld 1.6T 1.6T 24.0G 1020M 20.7G     1     3
nodeB.domain.tld 1.6T 1.6T 48.0G 1020M 46.0G     3     1

All instances (4) are up and running.

Regards,
J.

Guido Trotter

unread,

May 16, 2013, 10:14:04 AM5/16/13

to gan...@googlegroups.com

On Thu, May 16, 2013 at 2:42 PM, John N. <hosting...@gmail.com> wrote:

Sure, here would be the info...

/proc/drbd from node A (the freshly re-installed master node):

version: 8.3.7 (api:88/proto:86-91)
srcversion: EE47D8BF18AC166BE219757
0: cs:StandAlone ro:Secondary/Unknown ds:UpToDate/DUnknown   r----
    ns:0 nr:0 dw:0 dr:0 al:0 bm:0 lo:0 pe:0 ua:0 ap:0 ep:1 wo:b oos:86016

1: cs:StandAlone ro:Secondary/Unknown ds:UpToDate/DUnknown   r----
    ns:0 nr:0 dw:0 dr:0 al:0 bm:16 lo:0 pe:0 ua:0 ap:0 ep:1 wo:b oos:77824

These two are the problem. Can you shut them down?

(drbdsetup /dev/drbd0 down, drbdsetup /dev/drbd1 down)

Try also pausing the watcher first so that it doesn't try to reactivate them and put them in this broken state.

Other things you could try is drbdsetup /dev/drbd0 secondary, and even drbdsetup /dev/drbd0 invalidate, in case.

Then you should be able to run replace-disks -s, supposing the two instances are running on node B.

2: cs:Connected ro:Secondary/Primary ds:UpToDate/UpToDate C r----
ns:0 nr:17660 dw:17660 dr:0 al:0 bm:5 lo:0 pe:0 ua:0 ap:0 ep:1 wo:b oos:0
3: cs:Connected ro:Primary/Secondary ds:UpToDate/UpToDate C r----
ns:8616 nr:0 dw:2757080 dr:23751404 al:1702 bm:1284 lo:0 pe:0 ua:0 ap:0 ep:1 wo:b oos:0

These two are ok.

/proc/drbd from node B:

version: 8.3.7 (api:88/proto:86-91)
srcversion: EE47D8BF18AC166BE219757
0: cs:WFConnection ro:Primary/Unknown ds:UpToDate/DUnknown C r----
ns:0 nr:0 dw:680 dr:10949 al:30 bm:0 lo:0 pe:0 ua:0 ap:0 ep:1 wo:b oos:31364
1: cs:WFConnection ro:Primary/Unknown ds:UpToDate/DUnknown C r----
ns:0 nr:0 dw:33996 dr:47257 al:60 bm:0 lo:0 pe:0 ua:0 ap:0 ep:1 wo:b oos:30888

These are probably ok. If it still doesn't work you can try running "disconnect" on these, or invalidate-remote.

2: cs:Connected ro:Primary/Secondary ds:UpToDate/UpToDate C r----
    ns:2460256 nr:0 dw:8379200 dr:723444 al:712 bm:325 lo:0 pe:0 ua:0 ap:0 ep:1 wo:b oos:0
3: cs:Connected ro:Secondary/Primary ds:UpToDate/UpToDate C r----
    ns:0 nr:8616 dw:23728600 dr:6116 al:34 bm:1280 lo:0 pe:0 ua:0 ap:0 ep:1 wo:b oos:0

Output from gnt-node list:

Node                          DTotal DFree MTotal MNode MFree Pinst Sinst
nodeA.domain.tld 1.6T 1.6T 24.0G 1020M 20.7G     1     3
nodeB.domain.tld 1.6T 1.6T 48.0G 1020M 46.0G     3     1

All instances (4) are up and running.

Thanks,

Guido

John N.

unread,

May 16, 2013, 10:33:19 AM5/16/13

to gan...@googlegroups.com

0: cs:StandAlone ro:Secondary/Unknown ds:UpToDate/DUnknown   r----
    ns:0 nr:0 dw:0 dr:0 al:0 bm:0 lo:0 pe:0 ua:0 ap:0 ep:1 wo:b oos:86016

1: cs:StandAlone ro:Secondary/Unknown ds:UpToDate/DUnknown   r----
    ns:0 nr:0 dw:0 dr:0 al:0 bm:16 lo:0 pe:0 ua:0 ap:0 ep:1 wo:b oos:77824

These two are the problem. Can you shut them down?

(drbdsetup /dev/drbd0 down, drbdsetup /dev/drbd1 down)

The down command worked fine (no output) on both DRBD devices.

Try also pausing the watcher first so that it doesn't try to reactivate them and put them in this broken state.

Other things you could try is drbdsetup /dev/drbd0 secondary, and even drbdsetup /dev/drbd0 invalidate, in case.

Then you should be able to run replace-disks -s, supposing the two instances are running on node B.

Still the same problem when I run a replace-disks -s I get:

Thu May 16 16:24:23 2013 Replacing disk(s) 0 for instance 'inst1.domain.tld'
Thu May 16 16:24:23 2013 nodeB.domain.tld
Thu May 16 16:24:23 2013 nodeA.domain.tld
Thu May 16 16:24:23 2013 STEP 1/6 Check device existence
Thu May 16 16:24:23 2013 - INFO: Checking disk/0 on nodeB.domain.tld
Thu May 16 16:24:23 2013 - INFO: Checking disk/0 on nodeA.domain.tld
Failure: command execution error:
Can't find disk/0 on node nodeA.domain.tld: disk not found

As you suggested I then tried the drbdsetup secondary and invalidate commands: secondary command worked fine but the invalidate gave me the following output:

/dev/drbd0: State change failed: (-4) Can not resync without local disk

2: cs:Connected ro:Secondary/Primary ds:UpToDate/UpToDate C r----
    ns:0 nr:17660 dw:17660 dr:0 al:0 bm:5 lo:0 pe:0 ua:0 ap:0 ep:1 wo:b oos:0
3: cs:Connected ro:Primary/Secondary ds:UpToDate/UpToDate C r----
    ns:8616 nr:0 dw:2757080 dr:23751404 al:1702 bm:1284 lo:0 pe:0 ua:0 ap:0 ep:1 wo:b oos:0

These two are ok.

/proc/drbd from node B:

version: 8.3.7 (api:88/proto:86-91)
srcversion: EE47D8BF18AC166BE219757
0: cs:WFConnection ro:Primary/Unknown ds:UpToDate/DUnknown C r----
    ns:0 nr:0 dw:680 dr:10949 al:30 bm:0 lo:0 pe:0 ua:0 ap:0 ep:1 wo:b oos:31364
1: cs:WFConnection ro:Primary/Unknown ds:UpToDate/DUnknown C r----
    ns:0 nr:0 dw:33996 dr:47257 al:60 bm:0 lo:0 pe:0 ua:0 ap:0 ep:1 wo:b oos:30888

These are probably ok. If it still doesn't work you can try running "disconnect" on these, or invalidate-remote.

As the previous commands on nodeA did not work I went on trying to "drbdadm disconnect " on nodeB which worked fine but the drbdadm invalidate-remote did not work:

/dev/drbd0: State change failed: (-15) Need a connection to start verify or resync

Any other ideas what I could try?

J.

Reply all

Reply to author

Forward