Can't bring up DRBD resource

761 views
Skip to first unread message

Kristián Feldsam

unread,
Feb 20, 2015, 4:01:51 PM2/20/15
to esos-...@googlegroups.com
Hi all, 

I trying setup DRBD dual primary cluster, I follow article 32 and Marc's blog post from 04/2013. I got exit error 20 every time. I don't know what I am doing bad. DRBD service also not listening on tcp port. Would you help me? Thank you.

I am using esos r755

[root@storage1 ~]# cat /etc/drbd.d/global_common.conf 

# $Id: global_common.conf 430 2013-03-31 14:30:51Z msmi...@gmail.com $


global {

usage-count no;

# minor-count dialog-refresh disable-ip-verification

}


common {

handlers {

pri-on-incon-degr "/usr/lib/drbd/notify-pri-on-incon-degr.sh; /usr/lib/drbd/notify-emergency-reboot.sh; echo b > /proc/sysrq-trigger ; reboot -f";

pri-lost-after-sb "/usr/lib/drbd/notify-pri-lost-after-sb.sh; /usr/lib/drbd/notify-emergency-reboot.sh; echo b > /proc/sysrq-trigger ; reboot -f";

local-io-error "/usr/lib/drbd/notify-io-error.sh; /usr/lib/drbd/notify-emergency-shutdown.sh; echo o > /proc/sysrq-trigger ; halt -f";

fence-peer "/usr/lib/drbd/crm-fence-peer.sh";

split-brain "/usr/lib/drbd/notify-split-brain.sh root";

out-of-sync "/usr/lib/drbd/notify-out-of-sync.sh root";

after-resync-target "/usr/lib/drbd/crm-unfence-peer.sh";

#before-resync-target "/usr/lib/drbd/snapshot-resync-target-lvm.sh -p 15 -- -c 16k";

#after-resync-target "/usr/lib/drbd/unsnapshot-resync-target-lvm.sh";

}


startup {

# wfc-timeout degr-wfc-timeout outdated-wfc-timeout wait-after-sb

#wfc-timeout 0;

degr-wfc-timeout 120;

outdated-wfc-timeout 2;

become-primary-on both;

}


options {

# cpu-mask on-no-data-accessible

on-no-data-accessible io-error;

}


disk {

# size max-bio-bvecs on-io-error fencing disk-barrier disk-flushes

# disk-drain md-flushes resync-rate resync-after al-extents

                # c-plan-ahead c-delay-target c-fill-target c-max-rate

                # c-min-rate disk-timeout

                on-io-error detach;

                disk-barrier no;

                disk-flushes no;

                fencing resource-only;

                al-extents 3389;

                c-plan-ahead 0;

                resync-rate 153M;

}


net {

# protocol timeout max-epoch-size max-buffers unplug-watermark

# connect-int ping-int sndbuf-size rcvbuf-size ko-count

# allow-two-primaries cram-hmac-alg shared-secret after-sb-0pri

# after-sb-1pri after-sb-2pri always-asbp rr-conflict

# ping-timeout data-integrity-alg tcp-cork on-congestion

# congestion-fill congestion-extents csums-alg verify-alg

# use-rle

protocol C;

after-sb-0pri discard-zero-changes;

after-sb-1pri discard-secondary;

after-sb-2pri disconnect;

rr-conflict disconect;

max-buffers 8000;

max-epoch-size 8000;

sndbuf-size 512k;

}

}


[root@storage1 ~]# cat /etc/drbd.d/r0.res             

# 20150220 Kristian Feldsam


resource r0 {

net {

allow-two-primaries;

}

on storage1.feldhost {

device /dev/drbd0;

disk /dev/disk-by-id/LUN_NAA-6003005700ebac201c787e523a9bd324;

address 192.168.10.1:7788;

meta-disk internal;

}

on storage2.feldhost {

device /dev/drbd0;

disk /dev/disk-by-id/LUN_NAA-6003005700ed39801c78848711b98ba9;

address 192.168.10.2:7788;

meta-disk internal;

}

}


[root@storage1 ~]# drbdadm create-md r0

DRBD module version: 8.4.3

   userland version: 8.4.4

preferably kernel and userland versions should match.

You want me to create a v08 style flexible-size internal meta data block.

There appears to be a v08 flexible-size internal meta data block

already in place on /dev/disk-by-id/LUN_NAA-6003005700ebac201c787e523a9bd324 at byte offset 7998946930688

Do you really want to overwrite the existing v08 meta-data?

[need to type 'yes' to confirm] yes


Writing meta data...

initializing activity log

NOT initializing bitmap

New drbd meta data block successfully created.


[root@storage1 ~]# drbdadm up r0

DRBD module version: 8.4.3

   userland version: 8.4.4

preferably kernel and userland versions should match.

Command 'drbdsetup connect r0 ipv4:192.168.10.1:7788 ipv4:192.168.10.2:7788 --sndbuf-size=512k --max-epoch-size=8000 --max-buffers=8000 --rr-conflict=disconect --after-sb-2pri=disconnect --after-sb-1pri=discard-secondary --after-sb-0pri=discard-zero-changes --protocol=C --allow-two-primaries=yes' terminated with exit code 20


[root@storage1 ~]# /etc/rc.d/rc.drbd start

Setting DRBD parameters...

DRBD module version: 8.4.3

   userland version: 8.4.4

preferably kernel and userland versions should match.

Command '/sbin/drbdsetup connect r0 ipv4:192.168.10.1:7788 ipv4:192.168.10.2:7788 --sndbuf-size=512k --max-epoch-size=8000 --max-buffers=8000 --rr-conflict=disconect --after-sb-2pri=disconnect --after-sb-1pri=discard-secondary --after-sb-0pri=discard-zero-changes --protocol=C --allow-two-primaries=yes' terminated with exit code 20

Waiting for device creation...

DRBD module version: 8.4.3

   userland version: 8.4.4

preferably kernel and userland versions should match.

DRBD module version: 8.4.3

   userland version: 8.4.4

preferably kernel and userland versions should match.

Waiting for connection...

DRBD module version: 8.4.3

   userland version: 8.4.4

preferably kernel and userland versions should match.

Marc Smith

unread,
Feb 20, 2015, 5:06:48 PM2/20/15
to esos-...@googlegroups.com
Ugh, the version mismatch caught my eye... anyhow that's something
else to address.


For your issue: Can you try running the command that failed by hand:
/sbin/drbdsetup connect r0 ipv4:192.168.10.1:7788
ipv4:192.168.10.2:7788 --sndbuf-size=512k --max-epoch-size=8000
--max-buffers=8000 --rr-conflict=disconect --after-sb-2pri=disconnect
--after-sb-1pri=discard-secondary --after-sb-0pri=discard-zero-changes
--protocol=C --allow-two-primaries=yes

See if it gives more errors.

I did quick Google search and other users seem to indicate it might
fail if you're re-setting it up, and did not "down" the resource first
('drbdadm down r0'). I noticed the warning about overwriting
meta-data.

I also noticed you have a typo in your configuration file:
"rr-conflict disconect;"
(Should be "disconnect".)


--Marc
> --
> You received this message because you are subscribed to the Google Groups
> "esos-users" group.
> To unsubscribe from this group and stop receiving emails from it, send an
> email to esos-users+...@googlegroups.com.
> For more options, visit https://groups.google.com/d/optout.

Kristián Feldsam

unread,
Feb 21, 2015, 12:28:53 PM2/21/15
to esos-...@googlegroups.com
Hi, thank you, it was typo in config. It's working now.

I have 4Gbit interfaces so I setup bonding balance-rr miimon=100. With standard config I get about 90MB/s when replicating DRBD. So I play with various settings and get about 160MB/s with mtu 900, txquelen 10000 and net.core.netdev_max_backlog = 3000 (default was 250000). After playing I pull out one link, so bond has 3 active links. For my surprise actual replication speed go up to 195MB/s. What you think about it? I user Intel Pro 1000 Quad port controllers.
 
Dňa piatok, 20. februára 2015 23:06:48 UTC+1 Marc Smith napísal(-a):

Marc Smith

unread,
Feb 21, 2015, 11:00:22 PM2/21/15
to esos-...@googlegroups.com
That sounds like pretty good speeds, but I'm honestly not a Linux bonding driver expert. ESOS is generic enough that any information you find about tweaking/performance of the Linux bonding driver should apply.

Also, in the latest revisions of ESOS I updated rc.drbd to suppress warning messages about kernel module / user-land version differences.


--Marc
Reply all
Reply to author
Forward
0 new messages