Google Groups no longer supports new Usenet posts or subscriptions. Historical content remains viewable.
Dismiss

Bug#1054326: Can't start golbal heartbeat when drbd device is on top LVM device

103 views
Skip to first unread message

Dan Smolik

unread,
Oct 21, 2023, 5:30:05 PM10/21/23
to
Package: ocfs2-tools
Version: 1.8.7-1+b1
Severity: important
X-Debbugs-Cc: mar...@mydatex.cz

Dear maitainer,
in virtual enviroment I try build 2 node ocfs cluster. When drbd device is on top of md device all works. But when drbd device is on top LVM device global heartbeat doesn't start.

Using config file '/etc/ocfs2/cluster.conf'
Initializing cluster stack
Checking heartbeat mode
Global heartbeat enabled
Heartbeat region 3D69F9BD7AF24C59BB600A8D7B1D4770
Scanning devices
About to start heartbeat
o2cb: Heartbeat region could not be found 3D69F9BD7AF24C59BB600A8D7B1D4770
root@drbd-server01:/home/marvin/ocfs2-tools/o2cb_ctl# mounted.ocfs2 -d -v -v
Probing device /dev/vda1
Probing device /dev/vdb
Probing device /dev/vdb1
Probing device /dev/vdb5
Probing device /dev/vdc
Probing device /dev/vdc1
Probing device /dev/sr0
Probing device /dev/md0
Probing device /dev/mapper/c-o
Probing device /dev/drbd0
Device Stack Cluster F UUID Label
/dev/mapper/c-o o2cb mdtxcluster G 3D69F9BD7AF24C59BB600A8D7B1D4770 mdtx
/dev/drbd0 o2cb mdtxcluster G 3D69F9BD7AF24C59BB600A8D7B1D4770 mdtx



-- System Information:
Debian Release: 12.2
APT prefers stable-updates
APT policy: (500, 'stable-updates'), (500, 'stable-security'), (500, 'proposed-updates'), (500, 'stable')
Architecture: amd64 (x86_64)

Kernel: Linux 6.1.0-13-amd64 (SMP w/2 CPU threads; PREEMPT)
Locale: LANG=cs_CZ.UTF-8, LC_CTYPE=cs_CZ.UTF-8 (charmap=UTF-8), LANGUAGE not set
Shell: /bin/sh linked to /usr/bin/dash
Init: systemd (via /run/systemd/system)
LSM: AppArmor: enabled

Versions of packages ocfs2-tools depends on:
ii debconf [debconf-2.0] 1.5.82
ii init-system-helpers 1.65.2
ii libaio1 0.3.113-4
ii libc6 2.36-9+deb12u3
ii libcmap4 3.1.7-1
ii libcom-err2 1.47.0-2
ii libdlm3 4.2.0-2
ii libglib2.0-0 2.74.6-2
ii libreadline8 8.2-1.3
ii libuuid1 2.38.1-5+b1
ii lsb-base 11.6
ii psmisc 23.6-1
ii sysvinit-utils [lsb-base] 3.06-4

ocfs2-tools recommends no packages.

ocfs2-tools suggests no packages.

-- debconf information:
ocfs2-tools/idle_timeout: 30000
ocfs2-tools/heartbeat_threshold: 31
ocfs2-tools/reconnect_delay: 2000
ocfs2-tools/keepalive_delay: 2000
ocfs2-tools/init: false
ocfs2-tools/clustername: ocfs2

Valentin Vidic

unread,
Oct 22, 2023, 6:10:04 AM10/22/23
to
On Sat, Oct 21, 2023 at 11:19:50PM +0200, Dan Smolik wrote:
> in virtual enviroment I try build 2 node ocfs cluster. When drbd
> device is on top of md device all works. But when drbd device is on
> top LVM device global heartbeat doesn't start.
>
> Using config file '/etc/ocfs2/cluster.conf'

Maybe you can share more info, like what the cluster.conf looks like
in this setup?

--
Valentin

Daniel Smolik

unread,
Oct 22, 2023, 6:43:10 AM10/22/23
to
Yes no problem. There it is.

Dne 22. 10. 23 v 11:57 Valentin Vidic napsal(a):
drbd-vol1.res.node2
cluster.conf.node2
drbd-vol1.res.node1
cluster.conf.node1

Valentin Vidic

unread,
Oct 22, 2023, 7:40:05 AM10/22/23
to
On Sun, Oct 22, 2023 at 12:11:19PM +0200, Daniel Smolik wrote:
> Yes no problem. There it is.

Thanks. My best guess is that the problem happens because the region
UUID is visible on both drbd and the lower device, so the global
heartbeat might work if drbd device is selected first (for example in
the case of md).

Since I don't know if it possible to specify the device directly, your
best bet now is probably to use the local heartbeat mode (that should
work without the regions being specified).

--
Valentin

Daniel Smolik

unread,
Oct 22, 2023, 8:10:06 AM10/22/23
to
I completely  agree, but when it is on md device then is UUID visible on
both nodes too. My guess is  that name  of LVM  device is longer than md
or drbd device.  And I looked  to source code and start heartbeat failed

here: in op_start.c

 if (!(od->od_flags & O2CB_DEVICE_FOUND)) {
                tcom_err(ret, "%s", od->od_uuid);
                goto bail;
        }

It is only testing device and for final solution isn't  local heartbeat
possible. In o2cb is not possible specify device where start heartbeat.






Dne 22. 10. 23 v 13:37 Valentin Vidic napsal(a):

Daniel Smolik

unread,
Oct 22, 2023, 5:50:04 PM10/22/23
to
But when use local heartbeat mode I  can  mount ocfs but not work in
cluster mode. I mean that without global heartbeat you didn't have
shared storage.

Regards
                Dan




Dne 22. 10. 23 v 13:37 Valentin Vidic napsal(a):

Valentin Vidic

unread,
Oct 22, 2023, 6:00:07 PM10/22/23
to
On Sun, Oct 22, 2023 at 11:38:56PM +0200, Daniel Smolik wrote:
> But when use local heartbeat mode I  can  mount ocfs but not work in cluster
> mode. I mean that without global heartbeat you didn't have shared storage.

If I understand correctly with local heartbeat there is one heartbeat
running per ocfs2 mount and this should still allow cluster storage
to work.

Global heartbeat only optimizes this to run one or more heartbeat
devices for all ocfs2 mounts, so it should not be a strong requirement
for running a shared cluster storage.

--
Valentin

Daniel Smolik

unread,
Oct 22, 2023, 6:20:06 PM10/22/23
to
May be but this don't work.

I create  LVM volume heartbeat on both nodes. Set ocfs mode to local.

 o2cb add-heartbeat mdtxcluster  /dev/c/heratbeat
On both nodes.

o2cb start-heartbeat mdtxcluster
on both nodes

ocfs2: Mounting device (147,0) on (node 0, slot 0) with ordered data mode.
[ 1290.746345] o2dlm: Leaving domain 3D69F9BD7AF24C59BB600A8D7B1D4770
[ 1290.788991] ocfs2: Unmounting device (147,0) on (node 0)
[ 1828.717874] o2hb: Heartbeat stopped on region
3D69F9BD7AF24C59BB600A8D7B1D4770 (dm-0)
[ 2202.976390] o2hb: Heartbeat started on region
FD5342AE132B4A40A3DE39D0406252E5 (dm-1)
[ 2205.008194] o2hb: Region FD5342AE132B4A40A3DE39D0406252E5 (dm-1) is
now a quorum device
[ 2229.339352] o2dlm: Joining domain 3D69F9BD7AF24C59BB600A8D7B1D4770
[ 2229.339359] (
[ 2229.339362] 0
[ 2229.339364] ) 1 nodes
[ 2229.360976] ocfs2: Mounting device (147,0) on (node 0, slot 1) with
ordered data mode.
[ 2229.366496] ocfs2: Begin replay journal (node 1, slot 0) on device
(147,0)
[ 2229.380189] ocfs2: End replay journal (node 1, slot 0) on device (147,0)
[ 2256.920272] o2dlm: Leaving domain 3D69F9BD7AF24C59BB600A8D7B1D4770
[ 2256.960029] ocfs2: Unmounting device (147,0) on (node 0)
[ 2333.541256] o2dlm: Joining domain 3D69F9BD7AF24C59BB600A8D7B1D4770
[ 2333.541263] (
[ 2333.541266] 0
[ 2333.541268] ) 1 nodes
[ 2333.667424] ocfs2: Mounting device (147,0) on (node 0, slot 0) with
ordered data mode.
root@drbd-server01:/home/marvin#  o2cb start-heartbeat mdtxcluster



o2hb: Heartbeat started on region C402D542D0D648FAA7EA1D8D7B6571B8 (dm-1)
[ 2802.429790] o2hb: Region C402D542D0D648FAA7EA1D8D7B6571B8 (dm-1) is
now a quorum device
[ 2847.205047] o2dlm: Joining domain 3D69F9BD7AF24C59BB600A8D7B1D4770
[ 2847.205055] (
[ 2847.205058] 1
[ 2847.205061] ) 1 nodes
[ 2847.292767] ocfs2: Mounting device (147,0) on (node 1, slot 0) with
ordered data mode.
[ 2875.108121] o2dlm: Leaving domain 3D69F9BD7AF24C59BB600A8D7B1D4770
[ 2875.144193] ocfs2: Unmounting device (147,0) on (node 1)
[ 2963.823619] o2dlm: Joining domain 3D69F9BD7AF24C59BB600A8D7B1D4770
[ 2963.823626] (
[ 2963.823629] 1
[ 2963.823631] ) 1 nodes
[ 2963.870726] ocfs2: Mounting device (147,0) on (node 1, slot 1) with
ordered data mode.
[ 2963.879606] ocfs2: Begin replay journal (node 0, slot 0) on device
(147,0)
[ 2963.893880] ocfs2: End replay journal (node 0, slot 0) on device (147,0)



You can see that I have two one node clusters. But no shared storage no
one two node cluster.




o2cb -v -v -v -v  start-heartbeat   mdtxcluster
Using config file '/etc/ocfs2/cluster.conf'
Initializing cluster stack
Checking heartbeat mode
Global heartbeat enabled
Heartbeat region C402D542D0D648FAA7EA1D8D7B6571B8
Scanning devices
Region C402D542D0D648FAA7EA1D8D7B6571B8 matched to device
/dev/mapper/c-heratbeat
About to start heartbeat
Starting heartbeat on region C402D542D0D648FAA7EA1D8D7B6571B8, device
/dev/mapper/c-heratbeat
Stop heartbeat on devices removed from config
Checking heartbeat mode
Global heartbeat enabled
Looking up active heartbeat regions
Global heartbeat started

I set mode to local but after start set to global. I don´t know why.


Regards
                Dan




Dne 22. 10. 23 v 23:48 Valentin Vidic napsal(a):
0 new messages