Google Groups no longer supports new Usenet posts or subscriptions. Historical content remains viewable.
Dismiss

device mapper not reporting no-barrier-support?

30 views
Skip to first unread message

Anders Henke

unread,
Feb 25, 2008, 8:33:28 AM2/25/08
to linux-...@vger.kernel.org
Hi,

I'm currently stuck between Kernel LVM and DRBD, as I'm using Kernel
2.6.24.2 with DRBD 8.2.5 on top of an LVM2 device (LV).

-LVM2/device mapper doesn't support write barriers
-DRBD uses blkdev_issue_flush() to flush its metadata to disk.
On a no-barrier-device, DRBD should receive EOPNOTSUPP, but
it really does receive an EIO. Promptly, DRBD gives the
error message "drbd0: local disk flush failed with status -5".

The physical disk (in LVM speak) is a RAID1 on a 3ware 9650SE-2LP
controller; the driver 3w-9xxx supports barriers and after moving my D
RBD device from the LV to a single partition on the same RAID1, the
error messages from DRBD vanished.

I've posted a lengty summary of my findings to

http://lists.linbit.com/pipermail/drbd-user/2008-February/008665.html

.. where Lars Ellenberg from DRBD basically responded in

http://lists.linbit.com/pipermail/drbd-user/2008-February/008666.html

.. that DRBD does catch the EOPNOTSUPP for blkdev_issue_flush and
BIO_RW_BARRIER, but the lvm implementation of blkdev_issue_flush in
2.6.24.2 aparently does return EIO for blkdev_issue_flush.

So simply the question: how should a top-layer driver check wether a lower
device does support barriers? md-raid does check this way differently than
e.g. XFS does, while DRBD also adds a third way to check this.
Or is this "merely" a bug in drivers/md/dm.c?


Anders
--
1&1 Internet AG System Architect
Brauerstrasse 48 v://49.721.91374.50
D-76135 Karlsruhe f://49.721.91374.225

Amtsgericht Montabaur HRB 6484
Vorstand: Henning Ahlert, Ralph Dommermuth, Matthias Ehrlich, Andreas Gauger,
Thomas Gottschlich, Matthias Greve, Robert Hoffmann, Markus Huhn, Achim Weiss
Aufsichtsratsvorsitzender: Michael Scheeren
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majo...@vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/

Andrew Morton

unread,
Feb 25, 2008, 6:23:24 PM2/25/08
to Anders Henke, linux-...@vger.kernel.org, dm-d...@redhat.com
On Mon, 25 Feb 2008 14:26:15 +0100 Anders Henke <anders...@1und1.de> wrote:

> Hi,
>
> I'm currently stuck between Kernel LVM and DRBD, as I'm using Kernel
> 2.6.24.2 with DRBD 8.2.5 on top of an LVM2 device (LV).
>
> -LVM2/device mapper doesn't support write barriers
> -DRBD uses blkdev_issue_flush() to flush its metadata to disk.
> On a no-barrier-device, DRBD should receive EOPNOTSUPP, but
> it really does receive an EIO. Promptly, DRBD gives the
> error message "drbd0: local disk flush failed with status -5".
>
> The physical disk (in LVM speak) is a RAID1 on a 3ware 9650SE-2LP
> controller; the driver 3w-9xxx supports barriers and after moving my D
> RBD device from the LV to a single partition on the same RAID1, the
> error messages from DRBD vanished.
>
> I've posted a lengty summary of my findings to
>
> http://lists.linbit.com/pipermail/drbd-user/2008-February/008665.html
>

> ... where Lars Ellenberg from DRBD basically responded in
>
> http://lists.linbit.com/pipermail/drbd-user/2008-February/008666.html
>
> ... that DRBD does catch the EOPNOTSUPP for blkdev_issue_flush and


> BIO_RW_BARRIER, but the lvm implementation of blkdev_issue_flush in
> 2.6.24.2 aparently does return EIO for blkdev_issue_flush.
>
> So simply the question: how should a top-layer driver check wether a lower
> device does support barriers? md-raid does check this way differently than
> e.g. XFS does, while DRBD also adds a third way to check this.
> Or is this "merely" a bug in drivers/md/dm.c?
>

(cc dm-devel)

I'd say it's a DM bug. Probably a hard-to-fix one though.

Alasdair G Kergon

unread,
Feb 25, 2008, 8:37:34 PM2/25/08
to Andrew Morton, Anders Henke, Jens Axboe, device-mapper development, linux-...@vger.kernel.org
On Mon, Feb 25, 2008 at 03:20:50PM -0800, Andrew Morton wrote:
> On Mon, 25 Feb 2008 14:26:15 +0100 Anders Henke <anders...@1und1.de> wrote:
> > I'm currently stuck between Kernel LVM and DRBD, as I'm using Kernel
> > 2.6.24.2 with DRBD 8.2.5 on top of an LVM2 device (LV).
> > -LVM2/device mapper doesn't support write barriers

That's right.

> > -DRBD uses blkdev_issue_flush() to flush its metadata to disk.

Which won't work if device-mapper is underneath.

> > On a no-barrier-device, DRBD should receive EOPNOTSUPP, but
> > it really does receive an EIO. Promptly, DRBD gives the
> > error message "drbd0: local disk flush failed with status -5".

> > I've posted a lengty summary of my findings to
> > http://lists.linbit.com/pipermail/drbd-user/2008-February/008665.html

> > ... that DRBD does catch the EOPNOTSUPP for blkdev_issue_flush and
> > BIO_RW_BARRIER, but the lvm implementation of blkdev_issue_flush in
> > 2.6.24.2 aparently does return EIO for blkdev_issue_flush.

> I'd say it's a DM bug.

The dm code is unchanged, but look at the limited endio handling in
ll_rw_blk.c:

static void bio_end_empty_barrier(struct bio *bio, int err)
{
if (err)
clear_bit(BIO_UPTODATE, &bio->bi_flags);

complete(bio->bi_private);
}

int blkdev_issue_flush(struct block_device *bdev, sector_t *error_sector)
{
..
wait_for_completion(&wait);
if (error_sector)
*error_sector = bio->bi_sector;
ret = 0;
if (!bio_flagged(bio, BIO_UPTODATE))
ret = -EIO;

Alasdair
--
a...@redhat.com

Jens Axboe

unread,
Feb 26, 2008, 11:18:09 AM2/26/08
to Andrew Morton, Anders Henke, device-mapper development, linux-...@vger.kernel.org
> ...

> wait_for_completion(&wait);
> if (error_sector)
> *error_sector = bio->bi_sector;
> ret = 0;
> if (!bio_flagged(bio, BIO_UPTODATE))
> ret = -EIO;

You are right, the return value got broken there. Does this make it
return -EOPNOTSUPP properly for you?

diff --git a/block/blk-barrier.c b/block/blk-barrier.c
index 6901eed..55c5f1f 100644
--- a/block/blk-barrier.c
+++ b/block/blk-barrier.c
@@ -259,8 +259,11 @@ int blk_do_ordered(struct request_queue *q, struct request **rqp)



static void bio_end_empty_barrier(struct bio *bio, int err)
{

- if (err)
+ if (err) {
+ if (err == -EOPNOTSUPP)
+ set_bit(BIO_EOPNOTSUPP, &bio->bi_flags);
clear_bit(BIO_UPTODATE, &bio->bi_flags);
+ }

complete(bio->bi_private);
}
@@ -309,7 +312,9 @@ int blkdev_issue_flush(struct block_device *bdev, sector_t *error_sector)


*error_sector = bio->bi_sector;

ret = 0;

- if (!bio_flagged(bio, BIO_UPTODATE))
+ if (bio_flagged(bio, BIO_EOPNOTSUPP))
+ ret = -EOPNOTSUPP;
+ else if (!bio_flagged(bio, BIO_UPTODATE))
ret = -EIO;

bio_put(bio);

--
Jens Axboe

Anders Henke

unread,
Feb 26, 2008, 2:34:42 PM2/26/08
to Jens Axboe, Andrew Morton, device-mapper development, linux-...@vger.kernel.org


No, it doesn't.

I've applied your patch manually, as 2.6.24.2. doesn't have a "blk-barrier.c":

---cut
--- linux-2.6.24.2/block/ll_rw_blk.c.prepatch 2008-02-11
06:51:11.000000000 +0100
+++ linux-2.6.24.2/block/ll_rw_blk.c 2008-02-26 20:02:28.514641620
+0100
@@ -2667,8 +2667,11 @@



static void bio_end_empty_barrier(struct bio *bio, int err)
{
- if (err)
+ if (err) {
+ if (err == -EOPNOTSUPP)
+ set_bit(BIO_EOPNOTSUPP, &bio->bi_flags);
clear_bit(BIO_UPTODATE, &bio->bi_flags);
+ }

complete(bio->bi_private);
}

---cut

.. and the resulting kernel shows exactly the same behaviour than before:

[ 752.301388] drbd0: Writing meta data super block now.
[ 752.349713] drbd0: local disk flush failed with status -5
[ 752.416256] drbd0: local disk flush failed with status -5
[ 753.419254] drbd0: local disk flush failed with status -5
[ 753.925726] drbd0: local disk flush failed with status -5
[ 754.551176] drbd0: local disk flush failed with status -5
[ 754.806052] drbd0: local disk flush failed with status -5
[ 755.327988] drbd0: local disk flush failed with status -5
[ 755.781863] drbd0: local disk flush failed with status -5
[ 756.266694] drbd0: local disk flush failed with status -5

Anders

1&1 Internet AG "Use the --force, Luke"


Brauerstrasse 48 v://49.721.91374.50
D-76135 Karlsruhe f://49.721.91374.225

Amtsgericht Montabaur HRB 6484
Vorstand: Henning Ahlert, Ralph Dommermuth, Matthias Ehrlich, Andreas Gauger,
Thomas Gottschlich, Matthias Greve, Robert Hoffmann, Markus Huhn, Achim Weiss
Aufsichtsratsvorsitzender: Michael Scheeren

Jens Axboe

unread,
Feb 26, 2008, 2:42:08 PM2/26/08
to Anders Henke, Andrew Morton, device-mapper development, linux-...@vger.kernel.org
> ... and the resulting kernel shows exactly the same behaviour than before:

Not surprising, as you missed half of the patch:

> > @@ -309,7 +312,9 @@ int blkdev_issue_flush(struct block_device *bdev, sector_t *error_sector)
> > *error_sector = bio->bi_sector;
> >
> > ret = 0;
> > - if (!bio_flagged(bio, BIO_UPTODATE))
> > + if (bio_flagged(bio, BIO_EOPNOTSUPP))
> > + ret = -EOPNOTSUPP;
> > + else if (!bio_flagged(bio, BIO_UPTODATE))
> > ret = -EIO;
> >
> > bio_put(bio);

--
Jens Axboe

--

Anders Henke

unread,
Feb 26, 2008, 3:21:01 PM2/26/08
to Jens Axboe, Andrew Morton, device-mapper development, linux-...@vger.kernel.org

Ouch. Thank you for pointing this out.

I've been spending too much time of the day with things who have a negative
impact on my concentration and I shouldn't manually patch kernels at
this time of the day.

Yes, it's useless to set a bit, but never check it (like in my version of
your patch).

After adding the second part of your patch, the resulting kernel works as
intended:

[ 234.946192] drbd0: conn( WFSyncUUID -> SyncTarget )
[ 234.956176] drbd0: Began resync as SyncTarget (will sync 19542404 KB
[4885601
bits set]).
[ 234.972567] drbd0: Writing meta data super block now.
[ 235.018203] drbd0: local disk flush failed with status -95

DRBD sees the EOPNOTSUPP, logs this message only once and doesn't try
any further barrier requests (as intended).

Just for the records, the 2.6.24.2-ready version of your patch:

---cut
--- linux-2.6.24.2/block/ll_rw_blk.c.prepatch 2008-02-11 06:51:11.000000000 +0
100

+++ linux-2.6.24.2/block/ll_rw_blk.c 2008-02-26 20:58:05.552467940 +0100


@@ -2667,8 +2667,11 @@

static void bio_end_empty_barrier(struct bio *bio, int err)
{
- if (err)
+ if (err) {
+ if (err == -EOPNOTSUPP)
+ set_bit(BIO_EOPNOTSUPP, &bio->bi_flags);
clear_bit(BIO_UPTODATE, &bio->bi_flags);
+ }

complete(bio->bi_private);
}

@@ -2717,7 +2720,9 @@


*error_sector = bio->bi_sector;

ret = 0;
- if (!bio_flagged(bio, BIO_UPTODATE))
+ if (bio_flagged(bio, BIO_EOPNOTSUPP))
+ ret = -EOPNOTSUPP;
+ else if (!bio_flagged(bio, BIO_UPTODATE))
ret = -EIO;

bio_put(bio);

---cut

Anders
--
1&1 Internet AG better sleep(28800)


Brauerstrasse 48 v://49.721.91374.50
D-76135 Karlsruhe f://49.721.91374.225

Amtsgericht Montabaur HRB 6484
Vorstand: Henning Ahlert, Ralph Dommermuth, Matthias Ehrlich, Andreas Gauger,
Thomas Gottschlich, Matthias Greve, Robert Hoffmann, Markus Huhn, Achim Weiss
Aufsichtsratsvorsitzender: Michael Scheeren

Jens Axboe

unread,
Feb 26, 2008, 5:29:48 PM2/26/08
to Anders Henke, Andrew Morton, device-mapper development, linux-...@vger.kernel.org

OK good, that's what I expected :-)

I'll queue the patch for 2.6.25, the 2.6.24 should go to stable. Send me
a properly formatted patch and I'll make sure it goes that way.

Thanks for testing!

--
Jens Axboe

Anders Henke

unread,
Feb 28, 2008, 7:06:18 AM2/28/08
to Jens Axboe, Andrew Morton, device-mapper development, linux-...@vger.kernel.org
On Feb 26 2008, Jens Axboe wrote:
> > [ 234.946192] drbd0: conn( WFSyncUUID -> SyncTarget )
> > [ 234.956176] drbd0: Began resync as SyncTarget (will sync 19542404 KB
> > [4885601
> > bits set]).
> > [ 234.972567] drbd0: Writing meta data super block now.
> > [ 235.018203] drbd0: local disk flush failed with status -95
> >
> > DRBD sees the EOPNOTSUPP, logs this message only once and doesn't try
> > any further barrier requests (as intended).
>
> OK good, that's what I expected :-)
>
> I'll queue the patch for 2.6.25, the 2.6.24 should go to stable. Send me
> a properly formatted patch and I'll make sure it goes that way.
>
> Thanks for testing!

'diff -up''d patch is attached.

Anders
--
1&1 Internet AG System Design

ll_rw_blk-eopnotsup-2.6.24.2.patch
0 new messages