[PATCH] block: split bio_kmalloc from bio_alloc_bioset

3 views
Skip to first unread message

Tadeusz Struk

unread,
Jul 18, 2022, 4:09:42 PM7/18/22
to syzbot+4f441e...@syzkaller.appspotmail.com, syzkaller-a...@googlegroups.com, tadeus...@linaro.org
#syz test: https://android.googlesource.com/kernel/common android12-5.10-lts

diff --git a/block/bio.c b/block/bio.c
index f8d26ce7b61b..be59276e462e 100644
--- a/block/bio.c
+++ b/block/bio.c
@@ -405,122 +405,101 @@ static void punt_bios_to_rescuer(struct bio_set *bs)
* @nr_iovecs: number of iovecs to pre-allocate
* @bs: the bio_set to allocate from.
*
- * Description:
- * If @bs is NULL, uses kmalloc() to allocate the bio; else the allocation is
- * backed by the @bs's mempool.
+ * Allocate a bio from the mempools in @bs.
*
- * When @bs is not NULL, if %__GFP_DIRECT_RECLAIM is set then bio_alloc will
- * always be able to allocate a bio. This is due to the mempool guarantees.
- * To make this work, callers must never allocate more than 1 bio at a time
- * from this pool. Callers that need to allocate more than 1 bio must always
- * submit the previously allocated bio for IO before attempting to allocate
- * a new one. Failure to do so can cause deadlocks under memory pressure.
+ * If %__GFP_DIRECT_RECLAIM is set then bio_alloc will always be able to
+ * allocate a bio. This is due to the mempool guarantees. To make this work,
+ * callers must never allocate more than 1 bio at a time from the general pool.
+ * Callers that need to allocate more than 1 bio must always submit the
+ * previously allocated bio for IO before attempting to allocate a new one.
+ * Failure to do so can cause deadlocks under memory pressure.
*
- * Note that when running under submit_bio_noacct() (i.e. any block
- * driver), bios are not submitted until after you return - see the code in
- * submit_bio_noacct() that converts recursion into iteration, to prevent
- * stack overflows.
+ * Note that when running under submit_bio_noacct() (i.e. any block driver),
+ * bios are not submitted until after you return - see the code in
+ * submit_bio_noacct() that converts recursion into iteration, to prevent
+ * stack overflows.
*
- * This would normally mean allocating multiple bios under
- * submit_bio_noacct() would be susceptible to deadlocks, but we have
- * deadlock avoidance code that resubmits any blocked bios from a rescuer
- * thread.
+ * This would normally mean allocating multiple bios under submit_bio_noacct()
+ * would be susceptible to deadlocks, but we have
+ * deadlock avoidance code that resubmits any blocked bios from a rescuer
+ * thread.
*
- * However, we do not guarantee forward progress for allocations from other
- * mempools. Doing multiple allocations from the same mempool under
- * submit_bio_noacct() should be avoided - instead, use bio_set's front_pad
- * for per bio allocations.
+ * However, we do not guarantee forward progress for allocations from other
+ * mempools. Doing multiple allocations from the same mempool under
+ * submit_bio_noacct() should be avoided - instead, use bio_set's front_pad
+ * for per bio allocations.
*
- * RETURNS:
- * Pointer to new bio on success, NULL on failure.
+ * Returns: Pointer to new bio on success, NULL on failure.
*/
struct bio *bio_alloc_bioset(gfp_t gfp_mask, unsigned int nr_iovecs,
struct bio_set *bs)
{
gfp_t saved_gfp = gfp_mask;
- unsigned front_pad;
- unsigned inline_vecs;
- struct bio_vec *bvl = NULL;
struct bio *bio;
void *p;

- if (!bs) {
- if (nr_iovecs > UIO_MAXIOV)
- return NULL;
-
- p = kmalloc(struct_size(bio, bi_inline_vecs, nr_iovecs), gfp_mask);
- front_pad = 0;
- inline_vecs = nr_iovecs;
- } else {
- /* should not use nobvec bioset for nr_iovecs > 0 */
- if (WARN_ON_ONCE(!mempool_initialized(&bs->bvec_pool) &&
- nr_iovecs > 0))
- return NULL;
- /*
- * submit_bio_noacct() converts recursion to iteration; this
- * means if we're running beneath it, any bios we allocate and
- * submit will not be submitted (and thus freed) until after we
- * return.
- *
- * This exposes us to a potential deadlock if we allocate
- * multiple bios from the same bio_set() while running
- * underneath submit_bio_noacct(). If we were to allocate
- * multiple bios (say a stacking block driver that was splitting
- * bios), we would deadlock if we exhausted the mempool's
- * reserve.
- *
- * We solve this, and guarantee forward progress, with a rescuer
- * workqueue per bio_set. If we go to allocate and there are
- * bios on current->bio_list, we first try the allocation
- * without __GFP_DIRECT_RECLAIM; if that fails, we punt those
- * bios we would be blocking to the rescuer workqueue before
- * we retry with the original gfp_flags.
- */
-
- if (current->bio_list &&
- (!bio_list_empty(&current->bio_list[0]) ||
- !bio_list_empty(&current->bio_list[1])) &&
- bs->rescue_workqueue)
- gfp_mask &= ~__GFP_DIRECT_RECLAIM;
+ /* should not use nobvec bioset for nr_iovecs > 0 */
+ if (WARN_ON_ONCE(!mempool_initialized(&bs->bvec_pool) && nr_iovecs > 0))
+ return NULL;

+ /*
+ * submit_bio_noacct() converts recursion to iteration; this means if
+ * we're running beneath it, any bios we allocate and submit will not be
+ * submitted (and thus freed) until after we return.
+ *
+ * This exposes us to a potential deadlock if we allocate multiple bios
+ * from the same bio_set() while running underneath submit_bio_noacct().
+ * If we were to allocate multiple bios (say a stacking block driver
+ * that was splitting bios), we would deadlock if we exhausted the
+ * mempool's reserve.
+ *
+ * We solve this, and guarantee forward progress, with a rescuer
+ * workqueue per bio_set. If we go to allocate and there are bios on
+ * current->bio_list, we first try the allocation without
+ * __GFP_DIRECT_RECLAIM; if that fails, we punt those bios we would be
+ * blocking to the rescuer workqueue before we retry with the original
+ * gfp_flags.
+ */
+ if (current->bio_list &&
+ (!bio_list_empty(&current->bio_list[0]) ||
+ !bio_list_empty(&current->bio_list[1])) &&
+ bs->rescue_workqueue)
+ gfp_mask &= ~__GFP_DIRECT_RECLAIM;
+
+ p = mempool_alloc(&bs->bio_pool, gfp_mask);
+ if (!p && gfp_mask != saved_gfp) {
+ punt_bios_to_rescuer(bs);
+ gfp_mask = saved_gfp;
p = mempool_alloc(&bs->bio_pool, gfp_mask);
- if (!p && gfp_mask != saved_gfp) {
- punt_bios_to_rescuer(bs);
- gfp_mask = saved_gfp;
- p = mempool_alloc(&bs->bio_pool, gfp_mask);
- }
-
- front_pad = bs->front_pad;
- inline_vecs = BIO_INLINE_VECS;
}
-
if (unlikely(!p))
return NULL;

- bio = p + front_pad;
- bio_init(bio, NULL, 0);
-
- if (nr_iovecs > inline_vecs) {
+ bio = p + bs->front_pad;
+ if (nr_iovecs > BIO_INLINE_VECS) {
unsigned long idx = 0;
+ struct bio_vec *bvl = NULL;

bvl = bvec_alloc(gfp_mask, nr_iovecs, &idx, &bs->bvec_pool);
if (!bvl && gfp_mask != saved_gfp) {
punt_bios_to_rescuer(bs);
gfp_mask = saved_gfp;
- bvl = bvec_alloc(gfp_mask, nr_iovecs, &idx, &bs->bvec_pool);
+ bvl = bvec_alloc(gfp_mask, nr_iovecs, &idx,
+ &bs->bvec_pool);
}

if (unlikely(!bvl))
goto err_free;

bio->bi_flags |= idx << BVEC_POOL_OFFSET;
+ bio_init(bio, bvl, bvec_nr_vecs(idx));
} else if (nr_iovecs) {
- bvl = bio->bi_inline_vecs;
+ bio_init(bio, bio->bi_inline_vecs, BIO_INLINE_VECS);
+ } else {
+ bio_init(bio, NULL, 0);
}

bio->bi_pool = bs;
- bio->bi_max_vecs = nr_iovecs;
- bio->bi_io_vec = bvl;
return bio;

err_free:
@@ -529,6 +508,31 @@ struct bio *bio_alloc_bioset(gfp_t gfp_mask, unsigned int nr_iovecs,
}
EXPORT_SYMBOL(bio_alloc_bioset);

+/**
+ * bio_kmalloc - kmalloc a bio for I/O
+ * @gfp_mask: the GFP_* mask given to the slab allocator
+ * @nr_iovecs: number of iovecs to pre-allocate
+ *
+ * Use kmalloc to allocate and initialize a bio.
+ *
+ * Returns: Pointer to new bio on success, NULL on failure.
+ */
+struct bio *bio_kmalloc(gfp_t gfp_mask, unsigned int nr_iovecs)
+{
+ struct bio *bio;
+
+ if (nr_iovecs > UIO_MAXIOV)
+ return NULL;
+
+ bio = kmalloc(struct_size(bio, bi_inline_vecs, nr_iovecs), gfp_mask);
+ if (unlikely(!bio))
+ return NULL;
+ bio_init(bio, nr_iovecs ? bio->bi_inline_vecs : NULL, nr_iovecs);
+ bio->bi_pool = NULL;
+ return bio;
+}
+EXPORT_SYMBOL(bio_kmalloc);
+
void zero_fill_bio_iter(struct bio *bio, struct bvec_iter start)
{
unsigned long flags;
diff --git a/block/bounce.c b/block/bounce.c
index 162a6eee8999..4da429de78a2 100644
--- a/block/bounce.c
+++ b/block/bounce.c
@@ -214,8 +214,7 @@ static void bounce_end_io_read_isa(struct bio *bio)
__bounce_end_io_read(bio, &isa_page_pool);
}

-static struct bio *bounce_clone_bio(struct bio *bio_src, gfp_t gfp_mask,
- struct bio_set *bs)
+static struct bio *bounce_clone_bio(struct bio *bio_src, gfp_t gfp_mask)
{
struct bvec_iter iter;
struct bio_vec bv;
@@ -242,8 +241,11 @@ static struct bio *bounce_clone_bio(struct bio *bio_src, gfp_t gfp_mask,
* asking for trouble and would force extra work on
* __bio_clone_fast() anyways.
*/
-
- bio = bio_alloc_bioset(gfp_mask, bio_segments(bio_src), bs);
+ if (bio_is_passthrough(bio_src))
+ bio = bio_kmalloc(gfp_mask, bio_segments(bio_src));
+ else
+ bio = bio_alloc_bioset(gfp_mask, bio_segments(bio_src),
+ &bounce_bio_set);
if (!bio)
return NULL;
bio->bi_disk = bio_src->bi_disk;
@@ -294,7 +296,6 @@ static void __blk_queue_bounce(struct request_queue *q, struct bio **bio_orig,
unsigned i = 0;
bool bounce = false;
int sectors = 0;
- bool passthrough = bio_is_passthrough(*bio_orig);

bio_for_each_segment(from, *bio_orig, iter) {
if (i++ < BIO_MAX_PAGES)
@@ -305,14 +306,14 @@ static void __blk_queue_bounce(struct request_queue *q, struct bio **bio_orig,
if (!bounce)
return;

- if (!passthrough && sectors < bio_sectors(*bio_orig)) {
+ if (!bio_is_passthrough(*bio_orig) &&
+ sectors < bio_sectors(*bio_orig)) {
bio = bio_split(*bio_orig, sectors, GFP_NOIO, &bounce_bio_split);
bio_chain(bio, *bio_orig);
submit_bio_noacct(*bio_orig);
*bio_orig = bio;
}
- bio = bounce_clone_bio(*bio_orig, GFP_NOIO, passthrough ? NULL :
- &bounce_bio_set);
+ bio = bounce_clone_bio(*bio_orig, GFP_NOIO);

/*
* Bvec table can't be updated by bio_for_each_segment_all(),
diff --git a/include/linux/bio.h b/include/linux/bio.h
index 0d4d61743b1a..9a078779734d 100644
--- a/include/linux/bio.h
+++ b/include/linux/bio.h
@@ -395,6 +395,7 @@ extern int biovec_init_pool(mempool_t *pool, int pool_entries);
extern int bioset_init_from_src(struct bio_set *bs, struct bio_set *src);

extern struct bio *bio_alloc_bioset(gfp_t, unsigned int, struct bio_set *);
+struct bio *bio_kmalloc(gfp_t gfp_mask, unsigned int nr_iovecs);
extern void bio_put(struct bio *);

extern void __bio_clone_fast(struct bio *, struct bio *);
@@ -407,11 +408,6 @@ static inline struct bio *bio_alloc(gfp_t gfp_mask, unsigned int nr_iovecs)
return bio_alloc_bioset(gfp_mask, nr_iovecs, &fs_bio_set);
}

-static inline struct bio *bio_kmalloc(gfp_t gfp_mask, unsigned int nr_iovecs)
-{
- return bio_alloc_bioset(gfp_mask, nr_iovecs, NULL);
-}
-
extern blk_qc_t submit_bio(struct bio *);

extern void bio_endio(struct bio *);
--
2.36.1

syzbot

unread,
Jul 18, 2022, 4:24:09 PM7/18/22
to syzkaller-a...@googlegroups.com, tadeus...@linaro.org
Hello,

syzbot has tested the proposed patch but the reproducer is still triggering an issue:
no output from test machine



Tested on:

commit: ebc9fb07 ANDROID: random: fix CRC issues with the merge
git tree: android12-5.10-lts
console output: https://syzkaller.appspot.com/x/log.txt?x=154b12c2080000
kernel config: https://syzkaller.appspot.com/x/.config?x=80a126dc07d157bc
dashboard link: https://syzkaller.appspot.com/bug?extid=4f441e6ca0fcad141421
compiler: Debian clang version 13.0.1-++20220126092033+75e33f71c2da-1~exp1~20220126212112.63, GNU ld (GNU Binutils for Debian) 2.35.2
patch: https://syzkaller.appspot.com/x/patch.diff?x=12265962080000

Reply all
Reply to author
Forward
0 new messages