[Kernel Bug] general protection fault in btrfs_lookup_csum

27 views
Skip to first unread message

Zhiyu Zhang

unread,
Jun 6, 2025, 1:54:47 PM6/6/25
to c...@fb.com, jo...@toxicpanda.com, dst...@suse.com, linux...@vger.kernel.org, linux-...@vger.kernel.org, syzkaller, coreg...@gmail.com
Dear Developers and Maintainers,

We would like to report a Linux kernel bug titled "general protection
fault in btrfs_lookup_csum" on Linux-6.12.28, we also reproduce the
PoC on the latest 6.15 kernel. Here are the relevant attachments:

kernel config: https://drive.google.com/file/d/15zwNg6D0mF6eeFOw5zz4QkkH1bcK8xCl/view?usp=sharing
report: https://drive.google.com/file/d/1BPmRKH5Not1_y5briNsAcaYi0hXTe5um/view?usp=sharing
syz reproducer:
https://drive.google.com/file/d/1xvAUqtN1mu-49xfCObEYRn1eFc2Tmk8F/view?usp=sharing
C reproducer: https://drive.google.com/file/d/1cdDqjEqpqhoenhWzxF_GNc06kRkrrjxa/view?usp=sharing

The crash happens on every read I/O against a broken btrfs image whose
checksum tree is missing/corrupted. Specifically,
fs/btrfs/file-item.c:search_csum_tree() calls "csum_root =
btrfs_csum_root(fs_info, disk_bytenr);", where csum_root can be NULL
under certain on-disk corruptions. Then btrfs_lookup_csum()
immediately dereferences root->fs_info, causing a general-protection
fault / KASAN report.

--- a/fs/btrfs/file-item.c
+++ b/fs/btrfs/file-item.c
@@ -201,6 +201,8 @@ btrfs_lookup_csum(struct btrfs_trans_handle *trans,
struct btrfs_path *path,
u64 bytenr, int cow)
{
+ if (unlikely(!root))
+ return ERR_PTR(-EINVAL); /* or -ENOENT, see below */
struct btrfs_fs_info *fs_info = root->fs_info;
int ret;

With this draft patch the PoC no longer panics the kernel.
search_csum_tree() converts -ENOENT (and -EFBIG) to 0, treating the
range as “no checksum” and continuing safely. If we instead return
-EINVAL, the error propagates upward and aborts the read outright. I
am unsure which behaviour is preferred: (1) ENOENT: silently
consistent with existing path handling and avoids spurious I/O errors;
(2) EINVAL: treats the situation as fatal corruption.

Advice on the expected semantics would be appreciated before I submit
a formal patch.

If the issue receives a CVE, we would be grateful to be listed as reporters:
Reported-by: Zhiyu Zhang <zhiyuz...@gmail.com>
Reported-by: Longxing Li <coreg...@gmail.com>

Please let us know if a different fix or additional diagnostics are
preferred. We will be happy to respin the patch accordingly.

Thank you for your time!

Best regards,
Zhiyu Zhang

Qu Wenruo

unread,
Jun 6, 2025, 6:52:34 PM6/6/25
to Zhiyu Zhang, c...@fb.com, jo...@toxicpanda.com, dst...@suse.com, linux...@vger.kernel.org, linux-...@vger.kernel.org, syzkaller, coreg...@gmail.com


在 2025/6/7 03:24, Zhiyu Zhang 写道:
> Dear Developers and Maintainers,
>
> We would like to report a Linux kernel bug titled "general protection
> fault in btrfs_lookup_csum" on Linux-6.12.28, we also reproduce the
> PoC on the latest 6.15 kernel. Here are the relevant attachments:
>
> kernel config: https://drive.google.com/file/d/15zwNg6D0mF6eeFOw5zz4QkkH1bcK8xCl/view?usp=sharing
> report: https://drive.google.com/file/d/1BPmRKH5Not1_y5briNsAcaYi0hXTe5um/view?usp=sharing
> syz reproducer:
> https://drive.google.com/file/d/1xvAUqtN1mu-49xfCObEYRn1eFc2Tmk8F/view?usp=sharing
> C reproducer: https://drive.google.com/file/d/1cdDqjEqpqhoenhWzxF_GNc06kRkrrjxa/view?usp=sharing

This doesn't feel safe just accessing some unknown source.

Can you let the sysbot to reproduce and forward the report?

>
> The crash happens on every read I/O against a broken btrfs image whose
> checksum tree is missing/corrupted. Specifically,
> fs/btrfs/file-item.c:search_csum_tree() calls "csum_root =
> btrfs_csum_root(fs_info, disk_bytenr);", where csum_root can be NULL
> under certain on-disk corruptions. Then btrfs_lookup_csum()
> immediately dereferences root->fs_info, causing a general-protection
> fault / KASAN report.
>

Undermost case, if csum tree root is corrupted, btrfs can only be
mounted with rescue=ibadroots, and in that case btrfs should set
FS_STATE_NO_DATA_CSUM thus no one should trigger the csum tree search at
all (btrfs_lookup_bio_sums() will exit early).

The only unknown exception is scrub, which is already fixed by
f95d186255b3 ("btrfs: avoid NULL pointer dereference if no valid csum
tree").


The call trace just looks like a regular page read, and we didn't have
that FS_STATE_NO_DATA_CSUMS set, which isn't correct.

I'd prefer to dig deeper on finding out why.

Thanks,
Qu

Zhiyu Zhang

unread,
Jun 7, 2025, 1:41:58 AM6/7/25
to syzkaller
---------- Forwarded message ---------
发件人: Qu Wenruo <w...@suse.com>
Date: 2025年6月7日周六 07:49
Subject: [PATCH] btrfs: handle csum tree error with rescue=ibadroots correctly
To: <linux...@vger.kernel.org>
Cc: Zhiyu Zhang <zhiyuz...@gmail.com>, Longxing Li <coreg...@gmail.com>


[BUG]
There is syzbot based reproducer that can crash the kernel, with the
following call trace: (With some debug output added)

DEBUG: rescue=ibadroots parsed
BTRFS: device fsid 14d642db-7b15-43e4-81e6-4b8fac6a25f8 devid 1
transid 8 /dev/loop0 (7:0) scanned by repro (1010)
BTRFS info (device loop0): first mount of filesystem
14d642db-7b15-43e4-81e6-4b8fac6a25f8
BTRFS info (device loop0): using blake2b (blake2b-256-generic)
checksum algorithm
BTRFS info (device loop0): using free-space-tree
BTRFS warning (device loop0): checksum verify failed on logical
5312512 mirror 1 wanted
0xb043382657aede36608fd3386d6b001692ff406164733d94e2d9a180412c6003
found 0x810ceb2bacb7f0f9eb2bf3b2b15c02af867cb35ad450898169f3b1f0bd818651
level 0
DEBUG: read tree root path failed for tree csum, ret=-5
BTRFS warning (device loop0): checksum verify failed on logical
5328896 mirror 1 wanted
0x51be4e8b303da58e6340226815b70e3a93592dac3f30dd510c7517454de8567a
found 0x51be4e8b303da58e634022a315b70e3a93592dac3f30dd510c7517454de8567a
level 0
BTRFS warning (device loop0): checksum verify failed on logical
5292032 mirror 1 wanted
0x1924ccd683be9efc2fa98582ef58760e3848e9043db8649ee382681e220cdee4
found 0x0cb6184f6e8799d9f8cb335dccd1d1832da1071d12290dab3b85b587ecacca6e
level 0
process 'repro' launched './file2' with NULL argv: empty string added
DEBUG: no csum root, idatacsums=0 ibadroots=134217728
Oops: general protection fault, probably for non-canonical address
0xdffffc0000000041: 0000 [#1] SMP KASAN NOPTI
KASAN: null-ptr-deref in range [0x0000000000000208-0x000000000000020f]
CPU: 5 UID: 0 PID: 1010 Comm: repro Tainted: G OE
6.15.0-custom+ #249 PREEMPT(full)
Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS unknown 02/02/2022
RIP: 0010:btrfs_lookup_csum+0x93/0x3d0 [btrfs]
Call Trace:
<TASK>
btrfs_lookup_bio_sums+0x47a/0xdf0 [btrfs]
btrfs_submit_bbio+0x43e/0x1a80 [btrfs]
submit_one_bio+0xde/0x160 [btrfs]
btrfs_readahead+0x498/0x6a0 [btrfs]
read_pages+0x1c3/0xb20
page_cache_ra_order+0x4b5/0xc20
filemap_get_pages+0x2d3/0x19e0
filemap_read+0x314/0xde0
__kernel_read+0x35b/0x900
bprm_execve+0x62e/0x1140
do_execveat_common.isra.0+0x3fc/0x520
__x64_sys_execveat+0xdc/0x130
do_syscall_64+0x54/0x1d0
entry_SYSCALL_64_after_hwframe+0x76/0x7e
---[ end trace 0000000000000000 ]---

[CAUSE]
Firstly the fs has a corrupted csum tree root, thus to mount the fs we
have to go "ro,rescue=ibadroots" mount option.

Normally with that mount option, a bad csum tree root should set
BTRFS_FS_STATE_NO_DATA_CSUMS flag, so that any future data read will
ignore csum search.

But in this particular case, we have the following call trace that
caused NULL csum root, but not setting BTRFS_FS_STATE_NO_DATA_CSUMS:

load_global_roots_objectid():

ret = btrfs_search_slot();
/* Succeeded */
btrfs_item_key_to_cpu()
found = true;
/* We found the root item for csum tree. */
root = read_tree_root_path();
if (IS_ERR(root)) {
if (!btrfs_test_opt(fs_info, IGNOREBADROOTS))
/*
* Since we have rescue=ibadroots mount option,
* @ret is still 0.
*/
break;
if (!found || ret) {
/* @found is true, @ret is 0, error handling for csum
* tree is skipped.
*/
}

This means we completely skipped to set BTRFS_FS_STATE_NO_DATA_CSUMS if
the csum tree is corrupted, which results unexpected later csum lookup.

[FIX]
If read_tree_root_path() failed, always populate @ret to the error
number.

As at the end of the function, we need @ret to determine if we need to
do the extra error handling for csum tree.

Fixes: abed4aaae4f7 ("btrfs: track the csum, extent, and free space
trees in a rb tree")

Reported-by: Zhiyu Zhang <zhiyuz...@gmail.com>
Reported-by: Longxing Li <coreg...@gmail.com>
Signed-off-by: Qu Wenruo <w...@suse.com>
---
fs/btrfs/disk-io.c | 3 +--
1 file changed, 1 insertion(+), 2 deletions(-)

diff --git a/fs/btrfs/disk-io.c b/fs/btrfs/disk-io.c
index f48f9d924a62..0d6ad7512f21 100644
--- a/fs/btrfs/disk-io.c
+++ b/fs/btrfs/disk-io.c
@@ -2158,8 +2158,7 @@ static int load_global_roots_objectid(struct
btrfs_root *tree_root,
found = true;
root = read_tree_root_path(tree_root, path, &key);
if (IS_ERR(root)) {
- if (!btrfs_test_opt(fs_info, IGNOREBADROOTS))
- ret = PTR_ERR(root);
+ ret = PTR_ERR(root);
break;
}
set_bit(BTRFS_ROOT_TRACK_DIRTY, &root->state);
--
2.49.0

Zhiyu Zhang

unread,
Jun 7, 2025, 2:02:06 AM6/7/25
to syzkaller
Hi Qu,

Thank you for looking into this and for the detailed explanation.

> This doesn't feel safe just accessing some unknown source.
>
> Can you let the sysbot to reproduce and forward the report?

Apologies for the Google Drive links, which seem to be unsafe from an external site. Unfortunately I have not yet figured out the proper way to let syzbot repro and forward our report.

From a security perspective, sharing through GitHub doesn't seem like an appropriate way, so I in-line the crash report below instead:
======
Oops: general protection fault, probably for non-canonical address 0xdffffc0000000041: 0000 [#1] PREEMPT SMP KASAN NOPTI

KASAN: null-ptr-deref in range [0x0000000000000208-0x000000000000020f]
CPU: 1 UID: 0 PID: 49740 Comm: syz.4.2981 Not tainted 6.12.28 #1
Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.15.0-1 04/01/2014
RIP: 0010:btrfs_lookup_csum+0xa2/0x410 fs/btrfs/file-item.c:206
Code: 48 8b 04 25 28 00 00 00 48 89 84 24 d8 00 00 00 31 c0 e8 d1 83 f9 fd 49 8d bd 08 02 00 00 44 8b 4c 24 08 48 89 f8 48 c1 e8 03 <42> 80 3c 30 00 0f 85 1d 03 00 00 4d 8b bd 08 02 00 00 4c 8d b4 24
RSP: 0018:ffffc9000999ef60 EFLAGS: 00010206
RAX: 0000000000000041 RBX: 1ffff92001333df0 RCX: 000000000050d000
RDX: ffff8880841b8000 RSI: ffffffff8394411f RDI: 0000000000000208
RBP: 0000000000000000 R08: 0000000000000000 R09: 0000000000000000
R10: ffff88812550006b R11: 0000000000000000 R12: ffff88801eae0420
R13: 0000000000000000 R14: dffffc0000000000 R15: 0000000000001000
FS:  00007f7ab8c5c640(0000) GS:ffff888135e00000(0000) knlGS:0000000000000000
CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
CR2: 00007f7ab8c5bff0 CR3: 000000002c354000 CR4: 0000000000752ef0
DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
PKRU: 80000000
Call Trace:
 <TASK>
 search_csum_tree fs/btrfs/file-item.c:313 [inline]
 btrfs_lookup_bio_sums+0x775/0xf50 fs/btrfs/file-item.c:410
 btrfs_submit_chunk fs/btrfs/bio.c:717 [inline]
 btrfs_submit_bbio+0x535/0x1a70 fs/btrfs/bio.c:792
 submit_one_bio+0x123/0x1c0 fs/btrfs/extent_io.c:127
 btrfs_readahead+0x520/0x6b0 fs/btrfs/extent_io.c:2381
 read_pages+0x1a8/0xdc0 mm/readahead.c:160
 page_cache_ra_unbounded+0x3c0/0x6c0 mm/readahead.c:290
 do_page_cache_ra mm/readahead.c:320 [inline]
 page_cache_ra_order+0x8f2/0xc80 mm/readahead.c:519
 page_cache_sync_ra+0x4b4/0x9c0 mm/readahead.c:607
 page_cache_sync_readahead include/linux/pagemap.h:1394 [inline]
 filemap_get_pages+0xd7c/0x1be0 mm/filemap.c:2559
 filemap_read+0x3b2/0xd50 mm/filemap.c:2657
 btrfs_file_read_iter+0x17b/0x1c0 fs/btrfs/file.c:3795
 __kernel_read+0x3f1/0xb50 fs/read_write.c:527
 kernel_read+0x55/0x70 fs/read_write.c:545
 prepare_binprm fs/exec.c:1721 [inline]
 search_binary_handler fs/exec.c:1770 [inline]
 exec_binprm fs/exec.c:1828 [inline]
 bprm_execve fs/exec.c:1880 [inline]
 bprm_execve+0x61e/0x18b0 fs/exec.c:1856
 do_execveat_common.isra.0+0x4f1/0x630 fs/exec.c:1985
 do_execveat fs/exec.c:2070 [inline]
 __do_sys_execveat fs/exec.c:2144 [inline]
 __se_sys_execveat fs/exec.c:2138 [inline]
 __x64_sys_execveat+0xda/0x120 fs/exec.c:2138
 do_syscall_x64 arch/x86/entry/common.c:52 [inline]
 do_syscall_64+0xcd/0x250 arch/x86/entry/common.c:83
 entry_SYSCALL_64_after_hwframe+0x77/0x7f
...
======

While triaging we noticed two syzbot reports that look superficially similar but have a different root cause; they might be useful for comparison:
[1] https://groups.google.com/g/syzkaller-bugs/c/EV1hWaW8E08/m/IV1qNjHgAwAJ
[2] https://groups.google.com/g/syzkaller-bugs/c/skISr7ekIcY/m/QUAsiN6qAQAJ

> The call trace just looks like a regular page read, and we didn't have that FS_STATE_NO_DATA_CSUMS set, which isn't correct.
>
> I'd prefer to dig deeper on finding out why.

After reading your patch, I corrected my understanding of the issue: 

With ro,rescue=ibadroots the corrupted csum-tree root is detected in load_global_roots_objectid(). With IGNOREBADROOTS set, the code does not propagate the error into ret, so the later "bad-root handling" path is skipped. Subsequent normal page read therefore performs a csum lookup and hits NPD.

I think your patch is more elegant and correct. Thank you very much.

Thanks again for your time!

Best regards,
Zhiyu Zhang


在2025年6月7日星期六 UTC+8 06:52:34<Qu Wenruo> 写道:
Reply all
Reply to author
Forward
0 new messages