[syzbot] [bcachefs?] kernel BUG in bch2_journal_key_insert_take

11 views
Skip to first unread message

syzbot

unread,
Oct 21, 2024, 9:07:43 AM10/21/24
to kent.ov...@linux.dev, linux-b...@vger.kernel.org, linux-...@vger.kernel.org, syzkall...@googlegroups.com
Hello,

syzbot found the following issue on:

HEAD commit: 15e7d45e786a Add linux-next specific files for 20241016
git tree: linux-next
console+strace: https://syzkaller.appspot.com/x/log.txt?x=10a5c240580000
kernel config: https://syzkaller.appspot.com/x/.config?x=c36416f1c54640c0
dashboard link: https://syzkaller.appspot.com/bug?extid=47f334396d741f9cb1ce
compiler: Debian clang version 15.0.6, GNU ld (GNU Binutils for Debian) 2.40
syz repro: https://syzkaller.appspot.com/x/repro.syz?x=11044487980000
C reproducer: https://syzkaller.appspot.com/x/repro.c?x=12815830580000

Downloadable assets:
disk image: https://storage.googleapis.com/syzbot-assets/cf2ad43c81cc/disk-15e7d45e.raw.xz
vmlinux: https://storage.googleapis.com/syzbot-assets/c85347a66a1c/vmlinux-15e7d45e.xz
kernel image: https://storage.googleapis.com/syzbot-assets/648cf8e59c13/bzImage-15e7d45e.xz
mounted in repro: https://storage.googleapis.com/syzbot-assets/6ba77e840d2c/mount_0.gz

The issue was bisected to:

commit d59f4aba096298347f0e0e5402843bb8505edc2d
Author: Kent Overstreet <kent.ov...@linux.dev>
Date: Sat Oct 12 02:53:09 2024 +0000

bcachefs: -o norecovery now bails out of recovery earlier

bisection log: https://syzkaller.appspot.com/x/bisect.txt?x=1580c487980000
final oops: https://syzkaller.appspot.com/x/report.txt?x=1780c487980000
console output: https://syzkaller.appspot.com/x/log.txt?x=1380c487980000

IMPORTANT: if you fix the issue, please add the following tag to the commit:
Reported-by: syzbot+47f334...@syzkaller.appspotmail.com
Fixes: d59f4aba0962 ("bcachefs: -o norecovery now bails out of recovery earlier")

------------[ cut here ]------------
kernel BUG at fs/bcachefs/btree_journal_iter.c:190!
Oops: invalid opcode: 0000 [#1] PREEMPT SMP KASAN PTI
CPU: 1 UID: 0 PID: 1169 Comm: kworker/1:2 Not tainted 6.12.0-rc3-next-20241016-syzkaller #0
Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 09/13/2024
Workqueue: bcachefs_write_ref bch2_delete_dead_snapshots_work (bcachefs-delete-dead-snapshots/)
RIP: 0010:bch2_journal_key_insert_take+0x180f/0x1830 fs/bcachefs/btree_journal_iter.c:190
Code: f1 fc ff ff e8 d2 51 78 fd 90 0f 0b e8 ca 51 78 fd 90 0f 0b e8 c2 51 78 fd 90 0f 0b e8 ba 51 78 fd 90 0f 0b e8 b2 51 78 fd 90 <0f> 0b e8 4a a1 af 07 e8 a5 51 78 fd 90 0f 0b e8 9d 51 78 fd 90 0f
RSP: 0018:ffffc9000430edc0 EFLAGS: 00010293
RAX: ffffffff841c909e RBX: 0000000000000040 RCX: ffff8880272b8000
RDX: 0000000000000000 RSI: 0000000000000040 RDI: 0000000000000000
RBP: ffffc9000430ef30 R08: ffffffff841c7a8e R09: 1ffff1100de80035
R10: dffffc0000000000 R11: ffffed100de80036 R12: 0000000000000000
R13: ffff88806f400000 R14: dffffc0000000000 R15: ffff88806f44b310
FS: 0000000000000000(0000) GS:ffff8880b8700000(0000) knlGS:0000000000000000
CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
CR2: 0000559ab32ac530 CR3: 00000000744be000 CR4: 00000000003526f0
DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
Call Trace:
<TASK>
bch2_journal_key_insert+0xb3/0x130 fs/bcachefs/btree_journal_iter.c:260
do_bch2_trans_commit_to_journal_replay+0x111/0x420 fs/bcachefs/btree_trans_commit.c:1003
__bch2_trans_commit+0x15d9/0x9420 fs/bcachefs/btree_trans_commit.c:1039
bch2_trans_commit fs/bcachefs/btree_update.h:184 [inline]
bch2_delete_dead_snapshots+0x19b6/0x5ae0 fs/bcachefs/snapshot.c:1655
bch2_delete_dead_snapshots_work+0x34/0x40 fs/bcachefs/snapshot.c:1730
process_one_work kernel/workqueue.c:3229 [inline]
process_scheduled_works+0xa63/0x1850 kernel/workqueue.c:3310
worker_thread+0x870/0xd30 kernel/workqueue.c:3391
kthread+0x2f0/0x390 kernel/kthread.c:389
ret_from_fork+0x4b/0x80 arch/x86/kernel/process.c:147
ret_from_fork_asm+0x1a/0x30 arch/x86/entry/entry_64.S:244
</TASK>
Modules linked in:
---[ end trace 0000000000000000 ]---
RIP: 0010:bch2_journal_key_insert_take+0x180f/0x1830 fs/bcachefs/btree_journal_iter.c:190
Cod


---
This report is generated by a bot. It may contain errors.
See https://goo.gl/tpsmEJ for more information about syzbot.
syzbot engineers can be reached at syzk...@googlegroups.com.

syzbot will keep track of this issue. See:
https://goo.gl/tpsmEJ#status for how to communicate with syzbot.
For information about bisection process see: https://goo.gl/tpsmEJ#bisection

If the report is already addressed, let syzbot know by replying with:
#syz fix: exact-commit-title

If you want syzbot to run the reproducer, reply with:
#syz test: git://repo/address.git branch-or-commit-hash
If you attach or paste a git patch, syzbot will apply it before testing.

If you want to overwrite report's subsystems, reply with:
#syz set subsystems: new-subsystem
(See the list of subsystem names on the web dashboard)

If the report is a duplicate of another one, reply with:
#syz dup: exact-subject-of-another-report

If you want to undo deduplication, reply with:
#syz undup

Piotr Zalewski

unread,
Oct 21, 2024, 6:25:55 PM10/21/24
to syzbot+47f334...@syzkaller.appspotmail.com, syzkall...@googlegroups.com
#syz test
test.patch

syzbot

unread,
Oct 21, 2024, 6:58:08 PM10/21/24
to linux-...@vger.kernel.org, pz01000...@proton.me, syzkall...@googlegroups.com
Hello,

syzbot has tested the proposed patch but the reproducer is still triggering an issue:
INFO: task hung in bch2_copygc_stop

INFO: task syz-executor:5854 blocked for more than 143 seconds.
Not tainted 6.12.0-rc3-next-20241021-syzkaller-g63b3ff03d91a-dirty #0
"echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
task:syz-executor state:D stack:21216 pid:5854 tgid:5854 ppid:1 flags:0x00004006
Call Trace:
<TASK>
context_switch kernel/sched/core.c:5328 [inline]
__schedule+0x18af/0x4bd0 kernel/sched/core.c:6690
__schedule_loop kernel/sched/core.c:6767 [inline]
schedule+0x14b/0x320 kernel/sched/core.c:6782
schedule_timeout+0xb0/0x290 kernel/time/sleep_timeout.c:75
do_wait_for_common kernel/sched/completion.c:95 [inline]
__wait_for_common kernel/sched/completion.c:116 [inline]
wait_for_common kernel/sched/completion.c:127 [inline]
wait_for_completion+0x355/0x620 kernel/sched/completion.c:148
kthread_stop+0x19e/0x640 kernel/kthread.c:712
bch2_copygc_stop+0x57/0x140 fs/bcachefs/movinggc.c:411
__bch2_fs_read_only+0x47/0x430 fs/bcachefs/super.c:265
bch2_fs_read_only+0xbb9/0x1270 fs/bcachefs/super.c:355
__bch2_fs_stop+0x105/0x5c0 fs/bcachefs/super.c:620
generic_shutdown_super+0x139/0x2d0 fs/super.c:642
bch2_kill_sb+0x41/0x50 fs/bcachefs/fs.c:2278
deactivate_locked_super+0xc4/0x130 fs/super.c:473
cleanup_mnt+0x41f/0x4b0 fs/namespace.c:1373
task_work_run+0x24f/0x310 kernel/task_work.c:239
resume_user_mode_work include/linux/resume_user_mode.h:50 [inline]
exit_to_user_mode_loop kernel/entry/common.c:114 [inline]
exit_to_user_mode_prepare include/linux/entry-common.h:328 [inline]
__syscall_exit_to_user_mode_work kernel/entry/common.c:207 [inline]
syscall_exit_to_user_mode+0x168/0x370 kernel/entry/common.c:218
do_syscall_64+0x100/0x230 arch/x86/entry/common.c:89
entry_SYSCALL_64_after_hwframe+0x77/0x7f
RIP: 0033:0x7f5c4a37f327
RSP: 002b:00007fff1d156ed8 EFLAGS: 00000246 ORIG_RAX: 00000000000000a6
RAX: 0000000000000000 RBX: 0000000000000000 RCX: 00007f5c4a37f327
RDX: 0000000000000000 RSI: 0000000000000009 RDI: 00007fff1d156f90
RBP: 00007fff1d156f90 R08: 0000000000000000 R09: 0000000000000000
R10: 00000000ffffffff R11: 0000000000000246 R12: 00007fff1d158010
R13: 00007f5c4a3f0134 R14: 0000000000018b83 R15: 00007fff1d158050
</TASK>

Showing all locks held in the system:
1 lock held by khungtaskd/30:
#0: ffffffff8e939fa0 (rcu_read_lock){....}-{1:2}, at: rcu_lock_acquire include/linux/rcupdate.h:337 [inline]
#0: ffffffff8e939fa0 (rcu_read_lock){....}-{1:2}, at: rcu_read_lock include/linux/rcupdate.h:849 [inline]
#0: ffffffff8e939fa0 (rcu_read_lock){....}-{1:2}, at: debug_show_all_locks+0x55/0x2a0 kernel/locking/lockdep.c:6720
2 locks held by getty/4988:
#0: ffff88814befb0a0 (&tty->ldisc_sem){++++}-{0:0}, at: tty_ldisc_ref_wait+0x25/0x70 drivers/tty/tty_ldisc.c:243
#1: ffffc900031232f0 (&ldata->atomic_read_lock){+.+.}-{3:3}, at: n_tty_read+0x6a6/0x1e00 drivers/tty/n_tty.c:2211
2 locks held by syz-executor/5854:
#0: ffff8880212ae0e0 (&type->s_umount_key#51){+.+.}-{3:3}, at: __super_lock fs/super.c:56 [inline]
#0: ffff8880212ae0e0 (&type->s_umount_key#51){+.+.}-{3:3}, at: __super_lock_excl fs/super.c:71 [inline]
#0: ffff8880212ae0e0 (&type->s_umount_key#51){+.+.}-{3:3}, at: deactivate_super+0xb5/0xf0 fs/super.c:505
#1: ffff888064900278 (&c->state_lock){+.+.}-{3:3}, at: __bch2_fs_stop+0xfd/0x5c0 fs/bcachefs/super.c:619

=============================================

NMI backtrace for cpu 1
CPU: 1 UID: 0 PID: 30 Comm: khungtaskd Not tainted 6.12.0-rc3-next-20241021-syzkaller-g63b3ff03d91a-dirty #0
Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 09/13/2024
Call Trace:
<TASK>
__dump_stack lib/dump_stack.c:94 [inline]
dump_stack_lvl+0x241/0x360 lib/dump_stack.c:120
nmi_cpu_backtrace+0x49c/0x4d0 lib/nmi_backtrace.c:113
nmi_trigger_cpumask_backtrace+0x198/0x320 lib/nmi_backtrace.c:62
trigger_all_cpu_backtrace include/linux/nmi.h:162 [inline]
check_hung_uninterruptible_tasks kernel/hung_task.c:223 [inline]
watchdog+0xff4/0x1040 kernel/hung_task.c:379
kthread+0x2f0/0x390 kernel/kthread.c:389
ret_from_fork+0x4b/0x80 arch/x86/kernel/process.c:147
ret_from_fork_asm+0x1a/0x30 arch/x86/entry/entry_64.S:244
</TASK>
Sending NMI from CPU 1 to CPUs 0:
NMI backtrace for cpu 0
CPU: 0 UID: 0 PID: 2471 Comm: kworker/u8:7 Not tainted 6.12.0-rc3-next-20241021-syzkaller-g63b3ff03d91a-dirty #0
Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 09/13/2024
Workqueue: events_unbound cfg80211_wiphy_work
RIP: 0010:orc_ip arch/x86/kernel/unwind_orc.c:80 [inline]
RIP: 0010:__orc_find arch/x86/kernel/unwind_orc.c:102 [inline]
RIP: 0010:orc_find arch/x86/kernel/unwind_orc.c:227 [inline]
RIP: 0010:unwind_next_frame+0x6d5/0x22d0 arch/x86/kernel/unwind_orc.c:494
Code: 89 c1 48 c1 f9 02 48 c1 e8 3f 48 01 c8 48 83 e0 fe 49 8d 1c 46 48 89 d8 48 c1 e8 03 48 b9 00 00 00 00 00 fc ff df 0f b6 04 08 <84> c0 75 27 48 63 03 48 01 d8 48 8d 4b 04 4c 39 f8 4c 0f 46 f1 48
RSP: 0018:ffffc900097e7090 EFLAGS: 00000a06
RAX: 0000000000000000 RBX: ffffffff9092dbe8 RCX: dffffc0000000000
RDX: 00000000000b0001 RSI: ffffffff913b0774 RDI: ffffffff81416920
RBP: ffffffff9092dbe8 R08: 0000000000000001 R09: ffffc900097e7250
R10: ffffc900097e71b0 R11: ffffffff8180a260 R12: ffffffff9092dbe8
R13: ffffffff9092dbe8 R14: ffffffff9092dbe8 R15: ffffffff8b2eca34
FS: 0000000000000000(0000) GS:ffff8880b8600000(0000) knlGS:0000000000000000
CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
CR2: 00007f36f83ff000 CR3: 000000000e736000 CR4: 00000000003526f0
DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
Call Trace:
<NMI>
</NMI>
<TASK>
arch_stack_walk+0x11c/0x150 arch/x86/kernel/stacktrace.c:25
stack_trace_save+0x118/0x1d0 kernel/stacktrace.c:122
kasan_save_stack mm/kasan/common.c:47 [inline]
kasan_save_track+0x3f/0x80 mm/kasan/common.c:68
poison_kmalloc_redzone mm/kasan/common.c:377 [inline]
__kasan_kmalloc+0x98/0xb0 mm/kasan/common.c:394
kasan_kmalloc include/linux/kasan.h:260 [inline]
__do_kmalloc_node mm/slub.c:4283 [inline]
__kmalloc_noprof+0x285/0x4c0 mm/slub.c:4295
kmalloc_noprof include/linux/slab.h:905 [inline]
kzalloc_noprof include/linux/slab.h:1037 [inline]
ieee802_11_parse_elems_full+0xdb/0x2880 net/mac80211/parse.c:958
ieee802_11_parse_elems_crc net/mac80211/ieee80211_i.h:2384 [inline]
ieee802_11_parse_elems net/mac80211/ieee80211_i.h:2391 [inline]
ieee80211_rx_mgmt_probe_beacon net/mac80211/ibss.c:1575 [inline]
ieee80211_ibss_rx_queued_mgmt+0x4c8/0x2d70 net/mac80211/ibss.c:1606
ieee80211_iface_process_skb net/mac80211/iface.c:1603 [inline]
ieee80211_iface_work+0x8a5/0xf20 net/mac80211/iface.c:1657
cfg80211_wiphy_work+0x2db/0x490 net/wireless/core.c:440
process_one_work kernel/workqueue.c:3229 [inline]
process_scheduled_works+0xa63/0x1850 kernel/workqueue.c:3310
worker_thread+0x870/0xd30 kernel/workqueue.c:3391
kthread+0x2f0/0x390 kernel/kthread.c:389
ret_from_fork+0x4b/0x80 arch/x86/kernel/process.c:147
ret_from_fork_asm+0x1a/0x30 arch/x86/entry/entry_64.S:244
</TASK>


Tested on:

commit: 63b3ff03 Add linux-next specific files for 20241021
git tree: linux-next
console output: https://syzkaller.appspot.com/x/log.txt?x=12215c87980000
kernel config: https://syzkaller.appspot.com/x/.config?x=d2da11284432f402
dashboard link: https://syzkaller.appspot.com/bug?extid=47f334396d741f9cb1ce
compiler: Debian clang version 15.0.6, GNU ld (GNU Binutils for Debian) 2.40
patch: https://syzkaller.appspot.com/x/patch.diff?x=14ae5c87980000

Piotr Zalewski

unread,
Oct 22, 2024, 4:21:33 PM10/22/24
to syzbot, kent.ov...@linux.dev, linux-b...@vger.kernel.org, linux-...@vger.kernel.org, syzkall...@googlegroups.com
Seems this BUG is no longer being triggered. Output with 6.12-rc5:
```
executing program
[ 22.799942] loop0: detected capacity change from 0 to 32768
[ 22.835885] bcachefs (loop0): starting version 1.7: mi_btree_bitmap opts=metadata_checksum=none,data_checksum=ns
[ 22.837429] bcachefs (loop0): recovering from clean shutdown, journal seq 10
[ 22.843604] bcachefs (loop0): check_topology... done
[ 22.844050] bcachefs (loop0): accounting_read... done
[ 22.849861] bcachefs (loop0): alloc_read... done
[ 22.850282] bcachefs (loop0): stripes_read... done
[ 22.850738] bcachefs (loop0): snapshots_read... done
[ 22.851370] bcachefs (loop0): check_allocations...
[ 22.852430] btree ptr not marked in member info btree allocated bitmap
[ 22.852441] u64s 11 type btree_ptr_v2 SPOS_MAX len 0 ver 0: seq ac62141f8dc7e261 written 24 min_key POS_MIN dg
[ 22.854503] bcachefs (loop0): Unable to continue, halting
[ 22.854977] bcachefs (loop0): bch2_gc_mark_key(): error fsck_errors_not_fixed
[ 22.855673] bcachefs (loop0): bch2_gc_btree(): error fsck_errors_not_fixed
[ 22.856268] bcachefs (loop0): bch2_gc_btrees(): error fsck_errors_not_fixed
[ 22.857042] bcachefs (loop0): bch2_check_allocations(): error fsck_errors_not_fixed
[ 22.857904] bcachefs (loop0): bch2_fs_recovery(): error fsck_errors_not_fixed
[ 22.858378] bcachefs (loop0): bch2_fs_start(): error starting filesystem fsck_errors_not_fixed
[ 22.858933] bcachefs (loop0): shutting down
[ 22.863434] bcachefs (loop0): shutdown complete
[ 23.256941] bcachefs: bch2_fs_get_tree() error: fsck_errors_not_fixed
[ 23.270355] repro.elf (193) used greatest stack depth: 25792 bytes left
```



syzbot

unread,
Jan 13, 2025, 10:19:14 PM1/13/25
to syzkall...@googlegroups.com
Auto-closing this bug as obsolete.
No recent activity, existing reproducers are no longer triggering the issue.
Reply all
Reply to author
Forward
0 new messages