[syzbot] [kernfs?] KASAN: slab-out-of-bounds Read in wb_writeback

15 views
Skip to first unread message

syzbot

unread,
Apr 1, 2024, 6:53:35 AMApr 1
to bra...@kernel.org, gre...@linuxfoundation.org, ja...@suse.cz, linux-...@vger.kernel.org, linux-...@vger.kernel.org, syzkall...@googlegroups.com, t...@kernel.org, vi...@zeniv.linux.org.uk
Hello,

syzbot found the following issue on:

HEAD commit: a6bd6c933339 Add linux-next specific files for 20240328
git tree: linux-next
console output: https://syzkaller.appspot.com/x/log.txt?x=122c4ffd180000
kernel config: https://syzkaller.appspot.com/x/.config?x=b0058bda1436e073
dashboard link: https://syzkaller.appspot.com/bug?extid=7b219b86935220db6dd8
compiler: Debian clang version 15.0.6, GNU ld (GNU Binutils for Debian) 2.40

Unfortunately, I don't have any reproducer for this issue yet.

Downloadable assets:
disk image: https://storage.googleapis.com/syzbot-assets/7c1618ff7d25/disk-a6bd6c93.raw.xz
vmlinux: https://storage.googleapis.com/syzbot-assets/875519f620fe/vmlinux-a6bd6c93.xz
kernel image: https://storage.googleapis.com/syzbot-assets/ad92b057fb96/bzImage-a6bd6c93.xz

IMPORTANT: if you fix the issue, please add the following tag to the commit:
Reported-by: syzbot+7b219b...@syzkaller.appspotmail.com

==================================================================
BUG: KASAN: slab-out-of-bounds in __lock_acquire+0x78/0x1fd0 kernel/locking/lockdep.c:5005
Read of size 8 at addr ffff888023263fa8 by task kworker/u8:0/10

CPU: 0 PID: 10 Comm: kworker/u8:0 Not tainted 6.9.0-rc1-next-20240328-syzkaller #0
Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 03/27/2024
Workqueue: writeback wb_workfn (flush-8:0)
Call Trace:
<TASK>
__dump_stack lib/dump_stack.c:88 [inline]
dump_stack_lvl+0x241/0x360 lib/dump_stack.c:114
print_address_description mm/kasan/report.c:377 [inline]
print_report+0x169/0x550 mm/kasan/report.c:488
kasan_report+0x143/0x180 mm/kasan/report.c:601
__lock_acquire+0x78/0x1fd0 kernel/locking/lockdep.c:5005
lock_acquire+0x1ed/0x550 kernel/locking/lockdep.c:5754
__raw_spin_lock include/linux/spinlock_api_smp.h:133 [inline]
_raw_spin_lock+0x2e/0x40 kernel/locking/spinlock.c:154
spin_lock include/linux/spinlock.h:351 [inline]
wb_writeback+0x66f/0xd30 fs/fs-writeback.c:2160
wb_do_writeback fs/fs-writeback.c:2274 [inline]
wb_workfn+0x410/0x1090 fs/fs-writeback.c:2314
process_one_work kernel/workqueue.c:3218 [inline]
process_scheduled_works+0xa2c/0x1830 kernel/workqueue.c:3299
worker_thread+0x86d/0xd70 kernel/workqueue.c:3380
kthread+0x2f0/0x390 kernel/kthread.c:388
ret_from_fork+0x4b/0x80 arch/x86/kernel/process.c:147
ret_from_fork_asm+0x1a/0x30 arch/x86/entry/entry_64.S:243
</TASK>

Allocated by task 8:
kasan_save_stack mm/kasan/common.c:47 [inline]
kasan_save_track+0x3f/0x80 mm/kasan/common.c:68
poison_kmalloc_redzone mm/kasan/common.c:370 [inline]
__kasan_kmalloc+0x98/0xb0 mm/kasan/common.c:387
kasan_kmalloc include/linux/kasan.h:211 [inline]
__do_kmalloc_node mm/slub.c:4048 [inline]
kmalloc_node_track_caller_noprof+0x22a/0x450 mm/slub.c:4068
kmalloc_reserve+0x111/0x2a0 net/core/skbuff.c:599
__alloc_skb+0x1f3/0x440 net/core/skbuff.c:668
alloc_skb include/linux/skbuff.h:1318 [inline]
nsim_dev_trap_skb_build drivers/net/netdevsim/dev.c:748 [inline]
nsim_dev_trap_report drivers/net/netdevsim/dev.c:805 [inline]
nsim_dev_trap_report_work+0x254/0xaa0 drivers/net/netdevsim/dev.c:850
process_one_work kernel/workqueue.c:3218 [inline]
process_scheduled_works+0xa2c/0x1830 kernel/workqueue.c:3299
worker_thread+0x86d/0xd70 kernel/workqueue.c:3380
kthread+0x2f0/0x390 kernel/kthread.c:388
ret_from_fork+0x4b/0x80 arch/x86/kernel/process.c:147
ret_from_fork_asm+0x1a/0x30 arch/x86/entry/entry_64.S:243

Freed by task 8:
kasan_save_stack mm/kasan/common.c:47 [inline]
kasan_save_track+0x3f/0x80 mm/kasan/common.c:68
kasan_save_free_info+0x40/0x50 mm/kasan/generic.c:579
poison_slab_object+0xe0/0x150 mm/kasan/common.c:240
__kasan_slab_free+0x37/0x60 mm/kasan/common.c:256
kasan_slab_free include/linux/kasan.h:184 [inline]
slab_free_hook mm/slub.c:2180 [inline]
slab_free mm/slub.c:4363 [inline]
kfree+0x149/0x350 mm/slub.c:4484
skb_kfree_head net/core/skbuff.c:1096 [inline]
skb_free_head net/core/skbuff.c:1108 [inline]
skb_release_data+0x585/0x870 net/core/skbuff.c:1136
skb_release_all net/core/skbuff.c:1202 [inline]
__kfree_skb net/core/skbuff.c:1216 [inline]
consume_skb+0xb3/0x160 net/core/skbuff.c:1432
nsim_dev_trap_report drivers/net/netdevsim/dev.c:821 [inline]
nsim_dev_trap_report_work+0x765/0xaa0 drivers/net/netdevsim/dev.c:850
process_one_work kernel/workqueue.c:3218 [inline]
process_scheduled_works+0xa2c/0x1830 kernel/workqueue.c:3299
worker_thread+0x86d/0xd70 kernel/workqueue.c:3380
kthread+0x2f0/0x390 kernel/kthread.c:388
ret_from_fork+0x4b/0x80 arch/x86/kernel/process.c:147
ret_from_fork_asm+0x1a/0x30 arch/x86/entry/entry_64.S:243

The buggy address belongs to the object at ffff888023262000
which belongs to the cache kmalloc-4k of size 4096
The buggy address is located 4008 bytes to the right of
allocated 4096-byte region [ffff888023262000, ffff888023263000)

The buggy address belongs to the physical page:
page: refcount:1 mapcount:0 mapping:0000000000000000 index:0x0 pfn:0x23260
head: order:3 entire_mapcount:0 nr_pages_mapped:0 pincount:0
flags: 0xfff80000000040(head|node=0|zone=1|lastcpupid=0xfff)
page_type: 0xffffefff(slab)
raw: 00fff80000000040 ffff888015042140 dead000000000100 dead000000000122
raw: 0000000000000000 0000000000040004 00000001ffffefff 0000000000000000
head: 00fff80000000040 ffff888015042140 dead000000000100 dead000000000122
head: 0000000000000000 0000000000040004 00000001ffffefff 0000000000000000
head: 00fff80000000003 ffffea00008c9801 ffffea00008c9848 00000000ffffffff
head: 0000000800000000 0000000000000000 00000000ffffffff 0000000000000000
page dumped because: kasan: bad access detected
page_owner tracks the page as allocated
page last allocated via order 3, migratetype Unmovable, gfp_mask 0xd20c0(__GFP_IO|__GFP_FS|__GFP_NOWARN|__GFP_NORETRY|__GFP_COMP|__GFP_NOMEMALLOC), pid 1, tgid 393129986 (swapper/0), ts 1, free_ts 0
set_page_owner include/linux/page_owner.h:32 [inline]
post_alloc_hook+0x1f3/0x230 mm/page_alloc.c:1487
prep_new_page mm/page_alloc.c:1495 [inline]
get_page_from_freelist+0x2e8a/0x2f40 mm/page_alloc.c:3454
__alloc_pages_noprof+0x256/0x6c0 mm/page_alloc.c:4712
__alloc_pages_node_noprof include/linux/gfp.h:244 [inline]
alloc_pages_node_noprof include/linux/gfp.h:271 [inline]
alloc_slab_page+0x5f/0x120 mm/slub.c:2249
allocate_slab+0x5a/0x2e0 mm/slub.c:2412
new_slab mm/slub.c:2465 [inline]
___slab_alloc+0xea8/0x1430 mm/slub.c:3599
__slab_alloc+0x58/0xa0 mm/slub.c:3684
__slab_alloc_node mm/slub.c:3737 [inline]
slab_alloc_node mm/slub.c:3915 [inline]
kmalloc_trace_noprof+0x1d5/0x2b0 mm/slub.c:4074
kmalloc_noprof include/linux/slab.h:660 [inline]
kzalloc_noprof include/linux/slab.h:775 [inline]
kobject_uevent_env+0x28b/0x8e0 lib/kobject_uevent.c:525
driver_register+0x2d6/0x320 drivers/base/driver.c:254
do_one_initcall+0x248/0x880 init/main.c:1244
do_initcall_level+0x157/0x210 init/main.c:1306
do_initcalls+0x3f/0x80 init/main.c:1322
kernel_init_freeable+0x435/0x5d0 init/main.c:1555
kernel_init+0x1d/0x2b0 init/main.c:1444
ret_from_fork+0x4b/0x80 arch/x86/kernel/process.c:147
page_owner free stack trace missing

Memory state around the buggy address:
ffff888023263e80: fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc
ffff888023263f00: fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc
>ffff888023263f80: fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc
^
ffff888023264000: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
ffff888023264080: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
==================================================================


---
This report is generated by a bot. It may contain errors.
See https://goo.gl/tpsmEJ for more information about syzbot.
syzbot engineers can be reached at syzk...@googlegroups.com.

syzbot will keep track of this issue. See:
https://goo.gl/tpsmEJ#status for how to communicate with syzbot.

If the report is already addressed, let syzbot know by replying with:
#syz fix: exact-commit-title

If you want to overwrite report's subsystems, reply with:
#syz set subsystems: new-subsystem
(See the list of subsystem names on the web dashboard)

If the report is a duplicate of another one, reply with:
#syz dup: exact-subject-of-another-report

If you want to undo deduplication, reply with:
#syz undup

syzbot

unread,
Apr 2, 2024, 10:38:27 AMApr 2
to bra...@kernel.org, gre...@linuxfoundation.org, ja...@suse.cz, konishi...@gmail.com, linux-...@vger.kernel.org, linux-...@vger.kernel.org, linux...@vger.kernel.org, syzkall...@googlegroups.com, t...@kernel.org, vi...@zeniv.linux.org.uk
syzbot has found a reproducer for the following issue on:

HEAD commit: c0b832517f62 Add linux-next specific files for 20240402
git tree: linux-next
console+strace: https://syzkaller.appspot.com/x/log.txt?x=14af7dd9180000
kernel config: https://syzkaller.appspot.com/x/.config?x=afcaf46d374cec8c
dashboard link: https://syzkaller.appspot.com/bug?extid=7b219b86935220db6dd8
compiler: Debian clang version 15.0.6, GNU ld (GNU Binutils for Debian) 2.40
syz repro: https://syzkaller.appspot.com/x/repro.syz?x=1729f003180000
C reproducer: https://syzkaller.appspot.com/x/repro.c?x=17fa4341180000

Downloadable assets:
disk image: https://storage.googleapis.com/syzbot-assets/0d36ec76edc7/disk-c0b83251.raw.xz
vmlinux: https://storage.googleapis.com/syzbot-assets/6f9bb4e37dd0/vmlinux-c0b83251.xz
kernel image: https://storage.googleapis.com/syzbot-assets/2349287b14b7/bzImage-c0b83251.xz
mounted in repro: https://storage.googleapis.com/syzbot-assets/9760c52a227c/mount_0.gz

IMPORTANT: if you fix the issue, please add the following tag to the commit:
Reported-by: syzbot+7b219b...@syzkaller.appspotmail.com

==================================================================
BUG: KASAN: slab-out-of-bounds in __lock_acquire+0x78/0x1fd0 kernel/locking/lockdep.c:5005
Read of size 8 at addr ffff888020485fa8 by task kworker/u8:2/35

CPU: 0 PID: 35 Comm: kworker/u8:2 Not tainted 6.9.0-rc2-next-20240402-syzkaller #0
Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 03/27/2024
Workqueue: writeback wb_workfn (flush-7:1)
Call Trace:
<TASK>
__dump_stack lib/dump_stack.c:88 [inline]
dump_stack_lvl+0x241/0x360 lib/dump_stack.c:114
print_address_description mm/kasan/report.c:377 [inline]
print_report+0x169/0x550 mm/kasan/report.c:488
kasan_report+0x143/0x180 mm/kasan/report.c:601
__lock_acquire+0x78/0x1fd0 kernel/locking/lockdep.c:5005
lock_acquire+0x1ed/0x550 kernel/locking/lockdep.c:5754
__raw_spin_lock include/linux/spinlock_api_smp.h:133 [inline]
_raw_spin_lock+0x2e/0x40 kernel/locking/spinlock.c:154
spin_lock include/linux/spinlock.h:351 [inline]
wb_writeback+0x66f/0xd30 fs/fs-writeback.c:2160
wb_check_old_data_flush fs/fs-writeback.c:2233 [inline]
wb_do_writeback fs/fs-writeback.c:2286 [inline]
wb_workfn+0xba1/0x1090 fs/fs-writeback.c:2314
process_one_work kernel/workqueue.c:3218 [inline]
process_scheduled_works+0xa2c/0x1830 kernel/workqueue.c:3299
worker_thread+0x86d/0xd70 kernel/workqueue.c:3380
kthread+0x2f0/0x390 kernel/kthread.c:388
ret_from_fork+0x4b/0x80 arch/x86/kernel/process.c:147
ret_from_fork_asm+0x1a/0x30 arch/x86/entry/entry_64.S:243
</TASK>

Allocated by task 5052:
kasan_save_stack mm/kasan/common.c:47 [inline]
kasan_save_track+0x3f/0x80 mm/kasan/common.c:68
poison_kmalloc_redzone mm/kasan/common.c:370 [inline]
__kasan_kmalloc+0x98/0xb0 mm/kasan/common.c:387
kasan_kmalloc include/linux/kasan.h:211 [inline]
__do_kmalloc_node mm/slub.c:4048 [inline]
__kmalloc_noprof+0x200/0x410 mm/slub.c:4061
kmalloc_noprof include/linux/slab.h:664 [inline]
tomoyo_realpath_from_path+0xcf/0x5e0 security/tomoyo/realpath.c:251
tomoyo_get_realpath security/tomoyo/file.c:151 [inline]
tomoyo_path_perm+0x2b7/0x740 security/tomoyo/file.c:822
security_inode_getattr+0xd8/0x130 security/security.c:2269
vfs_getattr+0x45/0x430 fs/stat.c:173
vfs_fstat fs/stat.c:198 [inline]
vfs_fstatat+0xd6/0x190 fs/stat.c:300
__do_sys_newfstatat fs/stat.c:468 [inline]
__se_sys_newfstatat fs/stat.c:462 [inline]
__x64_sys_newfstatat+0x125/0x1b0 fs/stat.c:462
do_syscall_64+0xfb/0x240
entry_SYSCALL_64_after_hwframe+0x72/0x7a

Freed by task 5052:
kasan_save_stack mm/kasan/common.c:47 [inline]
kasan_save_track+0x3f/0x80 mm/kasan/common.c:68
kasan_save_free_info+0x40/0x50 mm/kasan/generic.c:579
poison_slab_object+0xe0/0x150 mm/kasan/common.c:240
__kasan_slab_free+0x37/0x60 mm/kasan/common.c:256
kasan_slab_free include/linux/kasan.h:184 [inline]
slab_free_hook mm/slub.c:2180 [inline]
slab_free mm/slub.c:4363 [inline]
kfree+0x149/0x350 mm/slub.c:4484
tomoyo_realpath_from_path+0x5a9/0x5e0 security/tomoyo/realpath.c:286
tomoyo_get_realpath security/tomoyo/file.c:151 [inline]
tomoyo_path_perm+0x2b7/0x740 security/tomoyo/file.c:822
security_inode_getattr+0xd8/0x130 security/security.c:2269
vfs_getattr+0x45/0x430 fs/stat.c:173
vfs_fstat fs/stat.c:198 [inline]
vfs_fstatat+0xd6/0x190 fs/stat.c:300
__do_sys_newfstatat fs/stat.c:468 [inline]
__se_sys_newfstatat fs/stat.c:462 [inline]
__x64_sys_newfstatat+0x125/0x1b0 fs/stat.c:462
do_syscall_64+0xfb/0x240
entry_SYSCALL_64_after_hwframe+0x72/0x7a

The buggy address belongs to the object at ffff888020484000
which belongs to the cache kmalloc-4k of size 4096
The buggy address is located 4008 bytes to the right of
allocated 4096-byte region [ffff888020484000, ffff888020485000)

The buggy address belongs to the physical page:
page: refcount:1 mapcount:0 mapping:0000000000000000 index:0x0 pfn:0x20480
head: order:3 entire_mapcount:0 nr_pages_mapped:0 pincount:0
flags: 0xfff80000000040(head|node=0|zone=1|lastcpupid=0xfff)
page_type: 0xffffefff(slab)
raw: 00fff80000000040 ffff888015042140 dead000000000100 dead000000000122
raw: 0000000000000000 0000000000040004 00000001ffffefff 0000000000000000
head: 00fff80000000040 ffff888015042140 dead000000000100 dead000000000122
head: 0000000000000000 0000000000040004 00000001ffffefff 0000000000000000
head: 00fff80000000003 ffffea0000812001 ffffea0000812048 00000000ffffffff
head: 0000000800000000 0000000000000000 00000000ffffffff 0000000000000000
page dumped because: kasan: bad access detected
page_owner tracks the page as allocated
page last allocated via order 3, migratetype Unmovable, gfp_mask 0xd20c0(__GFP_IO|__GFP_FS|__GFP_NOWARN|__GFP_NORETRY|__GFP_COMP|__GFP_NOMEMALLOC), pid 1, tgid -957297381 (swapper/0), ts 1, free_ts 0
set_page_owner include/linux/page_owner.h:32 [inline]
post_alloc_hook+0x1f3/0x230 mm/page_alloc.c:1490
prep_new_page mm/page_alloc.c:1498 [inline]
get_page_from_freelist+0x2e7e/0x2f40 mm/page_alloc.c:3454
__alloc_pages_noprof+0x256/0x6c0 mm/page_alloc.c:4712
__alloc_pages_node_noprof include/linux/gfp.h:244 [inline]
alloc_pages_node_noprof include/linux/gfp.h:271 [inline]
alloc_slab_page+0x5f/0x120 mm/slub.c:2249
allocate_slab+0x5a/0x2e0 mm/slub.c:2412
new_slab mm/slub.c:2465 [inline]
___slab_alloc+0xea8/0x1430 mm/slub.c:3599
__slab_alloc+0x58/0xa0 mm/slub.c:3684
__slab_alloc_node mm/slub.c:3737 [inline]
slab_alloc_node mm/slub.c:3915 [inline]
kmalloc_node_trace_noprof+0x20c/0x300 mm/slub.c:4087
kmalloc_node_noprof include/linux/slab.h:677 [inline]
bdi_alloc+0x4f/0x140 mm/backing-dev.c:894
__alloc_disk_node+0xb8/0x590 block/genhd.c:1347
__blk_mq_alloc_disk+0x17d/0x260 block/blk-mq.c:4166
loop_add+0x448/0xba0 drivers/block/loop.c:2032
loop_init+0x17a/0x230 drivers/block/loop.c:2275
do_one_initcall+0x248/0x880 init/main.c:1258
do_initcall_level+0x157/0x210 init/main.c:1320
do_initcalls+0x3f/0x80 init/main.c:1336
page_owner free stack trace missing

Memory state around the buggy address:
ffff888020485e80: fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc
ffff888020485f00: fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc
>ffff888020485f80: fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc
^
ffff888020486000: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
ffff888020486080: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
==================================================================


---
If you want syzbot to run the reproducer, reply with:
#syz test: git://repo/address.git branch-or-commit-hash
If you attach or paste a git patch, syzbot will apply it before testing.

Jan Kara

unread,
Apr 3, 2024, 5:47:22 AMApr 3
to Kemeng Shi, bra...@kernel.org, gre...@linuxfoundation.org, ja...@suse.cz, konishi...@gmail.com, linux-...@vger.kernel.org, linux-...@vger.kernel.org, linux...@vger.kernel.org, syzkall...@googlegroups.com, t...@kernel.org, vi...@zeniv.linux.org.uk
On Tue 02-04-24 07:38:25, syzbot wrote:
> syzbot has found a reproducer for the following issue on:
>
> HEAD commit: c0b832517f62 Add linux-next specific files for 20240402
> git tree: linux-next
> console+strace: https://syzkaller.appspot.com/x/log.txt?x=14af7dd9180000
> kernel config: https://syzkaller.appspot.com/x/.config?x=afcaf46d374cec8c
> dashboard link: https://syzkaller.appspot.com/bug?extid=7b219b86935220db6dd8
> compiler: Debian clang version 15.0.6, GNU ld (GNU Binutils for Debian) 2.40
> syz repro: https://syzkaller.appspot.com/x/repro.syz?x=1729f003180000
> C reproducer: https://syzkaller.appspot.com/x/repro.c?x=17fa4341180000
>
> Downloadable assets:
> disk image: https://storage.googleapis.com/syzbot-assets/0d36ec76edc7/disk-c0b83251.raw.xz
> vmlinux: https://storage.googleapis.com/syzbot-assets/6f9bb4e37dd0/vmlinux-c0b83251.xz
> kernel image: https://storage.googleapis.com/syzbot-assets/2349287b14b7/bzImage-c0b83251.xz
> mounted in repro: https://storage.googleapis.com/syzbot-assets/9760c52a227c/mount_0.gz
>
> IMPORTANT: if you fix the issue, please add the following tag to the commit:
> Reported-by: syzbot+7b219b...@syzkaller.appspotmail.com
>
> ==================================================================
> BUG: KASAN: slab-out-of-bounds in __lock_acquire+0x78/0x1fd0 kernel/locking/lockdep.c:5005
> Read of size 8 at addr ffff888020485fa8 by task kworker/u8:2/35

Looks like the writeback cleanups are causing some use-after-free issues.
The code KASAN is complaining about is:

/*
* Nothing written. Wait for some inode to
* become available for writeback. Otherwise
* we'll just busyloop.
*/
trace_writeback_wait(wb, work);
inode = wb_inode(wb->b_more_io.prev);
>>>>> spin_lock(&inode->i_lock); <<<<<<
spin_unlock(&wb->list_lock);
/* This function drops i_lock... */
inode_sleep_on_writeback(inode);

in wb_writeback(). Now looking at the changes indeed the commit
167d6693deb ("fs/writeback: bail out if there is no more inodes for IO and
queued once") is buggy because it will result in trying to fetch 'inode'
from empty b_more_io list and thus we'll corrupt memory. I think instead of
modifying the condition:

if (list_empty(&wb->b_more_io)) {

we should do:

- if (progress) {
+ if (progress || !queued) {
spin_unlock(&wb->list_lock);
continue;
}

Kemeng?

Honza
--
Jan Kara <ja...@suse.com>
SUSE Labs, CR

Christian Brauner

unread,
Apr 5, 2024, 7:06:08 AMApr 5
to Jan Kara, Kemeng Shi, gre...@linuxfoundation.org, konishi...@gmail.com, linux-...@vger.kernel.org, linux-...@vger.kernel.org, linux...@vger.kernel.org, syzkall...@googlegroups.com, t...@kernel.org, vi...@zeniv.linux.org.uk
Fwiw, I observed this on xfstest too the last few days and tracked it
down to this series. Here's the splat I got in case it helps:

Apr 05 00:33:06 localhost kernel: ==================================================================
Apr 05 00:33:06 localhost kernel: BUG: KASAN: slab-out-of-bounds in __lock_acquire.isra.0+0x1075/0x1280
Apr 05 00:33:06 localhost kernel: Read of size 8 at addr ffff88810ed40f48 by task kworker/u128:2/305560
Apr 05 00:33:06 localhost kernel:
Apr 05 00:33:06 localhost kernel: CPU: 5 PID: 305560 Comm: kworker/u128:2 Not tainted 99.9.0-rc2-gdebeafad51e2 #262
Apr 05 00:33:06 localhost kernel: Hardware name: QEMU Standard PC (Q35 + ICH9, 2009)/Incus, BIOS unknown 2/2/2022
Apr 05 00:33:06 localhost kernel: Workqueue: writeback wb_workfn (flush-259:0)
Apr 05 00:33:06 localhost kernel: Call Trace:
Apr 05 00:33:06 localhost kernel: <TASK>
Apr 05 00:33:06 localhost kernel: dump_stack_lvl+0x5a/0x90
Apr 05 00:33:06 localhost kernel: print_report+0xce/0x650
Apr 05 00:33:06 localhost kernel: ? __virt_addr_valid+0x217/0x320
Apr 05 00:33:06 localhost kernel: kasan_report+0xd7/0x110
Apr 05 00:33:06 localhost kernel: ? __lock_acquire.isra.0+0x1075/0x1280
Apr 05 00:33:06 localhost kernel: ? __lock_acquire.isra.0+0x1075/0x1280
Apr 05 00:33:06 localhost kernel: __lock_acquire.isra.0+0x1075/0x1280
Apr 05 00:33:06 localhost kernel: lock_acquire+0x136/0x330
Apr 05 00:33:06 localhost kernel: ? wb_writeback+0x255/0x870
Apr 05 00:33:06 localhost kernel: _raw_spin_lock+0x33/0x40
Apr 05 00:33:06 localhost kernel: ? wb_writeback+0x255/0x870
Apr 05 00:33:06 localhost kernel: wb_writeback+0x255/0x870
Apr 05 00:33:06 localhost kernel: ? __pfx_wb_writeback+0x10/0x10
Apr 05 00:33:06 localhost kernel: ? __pfx_lock_release+0x10/0x10
Apr 05 00:33:06 localhost kernel: wb_workfn+0x221/0xc80
Apr 05 00:33:06 localhost kernel: ? __pfx_wb_workfn+0x10/0x10
Apr 05 00:33:06 localhost kernel: ? lock_acquire+0x136/0x330
Apr 05 00:33:06 localhost kernel: process_one_work+0x82d/0x1790
Apr 05 00:33:06 localhost kernel: ? __pfx_process_one_work+0x10/0x10
Apr 05 00:33:06 localhost kernel: ? assign_work+0x16c/0x240
Apr 05 00:33:06 localhost kernel: worker_thread+0x724/0x1300
Apr 05 00:33:06 localhost kernel: ? __kthread_parkme+0xba/0x1f0
Apr 05 00:33:06 localhost kernel: ? __pfx_worker_thread+0x10/0x10
Apr 05 00:33:06 localhost kernel: kthread+0x2ed/0x3d0
Apr 05 00:33:06 localhost kernel: ? __pfx_kthread+0x10/0x10
Apr 05 00:33:06 localhost kernel: ret_from_fork+0x31/0x70
Apr 05 00:33:06 localhost kernel: ? __pfx_kthread+0x10/0x10
Apr 05 00:33:06 localhost kernel: ret_from_fork_asm+0x1a/0x30
Apr 05 00:33:06 localhost kernel: </TASK>
Apr 05 00:33:06 localhost kernel:
Apr 05 00:33:06 localhost kernel: Allocated by task 1:
Apr 05 00:33:06 localhost kernel: kasan_save_stack+0x33/0x60
Apr 05 00:33:06 localhost kernel: kasan_save_track+0x14/0x30
Apr 05 00:33:06 localhost kernel: __kasan_kmalloc+0xaa/0xb0
Apr 05 00:33:06 localhost kernel: psi_cgroup_alloc+0x57/0x2b0
Apr 05 00:33:06 localhost kernel: cgroup_mkdir+0x4f8/0xfb0
Apr 05 00:33:06 localhost kernel: kernfs_iop_mkdir+0x133/0x1c0
Apr 05 00:33:06 localhost kernel: vfs_mkdir+0x3b9/0x610
Apr 05 00:33:06 localhost kernel: do_mkdirat+0x27e/0x300
Apr 05 00:33:06 localhost kernel: __x64_sys_mkdir+0x65/0x80
Apr 05 00:33:06 localhost kernel: do_syscall_64+0x64/0x190
Apr 05 00:33:06 localhost kernel: entry_SYSCALL_64_after_hwframe+0x71/0x79
Apr 05 00:33:06 localhost kernel:
Apr 05 00:33:06 localhost kernel: Last potentially related work creation:
Apr 05 00:33:06 localhost kernel: kasan_save_stack+0x33/0x60
Apr 05 00:33:06 localhost kernel: __kasan_record_aux_stack+0xad/0xc0
Apr 05 00:33:06 localhost kernel: insert_work+0x32/0x1f0
Apr 05 00:33:06 localhost kernel: __queue_work+0x5cb/0xcb0
Apr 05 00:33:06 localhost kernel: call_timer_fn+0x16d/0x490
Apr 05 00:33:06 localhost kernel: __run_timers+0x488/0x980
Apr 05 00:33:06 localhost kernel: run_timer_base+0xfb/0x170
Apr 05 00:33:06 localhost kernel: run_timer_softirq+0x1a/0x30
Apr 05 00:33:06 localhost kernel: __do_softirq+0x26a/0x7d2
Apr 05 00:33:06 localhost kernel:
Apr 05 00:33:06 localhost kernel: Second to last potentially related work creation:
Apr 05 00:33:06 localhost kernel: kasan_save_stack+0x33/0x60
Apr 05 00:33:06 localhost kernel: __kasan_record_aux_stack+0xad/0xc0
Apr 05 00:33:06 localhost kernel: insert_work+0x32/0x1f0
Apr 05 00:33:06 localhost kernel: __queue_work+0x5cb/0xcb0
Apr 05 00:33:06 localhost kernel: call_timer_fn+0x16d/0x490
Apr 05 00:33:06 localhost kernel: __run_timers+0x488/0x980
Apr 05 00:33:06 localhost kernel: timer_expire_remote+0xe6/0x150
Apr 05 00:33:06 localhost kernel: tmigr_handle_remote+0x6e2/0xe00
Apr 05 00:33:06 localhost kernel: __do_softirq+0x26a/0x7d2
Apr 05 00:33:06 localhost kernel:
Apr 05 00:33:06 localhost kernel: The buggy address belongs to the object at ffff88810ed40000
which belongs to the cache kmalloc-2k of size 2048
Apr 05 00:33:06 localhost kernel: The buggy address is located 2792 bytes to the right of
allocated 1120-byte region [ffff88810ed40000, ffff88810ed40460)
Apr 05 00:33:06 localhost kernel:
Apr 05 00:33:06 localhost kernel: The buggy address belongs to the physical page:
Apr 05 00:33:06 localhost kernel: page: refcount:1 mapcount:0 mapping:0000000000000000 index:0xffff88810ed45000 pfn:0x10ed40
Apr 05 00:33:06 localhost kernel: head: order:3 entire_mapcount:0 nr_pages_mapped:0 pincount:0
Apr 05 00:33:06 localhost kernel: flags: 0x200000000000840(slab|head|node=0|zone=2)
Apr 05 00:33:06 localhost kernel: page_type: 0xffffffff()
Apr 05 00:33:06 localhost kernel: raw: 0200000000000840 ffff888100042f00 ffffea00040f1a00 dead000000000002
Apr 05 00:33:06 localhost kernel: raw: ffff88810ed45000 0000000080080006 00000001ffffffff 0000000000000000
Apr 05 00:33:06 localhost kernel: head: 0200000000000840 ffff888100042f00 ffffea00040f1a00 dead000000000002
Apr 05 00:33:06 localhost kernel: head: ffff88810ed45000 0000000080080006 00000001ffffffff 0000000000000000
Apr 05 00:33:06 localhost kernel: head: 0200000000000003 ffffea00043b5001 dead000000000122 00000000ffffffff
Apr 05 00:33:06 localhost kernel: head: 0000000800000000 0000000000000000 00000000ffffffff 0000000000000000
Apr 05 00:33:06 localhost kernel: page dumped because: kasan: bad access detected
Apr 05 00:33:06 localhost kernel:
Apr 05 00:33:06 localhost kernel: Memory state around the buggy address:
Apr 05 00:33:06 localhost kernel: ffff88810ed40e00: fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc
Apr 05 00:33:06 localhost kernel: ffff88810ed40e80: fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc
Apr 05 00:33:06 localhost kernel: >ffff88810ed40f00: fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc
Apr 05 00:33:06 localhost kernel: ^
Apr 05 00:33:06 localhost kernel: ffff88810ed40f80: fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc
Apr 05 00:33:06 localhost kernel: ffff88810ed41000: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
Apr 05 00:33:06 localhost kernel: ==================================================================
Apr 05 00:33:06 localhost kernel: Disabling lock debugging due to kernel taint
Apr 05 00:33:06 localhost kernel: INFO: trying to register non-static key.
Apr 05 00:33:06 localhost kernel: The code is fine but needs lockdep annotation, or maybe
Apr 05 00:33:06 localhost kernel: you didn't initialize this object before use?
Apr 05 00:33:06 localhost kernel: turning off the locking correctness validator.
Apr 05 00:33:06 localhost kernel: CPU: 5 PID: 305560 Comm: kworker/u128:2 Tainted: G B 99.9.0-rc2-gdebeafad51e2 #262
Apr 05 00:33:06 localhost kernel: Hardware name: QEMU Standard PC (Q35 + ICH9, 2009)/Incus, BIOS unknown 2/2/2022
Apr 05 00:33:06 localhost kernel: Workqueue: writeback wb_workfn (flush-259:0)
Apr 05 00:33:06 localhost kernel: Call Trace:
Apr 05 00:33:06 localhost kernel: <TASK>
Apr 05 00:33:06 localhost kernel: dump_stack_lvl+0x5a/0x90
Apr 05 00:33:06 localhost kernel: register_lock_class+0x11dd/0x1860
Apr 05 00:33:06 localhost kernel: ? add_taint+0x2a/0x90
Apr 05 00:33:06 localhost kernel: ? end_report+0x85/0x180
Apr 05 00:33:06 localhost kernel: ? __pfx_register_lock_class+0x10/0x10
Apr 05 00:33:06 localhost kernel: __lock_acquire.isra.0+0x7f/0x1280
Apr 05 00:33:06 localhost kernel: lock_acquire+0x136/0x330
Apr 05 00:33:06 localhost kernel: ? wb_writeback+0x255/0x870
Apr 05 00:33:06 localhost kernel: _raw_spin_lock+0x33/0x40
Apr 05 00:33:06 localhost kernel: ? wb_writeback+0x255/0x870
Apr 05 00:33:06 localhost kernel: wb_writeback+0x255/0x870
Apr 05 00:33:06 localhost kernel: ? __pfx_wb_writeback+0x10/0x10
Apr 05 00:33:06 localhost kernel: ? __pfx_lock_release+0x10/0x10
Apr 05 00:33:06 localhost kernel: wb_workfn+0x221/0xc80
Apr 05 00:33:06 localhost kernel: ? __pfx_wb_workfn+0x10/0x10
Apr 05 00:33:06 localhost kernel: ? lock_acquire+0x136/0x330
Apr 05 00:33:06 localhost kernel: process_one_work+0x82d/0x1790
Apr 05 00:33:06 localhost kernel: ? __pfx_process_one_work+0x10/0x10
Apr 05 00:33:06 localhost kernel: ? assign_work+0x16c/0x240
Apr 05 00:33:06 localhost kernel: worker_thread+0x724/0x1300
Apr 05 00:33:06 localhost kernel: ? __kthread_parkme+0xba/0x1f0
Apr 05 00:33:06 localhost kernel: ? __pfx_worker_thread+0x10/0x10
Apr 05 00:33:06 localhost kernel: kthread+0x2ed/0x3d0
Apr 05 00:33:06 localhost kernel: ? __pfx_kthread+0x10/0x10
Apr 05 00:33:06 localhost kernel: ret_from_fork+0x31/0x70
Apr 05 00:33:06 localhost kernel: ? __pfx_kthread+0x10/0x10
Apr 05 00:33:06 localhost kernel: ret_from_fork_asm+0x1a/0x30
Apr 05 00:33:06 localhost kernel: </TASK>

Jan Kara

unread,
Apr 5, 2024, 9:23:50 AMApr 5
to Christian Brauner, Jan Kara, Kemeng Shi, gre...@linuxfoundation.org, konishi...@gmail.com, linux-...@vger.kernel.org, linux-...@vger.kernel.org, linux...@vger.kernel.org, syzkall...@googlegroups.com, t...@kernel.org, vi...@zeniv.linux.org.uk
OK, since this is apparently causing more issues and Kemeng didn't reply
yet, here's a fix in the form of the patch. It has passed some basic
testing. Feel free to fold it into Kemeng's patch so that we don't keep
linux-next broken longer than necessary. Thanks!

Honza
0001-writeback-Fix-memory-corruption-in-writeback-code.patch

Christian Brauner

unread,
Apr 5, 2024, 9:54:54 AMApr 5
to Jan Kara, Kemeng Shi, gre...@linuxfoundation.org, konishi...@gmail.com, linux-...@vger.kernel.org, linux-...@vger.kernel.org, linux...@vger.kernel.org, syzkall...@googlegroups.com, t...@kernel.org, vi...@zeniv.linux.org.uk
Thanks! Folded and I mentioned that I folded a fix from you into the
commit with a link to this patch.

Kemeng Shi

unread,
Apr 8, 2024, 4:12:08 AMApr 8
to Jan Kara, Christian Brauner, gre...@linuxfoundation.org, konishi...@gmail.com, linux-...@vger.kernel.org, linux-...@vger.kernel.org, linux...@vger.kernel.org, syzkall...@googlegroups.com, t...@kernel.org, vi...@zeniv.linux.org.uk
Sorry for the late reply as I was on vacation these days. Also sorry
for the bug introduced. The change looks good to me. Thanks a lot
for helping to fix this in time.

Kemeng
>
> Honza
>

Reply all
Reply to author
Forward
0 new messages