[syzbot] [mm?] KCSAN: data-race in copy_page_from_iter_atomic / pagecache_isize_extended

6 views
Skip to first unread message

syzbot

unread,
May 6, 2025, 3:52:29 AM5/6/25
to ak...@linux-foundation.org, baoli...@linux.alibaba.com, hu...@google.com, linux-...@vger.kernel.org, linu...@kvack.org, syzkall...@googlegroups.com
Hello,

syzbot found the following issue on:

HEAD commit: 01f95500a162 Merge tag 'uml-for-linux-6.15-rc6' of git://g..
git tree: upstream
console output: https://syzkaller.appspot.com/x/log.txt?x=17abbb68580000
kernel config: https://syzkaller.appspot.com/x/.config?x=6154604431d9aaf9
dashboard link: https://syzkaller.appspot.com/bug?extid=189d4742d07e937d68ea
compiler: Debian clang version 20.1.2 (++20250402124445+58df0ef89dd6-1~exp1~20250402004600.97), Debian LLD 20.1.2

Unfortunately, I don't have any reproducer for this issue yet.

Downloadable assets:
disk image: https://storage.googleapis.com/syzbot-assets/8d61c7d3421d/disk-01f95500.raw.xz
vmlinux: https://storage.googleapis.com/syzbot-assets/d86d0377eab0/vmlinux-01f95500.xz
kernel image: https://storage.googleapis.com/syzbot-assets/a6f455ac4fd5/bzImage-01f95500.xz

IMPORTANT: if you fix the issue, please add the following tag to the commit:
Reported-by: syzbot+189d47...@syzkaller.appspotmail.com

==================================================================
BUG: KCSAN: data-race in copy_page_from_iter_atomic / pagecache_isize_extended

read to 0xffff88811d47e000 of 2048 bytes by task 37 on cpu 0:
memcpy_from_iter lib/iov_iter.c:73 [inline]
iterate_bvec include/linux/iov_iter.h:123 [inline]
iterate_and_advance2 include/linux/iov_iter.h:304 [inline]
iterate_and_advance include/linux/iov_iter.h:328 [inline]
__copy_from_iter lib/iov_iter.c:249 [inline]
copy_page_from_iter_atomic+0x77f/0xff0 lib/iov_iter.c:483
copy_folio_from_iter_atomic include/linux/uio.h:210 [inline]
generic_perform_write+0x2c2/0x490 mm/filemap.c:4121
shmem_file_write_iter+0xc5/0xf0 mm/shmem.c:3464
lo_rw_aio+0x5f7/0x7c0 drivers/block/loop.c:-1
do_req_filebacked drivers/block/loop.c:-1 [inline]
loop_handle_cmd drivers/block/loop.c:1866 [inline]
loop_process_work+0x52d/0xa60 drivers/block/loop.c:1901
loop_workfn+0x31/0x40 drivers/block/loop.c:1925
process_one_work kernel/workqueue.c:3238 [inline]
process_scheduled_works+0x4cb/0x9d0 kernel/workqueue.c:3319
worker_thread+0x582/0x770 kernel/workqueue.c:3400
kthread+0x486/0x510 kernel/kthread.c:464
ret_from_fork+0x4b/0x60 arch/x86/kernel/process.c:153
ret_from_fork_asm+0x1a/0x30 arch/x86/entry/entry_64.S:245

write to 0xffff88811d47e018 of 4072 bytes by task 4432 on cpu 1:
zero_user_segments include/linux/highmem.h:278 [inline]
folio_zero_segment include/linux/highmem.h:635 [inline]
pagecache_isize_extended+0x26f/0x340 mm/truncate.c:850
ext4_alloc_file_blocks+0x4ad/0x720 fs/ext4/extents.c:4545
ext4_do_fallocate fs/ext4/extents.c:4694 [inline]
ext4_fallocate+0x2b8/0x660 fs/ext4/extents.c:4750
vfs_fallocate+0x410/0x450 fs/open.c:338
ksys_fallocate fs/open.c:362 [inline]
__do_sys_fallocate fs/open.c:367 [inline]
__se_sys_fallocate fs/open.c:365 [inline]
__x64_sys_fallocate+0x7a/0xd0 fs/open.c:365
x64_sys_call+0x2b88/0x2fb0 arch/x86/include/generated/asm/syscalls_64.h:286
do_syscall_x64 arch/x86/entry/syscall_64.c:63 [inline]
do_syscall_64+0xd0/0x1a0 arch/x86/entry/syscall_64.c:94
entry_SYSCALL_64_after_hwframe+0x77/0x7f

Reported by Kernel Concurrency Sanitizer on:
CPU: 1 UID: 0 PID: 4432 Comm: syz.8.11649 Not tainted 6.15.0-rc5-syzkaller-00022-g01f95500a162 #0 PREEMPT(voluntary)
Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 04/19/2025
==================================================================


---
This report is generated by a bot. It may contain errors.
See https://goo.gl/tpsmEJ for more information about syzbot.
syzbot engineers can be reached at syzk...@googlegroups.com.

syzbot will keep track of this issue. See:
https://goo.gl/tpsmEJ#status for how to communicate with syzbot.

If the report is already addressed, let syzbot know by replying with:
#syz fix: exact-commit-title

If you want to overwrite report's subsystems, reply with:
#syz set subsystems: new-subsystem
(See the list of subsystem names on the web dashboard)

If the report is a duplicate of another one, reply with:
#syz dup: exact-subject-of-another-report

If you want to undo deduplication, reply with:
#syz undup

Jann Horn

unread,
May 12, 2025, 1:45:10 PM5/12/25
to syzkaller, syzbot, ak...@linux-foundation.org, baoli...@linux.alibaba.com, hu...@google.com, linux-...@vger.kernel.org, linu...@kvack.org, syzkall...@googlegroups.com
On Tue, May 6, 2025 at 9:52 AM syzbot
<syzbot+189d47...@syzkaller.appspotmail.com> wrote:
> HEAD commit: 01f95500a162 Merge tag 'uml-for-linux-6.15-rc6' of git://g..
> git tree: upstream
> console output: https://syzkaller.appspot.com/x/log.txt?x=17abbb68580000
> kernel config: https://syzkaller.appspot.com/x/.config?x=6154604431d9aaf9
> dashboard link: https://syzkaller.appspot.com/bug?extid=189d4742d07e937d68ea
> compiler: Debian clang version 20.1.2 (++20250402124445+58df0ef89dd6-1~exp1~20250402004600.97), Debian LLD 20.1.2
[...]
> IMPORTANT: if you fix the issue, please add the following tag to the commit:
> Reported-by: syzbot+189d47...@syzkaller.appspotmail.com
>
> ==================================================================
> BUG: KCSAN: data-race in copy_page_from_iter_atomic / pagecache_isize_extended

I think this is a problem with the KCSAN implementation.

This is a race between writing to a userspace-owned page and reading
from a userspace-owned page.

This kind of pattern should be fairly trivial to trigger: If userspace
tells the kernel to read from a GUP'd page or pagecache on one thread,
and simultaneously tells the kernel to write to the same page on
another thread, we'll get a data race. This is not really a kernel
data race; it is more like a userspace race whose memory accesses
happen to go through the kernel.

So I think the fix would be for KCSAN to ignore anything in such
pages. The hard part is, I'm not sure how to tell what kind of page
we're dealing with from the kernel, some MM people might know...
distinguishing normal pagecache/anon pages from other pages might be
doable, but I guess it probably gets hard when thinking about
driver-allocated pages that were mapped into userspace vs
driver-allocated pages that are used internally in the driver...

Jann Horn

unread,
May 12, 2025, 2:33:17 PM5/12/25
to syzkaller, syzbot, ak...@linux-foundation.org, baoli...@linux.alibaba.com, hu...@google.com, linux-...@vger.kernel.org, linu...@kvack.org, syzkall...@googlegroups.com
Or alternatively, if we really do want data_race() operations around
any memset() or memcpy() on userspace-controlled pages, I guess we'd
have to pepper a lot of those around the kernel.

Also, I didn't really think about some of what I wrote here - we
certainly wouldn't want to ignore unannotated accesses to some struct
located in pagecache that userspace can concurrently write to.

Maybe it would actually make sense to do the opposite of what I said
to some extent, special-case userspace-mapped pages such that KCSAN
_always_ alerts on plain access to them...

Marco Elver

unread,
May 12, 2025, 4:52:24 PM5/12/25
to Jann Horn, syzkaller, syzbot, ak...@linux-foundation.org, baoli...@linux.alibaba.com, hu...@google.com, linux-...@vger.kernel.org, linu...@kvack.org, syzkall...@googlegroups.com
There have been cases where user space was doing something unsafe, and
KCSAN caught it. While technically it's user space's bug to keep,
KCSAN is still telling us something's wrong here.

In the past we'd just ignore these bugs (never release them from
syzbot), but I think we recently changed the rules for some of these
to be sent to the mailing list. They can safely be ignored if deemed
"user space is doing something stupid".

I do think we want to surface such issues in one-off testing
scenarios. However, in the fuzzing/CI context it's not so helpful, so
we might need a way to suppress them. If there's a way to tell by
looking at the stacktrace, we could teach syzbot to ignore such data
races entirely.

Jann Horn

unread,
May 13, 2025, 12:43:21 PM5/13/25
to Marco Elver, syzkaller, syzbot, ak...@linux-foundation.org, baoli...@linux.alibaba.com, hu...@google.com, linux-...@vger.kernel.org, linu...@kvack.org, syzkall...@googlegroups.com
Hmm. I think it probably requires a kernel config flag then, I don't
think you can easily filter by stacktrace. In fuzzing builds you could
maybe do some basic checks on the folio to see if it's pagecache, an
anon folio, or a folio mapped into userspace... that would filter out
_most_ but not all cases.

syzbot

unread,
Jul 5, 2025, 12:37:20 AM7/5/25
to syzkall...@googlegroups.com
Auto-closing this bug as obsolete.
Crashes did not happen for a while, no reproducer and no activity.
Reply all
Reply to author
Forward
0 new messages