[syzbot] [cgroups?] KCSAN: data-race in cgroup_migrate_execute / memcpy_and_pad

8 views
Skip to first unread message

syzbot

unread,
Jul 7, 2025, 7:57:33 AM7/7/25
to cgr...@vger.kernel.org, han...@cmpxchg.org, linux-...@vger.kernel.org, mko...@suse.com, syzkall...@googlegroups.com, t...@kernel.org
Hello,

syzbot found the following issue on:

HEAD commit: d7b8f8e20813 Linux 6.16-rc5
git tree: upstream
console output: https://syzkaller.appspot.com/x/log.txt?x=176e828c580000
kernel config: https://syzkaller.appspot.com/x/.config?x=ce48b74bcedce10f
dashboard link: https://syzkaller.appspot.com/bug?extid=f3188428a0ed36870056
compiler: Debian clang version 20.1.7 (++20250616065708+6146a88f6049-1~exp1~20250616065826.132), Debian LLD 20.1.7

Unfortunately, I don't have any reproducer for this issue yet.

Downloadable assets:
disk image: https://storage.googleapis.com/syzbot-assets/6a4c012a3b93/disk-d7b8f8e2.raw.xz
vmlinux: https://storage.googleapis.com/syzbot-assets/e3a6050563a6/vmlinux-d7b8f8e2.xz
kernel image: https://storage.googleapis.com/syzbot-assets/2bdc7eedebcd/bzImage-d7b8f8e2.xz

IMPORTANT: if you fix the issue, please add the following tag to the commit:
Reported-by: syzbot+f31884...@syzkaller.appspotmail.com

==================================================================
BUG: KCSAN: data-race in cgroup_migrate_execute / memcpy_and_pad

write to 0xffff888133646ad0 of 8 bytes by task 4554 on cpu 1:
__list_splice include/linux/list.h:533 [inline]
list_splice_tail_init include/linux/list.h:589 [inline]
cgroup_migrate_execute+0x6b5/0x7f0 kernel/cgroup/cgroup.c:2689
cgroup_update_dfl_csses kernel/cgroup/cgroup.c:3135 [inline]
cgroup_apply_control+0x3ab/0x410 kernel/cgroup/cgroup.c:3375
cgroup_subtree_control_write+0x7d5/0xb80 kernel/cgroup/cgroup.c:3520
cgroup_file_write+0x194/0x350 kernel/cgroup/cgroup.c:4183
kernfs_fop_write_iter+0x1be/0x2d0 fs/kernfs/file.c:334
new_sync_write fs/read_write.c:593 [inline]
vfs_write+0x49d/0x8e0 fs/read_write.c:686
ksys_write+0xda/0x1a0 fs/read_write.c:738
__do_sys_write fs/read_write.c:749 [inline]
__se_sys_write fs/read_write.c:746 [inline]
__x64_sys_write+0x40/0x50 fs/read_write.c:746
x64_sys_call+0x2cdd/0x2fb0 arch/x86/include/generated/asm/syscalls_64.h:2
do_syscall_x64 arch/x86/entry/syscall_64.c:63 [inline]
do_syscall_64+0xd2/0x200 arch/x86/entry/syscall_64.c:94
entry_SYSCALL_64_after_hwframe+0x77/0x7f

read to 0xffff888133646180 of 3200 bytes by task 4561 on cpu 0:
memcpy_and_pad+0x48/0x80 lib/string_helpers.c:1007
arch_dup_task_struct+0x2c/0x40 arch/x86/kernel/process.c:98
dup_task_struct+0x83/0x6a0 kernel/fork.c:873
copy_process+0x399/0x1f90 kernel/fork.c:1999
kernel_clone+0x16c/0x5b0 kernel/fork.c:2599
__do_sys_clone3 kernel/fork.c:2903 [inline]
__se_sys_clone3+0x1c2/0x200 kernel/fork.c:2882
__x64_sys_clone3+0x31/0x40 kernel/fork.c:2882
x64_sys_call+0x10c9/0x2fb0 arch/x86/include/generated/asm/syscalls_64.h:436
do_syscall_x64 arch/x86/entry/syscall_64.c:63 [inline]
do_syscall_64+0xd2/0x200 arch/x86/entry/syscall_64.c:94
entry_SYSCALL_64_after_hwframe+0x77/0x7f

Reported by Kernel Concurrency Sanitizer on:
CPU: 0 UID: 0 PID: 4561 Comm: syz.2.316 Not tainted 6.16.0-rc5-syzkaller #0 PREEMPT(voluntary)
Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 05/07/2025
==================================================================


---
This report is generated by a bot. It may contain errors.
See https://goo.gl/tpsmEJ for more information about syzbot.
syzbot engineers can be reached at syzk...@googlegroups.com.

syzbot will keep track of this issue. See:
https://goo.gl/tpsmEJ#status for how to communicate with syzbot.

If the report is already addressed, let syzbot know by replying with:
#syz fix: exact-commit-title

If you want to overwrite report's subsystems, reply with:
#syz set subsystems: new-subsystem
(See the list of subsystem names on the web dashboard)

If the report is a duplicate of another one, reply with:
#syz dup: exact-subject-of-another-report

If you want to undo deduplication, reply with:
#syz undup

Michal Koutný

unread,
Jul 10, 2025, 12:27:47 PM7/10/25
to syzbot, cgr...@vger.kernel.org, han...@cmpxchg.org, linux-...@vger.kernel.org, syzkall...@googlegroups.com, t...@kernel.org
I assume the data race on parent->cg_list (not child->cg_list because
the parent is read).
The clone vs migration should be synchronized via
cgroup_threadgroup_rwsem but there's a window for the race:

copy_process
dup_task_struct
child->cg_list = parent->cg_list
...
[race window]
...
cgroup_can_fork
cgroup_css_set_fork
cgroup_threadgroup_change_begin

...
cgroup_post_fork
css_set_move_task
list_del_init(&child->cg_list);
list_add_tail(&child->cg_list, cset->tasks)
cgroup_threadgroup_change_end

The writer is
list_splice_tail_init(&cset->mg_tasks, &cset->tasks);
i.e. the parent is either migrated itself or it's the last (tail) task
in destination css_set of another migration.

But whatever value child copied over from parent doesn't matter because
it is overwritten when the child is inserted into appropriate cset in
cgroup_post_fork (properly synced via css_set_lock).

I.e. I'm not overexcited about this race but thanks syzbot.

Michal
signature.asc

syzbot

unread,
Aug 31, 2025, 8:45:22 PM8/31/25
to syzkall...@googlegroups.com
Auto-closing this bug as obsolete.
Crashes did not happen for a while, no reproducer and no activity.
Reply all
Reply to author
Forward
0 new messages