[syzbot] [mm?] KCSAN: data-race in mas_wr_store_entry / mtree_range_walk (2)

3 views
Skip to first unread message

syzbot

unread,
Apr 17, 2026, 5:12:23 AMĀ (4 days ago)Ā Apr 17
to Liam.H...@oracle.com, ak...@linux-foundation.org, ja...@google.com, linux-...@vger.kernel.org, linu...@kvack.org, l...@kernel.org, pfal...@suse.de, syzkall...@googlegroups.com, vba...@kernel.org
Hello,

syzbot found the following issue on:

HEAD commit: 1d51b370a0f8 Merge tag 'jfs-7.1' of github.com:kleikamp/li..
git tree: upstream
console output: https://syzkaller.appspot.com/x/log.txt?x=117dc4ce580000
kernel config: https://syzkaller.appspot.com/x/.config?x=7f207c4b1fbf85a3
dashboard link: https://syzkaller.appspot.com/bug?extid=38a879f4a73497f2dfef
compiler: Debian clang version 21.1.8 (++20251221033036+2078da43e25a-1~exp1~20251221153213.50), Debian LLD 21.1.8

Unfortunately, I don't have any reproducer for this issue yet.

Downloadable assets:
disk image: https://storage.googleapis.com/syzbot-assets/e08ff8d2b0e5/disk-1d51b370.raw.xz
vmlinux: https://storage.googleapis.com/syzbot-assets/c11d4b098bbf/vmlinux-1d51b370.xz
kernel image: https://storage.googleapis.com/syzbot-assets/6a4691f32e3d/bzImage-1d51b370.xz

IMPORTANT: if you fix the issue, please add the following tag to the commit:
Reported-by: syzbot+38a879...@syzkaller.appspotmail.com

==================================================================
BUG: KCSAN: data-race in mas_wr_store_entry / mtree_range_walk

write to 0xffff888104f71d08 of 8 bytes by task 4757 on cpu 0:
mas_wr_slot_store lib/maple_tree.c:3232 [inline]
mas_wr_store_entry+0x3405/0x5ad0 lib/maple_tree.c:3528
mas_store_prealloc+0x43e/0x690 lib/maple_tree.c:4936
vma_iter_store_overwrite mm/vma.h:616 [inline]
commit_merge+0x6a1/0x720 mm/vma.c:766
vma_expand+0x301/0x460 mm/vma.c:1219
vma_merge_new_range+0x29c/0x320 mm/vma.c:1112
__mmap_region mm/vma.c:2766 [inline]
mmap_region+0x1073/0x2110 mm/vma.c:2856
do_mmap+0x9b2/0xbd0 mm/mmap.c:560
vm_mmap_pgoff+0x183/0x2d0 mm/util.c:581
ksys_mmap_pgoff+0xc1/0x310 mm/mmap.c:606
x64_sys_call+0x14df/0x3020 arch/x86/include/generated/asm/syscalls_64.h:10
do_syscall_x64 arch/x86/entry/syscall_64.c:63 [inline]
do_syscall_64+0x12c/0x3b0 arch/x86/entry/syscall_64.c:94
entry_SYSCALL_64_after_hwframe+0x77/0x7f

read to 0xffff888104f71d08 of 8 bytes by task 4759 on cpu 1:
mtree_range_walk+0x1a6/0x490 lib/maple_tree.c:2032
mas_state_walk lib/maple_tree.c:2952 [inline]
mas_walk+0x1cc/0x370 lib/maple_tree.c:4366
lock_vma_under_rcu+0xc9/0x210 mm/mmap_lock.c:304
do_user_addr_fault+0x232/0x1050 arch/x86/mm/fault.c:1325
handle_page_fault arch/x86/mm/fault.c:1474 [inline]
exc_page_fault+0x62/0xa0 arch/x86/mm/fault.c:1527
asm_exc_page_fault+0x26/0x30 arch/x86/include/asm/idtentry.h:618

value changed: 0x00007f68dc2a5fff -> 0x00007f68dc284fff

Reported by Kernel Concurrency Sanitizer on:
CPU: 1 UID: 0 PID: 4759 Comm: syz.5.348 Not tainted syzkaller #0 PREEMPT(full)
Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 03/18/2026
==================================================================
netlink: 64 bytes leftover after parsing attributes in process `syz.5.348'.


---
This report is generated by a bot. It may contain errors.
See https://goo.gl/tpsmEJ for more information about syzbot.
syzbot engineers can be reached at syzk...@googlegroups.com.

syzbot will keep track of this issue. See:
https://goo.gl/tpsmEJ#status for how to communicate with syzbot.

If the report is already addressed, let syzbot know by replying with:
#syz fix: exact-commit-title

If you want to overwrite report's subsystems, reply with:
#syz set subsystems: new-subsystem
(See the list of subsystem names on the web dashboard)

If the report is a duplicate of another one, reply with:
#syz dup: exact-subject-of-another-report

If you want to undo deduplication, reply with:
#syz undup

Liam R. Howlett

unread,
Apr 17, 2026, 7:51:19 PMĀ (3 days ago)Ā Apr 17
to syzbot, ak...@linux-foundation.org, ja...@google.com, linux-...@vger.kernel.org, linu...@kvack.org, l...@kernel.org, pfal...@suse.de, syzkall...@googlegroups.com, vba...@kernel.org
* syzbot <syzbot+38a879...@syzkaller.appspotmail.com> [260417 05:12]:
> Hello,
>
> syzbot found the following issue on:
>
> HEAD commit: 1d51b370a0f8 Merge tag 'jfs-7.1' of github.com:kleikamp/li..
> git tree: upstream
> console output: https://syzkaller.appspot.com/x/log.txt?x=117dc4ce580000
> kernel config: https://syzkaller.appspot.com/x/.config?x=7f207c4b1fbf85a3
> dashboard link: https://syzkaller.appspot.com/bug?extid=38a879f4a73497f2dfef
> compiler: Debian clang version 21.1.8 (++20251221033036+2078da43e25a-1~exp1~20251221153213.50), Debian LLD 21.1.8
>
> Unfortunately, I don't have any reproducer for this issue yet.

... and you won't. This will work unless we tear aligned unsigned long
writes/reads.

I'm debating marking these as data_race(). Marking them all as
READ_ONCE and this one write as WRITE_ONCE. It seems overkill for
something that won't happen.

Alternatively, I can move the slot store fast path to need an
allocation, but that's worse.

Marco Elver

unread,
Apr 17, 2026, 8:26:13 PMĀ (3 days ago)Ā Apr 17
to Liam R. Howlett, syzbot, ak...@linux-foundation.org, ja...@google.com, linux-...@vger.kernel.org, linu...@kvack.org, l...@kernel.org, pfal...@suse.de, syzkall...@googlegroups.com, vba...@kernel.org
On Sat, 18 Apr 2026 at 01:51, 'Liam R. Howlett' via syzkaller-bugs
<syzkall...@googlegroups.com> wrote:
>
> * syzbot <syzbot+38a879...@syzkaller.appspotmail.com> [260417 05:12]:
> > Hello,
> >
> > syzbot found the following issue on:
> >
> > HEAD commit: 1d51b370a0f8 Merge tag 'jfs-7.1' of github.com:kleikamp/li..
> > git tree: upstream
> > console output: https://syzkaller.appspot.com/x/log.txt?x=117dc4ce580000
> > kernel config: https://syzkaller.appspot.com/x/.config?x=7f207c4b1fbf85a3
> > dashboard link: https://syzkaller.appspot.com/bug?extid=38a879f4a73497f2dfef
> > compiler: Debian clang version 21.1.8 (++20251221033036+2078da43e25a-1~exp1~20251221153213.50), Debian LLD 21.1.8
> >
> > Unfortunately, I don't have any reproducer for this issue yet.
>
> ... and you won't. This will work unless we tear aligned unsigned long
> writes/reads.
>
> I'm debating marking these as data_race(). Marking them all as
> READ_ONCE and this one write as WRITE_ONCE. It seems overkill for
> something that won't happen.
>
> Alternatively, I can move the slot store fast path to need an
> allocation, but that's worse.

The writer:

> rcu_assign_pointer(slots[offset + 1], wr_mas->entry);
> wr_mas->pivots[offset] = mas->index - 1; // <-- stores pivots[offset]

The reader races here:

> if (pivots[offset] >= mas->index) { // <-- load pivots[offset]
> max = pivots[offset]; // <-- load pivots[offset] again
> break;
> }

The compiler is free to reload them as written. What if there's a
concurrent update between the first and second load?

Vlastimil Babka (SUSE)

unread,
4:36 AMĀ (15 hours ago)Ā 4:36 AM
to Liam R. Howlett, syzbot, ak...@linux-foundation.org, ja...@google.com, linux-...@vger.kernel.org, linu...@kvack.org, l...@kernel.org, pfal...@suse.de, syzkall...@googlegroups.com
On 4/18/26 01:50, Liam R. Howlett wrote:
> * syzbot <syzbot+38a879...@syzkaller.appspotmail.com> [260417 05:12]:
>> Hello,
>>
>> syzbot found the following issue on:
>>
>> HEAD commit: 1d51b370a0f8 Merge tag 'jfs-7.1' of github.com:kleikamp/li..
>> git tree: upstream
>> console output: https://syzkaller.appspot.com/x/log.txt?x=117dc4ce580000
>> kernel config: https://syzkaller.appspot.com/x/.config?x=7f207c4b1fbf85a3
>> dashboard link: https://syzkaller.appspot.com/bug?extid=38a879f4a73497f2dfef
>> compiler: Debian clang version 21.1.8 (++20251221033036+2078da43e25a-1~exp1~20251221153213.50), Debian LLD 21.1.8
>>
>> Unfortunately, I don't have any reproducer for this issue yet.
>
> ... and you won't. This will work unless we tear aligned unsigned long
> writes/reads.

Note I think the reproducer here means code that will trigger that KCSAN
alert deterministically (as opposed to fuzzing), not something that would
depend on whether the alert is pointing to a "real" problem.


Liam R. Howlett

unread,
3:29 PMĀ (4 hours ago)Ā 3:29 PM
to Marco Elver, syzbot, ak...@linux-foundation.org, ja...@google.com, linux-...@vger.kernel.org, linu...@kvack.org, l...@kernel.org, pfal...@suse.de, syzkall...@googlegroups.com, vba...@kernel.org
* Marco Elver <el...@google.com> [260417 20:26]:
Then the benign race has happened.

Looking at [1], we see that care has been taken to limit the slot store
code to only !rcu mode, except for a subset of cases. Digging through
the information in git will eventually lead you to this note Peng wrote:

commit 64891ba3e51fb841b0af70db029038eb93bd5a43
Author: Peng Zhang <zhangp...@bytedance.com>
Date: Wed Jun 28 15:36:57 2023 +0800

maple_tree: add a fast path case in mas_wr_slot_store()

When expanding a range in two directions, only partially overwriting the
previous and next ranges, the number of entries will not be increased, so
we can just update the pivots as a fast path. However, it may introduce
potential risks in RCU mode, because it updates two pivots. We only
enable it in non-RCU mode.

Link: https://lkml.kernel.org/r/20230628073657.75...@bytedance.com
Signed-off-by: Peng Zhang <zhangp...@bytedance.com>
Reviewed-by: Liam R. Howlett <Liam.H...@oracle.com>
Signed-off-by: Andrew Morton <ak...@linux-foundation.org>

So, you can see that the author of the initial code did look at race
conditions. I wanted to read the link for more information but that
link isn't working right now (403 error).

-----

Or, we can ask an LLM about it:

In mas_wr_store_type(), we only allow wr_slot_store in RCU mode for the narrow
case where wr_mas->offset_end - mas->offset == 1. That condition means the
update touches only one boundary between two adjacent ranges, so the in-place
mutation in mas_wr_slot_store() stays limited to a single slot/pivot boundary
update and is considered safe for lockless readers.

If the span is wider than that, we do not use in-place slot-store under RCU.
The broader in-place path in mas_wr_slot_store() is explicitly guarded with
WARN_ON_ONCE(mt_in_rcu(...)), and the store-type logic instead falls back to
node-replacement paths (wr_node_store/split/rebalance), which preserve RCU
reader safety by publishing a new node rather than mutating too much in place.

In non-RCU mode (!mt_in_rcu()), we allow the wider in-place cases because
readers are expected to be synchronized by locking, so the stricter
lockless-reader constraints do not apply.

-----

I am sorry, but I don't have time to work through the scenarios as this
is not an issue and I no longer have the time budget for mailing lists
as I once did.

If you can come up with a problem (and ideally a reproducer), then
please let me know.

[1]. https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/tree/lib/maple_tree.c?id=c1f49dea2b8f335813d3b348fd39117fb8efb428#n3696
Reply all
Reply to author
Forward
0 new messages