[syzbot] general protection fault in __device_attach

32 views
Skip to first unread message

syzbot

unread,
Mar 14, 2022, 4:46:18 AM3/14/22
to gre...@linuxfoundation.org, linux-...@vger.kernel.org, raf...@kernel.org, syzkall...@googlegroups.com
Hello,

syzbot found the following issue on:

HEAD commit: e7e19defa575 Merge tag 'arm64-fixes' of git://git.kernel.o..
git tree: upstream
console output: https://syzkaller.appspot.com/x/log.txt?x=13ea76f6700000
kernel config: https://syzkaller.appspot.com/x/.config?x=442f8ac61e60a75e
dashboard link: https://syzkaller.appspot.com/bug?extid=dd3c97de244683533381
compiler: gcc (Debian 10.2.1-6) 10.2.1 20210110, GNU ld (GNU Binutils for Debian) 2.35.2

Unfortunately, I don't have any reproducer for this issue yet.

IMPORTANT: if you fix the issue, please add the following tag to the commit:
Reported-by: syzbot+dd3c97...@syzkaller.appspotmail.com

general protection fault, probably for non-canonical address 0xdffffc0000000021: 0000 [#1] PREEMPT SMP KASAN
KASAN: null-ptr-deref in range [0x0000000000000108-0x000000000000010f]
CPU: 1 PID: 14569 Comm: syz-executor.4 Not tainted 5.17.0-rc7-syzkaller-00068-ge7e19defa575 #0
Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 01/01/2011
RIP: 0010:__device_attach+0xad/0x4a0 drivers/base/dd.c:949
Code: e8 03 42 80 3c 20 00 0f 85 a3 03 00 00 48 b8 00 00 00 00 00 fc ff df 4c 8b 65 48 49 8d bc 24 08 01 00 00 48 89 fa 48 c1 ea 03 <0f> b6 04 02 84 c0 74 06 0f 8e 6e 03 00 00 45 0f b6 b4 24 08 01 00
RSP: 0018:ffffc90010a87b98 EFLAGS: 00010216
RAX: dffffc0000000000 RBX: 1ffff92002150f74 RCX: 0000000000000000
RDX: 0000000000000021 RSI: 0000000000000008 RDI: 0000000000000108
RBP: ffff88807829d030 R08: 0000000000000000 R09: ffffc90010a87ad7
R10: fffff52002150f5a R11: 0000000000000001 R12: 0000000000000000
R13: 0000000000000000 R14: 00000000fffffff0 R15: ffff88807829d140
FS: 00007f7048b3e700(0000) GS:ffff8880b9d00000(0000) knlGS:0000000000000000
CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
CR2: 00007f704a2dd090 CR3: 0000000074ae3000 CR4: 0000000000350ee0
Call Trace:
<TASK>
proc_ioctl.part.0+0x48e/0x560 drivers/usb/core/devio.c:2340
proc_ioctl drivers/usb/core/devio.c:170 [inline]
proc_ioctl_compat drivers/usb/core/devio.c:2389 [inline]
usbdev_do_ioctl drivers/usb/core/devio.c:2705 [inline]
usbdev_ioctl+0xc01/0x36c0 drivers/usb/core/devio.c:2791
vfs_ioctl fs/ioctl.c:51 [inline]
__do_sys_ioctl fs/ioctl.c:874 [inline]
__se_sys_ioctl fs/ioctl.c:860 [inline]
__x64_sys_ioctl+0x193/0x200 fs/ioctl.c:860
do_syscall_x64 arch/x86/entry/common.c:50 [inline]
do_syscall_64+0x35/0xb0 arch/x86/entry/common.c:80
entry_SYSCALL_64_after_hwframe+0x44/0xae
RIP: 0033:0x7f704a1c9049
Code: ff ff c3 66 2e 0f 1f 84 00 00 00 00 00 0f 1f 40 00 48 89 f8 48 89 f7 48 89 d6 48 89 ca 4d 89 c2 4d 89 c8 4c 8b 4c 24 08 0f 05 <48> 3d 01 f0 ff ff 73 01 c3 48 c7 c1 b8 ff ff ff f7 d8 64 89 01 48
RSP: 002b:00007f7048b3e168 EFLAGS: 00000246 ORIG_RAX: 0000000000000010
RAX: ffffffffffffffda RBX: 00007f704a2dbf60 RCX: 00007f704a1c9049
RDX: 0000000020000000 RSI: 00000000c00c5512 RDI: 0000000000000003
RBP: 00007f704a22308d R08: 0000000000000000 R09: 0000000000000000
R10: 0000000000000000 R11: 0000000000000246 R12: 0000000000000000
R13: 00007ffc683ba24f R14: 00007f7048b3e300 R15: 0000000000022000
</TASK>
Modules linked in:
---[ end trace 0000000000000000 ]---
RIP: 0010:__device_attach+0xad/0x4a0 drivers/base/dd.c:949
Code: e8 03 42 80 3c 20 00 0f 85 a3 03 00 00 48 b8 00 00 00 00 00 fc ff df 4c 8b 65 48 49 8d bc 24 08 01 00 00 48 89 fa 48 c1 ea 03 <0f> b6 04 02 84 c0 74 06 0f 8e 6e 03 00 00 45 0f b6 b4 24 08 01 00
RSP: 0018:ffffc90010a87b98 EFLAGS: 00010216
RAX: dffffc0000000000 RBX: 1ffff92002150f74 RCX: 0000000000000000
RDX: 0000000000000021 RSI: 0000000000000008 RDI: 0000000000000108
RBP: ffff88807829d030 R08: 0000000000000000 R09: ffffc90010a87ad7
R10: fffff52002150f5a R11: 0000000000000001 R12: 0000000000000000
R13: 0000000000000000 R14: 00000000fffffff0 R15: ffff88807829d140
FS: 00007f7048b3e700(0000) GS:ffff8880b9c00000(0000) knlGS:0000000000000000
CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
CR2: 00007f0b074ee1b8 CR3: 0000000074ae3000 CR4: 0000000000350ef0
----------------
Code disassembly (best guess):
0: e8 03 42 80 3c callq 0x3c804208
5: 20 00 and %al,(%rax)
7: 0f 85 a3 03 00 00 jne 0x3b0
d: 48 b8 00 00 00 00 00 movabs $0xdffffc0000000000,%rax
14: fc ff df
17: 4c 8b 65 48 mov 0x48(%rbp),%r12
1b: 49 8d bc 24 08 01 00 lea 0x108(%r12),%rdi
22: 00
23: 48 89 fa mov %rdi,%rdx
26: 48 c1 ea 03 shr $0x3,%rdx
* 2a: 0f b6 04 02 movzbl (%rdx,%rax,1),%eax <-- trapping instruction
2e: 84 c0 test %al,%al
30: 74 06 je 0x38
32: 0f 8e 6e 03 00 00 jle 0x3a6
38: 45 rex.RB
39: 0f .byte 0xf
3a: b6 b4 mov $0xb4,%dh
3c: 24 08 and $0x8,%al
3e: 01 00 add %eax,(%rax)


---
This report is generated by a bot. It may contain errors.
See https://goo.gl/tpsmEJ for more information about syzbot.
syzbot engineers can be reached at syzk...@googlegroups.com.

syzbot will keep track of this issue. See:
https://goo.gl/tpsmEJ#status for how to communicate with syzbot.

syzbot

unread,
Jun 2, 2022, 3:49:30 PM6/2/22
to gre...@linuxfoundation.org, linux-...@vger.kernel.org, raf...@kernel.org, syzkall...@googlegroups.com
syzbot has found a reproducer for the following issue on:

HEAD commit: d1dc87763f40 assoc_array: Fix BUG_ON during garbage collect
git tree: upstream
console+strace: https://syzkaller.appspot.com/x/log.txt?x=17d2e7f5f00000
kernel config: https://syzkaller.appspot.com/x/.config?x=c51cd24814bb5665
dashboard link: https://syzkaller.appspot.com/bug?extid=dd3c97de244683533381
compiler: gcc (Debian 10.2.1-6) 10.2.1 20210110, GNU ld (GNU Binutils for Debian) 2.35.2
syz repro: https://syzkaller.appspot.com/x/repro.syz?x=15613e2bf00000
C reproducer: https://syzkaller.appspot.com/x/repro.c?x=15c90adbf00000

IMPORTANT: if you fix the issue, please add the following tag to the commit:
Reported-by: syzbot+dd3c97...@syzkaller.appspotmail.com

usb usb9: device_add((null)) --> -22
general protection fault, probably for non-canonical address 0xdffffc0000000021: 0000 [#1] PREEMPT SMP KASAN
KASAN: null-ptr-deref in range [0x0000000000000108-0x000000000000010f]
CPU: 0 PID: 4190 Comm: syz-executor322 Not tainted 5.18.0-syzkaller-11972-gd1dc87763f40 #0
Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 01/01/2011
RIP: 0010:__device_attach+0xad/0x4a0 drivers/base/dd.c:948
Code: e8 03 42 80 3c 20 00 0f 85 a3 03 00 00 48 b8 00 00 00 00 00 fc ff df 4c 8b 65 48 49 8d bc 24 08 01 00 00 48 89 fa 48 c1 ea 03 <0f> b6 04 02 84 c0 74 06 0f 8e 6e 03 00 00 45 0f b6 b4 24 08 01 00
RSP: 0018:ffffc90003447b98 EFLAGS: 00010206
RAX: dffffc0000000000 RBX: 1ffff92000688f74 RCX: 0000000000000000
RDX: 0000000000000021 RSI: 0000000000000002 RDI: 0000000000000108
RBP: ffff88807a22f030 R08: 0000000000000000 R09: ffffffff8dbb1097
R10: fffffbfff1b76212 R11: 0000000000000001 R12: 0000000000000000
R13: 0000000000000000 R14: 00000000fffffff0 R15: ffff88807a22f0b0
FS: 0000555557335300(0000) GS:ffff8880b9c00000(0000) knlGS:0000000000000000
CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
CR2: 00007f2b779a90b0 CR3: 000000007a1a7000 CR4: 00000000003506f0
DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
Call Trace:
<TASK>
proc_ioctl.part.0+0x48e/0x560 drivers/usb/core/devio.c:2356
proc_ioctl drivers/usb/core/devio.c:182 [inline]
proc_ioctl_default drivers/usb/core/devio.c:2391 [inline]
usbdev_do_ioctl drivers/usb/core/devio.c:2747 [inline]
usbdev_ioctl+0x2c08/0x36f0 drivers/usb/core/devio.c:2807
vfs_ioctl fs/ioctl.c:51 [inline]
__do_sys_ioctl fs/ioctl.c:870 [inline]
__se_sys_ioctl fs/ioctl.c:856 [inline]
__x64_sys_ioctl+0x193/0x200 fs/ioctl.c:856
do_syscall_x64 arch/x86/entry/common.c:50 [inline]
do_syscall_64+0x35/0xb0 arch/x86/entry/common.c:80
entry_SYSCALL_64_after_hwframe+0x46/0xb0
RIP: 0033:0x7f2b77979779
Code: 28 00 00 00 75 05 48 83 c4 28 c3 e8 b1 14 00 00 90 48 89 f8 48 89 f7 48 89 d6 48 89 ca 4d 89 c2 4d 89 c8 4c 8b 4c 24 08 0f 05 <48> 3d 01 f0 ff ff 73 01 c3 48 c7 c1 c0 ff ff ff f7 d8 64 89 01 48
RSP: 002b:00007ffe17c6ed98 EFLAGS: 00000246 ORIG_RAX: 0000000000000010
RAX: ffffffffffffffda RBX: 00007f2b779bd184 RCX: 00007f2b77979779
RDX: 0000000020000040 RSI: 00000000c0105512 RDI: 0000000000000006
RBP: 00007ffe17c6edb0 R08: 0000000000000001 R09: 0000000000000001
R10: 000000000000ffff R11: 0000000000000246 R12: 0000000000000001
R13: 431bde82d7b634db R14: 0000000000000000 R15: 0000000000000000
</TASK>
Modules linked in:
---[ end trace 0000000000000000 ]---
RIP: 0010:__device_attach+0xad/0x4a0 drivers/base/dd.c:948
Code: e8 03 42 80 3c 20 00 0f 85 a3 03 00 00 48 b8 00 00 00 00 00 fc ff df 4c 8b 65 48 49 8d bc 24 08 01 00 00 48 89 fa 48 c1 ea 03 <0f> b6 04 02 84 c0 74 06 0f 8e 6e 03 00 00 45 0f b6 b4 24 08 01 00
RSP: 0018:ffffc90003447b98 EFLAGS: 00010206
RAX: dffffc0000000000 RBX: 1ffff92000688f74 RCX: 0000000000000000
RDX: 0000000000000021 RSI: 0000000000000002 RDI: 0000000000000108
RBP: ffff88807a22f030 R08: 0000000000000000 R09: ffffffff8dbb1097
R10: fffffbfff1b76212 R11: 0000000000000001 R12: 0000000000000000
R13: 0000000000000000 R14: 00000000fffffff0 R15: ffff88807a22f0b0
FS: 0000555557335300(0000) GS:ffff8880b9c00000(0000) knlGS:0000000000000000
CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
CR2: 00007f2b779a90b0 CR3: 000000007a1a7000 CR4: 00000000003506f0
DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400

Hillf Danton

unread,
Jun 2, 2022, 11:35:47 PM6/2/22
to syzbot, linux-...@vger.kernel.org, syzkall...@googlegroups.com
On Thu, 02 Jun 2022 12:49:28 -0700
See if it is due to the race that can be cured with usb_lock_device(udev).

#syz test https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git d1dc87763f40

--- y/drivers/usb/core/usb.c
+++ x/drivers/usb/core/usb.c
@@ -420,6 +420,8 @@ static void usb_release_dev(struct devic
kfree(udev->product);
kfree(udev->manufacturer);
kfree(udev->serial);
+ usb_lock_device(udev);
+ usb_unlock_device(udev);
kfree(udev);
}

--

syzbot

unread,
Jun 2, 2022, 11:55:12 PM6/2/22
to hda...@sina.com, linux-...@vger.kernel.org, syzkall...@googlegroups.com
Hello,

syzbot has tested the proposed patch but the reproducer is still triggering an issue:
general protection fault in __device_attach

usb usb9: device_add((null)) --> -22
general protection fault, probably for non-canonical address 0xdffffc0000000021: 0000 [#1] PREEMPT SMP KASAN
KASAN: null-ptr-deref in range [0x0000000000000108-0x000000000000010f]
CPU: 1 PID: 4084 Comm: syz-executor.0 Not tainted 5.18.0-syzkaller-11972-gd1dc87763f40-dirty #0
Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 01/01/2011
RIP: 0010:__device_attach+0xad/0x4a0 drivers/base/dd.c:948
Code: e8 03 42 80 3c 20 00 0f 85 a3 03 00 00 48 b8 00 00 00 00 00 fc ff df 4c 8b 65 48 49 8d bc 24 08 01 00 00 48 89 fa 48 c1 ea 03 <0f> b6 04 02 84 c0 74 06 0f 8e 6e 03 00 00 45 0f b6 b4 24 08 01 00
RSP: 0018:ffffc90003347b98 EFLAGS: 00010206
RAX: dffffc0000000000 RBX: 1ffff92000668f74 RCX: 0000000000000000
RDX: 0000000000000021 RSI: 0000000000000002 RDI: 0000000000000108
RBP: ffff888021878030 R08: 0000000000000000 R09: ffffffff8dbb1097
R10: fffffbfff1b76212 R11: 0000000000000001 R12: 0000000000000000
R13: 0000000000000000 R14: 00000000fffffff0 R15: ffff8880218780b0
FS: 00007f8da1571700(0000) GS:ffff8880b9d00000(0000) knlGS:0000000000000000
CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
CR2: 00007f8da059d090 CR3: 000000006b626000 CR4: 00000000003506e0
DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
Call Trace:
<TASK>
proc_ioctl.part.0+0x48e/0x560 drivers/usb/core/devio.c:2356
proc_ioctl drivers/usb/core/devio.c:182 [inline]
proc_ioctl_default drivers/usb/core/devio.c:2391 [inline]
usbdev_do_ioctl drivers/usb/core/devio.c:2747 [inline]
usbdev_ioctl+0x2c08/0x36f0 drivers/usb/core/devio.c:2807
vfs_ioctl fs/ioctl.c:51 [inline]
__do_sys_ioctl fs/ioctl.c:870 [inline]
__se_sys_ioctl fs/ioctl.c:856 [inline]
__x64_sys_ioctl+0x193/0x200 fs/ioctl.c:856
do_syscall_x64 arch/x86/entry/common.c:50 [inline]
do_syscall_64+0x35/0xb0 arch/x86/entry/common.c:80
entry_SYSCALL_64_after_hwframe+0x46/0xb0
RIP: 0033:0x7f8da0489109
Code: ff ff c3 66 2e 0f 1f 84 00 00 00 00 00 0f 1f 40 00 48 89 f8 48 89 f7 48 89 d6 48 89 ca 4d 89 c2 4d 89 c8 4c 8b 4c 24 08 0f 05 <48> 3d 01 f0 ff ff 73 01 c3 48 c7 c1 b8 ff ff ff f7 d8 64 89 01 48
RSP: 002b:00007f8da1571168 EFLAGS: 00000246 ORIG_RAX: 0000000000000010
RAX: ffffffffffffffda RBX: 00007f8da059bf60 RCX: 00007f8da0489109
RDX: 0000000020000040 RSI: 00000000c0105512 RDI: 0000000000000005
RBP: 00007f8da04e308d R08: 0000000000000000 R09: 0000000000000000
R10: 0000000000000000 R11: 0000000000000246 R12: 0000000000000000
R13: 00007fff3b94a64f R14: 00007f8da1571300 R15: 0000000000022000
</TASK>
Modules linked in:
---[ end trace 0000000000000000 ]---
RIP: 0010:__device_attach+0xad/0x4a0 drivers/base/dd.c:948
Code: e8 03 42 80 3c 20 00 0f 85 a3 03 00 00 48 b8 00 00 00 00 00 fc ff df 4c 8b 65 48 49 8d bc 24 08 01 00 00 48 89 fa 48 c1 ea 03 <0f> b6 04 02 84 c0 74 06 0f 8e 6e 03 00 00 45 0f b6 b4 24 08 01 00
RSP: 0018:ffffc90003347b98 EFLAGS: 00010206
RAX: dffffc0000000000 RBX: 1ffff92000668f74 RCX: 0000000000000000
RDX: 0000000000000021 RSI: 0000000000000002 RDI: 0000000000000108
RBP: ffff888021878030 R08: 0000000000000000 R09: ffffffff8dbb1097
R10: fffffbfff1b76212 R11: 0000000000000001 R12: 0000000000000000
R13: 0000000000000000 R14: 00000000fffffff0 R15: ffff8880218780b0
FS: 00007f8da1571700(0000) GS:ffff8880b9d00000(0000) knlGS:0000000000000000
CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
CR2: 00007f8da059d090 CR3: 000000006b626000 CR4: 00000000003506e0
DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
----------------
Code disassembly (best guess):
0: e8 03 42 80 3c callq 0x3c804208
5: 20 00 and %al,(%rax)
7: 0f 85 a3 03 00 00 jne 0x3b0
d: 48 b8 00 00 00 00 00 movabs $0xdffffc0000000000,%rax
14: fc ff df
17: 4c 8b 65 48 mov 0x48(%rbp),%r12
1b: 49 8d bc 24 08 01 00 lea 0x108(%r12),%rdi
22: 00
23: 48 89 fa mov %rdi,%rdx
26: 48 c1 ea 03 shr $0x3,%rdx
* 2a: 0f b6 04 02 movzbl (%rdx,%rax,1),%eax <-- trapping instruction
2e: 84 c0 test %al,%al
30: 74 06 je 0x38
32: 0f 8e 6e 03 00 00 jle 0x3a6
38: 45 rex.RB
39: 0f .byte 0xf
3a: b6 b4 mov $0xb4,%dh
3c: 24 08 and $0x8,%al
3e: 01 00 add %eax,(%rax)


Tested on:

commit: d1dc8776 assoc_array: Fix BUG_ON during garbage collect
git tree: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git
console output: https://syzkaller.appspot.com/x/log.txt?x=113f50ddf00000
kernel config: https://syzkaller.appspot.com/x/.config?x=c51cd24814bb5665
dashboard link: https://syzkaller.appspot.com/bug?extid=dd3c97de244683533381
compiler: gcc (Debian 10.2.1-6) 10.2.1 20210110, GNU ld (GNU Binutils for Debian) 2.35.2
patch: https://syzkaller.appspot.com/x/patch.diff?x=12390667f00000

Hillf Danton

unread,
Jun 3, 2022, 3:44:54 AM6/3/22
to syzbot, linux-...@vger.kernel.org, syzkall...@googlegroups.com
On Thu, 02 Jun 2022 12:49:28 -0700
v1, see if it is due to the race that can be cured with usb_lock_device(udev).
v2, see if it is due to device without dev->p settled.
--- y/drivers/usb/core/devio.c
+++ x/drivers/usb/core/devio.c
@@ -2352,7 +2352,7 @@ static int proc_ioctl(struct usb_dev_sta

/* let kernel drivers try to (re)bind to the interface */
case USBDEVFS_CONNECT:
- if (!intf->dev.driver)
+ if (!intf->dev.driver && intf->dev.p)
retval = device_attach(&intf->dev);
else
retval = -EBUSY;
--

syzbot

unread,
Jun 3, 2022, 6:02:09 AM6/3/22
to andriy.s...@linux.intel.com, gre...@linuxfoundation.org, hda...@sina.com, le...@kernel.org, linux...@vger.kernel.org, linux-...@vger.kernel.org, rafael.j...@intel.com, raf...@kernel.org, r...@rjwysocki.net, syzkall...@googlegroups.com
syzbot has bisected this issue to:

commit a9c4cf299f5f79d5016c8a9646fa1fc49381a8c1
Author: Andy Shevchenko <andriy.s...@linux.intel.com>
Date: Fri Jun 18 13:41:27 2021 +0000

ACPI: sysfs: Use __ATTR_RO() and __ATTR_RW() macros

bisection log: https://syzkaller.appspot.com/x/bisect.txt?x=1040b80df00000
start commit: d1dc87763f40 assoc_array: Fix BUG_ON during garbage collect
git tree: upstream
final oops: https://syzkaller.appspot.com/x/report.txt?x=1240b80df00000
console output: https://syzkaller.appspot.com/x/log.txt?x=1440b80df00000
Reported-by: syzbot+dd3c97...@syzkaller.appspotmail.com
Fixes: a9c4cf299f5f ("ACPI: sysfs: Use __ATTR_RO() and __ATTR_RW() macros")

For information about bisection process see: https://goo.gl/tpsmEJ#bisection

syzbot

unread,
Jun 3, 2022, 6:41:09 AM6/3/22
to hda...@sina.com, linux-...@vger.kernel.org, syzkall...@googlegroups.com
Hello,

syzbot has tested the proposed patch and the reproducer did not trigger any issue:

Reported-and-tested-by: syzbot+dd3c97...@syzkaller.appspotmail.com

Tested on:

commit: d1dc8776 assoc_array: Fix BUG_ON during garbage collect
git tree: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git
kernel config: https://syzkaller.appspot.com/x/.config?x=c51cd24814bb5665
dashboard link: https://syzkaller.appspot.com/bug?extid=dd3c97de244683533381
compiler: gcc (Debian 10.2.1-6) 10.2.1 20210110, GNU ld (GNU Binutils for Debian) 2.35.2
patch: https://syzkaller.appspot.com/x/patch.diff?x=1244e933f00000

Note: testing is done by a robot and is best-effort only.

Andy Shevchenko

unread,
Jun 3, 2022, 7:04:16 AM6/3/22
to syzbot, gre...@linuxfoundation.org, hda...@sina.com, le...@kernel.org, linux...@vger.kernel.org, linux-...@vger.kernel.org, rafael.j...@intel.com, raf...@kernel.org, r...@rjwysocki.net, syzkall...@googlegroups.com, linu...@vger.kernel.org, Alan Stern
On Fri, Jun 03, 2022 at 03:02:07AM -0700, syzbot wrote:
> syzbot has bisected this issue to:
>
> commit a9c4cf299f5f79d5016c8a9646fa1fc49381a8c1
> Author: Andy Shevchenko <andriy.s...@linux.intel.com>
> Date: Fri Jun 18 13:41:27 2021 +0000
>
> ACPI: sysfs: Use __ATTR_RO() and __ATTR_RW() macros

Hmm... It's not obvious at all how this change can alter the behaviour so
drastically. device_add() is called from USB core with intf->dev.name == NULL
by some reason. A-ha, seems like fault injector, which looks like

dev_set_name(&intf->dev, "%d-%s:%d.%d", dev->bus->busnum,
dev->devpath, configuration, ifnum);

missed the return code check.

But I'm not familiar with that code at all, adding Linux USB ML and Alan.

> bisection log: https://syzkaller.appspot.com/x/bisect.txt?x=1040b80df00000
> start commit: d1dc87763f40 assoc_array: Fix BUG_ON during garbage collect
> git tree: upstream
> final oops: https://syzkaller.appspot.com/x/report.txt?x=1240b80df00000
> console output: https://syzkaller.appspot.com/x/log.txt?x=1440b80df00000
> kernel config: https://syzkaller.appspot.com/x/.config?x=c51cd24814bb5665
> dashboard link: https://syzkaller.appspot.com/bug?extid=dd3c97de244683533381
> syz repro: https://syzkaller.appspot.com/x/repro.syz?x=15613e2bf00000
> C reproducer: https://syzkaller.appspot.com/x/repro.c?x=15c90adbf00000
>
> Reported-by: syzbot+dd3c97...@syzkaller.appspotmail.com
> Fixes: a9c4cf299f5f ("ACPI: sysfs: Use __ATTR_RO() and __ATTR_RW() macros")
>
> For information about bisection process see: https://goo.gl/tpsmEJ#bisection

--
With Best Regards,
Andy Shevchenko


Alan Stern

unread,
Jun 3, 2022, 11:42:23 AM6/3/22
to Andy Shevchenko, syzbot, gre...@linuxfoundation.org, hda...@sina.com, le...@kernel.org, linux...@vger.kernel.org, linux-...@vger.kernel.org, rafael.j...@intel.com, raf...@kernel.org, r...@rjwysocki.net, syzkall...@googlegroups.com, linu...@vger.kernel.org
On Fri, Jun 03, 2022 at 02:04:04PM +0300, Andy Shevchenko wrote:
> On Fri, Jun 03, 2022 at 03:02:07AM -0700, syzbot wrote:
> > syzbot has bisected this issue to:
> >
> > commit a9c4cf299f5f79d5016c8a9646fa1fc49381a8c1
> > Author: Andy Shevchenko <andriy.s...@linux.intel.com>
> > Date: Fri Jun 18 13:41:27 2021 +0000
> >
> > ACPI: sysfs: Use __ATTR_RO() and __ATTR_RW() macros
>
> Hmm... It's not obvious at all how this change can alter the behaviour so
> drastically. device_add() is called from USB core with intf->dev.name == NULL
> by some reason. A-ha, seems like fault injector, which looks like
>
> dev_set_name(&intf->dev, "%d-%s:%d.%d", dev->bus->busnum,
> dev->devpath, configuration, ifnum);
>
> missed the return code check.
>
> But I'm not familiar with that code at all, adding Linux USB ML and Alan.

I can't see any connection between this bug and acpi/sysfs.c. Is it a
bad bisection?

It looks like you're right about dev_set_name() failing. In fact, the
kernel appears to be littered with calls to that routine which do not
check the return code (the entire subtree below drivers/usb/ contains
only _one_ call that does check the return code!). The function doesn't
have any __must_check annotation, and its kerneldoc doesn't mention the
return code or the possibility of a failure.

Apparently the assumption is that if dev_set_name() fails then
device_add() later on will also fail, and the problem will be detected
then.

So now what should happen when device_add() for an interface fails in
usb_set_configuration()? I guess the interface should be deleted;
otherwise we have the possibility that people might still try to access
it via usbfs, as in the syzbot test run. Same goes for the
of_device_is_available() check.

Fixing that will be a little painful. Right now there are plenty of
places in the USB core that aren't prepared to cope with a non-existent
interface.

Alan Stern

Greg KH

unread,
Jun 3, 2022, 11:53:23 AM6/3/22
to Alan Stern, Andy Shevchenko, syzbot, hda...@sina.com, le...@kernel.org, linux...@vger.kernel.org, linux-...@vger.kernel.org, rafael.j...@intel.com, raf...@kernel.org, r...@rjwysocki.net, syzkall...@googlegroups.com, linu...@vger.kernel.org
But how can that really fail on a real system?

Is this just due to error-injection stuff? If so, I'm really loath to
rework the world for something that can never happen in real life.

Or is this a real syzbot-found-with-reproducer issue?

thanks,

greg k-h

Alan Stern

unread,
Jun 3, 2022, 12:03:35 PM6/3/22
to Greg KH, Andy Shevchenko, syzbot, hda...@sina.com, le...@kernel.org, linux...@vger.kernel.org, linux-...@vger.kernel.org, rafael.j...@intel.com, raf...@kernel.org, r...@rjwysocki.net, syzkall...@googlegroups.com, linu...@vger.kernel.org
Aren't there quite a few reasons why device_add() might fail? (Although
most of them probably are memory allocation errors...)

Basically, you have to make up your mind. If a function can fail, you
should be prepared to handle the failure. If it can't fail, there's no
point in even checking the return code.

Alan Stern

Greg KH

unread,
Jun 3, 2022, 12:12:00 PM6/3/22
to Alan Stern, Andy Shevchenko, syzbot, hda...@sina.com, le...@kernel.org, linux...@vger.kernel.org, linux-...@vger.kernel.org, rafael.j...@intel.com, raf...@kernel.org, r...@rjwysocki.net, syzkall...@googlegroups.com, linu...@vger.kernel.org
I was thinking of the dev_set_name() issue further back in the call
chain.

> Basically, you have to make up your mind. If a function can fail, you
> should be prepared to handle the failure. If it can't fail, there's no
> point in even checking the return code.

True, ok, we should unwind the mess. I'll try to look at it after the
merge window...

But again, is this a "real and able to be triggered from userspace"
problem, or just fault-injection-induced?

thanks,

greg k-h

Alan Stern

unread,
Jun 3, 2022, 12:27:28 PM6/3/22
to Greg KH, Andy Shevchenko, syzbot, hda...@sina.com, le...@kernel.org, linux...@vger.kernel.org, linux-...@vger.kernel.org, rafael.j...@intel.com, raf...@kernel.org, r...@rjwysocki.net, syzkall...@googlegroups.com, linu...@vger.kernel.org
On Fri, Jun 03, 2022 at 06:11:55PM +0200, Greg KH wrote:
> On Fri, Jun 03, 2022 at 12:03:32PM -0400, Alan Stern wrote:
> > On Fri, Jun 03, 2022 at 05:52:38PM +0200, Greg KH wrote:
> > > On Fri, Jun 03, 2022 at 11:42:19AM -0400, Alan Stern wrote:
> > > > So now what should happen when device_add() for an interface fails in
> > > > usb_set_configuration()?
> > >
> > > But how can that really fail on a real system?
> > >
> > > Is this just due to error-injection stuff? If so, I'm really loath to
> > > rework the world for something that can never happen in real life.
> > >
> > > Or is this a real syzbot-found-with-reproducer issue?
> >
> > Aren't there quite a few reasons why device_add() might fail? (Although
> > most of them probably are memory allocation errors...)
>
> I was thinking of the dev_set_name() issue further back in the call
> chain.

As far as I know, the only reason for dev_set_name() to fail is -ENOMEM.
That's not something the user can control directly.

> > Basically, you have to make up your mind. If a function can fail, you
> > should be prepared to handle the failure. If it can't fail, there's no
> > point in even checking the return code.
>
> True, ok, we should unwind the mess. I'll try to look at it after the
> merge window...
>
> But again, is this a "real and able to be triggered from userspace"
> problem, or just fault-injection-induced?

I don't think any of the failure paths here are controlled by the user.
They all seem to involve something going wrong internally in the kernel
(i.e., corruption or memory allocation failure for a small buffer).
Once that happens, the game is pretty much over anyway.

Is it worth handling this sort of thing, or should we ignore the
possibility and allow it to escalate to the point where the user can
potentially trigger a kernel panic? Another way of putting it is: How
gracefully do you want the kernel to collapse when this sort of
corruption happens?

Alan Stern

Dmitry Vyukov

unread,
Jun 4, 2022, 4:33:00 AM6/4/22
to Greg KH, Alan Stern, Andy Shevchenko, syzbot, hda...@sina.com, le...@kernel.org, linux...@vger.kernel.org, linux-...@vger.kernel.org, rafael.j...@intel.com, raf...@kernel.org, r...@rjwysocki.net, syzkall...@googlegroups.com, linu...@vger.kernel.org
Then this is something to fix in the fault injection subsystem.
Testing systems shouldn't be reporting false positives.
What allocations cannot fail in real life? Is it <=page_size?

Dan Carpenter

unread,
Jun 6, 2022, 8:39:07 AM6/6/22
to Dmitry Vyukov, Greg KH, Alan Stern, Andy Shevchenko, syzbot, hda...@sina.com, le...@kernel.org, linux...@vger.kernel.org, linux-...@vger.kernel.org, rafael.j...@intel.com, raf...@kernel.org, r...@rjwysocki.net, syzkall...@googlegroups.com, linu...@vger.kernel.org
Apparently in 2014, anything less than *EIGHT?!!* pages succeeded!

https://lwn.net/Articles/627419/

I have been on the look out since that article and never seen anyone
mention it changing. I think we should ignore that and say that
anything over PAGE_SIZE can fail. Possibly we could go smaller than
PAGE_SIZE...

regards,
dan carpenter

Dmitry Vyukov

unread,
Jun 7, 2022, 3:15:24 AM6/7/22
to Dan Carpenter, Greg KH, Alan Stern, Andy Shevchenko, syzbot, hda...@sina.com, le...@kernel.org, linux...@vger.kernel.org, linux-...@vger.kernel.org, rafael.j...@intel.com, raf...@kernel.org, r...@rjwysocki.net, syzkall...@googlegroups.com, linu...@vger.kernel.org, Linux-MM
+linux-mm for GFP expertise re what allocations cannot possibly fail
and should be excluded from fault injection.

Interesting, thanks for the link.

PAGE_SIZE looks like a good start. Once we have the predicate in
place, we can refine it later when/if we have more inputs.

But I wonder about GFP flags. They definitely have some impact on allocations.
If GFP_ACCOUNT is set, all allocations can fail, right?
If GFP_DMA/DMA32 is set, allocations can fail, right? What about other zones?
If GFP_NORETRY is set, allocations can fail?
What about GFP_NOMEMALLOC and GFP_ATOMIC?
What about GFP_IO/GFP_FS/GFP_DIRECT_RECLAIM/GFP_KSWAPD_RECLAIM? At
least some of these need to be set for allocations to not fail? Which
ones?
Any other flags are required to be set/unset for allocations to not fail?

FTR here is quick link to flags list:
https://elixir.bootlin.com/linux/v5.19-rc1/source/include/linux/gfp.h#L32

Matthew Wilcox

unread,
Jun 7, 2022, 11:28:12 PM6/7/22
to Dmitry Vyukov, Dan Carpenter, Greg KH, Alan Stern, Andy Shevchenko, syzbot, hda...@sina.com, le...@kernel.org, linux...@vger.kernel.org, linux-...@vger.kernel.org, rafael.j...@intel.com, raf...@kernel.org, r...@rjwysocki.net, syzkall...@googlegroups.com, linu...@vger.kernel.org, Linux-MM
I'm not the expert on page allocation, but ...

I don't think GFP_ACCOUNT makes allocations fail. It might make reclaim
happen from within that cgroup, and it might cause an OOM kill for
something in that cgroup. But I don't think it makes a (low order)
allocation more likely to fail.

There's usually less memory avilable in DMA/DMA32 zones, but we have
so few allocations from those zones, I question the utility of focusing
testing on those allocations.

GFP_ATOMIC allows access to emergency pools, so I would say _less_ likely
to fail. KSWAPD_RECLAIM has no effect on whether _this_ allocation
succeeds or fails; it kicks kswapd to do reclaim, rather than doing
reclaim directly. DIRECT_RECLAIM definitely makes allocations more likely
to succeed. GFP_FS allows (direct) reclaim to happen from filesystems.
GFP_IO allows IO to start (ie writeback can start) in order to clean
dirty memory.

Anyway, I hope somebody who knows the page allocator better than I do
can say smarter things than this. Even better if they can put it into
Documentation/ somewhere ;-)

https://www.kernel.org/doc/html/latest/core-api/memory-allocation.html
exists but isn't quite enough to answer this question.

Dmitry Vyukov

unread,
Jun 8, 2022, 4:20:18 AM6/8/22
to Matthew Wilcox, Dan Carpenter, Greg KH, Alan Stern, Andy Shevchenko, syzbot, hda...@sina.com, le...@kernel.org, linux...@vger.kernel.org, linux-...@vger.kernel.org, rafael.j...@intel.com, raf...@kernel.org, r...@rjwysocki.net, syzkall...@googlegroups.com, linu...@vger.kernel.org, Linux-MM
Interesting.
I was thinking of some malicious specifically crafted configurations
with very low limit and particular pattern of allocations. Also what
if there is just 1 process (current)? Is it possible to kill and
reclaim the current process when a thread is stuck in the middle of
the kernel on a kmalloc?
Also I see e.g.:
Tasks with the OOM protection (oom_score_adj set to -1000)
are treated as an exception and are never killed.

I am not an expert on this either, but I think it may be hard to fight
with a specifically crafted attack.


> There's usually less memory avilable in DMA/DMA32 zones, but we have
> so few allocations from those zones, I question the utility of focusing
> testing on those allocations.
>
> GFP_ATOMIC allows access to emergency pools, so I would say _less_ likely
> to fail. KSWAPD_RECLAIM has no effect on whether _this_ allocation
> succeeds or fails; it kicks kswapd to do reclaim, rather than doing
> reclaim directly. DIRECT_RECLAIM definitely makes allocations more likely
> to succeed. GFP_FS allows (direct) reclaim to happen from filesystems.
> GFP_IO allows IO to start (ie writeback can start) in order to clean
> dirty memory.
>
> Anyway, I hope somebody who knows the page allocator better than I do
> can say smarter things than this. Even better if they can put it into
> Documentation/ somewhere ;-)

Even better to put this into code as a predicate function that fault
injection will use. It will also serve as precise up-to-date
documentation.

Dmitry Vyukov

unread,
Jun 8, 2022, 4:24:20 AM6/8/22
to Matthew Wilcox, Dan Carpenter, Greg KH, Alan Stern, Andy Shevchenko, syzbot, hda...@sina.com, le...@kernel.org, linux...@vger.kernel.org, linux-...@vger.kernel.org, rafael.j...@intel.com, raf...@kernel.org, r...@rjwysocki.net, syzkall...@googlegroups.com, linu...@vger.kernel.org, Linux-MM
Also at the end of kmalloc as:
WARN_ON(!ret && !cant_fail(size, gfp));
!

syzbot

unread,
Jan 10, 2024, 8:12:06 AMJan 10
to 42.h...@gmail.com, andriy.s...@linux.intel.com, dan.ca...@oracle.com, dvy...@google.com, gre...@linuxfoundation.org, hda...@sina.com, kees...@chromium.org, le...@kernel.org, linux...@vger.kernel.org, linux-...@vger.kernel.org, linu...@kvack.org, linu...@vger.kernel.org, rafael.j...@intel.com, raf...@kernel.org, rien...@google.com, r...@rjwysocki.net, st...@rowland.harvard.edu, syzkall...@googlegroups.com, vba...@suse.cz, wi...@infradead.org
syzbot suspects this issue was fixed by commit:

commit 49378a05ce7f01a203550eb7c2ef772f6d24565c
Author: Vlastimil Babka <vba...@suse.cz>
Date: Thu Oct 26 15:45:42 2023 +0000

mm/slub: remove slab_alloc() and __kmem_cache_alloc_lru() wrappers

bisection log: https://syzkaller.appspot.com/x/bisect.txt?x=15b179cde80000
start commit: d1dc87763f40 assoc_array: Fix BUG_ON during garbage collect
git tree: upstream
If the result looks correct, please mark the issue as fixed by replying with:

#syz fix: mm/slub: remove slab_alloc() and __kmem_cache_alloc_lru() wrappers

syzbot

unread,
Mar 19, 2024, 7:28:13 PMMar 19
to syzkall...@googlegroups.com
Auto-closing this bug as obsolete.
No recent activity, existing reproducers are no longer triggering the issue.
Reply all
Reply to author
Forward
0 new messages