[syzbot] BUG: sleeping function called from invalid context in smc_pnet_apply_ib

43 views
Skip to first unread message

syzbot

unread,
Feb 17, 2022, 11:41:23 AM2/17/22
to j...@ziepe.ca, liangw...@huawei.com, linux-...@vger.kernel.org, linux...@vger.kernel.org, liwe...@huawei.com, net...@vger.kernel.org, syzkall...@googlegroups.com
Hello,

syzbot found the following issue on:

HEAD commit: c832962ac972 net: bridge: multicast: notify switchdev driv..
git tree: net
console output: https://syzkaller.appspot.com/x/log.txt?x=16b157bc700000
kernel config: https://syzkaller.appspot.com/x/.config?x=266de9da75c71a45
dashboard link: https://syzkaller.appspot.com/bug?extid=4f322a6d84e991c38775
compiler: gcc (Debian 10.2.1-6) 10.2.1 20210110, GNU ld (GNU Binutils for Debian) 2.35.2

Unfortunately, I don't have any reproducer for this issue yet.

IMPORTANT: if you fix the issue, please add the following tag to the commit:
Reported-by: syzbot+4f322a...@syzkaller.appspotmail.com

infiniband syz1: set down
infiniband syz1: added lo
RDS/IB: syz1: added
smc: adding ib device syz1 with port count 1
BUG: sleeping function called from invalid context at kernel/locking/mutex.c:577
in_atomic(): 1, irqs_disabled(): 0, non_block: 0, pid: 17974, name: syz-executor.3
preempt_count: 1, expected: 0
RCU nest depth: 0, expected: 0
6 locks held by syz-executor.3/17974:
#0: ffffffff90865838 (&rdma_nl_types[idx].sem){.+.+}-{3:3}, at: rdma_nl_rcv_msg+0x161/0x690 drivers/infiniband/core/netlink.c:164
#1: ffffffff8d04edf0 (link_ops_rwsem){++++}-{3:3}, at: nldev_newlink+0x25d/0x560 drivers/infiniband/core/nldev.c:1707
#2: ffffffff8d03e650 (devices_rwsem){++++}-{3:3}, at: enable_device_and_get+0xfc/0x3b0 drivers/infiniband/core/device.c:1321
#3: ffffffff8d03e510 (clients_rwsem){++++}-{3:3}, at: enable_device_and_get+0x15b/0x3b0 drivers/infiniband/core/device.c:1329
#4: ffff8880482c85c0 (&device->client_data_rwsem){++++}-{3:3}, at: add_client_context+0x3d0/0x5e0 drivers/infiniband/core/device.c:718
#5: ffff8880230a4118 (&pnettable->lock){++++}-{2:2}, at: smc_pnetid_by_table_ib+0x18c/0x470 net/smc/smc_pnet.c:1159
Preemption disabled at:
[<0000000000000000>] 0x0
CPU: 1 PID: 17974 Comm: syz-executor.3 Not tainted 5.17.0-rc3-syzkaller-00170-gc832962ac972 #0
Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 01/01/2011
Call Trace:
<TASK>
__dump_stack lib/dump_stack.c:88 [inline]
dump_stack_lvl+0xcd/0x134 lib/dump_stack.c:106
__might_resched.cold+0x222/0x26b kernel/sched/core.c:9576
__mutex_lock_common kernel/locking/mutex.c:577 [inline]
__mutex_lock+0x9f/0x12f0 kernel/locking/mutex.c:733
smc_pnet_apply_ib+0x28/0x160 net/smc/smc_pnet.c:251
smc_pnetid_by_table_ib+0x2ae/0x470 net/smc/smc_pnet.c:1164
smc_ib_add_dev+0x4d7/0x900 net/smc/smc_ib.c:940
add_client_context+0x405/0x5e0 drivers/infiniband/core/device.c:720
enable_device_and_get+0x1cd/0x3b0 drivers/infiniband/core/device.c:1331
ib_register_device drivers/infiniband/core/device.c:1419 [inline]
ib_register_device+0x814/0xaf0 drivers/infiniband/core/device.c:1365
rxe_register_device+0x2fe/0x3b0 drivers/infiniband/sw/rxe/rxe_verbs.c:1146
rxe_add+0x1331/0x1710 drivers/infiniband/sw/rxe/rxe.c:246
rxe_net_add+0x8c/0xe0 drivers/infiniband/sw/rxe/rxe_net.c:538
rxe_newlink drivers/infiniband/sw/rxe/rxe.c:268 [inline]
rxe_newlink+0xa9/0xd0 drivers/infiniband/sw/rxe/rxe.c:249
nldev_newlink+0x30a/0x560 drivers/infiniband/core/nldev.c:1717
rdma_nl_rcv_msg+0x36d/0x690 drivers/infiniband/core/netlink.c:195
rdma_nl_rcv_skb drivers/infiniband/core/netlink.c:239 [inline]
rdma_nl_rcv+0x2ee/0x430 drivers/infiniband/core/netlink.c:259
netlink_unicast_kernel net/netlink/af_netlink.c:1317 [inline]
netlink_unicast+0x539/0x7e0 net/netlink/af_netlink.c:1343
netlink_sendmsg+0x904/0xe00 net/netlink/af_netlink.c:1919
sock_sendmsg_nosec net/socket.c:705 [inline]
sock_sendmsg+0xcf/0x120 net/socket.c:725
____sys_sendmsg+0x6e8/0x810 net/socket.c:2413
___sys_sendmsg+0xf3/0x170 net/socket.c:2467
__sys_sendmsg+0xe5/0x1b0 net/socket.c:2496
do_syscall_x64 arch/x86/entry/common.c:50 [inline]
do_syscall_64+0x35/0xb0 arch/x86/entry/common.c:80
entry_SYSCALL_64_after_hwframe+0x44/0xae
RIP: 0033:0x7f909305f059
Code: ff ff c3 66 2e 0f 1f 84 00 00 00 00 00 0f 1f 40 00 48 89 f8 48 89 f7 48 89 d6 48 89 ca 4d 89 c2 4d 89 c8 4c 8b 4c 24 08 0f 05 <48> 3d 01 f0 ff ff 73 01 c3 48 c7 c1 b8 ff ff ff f7 d8 64 89 01 48
RSP: 002b:00007f90919d4168 EFLAGS: 00000246 ORIG_RAX: 000000000000002e
RAX: ffffffffffffffda RBX: 00007f9093171f60 RCX: 00007f909305f059
RDX: 0000000000000000 RSI: 00000000200000c0 RDI: 0000000000000004
RBP: 00007f90930b908d R08: 0000000000000000 R09: 0000000000000000
R10: 0000000000000000 R11: 0000000000000246 R12: 0000000000000000
R13: 00007fff171c256f R14: 00007f90919d4300 R15: 0000000000022000
</TASK>

=============================
[ BUG: Invalid wait context ]
5.17.0-rc3-syzkaller-00170-gc832962ac972 #0 Tainted: G W
-----------------------------
syz-executor.3/17974 is trying to lock:
ffffffff8d710098 (smc_ib_devices.mutex){+.+.}-{3:3}, at: smc_pnet_apply_ib+0x28/0x160 net/smc/smc_pnet.c:251
other info that might help us debug this:
context-{4:4}
6 locks held by syz-executor.3/17974:
#0: ffffffff90865838 (&rdma_nl_types[idx].sem){.+.+}-{3:3}, at: rdma_nl_rcv_msg+0x161/0x690 drivers/infiniband/core/netlink.c:164
#1: ffffffff8d04edf0 (link_ops_rwsem){++++}-{3:3}, at: nldev_newlink+0x25d/0x560 drivers/infiniband/core/nldev.c:1707
#2: ffffffff8d03e650 (devices_rwsem){++++}-{3:3}, at: enable_device_and_get+0xfc/0x3b0 drivers/infiniband/core/device.c:1321
#3: ffffffff8d03e510 (clients_rwsem){++++}-{3:3}, at: enable_device_and_get+0x15b/0x3b0 drivers/infiniband/core/device.c:1329
#4: ffff8880482c85c0 (&device->client_data_rwsem){++++}-{3:3}, at: add_client_context+0x3d0/0x5e0 drivers/infiniband/core/device.c:718
#5: ffff8880230a4118 (&pnettable->lock){++++}-{2:2}, at: smc_pnetid_by_table_ib+0x18c/0x470 net/smc/smc_pnet.c:1159
stack backtrace:
CPU: 1 PID: 17974 Comm: syz-executor.3 Tainted: G W 5.17.0-rc3-syzkaller-00170-gc832962ac972 #0
Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 01/01/2011
Call Trace:
<TASK>
__dump_stack lib/dump_stack.c:88 [inline]
dump_stack_lvl+0xcd/0x134 lib/dump_stack.c:106
print_lock_invalid_wait_context kernel/locking/lockdep.c:4678 [inline]
check_wait_context kernel/locking/lockdep.c:4739 [inline]
__lock_acquire.cold+0x213/0x3ab kernel/locking/lockdep.c:4977
lock_acquire kernel/locking/lockdep.c:5639 [inline]
lock_acquire+0x1ab/0x510 kernel/locking/lockdep.c:5604
__mutex_lock_common kernel/locking/mutex.c:600 [inline]
__mutex_lock+0x12f/0x12f0 kernel/locking/mutex.c:733
smc_pnet_apply_ib+0x28/0x160 net/smc/smc_pnet.c:251
smc_pnetid_by_table_ib+0x2ae/0x470 net/smc/smc_pnet.c:1164
smc_ib_add_dev+0x4d7/0x900 net/smc/smc_ib.c:940
add_client_context+0x405/0x5e0 drivers/infiniband/core/device.c:720
enable_device_and_get+0x1cd/0x3b0 drivers/infiniband/core/device.c:1331
ib_register_device drivers/infiniband/core/device.c:1419 [inline]
ib_register_device+0x814/0xaf0 drivers/infiniband/core/device.c:1365
rxe_register_device+0x2fe/0x3b0 drivers/infiniband/sw/rxe/rxe_verbs.c:1146
rxe_add+0x1331/0x1710 drivers/infiniband/sw/rxe/rxe.c:246
rxe_net_add+0x8c/0xe0 drivers/infiniband/sw/rxe/rxe_net.c:538
rxe_newlink drivers/infiniband/sw/rxe/rxe.c:268 [inline]
rxe_newlink+0xa9/0xd0 drivers/infiniband/sw/rxe/rxe.c:249
nldev_newlink+0x30a/0x560 drivers/infiniband/core/nldev.c:1717
rdma_nl_rcv_msg+0x36d/0x690 drivers/infiniband/core/netlink.c:195
rdma_nl_rcv_skb drivers/infiniband/core/netlink.c:239 [inline]
rdma_nl_rcv+0x2ee/0x430 drivers/infiniband/core/netlink.c:259
netlink_unicast_kernel net/netlink/af_netlink.c:1317 [inline]
netlink_unicast+0x539/0x7e0 net/netlink/af_netlink.c:1343
netlink_sendmsg+0x904/0xe00 net/netlink/af_netlink.c:1919
sock_sendmsg_nosec net/socket.c:705 [inline]
sock_sendmsg+0xcf/0x120 net/socket.c:725
____sys_sendmsg+0x6e8/0x810 net/socket.c:2413
___sys_sendmsg+0xf3/0x170 net/socket.c:2467
__sys_sendmsg+0xe5/0x1b0 net/socket.c:2496
do_syscall_x64 arch/x86/entry/common.c:50 [inline]
do_syscall_64+0x35/0xb0 arch/x86/entry/common.c:80
entry_SYSCALL_64_after_hwframe+0x44/0xae
RIP: 0033:0x7f909305f059
Code: ff ff c3 66 2e 0f 1f 84 00 00 00 00 00 0f 1f 40 00 48 89 f8 48 89 f7 48 89 d6 48 89 ca 4d 89 c2 4d 89 c8 4c 8b 4c 24 08 0f 05 <48> 3d 01 f0 ff ff 73 01 c3 48 c7 c1 b8 ff ff ff f7 d8 64 89 01 48
RSP: 002b:00007f90919d4168 EFLAGS: 00000246 ORIG_RAX: 000000000000002e
RAX: ffffffffffffffda RBX: 00007f9093171f60 RCX: 00007f909305f059
RDX: 0000000000000000 RSI: 00000000200000c0 RDI: 0000000000000004
RBP: 00007f90930b908d R08: 0000000000000000 R09: 0000000000000000
R10: 0000000000000000 R11: 0000000000000246 R12: 0000000000000000
R13: 00007fff171c256f R14: 00007f90919d4300 R15: 0000000000022000
</TASK>
smc: ib device syz1 port 1 has pnetid SYZ2 (user defined)
lo speed is unknown, defaulting to 1000
lo speed is unknown, defaulting to 1000
lo speed is unknown, defaulting to 1000
lo speed is unknown, defaulting to 1000
lo speed is unknown, defaulting to 1000
lo speed is unknown, defaulting to 1000


---
This report is generated by a bot. It may contain errors.
See https://goo.gl/tpsmEJ for more information about syzbot.
syzbot engineers can be reached at syzk...@googlegroups.com.

syzbot will keep track of this issue. See:
https://goo.gl/tpsmEJ#status for how to communicate with syzbot.

syzbot

unread,
Feb 17, 2022, 1:13:20 PM2/17/22
to fmdefr...@gmail.com, j...@ziepe.ca, liangw...@huawei.com, linux-...@vger.kernel.org, linux...@vger.kernel.org, liwe...@huawei.com, net...@vger.kernel.org, syzkall...@googlegroups.com
syzbot has found a reproducer for the following issue on:

HEAD commit: 5740d0689096 net: sched: limit TC_ACT_REPEAT loops
git tree: net
console output: https://syzkaller.appspot.com/x/log.txt?x=1474360e700000
kernel config: https://syzkaller.appspot.com/x/.config?x=88e226f0197aeba5
dashboard link: https://syzkaller.appspot.com/bug?extid=4f322a6d84e991c38775
compiler: gcc (Debian 10.2.1-6) 10.2.1 20210110, GNU ld (GNU Binutils for Debian) 2.35.2
syz repro: https://syzkaller.appspot.com/x/repro.syz?x=13dd93f2700000
C reproducer: https://syzkaller.appspot.com/x/repro.c?x=16a497e2700000

IMPORTANT: if you fix the issue, please add the following tag to the commit:
Reported-by: syzbot+4f322a...@syzkaller.appspotmail.com

infiniband syz1: set active
infiniband syz1: added lo
RDS/IB: syz1: added
smc: adding ib device syz1 with port count 1
BUG: sleeping function called from invalid context at kernel/locking/mutex.c:577
in_atomic(): 1, irqs_disabled(): 0, non_block: 0, pid: 3589, name: syz-executor180
preempt_count: 1, expected: 0
RCU nest depth: 0, expected: 0
6 locks held by syz-executor180/3589:
#0: ffffffff90865838 (&rdma_nl_types[idx].sem){.+.+}-{3:3}, at: rdma_nl_rcv_msg+0x161/0x690 drivers/infiniband/core/netlink.c:164
#1: ffffffff8d04edf0 (link_ops_rwsem){++++}-{3:3}, at: nldev_newlink+0x25d/0x560 drivers/infiniband/core/nldev.c:1707
#2: ffffffff8d03e650 (devices_rwsem){++++}-{3:3}, at: enable_device_and_get+0xfc/0x3b0 drivers/infiniband/core/device.c:1321
#3: ffffffff8d03e510 (clients_rwsem){++++}-{3:3}, at: enable_device_and_get+0x15b/0x3b0 drivers/infiniband/core/device.c:1329
#4: ffff8880790445c0 (&device->client_data_rwsem){++++}-{3:3}, at: add_client_context+0x3d0/0x5e0 drivers/infiniband/core/device.c:718
#5: ffff88814a29c818 (&pnettable->lock){++++}-{2:2}, at: smc_pnetid_by_table_ib+0x18c/0x470 net/smc/smc_pnet.c:1159
Preemption disabled at:
[<0000000000000000>] 0x0
CPU: 0 PID: 3589 Comm: syz-executor180 Not tainted 5.17.0-rc3-syzkaller-00174-g5740d0689096 #0
RIP: 0033:0x7f7ef25bed59
Code: 28 c3 e8 5a 14 00 00 66 2e 0f 1f 84 00 00 00 00 00 48 89 f8 48 89 f7 48 89 d6 48 89 ca 4d 89 c2 4d 89 c8 4c 8b 4c 24 08 0f 05 <48> 3d 01 f0 ff ff 73 01 c3 48 c7 c1 c0 ff ff ff f7 d8 64 89 01 48
RSP: 002b:00007ffcd0ce91d8 EFLAGS: 00000246 ORIG_RAX: 000000000000002e
RAX: ffffffffffffffda RBX: 0000000000000000 RCX: 00007f7ef25bed59
RDX: 0000000000000000 RSI: 00000000200000c0 RDI: 0000000000000005
RBP: 00007f7ef25827c0 R08: 0000000000000014 R09: 0000000000000000
R10: 0000000000000041 R11: 0000000000000246 R12: 00007f7ef2582850
R13: 0000000000000000 R14: 0000000000000000 R15: 0000000000000000
</TASK>

=============================
[ BUG: Invalid wait context ]
5.17.0-rc3-syzkaller-00174-g5740d0689096 #0 Tainted: G W
-----------------------------
syz-executor180/3589 is trying to lock:
ffffffff8d7100d8 (smc_ib_devices.mutex){+.+.}-{3:3}, at: smc_pnet_apply_ib+0x28/0x160 net/smc/smc_pnet.c:251
other info that might help us debug this:
context-{4:4}
6 locks held by syz-executor180/3589:
#0: ffffffff90865838 (&rdma_nl_types[idx].sem){.+.+}-{3:3}, at: rdma_nl_rcv_msg+0x161/0x690 drivers/infiniband/core/netlink.c:164
#1: ffffffff8d04edf0 (link_ops_rwsem){++++}-{3:3}, at: nldev_newlink+0x25d/0x560 drivers/infiniband/core/nldev.c:1707
#2: ffffffff8d03e650 (devices_rwsem){++++}-{3:3}, at: enable_device_and_get+0xfc/0x3b0 drivers/infiniband/core/device.c:1321
#3: ffffffff8d03e510 (clients_rwsem){++++}-{3:3}, at: enable_device_and_get+0x15b/0x3b0 drivers/infiniband/core/device.c:1329
#4: ffff8880790445c0 (&device->client_data_rwsem){++++}-{3:3}, at: add_client_context+0x3d0/0x5e0 drivers/infiniband/core/device.c:718
#5: ffff88814a29c818 (&pnettable->lock){++++}-{2:2}, at: smc_pnetid_by_table_ib+0x18c/0x470 net/smc/smc_pnet.c:1159
stack backtrace:
CPU: 0 PID: 3589 Comm: syz-executor180 Tainted: G W 5.17.0-rc3-syzkaller-00174-g5740d0689096 #0
RIP: 0033:0x7f7ef25bed59
Code: 28 c3 e8 5a 14 00 00 66 2e 0f 1f 84 00 00 00 00 00 48 89 f8 48 89 f7 48 89 d6 48 89 ca 4d 89 c2 4d 89 c8 4c 8b 4c 24 08 0f 05 <48> 3d 01 f0 ff ff 73 01 c3 48 c7 c1 c0 ff ff ff f7 d8 64 89 01 48
RSP: 002b:00007ffcd0ce91d8 EFLAGS: 00000246 ORIG_RAX: 000000000000002e
RAX: ffffffffffffffda RBX: 0000000000000000 RCX: 00007f7ef25bed59
RDX: 0000000000000000 RSI: 00000000200000c0 RDI: 0000000000000005
RBP: 00007f7ef25827c0 R08: 0000000000000014 R09: 0000000000000000
R10: 0000000000000041 R11: 0000000000000246 R12: 00007f7ef2582850
R13: 0000000000000000 R14: 0000000000000000 R15: 0000000000000000

Tony Lu

unread,
Feb 21, 2022, 3:40:23 AM2/21/22
to Fabio M. De Francesco, j...@ziepe.ca, liangw...@huawei.com, linux-...@vger.kernel.org, linux...@vger.kernel.org, liwe...@huawei.com, net...@vger.kernel.org, syzkall...@googlegroups.com, syzbot
On Thu, Feb 17, 2022 at 07:05:31PM +0100, Fabio M. De Francesco wrote:
> If I recall it well, read_lock() disables preemption.
>
> smc_pnetid_by_table_ib() uses read_lock() and then it calls smc_pnet_apply_ib()
> which, in turn, calls mutex_lock(&smc_ib_devices.mutex). Therefore the code
> acquires a mutex while in atomic and we get a SAC bug.
>
> Actually, even if my argument is correct(?), I don't know if the read_lock()
> in smc_pnetid_by_table_ib() can be converted to a sleeping lock like a mutex or
> a semaphore.

I think it is okay to use mutex, because this path is not so hot and no
limit to require spinlocks. pnettable is accessed by netlink, syscall
and netdevice notifier.

Tony Lu

unread,
Feb 21, 2022, 4:19:04 AM2/21/22
to Fabio M. De Francesco, j...@ziepe.ca, liangw...@huawei.com, linux-...@vger.kernel.org, linux...@vger.kernel.org, liwe...@huawei.com, net...@vger.kernel.org, syzkall...@googlegroups.com, syzbot
On Thu, Feb 17, 2022 at 07:05:31PM +0100, Fabio M. De Francesco wrote:
> On giovedě 17 febbraio 2022 17:41:22 CET syzbot wrote:
> If I recall it well, read_lock() disables preemption.
>
> smc_pnetid_by_table_ib() uses read_lock() and then it calls smc_pnet_apply_ib()
> which, in turn, calls mutex_lock(&smc_ib_devices.mutex). Therefore the code
> acquires a mutex while in atomic and we get a SAC bug.
>
> Actually, even if my argument is correct(?), I don't know if the read_lock()
> in smc_pnetid_by_table_ib() can be converted to a sleeping lock like a mutex or
> a semaphore.

Take the email above. I think it is safe to convert read_lock() to
mutex, which is already used by smc_ib_devices.mutex.

Thank you,
Tony Lu

syzbot

unread,
Feb 23, 2022, 4:06:10 AM2/23/22
to fmdefr...@gmail.com, j...@ziepe.ca, liangw...@huawei.com, linux-...@vger.kernel.org, linux...@vger.kernel.org, liwe...@huawei.com, net...@vger.kernel.org, syzkall...@googlegroups.com, ton...@linux.alibaba.com
Hello,

syzbot tried to test the proposed patch but the build/boot failed:

net/smc/smc_pnet.h:32:2: error: unknown type name 'mutex'


Tested on:

commit: 5c1ee569 Merge branch 'for-5.17-fixes' of git://git.ke..
git tree: upstream
dashboard link: https://syzkaller.appspot.com/bug?extid=4f322a6d84e991c38775
compiler: gcc (Debian 10.2.1-6) 10.2.1 20210110, GNU ld (GNU Binutils for Debian) 2.35.2
patch: https://syzkaller.appspot.com/x/patch.diff?x=116231fe700000

syzbot

unread,
Feb 23, 2022, 4:40:09 AM2/23/22
to fmdefr...@gmail.com, j...@ziepe.ca, liangw...@huawei.com, linux-...@vger.kernel.org, linux...@vger.kernel.org, liwe...@huawei.com, net...@vger.kernel.org, syzkall...@googlegroups.com, ton...@linux.alibaba.com
Hello,

syzbot has tested the proposed patch and the reproducer did not trigger any issue:

Reported-and-tested-by: syzbot+4f322a...@syzkaller.appspotmail.com

Tested on:

commit: 5c1ee569 Merge branch 'for-5.17-fixes' of git://git.ke..
git tree: upstream
kernel config: https://syzkaller.appspot.com/x/.config?x=15187fc11a461d83
dashboard link: https://syzkaller.appspot.com/bug?extid=4f322a6d84e991c38775
compiler: gcc (Debian 10.2.1-6) 10.2.1 20210110, GNU ld (GNU Binutils for Debian) 2.35.2
patch: https://syzkaller.appspot.com/x/patch.diff?x=15bc3696700000

Note: testing is done by a robot and is best-effort only.
Reply all
Reply to author
Forward
0 new messages