[syzbot] [net?] possible deadlock in team_del_slave (3)

56 views
Skip to first unread message

syzbot

unread,
Apr 26, 2024, 7:59:35 AM4/26/24
to da...@davemloft.net, edum...@google.com, ji...@resnulli.us, ku...@kernel.org, linux-...@vger.kernel.org, net...@vger.kernel.org, pab...@redhat.com, syzkall...@googlegroups.com
Hello,

syzbot found the following issue on:

HEAD commit: 480e035fc4c7 Merge tag 'drm-next-2024-03-13' of https://gi..
git tree: upstream
console+strace: https://syzkaller.appspot.com/x/log.txt?x=1662179e180000
kernel config: https://syzkaller.appspot.com/x/.config?x=1e5b814e91787669
dashboard link: https://syzkaller.appspot.com/bug?extid=705c61d60b091ef42c04
compiler: Debian clang version 15.0.6, GNU ld (GNU Binutils for Debian) 2.40
syz repro: https://syzkaller.appspot.com/x/repro.syz?x=1058e7b9180000
C reproducer: https://syzkaller.appspot.com/x/repro.c?x=11919365180000

Downloadable assets:
disk image: https://storage.googleapis.com/syzbot-assets/5f73b6ef963d/disk-480e035f.raw.xz
vmlinux: https://storage.googleapis.com/syzbot-assets/46c949396aad/vmlinux-480e035f.xz
kernel image: https://storage.googleapis.com/syzbot-assets/e3b4d0f5a5f8/bzImage-480e035f.xz

IMPORTANT: if you fix the issue, please add the following tag to the commit:
Reported-by: syzbot+705c61...@syzkaller.appspotmail.com

======================================================
WARNING: possible circular locking dependency detected
6.8.0-syzkaller-08073-g480e035fc4c7 #0 Not tainted
------------------------------------------------------
syz-executor419/5074 is trying to acquire lock:
ffff888023dc4d20 (team->team_lock_key){+.+.}-{3:3}, at: team_del_slave+0x32/0x1d0 drivers/net/team/team.c:1988

but task is already holding lock:
ffff88802a210768 (&rdev->wiphy.mtx){+.+.}-{3:3}, at: nl80211_del_interface+0x11a/0x140 net/wireless/nl80211.c:4389

which lock already depends on the new lock.


the existing dependency chain (in reverse order) is:

-> #1 (&rdev->wiphy.mtx){+.+.}-{3:3}:
lock_acquire+0x1e4/0x530 kernel/locking/lockdep.c:5754
__mutex_lock_common kernel/locking/mutex.c:608 [inline]
__mutex_lock+0x136/0xd70 kernel/locking/mutex.c:752
wiphy_lock include/net/cfg80211.h:5951 [inline]
ieee80211_open+0xe7/0x200 net/mac80211/iface.c:449
__dev_open+0x2d3/0x450 net/core/dev.c:1430
dev_open+0xae/0x1b0 net/core/dev.c:1466
team_port_add drivers/net/team/team.c:1214 [inline]
team_add_slave+0x9b3/0x2750 drivers/net/team/team.c:1974
do_set_master net/core/rtnetlink.c:2685 [inline]
do_setlink+0xe70/0x41f0 net/core/rtnetlink.c:2891
rtnl_setlink+0x40d/0x5a0 net/core/rtnetlink.c:3185
rtnetlink_rcv_msg+0x89b/0x10d0 net/core/rtnetlink.c:6595
netlink_rcv_skb+0x1e3/0x430 net/netlink/af_netlink.c:2559
netlink_unicast_kernel net/netlink/af_netlink.c:1335 [inline]
netlink_unicast+0x7ea/0x980 net/netlink/af_netlink.c:1361
netlink_sendmsg+0x8e1/0xcb0 net/netlink/af_netlink.c:1905
sock_sendmsg_nosec net/socket.c:730 [inline]
__sock_sendmsg+0x221/0x270 net/socket.c:745
____sys_sendmsg+0x525/0x7d0 net/socket.c:2584
___sys_sendmsg net/socket.c:2638 [inline]
__sys_sendmsg+0x2b0/0x3a0 net/socket.c:2667
do_syscall_64+0xfb/0x240
entry_SYSCALL_64_after_hwframe+0x6d/0x75

-> #0 (team->team_lock_key){+.+.}-{3:3}:
check_prev_add kernel/locking/lockdep.c:3134 [inline]
check_prevs_add kernel/locking/lockdep.c:3253 [inline]
validate_chain+0x18cb/0x58e0 kernel/locking/lockdep.c:3869
__lock_acquire+0x1346/0x1fd0 kernel/locking/lockdep.c:5137
lock_acquire+0x1e4/0x530 kernel/locking/lockdep.c:5754
__mutex_lock_common kernel/locking/mutex.c:608 [inline]
__mutex_lock+0x136/0xd70 kernel/locking/mutex.c:752
team_del_slave+0x32/0x1d0 drivers/net/team/team.c:1988
team_device_event+0x200/0x5b0 drivers/net/team/team.c:3029
notifier_call_chain+0x18f/0x3b0 kernel/notifier.c:93
call_netdevice_notifiers_extack net/core/dev.c:1988 [inline]
call_netdevice_notifiers net/core/dev.c:2002 [inline]
unregister_netdevice_many_notify+0xd96/0x16d0 net/core/dev.c:11096
unregister_netdevice_many net/core/dev.c:11154 [inline]
unregister_netdevice_queue+0x303/0x370 net/core/dev.c:11033
unregister_netdevice include/linux/netdevice.h:3115 [inline]
_cfg80211_unregister_wdev+0x162/0x560 net/wireless/core.c:1206
ieee80211_if_remove+0x25d/0x3a0 net/mac80211/iface.c:2242
ieee80211_del_iface+0x19/0x30 net/mac80211/cfg.c:202
rdev_del_virtual_intf net/wireless/rdev-ops.h:62 [inline]
cfg80211_remove_virtual_intf+0x230/0x3f0 net/wireless/util.c:2847
genl_family_rcv_msg_doit net/netlink/genetlink.c:1113 [inline]
genl_family_rcv_msg net/netlink/genetlink.c:1193 [inline]
genl_rcv_msg+0xb14/0xec0 net/netlink/genetlink.c:1208
netlink_rcv_skb+0x1e3/0x430 net/netlink/af_netlink.c:2559
genl_rcv+0x28/0x40 net/netlink/genetlink.c:1217
netlink_unicast_kernel net/netlink/af_netlink.c:1335 [inline]
netlink_unicast+0x7ea/0x980 net/netlink/af_netlink.c:1361
netlink_sendmsg+0x8e1/0xcb0 net/netlink/af_netlink.c:1905
sock_sendmsg_nosec net/socket.c:730 [inline]
__sock_sendmsg+0x221/0x270 net/socket.c:745
____sys_sendmsg+0x525/0x7d0 net/socket.c:2584
___sys_sendmsg net/socket.c:2638 [inline]
__sys_sendmsg+0x2b0/0x3a0 net/socket.c:2667
do_syscall_64+0xfb/0x240
entry_SYSCALL_64_after_hwframe+0x6d/0x75

other info that might help us debug this:

Possible unsafe locking scenario:

CPU0 CPU1
---- ----
lock(&rdev->wiphy.mtx);
lock(team->team_lock_key);
lock(&rdev->wiphy.mtx);
lock(team->team_lock_key);

*** DEADLOCK ***

3 locks held by syz-executor419/5074:
#0: ffffffff8f3f1a30 (cb_lock){++++}-{3:3}, at: genl_rcv+0x19/0x40 net/netlink/genetlink.c:1216
#1: ffffffff8f38ce88 (rtnl_mutex){+.+.}-{3:3}, at: nl80211_pre_doit+0x5f/0x8b0 net/wireless/nl80211.c:16401
#2: ffff88802a210768 (&rdev->wiphy.mtx){+.+.}-{3:3}, at: nl80211_del_interface+0x11a/0x140 net/wireless/nl80211.c:4389

stack backtrace:
CPU: 1 PID: 5074 Comm: syz-executor419 Not tainted 6.8.0-syzkaller-08073-g480e035fc4c7 #0
Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 02/29/2024
Call Trace:
<TASK>
__dump_stack lib/dump_stack.c:88 [inline]
dump_stack_lvl+0x241/0x360 lib/dump_stack.c:114
check_noncircular+0x36a/0x4a0 kernel/locking/lockdep.c:2187
check_prev_add kernel/locking/lockdep.c:3134 [inline]
check_prevs_add kernel/locking/lockdep.c:3253 [inline]
validate_chain+0x18cb/0x58e0 kernel/locking/lockdep.c:3869
__lock_acquire+0x1346/0x1fd0 kernel/locking/lockdep.c:5137
lock_acquire+0x1e4/0x530 kernel/locking/lockdep.c:5754
__mutex_lock_common kernel/locking/mutex.c:608 [inline]
__mutex_lock+0x136/0xd70 kernel/locking/mutex.c:752
team_del_slave+0x32/0x1d0 drivers/net/team/team.c:1988
team_device_event+0x200/0x5b0 drivers/net/team/team.c:3029
notifier_call_chain+0x18f/0x3b0 kernel/notifier.c:93
call_netdevice_notifiers_extack net/core/dev.c:1988 [inline]
call_netdevice_notifiers net/core/dev.c:2002 [inline]
unregister_netdevice_many_notify+0xd96/0x16d0 net/core/dev.c:11096
unregister_netdevice_many net/core/dev.c:11154 [inline]
unregister_netdevice_queue+0x303/0x370 net/core/dev.c:11033
unregister_netdevice include/linux/netdevice.h:3115 [inline]
_cfg80211_unregister_wdev+0x162/0x560 net/wireless/core.c:1206
ieee80211_if_remove+0x25d/0x3a0 net/mac80211/iface.c:2242
ieee80211_del_iface+0x19/0x30 net/mac80211/cfg.c:202
rdev_del_virtual_intf net/wireless/rdev-ops.h:62 [inline]
cfg80211_remove_virtual_intf+0x230/0x3f0 net/wireless/util.c:2847
genl_family_rcv_msg_doit net/netlink/genetlink.c:1113 [inline]
genl_family_rcv_msg net/netlink/genetlink.c:1193 [inline]
genl_rcv_msg+0xb14/0xec0 net/netlink/genetlink.c:1208
netlink_rcv_skb+0x1e3/0x430 net/netlink/af_netlink.c:2559
genl_rcv+0x28/0x40 net/netlink/genetlink.c:1217
netlink_unicast_kernel net/netlink/af_netlink.c:1335 [inline]
netlink_unicast+0x7ea/0x980 net/netlink/af_netlink.c:1361
netlink_sendmsg+0x8e1/0xcb0 net/netlink/af_netlink.c:1905
sock_sendmsg_nosec net/socket.c:730 [inline]
__sock_sendmsg+0x221/0x270 net/socket.c:745
____sys_sendmsg+0x525/0x7d0 net/socket.c:2584
___sys_sendmsg net/socket.c:2638 [inline]
__sys_sendmsg+0x2b0/0x3a0 net/socket.c:2667
do_syscall_64+0xfb/0x240
entry_SYSCALL_64_after_hwframe+0x6d/0x75
RIP: 0033:0x7f963cb981a9
Code: 28 00 00 00 75 05 48 83 c4 28 c3 e8 d1 19 00 00 90 48 89 f8 48 89 f7 48 89 d6 48 89 ca 4d 89 c2 4d 89 c8 4c 8b 4c 24 08 0f 05 <48> 3d 01 f0 ff ff 73 01 c3 48 c7 c1 b8 ff ff ff f7 d8 64 89 01 48
RSP: 002b:00007ffdde1419a8 EFLAGS: 00000246 ORIG_RAX: 000000000000002e
RAX: ffffffffffffffda RBX: 00007f963cbe53f6 RCX: 00007f963cb981a9
RDX: 0000000000000000 RSI: 0000000020000400 RDI: 0000000000000004
RBP: 00007f963cc17440 R08: 0000000000000000 R09: 0000000000000000
R10: 0000000000000000 R11: 0000000000000246 R12: 0000000000000031
R13: 0000000000000003 R14: 0000000000050012 R15: 00007ffdde141a02
</TASK>
team0: Port device wlan0 removed


---
This report is generated by a bot. It may contain errors.
See https://goo.gl/tpsmEJ for more information about syzbot.
syzbot engineers can be reached at syzk...@googlegroups.com.

syzbot will keep track of this issue. See:
https://goo.gl/tpsmEJ#status for how to communicate with syzbot.

If the report is already addressed, let syzbot know by replying with:
#syz fix: exact-commit-title

If you want syzbot to run the reproducer, reply with:
#syz test: git://repo/address.git branch-or-commit-hash
If you attach or paste a git patch, syzbot will apply it before testing.

If you want to overwrite report's subsystems, reply with:
#syz set subsystems: new-subsystem
(See the list of subsystem names on the web dashboard)

If the report is a duplicate of another one, reply with:
#syz dup: exact-subject-of-another-report

If you want to undo deduplication, reply with:
#syz undup

Hillf Danton

unread,
Apr 26, 2024, 10:17:14 AM4/26/24
to syzbot, edum...@google.com, linux-...@vger.kernel.org, net...@vger.kernel.org, Boqun Feng, syzkall...@googlegroups.com
On Fri, 26 Apr 2024 04:59:32 -0700
ASSERT_RTNL();
ASSERT_RTNL();
lockdep_assert_wiphy(sdata->local->hw.wiphy);

Given ASSERT_RTNL() on both sides, difficult to understand the
deadlock reported.

Jeongjun Park

unread,
Jul 3, 2024, 7:26:07 AM7/3/24
to syzbot+705c61...@syzkaller.appspotmail.com, linux-...@vger.kernel.org, syzkall...@googlegroups.com

syzbot

unread,
Jul 3, 2024, 9:41:08 AM7/3/24
to aha3...@gmail.com, linux-...@vger.kernel.org, syzkall...@googlegroups.com
Hello,

syzbot has tested the proposed patch but the reproducer is still triggering an issue:
possible deadlock in team_del_slave

bond0 (unregistering): (slave bond_slave_1): Releasing backup interface
bond0 (unregistering): Released all slaves
======================================================
WARNING: possible circular locking dependency detected
6.10.0-rc6-syzkaller-00061-ge9d22f7a6655 #0 Not tainted
------------------------------------------------------
kworker/u8:4/61 is trying to acquire lock:
ffff888023524d20 (team->team_lock_key#4){+.+.}-{3:3}, at: team_del_slave+0x32/0x1d0 drivers/net/team/team_core.c:1990

but task is already holding lock:
ffff8880226b0768 (&rdev->wiphy.mtx){+.+.}-{3:3}, at: wiphy_lock include/net/cfg80211.h:5966 [inline]
ffff8880226b0768 (&rdev->wiphy.mtx){+.+.}-{3:3}, at: ieee80211_remove_interfaces+0x12b/0x700 net/mac80211/iface.c:2280

which lock already depends on the new lock.


the existing dependency chain (in reverse order) is:

-> #1 (&rdev->wiphy.mtx){+.+.}-{3:3}:
lock_acquire+0x1ed/0x550 kernel/locking/lockdep.c:5754
__mutex_lock_common kernel/locking/mutex.c:608 [inline]
__mutex_lock+0x136/0xd70 kernel/locking/mutex.c:752
wiphy_lock include/net/cfg80211.h:5966 [inline]
ieee80211_open+0xe7/0x200 net/mac80211/iface.c:449
__dev_open+0x2d3/0x450 net/core/dev.c:1472
dev_open+0xae/0x1b0 net/core/dev.c:1508
team_port_add drivers/net/team/team_core.c:1216 [inline]
team_add_slave+0x9b3/0x2750 drivers/net/team/team_core.c:1976
do_set_master net/core/rtnetlink.c:2701 [inline]
do_setlink+0xe70/0x41f0 net/core/rtnetlink.c:2907
__rtnl_newlink net/core/rtnetlink.c:3696 [inline]
rtnl_newlink+0x180b/0x20a0 net/core/rtnetlink.c:3743
rtnetlink_rcv_msg+0x89b/0x1180 net/core/rtnetlink.c:6635
netlink_rcv_skb+0x1e3/0x430 net/netlink/af_netlink.c:2564
netlink_unicast_kernel net/netlink/af_netlink.c:1335 [inline]
netlink_unicast+0x7ea/0x980 net/netlink/af_netlink.c:1361
netlink_sendmsg+0x8db/0xcb0 net/netlink/af_netlink.c:1905
sock_sendmsg_nosec net/socket.c:730 [inline]
__sock_sendmsg+0x221/0x270 net/socket.c:745
____sys_sendmsg+0x525/0x7d0 net/socket.c:2585
___sys_sendmsg net/socket.c:2639 [inline]
__sys_sendmsg+0x2b0/0x3a0 net/socket.c:2668
do_syscall_x64 arch/x86/entry/common.c:52 [inline]
do_syscall_64+0xf3/0x230 arch/x86/entry/common.c:83
entry_SYSCALL_64_after_hwframe+0x77/0x7f

-> #0 (team->team_lock_key#4){+.+.}-{3:3}:
check_prev_add kernel/locking/lockdep.c:3134 [inline]
check_prevs_add kernel/locking/lockdep.c:3253 [inline]
validate_chain+0x18e0/0x5900 kernel/locking/lockdep.c:3869
__lock_acquire+0x1346/0x1fd0 kernel/locking/lockdep.c:5137
lock_acquire+0x1ed/0x550 kernel/locking/lockdep.c:5754
__mutex_lock_common kernel/locking/mutex.c:608 [inline]
__mutex_lock+0x136/0xd70 kernel/locking/mutex.c:752
team_del_slave+0x32/0x1d0 drivers/net/team/team_core.c:1990
team_device_event+0x200/0x5b0 drivers/net/team/team_core.c:2984
notifier_call_chain+0x19f/0x3e0 kernel/notifier.c:93
call_netdevice_notifiers_extack net/core/dev.c:2030 [inline]
call_netdevice_notifiers net/core/dev.c:2044 [inline]
unregister_netdevice_many_notify+0xd75/0x16b0 net/core/dev.c:11219
unregister_netdevice_many net/core/dev.c:11277 [inline]
unregister_netdevice_queue+0x303/0x370 net/core/dev.c:11156
unregister_netdevice include/linux/netdevice.h:3119 [inline]
_cfg80211_unregister_wdev+0x162/0x560 net/wireless/core.c:1206
ieee80211_remove_interfaces+0x4db/0x700 net/mac80211/iface.c:2305
ieee80211_unregister_hw+0x5d/0x2c0 net/mac80211/main.c:1658
mac80211_hwsim_del_radio+0x2c2/0x4c0 drivers/net/wireless/virtual/mac80211_hwsim.c:5576
hwsim_exit_net+0x5c1/0x670 drivers/net/wireless/virtual/mac80211_hwsim.c:6453
ops_exit_list net/core/net_namespace.c:173 [inline]
cleanup_net+0x802/0xcc0 net/core/net_namespace.c:640
process_one_work kernel/workqueue.c:3248 [inline]
process_scheduled_works+0xa2c/0x1830 kernel/workqueue.c:3329
worker_thread+0x86d/0xd50 kernel/workqueue.c:3409
kthread+0x2f0/0x390 kernel/kthread.c:389
ret_from_fork+0x4b/0x80 arch/x86/kernel/process.c:147
ret_from_fork_asm+0x1a/0x30 arch/x86/entry/entry_64.S:244

other info that might help us debug this:

Possible unsafe locking scenario:

CPU0 CPU1
---- ----
lock(&rdev->wiphy.mtx);
lock(team->team_lock_key#4);
lock(&rdev->wiphy.mtx);
lock(team->team_lock_key#4);

*** DEADLOCK ***

5 locks held by kworker/u8:4/61:
#0: ffff888015ed5948 ((wq_completion)netns){+.+.}-{0:0}, at: process_one_work kernel/workqueue.c:3223 [inline]
#0: ffff888015ed5948 ((wq_completion)netns){+.+.}-{0:0}, at: process_scheduled_works+0x90a/0x1830 kernel/workqueue.c:3329
#1: ffffc900015c7d00 (net_cleanup_work){+.+.}-{0:0}, at: process_one_work kernel/workqueue.c:3224 [inline]
#1: ffffc900015c7d00 (net_cleanup_work){+.+.}-{0:0}, at: process_scheduled_works+0x945/0x1830 kernel/workqueue.c:3329
#2: ffffffff8f5da690 (pernet_ops_rwsem){++++}-{3:3}, at: cleanup_net+0x16a/0xcc0 net/core/net_namespace.c:594
#3: ffffffff8f5e6ec8 (rtnl_mutex){+.+.}-{3:3}, at: ieee80211_unregister_hw+0x55/0x2c0 net/mac80211/main.c:1651
#4: ffff8880226b0768 (&rdev->wiphy.mtx){+.+.}-{3:3}, at: wiphy_lock include/net/cfg80211.h:5966 [inline]
#4: ffff8880226b0768 (&rdev->wiphy.mtx){+.+.}-{3:3}, at: ieee80211_remove_interfaces+0x12b/0x700 net/mac80211/iface.c:2280

stack backtrace:
CPU: 0 PID: 61 Comm: kworker/u8:4 Not tainted 6.10.0-rc6-syzkaller-00061-ge9d22f7a6655 #0
Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 06/07/2024
Workqueue: netns cleanup_net
Call Trace:
<TASK>
__dump_stack lib/dump_stack.c:88 [inline]
dump_stack_lvl+0x241/0x360 lib/dump_stack.c:114
check_noncircular+0x36a/0x4a0 kernel/locking/lockdep.c:2187
check_prev_add kernel/locking/lockdep.c:3134 [inline]
check_prevs_add kernel/locking/lockdep.c:3253 [inline]
validate_chain+0x18e0/0x5900 kernel/locking/lockdep.c:3869
__lock_acquire+0x1346/0x1fd0 kernel/locking/lockdep.c:5137
lock_acquire+0x1ed/0x550 kernel/locking/lockdep.c:5754
__mutex_lock_common kernel/locking/mutex.c:608 [inline]
__mutex_lock+0x136/0xd70 kernel/locking/mutex.c:752
team_del_slave+0x32/0x1d0 drivers/net/team/team_core.c:1990
team_device_event+0x200/0x5b0 drivers/net/team/team_core.c:2984
notifier_call_chain+0x19f/0x3e0 kernel/notifier.c:93
call_netdevice_notifiers_extack net/core/dev.c:2030 [inline]
call_netdevice_notifiers net/core/dev.c:2044 [inline]
unregister_netdevice_many_notify+0xd75/0x16b0 net/core/dev.c:11219
unregister_netdevice_many net/core/dev.c:11277 [inline]
unregister_netdevice_queue+0x303/0x370 net/core/dev.c:11156
unregister_netdevice include/linux/netdevice.h:3119 [inline]
_cfg80211_unregister_wdev+0x162/0x560 net/wireless/core.c:1206
ieee80211_remove_interfaces+0x4db/0x700 net/mac80211/iface.c:2305
ieee80211_unregister_hw+0x5d/0x2c0 net/mac80211/main.c:1658
mac80211_hwsim_del_radio+0x2c2/0x4c0 drivers/net/wireless/virtual/mac80211_hwsim.c:5576
hwsim_exit_net+0x5c1/0x670 drivers/net/wireless/virtual/mac80211_hwsim.c:6453
ops_exit_list net/core/net_namespace.c:173 [inline]
cleanup_net+0x802/0xcc0 net/core/net_namespace.c:640
process_one_work kernel/workqueue.c:3248 [inline]
process_scheduled_works+0xa2c/0x1830 kernel/workqueue.c:3329
worker_thread+0x86d/0xd50 kernel/workqueue.c:3409
kthread+0x2f0/0x390 kernel/kthread.c:389
ret_from_fork+0x4b/0x80 arch/x86/kernel/process.c:147
ret_from_fork_asm+0x1a/0x30 arch/x86/entry/entry_64.S:244
</TASK>
team0: Port device wlan1 removed
hsr_slave_0: left promiscuous mode
hsr_slave_1: left promiscuous mode
batman_adv: batadv0: Interface deactivated: batadv_slave_0
batman_adv: batadv0: Removing interface: batadv_slave_0
batman_adv: batadv0: Interface deactivated: batadv_slave_1
batman_adv: batadv0: Removing interface: batadv_slave_1
veth1_macvtap: left promiscuous mode
veth0_macvtap: left promiscuous mode
veth1_vlan: left promiscuous mode
veth0_vlan: left promiscuous mode
team0 (unregistering): Port device team_slave_1 removed
team0 (unregistering): Port device team_slave_0 removed
netdevsim netdevsim4 netdevsim3 (unregistering): unset [1, 0] type 2 family 0 port 6081 - 0
netdevsim netdevsim4 netdevsim2 (unregistering): unset [1, 0] type 2 family 0 port 6081 - 0
netdevsim netdevsim4 netdevsim1 (unregistering): unset [1, 0] type 2 family 0 port 6081 - 0
netdevsim netdevsim4 netdevsim0 (unregistering): unset [1, 0] type 2 family 0 port 6081 - 0
netdevsim netdevsim3 netdevsim3 (unregistering): unset [1, 0] type 2 family 0 port 6081 - 0
netdevsim netdevsim3 netdevsim2 (unregistering): unset [1, 0] type 2 family 0 port 6081 - 0
netdevsim netdevsim3 netdevsim1 (unregistering): unset [1, 0] type 2 family 0 port 6081 - 0
netdevsim netdevsim3 netdevsim0 (unregistering): unset [1, 0] type 2 family 0 port 6081 - 0
netdevsim netdevsim2 netdevsim3 (unregistering): unset [1, 0] type 2 family 0 port 6081 - 0
netdevsim netdevsim2 netdevsim2 (unregistering): unset [1, 0] type 2 family 0 port 6081 - 0
netdevsim netdevsim2 netdevsim1 (unregistering): unset [1, 0] type 2 family 0 port 6081 - 0
netdevsim netdevsim2 netdevsim0 (unregistering): unset [1, 0] type 2 family 0 port 6081 - 0
bridge_slave_1: left allmulticast mode
bridge_slave_1: left promiscuous mode
bridge0: port 2(bridge_slave_1) entered disabled state
bridge_slave_0: left allmulticast mode
bridge_slave_0: left promiscuous mode
bridge0: port 1(bridge_slave_0) entered disabled state
bridge_slave_1: left allmulticast mode
bridge_slave_1: left promiscuous mode
bridge0: port 2(bridge_slave_1) entered disabled state
bridge_slave_0: left allmulticast mode
bridge_slave_0: left promiscuous mode
bridge0: port 1(bridge_slave_0) entered disabled state
bridge_slave_1: left allmulticast mode
bridge_slave_1: left promiscuous mode
bridge0: port 2(bridge_slave_1) entered disabled state
bridge_slave_0: left allmulticast mode
bridge_slave_0: left promiscuous mode
bridge0: port 1(bridge_slave_0) entered disabled state
bridge_slave_1: left allmulticast mode
bridge_slave_1: left promiscuous mode
bridge0: port 2(bridge_slave_1) entered disabled state
bridge_slave_0: left allmulticast mode
bridge_slave_0: left promiscuous mode
bridge0: port 1(bridge_slave_0) entered disabled state
bond0 (unregistering): (slave bond_slave_0): Releasing backup interface
bond0 (unregistering): (slave bond_slave_1): Releasing backup interface
bond0 (unregistering): Released all slaves
bond0 (unregistering): (slave bond_slave_0): Releasing backup interface
bond0 (unregistering): (slave bond_slave_1): Releasing backup interface
bond0 (unregistering): Released all slaves
bond0 (unregistering): (slave bond_slave_0): Releasing backup interface
bond0 (unregistering): (slave bond_slave_1): Releasing backup interface
bond0 (unregistering): Released all slaves
bond0 (unregistering): (slave bond_slave_0): Releasing backup interface
bond0 (unregistering): (slave bond_slave_1): Releasing backup interface
bond0 (unregistering): Released all slaves
team0: Port device wlan1 removed
team0: Port device wlan1 removed
team0: Port device wlan1 removed
team0: Port device wlan1 removed
hsr_slave_0: left promiscuous mode
hsr_slave_1: left promiscuous mode
batman_adv: batadv0: Interface deactivated: batadv_slave_0
batman_adv: batadv0: Removing interface: batadv_slave_0
batman_adv: batadv0: Interface deactivated: batadv_slave_1
batman_adv: batadv0: Removing interface: batadv_slave_1
hsr_slave_0: left promiscuous mode
hsr_slave_1: left promiscuous mode
batman_adv: batadv0: Interface deactivated: batadv_slave_0
batman_adv: batadv0: Removing interface: batadv_slave_0
batman_adv: batadv0: Interface deactivated: batadv_slave_1
batman_adv: batadv0: Removing interface: batadv_slave_1
hsr_slave_0: left promiscuous mode
hsr_slave_1: left promiscuous mode
batman_adv: batadv0: Interface deactivated: batadv_slave_0
batman_adv: batadv0: Removing interface: batadv_slave_0
batman_adv: batadv0: Interface deactivated: batadv_slave_1
batman_adv: batadv0: Removing interface: batadv_slave_1
hsr_slave_0: left promiscuous mode
hsr_slave_1: left promiscuous mode
batman_adv: batadv0: Interface deactivated: batadv_slave_0
batman_adv: batadv0: Removing interface: batadv_slave_0
batman_adv: batadv0: Interface deactivated: batadv_slave_1
batman_adv: batadv0: Removing interface: batadv_slave_1
veth1_macvtap: left promiscuous mode
veth0_macvtap: left promiscuous mode
veth1_vlan: left promiscuous mode
veth0_vlan: left promiscuous mode
veth1_macvtap: left promiscuous mode
veth0_macvtap: left promiscuous mode
veth1_vlan: left promiscuous mode
veth0_vlan: left promiscuous mode
veth1_macvtap: left promiscuous mode
veth0_macvtap: left promiscuous mode
veth1_vlan: left promiscuous mode
veth0_vlan: left promiscuous mode
veth1_macvtap: left promiscuous mode
veth0_macvtap: left promiscuous mode
veth1_vlan: left promiscuous mode
veth0_vlan: left promiscuous mode
team0 (unregistering): Port device team_slave_1 removed
team0 (unregistering): Port device team_slave_0 removed
team0 (unregistering): Port device team_slave_1 removed
team0 (unregistering): Port device team_slave_0 removed
team0 (unregistering): Port device team_slave_1 removed
team0 (unregistering): Port device team_slave_0 removed
team0 (unregistering): Port device team_slave_1 removed
team0 (unregistering): Port device team_slave_0 removed


Tested on:

commit: e9d22f7a Merge tag 'linux_kselftest-fixes-6.10-rc7' of..
git tree: upstream
console output: https://syzkaller.appspot.com/x/log.txt?x=14efde81980000
kernel config: https://syzkaller.appspot.com/x/.config?x=864caee5f78cab51
dashboard link: https://syzkaller.appspot.com/bug?extid=705c61d60b091ef42c04
compiler: Debian clang version 15.0.6, GNU ld (GNU Binutils for Debian) 2.40

Note: no patches were applied.

Jeongjun Park

unread,
Jul 3, 2024, 9:44:45 AM7/3/24
to syzbot+705c61...@syzkaller.appspotmail.com, linux-...@vger.kernel.org, syzkall...@googlegroups.com
---
drivers/net/team/team_core.c | 14 ++++++++------
1 file changed, 8 insertions(+), 6 deletions(-)

diff --git a/drivers/net/team/team_core.c b/drivers/net/team/team_core.c
index ab1935a4aa2c..3ac82df876b0 100644
--- a/drivers/net/team/team_core.c
+++ b/drivers/net/team/team_core.c
@@ -1970,11 +1970,12 @@ static int team_add_slave(struct net_device *dev, struct net_device *port_dev,
struct netlink_ext_ack *extack)
{
struct team *team = netdev_priv(dev);
- int err;
+ int err, locked;

- mutex_lock(&team->lock);
+ locked = mutex_trylock(&team->lock);
err = team_port_add(team, port_dev, extack);
- mutex_unlock(&team->lock);
+ if (locked)
+ mutex_unlock(&team->lock);

if (!err)
netdev_change_features(dev);
@@ -1985,11 +1986,12 @@ static int team_add_slave(struct net_device *dev, struct net_device *port_dev,
static int team_del_slave(struct net_device *dev, struct net_device *port_dev)
{
struct team *team = netdev_priv(dev);
- int err;
+ int err, locked;

- mutex_lock(&team->lock);
+ locked = mutex_trylock(&team->lock);
err = team_port_del(team, port_dev);
- mutex_unlock(&team->lock);
+ if (locked)
+ mutex_unlock(&team->lock);

if (err)
return err;
--

syzbot

unread,
Jul 3, 2024, 10:19:06 AM7/3/24
to aha3...@gmail.com, linux-...@vger.kernel.org, syzkall...@googlegroups.com
Hello,

syzbot has tested the proposed patch and the reproducer did not trigger any issue:

Reported-and-tested-by: syzbot+705c61...@syzkaller.appspotmail.com

Tested on:

commit: e9d22f7a Merge tag 'linux_kselftest-fixes-6.10-rc7' of..
git tree: upstream
console output: https://syzkaller.appspot.com/x/log.txt?x=16dbf4e1980000
kernel config: https://syzkaller.appspot.com/x/.config?x=864caee5f78cab51
dashboard link: https://syzkaller.appspot.com/bug?extid=705c61d60b091ef42c04
compiler: Debian clang version 15.0.6, GNU ld (GNU Binutils for Debian) 2.40
patch: https://syzkaller.appspot.com/x/patch.diff?x=16fc5485980000

Note: testing is done by a robot and is best-effort only.

Jeongjun Park

unread,
Jul 3, 2024, 10:52:10 AM7/3/24
to ji...@resnulli.us, syzbot+705c61...@syzkaller.appspotmail.com, da...@davemloft.net, edum...@google.com, ku...@kernel.org, linux-...@vger.kernel.org, net...@vger.kernel.org, pab...@redhat.com, syzkall...@googlegroups.com, Jeongjun Park
CPU0 CPU1
---- ----
lock(&rdev->wiphy.mtx);
lock(team->team_lock_key#4);
lock(&rdev->wiphy.mtx);
lock(team->team_lock_key#4);

Deadlock occurs due to the above scenario. Therefore,
modify the code as shown in the patch below to prevent deadlock.

Regards,
Jeongjun Park.

Reported-and-tested-by: syzbot+705c61...@syzkaller.appspotmail.com
Fixes: 61dc3461b954 ("team: convert overall spinlock to mutex")
Signed-off-by: Jeongjun Park <aha3...@gmail.com>

Jeongjun Park

unread,
Jul 3, 2024, 11:52:24 AM7/3/24
to syzbot+705c61...@syzkaller.appspotmail.com, linux-...@vger.kernel.org, syzkall...@googlegroups.com
drivers/net/team/team_core.c | 6 ++++--
1 file changed, 4 insertions(+), 2 deletions(-)

diff --git a/drivers/net/team/team_core.c b/drivers/net/team/team_core.c
index ab1935a4aa2c..43d7c73b25aa 100644
--- a/drivers/net/team/team_core.c
+++ b/drivers/net/team/team_core.c
@@ -1972,7 +1972,8 @@ static int team_add_slave(struct net_device *dev, struct net_device *port_dev,
struct team *team = netdev_priv(dev);
int err;

- mutex_lock(&team->lock);
+ if (!mutex_trylock(&team->lock))
+ return -EBUSY;
err = team_port_add(team, port_dev, extack);
mutex_unlock(&team->lock);

@@ -1987,7 +1988,8 @@ static int team_del_slave(struct net_device *dev, struct net_device *port_dev)
struct team *team = netdev_priv(dev);
int err;

- mutex_lock(&team->lock);
+ if (!mutex_trylock(&team->lock))
+ return -EBUSY;
err = team_port_del(team, port_dev);
mutex_unlock(&team->lock);

--

Jeongjun Park

unread,
Jul 3, 2024, 12:02:19 PM7/3/24
to michal...@intel.com, aha3...@gmail.com, da...@davemloft.net, edum...@google.com, ji...@resnulli.us, ku...@kernel.org, linux-...@vger.kernel.org, net...@vger.kernel.org, pab...@redhat.com, syzbot+705c61...@syzkaller.appspotmail.com, syzkall...@googlegroups.com
>
> On Wed, Jul 03, 2024 at 11:51:59PM +0900, Jeongjun Park wrote:
> >        CPU0                    CPU1
> >        ----                    ----
> >   lock(&rdev->wiphy.mtx);
> >                                lock(team->team_lock_key#4);
> >                                lock(&rdev->wiphy.mtx);
> >   lock(team->team_lock_key#4);
> >
> > Deadlock occurs due to the above scenario. Therefore,
> > modify the code as shown in the patch below to prevent deadlock.
> >
> > Regards,
> > Jeongjun Park.
>
> The commit message should contain the patch description only (without
> salutations, etc.).
>
> >
> > Reported-and-tested-by: syzbot+705c61...@syzkaller.appspotmail.com
> > Fixes: 61dc3461b954 ("team: convert overall spinlock to mutex")
> > Signed-off-by: Jeongjun Park <aha3...@gmail.com>
> > ---
> >  drivers/net/team/team_core.c | 14 ++++++++------
> >  1 file changed, 8 insertions(+), 6 deletions(-)
> >
> > diff --git a/drivers/net/team/team_core.c b/drivers/net/team/team_core.c
> > index ab1935a4aa2c..3ac82df876b0 100644
> > --- a/drivers/net/team/team_core.c
> > +++ b/drivers/net/team/team_core.c
> > @@ -1970,11 +1970,12 @@ static int team_add_slave(struct net_device *dev, struct net_device *port_dev,
> >                           struct netlink_ext_ack *extack)
> >  {
> >         struct team *team = netdev_priv(dev);
> > -       int err;
> > +       int err, locked;
> >
> > -       mutex_lock(&team->lock);
> > +       locked = mutex_trylock(&team->lock);
> >         err = team_port_add(team, port_dev, extack);
> > -       mutex_unlock(&team->lock);
> > +       if (locked)
> > +               mutex_unlock(&team->lock);
>
> This is not correct usage of 'mutex_trylock()' API. In such a case you
> could as well remove the lock completely from that part of code.
> If "mutex_trylock()" returns false it means the mutex cannot be taken
> (because it was already taken by other thread), so you should not modify
> the resources that were expected to be protected by the mutex.
> In other words, there is a risk of modifying resources using
> "team_port_add()" by several threads at a time.
>
> >
> >         if (!err)
> >                 netdev_change_features(dev);
> > @@ -1985,11 +1986,12 @@ static int team_add_slave(struct net_device *dev, struct net_device *port_dev,
> >  static int team_del_slave(struct net_device *dev, struct net_device *port_dev)
> >  {
> >         struct team *team = netdev_priv(dev);
> > -       int err;
> > +       int err, locked;
> >
> > -       mutex_lock(&team->lock);
> > +       locked = mutex_trylock(&team->lock);
> >         err = team_port_del(team, port_dev);
> > -       mutex_unlock(&team->lock);
> > +       if (locked)
> > +               mutex_unlock(&team->lock);
>
> The same story as in case of "team_add_slave()".
>
> >
> >         if (err)
> >                 return err;
> > --
> >
>
> The patch does not seem to be a correct solution to remove a deadlock.
> Most probably a synchronization design needs an inspection.
> If you really want to use "mutex_trylock()" API, please consider several
> attempts of taking the mutex, but never modify the protected resources when
> the mutex is not taken successfully.
>

Thanks for your comment. I rewrote the patch based on those comments.
This time, we modified it to return an error so that resources are not
modified when a race situation occurs. We would appreciate your
feedback on what this patch would be like.

> Thanks,
> Michal
>
>

Regards,
Jeongjun Park

Eric Dumazet

unread,
Jul 3, 2024, 12:30:23 PM7/3/24
to Jeongjun Park, michal...@intel.com, da...@davemloft.net, ji...@resnulli.us, ku...@kernel.org, linux-...@vger.kernel.org, net...@vger.kernel.org, pab...@redhat.com, syzbot+705c61...@syzkaller.appspotmail.com, syzkall...@googlegroups.com
Failing team_del_slave() is not an option. It will add various issues.

syzbot

unread,
Jul 3, 2024, 12:35:04 PM7/3/24
to aha3...@gmail.com, linux-...@vger.kernel.org, syzkall...@googlegroups.com
Hello,

syzbot has tested the proposed patch and the reproducer did not trigger any issue:

Reported-and-tested-by: syzbot+705c61...@syzkaller.appspotmail.com

Tested on:

commit: e9d22f7a Merge tag 'linux_kselftest-fixes-6.10-rc7' of..
git tree: upstream
console output: https://syzkaller.appspot.com/x/log.txt?x=14125485980000
kernel config: https://syzkaller.appspot.com/x/.config?x=864caee5f78cab51
dashboard link: https://syzkaller.appspot.com/bug?extid=705c61d60b091ef42c04
compiler: Debian clang version 15.0.6, GNU ld (GNU Binutils for Debian) 2.40
patch: https://syzkaller.appspot.com/x/patch.diff?x=1489b399980000

Jiri Pirko

unread,
Jul 4, 2024, 6:15:40 AM7/4/24
to syzbot, da...@davemloft.net, edum...@google.com, ku...@kernel.org, linux-...@vger.kernel.org, net...@vger.kernel.org, pab...@redhat.com, syzkall...@googlegroups.com
I wonder, since we already rely on rtnl in lots of team code, perhaps we
can remove team->lock completely and convert the rest of the code to be
protected by rtnl lock as well.

Jeongjun Park

unread,
Jul 4, 2024, 6:43:38 AM7/4/24
to syzbot+705c61...@syzkaller.appspotmail.com, linux-...@vger.kernel.org, syzkall...@googlegroups.com
>
> On Wed, Jul 03, 2024 at 11:51:59PM +0900, Jeongjun Park wrote:
> >        CPU0                    CPU1
> >        ----                    ----
> >   lock(&rdev->wiphy.mtx);

Jeongjun Park

unread,
Jul 4, 2024, 6:45:49 AM7/4/24
to syzbot+705c61...@syzkaller.appspotmail.com, linux-...@vger.kernel.org, syzkall...@googlegroups.com
drivers/net/team/team_core.c | 32 +++++++++++++++++++++++---------
1 file changed, 23 insertions(+), 9 deletions(-)

diff --git a/drivers/net/team/team_core.c b/drivers/net/team/team_core.c
index ab1935a4aa2c..a12366fd420c 100644
--- a/drivers/net/team/team_core.c
+++ b/drivers/net/team/team_core.c
@@ -1142,31 +1142,37 @@ static int team_port_add(struct team *team, struct net_device *port_dev,
char *portname = port_dev->name;
int err;

+ rtnl_lock();
+
if (port_dev->flags & IFF_LOOPBACK) {
NL_SET_ERR_MSG(extack, "Loopback device can't be added as a team port");
netdev_err(dev, "Device %s is loopback device. Loopback devices can't be added as a team port\n",
portname);
- return -EINVAL;
+ err = -EINVAL;
+ goto err_out;
}

if (netif_is_team_port(port_dev)) {
NL_SET_ERR_MSG(extack, "Device is already a port of a team device");
netdev_err(dev, "Device %s is already a port "
"of a team device\n", portname);
- return -EBUSY;
+ err = -EBUSY;
+ goto err_out;
}

if (dev == port_dev) {
NL_SET_ERR_MSG(extack, "Cannot enslave team device to itself");
netdev_err(dev, "Cannot enslave team device to itself\n");
- return -EINVAL;
+ err = -EINVAL;
+ goto err_out;
}

if (netdev_has_upper_dev(dev, port_dev)) {
NL_SET_ERR_MSG(extack, "Device is already an upper device of the team interface");
netdev_err(dev, "Device %s is already an upper device of the team interface\n",
portname);
- return -EBUSY;
+ err = -EBUSY;
+ goto err_out;
}

if (port_dev->features & NETIF_F_VLAN_CHALLENGED &&
@@ -1174,7 +1180,8 @@ static int team_port_add(struct team *team, struct net_device *port_dev,
NL_SET_ERR_MSG(extack, "Device is VLAN challenged and team device has VLAN set up");
netdev_err(dev, "Device %s is VLAN challenged and team device has VLAN set up\n",
portname);
- return -EPERM;
+ err = -EPERM;
+ goto err_out;
}

err = team_dev_type_check_change(dev, port_dev);
@@ -1185,13 +1192,16 @@ static int team_port_add(struct team *team, struct net_device *port_dev,
NL_SET_ERR_MSG(extack, "Device is up. Set it down before adding it as a team port");
netdev_err(dev, "Device %s is up. Set it down before adding it as a team port\n",
portname);
- return -EBUSY;
+ err = -EBUSY;
+ goto err_out;
}

port = kzalloc(sizeof(struct team_port) + team->mode->port_priv_size,
GFP_KERNEL);
- if (!port)
- return -ENOMEM;
+ if (!port) {
+ err = -ENOMEM;
+ goto err_out;
+ }

port->dev = port_dev;
port->team = team;
@@ -1213,7 +1223,9 @@ static int team_port_add(struct team *team, struct net_device *port_dev,
goto err_port_enter;
}

+ mutex_unlock(&team->lock);
err = dev_open(port_dev, extack);
+ mutex_lock(&team->lock);
if (err) {
netdev_dbg(dev, "Device %s opening failed\n",
portname);
@@ -1292,6 +1304,7 @@ static int team_port_add(struct team *team, struct net_device *port_dev,

netdev_info(dev, "Port device %s added\n", portname);

+ rtnl_unlock();
return 0;

err_set_slave_promisc:
@@ -1321,7 +1334,8 @@ static int team_port_add(struct team *team, struct net_device *port_dev,

err_set_mtu:
kfree(port);
-
+err_out:
+ rtnl_unlock();
return err;
}

--

Jeongjun Park

unread,
Jul 4, 2024, 7:02:59 AM7/4/24
to syzbot+705c61...@syzkaller.appspotmail.com, linux-...@vger.kernel.org, syzkall...@googlegroups.com
drivers/net/team/team_core.c | 2 ++
1 file changed, 2 insertions(+)

diff --git a/drivers/net/team/team_core.c b/drivers/net/team/team_core.c
index ab1935a4aa2c..245566a1875d 100644
--- a/drivers/net/team/team_core.c
+++ b/drivers/net/team/team_core.c
@@ -1213,7 +1213,9 @@ static int team_port_add(struct team *team, struct net_device *port_dev,
goto err_port_enter;
}

+ mutex_unlock(&team->lock);
err = dev_open(port_dev, extack);
+ mutex_lock(&team->lock);
if (err) {
netdev_dbg(dev, "Device %s opening failed\n",
portname);
--

syzbot

unread,
Jul 4, 2024, 12:07:04 PM7/4/24
to aha3...@gmail.com, linux-...@vger.kernel.org, syzkall...@googlegroups.com
Hello,

syzbot tried to test the proposed patch but the build/boot failed:

possible deadlock in team_add_slave

bond0: (slave bond_slave_0): Enslaving as an active interface with an up link
bond0: (slave bond_slave_1): Enslaving as an active interface with an up link
============================================
WARNING: possible recursive locking detected
6.10.0-rc6-syzkaller-00069-g795c58e4c7fc-dirty #0 Not tainted
--------------------------------------------
syz-executor.0/5159 is trying to acquire lock:
ffffffff8f5e6ec8 (rtnl_mutex){+.+.}-{3:3}, at: team_port_add drivers/net/team/team_core.c:1145 [inline]
ffffffff8f5e6ec8 (rtnl_mutex){+.+.}-{3:3}, at: team_add_slave+0xdd/0x2720 drivers/net/team/team_core.c:1990

but task is already holding lock:
ffffffff8f5e6ec8 (rtnl_mutex){+.+.}-{3:3}, at: rtnl_lock net/core/rtnetlink.c:79 [inline]
ffffffff8f5e6ec8 (rtnl_mutex){+.+.}-{3:3}, at: rtnetlink_rcv_msg+0x842/0x1180 net/core/rtnetlink.c:6632

other info that might help us debug this:
Possible unsafe locking scenario:

CPU0
----
lock(rtnl_mutex);
lock(rtnl_mutex);

*** DEADLOCK ***

May be due to missing lock nesting notation

2 locks held by syz-executor.0/5159:
#0: ffffffff8f5e6ec8 (rtnl_mutex){+.+.}-{3:3}, at: rtnl_lock net/core/rtnetlink.c:79 [inline]
#0: ffffffff8f5e6ec8 (rtnl_mutex){+.+.}-{3:3}, at: rtnetlink_rcv_msg+0x842/0x1180 net/core/rtnetlink.c:6632
#1: ffff88806a00cd20 (team->team_lock_key){+.+.}-{3:3}, at: team_add_slave+0xb0/0x2720 drivers/net/team/team_core.c:1989

stack backtrace:
CPU: 0 PID: 5159 Comm: syz-executor.0 Not tainted 6.10.0-rc6-syzkaller-00069-g795c58e4c7fc-dirty #0
Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 06/07/2024
Call Trace:
<TASK>
__dump_stack lib/dump_stack.c:88 [inline]
dump_stack_lvl+0x241/0x360 lib/dump_stack.c:114
check_deadlock kernel/locking/lockdep.c:3062 [inline]
validate_chain+0x15d3/0x5900 kernel/locking/lockdep.c:3856
__lock_acquire+0x1346/0x1fd0 kernel/locking/lockdep.c:5137
lock_acquire+0x1ed/0x550 kernel/locking/lockdep.c:5754
__mutex_lock_common kernel/locking/mutex.c:608 [inline]
__mutex_lock+0x136/0xd70 kernel/locking/mutex.c:752
team_port_add drivers/net/team/team_core.c:1145 [inline]
team_add_slave+0xdd/0x2720 drivers/net/team/team_core.c:1990
do_set_master net/core/rtnetlink.c:2701 [inline]
do_setlink+0xe70/0x41f0 net/core/rtnetlink.c:2907
__rtnl_newlink net/core/rtnetlink.c:3696 [inline]
rtnl_newlink+0x180b/0x20a0 net/core/rtnetlink.c:3743
rtnetlink_rcv_msg+0x89b/0x1180 net/core/rtnetlink.c:6635
netlink_rcv_skb+0x1e3/0x430 net/netlink/af_netlink.c:2564
netlink_unicast_kernel net/netlink/af_netlink.c:1335 [inline]
netlink_unicast+0x7ea/0x980 net/netlink/af_netlink.c:1361
netlink_sendmsg+0x8db/0xcb0 net/netlink/af_netlink.c:1905
sock_sendmsg_nosec net/socket.c:730 [inline]
__sock_sendmsg+0x221/0x270 net/socket.c:745
__sys_sendto+0x3a4/0x4f0 net/socket.c:2192
__do_sys_sendto net/socket.c:2204 [inline]
__se_sys_sendto net/socket.c:2200 [inline]
__x64_sys_sendto+0xde/0x100 net/socket.c:2200
do_syscall_x64 arch/x86/entry/common.c:52 [inline]
do_syscall_64+0xf3/0x230 arch/x86/entry/common.c:83
entry_SYSCALL_64_after_hwframe+0x77/0x7f
RIP: 0033:0x7fe59307ed43
Code: 64 89 02 48 c7 c0 ff ff ff ff eb b7 66 2e 0f 1f 84 00 00 00 00 00 90 80 3d c1 91 10 00 00 41 89 ca 74 14 b8 2c 00 00 00 0f 05 <48> 3d 00 f0 ff ff 77 75 c3 0f 1f 40 00 55 48 83 ec 30 44 89 4c 24
RSP: 002b:00007fe5932df648 EFLAGS: 00000202 ORIG_RAX: 000000000000002c
RAX: ffffffffffffffda RBX: 00007fe593ce4620 RCX: 00007fe59307ed43
RDX: 0000000000000028 RSI: 00007fe593ce4670 RDI: 0000000000000003
RBP: 0000000000000001 R08: 00007fe5932df664 R09: 000000000000000c
R10: 0000000000000000 R11: 0000000000000202 R12: 0000000000000003
R13: 0000000000000000 R14: 00007fe593ce4670 R15: 0000000000000000
</TASK>


Warning: Permanently added '10.128.0.29' (ED25519) to the list of known hosts.
2024/07/04 16:06:06 ignoring optional flag "sandboxArg"="0"
2024/07/04 16:06:07 parsed 1 programs
[ 63.126006][ T5090] cgroup: Unknown subsys name 'net'
[ 63.411712][ T5090] cgroup: Unknown subsys name 'rlimit'
[ 64.609660][ T5092] Adding 124996k swap on ./swap-file. Priority:0 extents:1 across:124996k
[ 65.064470][ T53] Bluetooth: hci0: unexpected cc 0x0c03 length: 249 > 1
[ 65.072894][ T53] Bluetooth: hci0: unexpected cc 0x1003 length: 249 > 9
[ 65.080962][ T53] Bluetooth: hci0: unexpected cc 0x1001 length: 249 > 9
[ 65.090842][ T53] Bluetooth: hci0: unexpected cc 0x0c23 length: 249 > 4
[ 65.103105][ T53] Bluetooth: hci0: unexpected cc 0x0c25 length: 249 > 3
[ 65.121545][ T53] Bluetooth: hci0: unexpected cc 0x0c38 length: 249 > 2
[ 65.438608][ T1046] wlan0: Created IBSS using preconfigured BSSID 50:50:50:50:50:50
[ 65.457546][ T1046] wlan0: Creating new IBSS network, BSSID 50:50:50:50:50:50
[ 65.495887][ T2472] wlan1: Created IBSS using preconfigured BSSID 50:50:50:50:50:50
[ 65.504409][ T2472] wlan1: Creating new IBSS network, BSSID 50:50:50:50:50:50
[ 66.784406][ T5159] chnl_net:caif_netlink_parms(): no params data found
[ 66.879723][ T5159] bridge0: port 1(bridge_slave_0) entered blocking state
[ 66.889205][ T5159] bridge0: port 1(bridge_slave_0) entered disabled state
[ 66.896897][ T5159] bridge_slave_0: entered allmulticast mode
[ 66.905071][ T5159] bridge_slave_0: entered promiscuous mode
[ 66.914541][ T5159] bridge0: port 2(bridge_slave_1) entered blocking state
[ 66.922253][ T5159] bridge0: port 2(bridge_slave_1) entered disabled state
[ 66.929529][ T5159] bridge_slave_1: entered allmulticast mode
[ 66.937325][ T5159] bridge_slave_1: entered promiscuous mode
[ 66.974384][ T5159] bond0: (slave bond_slave_0): Enslaving as an active interface with an up link
[ 66.986385][ T5159] bond0: (slave bond_slave_1): Enslaving as an active interface with an up link
[ 67.014300][ T5159]
[ 67.016708][ T5159] ============================================
[ 67.022882][ T5159] WARNING: possible recursive locking detected
[ 67.029164][ T5159] 6.10.0-rc6-syzkaller-00069-g795c58e4c7fc-dirty #0 Not tainted
[ 67.036959][ T5159] --------------------------------------------
[ 67.043287][ T5159] syz-executor.0/5159 is trying to acquire lock:
[ 67.049949][ T5159] ffffffff8f5e6ec8 (rtnl_mutex){+.+.}-{3:3}, at: team_add_slave+0xdd/0x2720
[ 67.058734][ T5159]
[ 67.058734][ T5159] but task is already holding lock:
[ 67.066374][ T5159] ffffffff8f5e6ec8 (rtnl_mutex){+.+.}-{3:3}, at: rtnetlink_rcv_msg+0x842/0x1180
[ 67.075446][ T5159]
[ 67.075446][ T5159] other info that might help us debug this:
[ 67.083669][ T5159] Possible unsafe locking scenario:
[ 67.083669][ T5159]
[ 67.091184][ T5159] CPU0
[ 67.094442][ T5159] ----
[ 67.097721][ T5159] lock(rtnl_mutex);
[ 67.101746][ T5159] lock(rtnl_mutex);
[ 67.105832][ T5159]
[ 67.105832][ T5159] *** DEADLOCK ***
[ 67.105832][ T5159]
[ 67.113981][ T5159] May be due to missing lock nesting notation
[ 67.113981][ T5159]
[ 67.122398][ T5159] 2 locks held by syz-executor.0/5159:
[ 67.127935][ T5159] #0: ffffffff8f5e6ec8 (rtnl_mutex){+.+.}-{3:3}, at: rtnetlink_rcv_msg+0x842/0x1180
[ 67.137584][ T5159] #1: ffff88806a00cd20 (team->team_lock_key){+.+.}-{3:3}, at: team_add_slave+0xb0/0x2720
[ 67.147698][ T5159]
[ 67.147698][ T5159] stack backtrace:
[ 67.153714][ T5159] CPU: 0 PID: 5159 Comm: syz-executor.0 Not tainted 6.10.0-rc6-syzkaller-00069-g795c58e4c7fc-dirty #0
[ 67.164638][ T5159] Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 06/07/2024
[ 67.174864][ T5159] Call Trace:
[ 67.178320][ T5159] <TASK>
[ 67.181235][ T5159] dump_stack_lvl+0x241/0x360
[ 67.185997][ T5159] ? __pfx_dump_stack_lvl+0x10/0x10
[ 67.191370][ T5159] ? print_deadlock_bug+0x479/0x620
[ 67.196552][ T5159] validate_chain+0x15d3/0x5900
[ 67.201409][ T5159] ? __pfx_validate_chain+0x10/0x10
[ 67.206707][ T5159] ? stack_trace_save+0x118/0x1d0
[ 67.211816][ T5159] ? __pfx_stack_trace_save+0x10/0x10
[ 67.217296][ T5159] ? lockdep_unlock+0x16a/0x300
[ 67.222144][ T5159] ? mark_lock+0x9a/0x350
[ 67.226480][ T5159] __lock_acquire+0x1346/0x1fd0
[ 67.231609][ T5159] lock_acquire+0x1ed/0x550
[ 67.236105][ T5159] ? team_add_slave+0xdd/0x2720
[ 67.240978][ T5159] ? __pfx_lock_acquire+0x10/0x10
[ 67.246022][ T5159] ? __pfx___might_resched+0x10/0x10
[ 67.251353][ T5159] ? __pfx___mutex_trylock_common+0x10/0x10
[ 67.257275][ T5159] __mutex_lock+0x136/0xd70
[ 67.261783][ T5159] ? team_add_slave+0xdd/0x2720
[ 67.266645][ T5159] ? team_add_slave+0xdd/0x2720
[ 67.271654][ T5159] ? __pfx___mutex_lock+0x10/0x10
[ 67.276846][ T5159] team_add_slave+0xdd/0x2720
[ 67.281508][ T5159] ? __pfx_lock_acquire+0x10/0x10
[ 67.286605][ T5159] ? deref_stack_reg+0x1c7/0x260
[ 67.291535][ T5159] ? __pfx_team_add_slave+0x10/0x10
[ 67.296823][ T5159] ? is_bpf_text_address+0x285/0x2a0
[ 67.302187][ T5159] ? is_bpf_text_address+0x26/0x2a0
[ 67.307484][ T5159] ? __pfx_stack_trace_consume_entry+0x10/0x10
[ 67.313666][ T5159] ? kernel_text_address+0xa7/0xe0
[ 67.318792][ T5159] ? __kernel_text_address+0xd/0x40
[ 67.323994][ T5159] ? unwind_get_return_address+0x91/0xc0
[ 67.329730][ T5159] ? mutex_is_locked+0x12/0x50
[ 67.334513][ T5159] do_setlink+0xe70/0x41f0
[ 67.339161][ T5159] ? stack_trace_save+0x118/0x1d0
[ 67.344184][ T5159] ? __pfx_stack_trace_save+0x10/0x10
[ 67.349548][ T5159] ? __pfx_do_setlink+0x10/0x10
[ 67.354458][ T5159] ? __nla_validate_parse+0x26ce/0x3090
[ 67.360197][ T5159] ? kmalloc_trace_noprof+0x19c/0x2c0
[ 67.365569][ T5159] ? rtnl_newlink+0xf2/0x20a0
[ 67.370328][ T5159] ? __pfx___nla_validate_parse+0x10/0x10
[ 67.376043][ T5159] ? validate_linkmsg+0x71e/0x900
[ 67.381225][ T5159] rtnl_newlink+0x180b/0x20a0
[ 67.385888][ T5159] ? rtnl_newlink+0x4f1/0x20a0
[ 67.390647][ T5159] ? __pfx_rtnl_newlink+0x10/0x10
[ 67.395719][ T5159] ? __pfx___mutex_trylock_common+0x10/0x10
[ 67.401615][ T5159] ? rcu_is_watching+0x15/0xb0
[ 67.406401][ T5159] ? trace_contention_end+0x3c/0x120
[ 67.411876][ T5159] ? __mutex_lock+0x2ef/0xd70
[ 67.416760][ T5159] ? __pfx_lock_release+0x10/0x10
[ 67.421801][ T5159] ? __pfx_rtnl_newlink+0x10/0x10
[ 67.426918][ T5159] rtnetlink_rcv_msg+0x89b/0x1180
[ 67.432098][ T5159] ? rtnetlink_rcv_msg+0x208/0x1180
[ 67.437351][ T5159] ? __pfx_rtnetlink_rcv_msg+0x10/0x10
[ 67.442824][ T5159] ? is_bpf_text_address+0x285/0x2a0
[ 67.448107][ T5159] ? __pfx_validate_chain+0x10/0x10
[ 67.453468][ T5159] ? __pfx_validate_chain+0x10/0x10
[ 67.458685][ T5159] ? arch_stack_walk+0x16d/0x1b0
[ 67.463679][ T5159] ? mark_lock+0x9a/0x350
[ 67.468000][ T5159] ? __pfx_validate_chain+0x10/0x10
[ 67.473274][ T5159] ? __lock_acquire+0x1346/0x1fd0
[ 67.478279][ T5159] ? mark_lock+0x9a/0x350
[ 67.482618][ T5159] ? __lock_acquire+0x1346/0x1fd0
[ 67.487750][ T5159] netlink_rcv_skb+0x1e3/0x430
[ 67.492526][ T5159] ? __pfx_rtnetlink_rcv_msg+0x10/0x10
[ 67.497973][ T5159] ? __pfx_netlink_rcv_skb+0x10/0x10
[ 67.503358][ T5159] ? netlink_deliver_tap+0x2e/0x1b0
[ 67.508562][ T5159] netlink_unicast+0x7ea/0x980
[ 67.513366][ T5159] ? __pfx_netlink_unicast+0x10/0x10
[ 67.518654][ T5159] ? __virt_addr_valid+0x183/0x520
[ 67.524039][ T5159] ? __check_object_size+0x49c/0x900
[ 67.529325][ T5159] ? bpf_lsm_netlink_send+0x9/0x10
[ 67.534430][ T5159] netlink_sendmsg+0x8db/0xcb0
[ 67.539209][ T5159] ? __pfx_netlink_sendmsg+0x10/0x10
[ 67.544486][ T5159] ? lockdep_hardirqs_on_prepare+0x43d/0x780
[ 67.550487][ T5159] ? aa_sock_msg_perm+0x91/0x160
[ 67.555506][ T5159] ? bpf_lsm_socket_sendmsg+0x9/0x10
[ 67.560779][ T5159] ? security_socket_sendmsg+0x87/0xb0
[ 67.566234][ T5159] ? __pfx_netlink_sendmsg+0x10/0x10
[ 67.571514][ T5159] __sock_sendmsg+0x221/0x270
[ 67.576197][ T5159] __sys_sendto+0x3a4/0x4f0
[ 67.580871][ T5159] ? __pfx___sys_sendto+0x10/0x10
[ 67.585926][ T5159] ? lockdep_hardirqs_on_prepare+0x43d/0x780
[ 67.591896][ T5159] ? __pfx_lockdep_hardirqs_on_prepare+0x10/0x10
[ 67.598304][ T5159] __x64_sys_sendto+0xde/0x100
[ 67.603056][ T5159] do_syscall_64+0xf3/0x230
[ 67.607636][ T5159] ? clear_bhb_loop+0x35/0x90
[ 67.612394][ T5159] entry_SYSCALL_64_after_hwframe+0x77/0x7f
[ 67.618397][ T5159] RIP: 0033:0x7fe59307ed43
[ 67.622867][ T5159] Code: 64 89 02 48 c7 c0 ff ff ff ff eb b7 66 2e 0f 1f 84 00 00 00 00 00 90 80 3d c1 91 10 00 00 41 89 ca 74 14 b8 2c 00 00 00 0f 05 <48> 3d 00 f0 ff ff 77 75 c3 0f 1f 40 00 55 48 83 ec 30 44 89 4c 24
[ 67.642562][ T5159] RSP: 002b:00007fe5932df648 EFLAGS: 00000202 ORIG_RAX: 000000000000002c
[ 67.650973][ T5159] RAX: ffffffffffffffda RBX: 00007fe593ce4620 RCX: 00007fe59307ed43
[ 67.659020][ T5159] RDX: 0000000000000028 RSI: 00007fe593ce4670 RDI: 0000000000000003
[ 67.666978][ T5159] RBP: 0000000000000001 R08: 00007fe5932df664 R09: 000000000000000c
[ 67.675020][ T5159] R10: 0000000000000000 R11: 0000000000000202 R12: 0000000000000003
[ 67.683433][ T5159] R13: 0000000000000000 R14: 00007fe593ce4670 R15: 0000000000000000
[ 67.691430][ T5159] </TASK>
[ 72.393400][ T1249] ieee802154 phy0 wpan0: encryption failed: -22
[ 72.399745][ T1249] ieee802154 phy1 wpan1: encryption failed: -22


syzkaller build log:
go env (err=<nil>)
GO111MODULE='auto'
GOARCH='amd64'
GOBIN=''
GOCACHE='/syzkaller/.cache/go-build'
GOENV='/syzkaller/.config/go/env'
GOEXE=''
GOEXPERIMENT=''
GOFLAGS=''
GOHOSTARCH='amd64'
GOHOSTOS='linux'
GOINSECURE=''
GOMODCACHE='/syzkaller/jobs-2/linux/gopath/pkg/mod'
GONOPROXY=''
GONOSUMDB=''
GOOS='linux'
GOPATH='/syzkaller/jobs-2/linux/gopath'
GOPRIVATE=''
GOPROXY='https://proxy.golang.org,direct'
GOROOT='/usr/local/go'
GOSUMDB='sum.golang.org'
GOTMPDIR=''
GOTOOLCHAIN='auto'
GOTOOLDIR='/usr/local/go/pkg/tool/linux_amd64'
GOVCS=''
GOVERSION='go1.21.4'
GCCGO='gccgo'
GOAMD64='v1'
AR='ar'
CC='gcc'
CXX='g++'
CGO_ENABLED='1'
GOMOD='/syzkaller/jobs-2/linux/gopath/src/github.com/google/syzkaller/go.mod'
GOWORK=''
CGO_CFLAGS='-O2 -g'
CGO_CPPFLAGS=''
CGO_CXXFLAGS='-O2 -g'
CGO_FFLAGS='-O2 -g'
CGO_LDFLAGS='-O2 -g'
PKG_CONFIG='pkg-config'
GOGCCFLAGS='-fPIC -m64 -pthread -Wl,--no-gc-sections -fmessage-length=0 -ffile-prefix-map=/tmp/go-build2557619195=/tmp/go-build -gno-record-gcc-switches'

git status (err=<nil>)
HEAD detached at edc5149ad2
nothing to commit, working tree clean


tput: No value for $TERM and no -T specified
tput: No value for $TERM and no -T specified
Makefile:31: run command via tools/syz-env for best compatibility, see:
Makefile:32: https://github.com/google/syzkaller/blob/master/docs/contributing.md#using-syz-env
go list -f '{{.Stale}}' ./sys/syz-sysgen | grep -q false || go install ./sys/syz-sysgen
make .descriptions
tput: No value for $TERM and no -T specified
tput: No value for $TERM and no -T specified
Makefile:31: run command via tools/syz-env for best compatibility, see:
Makefile:32: https://github.com/google/syzkaller/blob/master/docs/contributing.md#using-syz-env
bin/syz-sysgen
go fmt ./sys/... >/dev/null
touch .descriptions
GOOS=linux GOARCH=amd64 go build "-ldflags=-s -w -X github.com/google/syzkaller/prog.GitRevision=edc5149ad2ab7a38db6b3bcb1b594e0264a92163 -X 'github.com/google/syzkaller/prog.gitRevisionDate=20240621-090414'" "-tags=syz_target syz_os_linux syz_arch_amd64 " -o ./bin/linux_amd64/syz-fuzzer github.com/google/syzkaller/syz-fuzzer
GOOS=linux GOARCH=amd64 go build "-ldflags=-s -w -X github.com/google/syzkaller/prog.GitRevision=edc5149ad2ab7a38db6b3bcb1b594e0264a92163 -X 'github.com/google/syzkaller/prog.gitRevisionDate=20240621-090414'" "-tags=syz_target syz_os_linux syz_arch_amd64 " -o ./bin/linux_amd64/syz-execprog github.com/google/syzkaller/tools/syz-execprog
mkdir -p ./bin/linux_amd64
g++ -o ./bin/linux_amd64/syz-executor executor/executor.cc \
-m64 -O2 -pthread -Wall -Werror -Wparentheses -Wunused-const-variable -Wframe-larger-than=16384 -Wno-stringop-overflow -Wno-array-bounds -Wno-format-overflow -Wno-unused-but-set-variable -Wno-unused-command-line-argument -static-pie -std=c++17 -I. -Iexecutor/_include -fpermissive -w -DGOOS_linux=1 -DGOARCH_amd64=1 \
-DHOSTGOOS_linux=1 -DGIT_REVISION=\"edc5149ad2ab7a38db6b3bcb1b594e0264a92163\"



Tested on:

commit: 795c58e4 Merge tag 'trace-v6.10-rc6' of git://git.kern..
git tree: upstream
kernel config: https://syzkaller.appspot.com/x/.config?x=864caee5f78cab51
dashboard link: https://syzkaller.appspot.com/bug?extid=705c61d60b091ef42c04
compiler: Debian clang version 15.0.6, GNU ld (GNU Binutils for Debian) 2.40
patch: https://syzkaller.appspot.com/x/patch.diff?x=13e99bae980000

syzbot

unread,
Jul 4, 2024, 12:28:06 PM7/4/24
to aha3...@gmail.com, linux-...@vger.kernel.org, syzkall...@googlegroups.com
Hello,

syzbot has tested the proposed patch and the reproducer did not trigger any issue:

Reported-and-tested-by: syzbot+705c61...@syzkaller.appspotmail.com

Tested on:

commit: 795c58e4 Merge tag 'trace-v6.10-rc6' of git://git.kern..
git tree: upstream
console output: https://syzkaller.appspot.com/x/log.txt?x=13bf4581980000
kernel config: https://syzkaller.appspot.com/x/.config?x=864caee5f78cab51
dashboard link: https://syzkaller.appspot.com/bug?extid=705c61d60b091ef42c04
compiler: Debian clang version 15.0.6, GNU ld (GNU Binutils for Debian) 2.40
patch: https://syzkaller.appspot.com/x/patch.diff?x=104920a5980000

Jeongjun Park

unread,
Jul 5, 2024, 11:17:28 AM7/5/24
to edum...@google.com, aha3...@gmail.com, da...@davemloft.net, ji...@resnulli.us, ku...@kernel.org, linux-...@vger.kernel.org, michal...@intel.com, net...@vger.kernel.org, pab...@redhat.com, syzbot+705c61...@syzkaller.appspotmail.com, syzkall...@googlegroups.com
Thank you for comment.

So, how about briefly releasing the lock before calling dev_open()
in team_port_add() and then locking it again? dev_open() does not use
&team, so disabling it briefly will not cause any major problems.

Regards,
Jeongjun Park

---
drivers/net/team/team_core.c | 2 ++
1 file changed, 2 insertions(+)

diff --git a/drivers/net/team/team_core.c b/drivers/net/team/team_core.c
index ab1935a4aa2c..245566a1875d 100644
--- a/drivers/net/team/team_core.c
+++ b/drivers/net/team/team_core.c

Jeongjun Park

unread,
Jul 5, 2024, 11:19:08 AM7/5/24
to edum...@google.com, aha3...@gmail.com, da...@davemloft.net, ji...@resnulli.us, ku...@kernel.org, linux-...@vger.kernel.org, michal...@intel.com, net...@vger.kernel.org, pab...@redhat.com, syzbot+705c61...@syzkaller.appspotmail.com, syzkall...@googlegroups.com
Thank you for comment.

So, how about briefly releasing the lock before calling dev_open()
in team_port_add() and then locking it again? dev_open() does not use
&team, so disabling it briefly will not cause any major problems.

Regards,
Jeongjun Park

---
drivers/net/team/team_core.c | 2 ++
1 file changed, 2 insertions(+)

diff --git a/drivers/net/team/team_core.c b/drivers/net/team/team_core.c
index ab1935a4aa2c..245566a1875d 100644
--- a/drivers/net/team/team_core.c
+++ b/drivers/net/team/team_core.c

Jeongjun Park

unread,
Jul 6, 2024, 12:13:46 AM7/6/24
to ji...@resnulli.us, syzbot+705c61...@syzkaller.appspotmail.com, da...@davemloft.net, edum...@google.com, ku...@kernel.org, linux-...@vger.kernel.org, net...@vger.kernel.org, pab...@redhat.com, syzkall...@googlegroups.com, Jeongjun Park
CPU0 CPU1
---- ----
lock(&rdev->wiphy.mtx);
lock(team->team_lock_key#4);
lock(&rdev->wiphy.mtx);
lock(team->team_lock_key#4);

Deadlock occurs due to the above scenario. Therefore, you can prevent
deadlock by briefly releasing the lock before calling dev_open() in
team_port_add() and locking it again after it returns.

Reported-and-tested-by: syzbot+705c61...@syzkaller.appspotmail.com
Fixes: 3d249d4ca7d0 ("net: introduce ethernet teaming device")
Signed-off-by: Jeongjun Park <aha3...@gmail.com>

Stephen Hemminger

unread,
Jul 6, 2024, 11:01:07 AM7/6/24
to Jeongjun Park, ji...@resnulli.us, syzbot+705c61...@syzkaller.appspotmail.com, da...@davemloft.net, edum...@google.com, ku...@kernel.org, linux-...@vger.kernel.org, net...@vger.kernel.org, pab...@redhat.com, syzkall...@googlegroups.com
On Sat, 6 Jul 2024 13:13:29 +0900
Jeongjun Park <aha3...@gmail.com> wrote:

> CPU0 CPU1
> ---- ----
> lock(&rdev->wiphy.mtx);
> lock(team->team_lock_key#4);
> lock(&rdev->wiphy.mtx);
> lock(team->team_lock_key#4);
>
> Deadlock occurs due to the above scenario. Therefore, you can prevent
> deadlock by briefly releasing the lock before calling dev_open() in
> team_port_add() and locking it again after it returns.
>
> Reported-and-tested-by: syzbot+705c61...@syzkaller.appspotmail.com
> Fixes: 3d249d4ca7d0 ("net: introduce ethernet teaming device")
> Signed-off-by: Jeongjun Park <aha3...@gmail.com>
> ---

But if you drop the lock the actual data structures might have changed.
Usually not a good idea,

Jeongjun Park

unread,
Jul 7, 2024, 2:00:36 AM7/7/24
to syzbot+705c61...@syzkaller.appspotmail.com, linux-...@vger.kernel.org, syzkall...@googlegroups.com
net/mac80211/iface.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/net/mac80211/iface.c b/net/mac80211/iface.c
index b935bb5d8ed1..7ac4a62ed536 100644
--- a/net/mac80211/iface.c
+++ b/net/mac80211/iface.c
@@ -2301,7 +2301,7 @@ void ieee80211_remove_interfaces(struct ieee80211_local *local)
ieee80211_vif_cfg_change_notify(sdata,
BSS_CHANGED_ARP_FILTER);

- list_del(&sdata->list);
+ list_del_init(&sdata->list);
cfg80211_unregister_wdev(&sdata->wdev);

if (!netdev)
--

Jeongjun Park

unread,
Jul 7, 2024, 2:02:18 AM7/7/24
to syzbot+705c61...@syzkaller.appspotmail.com, linux-...@vger.kernel.org, syzkall...@googlegroups.com

Jeongjun Park

unread,
Jul 7, 2024, 2:07:00 AM7/7/24
to syzbot+705c61...@syzkaller.appspotmail.com, linux-...@vger.kernel.org, syzkall...@googlegroups.com
net/mac80211/iface.c | 3 ++-
1 file changed, 2 insertions(+), 1 deletion(-)

diff --git a/net/mac80211/iface.c b/net/mac80211/iface.c
index 7ac4a62ed536..e55b1c2654ab 100644
--- a/net/mac80211/iface.c
+++ b/net/mac80211/iface.c
@@ -2286,6 +2286,8 @@ void ieee80211_remove_interfaces(struct ieee80211_local *local)
list_splice_init(&local->interfaces, &unreg_list);
mutex_unlock(&local->iflist_mtx);

+ wiphy_unlock(local->hw.wiphy);
+
list_for_each_entry_safe(sdata, tmp, &unreg_list, list) {
bool netdev = sdata->dev;

@@ -2307,7 +2309,6 @@ void ieee80211_remove_interfaces(struct ieee80211_local *local)
if (!netdev)
kfree(sdata);
}
- wiphy_unlock(local->hw.wiphy);
}

static int netdev_notify(struct notifier_block *nb,
--
2.34.1

syzbot

unread,
Jul 7, 2024, 2:23:03 AM7/7/24
to aha3...@gmail.com, linux-...@vger.kernel.org, syzkall...@googlegroups.com
Hello,

syzbot has tested the proposed patch but the reproducer is still triggering an issue:
possible deadlock in team_del_slave

======================================================
WARNING: possible circular locking dependency detected
6.10.0-rc6-syzkaller-00223-gc6653f49e4fd-dirty #0 Not tainted
------------------------------------------------------
kworker/u8:0/11 is trying to acquire lock:
ffff888023258d20 (team->team_lock_key#4){+.+.}-{3:3}, at: team_del_slave+0x32/0x1d0 drivers/net/team/team_core.c:1990

but task is already holding lock:
ffff88801eed0768 (&rdev->wiphy.mtx){+.+.}-{3:3}, at: wiphy_lock include/net/cfg80211.h:5966 [inline]
ffff88801eed0768 (&rdev->wiphy.mtx){+.+.}-{3:3}, at: ieee80211_remove_interfaces+0x12b/0x6f0 net/mac80211/iface.c:2280

which lock already depends on the new lock.


the existing dependency chain (in reverse order) is:

-> #1 (&rdev->wiphy.mtx){+.+.}-{3:3}:
lock_acquire+0x1ed/0x550 kernel/locking/lockdep.c:5754
__mutex_lock_common kernel/locking/mutex.c:608 [inline]
__mutex_lock+0x136/0xd70 kernel/locking/mutex.c:752
wiphy_lock include/net/cfg80211.h:5966 [inline]
ieee80211_open+0xe7/0x200 net/mac80211/iface.c:449
__dev_open+0x2d3/0x450 net/core/dev.c:1472
dev_open+0xae/0x1b0 net/core/dev.c:1508
team_port_add drivers/net/team/team_core.c:1216 [inline]
team_add_slave+0x9b3/0x2750 drivers/net/team/team_core.c:1976
do_set_master net/core/rtnetlink.c:2701 [inline]
do_setlink+0xe70/0x41f0 net/core/rtnetlink.c:2907
__rtnl_newlink net/core/rtnetlink.c:3696 [inline]
rtnl_newlink+0x180b/0x20a0 net/core/rtnetlink.c:3743
rtnetlink_rcv_msg+0x89b/0x1180 net/core/rtnetlink.c:6635
netlink_rcv_skb+0x1e3/0x430 net/netlink/af_netlink.c:2564
netlink_unicast_kernel net/netlink/af_netlink.c:1335 [inline]
netlink_unicast+0x7ea/0x980 net/netlink/af_netlink.c:1361
netlink_sendmsg+0x8db/0xcb0 net/netlink/af_netlink.c:1905
sock_sendmsg_nosec net/socket.c:730 [inline]
__sock_sendmsg+0x221/0x270 net/socket.c:745
____sys_sendmsg+0x525/0x7d0 net/socket.c:2585
___sys_sendmsg net/socket.c:2639 [inline]
__sys_sendmsg+0x2b0/0x3a0 net/socket.c:2668
do_syscall_x64 arch/x86/entry/common.c:52 [inline]
do_syscall_64+0xf3/0x230 arch/x86/entry/common.c:83
entry_SYSCALL_64_after_hwframe+0x77/0x7f

-> #0 (team->team_lock_key#4){+.+.}-{3:3}:
check_prev_add kernel/locking/lockdep.c:3134 [inline]
check_prevs_add kernel/locking/lockdep.c:3253 [inline]
validate_chain+0x18e0/0x5900 kernel/locking/lockdep.c:3869
__lock_acquire+0x1346/0x1fd0 kernel/locking/lockdep.c:5137
lock_acquire+0x1ed/0x550 kernel/locking/lockdep.c:5754
__mutex_lock_common kernel/locking/mutex.c:608 [inline]
__mutex_lock+0x136/0xd70 kernel/locking/mutex.c:752
team_del_slave+0x32/0x1d0 drivers/net/team/team_core.c:1990
team_device_event+0x200/0x5b0 drivers/net/team/team_core.c:2984
notifier_call_chain+0x19f/0x3e0 kernel/notifier.c:93
call_netdevice_notifiers_extack net/core/dev.c:2030 [inline]
call_netdevice_notifiers net/core/dev.c:2044 [inline]
unregister_netdevice_many_notify+0xd75/0x16b0 net/core/dev.c:11219
unregister_netdevice_many net/core/dev.c:11277 [inline]
unregister_netdevice_queue+0x303/0x370 net/core/dev.c:11156
unregister_netdevice include/linux/netdevice.h:3119 [inline]
_cfg80211_unregister_wdev+0x162/0x560 net/wireless/core.c:1206
ieee80211_remove_interfaces+0x4cd/0x6f0 net/mac80211/iface.c:2305
ieee80211_unregister_hw+0x5d/0x2c0 net/mac80211/main.c:1659
mac80211_hwsim_del_radio+0x2c2/0x4c0 drivers/net/wireless/virtual/mac80211_hwsim.c:5576
hwsim_exit_net+0x5c1/0x670 drivers/net/wireless/virtual/mac80211_hwsim.c:6453
ops_exit_list net/core/net_namespace.c:173 [inline]
cleanup_net+0x802/0xcc0 net/core/net_namespace.c:640
process_one_work kernel/workqueue.c:3248 [inline]
process_scheduled_works+0xa2c/0x1830 kernel/workqueue.c:3329
worker_thread+0x86d/0xd50 kernel/workqueue.c:3409
kthread+0x2f0/0x390 kernel/kthread.c:389
ret_from_fork+0x4b/0x80 arch/x86/kernel/process.c:147
ret_from_fork_asm+0x1a/0x30 arch/x86/entry/entry_64.S:244

other info that might help us debug this:

Possible unsafe locking scenario:

CPU0 CPU1
---- ----
lock(&rdev->wiphy.mtx);
lock(team->team_lock_key#4);
lock(&rdev->wiphy.mtx);
lock(team->team_lock_key#4);

*** DEADLOCK ***

5 locks held by kworker/u8:0/11:
#0: ffff888015ed5948 ((wq_completion)netns){+.+.}-{0:0}, at: process_one_work kernel/workqueue.c:3223 [inline]
#0: ffff888015ed5948 ((wq_completion)netns){+.+.}-{0:0}, at: process_scheduled_works+0x90a/0x1830 kernel/workqueue.c:3329
#1: ffffc90000107d00 (net_cleanup_work){+.+.}-{0:0}, at: process_one_work kernel/workqueue.c:3224 [inline]
#1: ffffc90000107d00 (net_cleanup_work){+.+.}-{0:0}, at: process_scheduled_works+0x945/0x1830 kernel/workqueue.c:3329
#2: ffffffff8f5da590 (pernet_ops_rwsem){++++}-{3:3}, at: cleanup_net+0x16a/0xcc0 net/core/net_namespace.c:594
#3: ffffffff8f5e6dc8 (rtnl_mutex){+.+.}-{3:3}, at: ieee80211_unregister_hw+0x55/0x2c0 net/mac80211/main.c:1652
#4: ffff88801eed0768 (&rdev->wiphy.mtx){+.+.}-{3:3}, at: wiphy_lock include/net/cfg80211.h:5966 [inline]
#4: ffff88801eed0768 (&rdev->wiphy.mtx){+.+.}-{3:3}, at: ieee80211_remove_interfaces+0x12b/0x6f0 net/mac80211/iface.c:2280

stack backtrace:
CPU: 1 PID: 11 Comm: kworker/u8:0 Not tainted 6.10.0-rc6-syzkaller-00223-gc6653f49e4fd-dirty #0
Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 06/07/2024
Workqueue: netns cleanup_net
Call Trace:
<TASK>
__dump_stack lib/dump_stack.c:88 [inline]
dump_stack_lvl+0x241/0x360 lib/dump_stack.c:114
check_noncircular+0x36a/0x4a0 kernel/locking/lockdep.c:2187
check_prev_add kernel/locking/lockdep.c:3134 [inline]
check_prevs_add kernel/locking/lockdep.c:3253 [inline]
validate_chain+0x18e0/0x5900 kernel/locking/lockdep.c:3869
__lock_acquire+0x1346/0x1fd0 kernel/locking/lockdep.c:5137
lock_acquire+0x1ed/0x550 kernel/locking/lockdep.c:5754
__mutex_lock_common kernel/locking/mutex.c:608 [inline]
__mutex_lock+0x136/0xd70 kernel/locking/mutex.c:752
team_del_slave+0x32/0x1d0 drivers/net/team/team_core.c:1990
team_device_event+0x200/0x5b0 drivers/net/team/team_core.c:2984
notifier_call_chain+0x19f/0x3e0 kernel/notifier.c:93
call_netdevice_notifiers_extack net/core/dev.c:2030 [inline]
call_netdevice_notifiers net/core/dev.c:2044 [inline]
unregister_netdevice_many_notify+0xd75/0x16b0 net/core/dev.c:11219
unregister_netdevice_many net/core/dev.c:11277 [inline]
unregister_netdevice_queue+0x303/0x370 net/core/dev.c:11156
unregister_netdevice include/linux/netdevice.h:3119 [inline]
_cfg80211_unregister_wdev+0x162/0x560 net/wireless/core.c:1206
ieee80211_remove_interfaces+0x4cd/0x6f0 net/mac80211/iface.c:2305
ieee80211_unregister_hw+0x5d/0x2c0 net/mac80211/main.c:1659
mac80211_hwsim_del_radio+0x2c2/0x4c0 drivers/net/wireless/virtual/mac80211_hwsim.c:5576
hwsim_exit_net+0x5c1/0x670 drivers/net/wireless/virtual/mac80211_hwsim.c:6453
ops_exit_list net/core/net_namespace.c:173 [inline]
cleanup_net+0x802/0xcc0 net/core/net_namespace.c:640
process_one_work kernel/workqueue.c:3248 [inline]
process_scheduled_works+0xa2c/0x1830 kernel/workqueue.c:3329
worker_thread+0x86d/0xd50 kernel/workqueue.c:3409
kthread+0x2f0/0x390 kernel/kthread.c:389
ret_from_fork+0x4b/0x80 arch/x86/kernel/process.c:147
ret_from_fork_asm+0x1a/0x30 arch/x86/entry/entry_64.S:244
</TASK>
team0: Port device wlan1 removed
hsr_slave_0: left promiscuous mode
hsr_slave_1: left promiscuous mode
batman_adv: batadv0: Interface deactivated: batadv_slave_0
batman_adv: batadv0: Removing interface: batadv_slave_0
batman_adv: batadv0: Interface deactivated: batadv_slave_1
batman_adv: batadv0: Removing interface: batadv_slave_1
veth1_macvtap: left promiscuous mode
veth0_macvtap: left promiscuous mode
veth1_vlan: left promiscuous mode
veth0_vlan: left promiscuous mode
team0 (unregistering): Port device team_slave_1 removed
team0 (unregistering): Port device team_slave_0 removed
netdevsim netdevsim1 netdevsim3 (unregistering): unset [1, 0] type 2 family 0 port 6081 - 0
netdevsim netdevsim1 netdevsim2 (unregistering): unset [1, 0] type 2 family 0 port 6081 - 0
netdevsim netdevsim1 netdevsim1 (unregistering): unset [1, 0] type 2 family 0 port 6081 - 0
netdevsim netdevsim1 netdevsim0 (unregistering): unset [1, 0] type 2 family 0 port 6081 - 0
netdevsim netdevsim3 netdevsim3 (unregistering): unset [1, 0] type 2 family 0 port 6081 - 0
netdevsim netdevsim3 netdevsim2 (unregistering): unset [1, 0] type 2 family 0 port 6081 - 0
netdevsim netdevsim3 netdevsim1 (unregistering): unset [1, 0] type 2 family 0 port 6081 - 0
netdevsim netdevsim3 netdevsim0 (unregistering): unset [1, 0] type 2 family 0 port 6081 - 0
netdevsim netdevsim0 netdevsim3 (unregistering): unset [1, 0] type 2 family 0 port 6081 - 0
netdevsim netdevsim0 netdevsim2 (unregistering): unset [1, 0] type 2 family 0 port 6081 - 0
netdevsim netdevsim0 netdevsim1 (unregistering): unset [1, 0] type 2 family 0 port 6081 - 0
netdevsim netdevsim0 netdevsim0 (unregistering): unset [1, 0] type 2 family 0 port 6081 - 0
bridge_slave_1: left allmulticast mode
bridge_slave_1: left promiscuous mode
bridge0: port 2(bridge_slave_1) entered disabled state
bridge_slave_0: left allmulticast mode
bridge_slave_0: left promiscuous mode
bridge0: port 1(bridge_slave_0) entered disabled state
bridge_slave_1: left allmulticast mode
bridge_slave_1: left promiscuous mode
bridge0: port 2(bridge_slave_1) entered disabled state
bridge_slave_0: left allmulticast mode
bridge_slave_0: left promiscuous mode
bridge0: port 1(bridge_slave_0) entered disabled state
bridge_slave_1: left allmulticast mode
bridge_slave_1: left promiscuous mode
bridge0: port 2(bridge_slave_1) entered disabled state
bridge_slave_0: left allmulticast mode
bridge_slave_0: left promiscuous mode
bridge0: port 1(bridge_slave_0) entered disabled state
bridge_slave_1: left allmulticast mode
bridge_slave_1: left promiscuous mode
bridge0: port 2(bridge_slave_1) entered disabled state
bridge_slave_0: left allmulticast mode
bridge_slave_0: left promiscuous mode
bridge0: port 1(bridge_slave_0) entered disabled state
commit: c6653f49 Merge tag 'powerpc-6.10-4' of git://git.kerne..
git tree: upstream
console output: https://syzkaller.appspot.com/x/log.txt?x=1037d4a5980000
kernel config: https://syzkaller.appspot.com/x/.config?x=864caee5f78cab51
dashboard link: https://syzkaller.appspot.com/bug?extid=705c61d60b091ef42c04
compiler: Debian clang version 15.0.6, GNU ld (GNU Binutils for Debian) 2.40
patch: https://syzkaller.appspot.com/x/patch.diff?x=1063e781980000

syzbot

unread,
Jul 7, 2024, 2:44:04 AM7/7/24
to aha3...@gmail.com, linux-...@vger.kernel.org, syzkall...@googlegroups.com
Hello,

syzbot has tested the proposed patch but the reproducer is still triggering an issue:
possible deadlock in team_del_slave

bond0 (unregistering): (slave bond_slave_0): Releasing backup interface
bond0 (unregistering): (slave bond_slave_1): Releasing backup interface
bond0 (unregistering): Released all slaves
======================================================
WARNING: possible circular locking dependency detected
6.10.0-rc6-syzkaller-00223-gc6653f49e4fd-dirty #0 Not tainted
------------------------------------------------------
kworker/u8:5/1042 is trying to acquire lock:
ffff88802e894d20 (team->team_lock_key#4){+.+.}-{3:3}, at: team_del_slave+0x32/0x1d0 drivers/net/team/team_core.c:1990

but task is already holding lock:
ffff88807e3f8768 (&rdev->wiphy.mtx){+.+.}-{3:3}, at: wiphy_lock include/net/cfg80211.h:5966 [inline]
ffff88807e3f8768 (&rdev->wiphy.mtx){+.+.}-{3:3}, at: ieee80211_remove_interfaces+0x12b/0x6f0 net/mac80211/iface.c:2280
5 locks held by kworker/u8:5/1042:
#0: ffff888015ed5948 ((wq_completion)netns){+.+.}-{0:0}, at: process_one_work kernel/workqueue.c:3223 [inline]
#0: ffff888015ed5948 ((wq_completion)netns){+.+.}-{0:0}, at: process_scheduled_works+0x90a/0x1830 kernel/workqueue.c:3329
#1: ffffc900041ffd00 (net_cleanup_work){+.+.}-{0:0}, at: process_one_work kernel/workqueue.c:3224 [inline]
#1: ffffc900041ffd00 (net_cleanup_work){+.+.}-{0:0}, at: process_scheduled_works+0x945/0x1830 kernel/workqueue.c:3329
#2: ffffffff8f5da590 (pernet_ops_rwsem){++++}-{3:3}, at: cleanup_net+0x16a/0xcc0 net/core/net_namespace.c:594
#3: ffffffff8f5e6dc8 (rtnl_mutex){+.+.}-{3:3}, at: ieee80211_unregister_hw+0x55/0x2c0 net/mac80211/main.c:1652
#4: ffff88807e3f8768 (&rdev->wiphy.mtx){+.+.}-{3:3}, at: wiphy_lock include/net/cfg80211.h:5966 [inline]
#4: ffff88807e3f8768 (&rdev->wiphy.mtx){+.+.}-{3:3}, at: ieee80211_remove_interfaces+0x12b/0x6f0 net/mac80211/iface.c:2280

stack backtrace:
CPU: 0 PID: 1042 Comm: kworker/u8:5 Not tainted 6.10.0-rc6-syzkaller-00223-gc6653f49e4fd-dirty #0
netdevsim netdevsim2 netdevsim3 (unregistering): unset [1, 0] type 2 family 0 port 6081 - 0
netdevsim netdevsim2 netdevsim2 (unregistering): unset [1, 0] type 2 family 0 port 6081 - 0
netdevsim netdevsim2 netdevsim1 (unregistering): unset [1, 0] type 2 family 0 port 6081 - 0
netdevsim netdevsim2 netdevsim0 (unregistering): unset [1, 0] type 2 family 0 port 6081 - 0
netdevsim netdevsim1 netdevsim3 (unregistering): unset [1, 0] type 2 family 0 port 6081 - 0
netdevsim netdevsim1 netdevsim2 (unregistering): unset [1, 0] type 2 family 0 port 6081 - 0
netdevsim netdevsim1 netdevsim1 (unregistering): unset [1, 0] type 2 family 0 port 6081 - 0
netdevsim netdevsim1 netdevsim0 (unregistering): unset [1, 0] type 2 family 0 port 6081 - 0
netdevsim netdevsim3 netdevsim3 (unregistering): unset [1, 0] type 2 family 0 port 6081 - 0
netdevsim netdevsim3 netdevsim2 (unregistering): unset [1, 0] type 2 family 0 port 6081 - 0
netdevsim netdevsim3 netdevsim1 (unregistering): unset [1, 0] type 2 family 0 port 6081 - 0
netdevsim netdevsim3 netdevsim0 (unregistering): unset [1, 0] type 2 family 0 port 6081 - 0
console output: https://syzkaller.appspot.com/x/log.txt?x=14a88376980000
kernel config: https://syzkaller.appspot.com/x/.config?x=864caee5f78cab51
dashboard link: https://syzkaller.appspot.com/bug?extid=705c61d60b091ef42c04
compiler: Debian clang version 15.0.6, GNU ld (GNU Binutils for Debian) 2.40
patch: https://syzkaller.appspot.com/x/patch.diff?x=12446f81980000

syzbot

unread,
Jul 7, 2024, 3:04:05 AM7/7/24
to aha3...@gmail.com, linux-...@vger.kernel.org, syzkall...@googlegroups.com
Hello,

syzbot tried to test the proposed patch but the build/boot failed:

. Setting the MTU to 1560 would solve the problem.
[ 65.749523][ T5188] batman_adv: batadv0: Not using interface batadv_slave_1 (retrying later): interface not active
[ 65.794737][ T5188] hsr_slave_0: entered promiscuous mode
[ 65.800990][ T5188] hsr_slave_1: entered promiscuous mode
[ 65.807988][ T5188] debugfs: Directory 'hsr0' with parent 'hsr' already present!
[ 65.816311][ T5188] Cannot create hsr debugfs directory
[ 65.949582][ T5188] netdevsim netdevsim2 netdevsim0: renamed from eth0
[ 65.959631][ T5188] netdevsim netdevsim2 netdevsim1: renamed from eth1
[ 65.969541][ T5188] netdevsim netdevsim2 netdevsim2: renamed from eth2
[ 65.979136][ T5188] netdevsim netdevsim2 netdevsim3: renamed from eth3
[ 66.059668][ T5188] 8021q: adding VLAN 0 to HW filter on device bond0
[ 66.079370][ T5188] 8021q: adding VLAN 0 to HW filter on device team0
[ 66.091102][ T5170] bridge0: port 1(bridge_slave_0) entered blocking state
[ 66.098400][ T5170] bridge0: port 1(bridge_slave_0) entered forwarding state
[ 66.116765][ T5170] bridge0: port 2(bridge_slave_1) entered blocking state
[ 66.124047][ T5170] bridge0: port 2(bridge_slave_1) entered forwarding state
[ 66.159855][ T5188] hsr0: Slave A (hsr_slave_0) is not up; please bring it up to get a fully working HSR network
[ 66.170938][ T5188] hsr0: Slave B (hsr_slave_1) is not up; please bring it up to get a fully working HSR network
[ 66.209690][ T5188] 8021q: adding VLAN 0 to HW filter on device batadv0
[ 66.250323][ T5188] veth0_vlan: entered promiscuous mode
[ 66.267593][ T5188] veth1_vlan: entered promiscuous mode
[ 66.295077][ T5188] veth0_macvtap: entered promiscuous mode
[ 66.305000][ T5188] veth1_macvtap: entered promiscuous mode
[ 66.321220][ T5188] batman_adv: The newly added mac address (aa:aa:aa:aa:aa:3e) already exists on: batadv_slave_0
[ 66.334443][ T5188] batman_adv: It is strongly recommended to keep mac addresses unique to avoid problems!
[ 66.347433][ T5188] batman_adv: batadv0: Interface activated: batadv_slave_0
[ 66.364590][ T5188] batman_adv: The newly added mac address (aa:aa:aa:aa:aa:3f) already exists on: batadv_slave_1
[ 66.375515][ T5188] batman_adv: It is strongly recommended to keep mac addresses unique to avoid problems!
[ 66.387573][ T5188] batman_adv: batadv0: Interface activated: batadv_slave_1
[ 66.399784][ T5188] netdevsim netdevsim2 netdevsim0: set [1, 0] type 2 family 0 port 6081 - 0
[ 66.408658][ T5188] netdevsim netdevsim2 netdevsim1: set [1, 0] type 2 family 0 port 6081 - 0
[ 66.417776][ T5188] netdevsim netdevsim2 netdevsim2: set [1, 0] type 2 family 0 port 6081 - 0
[ 66.427420][ T5188] netdevsim netdevsim2 netdevsim3: set [1, 0] type 2 family 0 port 6081 - 0
[ 66.499007][ T134] wlan0: Created IBSS using preconfigured BSSID 50:50:50:50:50:50
[ 66.507906][ T134] wlan0: Creating new IBSS network, BSSID 50:50:50:50:50:50
[ 66.537892][ T51] wlan1: Created IBSS using preconfigured BSSID 50:50:50:50:50:50
[ 66.546807][ T51] wlan1: Creating new IBSS network, BSSID 50:50:50:50:50:50
[ 69.815443][ T12] bridge_slave_1: left allmulticast mode
[ 69.821395][ T12] bridge_slave_1: left promiscuous mode
[ 69.844241][ T12] bridge0: port 2(bridge_slave_1) entered disabled state
[ 69.859472][ T12] bridge_slave_0: left allmulticast mode
[ 69.867932][ T12] bridge_slave_0: left promiscuous mode
[ 69.873969][ T12] bridge0: port 1(bridge_slave_0) entered disabled state
[ 70.095084][ T12] bond0 (unregistering): (slave bond_slave_0): Releasing backup interface
[ 70.107948][ T12] bond0 (unregistering): (slave bond_slave_1): Releasing backup interface
[ 70.118543][ T12] bond0 (unregistering): Released all slaves
[ 70.239777][ T12] hsr_slave_0: left promiscuous mode
[ 70.246490][ T12] hsr_slave_1: left promiscuous mode
[ 70.255808][ T12] batman_adv: batadv0: Interface deactivated: batadv_slave_0
[ 70.265531][ T12] batman_adv: batadv0: Removing interface: batadv_slave_0
[ 70.283319][ T12] batman_adv: batadv0: Interface deactivated: batadv_slave_1
[ 70.290777][ T12] batman_adv: batadv0: Removing interface: batadv_slave_1
[ 70.316399][ T12] veth1_macvtap: left promiscuous mode
[ 70.323105][ T12] veth0_macvtap: left promiscuous mode
[ 70.328757][ T12] veth1_vlan: left promiscuous mode
[ 70.335500][ T12] veth0_vlan: left promiscuous mode
[ 70.689807][ T12] team0 (unregistering): Port device team_slave_1 removed
[ 70.718877][ T12] team0 (unregistering): Port device team_slave_0 removed
[ 71.258531][ T12] netdevsim netdevsim2 netdevsim3 (unregistering): unset [1, 0] type 2 family 0 port 6081 - 0
[ 71.857737][ T1248] ieee802154 phy0 wpan0: encryption failed: -22
[ 71.871811][ T1248] ieee802154 phy1 wpan1: encryption failed: -22
[ 72.118696][ T12] netdevsim netdevsim2 netdevsim2 (unregistering): unset [1, 0] type 2 family 0 port 6081 - 0
[ 72.177872][ T12] netdevsim netdevsim2 netdevsim1 (unregistering): unset [1, 0] type 2 family 0 port 6081 - 0
[ 72.269874][ T12] netdevsim netdevsim2 netdevsim0 (unregistering): unset [1, 0] type 2 family 0 port 6081 - 0
[ 72.413143][ T12] bridge_slave_1: left allmulticast mode
[ 72.418935][ T12] bridge_slave_1: left promiscuous mode
[ 72.427494][ T12] bridge0: port 2(bridge_slave_1) entered disabled state
[ 72.440417][ T12] bridge_slave_0: left allmulticast mode
[ 72.448397][ T12] bridge_slave_0: left promiscuous mode
[ 72.455186][ T12] bridge0: port 1(bridge_slave_0) entered disabled state
[ 72.738209][ T12] bond0 (unregistering): (slave bond_slave_0): Releasing backup interface
[ 72.749757][ T12] bond0 (unregistering): (slave bond_slave_1): Releasing backup interface
[ 72.760471][ T12] bond0 (unregistering): Released all slaves
[ 73.080824][ T12] ------------[ cut here ]------------
[ 73.086430][ T12] WARNING: CPU: 0 PID: 12 at net/wireless/core.c:1197 _cfg80211_unregister_wdev+0x46d/0x560
[ 73.096902][ T12] Modules linked in:
[ 73.100847][ T12] CPU: 0 PID: 12 Comm: kworker/u8:1 Not tainted 6.10.0-rc6-syzkaller-00223-gc6653f49e4fd-dirty #0
[ 73.111921][ T12] Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 06/07/2024
[ 73.122420][ T12] Workqueue: netns cleanup_net
[ 73.127239][ T12] RIP: 0010:_cfg80211_unregister_wdev+0x46d/0x560
[ 73.134251][ T12] Code: 0f b6 04 38 84 c0 0f 85 ec 00 00 00 41 80 65 00 fe 48 83 c4 10 5b 41 5c 41 5d 41 5e 41 5f 5d c3 cc cc cc cc e8 04 07 c5 f6 90 <0f> 0b 90 e9 61 fc ff ff e8 f6 06 c5 f6 c6 05 2c ac c6 04 01 90 48
[ 73.154633][ T12] RSP: 0018:ffffc90000117798 EFLAGS: 00010293
[ 73.160724][ T12] RAX: ffffffff8ad120ac RBX: 0000000000000000 RCX: ffff8880176c5a00
[ 73.169098][ T12] RDX: 0000000000000000 RSI: 0000000000000000 RDI: 0000000000000000
[ 73.177679][ T12] RBP: ffff888068618000 R08: ffffffff8ad11d02 R09: 1ffffffff1ebcdac
[ 73.186247][ T12] R10: dffffc0000000000 R11: fffffbfff1ebcdad R12: 0000000000000001
[ 73.194767][ T12] R13: ffff88801cf7ccb0 R14: ffff888068618700 R15: dffffc0000000000
[ 73.203072][ T12] FS: 0000000000000000(0000) GS:ffff8880b9400000(0000) knlGS:0000000000000000
[ 73.212323][ T12] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[ 73.218895][ T12] CR2: 000055f5b8852950 CR3: 000000000e132000 CR4: 00000000003506f0
[ 73.227244][ T12] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
[ 73.235570][ T12] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
[ 73.243988][ T12] Call Trace:
[ 73.247277][ T12] <TASK>
[ 73.250193][ T12] ? __warn+0x163/0x4e0
[ 73.254816][ T12] ? _cfg80211_unregister_wdev+0x46d/0x560
[ 73.260665][ T12] ? report_bug+0x2b3/0x500
[ 73.265522][ T12] ? _cfg80211_unregister_wdev+0x46d/0x560
[ 73.271592][ T12] ? handle_bug+0x3e/0x70
[ 73.276016][ T12] ? exc_invalid_op+0x1a/0x50
[ 73.280956][ T12] ? asm_exc_invalid_op+0x1a/0x20
[ 73.286396][ T12] ? _cfg80211_unregister_wdev+0xc2/0x560
[ 73.292211][ T12] ? _cfg80211_unregister_wdev+0x46c/0x560
[ 73.298035][ T12] ? _cfg80211_unregister_wdev+0x46d/0x560
[ 73.303878][ T12] ? _cfg80211_unregister_wdev+0x46c/0x560
[ 73.309708][ T12] ieee80211_remove_interfaces+0x525/0x720
[ 73.315641][ T12] ? ieee80211_unregister_hw+0x55/0x2c0
[ 73.321227][ T12] ? __pfx_ieee80211_remove_interfaces+0x10/0x10
[ 73.327613][ T12] ieee80211_unregister_hw+0x5d/0x2c0
[ 73.333058][ T12] mac80211_hwsim_del_radio+0x2c2/0x4c0
[ 73.338612][ T12] ? __pfx_mac80211_hwsim_del_radio+0x10/0x10
[ 73.344761][ T12] hwsim_exit_net+0x5c1/0x670
[ 73.349450][ T12] ? __pfx_hwsim_exit_net+0x10/0x10
[ 73.354809][ T12] ? __ip_vs_dev_cleanup_batch+0x239/0x260
[ 73.360650][ T12] cleanup_net+0x802/0xcc0
[ 73.365129][ T12] ? __pfx_cleanup_net+0x10/0x10
[ 73.370092][ T12] ? process_scheduled_works+0x945/0x1830
[ 73.375946][ T12] process_scheduled_works+0xa2c/0x1830
[ 73.381548][ T12] ? __pfx_process_scheduled_works+0x10/0x10
[ 73.387971][ T12] ? assign_work+0x364/0x3d0
[ 73.392920][ T12] worker_thread+0x86d/0xd50
[ 73.397535][ T12] ? __kthread_parkme+0x169/0x1d0
[ 73.402621][ T12] ? __pfx_worker_thread+0x10/0x10
[ 73.407745][ T12] kthread+0x2f0/0x390
[ 73.411959][ T12] ? __pfx_worker_thread+0x10/0x10
[ 73.417079][ T12] ? __pfx_kthread+0x10/0x10
[ 73.421677][ T12] ret_from_fork+0x4b/0x80
[ 73.426151][ T12] ? __pfx_kthread+0x10/0x10
[ 73.430832][ T12] ret_from_fork_asm+0x1a/0x30
[ 73.435683][ T12] </TASK>
[ 73.438739][ T12] Kernel panic - not syncing: kernel: panic_on_warn set ...
[ 73.446004][ T12] CPU: 0 PID: 12 Comm: kworker/u8:1 Not tainted 6.10.0-rc6-syzkaller-00223-gc6653f49e4fd-dirty #0
[ 73.456604][ T12] Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 06/07/2024
[ 73.466655][ T12] Workqueue: netns cleanup_net
[ 73.471414][ T12] Call Trace:
[ 73.474709][ T12] <TASK>
[ 73.477655][ T12] dump_stack_lvl+0x241/0x360
[ 73.482346][ T12] ? __pfx_dump_stack_lvl+0x10/0x10
[ 73.487561][ T12] ? __pfx__printk+0x10/0x10
[ 73.492183][ T12] ? _printk+0xd5/0x120
[ 73.496355][ T12] ? vscnprintf+0x5d/0x90
[ 73.500704][ T12] panic+0x349/0x860
[ 73.504805][ T12] ? __warn+0x172/0x4e0
[ 73.509075][ T12] ? __pfx_panic+0x10/0x10
[ 73.513494][ T12] ? show_trace_log_lvl+0x4e6/0x520
[ 73.518728][ T12] ? ret_from_fork_asm+0x1a/0x30
[ 73.523686][ T12] __warn+0x346/0x4e0
[ 73.527674][ T12] ? _cfg80211_unregister_wdev+0x46d/0x560
[ 73.533569][ T12] report_bug+0x2b3/0x500
[ 73.537892][ T12] ? _cfg80211_unregister_wdev+0x46d/0x560
[ 73.543696][ T12] handle_bug+0x3e/0x70
[ 73.547850][ T12] exc_invalid_op+0x1a/0x50
[ 73.552387][ T12] asm_exc_invalid_op+0x1a/0x20
[ 73.557264][ T12] RIP: 0010:_cfg80211_unregister_wdev+0x46d/0x560
[ 73.563690][ T12] Code: 0f b6 04 38 84 c0 0f 85 ec 00 00 00 41 80 65 00 fe 48 83 c4 10 5b 41 5c 41 5d 41 5e 41 5f 5d c3 cc cc cc cc e8 04 07 c5 f6 90 <0f> 0b 90 e9 61 fc ff ff e8 f6 06 c5 f6 c6 05 2c ac c6 04 01 90 48
[ 73.583293][ T12] RSP: 0018:ffffc90000117798 EFLAGS: 00010293
[ 73.589358][ T12] RAX: ffffffff8ad120ac RBX: 0000000000000000 RCX: ffff8880176c5a00
[ 73.597318][ T12] RDX: 0000000000000000 RSI: 0000000000000000 RDI: 0000000000000000
[ 73.605284][ T12] RBP: ffff888068618000 R08: ffffffff8ad11d02 R09: 1ffffffff1ebcdac
[ 73.613250][ T12] R10: dffffc0000000000 R11: fffffbfff1ebcdad R12: 0000000000000001
[ 73.621239][ T12] R13: ffff88801cf7ccb0 R14: ffff888068618700 R15: dffffc0000000000
[ 73.629242][ T12] ? _cfg80211_unregister_wdev+0xc2/0x560
[ 73.635048][ T12] ? _cfg80211_unregister_wdev+0x46c/0x560
[ 73.640870][ T12] ? _cfg80211_unregister_wdev+0x46c/0x560
[ 73.646705][ T12] ieee80211_remove_interfaces+0x525/0x720
[ 73.652529][ T12] ? ieee80211_unregister_hw+0x55/0x2c0
[ 73.658195][ T12] ? __pfx_ieee80211_remove_interfaces+0x10/0x10
[ 73.664568][ T12] ieee80211_unregister_hw+0x5d/0x2c0
[ 73.669955][ T12] mac80211_hwsim_del_radio+0x2c2/0x4c0
[ 73.675512][ T12] ? __pfx_mac80211_hwsim_del_radio+0x10/0x10
[ 73.681583][ T12] hwsim_exit_net+0x5c1/0x670
[ 73.686317][ T12] ? __pfx_hwsim_exit_net+0x10/0x10
[ 73.691541][ T12] ? __ip_vs_dev_cleanup_batch+0x239/0x260
[ 73.697357][ T12] cleanup_net+0x802/0xcc0
[ 73.701800][ T12] ? __pfx_cleanup_net+0x10/0x10
[ 73.706737][ T12] ? process_scheduled_works+0x945/0x1830
[ 73.712531][ T12] process_scheduled_works+0xa2c/0x1830
[ 73.718085][ T12] ? __pfx_process_scheduled_works+0x10/0x10
[ 73.724060][ T12] ? assign_work+0x364/0x3d0
[ 73.728641][ T12] worker_thread+0x86d/0xd50
[ 73.733244][ T12] ? __kthread_parkme+0x169/0x1d0
[ 73.738257][ T12] ? __pfx_worker_thread+0x10/0x10
[ 73.743365][ T12] kthread+0x2f0/0x390
[ 73.747424][ T12] ? __pfx_worker_thread+0x10/0x10
[ 73.752533][ T12] ? __pfx_kthread+0x10/0x10
[ 73.757113][ T12] ret_from_fork+0x4b/0x80
[ 73.761518][ T12] ? __pfx_kthread+0x10/0x10
[ 73.766100][ T12] ret_from_fork_asm+0x1a/0x30
[ 73.770872][ T12] </TASK>
[ 73.774152][ T12] Kernel Offset: disabled
[ 73.778533][ T12] Rebooting in 86400 seconds..
GOGCCFLAGS='-fPIC -m64 -pthread -Wl,--no-gc-sections -fmessage-length=0 -ffile-prefix-map=/tmp/go-build2801630060=/tmp/go-build -gno-record-gcc-switches'

git status (err=<nil>)
HEAD detached at edc5149ad2
nothing to commit, working tree clean


tput: No value for $TERM and no -T specified
tput: No value for $TERM and no -T specified
Makefile:31: run command via tools/syz-env for best compatibility, see:
Makefile:32: https://github.com/google/syzkaller/blob/master/docs/contributing.md#using-syz-env
go list -f '{{.Stale}}' ./sys/syz-sysgen | grep -q false || go install ./sys/syz-sysgen
make .descriptions
tput: No value for $TERM and no -T specified
tput: No value for $TERM and no -T specified
Makefile:31: run command via tools/syz-env for best compatibility, see:
Makefile:32: https://github.com/google/syzkaller/blob/master/docs/contributing.md#using-syz-env
bin/syz-sysgen
go fmt ./sys/... >/dev/null
touch .descriptions
GOOS=linux GOARCH=amd64 go build "-ldflags=-s -w -X github.com/google/syzkaller/prog.GitRevision=edc5149ad2ab7a38db6b3bcb1b594e0264a92163 -X 'github.com/google/syzkaller/prog.gitRevisionDate=20240621-090414'" "-tags=syz_target syz_os_linux syz_arch_amd64 " -o ./bin/linux_amd64/syz-fuzzer github.com/google/syzkaller/syz-fuzzer
GOOS=linux GOARCH=amd64 go build "-ldflags=-s -w -X github.com/google/syzkaller/prog.GitRevision=edc5149ad2ab7a38db6b3bcb1b594e0264a92163 -X 'github.com/google/syzkaller/prog.gitRevisionDate=20240621-090414'" "-tags=syz_target syz_os_linux syz_arch_amd64 " -o ./bin/linux_amd64/syz-execprog github.com/google/syzkaller/tools/syz-execprog
mkdir -p ./bin/linux_amd64
g++ -o ./bin/linux_amd64/syz-executor executor/executor.cc \
-m64 -O2 -pthread -Wall -Werror -Wparentheses -Wunused-const-variable -Wframe-larger-than=16384 -Wno-stringop-overflow -Wno-array-bounds -Wno-format-overflow -Wno-unused-but-set-variable -Wno-unused-command-line-argument -static-pie -std=c++17 -I. -Iexecutor/_include -fpermissive -w -DGOOS_linux=1 -DGOARCH_amd64=1 \
-DHOSTGOOS_linux=1 -DGIT_REVISION=\"edc5149ad2ab7a38db6b3bcb1b594e0264a92163\"


Error text is too large and was truncated, full error text is at:
https://syzkaller.appspot.com/x/error.txt?x=10e59fbe980000


Tested on:

commit: c6653f49 Merge tag 'powerpc-6.10-4' of git://git.kerne..
git tree: upstream
kernel config: https://syzkaller.appspot.com/x/.config?x=864caee5f78cab51
dashboard link: https://syzkaller.appspot.com/bug?extid=705c61d60b091ef42c04
compiler: Debian clang version 15.0.6, GNU ld (GNU Binutils for Debian) 2.40
patch: https://syzkaller.appspot.com/x/patch.diff?x=116cb7c1980000

Michal Kubiak

unread,
Jul 8, 2024, 5:15:09 AM7/8/24
to Jeongjun Park, ji...@resnulli.us, syzbot+705c61...@syzkaller.appspotmail.com, da...@davemloft.net, edum...@google.com, ku...@kernel.org, linux-...@vger.kernel.org, net...@vger.kernel.org, pab...@redhat.com, syzkall...@googlegroups.com
On Wed, Jul 03, 2024 at 11:51:59PM +0900, Jeongjun Park wrote:
> CPU0 CPU1
> ---- ----
> lock(&rdev->wiphy.mtx);
> lock(team->team_lock_key#4);
> lock(&rdev->wiphy.mtx);
> lock(team->team_lock_key#4);
>
> Deadlock occurs due to the above scenario. Therefore,
> modify the code as shown in the patch below to prevent deadlock.
>
> Regards,
> Jeongjun Park.

The commit message should contain the patch description only (without
salutations, etc.).

>
> Reported-and-tested-by: syzbot+705c61...@syzkaller.appspotmail.com
> Fixes: 61dc3461b954 ("team: convert overall spinlock to mutex")
> Signed-off-by: Jeongjun Park <aha3...@gmail.com>
> ---
> drivers/net/team/team_core.c | 14 ++++++++------
> 1 file changed, 8 insertions(+), 6 deletions(-)
>
> diff --git a/drivers/net/team/team_core.c b/drivers/net/team/team_core.c
> index ab1935a4aa2c..3ac82df876b0 100644
> --- a/drivers/net/team/team_core.c
> +++ b/drivers/net/team/team_core.c
> @@ -1970,11 +1970,12 @@ static int team_add_slave(struct net_device *dev, struct net_device *port_dev,
> struct netlink_ext_ack *extack)
> {
> struct team *team = netdev_priv(dev);
> - int err;
> + int err, locked;
>
> - mutex_lock(&team->lock);
> + locked = mutex_trylock(&team->lock);
> err = team_port_add(team, port_dev, extack);
> - mutex_unlock(&team->lock);
> + if (locked)
> + mutex_unlock(&team->lock);

This is not correct usage of 'mutex_trylock()' API. In such a case you
could as well remove the lock completely from that part of code.
If "mutex_trylock()" returns false it means the mutex cannot be taken
(because it was already taken by other thread), so you should not modify
the resources that were expected to be protected by the mutex.
In other words, there is a risk of modifying resources using
"team_port_add()" by several threads at a time.

>
> if (!err)
> netdev_change_features(dev);
> @@ -1985,11 +1986,12 @@ static int team_add_slave(struct net_device *dev, struct net_device *port_dev,
> static int team_del_slave(struct net_device *dev, struct net_device *port_dev)
> {
> struct team *team = netdev_priv(dev);
> - int err;
> + int err, locked;
>
> - mutex_lock(&team->lock);
> + locked = mutex_trylock(&team->lock);
> err = team_port_del(team, port_dev);
> - mutex_unlock(&team->lock);
> + if (locked)
> + mutex_unlock(&team->lock);

The same story as in case of "team_add_slave()".

>
> if (err)
> return err;
> --
>

The patch does not seem to be a correct solution to remove a deadlock.
Most probably a synchronization design needs an inspection.
If you really want to use "mutex_trylock()" API, please consider several
attempts of taking the mutex, but never modify the protected resources when
the mutex is not taken successfully.

Thanks,
Michal


Reply all
Reply to author
Forward
0 new messages