[v6.1] possible deadlock in smc_release

0 views
Skip to first unread message

syzbot

unread,
Jan 29, 2024, 11:17:26 AMJan 29
to syzkaller...@googlegroups.com
Hello,

syzbot found the following issue on:

HEAD commit: 883d1a956208 Linux 6.1.75
git tree: linux-6.1.y
console output: https://syzkaller.appspot.com/x/log.txt?x=14aec380180000
kernel config: https://syzkaller.appspot.com/x/.config?x=c421ffef554bdd17
dashboard link: https://syzkaller.appspot.com/bug?extid=edf7d3f5c58c3780ff30
compiler: Debian clang version 15.0.6, GNU ld (GNU Binutils for Debian) 2.40
userspace arch: arm64

Unfortunately, I don't have any reproducer for this issue yet.

Downloadable assets:
disk image: https://storage.googleapis.com/syzbot-assets/1ff30edbe66d/disk-883d1a95.raw.xz
vmlinux: https://storage.googleapis.com/syzbot-assets/1cd87be79a04/vmlinux-883d1a95.xz
kernel image: https://storage.googleapis.com/syzbot-assets/e50c5a26cded/Image-883d1a95.gz.xz

IMPORTANT: if you fix the issue, please add the following tag to the commit:
Reported-by: syzbot+edf7d3...@syzkaller.appspotmail.com

======================================================
WARNING: possible circular locking dependency detected
6.1.75-syzkaller #0 Not tainted
------------------------------------------------------
syz-executor.3/15397 is trying to acquire lock:
ffff0000d60b4c90 ((work_completion)(&new_smc->smc_listen_work)){+.+.}-{0:0}, at: __flush_work+0xd0/0x1c0 kernel/workqueue.c:3072

but task is already holding lock:
ffff0000d60b6670 (sk_lock-AF_SMC/1){+.+.}-{0:0}, at: smc_release+0x1e8/0x528

which lock already depends on the new lock.


the existing dependency chain (in reverse order) is:

-> #1 (sk_lock-AF_SMC/1){+.+.}-{0:0}:
lock_sock_nested+0x78/0x138 net/core/sock.c:3483
smc_listen_out+0x10c/0x3bc net/smc/af_smc.c:1882
smc_listen_work+0x1e4/0x102c
process_one_work+0x7ac/0x1404 kernel/workqueue.c:2292
worker_thread+0x8e4/0xfec kernel/workqueue.c:2439
kthread+0x250/0x2d8 kernel/kthread.c:376
ret_from_fork+0x10/0x20 arch/arm64/kernel/entry.S:864

-> #0 ((work_completion)(&new_smc->smc_listen_work)){+.+.}-{0:0}:
check_prev_add kernel/locking/lockdep.c:3090 [inline]
check_prevs_add kernel/locking/lockdep.c:3209 [inline]
validate_chain kernel/locking/lockdep.c:3825 [inline]
__lock_acquire+0x3338/0x7680 kernel/locking/lockdep.c:5049
lock_acquire+0x26c/0x7cc kernel/locking/lockdep.c:5662
__flush_work+0xf8/0x1c0 kernel/workqueue.c:3072
__cancel_work_timer+0x3ec/0x548 kernel/workqueue.c:3163
cancel_work_sync+0x24/0x38 kernel/workqueue.c:3199
smc_clcsock_release+0x64/0xec net/smc/smc_close.c:29
__smc_release+0x55c/0x700 net/smc/af_smc.c:300
smc_close_non_accepted+0xd8/0x260 net/smc/af_smc.c:1816
smc_close_cleanup_listen net/smc/smc_close.c:45 [inline]
smc_close_active+0x9bc/0xd20 net/smc/smc_close.c:225
__smc_release+0xa0/0x700 net/smc/af_smc.c:276
smc_release+0x260/0x528 net/smc/af_smc.c:343
__sock_release net/socket.c:654 [inline]
sock_close+0xb8/0x1fc net/socket.c:1400
__fput+0x30c/0x7bc fs/file_table.c:320
____fput+0x20/0x30 fs/file_table.c:348
task_work_run+0x240/0x2f0 kernel/task_work.c:179
resume_user_mode_work include/linux/resume_user_mode.h:49 [inline]
do_notify_resume+0x2148/0x3474 arch/arm64/kernel/signal.c:1132
prepare_exit_to_user_mode arch/arm64/kernel/entry-common.c:137 [inline]
exit_to_user_mode arch/arm64/kernel/entry-common.c:142 [inline]
el0_svc+0x9c/0x168 arch/arm64/kernel/entry-common.c:638
el0t_64_sync_handler+0x84/0xf0 arch/arm64/kernel/entry-common.c:655
el0t_64_sync+0x18c/0x190 arch/arm64/kernel/entry.S:585

other info that might help us debug this:

Possible unsafe locking scenario:

CPU0 CPU1
---- ----
lock(sk_lock-AF_SMC/1);
lock((work_completion)(&new_smc->smc_listen_work));
lock(sk_lock-AF_SMC/1);
lock((work_completion)(&new_smc->smc_listen_work));

*** DEADLOCK ***

2 locks held by syz-executor.3/15397:
#0: ffff0000e26b0810 (&sb->s_type->i_mutex_key#11){+.+.}-{3:3}, at: inode_lock include/linux/fs.h:756 [inline]
#0: ffff0000e26b0810 (&sb->s_type->i_mutex_key#11){+.+.}-{3:3}, at: __sock_release net/socket.c:653 [inline]
#0: ffff0000e26b0810 (&sb->s_type->i_mutex_key#11){+.+.}-{3:3}, at: sock_close+0x80/0x1fc net/socket.c:1400
#1: ffff0000d60b6670 (sk_lock-AF_SMC/1){+.+.}-{0:0}, at: smc_release+0x1e8/0x528

stack backtrace:
CPU: 1 PID: 15397 Comm: syz-executor.3 Not tainted 6.1.75-syzkaller #0
Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 11/17/2023
Call trace:
dump_backtrace+0x1c8/0x1f4 arch/arm64/kernel/stacktrace.c:158
show_stack+0x2c/0x3c arch/arm64/kernel/stacktrace.c:165
__dump_stack lib/dump_stack.c:88 [inline]
dump_stack_lvl+0x108/0x170 lib/dump_stack.c:106
dump_stack+0x1c/0x58 lib/dump_stack.c:113
print_circular_bug+0x150/0x1b8 kernel/locking/lockdep.c:2048
check_noncircular+0x2cc/0x378 kernel/locking/lockdep.c:2170
check_prev_add kernel/locking/lockdep.c:3090 [inline]
check_prevs_add kernel/locking/lockdep.c:3209 [inline]
validate_chain kernel/locking/lockdep.c:3825 [inline]
__lock_acquire+0x3338/0x7680 kernel/locking/lockdep.c:5049
lock_acquire+0x26c/0x7cc kernel/locking/lockdep.c:5662
__flush_work+0xf8/0x1c0 kernel/workqueue.c:3072
__cancel_work_timer+0x3ec/0x548 kernel/workqueue.c:3163
cancel_work_sync+0x24/0x38 kernel/workqueue.c:3199
smc_clcsock_release+0x64/0xec net/smc/smc_close.c:29
__smc_release+0x55c/0x700 net/smc/af_smc.c:300
smc_close_non_accepted+0xd8/0x260 net/smc/af_smc.c:1816
smc_close_cleanup_listen net/smc/smc_close.c:45 [inline]
smc_close_active+0x9bc/0xd20 net/smc/smc_close.c:225
__smc_release+0xa0/0x700 net/smc/af_smc.c:276
smc_release+0x260/0x528 net/smc/af_smc.c:343
__sock_release net/socket.c:654 [inline]
sock_close+0xb8/0x1fc net/socket.c:1400
__fput+0x30c/0x7bc fs/file_table.c:320
____fput+0x20/0x30 fs/file_table.c:348
task_work_run+0x240/0x2f0 kernel/task_work.c:179
resume_user_mode_work include/linux/resume_user_mode.h:49 [inline]
do_notify_resume+0x2148/0x3474 arch/arm64/kernel/signal.c:1132
prepare_exit_to_user_mode arch/arm64/kernel/entry-common.c:137 [inline]
exit_to_user_mode arch/arm64/kernel/entry-common.c:142 [inline]
el0_svc+0x9c/0x168 arch/arm64/kernel/entry-common.c:638
el0t_64_sync_handler+0x84/0xf0 arch/arm64/kernel/entry-common.c:655
el0t_64_sync+0x18c/0x190 arch/arm64/kernel/entry.S:585


---
This report is generated by a bot. It may contain errors.
See https://goo.gl/tpsmEJ for more information about syzbot.
syzbot engineers can be reached at syzk...@googlegroups.com.

syzbot will keep track of this issue. See:
https://goo.gl/tpsmEJ#status for how to communicate with syzbot.

If the report is already addressed, let syzbot know by replying with:
#syz fix: exact-commit-title

If you want to overwrite report's subsystems, reply with:
#syz set subsystems: new-subsystem
(See the list of subsystem names on the web dashboard)

If the report is a duplicate of another one, reply with:
#syz dup: exact-subject-of-another-report

If you want to undo deduplication, reply with:
#syz undup

syzbot

unread,
Jan 29, 2024, 3:17:35 PMJan 29
to syzkaller...@googlegroups.com
syzbot has found a reproducer for the following issue on:

HEAD commit: 883d1a956208 Linux 6.1.75
git tree: linux-6.1.y
console output: https://syzkaller.appspot.com/x/log.txt?x=14ec0490180000
kernel config: https://syzkaller.appspot.com/x/.config?x=c421ffef554bdd17
dashboard link: https://syzkaller.appspot.com/bug?extid=edf7d3f5c58c3780ff30
compiler: Debian clang version 15.0.6, GNU ld (GNU Binutils for Debian) 2.40
userspace arch: arm64
syz repro: https://syzkaller.appspot.com/x/repro.syz?x=116042f3e80000
C reproducer: https://syzkaller.appspot.com/x/repro.c?x=17201040180000

Downloadable assets:
disk image: https://storage.googleapis.com/syzbot-assets/1ff30edbe66d/disk-883d1a95.raw.xz
vmlinux: https://storage.googleapis.com/syzbot-assets/1cd87be79a04/vmlinux-883d1a95.xz
kernel image: https://storage.googleapis.com/syzbot-assets/e50c5a26cded/Image-883d1a95.gz.xz

IMPORTANT: if you fix the issue, please add the following tag to the commit:
Reported-by: syzbot+edf7d3...@syzkaller.appspotmail.com

======================================================
WARNING: possible circular locking dependency detected
6.1.75-syzkaller #0 Not tainted
------------------------------------------------------
syz-executor373/4222 is trying to acquire lock:
ffff0000ddda1450 ((work_completion)(&new_smc->smc_listen_work)){+.+.}-{0:0}, at: __flush_work+0xd0/0x1c0 kernel/workqueue.c:3072

but task is already holding lock:
ffff0000ddda0130 (sk_lock-AF_SMC/1){+.+.}-{0:0}, at: smc_release+0x1e8/0x528
exit_task_work include/linux/task_work.h:38 [inline]
do_exit+0x554/0x1a88 kernel/exit.c:869
do_group_exit+0x194/0x22c kernel/exit.c:1019
__do_sys_exit_group kernel/exit.c:1030 [inline]
__se_sys_exit_group kernel/exit.c:1028 [inline]
__wake_up_parent+0x0/0x60 kernel/exit.c:1028
__invoke_syscall arch/arm64/kernel/syscall.c:38 [inline]
invoke_syscall+0x98/0x2c0 arch/arm64/kernel/syscall.c:52
el0_svc_common+0x138/0x258 arch/arm64/kernel/syscall.c:142
do_el0_svc+0x64/0x218 arch/arm64/kernel/syscall.c:206
el0_svc+0x58/0x168 arch/arm64/kernel/entry-common.c:637
el0t_64_sync_handler+0x84/0xf0 arch/arm64/kernel/entry-common.c:655
el0t_64_sync+0x18c/0x190 arch/arm64/kernel/entry.S:585

other info that might help us debug this:

Possible unsafe locking scenario:

CPU0 CPU1
---- ----
lock(sk_lock-AF_SMC/1);
lock((work_completion)(&new_smc->smc_listen_work));
lock(sk_lock-AF_SMC/1);
lock((work_completion)(&new_smc->smc_listen_work));

*** DEADLOCK ***

2 locks held by syz-executor373/4222:
#0: ffff0000df60c410 (&sb->s_type->i_mutex_key#11){+.+.}-{3:3}, at: inode_lock include/linux/fs.h:756 [inline]
#0: ffff0000df60c410 (&sb->s_type->i_mutex_key#11){+.+.}-{3:3}, at: __sock_release net/socket.c:653 [inline]
#0: ffff0000df60c410 (&sb->s_type->i_mutex_key#11){+.+.}-{3:3}, at: sock_close+0x80/0x1fc net/socket.c:1400
#1: ffff0000ddda0130 (sk_lock-AF_SMC/1){+.+.}-{0:0}, at: smc_release+0x1e8/0x528

stack backtrace:
CPU: 0 PID: 4222 Comm: syz-executor373 Not tainted 6.1.75-syzkaller #0
exit_task_work include/linux/task_work.h:38 [inline]
do_exit+0x554/0x1a88 kernel/exit.c:869
do_group_exit+0x194/0x22c kernel/exit.c:1019
__do_sys_exit_group kernel/exit.c:1030 [inline]
__se_sys_exit_group kernel/exit.c:1028 [inline]
__wake_up_parent+0x0/0x60 kernel/exit.c:1028
__invoke_syscall arch/arm64/kernel/syscall.c:38 [inline]
invoke_syscall+0x98/0x2c0 arch/arm64/kernel/syscall.c:52
el0_svc_common+0x138/0x258 arch/arm64/kernel/syscall.c:142
do_el0_svc+0x64/0x218 arch/arm64/kernel/syscall.c:206
el0_svc+0x58/0x168 arch/arm64/kernel/entry-common.c:637
el0t_64_sync_handler+0x84/0xf0 arch/arm64/kernel/entry-common.c:655
el0t_64_sync+0x18c/0x190 arch/arm64/kernel/entry.S:585


---
If you want syzbot to run the reproducer, reply with:
#syz test: git://repo/address.git branch-or-commit-hash
If you attach or paste a git patch, syzbot will apply it before testing.
Reply all
Reply to author
Forward
0 new messages