[v5.15] possible deadlock in rds_wake_sk_sleep

8 views
Skip to first unread message

syzbot

unread,
Jun 5, 2023, 3:41:54 AM6/5/23
to syzkaller...@googlegroups.com
Hello,

syzbot found the following issue on:

HEAD commit: 0ab06468cbd1 Linux 5.15.114
git tree: linux-5.15.y
console output: https://syzkaller.appspot.com/x/log.txt?x=17f12779280000
kernel config: https://syzkaller.appspot.com/x/.config?x=2d24dbde73b9b505
dashboard link: https://syzkaller.appspot.com/bug?extid=b2780e864eaa3d26da9b
compiler: Debian clang version 15.0.7, GNU ld (GNU Binutils for Debian) 2.35.2
syz repro: https://syzkaller.appspot.com/x/repro.syz?x=168994c9280000
C reproducer: https://syzkaller.appspot.com/x/repro.c?x=158b7d45280000

Downloadable assets:
disk image: https://storage.googleapis.com/syzbot-assets/48a0cd4fb454/disk-0ab06468.raw.xz
vmlinux: https://storage.googleapis.com/syzbot-assets/d857ea64526d/vmlinux-0ab06468.xz
kernel image: https://storage.googleapis.com/syzbot-assets/330499c124c4/bzImage-0ab06468.xz

IMPORTANT: if you fix the issue, please add the following tag to the commit:
Reported-by: syzbot+b2780e...@syzkaller.appspotmail.com

======================================================
WARNING: possible circular locking dependency detected
5.15.114-syzkaller #0 Not tainted
------------------------------------------------------
syz-executor168/4091 is trying to acquire lock:
ffff88801a8bbc70 (&rs->rs_recv_lock){...-}-{2:2}, at: rds_wake_sk_sleep+0x2a/0xd0 net/rds/af_rds.c:109

but task is already holding lock:
ffff88801d600900 (&rm->m_rs_lock){..-.}-{2:2}, at: rds_send_remove_from_sock+0x129/0x800 net/rds/send.c:628

which lock already depends on the new lock.


the existing dependency chain (in reverse order) is:

-> #1 (&rm->m_rs_lock){..-.}-{2:2}:
lock_acquire+0x1db/0x4f0 kernel/locking/lockdep.c:5622
__raw_spin_lock_irqsave include/linux/spinlock_api_smp.h:110 [inline]
_raw_spin_lock_irqsave+0xd1/0x120 kernel/locking/spinlock.c:162
rds_message_purge net/rds/message.c:138 [inline]
rds_message_put+0x148/0xc80 net/rds/message.c:180
rds_inc_put net/rds/recv.c:82 [inline]
rds_clear_recv_queue+0x2de/0x3b0 net/rds/recv.c:767
rds_release+0xc1/0x2e0 net/rds/af_rds.c:73
__sock_release net/socket.c:649 [inline]
sock_close+0xcd/0x230 net/socket.c:1317
__fput+0x3bf/0x890 fs/file_table.c:280
task_work_run+0x129/0x1a0 kernel/task_work.c:164
exit_task_work include/linux/task_work.h:32 [inline]
do_exit+0x6a3/0x2480 kernel/exit.c:872
do_group_exit+0x144/0x310 kernel/exit.c:994
get_signal+0xc66/0x14e0 kernel/signal.c:2889
arch_do_signal_or_restart+0xc3/0x1890 arch/x86/kernel/signal.c:865
handle_signal_work kernel/entry/common.c:148 [inline]
exit_to_user_mode_loop+0x97/0x130 kernel/entry/common.c:172
exit_to_user_mode_prepare+0xb1/0x140 kernel/entry/common.c:208
__syscall_exit_to_user_mode_work kernel/entry/common.c:290 [inline]
syscall_exit_to_user_mode+0x5d/0x250 kernel/entry/common.c:301
do_syscall_64+0x49/0xb0 arch/x86/entry/common.c:86
entry_SYSCALL_64_after_hwframe+0x61/0xcb

-> #0 (&rs->rs_recv_lock){...-}-{2:2}:
check_prev_add kernel/locking/lockdep.c:3053 [inline]
check_prevs_add kernel/locking/lockdep.c:3172 [inline]
validate_chain+0x1646/0x58b0 kernel/locking/lockdep.c:3787
__lock_acquire+0x1295/0x1ff0 kernel/locking/lockdep.c:5011
lock_acquire+0x1db/0x4f0 kernel/locking/lockdep.c:5622
__raw_read_lock_irqsave include/linux/rwlock_api_smp.h:159 [inline]
_raw_read_lock_irqsave+0xd9/0x120 kernel/locking/spinlock.c:236
rds_wake_sk_sleep+0x2a/0xd0 net/rds/af_rds.c:109
rds_send_remove_from_sock+0x1c9/0x800 net/rds/send.c:634
rds_send_path_drop_acked+0x362/0x3a0 net/rds/send.c:710
rds_tcp_write_space+0x193/0x560 net/rds/tcp_send.c:198
tcp_new_space net/ipv4/tcp_input.c:5443 [inline]
tcp_check_space+0x141/0x970 net/ipv4/tcp_input.c:5462
tcp_data_snd_check net/ipv4/tcp_input.c:5471 [inline]
tcp_rcv_established+0xeb6/0x1e20 net/ipv4/tcp_input.c:5966
tcp_v4_do_rcv+0x423/0x960 net/ipv4/tcp_ipv4.c:1727
sk_backlog_rcv include/net/sock.h:1057 [inline]
__release_sock+0x198/0x4b0 net/core/sock.c:2690
release_sock+0x5d/0x1c0 net/core/sock.c:3231
rds_send_xmit+0x1d16/0x2530 net/rds/send.c:422
rds_sendmsg+0x1b97/0x2240 net/rds/send.c:1382
sock_sendmsg_nosec net/socket.c:704 [inline]
sock_sendmsg net/socket.c:724 [inline]
____sys_sendmsg+0x59e/0x8f0 net/socket.c:2412
___sys_sendmsg+0x252/0x2e0 net/socket.c:2466
__sys_sendmsg net/socket.c:2495 [inline]
__do_sys_sendmsg net/socket.c:2504 [inline]
__se_sys_sendmsg+0x19a/0x260 net/socket.c:2502
do_syscall_x64 arch/x86/entry/common.c:50 [inline]
do_syscall_64+0x3d/0xb0 arch/x86/entry/common.c:80
entry_SYSCALL_64_after_hwframe+0x61/0xcb

other info that might help us debug this:

Possible unsafe locking scenario:

CPU0 CPU1
---- ----
lock(&rm->m_rs_lock);
lock(&rs->rs_recv_lock);
lock(&rm->m_rs_lock);
lock(&rs->rs_recv_lock);

*** DEADLOCK ***

3 locks held by syz-executor168/4091:
#0: ffff88807895c920 (k-sk_lock-AF_INET){+.+.}-{0:0}, at: lock_sock include/net/sock.h:1649 [inline]
#0: ffff88807895c920 (k-sk_lock-AF_INET){+.+.}-{0:0}, at: tcp_sock_set_cork+0x29/0x1b0 net/ipv4/tcp.c:3232
#1: ffff88807895cbd0 (k-clock-AF_INET){++.-}-{2:2}, at: rds_tcp_write_space+0x30/0x560 net/rds/tcp_send.c:184
#2: ffff88801d600900 (&rm->m_rs_lock){..-.}-{2:2}, at: rds_send_remove_from_sock+0x129/0x800 net/rds/send.c:628

stack backtrace:
CPU: 1 PID: 4091 Comm: syz-executor168 Not tainted 5.15.114-syzkaller #0
Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 05/25/2023
Call Trace:
<TASK>
__dump_stack lib/dump_stack.c:88 [inline]
dump_stack_lvl+0x1e3/0x2cb lib/dump_stack.c:106
check_noncircular+0x2f8/0x3b0 kernel/locking/lockdep.c:2133
check_prev_add kernel/locking/lockdep.c:3053 [inline]
check_prevs_add kernel/locking/lockdep.c:3172 [inline]
validate_chain+0x1646/0x58b0 kernel/locking/lockdep.c:3787
__lock_acquire+0x1295/0x1ff0 kernel/locking/lockdep.c:5011
lock_acquire+0x1db/0x4f0 kernel/locking/lockdep.c:5622
__raw_read_lock_irqsave include/linux/rwlock_api_smp.h:159 [inline]
_raw_read_lock_irqsave+0xd9/0x120 kernel/locking/spinlock.c:236
rds_wake_sk_sleep+0x2a/0xd0 net/rds/af_rds.c:109
rds_send_remove_from_sock+0x1c9/0x800 net/rds/send.c:634
rds_send_path_drop_acked+0x362/0x3a0 net/rds/send.c:710
rds_tcp_write_space+0x193/0x560 net/rds/tcp_send.c:198
tcp_new_space net/ipv4/tcp_input.c:5443 [inline]
tcp_check_space+0x141/0x970 net/ipv4/tcp_input.c:5462
tcp_data_snd_check net/ipv4/tcp_input.c:5471 [inline]
tcp_rcv_established+0xeb6/0x1e20 net/ipv4/tcp_input.c:5966
tcp_v4_do_rcv+0x423/0x960 net/ipv4/tcp_ipv4.c:1727
sk_backlog_rcv include/net/sock.h:1057 [inline]
__release_sock+0x198/0x4b0 net/core/sock.c:2690
release_sock+0x5d/0x1c0 net/core/sock.c:3231
rds_send_xmit+0x1d16/0x2530 net/rds/send.c:422
rds_sendmsg+0x1b97/0x2240 net/rds/send.c:1382
sock_sendmsg_nosec net/socket.c:704 [inline]
sock_sendmsg net/socket.c:724 [inline]
____sys_sendmsg+0x59e/0x8f0 net/socket.c:2412
___sys_sendmsg+0x252/0x2e0 net/socket.c:2466
__sys_sendmsg net/socket.c:2495 [inline]
__do_sys_sendmsg net/socket.c:2504 [inline]
__se_sys_sendmsg+0x19a/0x260 net/socket.c:2502
do_syscall_x64 arch/x86/entry/common.c:50 [inline]
do_syscall_64+0x3d/0xb0 arch/x86/entry/common.c:80
entry_SYSCALL_64_after_hwframe+0x61/0xcb
RIP: 0033:0x7f609a864c79
Code: 28 00 00 00 75 05 48 83 c4 28 c3 e8 11 15 00 00 90 48 89 f8 48 89 f7 48 89 d6 48 89 ca 4d 89 c2 4d 89 c8 4c 8b 4c 24 08 0f 05 <48> 3d 01 f0 ff ff 73 01 c3 48 c7 c1 b8 ff ff ff f7 d8 64 89 01 48
RSP: 002b:00007f609a7d4318 EFLAGS: 00000246 ORIG_RAX: 000000000000002e
RAX: ffffffffffffffda RBX: 00007f609a8ec448 RCX: 00007f609a864c79
RDX: 0000000000000000 RSI: 00000000200000c0 RDI: 0000000000000004
RBP: 00007f609a8ec440 R08: 00007f609a7d4700 R09: 0000000000000000
R10: 00007f609a7d4700 R11: 0000000000000246 R12: 00007f609a8ba074
R13: 00007ffd3d73ba3f R14: 00007f609a7d4400 R15: 0000000000022000
</TASK>


---
This report is generated by a bot. It may contain errors.
See https://goo.gl/tpsmEJ for more information about syzbot.
syzbot engineers can be reached at syzk...@googlegroups.com.

syzbot will keep track of this issue. See:
https://goo.gl/tpsmEJ#status for how to communicate with syzbot.

If the bug is already fixed, let syzbot know by replying with:
#syz fix: exact-commit-title

If you want syzbot to run the reproducer, reply with:
#syz test: git://repo/address.git branch-or-commit-hash
If you attach or paste a git patch, syzbot will apply it before testing.

If you want to change bug's subsystems, reply with:
#syz set subsystems: new-subsystem
(See the list of subsystem names on the web dashboard)

If the bug is a duplicate of another bug, reply with:
#syz dup: exact-subject-of-another-report

If you want to undo deduplication, reply with:
#syz undup

syzbot

unread,
Jun 12, 2023, 2:04:57 AM6/12/23
to syzkaller...@googlegroups.com
Hello,

syzbot found the following issue on:

HEAD commit: 2f3918bc53fb Linux 6.1.33
git tree: linux-6.1.y
console output: https://syzkaller.appspot.com/x/log.txt?x=10cc95b3280000
kernel config: https://syzkaller.appspot.com/x/.config?x=668ab7dd51e152ad
dashboard link: https://syzkaller.appspot.com/bug?extid=a18a2cae6d476734e11d
compiler: Debian clang version 15.0.7, GNU ld (GNU Binutils for Debian) 2.35.2
syz repro: https://syzkaller.appspot.com/x/repro.syz?x=16f75807280000
C reproducer: https://syzkaller.appspot.com/x/repro.c?x=1319bf1b280000

Downloadable assets:
disk image: https://storage.googleapis.com/syzbot-assets/148750653b59/disk-2f3918bc.raw.xz
vmlinux: https://storage.googleapis.com/syzbot-assets/f86efd682b25/vmlinux-2f3918bc.xz
kernel image: https://storage.googleapis.com/syzbot-assets/483386a4e270/bzImage-2f3918bc.xz

IMPORTANT: if you fix the issue, please add the following tag to the commit:
Reported-by: syzbot+a18a2c...@syzkaller.appspotmail.com

======================================================
WARNING: possible circular locking dependency detected
6.1.33-syzkaller #0 Not tainted
------------------------------------------------------
syz-executor294/3900 is trying to acquire lock:
ffff888074a89d70 (&rs->rs_recv_lock){...-}-{2:2}, at: rds_wake_sk_sleep+0x2a/0xd0 net/rds/af_rds.c:109

but task is already holding lock:
ffff88807bac1900 (&rm->m_rs_lock){..-.}-{2:2}, at: rds_send_remove_from_sock+0x129/0x800 net/rds/send.c:628

which lock already depends on the new lock.


the existing dependency chain (in reverse order) is:

-> #1 (&rm->m_rs_lock){..-.}-{2:2}:
lock_acquire+0x1f8/0x5a0 kernel/locking/lockdep.c:5669
__raw_spin_lock_irqsave include/linux/spinlock_api_smp.h:110 [inline]
_raw_spin_lock_irqsave+0xd1/0x120 kernel/locking/spinlock.c:162
rds_message_purge net/rds/message.c:138 [inline]
rds_message_put+0x146/0xaa0 net/rds/message.c:180
rds_inc_put net/rds/recv.c:82 [inline]
rds_clear_recv_queue+0x2de/0x3b0 net/rds/recv.c:767
rds_release+0xc1/0x2e0 net/rds/af_rds.c:73
__sock_release net/socket.c:652 [inline]
sock_close+0xcd/0x230 net/socket.c:1370
__fput+0x3b7/0x890 fs/file_table.c:320
task_work_run+0x246/0x300 kernel/task_work.c:179
exit_task_work include/linux/task_work.h:38 [inline]
do_exit+0x6fb/0x2300 kernel/exit.c:869
do_group_exit+0x202/0x2b0 kernel/exit.c:1019
get_signal+0x16f7/0x17d0 kernel/signal.c:2858
arch_do_signal_or_restart+0xb0/0x1a10 arch/x86/kernel/signal.c:869
exit_to_user_mode_loop+0x6a/0x100 kernel/entry/common.c:168
exit_to_user_mode_prepare+0xb1/0x140 kernel/entry/common.c:204
__syscall_exit_to_user_mode_work kernel/entry/common.c:286 [inline]
syscall_exit_to_user_mode+0x60/0x270 kernel/entry/common.c:297
do_syscall_64+0x49/0xb0 arch/x86/entry/common.c:86
entry_SYSCALL_64_after_hwframe+0x63/0xcd

-> #0 (&rs->rs_recv_lock){...-}-{2:2}:
check_prev_add kernel/locking/lockdep.c:3098 [inline]
check_prevs_add kernel/locking/lockdep.c:3217 [inline]
validate_chain+0x1667/0x58e0 kernel/locking/lockdep.c:3832
__lock_acquire+0x125b/0x1f80 kernel/locking/lockdep.c:5056
lock_acquire+0x1f8/0x5a0 kernel/locking/lockdep.c:5669
__raw_read_lock_irqsave include/linux/rwlock_api_smp.h:160 [inline]
_raw_read_lock_irqsave+0xd9/0x120 kernel/locking/spinlock.c:236
rds_wake_sk_sleep+0x2a/0xd0 net/rds/af_rds.c:109
rds_send_remove_from_sock+0x1c9/0x800 net/rds/send.c:634
rds_send_path_drop_acked+0x362/0x3a0 net/rds/send.c:710
rds_tcp_write_space+0x193/0x560 net/rds/tcp_send.c:198
tcp_new_space net/ipv4/tcp_input.c:5471 [inline]
tcp_check_space+0x176/0xab0 net/ipv4/tcp_input.c:5490
tcp_data_snd_check net/ipv4/tcp_input.c:5499 [inline]
tcp_rcv_established+0xedb/0x1f00 net/ipv4/tcp_input.c:6007
tcp_v4_do_rcv+0x487/0xb00 net/ipv4/tcp_ipv4.c:1671
sk_backlog_rcv include/net/sock.h:1111 [inline]
__release_sock+0x198/0x4b0 net/core/sock.c:2909
release_sock+0x5d/0x1c0 net/core/sock.c:3473
rds_send_xmit+0x1d16/0x2530 net/rds/send.c:422
rds_sendmsg+0x1b95/0x2240 net/rds/send.c:1382
sock_sendmsg_nosec net/socket.c:716 [inline]
sock_sendmsg net/socket.c:736 [inline]
____sys_sendmsg+0x59e/0x8f0 net/socket.c:2482
___sys_sendmsg net/socket.c:2536 [inline]
__sys_sendmsg+0x2a9/0x390 net/socket.c:2565
do_syscall_x64 arch/x86/entry/common.c:50 [inline]
do_syscall_64+0x3d/0xb0 arch/x86/entry/common.c:80
entry_SYSCALL_64_after_hwframe+0x63/0xcd

other info that might help us debug this:

Possible unsafe locking scenario:

CPU0 CPU1
---- ----
lock(&rm->m_rs_lock);
lock(&rs->rs_recv_lock);
lock(&rm->m_rs_lock);
lock(&rs->rs_recv_lock);

*** DEADLOCK ***

3 locks held by syz-executor294/3900:
#0: ffff888078ceef70 (k-sk_lock-AF_INET){+.+.}-{0:0}, at: lock_sock include/net/sock.h:1725 [inline]
#0: ffff888078ceef70 (k-sk_lock-AF_INET){+.+.}-{0:0}, at: tcp_sock_set_cork+0x29/0x1b0 net/ipv4/tcp.c:3339
#1: ffff888078cef1f8 (k-clock-AF_INET){++.-}-{2:2}, at: rds_tcp_write_space+0x30/0x560 net/rds/tcp_send.c:184
#2: ffff88807bac1900 (&rm->m_rs_lock){..-.}-{2:2}, at: rds_send_remove_from_sock+0x129/0x800 net/rds/send.c:628

stack backtrace:
CPU: 1 PID: 3900 Comm: syz-executor294 Not tainted 6.1.33-syzkaller #0
Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 05/25/2023
Call Trace:
<TASK>
__dump_stack lib/dump_stack.c:88 [inline]
dump_stack_lvl+0x1e3/0x2cb lib/dump_stack.c:106
check_noncircular+0x2fa/0x3b0 kernel/locking/lockdep.c:2178
check_prev_add kernel/locking/lockdep.c:3098 [inline]
check_prevs_add kernel/locking/lockdep.c:3217 [inline]
validate_chain+0x1667/0x58e0 kernel/locking/lockdep.c:3832
__lock_acquire+0x125b/0x1f80 kernel/locking/lockdep.c:5056
lock_acquire+0x1f8/0x5a0 kernel/locking/lockdep.c:5669
__raw_read_lock_irqsave include/linux/rwlock_api_smp.h:160 [inline]
_raw_read_lock_irqsave+0xd9/0x120 kernel/locking/spinlock.c:236
rds_wake_sk_sleep+0x2a/0xd0 net/rds/af_rds.c:109
rds_send_remove_from_sock+0x1c9/0x800 net/rds/send.c:634
rds_send_path_drop_acked+0x362/0x3a0 net/rds/send.c:710
rds_tcp_write_space+0x193/0x560 net/rds/tcp_send.c:198
tcp_new_space net/ipv4/tcp_input.c:5471 [inline]
tcp_check_space+0x176/0xab0 net/ipv4/tcp_input.c:5490
tcp_data_snd_check net/ipv4/tcp_input.c:5499 [inline]
tcp_rcv_established+0xedb/0x1f00 net/ipv4/tcp_input.c:6007
tcp_v4_do_rcv+0x487/0xb00 net/ipv4/tcp_ipv4.c:1671
sk_backlog_rcv include/net/sock.h:1111 [inline]
__release_sock+0x198/0x4b0 net/core/sock.c:2909
release_sock+0x5d/0x1c0 net/core/sock.c:3473
rds_send_xmit+0x1d16/0x2530 net/rds/send.c:422
rds_sendmsg+0x1b95/0x2240 net/rds/send.c:1382
sock_sendmsg_nosec net/socket.c:716 [inline]
sock_sendmsg net/socket.c:736 [inline]
____sys_sendmsg+0x59e/0x8f0 net/socket.c:2482
___sys_sendmsg net/socket.c:2536 [inline]
__sys_sendmsg+0x2a9/0x390 net/socket.c:2565
do_syscall_x64 arch/x86/entry/common.c:50 [inline]
do_syscall_64+0x3d/0xb0 arch/x86/entry/common.c:80
entry_SYSCALL_64_after_hwframe+0x63/0xcd
RIP: 0033:0x7fc54e9d1c59
Code: 28 00 00 00 75 05 48 83 c4 28 c3 e8 11 15 00 00 90 48 89 f8 48 89 f7 48 89 d6 48 89 ca 4d 89 c2 4d 89 c8 4c 8b 4c 24 08 0f 05 <48> 3d 01 f0 ff ff 73 01 c3 48 c7 c1 b8 ff ff ff f7 d8 64 89 01 48
RSP: 002b:00007fc54e941318 EFLAGS: 00000246 ORIG_RAX: 000000000000002e
RAX: ffffffffffffffda RBX: 00007fc54ea59448 RCX: 00007fc54e9d1c59
RDX: 0000000000000000 RSI: 00000000200000c0 RDI: 0000000000000004
RBP: 00007fc54ea59440 R08: 00007fc54e941700 R09: 0000000000000000
R10: 00007fc54e941700 R11: 0000000000000246 R12: 00007fc54ea27074
R13: 00007fffa549bfef R14: 00007fc54e941400 R15: 0000000000022000
Reply all
Reply to author
Forward
0 new messages