possible deadlock in rds_wake_sk_sleep

8 views
Skip to first unread message

syzbot

unread,
Dec 15, 2021, 2:20:25 AM12/15/21
to syzkaller...@googlegroups.com
Hello,

syzbot found the following issue on:

HEAD commit: 3f8a27f9e27b Linux 4.19.211
git tree: linux-4.19.y
console output: https://syzkaller.appspot.com/x/log.txt?x=16d25a7db00000
kernel config: https://syzkaller.appspot.com/x/.config?x=9b9277b418617afe
dashboard link: https://syzkaller.appspot.com/bug?extid=3b4069868f81d1bf6df1
compiler: gcc version 10.2.1 20210110 (Debian 10.2.1-6)
syz repro: https://syzkaller.appspot.com/x/repro.syz?x=15f18325b00000
C reproducer: https://syzkaller.appspot.com/x/repro.c?x=11762bb1b00000

IMPORTANT: if you fix the issue, please add the following tag to the commit:
Reported-by: syzbot+3b4069...@syzkaller.appspotmail.com

======================================================
WARNING: possible circular locking dependency detected
4.19.211-syzkaller #0 Not tainted
------------------------------------------------------
syz-executor728/12995 is trying to acquire lock:
000000001eb09c18 (&rs->rs_recv_lock){....}, at: rds_wake_sk_sleep+0x1d/0xc0 net/rds/af_rds.c:109

but task is already holding lock:
0000000063aa4bb2 (&(&rm->m_rs_lock)->rlock){....}, at: rds_send_remove_from_sock+0x278/0x8b0 net/rds/send.c:618

which lock already depends on the new lock.


the existing dependency chain (in reverse order) is:

-> #1 (&(&rm->m_rs_lock)->rlock){....}:
rds_message_purge net/rds/message.c:138 [inline]
rds_message_put+0x198/0xd00 net/rds/message.c:180
rds_inc_put+0xf9/0x140 net/rds/recv.c:87
rds_clear_recv_queue+0x147/0x350 net/rds/recv.c:762
rds_release+0xc6/0x350 net/rds/af_rds.c:73
__sock_release+0xcd/0x2a0 net/socket.c:599
sock_close+0x15/0x20 net/socket.c:1214
__fput+0x2ce/0x890 fs/file_table.c:278
task_work_run+0x148/0x1c0 kernel/task_work.c:113
tracehook_notify_resume include/linux/tracehook.h:193 [inline]
exit_to_usermode_loop+0x251/0x2a0 arch/x86/entry/common.c:167
prepare_exit_to_usermode arch/x86/entry/common.c:198 [inline]
syscall_return_slowpath arch/x86/entry/common.c:271 [inline]
do_syscall_64+0x538/0x620 arch/x86/entry/common.c:296
entry_SYSCALL_64_after_hwframe+0x49/0xbe

-> #0 (&rs->rs_recv_lock){....}:
__raw_read_lock_irqsave include/linux/rwlock_api_smp.h:159 [inline]
_raw_read_lock_irqsave+0x93/0xd0 kernel/locking/spinlock.c:224
rds_wake_sk_sleep+0x1d/0xc0 net/rds/af_rds.c:109
rds_send_remove_from_sock+0xb1/0x8b0 net/rds/send.c:624
rds_send_path_drop_acked+0x2de/0x3c0 net/rds/send.c:700
rds_tcp_write_space+0x199/0x650 net/rds/tcp_send.c:203
tcp_new_space net/ipv4/tcp_input.c:5167 [inline]
tcp_check_space+0x407/0x6f0 net/ipv4/tcp_input.c:5178
tcp_data_snd_check net/ipv4/tcp_input.c:5188 [inline]
tcp_rcv_established+0x916/0x1ef0 net/ipv4/tcp_input.c:5681
tcp_v4_do_rcv+0x5d6/0x870 net/ipv4/tcp_ipv4.c:1547
sk_backlog_rcv include/net/sock.h:952 [inline]
__release_sock+0x134/0x3a0 net/core/sock.c:2362
release_sock+0x54/0x1b0 net/core/sock.c:2901
do_tcp_setsockopt.constprop.0+0x42e/0x2340 net/ipv4/tcp.c:3098
tcp_setsockopt net/ipv4/tcp.c:3110 [inline]
tcp_setsockopt+0xb2/0xd0 net/ipv4/tcp.c:3102
kernel_setsockopt+0x106/0x1c0 net/socket.c:3563
rds_tcp_cork net/rds/tcp_send.c:43 [inline]
rds_tcp_xmit_path_complete+0xbf/0x100 net/rds/tcp_send.c:57
rds_send_xmit+0x13b5/0x2290 net/rds/send.c:410
rds_sendmsg+0x289d/0x2ea0 net/rds/send.c:1367
sock_sendmsg_nosec net/socket.c:651 [inline]
sock_sendmsg+0xc3/0x120 net/socket.c:661
__sys_sendto+0x21a/0x320 net/socket.c:1899
__do_sys_sendto net/socket.c:1911 [inline]
__se_sys_sendto net/socket.c:1907 [inline]
__x64_sys_sendto+0xdd/0x1b0 net/socket.c:1907
do_syscall_64+0xf9/0x620 arch/x86/entry/common.c:293
entry_SYSCALL_64_after_hwframe+0x49/0xbe

other info that might help us debug this:

Possible unsafe locking scenario:

CPU0 CPU1
---- ----
lock(&(&rm->m_rs_lock)->rlock);
lock(&rs->rs_recv_lock);
lock(&(&rm->m_rs_lock)->rlock);
lock(&rs->rs_recv_lock);

*** DEADLOCK ***

3 locks held by syz-executor728/12995:
#0: 00000000589d3912 (k-sk_lock-AF_INET){+.+.}, at: lock_sock include/net/sock.h:1512 [inline]
#0: 00000000589d3912 (k-sk_lock-AF_INET){+.+.}, at: do_tcp_setsockopt.constprop.0+0x13f/0x2340 net/ipv4/tcp.c:2816
#1: 00000000b5d5c10f (k-clock-AF_INET){++.-}, at: rds_tcp_write_space+0x25/0x650 net/rds/tcp_send.c:189
#2: 0000000063aa4bb2 (&(&rm->m_rs_lock)->rlock){....}, at: rds_send_remove_from_sock+0x278/0x8b0 net/rds/send.c:618

stack backtrace:
CPU: 1 PID: 12995 Comm: syz-executor728 Not tainted 4.19.211-syzkaller #0
Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 01/01/2011
Call Trace:
__dump_stack lib/dump_stack.c:77 [inline]
dump_stack+0x1fc/0x2ef lib/dump_stack.c:118
print_circular_bug.constprop.0.cold+0x2d7/0x41e kernel/locking/lockdep.c:1222
check_prev_add kernel/locking/lockdep.c:1866 [inline]
check_prevs_add kernel/locking/lockdep.c:1979 [inline]
validate_chain kernel/locking/lockdep.c:2420 [inline]
__lock_acquire+0x30c9/0x3ff0 kernel/locking/lockdep.c:3416
lock_acquire+0x170/0x3c0 kernel/locking/lockdep.c:3908
__raw_read_lock_irqsave include/linux/rwlock_api_smp.h:159 [inline]
_raw_read_lock_irqsave+0x93/0xd0 kernel/locking/spinlock.c:224
rds_wake_sk_sleep+0x1d/0xc0 net/rds/af_rds.c:109
rds_send_remove_from_sock+0xb1/0x8b0 net/rds/send.c:624
rds_send_path_drop_acked+0x2de/0x3c0 net/rds/send.c:700
rds_tcp_write_space+0x199/0x650 net/rds/tcp_send.c:203
tcp_new_space net/ipv4/tcp_input.c:5167 [inline]
tcp_check_space+0x407/0x6f0 net/ipv4/tcp_input.c:5178
tcp_data_snd_check net/ipv4/tcp_input.c:5188 [inline]
tcp_rcv_established+0x916/0x1ef0 net/ipv4/tcp_input.c:5681
tcp_v4_do_rcv+0x5d6/0x870 net/ipv4/tcp_ipv4.c:1547
sk_backlog_rcv include/net/sock.h:952 [inline]
__release_sock+0x134/0x3a0 net/core/sock.c:2362
release_sock+0x54/0x1b0 net/core/sock.c:2901
do_tcp_setsockopt.constprop.0+0x42e/0x2340 net/ipv4/tcp.c:3098
tcp_setsockopt net/ipv4/tcp.c:3110 [inline]
tcp_setsockopt+0xb2/0xd0 net/ipv4/tcp.c:3102
kernel_setsockopt+0x106/0x1c0 net/socket.c:3563
rds_tcp_cork net/rds/tcp_send.c:43 [inline]
rds_tcp_xmit_path_complete+0xbf/0x100 net/rds/tcp_send.c:57
rds_send_xmit+0x13b5/0x2290 net/rds/send.c:410
rds_sendmsg+0x289d/0x2ea0 net/rds/send.c:1367
sock_sendmsg_nosec net/socket.c:651 [inline]
sock_sendmsg+0xc3/0x120 net/socket.c:661
__sys_sendto+0x21a/0x320 net/socket.c:1899
__do_sys_sendto net/socket.c:1911 [inline]
__se_sys_sendto net/socket.c:1907 [inline]
__x64_sys_sendto+0xdd/0x1b0 net/socket.c:1907
do_syscall_64+0xf9/0x620 arch/x86/entry/common.c:293
entry_SYSCALL_64_after_hwframe+0x49/0xbe
RIP: 0033:0x7f31e6ecc079
Code: 28 00 00 00 75 05 48 83 c4 28 c3 e8 f1 18 00 00 90 48 89 f8 48 89 f7 48 89 d6 48 89 ca 4d 89 c2 4d 89 c8 4c 8b 4c 24 08 0f 05 <48> 3d 01 f0 ff ff 73 01 c3 48 c7 c1 b8 ff ff ff f7 d8 64 89 01 48
RSP: 002b:00007f31e6e71308 EFLAGS: 00000246 ORIG_RAX: 000000000000002c
RAX: ffffffffffffffda RBX: 00007f31e6f53268 RCX: 00007f31e6ecc079
RDX: 0000000000000000 RSI: 0000000000000000 RDI: 0000000000000005
RBP: 00007f31e6f53260 R08: 0000000000000000 R09: 0000000000000000
R10: 0000000000000000 R11: 0000000000000246 R12: 00007f31e6f1a74c
R13: 00007ffed006e09f R14: 00007f31e6e71400 R15: 0000000000022000


---
This report is generated by a bot. It may contain errors.
See https://goo.gl/tpsmEJ for more information about syzbot.
syzbot engineers can be reached at syzk...@googlegroups.com.

syzbot will keep track of this issue. See:
https://goo.gl/tpsmEJ#status for how to communicate with syzbot.
syzbot can test patches for this issue, for details see:
https://goo.gl/tpsmEJ#testing-patches
Reply all
Reply to author
Forward
0 new messages