inconsistent lock state in xprt_disconnect_done

5 views
Skip to first unread message

syzbot

unread,
Apr 26, 2022, 5:00:22 PM4/26/22
to syzkaller...@googlegroups.com
Hello,

syzbot found the following issue on:

HEAD commit: 15a1c6b6f516 Linux 4.14.276
git tree: linux-4.14.y
console output: https://syzkaller.appspot.com/x/log.txt?x=1398b3b8f00000
kernel config: https://syzkaller.appspot.com/x/.config?x=3ec12c117082c76f
dashboard link: https://syzkaller.appspot.com/bug?extid=1f4408d3941322a6de1f
compiler: gcc version 10.2.1 20210110 (Debian 10.2.1-6)

Unfortunately, I don't have any reproducer for this issue yet.

IMPORTANT: if you fix the issue, please add the following tag to the commit:
Reported-by: syzbot+1f4408...@syzkaller.appspotmail.com

================================
WARNING: inconsistent lock state
4.14.276-syzkaller #0 Not tainted
--------------------------------
inconsistent {SOFTIRQ-ON-W} -> {IN-SOFTIRQ-W} usage.
ksoftirqd/0/7 [HC0[0]:SC1[5]:HE1:SE0] takes:
(&(&xprt->transport_lock)->rlock){+.?.}, at: [<ffffffff867297f9>] spin_lock_bh include/linux/spinlock.h:322 [inline]
(&(&xprt->transport_lock)->rlock){+.?.}, at: [<ffffffff867297f9>] xprt_disconnect_done+0x19/0x40 net/sunrpc/xprt.c:664
{SOFTIRQ-ON-W} state was registered at:
lock_acquire+0x170/0x3f0 kernel/locking/lockdep.c:3998
__raw_spin_lock include/linux/spinlock_api_smp.h:142 [inline]
_raw_spin_lock+0x2a/0x40 kernel/locking/spinlock.c:152
spin_lock include/linux/spinlock.h:317 [inline]
xprt_destroy+0x68/0x1c0 net/sunrpc/xprt.c:1528
xprt_destroy_kref net/sunrpc/xprt.c:1542 [inline]
kref_put include/linux/kref.h:70 [inline]
xprt_put+0x32/0x40 net/sunrpc/xprt.c:1566
xprt_switch_free_entries net/sunrpc/xprtmultipath.c:124 [inline]
xprt_switch_free net/sunrpc/xprtmultipath.c:135 [inline]
kref_put include/linux/kref.h:70 [inline]
xprt_switch_put+0x1ae/0x290 net/sunrpc/xprtmultipath.c:161
rpc_free_client+0x105/0x190 net/sunrpc/clnt.c:885
rpc_free_auth net/sunrpc/clnt.c:910 [inline]
rpc_release_client+0xc4/0x130 net/sunrpc/clnt.c:927
rpcb_getport_async+0x6f3/0xba0 net/sunrpc/rpcb_clnt.c:790
call_bind+0x140/0x1a0 net/sunrpc/clnt.c:1797
__rpc_execute+0x1cf/0xc90 net/sunrpc/sched.c:783
rpc_execute+0x273/0x350 net/sunrpc/sched.c:851
rpc_run_task+0x473/0x630 net/sunrpc/clnt.c:1064
rpc_call_sync+0x93/0x140 net/sunrpc/clnt.c:1093
rpc_ping+0xa4/0xf0 net/sunrpc/clnt.c:2520
rpc_create_xprt+0x299/0x340 net/sunrpc/clnt.c:479
rpc_create+0x2db/0x4f0 net/sunrpc/clnt.c:587
nfs_create_rpc_client+0x2ec/0x3b0 fs/nfs/client.c:528
nfs_init_client fs/nfs/client.c:639 [inline]
nfs_init_client+0x62/0xf0 fs/nfs/client.c:626
nfs_get_client+0xdc2/0x10c0 fs/nfs/client.c:430
nfs_init_server+0x21d/0xd20 fs/nfs/client.c:675
nfs_create_server+0x7a/0x490 fs/nfs/client.c:958
nfs_try_mount+0x122/0x750 fs/nfs/super.c:1879
nfs_fs_mount+0x1530/0x2b10 fs/nfs/super.c:2692
mount_fs+0x92/0x2a0 fs/super.c:1237
vfs_kern_mount.part.0+0x5b/0x470 fs/namespace.c:1046
vfs_kern_mount fs/namespace.c:1036 [inline]
do_new_mount fs/namespace.c:2572 [inline]
do_mount+0xe65/0x2a10 fs/namespace.c:2902
SYSC_mount fs/namespace.c:3118 [inline]
SyS_mount+0xa8/0x120 fs/namespace.c:3095
do_syscall_64+0x1d5/0x640 arch/x86/entry/common.c:292
entry_SYSCALL_64_after_hwframe+0x46/0xbb
irq event stamp: 815468
hardirqs last enabled at (815468): [<ffffffff81321821>] __local_bh_enable_ip+0xc1/0x170 kernel/softirq.c:190
hardirqs last disabled at (815467): [<ffffffff813217e9>] __local_bh_enable_ip+0x89/0x170 kernel/softirq.c:167
softirqs last enabled at (815418): [<ffffffff8760068b>] __do_softirq+0x68b/0x9ff kernel/softirq.c:314
softirqs last disabled at (815423): [<ffffffff81321920>] run_ksoftirqd+0x50/0x1a0 kernel/softirq.c:670

other info that might help us debug this:
Possible unsafe locking scenario:

CPU0
----
lock(&(&xprt->transport_lock)->rlock);
<Interrupt>
lock(&(&xprt->transport_lock)->rlock);

*** DEADLOCK ***

4 locks held by ksoftirqd/0/7:
#0: (rcu_read_lock){....}, at: [<ffffffff85c3bd72>] __write_once_size include/linux/compiler.h:212 [inline]
#0: (rcu_read_lock){....}, at: [<ffffffff85c3bd72>] __skb_unlink include/linux/skbuff.h:1917 [inline]
#0: (rcu_read_lock){....}, at: [<ffffffff85c3bd72>] __skb_dequeue include/linux/skbuff.h:1933 [inline]
#0: (rcu_read_lock){....}, at: [<ffffffff85c3bd72>] process_backlog+0x1c2/0x6f0 net/core/dev.c:5193
#1: (rcu_read_lock){....}, at: [<ffffffff860d164a>] __skb_pull include/linux/skbuff.h:2150 [inline]
#1: (rcu_read_lock){....}, at: [<ffffffff860d164a>] ip_local_deliver_finish+0x12a/0xab0 net/ipv4/ip_input.c:194
#2: (slock-AF_INET-RPC){+.-.}, at: [<ffffffff861758a8>] spin_lock include/linux/spinlock.h:317 [inline]
#2: (slock-AF_INET-RPC){+.-.}, at: [<ffffffff861758a8>] tcp_v4_err+0x3d8/0x1820 net/ipv4/tcp_ipv4.c:404
#3: (k-clock-AF_INET){++.-}, at: [<ffffffff86734fa5>] xs_tcp_state_change+0x25/0x7e0 net/sunrpc/xprtsock.c:1568

stack backtrace:
CPU: 0 PID: 7 Comm: ksoftirqd/0 Not tainted 4.14.276-syzkaller #0
Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 01/01/2011
Call Trace:
__dump_stack lib/dump_stack.c:17 [inline]
dump_stack+0x1b2/0x281 lib/dump_stack.c:58
print_usage_bug.cold+0x42e/0x570 kernel/locking/lockdep.c:2589
valid_state kernel/locking/lockdep.c:2602 [inline]
mark_lock_irq kernel/locking/lockdep.c:2796 [inline]
mark_lock+0xb4d/0x1050 kernel/locking/lockdep.c:3194
mark_irqflags kernel/locking/lockdep.c:3072 [inline]
__lock_acquire+0xc81/0x3f20 kernel/locking/lockdep.c:3448
lock_acquire+0x170/0x3f0 kernel/locking/lockdep.c:3998
__raw_spin_lock_bh include/linux/spinlock_api_smp.h:135 [inline]
_raw_spin_lock_bh+0x2f/0x40 kernel/locking/spinlock.c:176
spin_lock_bh include/linux/spinlock.h:322 [inline]
xprt_disconnect_done+0x19/0x40 net/sunrpc/xprt.c:664
xs_sock_mark_closed net/sunrpc/xprtsock.c:811 [inline]
xs_tcp_state_change+0x3c4/0x7e0 net/sunrpc/xprtsock.c:1637
tcp_done+0x14f/0x210 net/ipv4/tcp.c:3427
tcp_v4_err+0x7dd/0x1820 net/ipv4/tcp_ipv4.c:523
icmp_socket_deliver+0x1a7/0x330 net/ipv4/icmp.c:836
icmp_unreach+0x268/0xae0 net/ipv4/icmp.c:953
icmp_rcv+0xb7f/0x1240 net/ipv4/icmp.c:1134
ip_local_deliver_finish+0x3f2/0xab0 net/ipv4/ip_input.c:216
NF_HOOK include/linux/netfilter.h:250 [inline]
ip_local_deliver+0x167/0x460 net/ipv4/ip_input.c:257
dst_input include/net/dst.h:476 [inline]
ip_rcv_finish+0x6e3/0x19f0 net/ipv4/ip_input.c:396
NF_HOOK include/linux/netfilter.h:250 [inline]
ip_rcv+0x8a7/0xf10 net/ipv4/ip_input.c:493
__netif_receive_skb_core+0x15ee/0x2a30 net/core/dev.c:4474
__netif_receive_skb+0x27/0x1a0 net/core/dev.c:4512
process_backlog+0x218/0x6f0 net/core/dev.c:5195
napi_poll net/core/dev.c:5604 [inline]
net_rx_action+0x466/0xfd0 net/core/dev.c:5670
__do_softirq+0x24d/0x9ff kernel/softirq.c:288
run_ksoftirqd+0x50/0x1a0 kernel/softirq.c:670
smpboot_thread_fn+0x5c1/0x920 kernel/smpboot.c:164
kthread+0x30d/0x420 kernel/kthread.c:232
ret_from_fork+0x24/0x30 arch/x86/entry/entry_64.S:404
print_req_error: I/O error, dev loop5, sector 24
base_sock_release(ffff88804ea4f200) sk=ffff88805a981580
print_req_error: I/O error, dev loop5, sector 24
base_sock_release(ffff88804ea80580) sk=ffff8880aec72880
base_sock_release(ffff88805a538980) sk= (null)
print_req_error: I/O error, dev loop5, sector 24
base_sock_release(ffff888097f28600) sk=ffff88804fd30a00
base_sock_release(ffff88809fa2e500) sk=ffff88808fde0980
base_sock_release(ffff88805a54b140) sk=ffff888092045180
base_sock_release(ffff88805a4d2a80) sk= (null)
base_sock_release(ffff88808e921580) sk= (null)
base_sock_release(ffff88805a4c9480) sk= (null)
base_sock_release(ffff88805a4c8200) sk= (null)
base_sock_release(ffff88805a4c5a80) sk= (null)
base_sock_release(ffff88808eb5a4c0) sk= (null)
base_sock_release(ffff88805a7f0ac0) sk= (null)
base_sock_release(ffff888097fd2540) sk= (null)
base_sock_release(ffff88808e3d7a80) sk= (null)
base_sock_release(ffff88805a7651c0) sk= (null)
base_sock_release(ffff88805a73fb00) sk= (null)
base_sock_release(ffff88805a73f200) sk= (null)
base_sock_release(ffff88808e82bb00) sk= (null)
base_sock_release(ffff888097c0b600) sk= (null)
base_sock_release(ffff8880a66840c0) sk= (null)
base_sock_release(ffff88808e3d65c0) sk= (null)
base_sock_release(ffff88808e3d7600) sk= (null)
base_sock_release(ffff88808eab1200) sk= (null)
base_sock_release(ffff8880a67e7ac0) sk= (null)
base_sock_release(ffff88808e35e100) sk= (null)
base_sock_release(ffff88808e35e580) sk= (null)
base_sock_release(ffff88808e3655c0) sk= (null)
base_sock_release(ffff88808e377a80) sk= (null)
base_sock_release(ffff888094d1f4c0) sk= (null)
base_sock_release(ffff88808e381140) sk= (null)
base_sock_release(ffff88808e3815c0) sk= (null)
base_sock_release(ffff88808eaee200) sk= (null)
base_sock_release(ffff88808ea09a40) sk= (null)
base_sock_release(ffff8880a67e7640) sk= (null)
base_sock_release(ffff8880a67e71c0) sk= (null)
base_sock_release(ffff88808eb00a80) sk= (null)
base_sock_release(ffff8880a67e9600) sk= (null)
base_sock_release(ffff88808ea454c0) sk= (null)
base_sock_release(ffff888094ca2980) sk= (null)
base_sock_release(ffff888094d1e000) sk= (null)
base_sock_release(ffff88808e372500) sk= (null)
base_sock_release(ffff88808e372080) sk= (null)
base_sock_release(ffff88805a4c5600) sk= (null)
base_sock_release(ffff8880906e8540) sk= (null)
base_sock_release(ffff88808e94e5c0) sk= (null)
base_sock_release(ffff88808e94e140) sk= (null)
base_sock_release(ffff88808eb6fb00) sk= (null)
base_sock_release(ffff88805a76c940) sk= (null)
base_sock_release(ffff88805a76c4c0) sk= (null)
base_sock_release(ffff88805a765640) sk= (null)
base_sock_release(ffff88808e36c140) sk= (null)
base_sock_release(ffff88808eb6f680) sk= (null)
base_sock_release(ffff8880a66cd1c0) sk= (null)
base_sock_release(ffff88808ea095c0) sk= (null)
base_sock_release(ffff88808eb29200) sk= (null)
base_sock_release(ffff888094d24480) sk= (null)
base_sock_release(ffff88805a4ed580) sk= (null)
base_sock_release(ffff88805a5205c0) sk= (null)
base_sock_release(ffff88805a5b9980) sk= (null)
base_sock_release(ffff88805a5df080) sk= (null)
base_sock_release(ffff88805a5d30c0) sk= (null)
base_sock_release(ffff88805a5d3540) sk= (null)
base_sock_release(ffff88805a4a5ac0) sk= (null)
base_sock_release(ffff88805a47c080) sk= (null)
base_sock_release(ffff88805a4ca040) sk= (null)
base_sock_release(ffff88808e35ea00) sk= (null)
base_sock_release(ffff88808e35c4c0) sk= (null)
base_sock_release(ffff88808e377600) sk= (null)
base_sock_release(ffff88805a715100) sk= (null)
base_sock_release(ffff88805a619180) sk= (null)
base_sock_release(ffff88805a6e1a40) sk=ffff8880937a9100
base_sock_release(ffff88808e82b200) sk=ffff88809188a280
base_sock_release(ffff88805a63b980) sk=ffff888059b1a2c0
base_sock_release(ffff888097fc0a00) sk=ffff8880abc5eec0
base_sock_release(ffff888094cd9600) sk= (null)
base_sock_release(ffff8880a6705640) sk=ffff88808baf5340
FAULT_INJECTION: forcing a failure.
name failslab, interval 1, probability 0, space 0, times 0
CPU: 0 PID: 12968 Comm: syz-executor.3 Not tainted 4.14.276-syzkaller #0
Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 01/01/2011
Call Trace:
__dump_stack lib/dump_stack.c:17 [inline]
dump_stack+0x1b2/0x281 lib/dump_stack.c:58
fail_dump lib/fault-inject.c:51 [inline]
should_fail.cold+0x10a/0x149 lib/fault-inject.c:149
should_failslab+0xd6/0x130 mm/failslab.c:32
slab_pre_alloc_hook mm/slab.h:421 [inline]
slab_alloc mm/slab.c:3376 [inline]
__do_kmalloc mm/slab.c:3718 [inline]
__kmalloc+0x2c1/0x400 mm/slab.c:3729
kmalloc include/linux/slab.h:493 [inline]
kzalloc include/linux/slab.h:661 [inline]
rds_message_alloc+0x32/0x220 net/rds/message.c:202
rds_sendmsg+0x7ed/0x1bf0 net/rds/send.c:1094
sock_sendmsg_nosec net/socket.c:646 [inline]
sock_sendmsg+0xb5/0x100 net/socket.c:656
___sys_sendmsg+0x6c8/0x800 net/socket.c:2062
__sys_sendmsg+0xa3/0x120 net/socket.c:2096
SYSC_sendmsg net/socket.c:2107 [inline]
SyS_sendmsg+0x27/0x40 net/socket.c:2103
do_syscall_64+0x1d5/0x640 arch/x86/entry/common.c:292
entry_SYSCALL_64_after_hwframe+0x46/0xbb
RIP: 0033:0x7f870ebf80e9
RSP: 002b:00007f870d56d168 EFLAGS: 00000246 ORIG_RAX: 000000000000002e
RAX: ffffffffffffffda RBX: 00007f870ed0af60 RCX: 00007f870ebf80e9
RDX: 0000000000000000 RSI: 0000000020001600 RDI: 0000000000000003
RBP: 00007f870d56d1d0 R08: 0000000000000000 R09: 0000000000000000
R10: 0000000000000000 R11: 0000000000000246 R12: 0000000000000001
R13: 00007ffd6a0a8e7f R14: 00007f870d56d300 R15: 0000000000022000
9pnet: Insufficient options for proto=fd
9pnet: Insufficient options for proto=fd
9pnet: Insufficient options for proto=fd
RDS: rds_bind could not find a transport for 224.0.0.2, load rds_tcp or rds_rdma?
RDS: rds_bind could not find a transport for 224.0.0.2, load rds_tcp or rds_rdma?
RDS: rds_bind could not find a transport for 224.0.0.2, load rds_tcp or rds_rdma?
RDS: rds_bind could not find a transport for 224.0.0.1, load rds_tcp or rds_rdma?
RDS: rds_bind could not find a transport for 224.0.0.1, load rds_tcp or rds_rdma?
RDS: rds_bind could not find a transport for 224.0.0.2, load rds_tcp or rds_rdma?
RDS: rds_bind could not find a transport for 224.0.0.1, load rds_tcp or rds_rdma?
atomic_op ffff8880b50cf840 conn xmit_atomic (null)


---
This report is generated by a bot. It may contain errors.
See https://goo.gl/tpsmEJ for more information about syzbot.
syzbot engineers can be reached at syzk...@googlegroups.com.

syzbot will keep track of this issue. See:
https://goo.gl/tpsmEJ#status for how to communicate with syzbot.

syzbot

unread,
Dec 23, 2022, 6:50:28 AM12/23/22
to syzkaller...@googlegroups.com
Auto-closing this bug as obsolete.
Crashes did not happen for a while, no reproducer and no activity.
Reply all
Reply to author
Forward
0 new messages