[syzbot] WARNING: refcount bug in sk_psock_get

21 views
Skip to first unread message

syzbot

unread,
Apr 9, 2021, 11:39:18ā€ÆAM4/9/21
to ak...@linux-foundation.org, and...@kernel.org, a...@kernel.org, bor...@nvidia.com, b...@alien8.de, b...@vger.kernel.org, dan...@iogearbox.net, da...@davemloft.net, h...@zytor.com, jmat...@google.com, john.fa...@gmail.com, jo...@8bytes.org, ka...@fb.com, kps...@kernel.org, ku...@kernel.org, k...@vger.kernel.org, linux-...@vger.kernel.org, mark.r...@arm.com, masa...@kernel.org, mi...@redhat.com, net...@vger.kernel.org, pbon...@redhat.com, pet...@infradead.org, rafael.j...@intel.com, ros...@goodmis.org, sea...@google.com, songliu...@fb.com, syzkall...@googlegroups.com, tg...@linutronix.de, vkuz...@redhat.com, wanp...@tencent.com, wi...@kernel.org, x...@kernel.org, y...@fb.com
Hello,

syzbot found the following issue on:

HEAD commit: 9c54130c Add linux-next specific files for 20210406
git tree: linux-next
console output: https://syzkaller.appspot.com/x/log.txt?x=17d8d7aad00000
kernel config: https://syzkaller.appspot.com/x/.config?x=d125958c3995ddcd
dashboard link: https://syzkaller.appspot.com/bug?extid=b54a1ce86ba4a623b7f0
syz repro: https://syzkaller.appspot.com/x/repro.syz?x=1729797ed00000
C reproducer: https://syzkaller.appspot.com/x/repro.c?x=1190f46ad00000

The issue was bisected to:

commit 997acaf6b4b59c6a9c259740312a69ea549cc684
Author: Mark Rutland <mark.r...@arm.com>
Date: Mon Jan 11 15:37:07 2021 +0000

lockdep: report broken irq restoration

bisection log: https://syzkaller.appspot.com/x/bisect.txt?x=11a6cc96d00000
final oops: https://syzkaller.appspot.com/x/report.txt?x=13a6cc96d00000
console output: https://syzkaller.appspot.com/x/log.txt?x=15a6cc96d00000

IMPORTANT: if you fix the issue, please add the following tag to the commit:
Reported-by: syzbot+b54a1c...@syzkaller.appspotmail.com
Fixes: 997acaf6b4b5 ("lockdep: report broken irq restoration")

------------[ cut here ]------------
refcount_t: saturated; leaking memory.
WARNING: CPU: 1 PID: 8414 at lib/refcount.c:19 refcount_warn_saturate+0xf4/0x1e0 lib/refcount.c:19
Modules linked in:
CPU: 1 PID: 8414 Comm: syz-executor793 Not tainted 5.12.0-rc6-next-20210406-syzkaller #0
Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 01/01/2011
RIP: 0010:refcount_warn_saturate+0xf4/0x1e0 lib/refcount.c:19
Code: 1d 69 0c e6 09 31 ff 89 de e8 c8 b4 a6 fd 84 db 75 ab e8 0f ae a6 fd 48 c7 c7 e0 52 c2 89 c6 05 49 0c e6 09 01 e8 91 0f 00 05 <0f> 0b eb 8f e8 f3 ad a6 fd 0f b6 1d 33 0c e6 09 31 ff 89 de e8 93
RSP: 0018:ffffc90000eef388 EFLAGS: 00010282
RAX: 0000000000000000 RBX: 0000000000000000 RCX: 0000000000000000
RDX: ffff88801bbdd580 RSI: ffffffff815c2e05 RDI: fffff520001dde63
RBP: 0000000000000000 R08: 0000000000000000 R09: 0000000000000000
R10: ffffffff815bcc6e R11: 0000000000000000 R12: 1ffff920001dde74
R13: 0000000090200301 R14: ffff888026e00000 R15: ffffc90000eef3c0
FS: 0000000001422300(0000) GS:ffff8880b9d00000(0000) knlGS:0000000000000000
CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
CR2: 0000000020000000 CR3: 0000000012b3b000 CR4: 00000000001506e0
DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
Call Trace:
__refcount_add_not_zero include/linux/refcount.h:163 [inline]
__refcount_inc_not_zero include/linux/refcount.h:227 [inline]
refcount_inc_not_zero include/linux/refcount.h:245 [inline]
sk_psock_get+0x3b0/0x400 include/linux/skmsg.h:435
bpf_exec_tx_verdict+0x11e/0x11a0 net/tls/tls_sw.c:799
tls_sw_sendmsg+0xa41/0x1800 net/tls/tls_sw.c:1013
inet_sendmsg+0x99/0xe0 net/ipv4/af_inet.c:821
sock_sendmsg_nosec net/socket.c:654 [inline]
sock_sendmsg+0xcf/0x120 net/socket.c:674
sock_write_iter+0x289/0x3c0 net/socket.c:1001
call_write_iter include/linux/fs.h:2106 [inline]
do_iter_readv_writev+0x46f/0x740 fs/read_write.c:740
do_iter_write+0x188/0x670 fs/read_write.c:866
vfs_writev+0x1aa/0x630 fs/read_write.c:939
do_writev+0x27f/0x300 fs/read_write.c:982
do_syscall_64+0x2d/0x70 arch/x86/entry/common.c:46
entry_SYSCALL_64_after_hwframe+0x44/0xae
RIP: 0033:0x43efa9
Code: 28 c3 e8 2a 14 00 00 66 2e 0f 1f 84 00 00 00 00 00 48 89 f8 48 89 f7 48 89 d6 48 89 ca 4d 89 c2 4d 89 c8 4c 8b 4c 24 08 0f 05 <48> 3d 01 f0 ff ff 73 01 c3 48 c7 c1 c0 ff ff ff f7 d8 64 89 01 48
RSP: 002b:00007ffe9279f418 EFLAGS: 00000246 ORIG_RAX: 0000000000000014
RAX: ffffffffffffffda RBX: 0000000000400488 RCX: 000000000043efa9
RDX: 0000000000000001 RSI: 0000000020000100 RDI: 0000000000000003
RBP: 0000000000402f90 R08: 0000000000400488 R09: 0000000000400488
R10: 0000000000000038 R11: 0000000000000246 R12: 0000000000403020
R13: 0000000000000000 R14: 00000000004ac018 R15: 0000000000400488


---
This report is generated by a bot. It may contain errors.
See https://goo.gl/tpsmEJ for more information about syzbot.
syzbot engineers can be reached at syzk...@googlegroups.com.

syzbot will keep track of this issue. See:
https://goo.gl/tpsmEJ#status for how to communicate with syzbot.
For information about bisection process see: https://goo.gl/tpsmEJ#bisection
syzbot can test patches for this issue, for details see:
https://goo.gl/tpsmEJ#testing-patches

John Fastabend

unread,
Apr 9, 2021, 3:45:00ā€ÆPM4/9/21
to syzbot, ak...@linux-foundation.org, and...@kernel.org, a...@kernel.org, bor...@nvidia.com, b...@alien8.de, b...@vger.kernel.org, dan...@iogearbox.net, da...@davemloft.net, h...@zytor.com, jmat...@google.com, john.fa...@gmail.com, jo...@8bytes.org, ka...@fb.com, kps...@kernel.org, ku...@kernel.org, k...@vger.kernel.org, linux-...@vger.kernel.org, mark.r...@arm.com, masa...@kernel.org, mi...@redhat.com, net...@vger.kernel.org, pbon...@redhat.com, pet...@infradead.org, rafael.j...@intel.com, ros...@goodmis.org, sea...@google.com, songliu...@fb.com, syzkall...@googlegroups.com, tg...@linutronix.de, vkuz...@redhat.com, wanp...@tencent.com, wi...@kernel.org, x...@kernel.org, y...@fb.com
[...]

This is likely a problem with latest round of sockmap patches I'll
tke a look.

syzbot

unread,
Apr 9, 2021, 4:49:11ā€ÆPM4/9/21
to syzkall...@googlegroups.com, xiyou.w...@gmail.com
Hello,

syzbot has tested the proposed patch but the reproducer is still triggering an issue:
WARNING: refcount bug in sk_psock_get

------------[ cut here ]------------
refcount_t: saturated; leaking memory.
WARNING: CPU: 0 PID: 10143 at lib/refcount.c:19 refcount_warn_saturate+0xf4/0x1e0 lib/refcount.c:19
Modules linked in:
CPU: 0 PID: 10143 Comm: syz-executor.0 Not tainted 5.12.0-rc4-syzkaller #0
Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 01/01/2011
RIP: 0010:refcount_warn_saturate+0xf4/0x1e0 lib/refcount.c:19
Code: 1d de 73 68 09 31 ff 89 de e8 88 fa aa fd 84 db 75 ab e8 cf f3 aa fd 48 c7 c7 00 e5 c1 89 c6 05 be 73 68 09 01 e8 d9 97 fb 04 <0f> 0b eb 8f e8 b3 f3 aa fd 0f b6 1d a8 73 68 09 31 ff 89 de e8 53
RSP: 0018:ffffc9000ad6f388 EFLAGS: 00010282
RAX: 0000000000000000 RBX: 0000000000000000 RCX: 0000000000000000
RDX: ffff8880257b8000 RSI: ffffffff815c5205 RDI: fffff520015ade63
RBP: 0000000000000000 R08: 0000000000000000 R09: 0000000000000000
R10: ffffffff815bdf9e R11: 0000000000000000 R12: 1ffff920015ade74
R13: 000000008f9d1101 R14: ffff88801d3d8000 R15: ffffc9000ad6f3c0
FS: 00007fc7fad7d700(0000) GS:ffff8880b9e00000(0000) knlGS:0000000000000000
CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
CR2: 000055d3704dc8c7 CR3: 000000002541c000 CR4: 00000000001506f0
DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
Call Trace:
__refcount_add_not_zero include/linux/refcount.h:163 [inline]
__refcount_inc_not_zero include/linux/refcount.h:227 [inline]
refcount_inc_not_zero include/linux/refcount.h:245 [inline]
sk_psock_get+0x3b0/0x400 include/linux/skmsg.h:435
bpf_exec_tx_verdict+0x11e/0x11a0 net/tls/tls_sw.c:799
tls_sw_sendmsg+0xa41/0x1800 net/tls/tls_sw.c:1013
inet_sendmsg+0x99/0xe0 net/ipv4/af_inet.c:821
sock_sendmsg_nosec net/socket.c:654 [inline]
sock_sendmsg+0xcf/0x120 net/socket.c:674
sock_write_iter+0x289/0x3c0 net/socket.c:1001
call_write_iter include/linux/fs.h:1977 [inline]
do_iter_readv_writev+0x46f/0x740 fs/read_write.c:740
do_iter_write+0x188/0x670 fs/read_write.c:866
vfs_writev+0x1aa/0x630 fs/read_write.c:939
do_writev+0x27f/0x300 fs/read_write.c:982
do_syscall_64+0x2d/0x70 arch/x86/entry/common.c:46
entry_SYSCALL_64_after_hwframe+0x44/0xae
RIP: 0033:0x466459
Code: ff ff c3 66 2e 0f 1f 84 00 00 00 00 00 0f 1f 40 00 48 89 f8 48 89 f7 48 89 d6 48 89 ca 4d 89 c2 4d 89 c8 4c 8b 4c 24 08 0f 05 <48> 3d 01 f0 ff ff 73 01 c3 48 c7 c1 bc ff ff ff f7 d8 64 89 01 48
RSP: 002b:00007fc7fad7d188 EFLAGS: 00000246 ORIG_RAX: 0000000000000014
RAX: ffffffffffffffda RBX: 000000000056bf60 RCX: 0000000000466459
RDX: 0000000000000001 RSI: 0000000020000100 RDI: 0000000000000003
RBP: 00000000004bf9fb R08: 0000000000000000 R09: 0000000000000000
R10: 0000000000000000 R11: 0000000000000246 R12: 000000000056bf60
R13: 00007ffdf0650a2f R14: 00007fc7fad7d300 R15: 0000000000022000


Tested on:

commit: 92d3bff2 Merge branch 'bpf/selftests: page size fixes'
git tree: bpf-next
console output: https://syzkaller.appspot.com/x/log.txt?x=158fa441d00000
kernel config: https://syzkaller.appspot.com/x/.config?x=995a68eaac242f29
dashboard link: https://syzkaller.appspot.com/bug?extid=b54a1ce86ba4a623b7f0
compiler:

Cong Wang

unread,
Apr 10, 2021, 12:42:01ā€ÆPM4/10/21
to John Fastabend, syzbot, Andrew Morton, Andrii Nakryiko, Alexei Starovoitov, bor...@nvidia.com, Borislav Petkov, bpf, Daniel Borkmann, David Miller, H. Peter Anvin, jmat...@google.com, Joerg Roedel, Martin KaFai Lau, kps...@kernel.org, Jakub Kicinski, kvm@vger.kernel.org list, LKML, Mark Rutland, masa...@kernel.org, Ingo Molnar, Linux Kernel Network Developers, pbon...@redhat.com, Peter Zijlstra, Rafael J. Wysocki, Steven Rostedt, sea...@google.com, Song Liu, syzkaller-bugs, Thomas Gleixner, vkuz...@redhat.com, wanp...@tencent.com, Will Deacon, x86, Yonghong Song
I bet this has nothing to do with my sockmap patches, as clearly
the reproducer does not even have any BPF thing. And actually
the reproducer creates an SMC socket, which coincidentally uses
sk_user_data too, therefore triggers this bug.

I think we should just prohibit TCP_ULP on SMC socket, something
like this:

diff --git a/net/smc/af_smc.c b/net/smc/af_smc.c
index 47340b3b514f..0d4d6d28f20c 100644
--- a/net/smc/af_smc.c
+++ b/net/smc/af_smc.c
@@ -2162,6 +2162,9 @@ static int smc_setsockopt(struct socket *sock,
int level, int optname,
struct smc_sock *smc;
int val, rc;

+ if (optname == TCP_ULP)
+ return -EOPNOTSUPP;
+
smc = smc_sk(sk);

/* generic setsockopts reaching us here always apply to the
@@ -2186,7 +2189,6 @@ static int smc_setsockopt(struct socket *sock,
int level, int optname,
if (rc || smc->use_fallback)
goto out;
switch (optname) {
- case TCP_ULP:
case TCP_FASTOPEN:
case TCP_FASTOPEN_CONNECT:
case TCP_FASTOPEN_KEY:

syzbot

unread,
Apr 10, 2021, 1:22:08ā€ÆPM4/10/21
to syzkall...@googlegroups.com, xiyou.w...@gmail.com
Hello,

syzbot has tested the proposed patch and the reproducer did not trigger any issue:

Reported-and-tested-by: syzbot+b54a1c...@syzkaller.appspotmail.com

Tested on:

commit: 78b9c1fc smc: disallow TCP_ULP
git tree: https://github.com/congwang/linux.git bpf-next
Note: testing is done by a robot and is best-effort only.
Reply all
Reply to author
Forward
0 new messages