[syzbot] WARNING: suspicious RCU usage in bond_ethtool_get_ts_info

6 views
Skip to first unread message

syzbot

unread,
May 12, 2022, 3:35:29 PM5/12/22
to and...@kernel.org, an...@greyhouse.net, a...@kernel.org, b...@vger.kernel.org, dan...@iogearbox.net, da...@davemloft.net, edum...@google.com, ha...@kernel.org, j.vos...@gmail.com, john.fa...@gmail.com, ka...@fb.com, kps...@kernel.org, ku...@kernel.org, linux-...@vger.kernel.org, liuha...@gmail.com, net...@vger.kernel.org, pab...@redhat.com, songliu...@fb.com, syzkall...@googlegroups.com, vfa...@gmail.com, y...@fb.com
Hello,

syzbot found the following issue on:

HEAD commit: 01f4685797a5 eth: amd: remove NI6510 support (ni65)
git tree: net-next
console+strace: https://syzkaller.appspot.com/x/log.txt?x=16391d99f00000
kernel config: https://syzkaller.appspot.com/x/.config?x=c04cc4641789ea51
dashboard link: https://syzkaller.appspot.com/bug?extid=92beb3d46aab498710fa
compiler: gcc (Debian 10.2.1-6) 10.2.1 20210110, GNU ld (GNU Binutils for Debian) 2.35.2
syz repro: https://syzkaller.appspot.com/x/repro.syz?x=17df03e1f00000
C reproducer: https://syzkaller.appspot.com/x/repro.c?x=12159cbef00000

The issue was bisected to:

commit aa6034678e873db8bd5c5a4b73f8b88c469374d6
Author: Hangbin Liu <liuha...@gmail.com>
Date: Fri Jan 21 08:25:18 2022 +0000

bonding: use rcu_dereference_rtnl when get bonding active slave

bisection log: https://syzkaller.appspot.com/x/bisect.txt?x=16fce349f00000
final oops: https://syzkaller.appspot.com/x/report.txt?x=15fce349f00000
console output: https://syzkaller.appspot.com/x/log.txt?x=11fce349f00000

IMPORTANT: if you fix the issue, please add the following tag to the commit:
Reported-by: syzbot+92beb3...@syzkaller.appspotmail.com
Fixes: aa6034678e87 ("bonding: use rcu_dereference_rtnl when get bonding active slave")

=============================
WARNING: suspicious RCU usage
5.18.0-rc5-syzkaller-01392-g01f4685797a5 #0 Not tainted
-----------------------------
include/net/bonding.h:353 suspicious rcu_dereference_check() usage!

other info that might help us debug this:


rcu_scheduler_active = 2, debug_locks = 1
1 lock held by syz-executor317/3599:
#0: ffff88801de78130 (sk_lock-AF_INET){+.+.}-{0:0}, at: lock_sock include/net/sock.h:1680 [inline]
#0: ffff88801de78130 (sk_lock-AF_INET){+.+.}-{0:0}, at: sock_setsockopt+0x1e3/0x2ec0 net/core/sock.c:1066

stack backtrace:
CPU: 0 PID: 3599 Comm: syz-executor317 Not tainted 5.18.0-rc5-syzkaller-01392-g01f4685797a5 #0
Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 01/01/2011
Call Trace:
<TASK>
__dump_stack lib/dump_stack.c:88 [inline]
dump_stack_lvl+0xcd/0x134 lib/dump_stack.c:106
bond_option_active_slave_get_rcu include/net/bonding.h:353 [inline]
bond_ethtool_get_ts_info+0x32c/0x3a0 drivers/net/bonding/bond_main.c:5595
__ethtool_get_ts_info+0x173/0x240 net/ethtool/common.c:554
ethtool_get_phc_vclocks+0x99/0x110 net/ethtool/common.c:568
sock_timestamping_bind_phc net/core/sock.c:869 [inline]
sock_set_timestamping+0x3a3/0x7e0 net/core/sock.c:916
sock_setsockopt+0x543/0x2ec0 net/core/sock.c:1221
__sys_setsockopt+0x55e/0x6a0 net/socket.c:2223
__do_sys_setsockopt net/socket.c:2238 [inline]
__se_sys_setsockopt net/socket.c:2235 [inline]
__x64_sys_setsockopt+0xba/0x150 net/socket.c:2235
do_syscall_x64 arch/x86/entry/common.c:50 [inline]
do_syscall_64+0x35/0xb0 arch/x86/entry/common.c:80
entry_SYSCALL_64_after_hwframe+0x44/0xae
RIP: 0033:0x7f8902c8eb39
Code: 28 c3 e8 2a 14 00 00 66 2e 0f 1f 84 00 00 00 00 00 48 89 f8 48 89 f7 48 89 d6 48 89 ca 4d 89 c2 4d 89 c8 4c 8b 4c 24 08 0f 05 <48> 3d 01 f0 ff ff 73 01 c3 48 c7 c1 c0 ff ff ff f7 d8 64 89 01 48


---
This report is generated by a bot. It may contain errors.
See https://goo.gl/tpsmEJ for more information about syzbot.
syzbot engineers can be reached at syzk...@googlegroups.com.

syzbot will keep track of this issue. See:
https://goo.gl/tpsmEJ#status for how to communicate with syzbot.
For information about bisection process see: https://goo.gl/tpsmEJ#bisection
syzbot can test patches for this issue, for details see:
https://goo.gl/tpsmEJ#testing-patches

Hangbin Liu

unread,
May 12, 2022, 9:51:43 PM5/12/22
to Vladimir Oltean, Jay Vosburgh, Veaceslav Falico, Andy Gospodarek, net...@vger.kernel.org, pab...@redhat.com, David S. Miller, Jakub Kicinski, syzkall...@googlegroups.com, syzbot
Remove bpf guys from cc list.
Oh, I forgot to check setsockopt path...

Hi Vladimir, Jay,

Do you think if I should revert my previous commit, or just add
rcu_read_lock()/rcu_read_unlock() back to bond_ethtool_get_ts_info()?

Thanks
Hangbin

Vladimir Oltean

unread,
May 13, 2022, 4:48:27 AM5/13/22
to Hangbin Liu, Jay Vosburgh, Veaceslav Falico, Andy Gospodarek, net...@vger.kernel.org, pab...@redhat.com, David S. Miller, Jakub Kicinski, syzkall...@googlegroups.com, syzbot
Hi Hangbin,

sock_timestamping_bind_phc() appears to run unlocked, with the exception
of the rcu_read_lock() in dev_get_by_index() in which there is a dev_hold().
I'm thinking that this dev_hold ensures that the bonding interface does
not go away, but it still does not ensure that the active slave does not
go away.

I only looked superficially at this because I am AFK today, but I think
I would put rcu_read_lock() in bond_ethtool_get_ts_info(), but not the
way in which it was before, but rather for the entire time during which
real_dev, ops and phydev are being dereferenced.
Reply all
Reply to author
Forward
0 new messages