[syzbot] [net?] WARNING: bad unlock balance in do_setlink

4 views
Skip to first unread message

syzbot

unread,
Apr 7, 2025, 1:57:38 AM4/7/25
to andrew...@lunn.ch, da...@davemloft.net, edum...@google.com, ho...@kernel.org, ku...@kernel.org, kun...@amazon.com, linux-...@vger.kernel.org, net...@vger.kernel.org, pab...@redhat.com, s...@fomichev.me, syzkall...@googlegroups.com
Hello,

syzbot found the following issue on:

HEAD commit: 8bc251e5d874 Merge tag 'nf-25-04-03' of git://git.kernel.o..
git tree: net
console+strace: https://syzkaller.appspot.com/x/log.txt?x=1133afb0580000
kernel config: https://syzkaller.appspot.com/x/.config?x=24f9c4330e7c0609
dashboard link: https://syzkaller.appspot.com/bug?extid=45016fe295243a7882d3
compiler: Debian clang version 15.0.6, GNU ld (GNU Binutils for Debian) 2.40
syz repro: https://syzkaller.appspot.com/x/repro.syz?x=1040823f980000
C reproducer: https://syzkaller.appspot.com/x/repro.c?x=151d194c580000

Downloadable assets:
disk image: https://storage.googleapis.com/syzbot-assets/a500d5daba83/disk-8bc251e5.raw.xz
vmlinux: https://storage.googleapis.com/syzbot-assets/2459c792199a/vmlinux-8bc251e5.xz
kernel image: https://storage.googleapis.com/syzbot-assets/558655fb055e/bzImage-8bc251e5.xz

The issue was bisected to:

commit dbfc99495d960134bfe1a4f13849fb0d5373b42c
Author: Stanislav Fomichev <s...@fomichev.me>
Date: Tue Apr 1 16:34:47 2025 +0000

net: dummy: request ops lock

bisection log: https://syzkaller.appspot.com/x/bisect.txt?x=13233998580000
final oops: https://syzkaller.appspot.com/x/report.txt?x=10a33998580000
console output: https://syzkaller.appspot.com/x/log.txt?x=17233998580000

IMPORTANT: if you fix the issue, please add the following tag to the commit:
Reported-by: syzbot+45016f...@syzkaller.appspotmail.com
Fixes: dbfc99495d96 ("net: dummy: request ops lock")

=====================================
WARNING: bad unlock balance detected!
6.14.0-syzkaller-12504-g8bc251e5d874 #0 Not tainted
-------------------------------------
syz-executor814/5834 is trying to release lock (&dev_instance_lock_key) at:
[<ffffffff89f41f56>] netdev_unlock include/linux/netdevice.h:2756 [inline]
[<ffffffff89f41f56>] netdev_unlock_ops include/net/netdev_lock.h:48 [inline]
[<ffffffff89f41f56>] do_setlink+0xc26/0x43a0 net/core/rtnetlink.c:3406
but there are no more locks to release!

other info that might help us debug this:
1 lock held by syz-executor814/5834:
#0: ffffffff900fc408 (rtnl_mutex){+.+.}-{4:4}, at: rtnl_lock net/core/rtnetlink.c:80 [inline]
#0: ffffffff900fc408 (rtnl_mutex){+.+.}-{4:4}, at: rtnl_nets_lock net/core/rtnetlink.c:341 [inline]
#0: ffffffff900fc408 (rtnl_mutex){+.+.}-{4:4}, at: rtnl_newlink+0xd68/0x1fe0 net/core/rtnetlink.c:4064

stack backtrace:
CPU: 0 UID: 0 PID: 5834 Comm: syz-executor814 Not tainted 6.14.0-syzkaller-12504-g8bc251e5d874 #0 PREEMPT(full)
Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 02/12/2025
Call Trace:
<TASK>
__dump_stack lib/dump_stack.c:94 [inline]
dump_stack_lvl+0x241/0x360 lib/dump_stack.c:120
print_unlock_imbalance_bug+0x185/0x1a0 kernel/locking/lockdep.c:5296
__lock_release kernel/locking/lockdep.c:5535 [inline]
lock_release+0x1ed/0x3e0 kernel/locking/lockdep.c:5887
__mutex_unlock_slowpath+0xee/0x800 kernel/locking/mutex.c:907
netdev_unlock include/linux/netdevice.h:2756 [inline]
netdev_unlock_ops include/net/netdev_lock.h:48 [inline]
do_setlink+0xc26/0x43a0 net/core/rtnetlink.c:3406
rtnl_group_changelink net/core/rtnetlink.c:3783 [inline]
__rtnl_newlink net/core/rtnetlink.c:3937 [inline]
rtnl_newlink+0x1619/0x1fe0 net/core/rtnetlink.c:4065
rtnetlink_rcv_msg+0x80f/0xd70 net/core/rtnetlink.c:6955
netlink_rcv_skb+0x208/0x480 net/netlink/af_netlink.c:2534
netlink_unicast_kernel net/netlink/af_netlink.c:1313 [inline]
netlink_unicast+0x7f8/0x9a0 net/netlink/af_netlink.c:1339
netlink_sendmsg+0x8c3/0xcd0 net/netlink/af_netlink.c:1883
sock_sendmsg_nosec net/socket.c:712 [inline]
__sock_sendmsg+0x221/0x270 net/socket.c:727
____sys_sendmsg+0x523/0x860 net/socket.c:2566
___sys_sendmsg net/socket.c:2620 [inline]
__sys_sendmsg+0x271/0x360 net/socket.c:2652
do_syscall_x64 arch/x86/entry/syscall_64.c:63 [inline]
do_syscall_64+0xf3/0x230 arch/x86/entry/syscall_64.c:94
entry_SYSCALL_64_after_hwframe+0x77/0x7f
RIP: 0033:0x7f8427b614a9
Code: 48 83 c4 28 c3 e8 37 17 00 00 0f 1f 80 00 00 00 00 48 89 f8 48 89 f7 48 89 d6 48 89 ca 4d 89 c2 4d 89 c8 4c 8b 4c 24 08 0f 05 <48> 3d 01 f0 ff ff 73 01 c3 48 c7 c1 b8 ff ff ff f7 d8 64 89 01 48
RSP: 002b:00007fff9b59f3a8 EFLAGS: 00000246 ORIG_RAX: 000000000000002e
RAX: ffffffffffffffda RBX: 00007fff9b59f578 RCX: 00007f8427b614a9
RDX: 0000000000000000 RSI: 0000200000000300 RDI: 0000000000000004
RBP: 00007f8427bd4610 R08: 000000000000000c R09: 00007fff9b59f578
R10: 000000000000001b R11: 0000000000000246 R12: 0000000000000001
R13:


---
This report is generated by a bot. It may contain errors.
See https://goo.gl/tpsmEJ for more information about syzbot.
syzbot engineers can be reached at syzk...@googlegroups.com.

syzbot will keep track of this issue. See:
https://goo.gl/tpsmEJ#status for how to communicate with syzbot.
For information about bisection process see: https://goo.gl/tpsmEJ#bisection

If the report is already addressed, let syzbot know by replying with:
#syz fix: exact-commit-title

If you want syzbot to run the reproducer, reply with:
#syz test: git://repo/address.git branch-or-commit-hash
If you attach or paste a git patch, syzbot will apply it before testing.

If you want to overwrite report's subsystems, reply with:
#syz set subsystems: new-subsystem
(See the list of subsystem names on the web dashboard)

If the report is a duplicate of another one, reply with:
#syz dup: exact-subject-of-another-report

If you want to undo deduplication, reply with:
#syz undup

Kuniyuki Iwashima

unread,
Apr 7, 2025, 2:37:26 AM4/7/25
to syzbot+45016f...@syzkaller.appspotmail.com, andrew...@lunn.ch, da...@davemloft.net, edum...@google.com, ho...@kernel.org, ku...@kernel.org, kun...@amazon.com, linux-...@vger.kernel.org, net...@vger.kernel.org, pab...@redhat.com, s...@fomichev.me, syzkall...@googlegroups.com
From: syzbot <syzbot+45016f...@syzkaller.appspotmail.com>
Date: Sun, 06 Apr 2025 22:57:35 -0700
#syz test

diff --git a/net/core/rtnetlink.c b/net/core/rtnetlink.c
index c23852835050..925d634f724e 100644
--- a/net/core/rtnetlink.c
+++ b/net/core/rtnetlink.c
@@ -3027,7 +3027,7 @@ static int do_setlink(const struct sk_buff *skb, struct net_device *dev,

err = validate_linkmsg(dev, tb, extack);
if (err < 0)
- goto errout;
+ return err;

if (tb[IFLA_IFNAME])
nla_strscpy(ifname, tb[IFLA_IFNAME], IFNAMSIZ);

syzbot

unread,
Apr 7, 2025, 4:13:05 AM4/7/25
to and...@lunn.ch, da...@davemloft.net, edum...@google.com, ho...@kernel.org, ku...@kernel.org, kun...@amazon.com, linux-...@vger.kernel.org, net...@vger.kernel.org, pab...@redhat.com, s...@fomichev.me, syzkall...@googlegroups.com
Hello,

syzbot has tested the proposed patch but the reproducer is still triggering an issue:
unregister_netdevice: waiting for DEV to become free

unregister_netdevice: waiting for batadv0 to become free. Usage count = 3


Tested on:

commit: 61f96e68 Merge tag 'net-6.15-rc1' of git://git.kernel...
git tree: net
console output: https://syzkaller.appspot.com/x/log.txt?x=111c523f980000
kernel config: https://syzkaller.appspot.com/x/.config?x=f2054704dd53fb80
dashboard link: https://syzkaller.appspot.com/bug?extid=45016fe295243a7882d3
compiler: Debian clang version 15.0.6, GNU ld (GNU Binutils for Debian) 2.40
patch: https://syzkaller.appspot.com/x/patch.diff?x=12f3bd98580000

Stanislav Fomichev

unread,
Apr 7, 2025, 10:19:57 AM4/7/25
to syzbot, and...@lunn.ch, da...@davemloft.net, edum...@google.com, ho...@kernel.org, ku...@kernel.org, kun...@amazon.com, linux-...@vger.kernel.org, net...@vger.kernel.org, pab...@redhat.com, s...@fomichev.me, syzkall...@googlegroups.com
On 04/07, syzbot wrote:
> Hello,
>
> syzbot has tested the proposed patch but the reproducer is still triggering an issue:
> unregister_netdevice: waiting for DEV to become free
>
> unregister_netdevice: waiting for batadv0 to become free. Usage count = 3

So it does fix the lock unbalance issue, but now there is a hang?

Kuniyuki Iwashima

unread,
Apr 7, 2025, 12:13:25 PM4/7/25
to stfom...@gmail.com, and...@lunn.ch, da...@davemloft.net, edum...@google.com, ho...@kernel.org, ku...@kernel.org, kun...@amazon.com, linux-...@vger.kernel.org, net...@vger.kernel.org, pab...@redhat.com, s...@fomichev.me, syzbot+45016f...@syzkaller.appspotmail.com, syzkall...@googlegroups.com
From: Stanislav Fomichev <stfom...@gmail.com>
Date: Mon, 7 Apr 2025 07:19:54 -0700
I think this is an orthogonal issue.

I saw this in another report as well.
https://lore.kernel.org/netdev/67f208ea.050a02...@google.com/

syzbot may want to find a better way to filter this kind of noise.

Aleksandr Nogikh

unread,
Apr 8, 2025, 4:11:36 AM4/8/25
to Kuniyuki Iwashima, Dmitry Vyukov, stfom...@gmail.com, and...@lunn.ch, da...@davemloft.net, edum...@google.com, ho...@kernel.org, ku...@kernel.org, linux-...@vger.kernel.org, net...@vger.kernel.org, pab...@redhat.com, s...@fomichev.me, syzbot+45016f...@syzkaller.appspotmail.com, syzkall...@googlegroups.com
Syzbot treats this message as a problem worthy of reporting since a
long time (Cc'd Dmitry who may remember the context):
https://github.com/google/syzkaller/commit/7a67784ca8bdc3b26cce2f0ec9a40d2dd9ec9396

Since v6.15-rc1, we do observe it happen at least 10x more often than
before, both during fuzzing and while processing #syz test commands:
https://syzkaller.appspot.com/bug?extid=881d65229ca4f9ae8c84

--
Aleksandr

Dmitry Vyukov

unread,
Apr 8, 2025, 6:44:29 AM4/8/25
to Aleksandr Nogikh, Kuniyuki Iwashima, stfom...@gmail.com, and...@lunn.ch, da...@davemloft.net, edum...@google.com, ho...@kernel.org, ku...@kernel.org, linux-...@vger.kernel.org, net...@vger.kernel.org, pab...@redhat.com, s...@fomichev.me, syzbot+45016f...@syzkaller.appspotmail.com, syzkall...@googlegroups.com
IIUC this error means a leaked reference count on a device, and the
device and everything it references leaked forever + a kernel thread
looping forever. This does not look like noise.

Eric, should know more. Eric fixed a bunch of these bugs and added a
ref count tracker to devices to provide better diagnostics. For some
reason I don't see the reftracker output in the console output, but
CONFIG_NET_DEV_REFCNT_TRACKER=y is enabled in the config.

Eric Dumazet

unread,
Apr 8, 2025, 7:33:30 AM4/8/25
to Dmitry Vyukov, Aleksandr Nogikh, Kuniyuki Iwashima, stfom...@gmail.com, and...@lunn.ch, da...@davemloft.net, ho...@kernel.org, ku...@kernel.org, linux-...@vger.kernel.org, net...@vger.kernel.org, pab...@redhat.com, s...@fomichev.me, syzbot+45016f...@syzkaller.appspotmail.com, syzkall...@googlegroups.com
I think that Kuniyuki patch was fixing the original syzbot report.

After fixing this trivial bug, another bug showed up,
and this second bug triggered "syzbot may want to find a better way to
filter this kind of noise." comment.


-ETOOMANYBUGS.

Aleksandr Nogikh

unread,
Apr 8, 2025, 4:16:34 PM4/8/25
to Eric Dumazet, Dmitry Vyukov, Kuniyuki Iwashima, stfom...@gmail.com, and...@lunn.ch, da...@davemloft.net, ho...@kernel.org, ku...@kernel.org, linux-...@vger.kernel.org, net...@vger.kernel.org, pab...@redhat.com, s...@fomichev.me, syzbot+45016f...@syzkaller.appspotmail.com, syzkall...@googlegroups.com
FWIW I've just bisected the recent spike in "unregister_netdevice:
waiting for batadv0 to become free" and git bisect pointed to:

00b35530811f2aa3d7ceec2dbada80861c7632a8
Author: Eric Dumazet <edum...@google.com>
Date: Thu Feb 6 14:04:22 2025 +0000

batman-adv: adopt netdev_hold() / netdev_put()

Add a device tracker to struct batadv_hard_iface to help
debugging of network device refcount imbalances.


Eric, could you please have a look?

>
>
> -ETOOMANYBUGS.

Eric Dumazet

unread,
Apr 8, 2025, 4:41:46 PM4/8/25
to Aleksandr Nogikh, Sven Eckelmann, Dmitry Vyukov, Kuniyuki Iwashima, stfom...@gmail.com, and...@lunn.ch, da...@davemloft.net, ho...@kernel.org, ku...@kernel.org, linux-...@vger.kernel.org, net...@vger.kernel.org, pab...@redhat.com, s...@fomichev.me, syzbot+45016f...@syzkaller.appspotmail.com, syzkall...@googlegroups.com
My original patch was :
https://lore.kernel.org/netdev/CANn89i+ySFS5C24guM9E9UsP...@mail.gmail.com/T/

I think it was correct.

Then Sven added code in it, instead of adding a separate patch.

I guess a fix would be :

diff --git a/net/batman-adv/hard-interface.c b/net/batman-adv/hard-interface.c
index f145f96626531053bbf8f58a31f28f625a9d80f9..7cd4bdcee43935b9e5fb7d1696430909b7af67b4
100644
--- a/net/batman-adv/hard-interface.c
+++ b/net/batman-adv/hard-interface.c
@@ -725,7 +725,6 @@ int batadv_hardif_enable_interface(struct
batadv_hard_iface *hard_iface,

kref_get(&hard_iface->refcount);

- dev_hold(mesh_iface);
netdev_hold(mesh_iface, &hard_iface->meshif_dev_tracker, GFP_ATOMIC);
hard_iface->mesh_iface = mesh_iface;
bat_priv = netdev_priv(hard_iface->mesh_iface);
Reply all
Reply to author
Forward
0 new messages