[syzbot] WARNING in batadv_nc_mesh_free

6 views
Skip to first unread message

syzbot

unread,
Oct 21, 2021, 7:19:24ā€ÆPM10/21/21
to a...@unstable.cc, b.a.t...@lists.open-mesh.org, da...@davemloft.net, ku...@kernel.org, linux-...@vger.kernel.org, marekl...@neomailbox.ch, net...@vger.kernel.org, sv...@narfation.org, s...@simonwunderlich.de, syzkall...@googlegroups.com
Hello,

syzbot found the following issue on:

HEAD commit: 2f111a6fd5b5 Merge tag 'ceph-for-5.15-rc7' of git://github..
git tree: upstream
console output: https://syzkaller.appspot.com/x/log.txt?x=115750acb00000
kernel config: https://syzkaller.appspot.com/x/.config?x=d95853dad8472c91
dashboard link: https://syzkaller.appspot.com/bug?extid=28b0702ada0bf7381f58
compiler: Debian clang version 11.0.1-2, GNU ld (GNU Binutils for Debian) 2.35.2
syz repro: https://syzkaller.appspot.com/x/repro.syz?x=1026ef2cb00000
C reproducer: https://syzkaller.appspot.com/x/repro.c?x=15c9c162b00000

IMPORTANT: if you fix the issue, please add the following tag to the commit:
Reported-by: syzbot+28b070...@syzkaller.appspotmail.com

RBP: 00007ffef262e230 R08: 0000000000000002 R09: 00007fddc8003531
R10: 0000000000000000 R11: 0000000000000246 R12: 0000000000000004
R13: 0000000000000000 R14: 0000000000000000 R15: 0000000000000000
------------[ cut here ]------------
ODEBUG: assert_init not available (active state 0) object type: timer_list hint: 0x0
WARNING: CPU: 0 PID: 6517 at lib/debugobjects.c:508 debug_print_object lib/debugobjects.c:505 [inline]
WARNING: CPU: 0 PID: 6517 at lib/debugobjects.c:508 debug_object_assert_init+0x1fa/0x250 lib/debugobjects.c:895
Modules linked in:
CPU: 0 PID: 6517 Comm: syz-executor011 Not tainted 5.15.0-rc6-syzkaller #0
Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 01/01/2011
RIP: 0010:debug_print_object lib/debugobjects.c:505 [inline]
RIP: 0010:debug_object_assert_init+0x1fa/0x250 lib/debugobjects.c:895
Code: e8 4b 15 b8 fd 4c 8b 45 00 48 c7 c7 a0 31 b4 8a 48 c7 c6 00 2e b4 8a 48 c7 c2 e0 33 b4 8a 31 c9 49 89 d9 31 c0 e8 b6 c6 36 fd <0f> 0b ff 05 3a 5c c5 09 48 83 c5 38 48 89 e8 48 c1 e8 03 42 80 3c
RSP: 0018:ffffc90002c7e698 EFLAGS: 00010046
RAX: cffa606352c78700 RBX: 0000000000000000 RCX: ffff888076ce9c80
RDX: 0000000000000000 RSI: 0000000080000000 RDI: 0000000000000000
RBP: ffffffff8a512d00 R08: ffffffff81693402 R09: ffffed1017383f2c
R10: ffffed1017383f2c R11: 0000000000000000 R12: dffffc0000000000
R13: ffff88801bcd1720 R14: 0000000000000002 R15: ffffffff90ba5a20
FS: 0000555557087300(0000) GS:ffff8880b9c00000(0000) knlGS:0000000000000000
CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
CR2: 00007f5473f3c000 CR3: 0000000070ca6000 CR4: 00000000003506f0
DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
Call Trace:
debug_timer_assert_init kernel/time/timer.c:739 [inline]
debug_assert_init kernel/time/timer.c:784 [inline]
del_timer+0xa5/0x3d0 kernel/time/timer.c:1204
try_to_grab_pending+0x151/0xbb0 kernel/workqueue.c:1270
__cancel_work_timer+0x14c/0x710 kernel/workqueue.c:3129
batadv_nc_mesh_free+0x4a/0xf0 net/batman-adv/network-coding.c:1869
batadv_mesh_free+0x6f/0x140 net/batman-adv/main.c:245
batadv_mesh_init+0x4e5/0x550 net/batman-adv/main.c:226
batadv_softif_init_late+0x8fe/0xd70 net/batman-adv/soft-interface.c:804
register_netdevice+0x826/0x1c30 net/core/dev.c:10229
__rtnl_newlink net/core/rtnetlink.c:3458 [inline]
rtnl_newlink+0x14b3/0x1d10 net/core/rtnetlink.c:3506
rtnetlink_rcv_msg+0x934/0xe60 net/core/rtnetlink.c:5572
netlink_rcv_skb+0x200/0x470 net/netlink/af_netlink.c:2510
netlink_unicast_kernel net/netlink/af_netlink.c:1319 [inline]
netlink_unicast+0x814/0x9f0 net/netlink/af_netlink.c:1345
netlink_sendmsg+0xa29/0xe50 net/netlink/af_netlink.c:1935
sock_sendmsg_nosec net/socket.c:704 [inline]
sock_sendmsg net/socket.c:724 [inline]
____sys_sendmsg+0x5b9/0x910 net/socket.c:2409
___sys_sendmsg net/socket.c:2463 [inline]
__sys_sendmsg+0x36f/0x450 net/socket.c:2492
do_syscall_x64 arch/x86/entry/common.c:50 [inline]
do_syscall_64+0x44/0xd0 arch/x86/entry/common.c:80
entry_SYSCALL_64_after_hwframe+0x44/0xae
RIP: 0033:0x7fddc82bc7e9
Code: 28 c3 e8 2a 14 00 00 66 2e 0f 1f 84 00 00 00 00 00 48 89 f8 48 89 f7 48 89 d6 48 89 ca 4d 89 c2 4d 89 c8 4c 8b 4c 24 08 0f 05 <48> 3d 01 f0 ff ff 73 01 c3 48 c7 c1 c0 ff ff ff f7 d8 64 89 01 48
RSP: 002b:00007ffef262e228 EFLAGS: 00000246 ORIG_RAX: 000000000000002e
RAX: ffffffffffffffda RBX: 0000000000000002 RCX: 00007fddc82bc7e9
RDX: 0000000000000000 RSI: 0000000020000140 RDI: 0000000000000003
RBP: 00007ffef262e230 R08: 0000000000000002 R09: 00007fddc8003531
R10: 0000000000000000 R11: 0000000000000246 R12: 0000000000000004
R13: 0000000000000000 R14: 0000000000000000 R15: 0000000000000000


---
This report is generated by a bot. It may contain errors.
See https://goo.gl/tpsmEJ for more information about syzbot.
syzbot engineers can be reached at syzk...@googlegroups.com.

syzbot will keep track of this issue. See:
https://goo.gl/tpsmEJ#status for how to communicate with syzbot.
syzbot can test patches for this issue, for details see:
https://goo.gl/tpsmEJ#testing-patches

Pavel Skripkin

unread,
Oct 22, 2021, 2:33:51ā€ÆPM10/22/21
to syzbot, a...@unstable.cc, b.a.t...@lists.open-mesh.org, da...@davemloft.net, ku...@kernel.org, linux-...@vger.kernel.org, marekl...@neomailbox.ch, net...@vger.kernel.org, sv...@narfation.org, s...@simonwunderlich.de, syzkall...@googlegroups.com
Looks like cancel_delayed_work_sync() is called before
INIT_DELAYED_WORK(), so calltrace looks like

batadv_mesh_init()
batadv_originator_init() <- injected allocation failure
batadv_mesh_free()
batadv_nc_mesh_free()
cancel_delayed_work_sync()


Quick fix can be moving INIT_DELAYED_WORK() from batadv_nc_init() to
batadv_mesh_init(), since there is complex dependencies between each
mech part, if I understood comments correctly


Just for thoughts and syzbot testing

#syz test
git://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git master



With regards,
Pavel Skripkin


ph

syzbot

unread,
Oct 22, 2021, 4:20:09ā€ÆPM10/22/21
to a...@unstable.cc, b.a.t...@lists.open-mesh.org, da...@davemloft.net, ku...@kernel.org, linux-...@vger.kernel.org, marekl...@neomailbox.ch, net...@vger.kernel.org, paskr...@gmail.com, sv...@narfation.org, s...@simonwunderlich.de, syzkall...@googlegroups.com
Hello,

syzbot has tested the proposed patch but the reproducer is still triggering an issue:
general protection fault in batadv_nc_purge_paths

RBP: 00007fe7b40631d0 R08: 0000000000000000 R09: 0000000000000000
R10: 0000000000000000 R11: 0000000000000246 R12: 0000000000000002
R13: 00007ffe7ffd3def R14: 00007fe7b4063300 R15: 0000000000022000
general protection fault, probably for non-canonical address 0xdffffc0000000002: 0000 [#1] PREEMPT SMP KASAN
KASAN: null-ptr-deref in range [0x0000000000000010-0x0000000000000017]
CPU: 1 PID: 9061 Comm: syz-executor.0 Not tainted 5.15.0-rc6-syzkaller #0
Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 01/01/2011
RIP: 0010:batadv_nc_purge_paths+0x38/0x3f0 net/batman-adv/network-coding.c:437
Code: 48 89 d3 49 89 f6 48 89 7c 24 58 49 bd 00 00 00 00 00 fc ff df e8 38 48 ab f7 4d 8d 7e 10 4c 89 f8 48 c1 e8 03 48 89 44 24 48 <42> 8a 04 28 84 c0 0f 85 88 03 00 00 41 8b 2f 31 ff 89 ee e8 20 4c
RSP: 0018:ffffc9000d04eac0 EFLAGS: 00010202
RAX: 0000000000000002 RBX: 0000000000000000 RCX: ffff888078270000
RDX: 0000000000000000 RSI: 0000000000000000 RDI: ffff88807ec2cc80
RBP: 00000000fffffff4 R08: ffffffff8154e5b4 R09: ffffed100fd85adc
R10: ffffed100fd85adc R11: 0000000000000000 R12: ffff88807ec2cc80
R13: dffffc0000000000 R14: 0000000000000000 R15: 0000000000000010
FS: 00007fe7b4063700(0000) GS:ffff8880b9d00000(0000) knlGS:0000000000000000
CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
CR2: 00007f359172e000 CR3: 000000005e749000 CR4: 00000000003506e0
DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
Call Trace:
batadv_nc_mesh_free+0x7a/0xf0 net/batman-adv/network-coding.c:1869
batadv_mesh_free+0x6f/0x140 net/batman-adv/main.c:249
batadv_mesh_init+0x5b1/0x620 net/batman-adv/main.c:230
batadv_softif_init_late+0x8fe/0xd70 net/batman-adv/soft-interface.c:804
register_netdevice+0x826/0x1c30 net/core/dev.c:10229
__rtnl_newlink net/core/rtnetlink.c:3458 [inline]
rtnl_newlink+0x14b3/0x1d10 net/core/rtnetlink.c:3506
rtnetlink_rcv_msg+0x934/0xe60 net/core/rtnetlink.c:5572
netlink_rcv_skb+0x200/0x470 net/netlink/af_netlink.c:2510
netlink_unicast_kernel net/netlink/af_netlink.c:1319 [inline]
netlink_unicast+0x814/0x9f0 net/netlink/af_netlink.c:1345
netlink_sendmsg+0xa29/0xe50 net/netlink/af_netlink.c:1935
sock_sendmsg_nosec net/socket.c:704 [inline]
sock_sendmsg net/socket.c:724 [inline]
____sys_sendmsg+0x5b9/0x910 net/socket.c:2409
___sys_sendmsg net/socket.c:2463 [inline]
__sys_sendmsg+0x36f/0x450 net/socket.c:2492
do_syscall_x64 arch/x86/entry/common.c:50 [inline]
do_syscall_64+0x44/0xd0 arch/x86/entry/common.c:80
entry_SYSCALL_64_after_hwframe+0x44/0xae
RIP: 0033:0x7fe7b48eda39
Code: ff ff c3 66 2e 0f 1f 84 00 00 00 00 00 0f 1f 40 00 48 89 f8 48 89 f7 48 89 d6 48 89 ca 4d 89 c2 4d 89 c8 4c 8b 4c 24 08 0f 05 <48> 3d 01 f0 ff ff 73 01 c3 48 c7 c1 bc ff ff ff f7 d8 64 89 01 48
RSP: 002b:00007fe7b4063188 EFLAGS: 00000246 ORIG_RAX: 000000000000002e
RAX: ffffffffffffffda RBX: 00007fe7b49f0f60 RCX: 00007fe7b48eda39
RDX: 0000000000000000 RSI: 0000000020000140 RDI: 0000000000000003
RBP: 00007fe7b40631d0 R08: 0000000000000000 R09: 0000000000000000
R10: 0000000000000000 R11: 0000000000000246 R12: 0000000000000002
R13: 00007ffe7ffd3def R14: 00007fe7b4063300 R15: 0000000000022000
Modules linked in:
---[ end trace 67ff054734964acf ]---
RIP: 0010:batadv_nc_purge_paths+0x38/0x3f0 net/batman-adv/network-coding.c:437
Code: 48 89 d3 49 89 f6 48 89 7c 24 58 49 bd 00 00 00 00 00 fc ff df e8 38 48 ab f7 4d 8d 7e 10 4c 89 f8 48 c1 e8 03 48 89 44 24 48 <42> 8a 04 28 84 c0 0f 85 88 03 00 00 41 8b 2f 31 ff 89 ee e8 20 4c
RSP: 0018:ffffc9000d04eac0 EFLAGS: 00010202
RAX: 0000000000000002 RBX: 0000000000000000 RCX: ffff888078270000
RDX: 0000000000000000 RSI: 0000000000000000 RDI: ffff88807ec2cc80
RBP: 00000000fffffff4 R08: ffffffff8154e5b4 R09: ffffed100fd85adc
R10: ffffed100fd85adc R11: 0000000000000000 R12: ffff88807ec2cc80
R13: dffffc0000000000 R14: 0000000000000000 R15: 0000000000000010
FS: 00007fe7b4063700(0000) GS:ffff8880b9c00000(0000) knlGS:0000000000000000
CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
CR2: 00007fc230f87020 CR3: 000000005e749000 CR4: 00000000003506f0
DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
----------------
Code disassembly (best guess):
0: 48 89 d3 mov %rdx,%rbx
3: 49 89 f6 mov %rsi,%r14
6: 48 89 7c 24 58 mov %rdi,0x58(%rsp)
b: 49 bd 00 00 00 00 00 movabs $0xdffffc0000000000,%r13
12: fc ff df
15: e8 38 48 ab f7 callq 0xf7ab4852
1a: 4d 8d 7e 10 lea 0x10(%r14),%r15
1e: 4c 89 f8 mov %r15,%rax
21: 48 c1 e8 03 shr $0x3,%rax
25: 48 89 44 24 48 mov %rax,0x48(%rsp)
* 2a: 42 8a 04 28 mov (%rax,%r13,1),%al <-- trapping instruction
2e: 84 c0 test %al,%al
30: 0f 85 88 03 00 00 jne 0x3be
36: 41 8b 2f mov (%r15),%ebp
39: 31 ff xor %edi,%edi
3b: 89 ee mov %ebp,%esi
3d: e8 .byte 0xe8
3e: 20 .byte 0x20
3f: 4c rex.WR


Tested on:

commit: 1d4590f5 Merge tag 'acpi-5.15-rc7' of git://git.kernel..
git tree: upstream
console output: https://syzkaller.appspot.com/x/log.txt?x=156e86c4b00000
kernel config: https://syzkaller.appspot.com/x/.config?x=d95853dad8472c91
dashboard link: https://syzkaller.appspot.com/bug?extid=28b0702ada0bf7381f58
compiler: Debian clang version 11.0.1-2, GNU ld (GNU Binutils for Debian) 2.35.2
patch: https://syzkaller.appspot.com/x/patch.diff?x=16cb3daf300000

Pavel Skripkin

unread,
Oct 22, 2021, 4:57:17ā€ÆPM10/22/21
to syzbot, a...@unstable.cc, b.a.t...@lists.open-mesh.org, da...@davemloft.net, ku...@kernel.org, linux-...@vger.kernel.org, marekl...@neomailbox.ch, net...@vger.kernel.org, sv...@narfation.org, s...@simonwunderlich.de, syzkall...@googlegroups.com
On 10/22/21 23:20, syzbot wrote:
> Hello,
>
> syzbot has tested the proposed patch but the reproducer is still triggering an issue:
> general protection fault in batadv_nc_purge_paths


Oh, ok. Next clean up call in batadv_nc_mesh_free() caused GPF, since
fields are not initialized. Let's try to clean up one by one and do not
break dependencies.

Quite ugly one, but idea is correct, I guess

Also, make each *_init() call clean up all allocated stuff to not call
corresponding *_free() on error handling path, since it introduces
problems, as syzbot reported




With regards,
Pavel Skripkin

ph

Pavel Skripkin

unread,
Oct 22, 2021, 4:58:17ā€ÆPM10/22/21
to syzbot, a...@unstable.cc, b.a.t...@lists.open-mesh.org, da...@davemloft.net, ku...@kernel.org, linux-...@vger.kernel.org, marekl...@neomailbox.ch, net...@vger.kernel.org, sv...@narfation.org, s...@simonwunderlich.de, syzkall...@googlegroups.com
Whooops.... Forgot to ask syzbot to test the patch
ph

syzbot

unread,
Oct 23, 2021, 12:50:11ā€ÆAM10/23/21
to a...@unstable.cc, b.a.t...@lists.open-mesh.org, da...@davemloft.net, ku...@kernel.org, linux-...@vger.kernel.org, marekl...@neomailbox.ch, net...@vger.kernel.org, paskr...@gmail.com, sv...@narfation.org, s...@simonwunderlich.de, syzkall...@googlegroups.com
Hello,

syzbot has tested the proposed patch and the reproducer did not trigger any issue:

Reported-and-tested-by: syzbot+28b070...@syzkaller.appspotmail.com

Tested on:

commit: 9c0c4d24 Merge tag 'block-5.15-2021-10-22' of git://gi..
git tree: upstream
kernel config: https://syzkaller.appspot.com/x/.config?x=d95853dad8472c91
dashboard link: https://syzkaller.appspot.com/bug?extid=28b0702ada0bf7381f58
compiler: Debian clang version 11.0.1-2, GNU ld (GNU Binutils for Debian) 2.35.2
patch: https://syzkaller.appspot.com/x/patch.diff?x=1553d4c4b00000

Note: testing is done by a robot and is best-effort only.

Sven Eckelmann

unread,
Oct 23, 2021, 3:41:12ā€ÆAM10/23/21
to syzbot, a...@unstable.cc, b.a.t...@lists.open-mesh.org, da...@davemloft.net, ku...@kernel.org, linux-...@vger.kernel.org, marekl...@neomailbox.ch, net...@vger.kernel.org, s...@simonwunderlich.de, syzkall...@googlegroups.com, Pavel Skripkin, linus.l...@c0d3.blue
On Friday, 22 October 2021 22:58:15 CEST Pavel Skripkin wrote:
[...]
> > Oh, ok. Next clean up call in batadv_nc_mesh_free() caused GPF, since
> > fields are not initialized. Let's try to clean up one by one and do not
> > break dependencies.
> >
> > Quite ugly one, but idea is correct, I guess
> >
> > Also, make each *_init() call clean up all allocated stuff to not call
> > corresponding *_free() on error handling path, since it introduces
> > problems, as syzbot reported

Thanks for the patch + syzbot interactions. I just wanted to implement a
change - which would most likely have ended up the same way. Can you please
send it to netdev and Cc b.a.t...@lists.open-mesh.org? We don't have
anything else to submit at the moment for netdev and this patch can be applied
by netdev directly. I will add my Acked-by in this process.

Not sure about the Fixes. It is definitely wrong in the initial commit.... but
it got only really problematic when other features got introduced. I would
still say that the initial one should be mentioned.

Fixes: c6c8fea29769 ("net: Add batman-adv meshing protocol")

@Linus, @Marek, @Antonio: Please check whether it is ok to move the
batadv_v_mesh_init after batadv_tt_init + batadv_originator_init.
batadv_v_mesh_init is basically there to initialize:

* bat_priv->bat_v.ogm_buff(|_len|_mutex)
* bat_priv->bat_v.ogm_seqno
* bat_priv->bat_v.ogm_wq

batadv_originator_init is there to initialize the

* bat_priv->orig_hash
* bat_priv->orig_work (batadv_purge_orig) + queue it up

batadv_tt_init is a lot more complex but should in theory not interact with
ogm specific algo ops.

I wouldn't know why there could be a problem but I would leave it to the
experts.

Kind regards,
Sven
signature.asc
Reply all
Reply to author
Forward
0 new messages