WARNING in xfrm_state_fini (2)

85 views
Skip to first unread message

syzbot

unread,
Feb 4, 2018, 6:30:04 AM2/4/18
to da...@davemloft.net, her...@gondor.apana.org.au, linux-...@vger.kernel.org, net...@vger.kernel.org, steffen....@secunet.com, syzkall...@googlegroups.com
Hello,

syzbot hit the following crash on upstream commit
4bf772b14675411a69b3c807f73006de0fe4b649 (Fri Feb 2 01:48:47 2018 +0000)
Merge tag 'drm-for-v4.16' of git://people.freedesktop.org/~airlied/linux

Unfortunately, I don't have any reproducer for this crash yet.
Raw console output is attached.
compiler: gcc (GCC) 7.1.1 20170620
.config is attached.
user-space arch: i386

IMPORTANT: if you fix the bug, please add the following tag to the commit:
Reported-by: syzbot+0bf051...@syzkaller.appspotmail.com
It will help syzbot understand when the bug is fixed. See footer for
details.
If you forward the report, please keep this part and the footer.

netlink: 3 bytes leftover after parsing attributes in process
`syz-executor2'.
WARNING: CPU: 0 PID: 28 at net/xfrm/xfrm_state.c:2341
xfrm_state_fini+0x46a/0x620 net/xfrm/xfrm_state.c:2341
Kernel panic - not syncing: panic_on_warn set ...

CPU: 0 PID: 28 Comm: kworker/u4:2 Not tainted 4.15.0+ #202
Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS
Google 01/01/2011
Workqueue: netns cleanup_net
Call Trace:
__dump_stack lib/dump_stack.c:17 [inline]
dump_stack+0x194/0x257 lib/dump_stack.c:53
panic+0x1e4/0x41c kernel/panic.c:183
__warn+0x1dc/0x200 kernel/panic.c:547
report_bug+0x211/0x2d0 lib/bug.c:184
fixup_bug.part.11+0x37/0x80 arch/x86/kernel/traps.c:178
fixup_bug arch/x86/kernel/traps.c:247 [inline]
do_error_trap+0x2d7/0x3e0 arch/x86/kernel/traps.c:296
do_invalid_op+0x1b/0x20 arch/x86/kernel/traps.c:315
invalid_op+0x22/0x40 arch/x86/entry/entry_64.S:1097
RIP: 0010:xfrm_state_fini+0x46a/0x620 net/xfrm/xfrm_state.c:2341
RSP: 0000:ffff8801d993f148 EFLAGS: 00010293
RAX: ffff8801d9926000 RBX: ffff8801b899a100 RCX: ffffffff84be327a
RDX: 0000000000000000 RSI: 0000000000000000 RDI: ffffffff86ac93f8
RBP: ffff8801d993f2a0 R08: 1ffff1003b327dbc R09: 0000000000000000
R10: 0000000000000000 R11: 0000000000000000 R12: 1ffff1003b327e2b
R13: ffff8801d993f278 R14: 1ffff1003b327e2f R15: ffff8801b899b500
xfrm_net_exit+0x25/0x70 net/xfrm/xfrm_policy.c:2978
ops_exit_list.isra.6+0xae/0x150 net/core/net_namespace.c:142
cleanup_net+0x6a3/0xcc0 net/core/net_namespace.c:517
process_one_work+0xbbf/0x1af0 kernel/workqueue.c:2113
worker_thread+0x223/0x1990 kernel/workqueue.c:2247
kthread+0x33c/0x400 kernel/kthread.c:238
ret_from_fork+0x3a/0x50 arch/x86/entry/entry_64.S:542
Dumping ftrace buffer:
(ftrace buffer empty)
Kernel Offset: disabled
Rebooting in 86400 seconds..


---
This bug is generated by a dumb bot. It may contain errors.
See https://goo.gl/tpsmEJ for details.
Direct all questions to syzk...@googlegroups.com.

syzbot will keep track of this bug report.
If you forgot to add the Reported-by tag, once the fix for this bug is
merged
into any tree, please reply to this email with:
#syz fix: exact-commit-title
To mark this as a duplicate of another syzbot report, please reply with:
#syz dup: exact-subject-of-another-report
If it's a one-off invalid bug report, please reply with:
#syz invalid
Note: if the crash happens again, it will cause creation of a new bug
report.
Note: all commands must start from beginning of the line in the email body.
raw.log.txt
config.txt

syzbot

unread,
Feb 26, 2018, 6:08:03 PM2/26/18
to da...@davemloft.net, her...@gondor.apana.org.au, linux-...@vger.kernel.org, net...@vger.kernel.org, steffen....@secunet.com, syzkall...@googlegroups.com
syzbot has found reproducer for the following crash on net-next commit
ba6056a41cb09575a5ffe2fcfa9a0afb1b60eb92 (Mon Feb 26 15:37:24 2018 +0000)
Merge git://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf-next

So far this crash happened 165 times on net-next, upstream.
C reproducer is attached.
syzkaller reproducer is attached.
Raw console output is attached.
compiler: gcc (GCC) 7.1.1 20170620
.config is attached.

IMPORTANT: if you fix the bug, please add the following tag to the commit:
Reported-by: syzbot+0bf051...@syzkaller.appspotmail.com
It will help syzbot understand when the bug is fixed.

WARNING: CPU: 1 PID: 21 at net/xfrm/xfrm_state.c:2341
xfrm_state_fini+0x46a/0x620 net/xfrm/xfrm_state.c:2341
Kernel panic - not syncing: panic_on_warn set ...

CPU: 1 PID: 21 Comm: kworker/u4:1 Not tainted 4.16.0-rc2+ #242
Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS
Google 01/01/2011
Workqueue: netns cleanup_net
Call Trace:
__dump_stack lib/dump_stack.c:17 [inline]
dump_stack+0x194/0x24d lib/dump_stack.c:53
panic+0x1e4/0x41c kernel/panic.c:183
__warn+0x1dc/0x200 kernel/panic.c:547
report_bug+0x211/0x2d0 lib/bug.c:184
fixup_bug.part.11+0x37/0x80 arch/x86/kernel/traps.c:178
fixup_bug arch/x86/kernel/traps.c:247 [inline]
do_error_trap+0x2d7/0x3e0 arch/x86/kernel/traps.c:296
do_invalid_op+0x1b/0x20 arch/x86/kernel/traps.c:315
invalid_op+0x58/0x80 arch/x86/entry/entry_64.S:957
RIP: 0010:xfrm_state_fini+0x46a/0x620 net/xfrm/xfrm_state.c:2341
RSP: 0018:ffff8801d9447150 EFLAGS: 00010293
RAX: ffff8801d9436580 RBX: ffff8801cd526200 RCX: ffffffff84e9ea7a
RDX: 0000000000000000 RSI: 0000000000000000 RDI: ffffffff86ec96b8
RBP: ffff8801d94472a8 R08: 1ffff1003b288dbd R09: 0000000000000000
R10: 0000000000000000 R11: 0000000000000000 R12: 1ffff1003b288e2c
R13: ffff8801d9447280 R14: 1ffff1003b288e30 R15: ffff8801cd527600
xfrm_net_exit+0x25/0x70 net/xfrm/xfrm_policy.c:2978
ops_exit_list.isra.6+0xae/0x150 net/core/net_namespace.c:146
cleanup_net+0x5a3/0xbf0 net/core/net_namespace.c:539
process_one_work+0xbbf/0x1af0 kernel/workqueue.c:2113
worker_thread+0x223/0x1990 kernel/workqueue.c:2247
kthread+0x33c/0x400 kernel/kthread.c:238
ret_from_fork+0x3a/0x50 arch/x86/entry/entry_64.S:407
raw.log.txt
repro.syz.txt
repro.c.txt
config.txt

Jason Litzinger

unread,
Jun 18, 2018, 12:14:06 AM6/18/18
to syzkaller-bugs
I've simplified the reproducer provided by syzbot to the included
version.  The warning is reproduced 100% using the qemu image in the
syzkaller docs running the latest upstream and net.

As noted on the dashboard, this is similar to [1], in that an entry
remains in the xfrm_state_walk list, but different because the
protocol is not 0, it is 43, IPPROTO_ROUTING (and is valid by the fix
for [1], see 6a53b7593233).

Unfortunately, when a namespace exits, xfrm_state_fini only flushes
IPSEC protocols.  I don't have enough experience with the xfrm
subsystem to know whether this is correct, however, dc00a525603650a14
explicitly allows non ipsec protocols, as well as 0 for "all".

Would it be more appropriate for flush to also flush the non ipsec
protocols allowed in xfrm_user.c:validate_tmpl (explicitly or with 0)?

If someone with more experience with the subsystem believes that to be
the case I'm happy to send a patch (against net or ipsec?), otherwise
I'm going to keep digging to see if a better option presents itself.

Regardless I hope the simplified reproducer might be useful.

-Jason

[1] https://syzkaller.appspot.com/bug?id=c922592229951800c197ce48a5eaab8877c33723

* I wasn't subscribed to the list for the original message, so I'm
  using the GUI to reply...apologies if anything is mangled.
simple.c

Dmitry Vyukov

unread,
Jun 18, 2018, 3:25:53 AM6/18/18
to Jason Litzinger, David Miller, Herbert Xu, LKML, netdev, Steffen Klassert, syzkaller-bugs
+kernel developers back to CC

Jason did some debugging of this bug and have some questions as to
what's the best way to proceed. Please read the above.

Steffen Klassert

unread,
Jun 19, 2018, 8:28:35 AM6/19/18
to Dmitry Vyukov, Jason Litzinger, David Miller, Herbert Xu, LKML, netdev, syzkaller-bugs
Thanks for the info!

I'm traveling currently, so it may take some time
until I can look at it.
Reply all
Reply to author
Forward
0 new messages