WARNING in mark_lock

21 views
Skip to first unread message

syzbot

unread,
Jun 24, 2019, 9:37:06 PM6/24/19
to linux-...@vger.kernel.org, syzkall...@googlegroups.com, tg...@linutronix.de
Hello,

syzbot found the following crash on:

HEAD commit: dc636f5d Add linux-next specific files for 20190620
git tree: linux-next
console output: https://syzkaller.appspot.com/x/log.txt?x=162b68b1a00000
kernel config: https://syzkaller.appspot.com/x/.config?x=99c104b0092a557b
dashboard link: https://syzkaller.appspot.com/bug?extid=a861f52659ae2596492b
compiler: gcc (GCC) 9.0.0 20181231 (experimental)
syz repro: https://syzkaller.appspot.com/x/repro.syz?x=110b24f6a00000

IMPORTANT: if you fix the bug, please add the following tag to the commit:
Reported-by: syzbot+a861f5...@syzkaller.appspotmail.com

------------[ cut here ]------------
DEBUG_LOCKS_WARN_ON(1)
WARNING: CPU: 0 PID: 9968 at kernel/locking/lockdep.c:167 hlock_class
kernel/locking/lockdep.c:167 [inline]
WARNING: CPU: 0 PID: 9968 at kernel/locking/lockdep.c:167 hlock_class
kernel/locking/lockdep.c:156 [inline]
WARNING: CPU: 0 PID: 9968 at kernel/locking/lockdep.c:167
mark_lock+0x22b/0x11e0 kernel/locking/lockdep.c:3594
Kernel panic - not syncing: panic_on_warn set ...
CPU: 0 PID: 9968 Comm: syz-executor.2 Not tainted 5.2.0-rc5-next-20190620
#19
Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS
Google 01/01/2011
Call Trace:
<IRQ>
__dump_stack lib/dump_stack.c:77 [inline]
dump_stack+0x172/0x1f0 lib/dump_stack.c:113
panic+0x2dc/0x755 kernel/panic.c:219
__warn.cold+0x20/0x4c kernel/panic.c:576
report_bug+0x263/0x2b0 lib/bug.c:186
fixup_bug arch/x86/kernel/traps.c:179 [inline]
fixup_bug arch/x86/kernel/traps.c:174 [inline]
do_error_trap+0x11b/0x200 arch/x86/kernel/traps.c:272
do_invalid_op+0x37/0x50 arch/x86/kernel/traps.c:291
invalid_op+0x14/0x20 arch/x86/entry/entry_64.S:986
RIP: 0010:hlock_class kernel/locking/lockdep.c:167 [inline]
RIP: 0010:hlock_class kernel/locking/lockdep.c:156 [inline]
RIP: 0010:mark_lock+0x22b/0x11e0 kernel/locking/lockdep.c:3594
Code: d0 7c 08 84 d2 0f 85 33 0f 00 00 44 8b 15 4d 14 4a 08 45 85 d2 75 b6
48 c7 c6 c0 a6 8b 87 48 c7 c7 00 a7 8b 87 e8 ad e6 eb ff <0f> 0b 31 db e9
a8 fe ff ff 48 c7 c7 80 71 86 8a e8 f0 95 53 00 e9
RSP: 0018:ffff8880ae809ad0 EFLAGS: 00010082
RAX: 0000000000000000 RBX: 0000000000000f1d RCX: 0000000000000000
RDX: 0000000000010000 RSI: ffffffff815b37e6 RDI: ffffed1015d0134c
RBP: ffff8880ae809b20 R08: ffff88808662e0c0 R09: fffffbfff11b3285
R10: fffffbfff11b3284 R11: ffffffff88d99423 R12: 0000000000000000
R13: ffff88808662e9c8 R14: 000000000000004f R15: 00000000000c4f1d
mark_usage kernel/locking/lockdep.c:3485 [inline]
__lock_acquire+0x1e1a/0x4680 kernel/locking/lockdep.c:3839
lock_acquire+0x190/0x410 kernel/locking/lockdep.c:4418
__raw_spin_lock_irqsave include/linux/spinlock_api_smp.h:110 [inline]
_raw_spin_lock_irqsave+0x95/0xcd kernel/locking/spinlock.c:159
try_to_wake_up+0x90/0x1430 kernel/sched/core.c:2000
wake_up_process+0x10/0x20 kernel/sched/core.c:2114
hrtimer_wakeup+0x48/0x60 kernel/time/hrtimer.c:1636
__run_hrtimer kernel/time/hrtimer.c:1388 [inline]
__hrtimer_run_queues+0x364/0xe40 kernel/time/hrtimer.c:1450
hrtimer_interrupt+0x314/0x770 kernel/time/hrtimer.c:1508
local_apic_timer_interrupt arch/x86/kernel/apic/apic.c:1041 [inline]
smp_apic_timer_interrupt+0x12a/0x5b0 arch/x86/kernel/apic/apic.c:1066
apic_timer_interrupt+0xf/0x20 arch/x86/entry/entry_64.S:806
</IRQ>
Kernel Offset: disabled
Rebooting in 86400 seconds..


---
This bug is generated by a bot. It may contain errors.
See https://goo.gl/tpsmEJ for more information about syzbot.
syzbot engineers can be reached at syzk...@googlegroups.com.

syzbot will keep track of this bug report. See:
https://goo.gl/tpsmEJ#status for how to communicate with syzbot.
syzbot can test patches for this bug, for details see:
https://goo.gl/tpsmEJ#testing-patches

Thomas Gleixner

unread,
Jun 25, 2019, 2:20:58 AM6/25/19
to syzbot, LKML, syzkall...@googlegroups.com, Peter Zijlstra
On Mon, 24 Jun 2019, syzbot wrote:

> Hello,

CC++ Peterz
^^^^^^^^^^^^^^

Eric Biggers

unread,
Jun 25, 2019, 3:29:45 AM6/25/19
to b...@vger.kernel.org, syzbot, LKML, syzkall...@googlegroups.com, Thomas Gleixner, Peter Zijlstra
[+bpf list]

On Tue, Jun 25, 2019 at 08:20:56AM +0200, Thomas Gleixner wrote:
> On Mon, 24 Jun 2019, syzbot wrote:
>
> > Hello,
>
> CC++ Peterz
>
> >
> > syzbot found the following crash on:
> >
> > HEAD commit: dc636f5d Add linux-next specific files for 20190620
> > git tree: linux-next
> > console output: https://syzkaller.appspot.com/x/log.txt?x=162b68b1a00000
> > kernel config: https://syzkaller.appspot.com/x/.config?x=99c104b0092a557b
> > dashboard link: https://syzkaller.appspot.com/bug?extid=a861f52659ae2596492b
> > compiler: gcc (GCC) 9.0.0 20181231 (experimental)
> > syz repro: https://syzkaller.appspot.com/x/repro.syz?x=110b24f6a00000
> >
> > IMPORTANT: if you fix the bug, please add the following tag to the commit:
> > Reported-by: syzbot+a861f5...@syzkaller.appspotmail.com

The syz repro looks bpf related, and essentially the same repro is in lots of
other open syzbot reports which I've assigned to the bpf subsystem...
https://lore.kernel.org/lkml/20190624050...@sol.localdomain/

{"threaded":true,"repeat":true,"procs":6,"sandbox":"none","fault_call":-1,"tun":true,"netdev":true,"resetnet":true,"cgroups":true,"binfmt_misc":true,"close_fds":true,"tmpdir":true,"segv":true}
bpf$MAP_CREATE(0x0, &(0x7f0000000280)={0xf, 0x4, 0x4, 0x400, 0x0, 0x1}, 0x3c)
socket$rxrpc(0x21, 0x2, 0x800000000a)
r0 = socket$inet6_tcp(0xa, 0x1, 0x0)
setsockopt$inet6_tcp_int(r0, 0x6, 0x13, &(0x7f00000000c0)=0x100000001, 0x1d4)
connect$inet6(r0, &(0x7f0000000140), 0x1c)
bpf$MAP_CREATE(0x0, &(0x7f0000000000)={0x5}, 0xfffffffffffffdcb)
bpf$MAP_CREATE(0x2, &(0x7f0000003000)={0x3, 0x0, 0x77fffb, 0x0, 0x10020000000, 0x0}, 0x2c)
setsockopt$inet6_tcp_TCP_ULP(r0, 0x6, 0x1f, &(0x7f0000000040)='tls\x00', 0x4)

Peter Zijlstra

unread,
Jun 25, 2019, 7:03:05 AM6/25/19
to Thomas Gleixner, syzbot, LKML, syzkall...@googlegroups.com
On Tue, Jun 25, 2019 at 08:20:56AM +0200, Thomas Gleixner wrote:
That's trying to acquire p->pi_lock, and the DEBUG_LOCKS_WARN_ON() that
triggers has the comment:

/*
* Someone passed in garbage, we give up.
*/

Which seems to indicate we triggered some use-after-free or other
corruption scenario (@p is buggered in any case).

> > hrtimer_wakeup+0x48/0x60 kernel/time/hrtimer.c:1636
> > __run_hrtimer kernel/time/hrtimer.c:1388 [inline]
> > __hrtimer_run_queues+0x364/0xe40 kernel/time/hrtimer.c:1450
> > hrtimer_interrupt+0x314/0x770 kernel/time/hrtimer.c:1508
> > local_apic_timer_interrupt arch/x86/kernel/apic/apic.c:1041 [inline]
> > smp_apic_timer_interrupt+0x12a/0x5b0 arch/x86/kernel/apic/apic.c:1066
> > apic_timer_interrupt+0xf/0x20 arch/x86/entry/entry_64.S:806
> > </IRQ>
> > Kernel Offset: disabled
> > Rebooting in 86400 seconds..

If we look at the 'console output' provided above, we'll see that
there's two CPUs going *splat* at the same time, the above is first, but
the second does:

[ 93.859778][ T9977] kasan: GPF could be caused by NULL-ptr deref or user memory access
[ 93.869482][ T9977] general protection fault: 0000 [#1] PREEMPT SMP KASAN
[ 93.881836][ T9977] CPU: 1 PID: 9977 Comm: syz-executor.4 Not tainted 5.2.0-rc5-next-20190620 #19

[ 94.102920][ T9977] do_exit+0x81b/0x2f60

which also indicates things went sideways fast.

I'm not sure I've got enough hints to figure out where it goes
side-ways, but it sure did.

Maybe someone forgot to cancel a timer before freeing?

Peter Zijlstra

unread,
Jun 25, 2019, 7:06:12 AM6/25/19
to Thomas Gleixner, syzbot, LKML, syzkall...@googlegroups.com
On Tue, Jun 25, 2019 at 01:03:01PM +0200, Peter Zijlstra wrote:
> On Tue, Jun 25, 2019 at 08:20:56AM +0200, Thomas Gleixner wrote:
> > On Mon, 24 Jun 2019, syzbot wrote:
>
> > > syzbot found the following crash on:
> > >
> > > HEAD commit: dc636f5d Add linux-next specific files for 20190620
> > > git tree: linux-next
> > > console output: https://syzkaller.appspot.com/x/log.txt?x=162b68b1a00000

syzcaller folks; why doesn't the above link include the actual kernel
boot, but only the userspace bits starting at syzcaller start?

I was trying to figure out the setup, but there's not enough information
here.

Dmitry Vyukov

unread,
Jun 25, 2019, 8:07:56 AM6/25/19
to Peter Zijlstra, Thomas Gleixner, syzbot, LKML, syzkaller-bugs
Hi Peter,

Usually there is too much after-boot output, so boot output is evicted
anyway even if was preserved initially. Also usually it's not
important (this is the first time this comes up). And also
structurally boot is a separate procedure in syzkaller VM abstraction,
a machine is booted, output is analyzed for potential crashes, then
the machine is considered in a known good state and then some workload
is started as a separate procedure and new output capturing starts
from this point again.

What info are you interested in? Can if be obtained after boot?
Perhaps I can give it to you now. And there is also this long standing request:
https://github.com/google/syzkaller/issues/466
to collect some kind of "machine info" along with crashes. Perhaps we
need to add the info you are looking for to that list.

Peter Zijlstra

unread,
Jun 25, 2019, 10:01:33 AM6/25/19
to Dmitry Vyukov, Thomas Gleixner, syzbot, LKML, syzkaller-bugs
On Tue, Jun 25, 2019 at 02:07:42PM +0200, Dmitry Vyukov wrote:
> On Tue, Jun 25, 2019 at 1:06 PM Peter Zijlstra <pet...@infradead.org> wrote:
> >
> > On Tue, Jun 25, 2019 at 01:03:01PM +0200, Peter Zijlstra wrote:
> > > On Tue, Jun 25, 2019 at 08:20:56AM +0200, Thomas Gleixner wrote:
> > > > On Mon, 24 Jun 2019, syzbot wrote:
> > >
> > > > > syzbot found the following crash on:
> > > > >
> > > > > HEAD commit: dc636f5d Add linux-next specific files for 20190620
> > > > > git tree: linux-next
> > > > > console output: https://syzkaller.appspot.com/x/log.txt?x=162b68b1a00000
> >
> > syzcaller folks; why doesn't the above link include the actual kernel
> > boot, but only the userspace bits starting at syzcaller start?
> >
> > I was trying to figure out the setup, but there's not enough information
> > here.
>
> Hi Peter,
>
> Usually there is too much after-boot output, so boot output is evicted
> anyway even if was preserved initially. Also usually it's not
> important (this is the first time this comes up). And also
> structurally boot is a separate procedure in syzkaller VM abstraction,
> a machine is booted, output is analyzed for potential crashes, then
> the machine is considered in a known good state and then some workload
> is started as a separate procedure and new output capturing starts
> from this point again.

Ah, for my own machines I spool all serial console output to a file,
everything is preserved until logrotate kills it after a week or so.

There is no distinction between boot and anything else, everything that
goes to serial (and I make sure everything does) lands together.

> What info are you interested in? Can if be obtained after boot?

I was interested in the kernel commandline; and in particular the
nohz_full configuration if any.

Dmitry Vyukov

unread,
Jun 25, 2019, 10:10:22 AM6/25/19
to Peter Zijlstra, Thomas Gleixner, syzbot, LKML, syzkaller-bugs
We don't give nohz_full or anything similar. The command line
arguments are available here:
https://github.com/google/syzkaller/blob/master/dashboard/config/upstream-apparmor.cmdline
and few are here:
https://github.com/google/syzkaller/blob/master/tools/create-gce-image.sh#L211

syzbot

unread,
Jun 27, 2019, 6:03:02 PM6/27/19
to a...@kernel.org, b...@vger.kernel.org, dan...@iogearbox.net, dvy...@google.com, ebig...@kernel.org, john.fa...@gmail.com, linux-...@vger.kernel.org, net...@vger.kernel.org, pet...@infradead.org, syzkall...@googlegroups.com, tg...@linutronix.de
syzbot has bisected this bug to:

commit e9db4ef6bf4ca9894bb324c76e01b8f1a16b2650
Author: John Fastabend <john.fa...@gmail.com>
Date: Sat Jun 30 13:17:47 2018 +0000

bpf: sockhash fix omitted bucket lock in sock_close

bisection log: https://syzkaller.appspot.com/x/bisect.txt?x=1436e7e9a00000
start commit: dc636f5d Add linux-next specific files for 20190620
git tree: linux-next
final crash: https://syzkaller.appspot.com/x/report.txt?x=1636e7e9a00000
console output: https://syzkaller.appspot.com/x/log.txt?x=1236e7e9a00000
syz repro: https://syzkaller.appspot.com/x/repro.syz?x=110b24f6a00000

Reported-by: syzbot+a861f5...@syzkaller.appspotmail.com
Fixes: e9db4ef6bf4c ("bpf: sockhash fix omitted bucket lock in sock_close")

For information about bisection process see: https://goo.gl/tpsmEJ#bisection

John Fastabend

unread,
Jul 1, 2019, 1:32:14 AM7/1/19
to Eric Biggers, b...@vger.kernel.org, syzbot, LKML, syzkall...@googlegroups.com, Thomas Gleixner, Peter Zijlstra
#syz test: git://github.com/cilium/linux ktls-unhash

syzbot

unread,
Jul 1, 2019, 1:51:01 AM7/1/19
to b...@vger.kernel.org, ebig...@kernel.org, john.fa...@gmail.com, linux-...@vger.kernel.org, pet...@infradead.org, syzkall...@googlegroups.com, tg...@linutronix.de
Hello,

syzbot has tested the proposed patch but the reproducer still triggered
crash:
KASAN: use-after-free Read in class_equal

==================================================================
BUG: KASAN: use-after-free in class_equal+0x40/0x50
kernel/locking/lockdep.c:1527
Read of size 8 at addr ffff88808a268ba0 by task syz-executor.1/9270

CPU: 0 PID: 9270 Comm: syz-executor.1 Not tainted 5.2.0-rc3+ #1
Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS
Google 01/01/2011
Call Trace:

Allocated by task 2647419968:
BUG: unable to handle page fault for address: ffffffff8c00b020
#PF: supervisor read access in kernel mode
#PF: error_code(0x0000) - not-present page
PGD 8a70067 P4D 8a70067 PUD 8a71063 PMD 0
Thread overran stack, or stack corrupted
Oops: 0000 [#1] PREEMPT SMP KASAN
CPU: 0 PID: 9270 Comm: syz-executor.1 Not tainted 5.2.0-rc3+ #1
Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS
Google 01/01/2011
RIP: 0010:stack_depot_fetch+0x10/0x30 lib/stackdepot.c:203
Code: e9 7b fd ff ff 4c 89 ff e8 8d b4 62 fe e9 e6 fd ff ff 90 90 90 90 90
90 90 90 89 f8 c1 ef 11 25 ff ff 1f 00 81 e7 f0 3f 00 00 <48> 03 3c c5 20
6c 04 8b 48 8d 47 18 48 89 06 8b 47 0c c3 0f 1f 00
RSP: 0018:ffff88808a2688e8 EFLAGS: 00010006
RAX: 00000000001f8880 RBX: ffff88808a269304 RCX: 0000000000000000
RDX: 0000000000000000 RSI: ffff88808a2688f0 RDI: 0000000000003ff0
RBP: ffff88808a268908 R08: 0000000000000020 R09: ffffed1015d044fa
R10: ffffed1015d044f9 R11: ffff8880ae8227cf R12: ffffea0002289a00
R13: ffff88808a268ba0 R14: ffff8880aa58ec40 R15: ffff88808a269300
FS: 00005555570ba940(0000) GS:ffff8880ae800000(0000) knlGS:0000000000000000
CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
CR2: ffffffff8c00b020 CR3: 000000008dd00000 CR4: 00000000001406f0
DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
Call Trace:
Modules linked in:
CR2: ffffffff8c00b020
---[ end trace 4acfe4b59fbc9cdb ]---
RIP: 0010:stack_depot_fetch+0x10/0x30 lib/stackdepot.c:203
Code: e9 7b fd ff ff 4c 89 ff e8 8d b4 62 fe e9 e6 fd ff ff 90 90 90 90 90
90 90 90 89 f8 c1 ef 11 25 ff ff 1f 00 81 e7 f0 3f 00 00 <48> 03 3c c5 20
6c 04 8b 48 8d 47 18 48 89 06 8b 47 0c c3 0f 1f 00
RSP: 0018:ffff88808a2688e8 EFLAGS: 00010006
RAX: 00000000001f8880 RBX: ffff88808a269304 RCX: 0000000000000000
RDX: 0000000000000000 RSI: ffff88808a2688f0 RDI: 0000000000003ff0
RBP: ffff88808a268908 R08: 0000000000000020 R09: ffffed1015d044fa
R10: ffffed1015d044f9 R11: ffff8880ae8227cf R12: ffffea0002289a00
R13: ffff88808a268ba0 R14: ffff8880aa58ec40 R15: ffff88808a269300
FS: 00005555570ba940(0000) GS:ffff8880ae800000(0000) knlGS:0000000000000000
CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
CR2: ffffffff8c00b020 CR3: 000000008dd00000 CR4: 00000000001406f0
DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400


Tested on:

commit: 0b58d013 bpf: tls, implement unhash to avoid transition ou..
git tree: git://github.com/cilium/linux ktls-unhash
console output: https://syzkaller.appspot.com/x/log.txt?x=153368a3a00000
kernel config: https://syzkaller.appspot.com/x/.config?x=2cc918d28ebd06b4

John Fastabend

unread,
Jul 8, 2019, 12:21:25 PM7/8/19
to syzbot, b...@vger.kernel.org, ebig...@kernel.org, john.fa...@gmail.com, linux-...@vger.kernel.org, pet...@infradead.org, syzkall...@googlegroups.com, tg...@linutronix.de
#syz test: git://github.com/cilium/linux fix-unhash

syzbot

unread,
Jul 8, 2019, 6:03:01 PM7/8/19
to b...@vger.kernel.org, ebig...@kernel.org, john.fa...@gmail.com, linux-...@vger.kernel.org, pet...@infradead.org, syzkall...@googlegroups.com, tg...@linutronix.de
Hello,

syzbot has tested the proposed patch and the reproducer did not trigger
crash:

Reported-and-tested-by:
syzbot+a861f5...@syzkaller.appspotmail.com

Tested on:

commit: 17b3f125 tls: working code
git tree: git://github.com/cilium/linux fix-unhash
kernel config: https://syzkaller.appspot.com/x/.config?x=dd16b8dc9d0d210c
compiler: gcc (GCC) 9.0.0 20181231 (experimental)

Note: testing is done by a robot and is best-effort only.

Eric Biggers

unread,
Aug 22, 2019, 12:19:33 PM8/22/19
to syzbot, syzkall...@googlegroups.com
> --
> You received this message because you are subscribed to the Google Groups "syzkaller-bugs" group.
> To unsubscribe from this group and stop receiving emails from it, send an email to syzkaller-bug...@googlegroups.com.
> To view this discussion on the web visit https://groups.google.com/d/msgid/syzkaller-bugs/0000000000008ccb1a058d329b72%40google.com.
> For more options, visit https://groups.google.com/d/optout.

#syz fix: bpf: sockmap/tls, close can race with map free

But there's also a USB driver crash with this same signature:
https://syzkaller.appspot.com/text?tag=CrashReport&x=16d0224c600000
Reply all
Reply to author
Forward
0 new messages