lockdep WARNING in get_online

Dmitry Vyukov

unread,

Apr 2, 2016, 9:02:22 AM4/2/16

to Thomas Gleixner, Ingo Molnar, H. Peter Anvin, x...@kernel.org, Peter Zijlstra, LKML, Ingo Molnar, Oleg Nesterov, syzkaller, Kostya Serebryany, Alexander Potapenko, Sasha Levin

Hello,

After syncing to 05cf8077e54b20dddb756eaa26f3aeb5c38dd3cf (Apr 1) I've
started getting the following warnings while running syzkaller fuzzer:

------------[ cut here ]------------
WARNING: CPU: 0 PID: 7252 at kernel/locking/lockdep.c:2017[<
inline >] check_no_collision kernel/locking/lockdep.c:2017
WARNING: CPU: 0 PID: 7252 at kernel/locking/lockdep.c:2017[<
inline >] lookup_chain_cache kernel/locking/lockdep.c:2060
WARNING: CPU: 0 PID: 7252 at kernel/locking/lockdep.c:2017[<
inline >] validate_chain kernel/locking/lockdep.c:2132
WARNING: CPU: 0 PID: 7252 at kernel/locking/lockdep.c:2017[< none
>] __lock_acquire+0x39a5/0x5ff0 kernel/locking/lockdep.c:3221
DEBUG_LOCKS_WARN_ON(chain->depth != curr->lockdep_depth - (i - 1))
Modules linked in:
CPU: 0 PID: 7252 Comm: syz-executor Tainted: G W 4.6.0-rc1+ #340
Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS Bochs 01/01/2011
ffffffff87eb25c0 ffff8800308cf6b8 ffffffff82c8e9df ffffffff81494638
fffffbfff0fd64b8 ffff8800308cf730 0000000000000000 ffffffff86aa7700
ffffffff81474095 0000000000000009 ffff8800308cf700 fffffff2016/04/02
12:00:56 executing program 2:
Call Trace:
[< inline >] __dump_stack lib/dump_stack.c:15
[<ffffffff82c8e9df>] dump_stack+0x12e/0x18f lib/dump_stack.c:51
[<ffffffff8136637f>] __warn+0x19f/0x1e0 kernel/panic.c:512
[<ffffffff8136646c>] warn_slowpath_fmt+0xac/0xd0 kernel/panic.c:527
[< inline >] check_no_collision kernel/locking/lockdep.c:2017
[< inline >] lookup_chain_cache kernel/locking/lockdep.c:2060
[< inline >] validate_chain kernel/locking/lockdep.c:2132
[<ffffffff81474095>] __lock_acquire+0x39a5/0x5ff0 kernel/locking/lockdep.c:3221
[<ffffffff81478ba2>] lock_acquire+0x222/0x4b0 kernel/locking/lockdep.c:3600
[< inline >] __mutex_lock_common kernel/locking/mutex.c:518
[<ffffffff867aeb61>] mutex_lock_nested+0xb1/0xa50 kernel/locking/mutex.c:618
[<ffffffff81367820>] get_online_cpus+0x70/0x90 kernel/cpu.c:174
[<ffffffff81200ee9>] arch_jump_label_transform+0x69/0x2e0
arch/x86/kernel/jump_label.c:108
[<ffffffff8166c468>] __jump_label_update+0x168/0x1d0 kernel/jump_label.c:197
[<ffffffff8166c697>] jump_label_update+0x1c7/0x250 kernel/jump_label.c:482
[<ffffffff8166cec1>] __static_key_slow_dec+0xc1/0x120 kernel/jump_label.c:85
[<ffffffff8166cf71>] static_key_slow_dec+0x51/0x90 kernel/jump_label.c:100
[< inline >] tracepoint_remove_func kernel/tracepoint.c:248
[<ffffffff815a4eed>] tracepoint_probe_unregister+0x49d/0x5f0
kernel/tracepoint.c:319
[<ffffffff815e213d>] trace_event_reg+0xed/0x340 kernel/trace/trace_events.c:324
[<ffffffff815eff31>] perf_trace_event_unreg.isra.1+0xe1/0x1f0
kernel/trace/trace_event_perf.c:152
[<ffffffff815f0b9e>] perf_trace_destroy+0xbe/0x100
kernel/trace/trace_event_perf.c:233
[<ffffffff81639e35>] tp_perf_event_destroy+0x15/0x20 kernel/events/core.c:7041
[<ffffffff81654f2f>] _free_event+0x23f/0xa40 kernel/events/core.c:3750
[<ffffffff81655762>] free_event+0x32/0xa0 kernel/events/core.c:3776
[<ffffffff81658cf5>] SYSC_perf_event_open+0x955/0x21d0
kernel/events/core.c:8675
[<ffffffff81660da9>] SyS_perf_event_open+0x39/0x50 kernel/events/core.c:8324
[<ffffffff867b9400>] entry_SYSCALL_64_fastpath+0x23/0xc1
arch/x86/entry/entry_64.S:207
---[ end trace 23451128bc05ed82 ]---

Dmitry Vyukov

unread,

Apr 4, 2016, 4:19:27 AM4/4/16

to Thomas Gleixner, Ingo Molnar, H. Peter Anvin, x...@kernel.org, Peter Zijlstra, LKML, Ingo Molnar, Oleg Nesterov, syzkaller, Kostya Serebryany, Alexander Potapenko, Sasha Levin

This happens in CONFIG_DEBUG_LOCKDEP code. Is it a bug in lockdep?
But I always see at the same stack involving perf and jump_label...

/*
* Checks whether the chain and the current held locks are consistent
* in depth and also in content. If they are not it most likely means
* that there was a collision during the calculation of the chain_key.
* Returns: 0 not passed, 1 passed
*/
static int check_no_collision(struct task_struct *curr,
struct held_lock *hlock,
struct lock_chain *chain)
{
#ifdef CONFIG_DEBUG_LOCKDEP
int i, j, id;

i = get_first_held_lock(curr, hlock);

if (DEBUG_LOCKS_WARN_ON(chain->depth != curr->lockdep_depth - (i - 1)))
return 0;

Peter Zijlstra

unread,

Apr 4, 2016, 5:01:48 AM4/4/16

to Dmitry Vyukov, Thomas Gleixner, Ingo Molnar, H. Peter Anvin, x...@kernel.org, LKML, Ingo Molnar, Oleg Nesterov, syzkaller, Kostya Serebryany, Alexander Potapenko, Sasha Levin

On Mon, Apr 04, 2016 at 10:19:05AM +0200, Dmitry Vyukov wrote:
>
> This happens in CONFIG_DEBUG_LOCKDEP code. Is it a bug in lockdep?

We hope not; but it is a new test.

So lockdep needs to check each current lock stack against the recorded
lock dependencies to see if we've gotten ourselves a cycle. Doing this
check is _expensive_.

So what lockdep does is it computes a hash for each lock stack and only
if we've not seen this hash before (actually truncated since we don't
have a full 64bit hashtable) do we go look for cycles.

The new check tries to detect hash-collisions in this cache. A collision
would result in not checking for cycles, even if we've not seen the
stack before.

You've managed to tickle this.

Now, last week I found some bugs in there, and Alfredo added a pretty
printer, so maybe try and add these patches to your testing?

git://git.kernel.org/pub/scm/linux/kernel/git/peterz/queue.git locking/urgent

> But I always see at the same stack involving perf and jump_label...

So you have a simple reproducer? So that I can have a go at this.

Dmitry Vyukov

unread,

Apr 4, 2016, 5:50:35 AM4/4/16

to Peter Zijlstra, Thomas Gleixner, Ingo Molnar, H. Peter Anvin, x...@kernel.org, LKML, Ingo Molnar, Oleg Nesterov, syzkaller, Kostya Serebryany, Alexander Potapenko, Sasha Levin

Yes, I am able to reproduce it by running:

# ./syz-execprog -cover=0 -repeat=0 -procs=20 -nobody=0
crash-qemu-26-1459761514194788294

crash-qemu-26-1459761514194788294 being:
https://gist.githubusercontent.com/dvyukov/bce719c2b7dede54f96d8bbb7c78714f/raw/08a7a237986f0e3bccc7c0e800a3bdaa32e1fc0b/gistfile1.txt

Peter Zijlstra

unread,

Apr 5, 2016, 5:41:38 AM4/5/16

to Dmitry Vyukov, Thomas Gleixner, Ingo Molnar, H. Peter Anvin, x...@kernel.org, LKML, Ingo Molnar, Oleg Nesterov, syzkaller, Kostya Serebryany, Alexander Potapenko, Sasha Levin

On Mon, Apr 04, 2016 at 11:50:07AM +0200, Dmitry Vyukov wrote:
> Yes, I am able to reproduce it by running:
>
> # ./syz-execprog -cover=0 -repeat=0 -procs=20 -nobody=0
> crash-qemu-26-1459761514194788294
>
> crash-qemu-26-1459761514194788294 being:
> https://gist.githubusercontent.com/dvyukov/bce719c2b7dede54f96d8bbb7c78714f/raw/08a7a237986f0e3bccc7c0e800a3bdaa32e1fc0b/gistfile1.txt

root@ivb-ep:~/gopath/src/github.com/google/syzkaller# ./bin/syz-execprog -cover=0 -repeat=0 -procs=20 -nobody=0 gistfile1.txt
2016/04/05 11:39:37 parsed 391 programs
2016/04/05 11:39:37 executed 0 programs
result: failed=false hanged=false err=failed to start executor binary: fork/exec /root/gopath/src/github.com/google/syzkaller: permission denied

result: failed=false hanged=false err=failed to start executor binary: fork/exec /root/gopath/src/github.com/google/syzkaller: permission denied

result: failed=false hanged=false err=failed to start executor binary: fork/exec /root/gopath/src/github.com/google/syzkaller: permission denied

(which goes on ad inf.)

Clearly I'm not doing it right.. help?

Dmitry Vyukov

unread,

Apr 5, 2016, 5:44:33 AM4/5/16

to Peter Zijlstra, Thomas Gleixner, Ingo Molnar, H. Peter Anvin, x...@kernel.org, LKML, Ingo Molnar, Oleg Nesterov, syzkaller, Kostya Serebryany, Alexander Potapenko, Sasha Levin

The latest version of syz-execproc uses ./syz-executor by default. If
you don't have ./syz-executor, add -executor=/path/to/syz-executor.

Peter Zijlstra

unread,

Apr 5, 2016, 7:18:24 AM4/5/16

to Dmitry Vyukov, Thomas Gleixner, Ingo Molnar, H. Peter Anvin, x...@kernel.org, LKML, Ingo Molnar, Oleg Nesterov, syzkaller, Kostya Serebryany, Alexander Potapenko, Sasha Levin

I still have the version you helped make work; I saw that the config
file format had changed since so I've not updated because I didn't want
borkage.

In any case, that did indeed work. But it doesn't trigger the collision
for me :/

I'm running a fairly bloated (distro derived) .config with various debug
options (including lockdep and lockdep_debug) enabled.

Should I maybe do a 'normal' fuzz run to see if that will trigger it? I
need a config for that though, as I only have the perf specific one.
Also, how likely is running this thing as root on bare metal going to
wreck things?

Dmitry Vyukov

unread,

Apr 5, 2016, 7:24:01 AM4/5/16

to Peter Zijlstra, Thomas Gleixner, Ingo Molnar, H. Peter Anvin, x...@kernel.org, LKML, Ingo Molnar, Oleg Nesterov, syzkaller, Kostya Serebryany, Alexander Potapenko, Sasha Levin

On Tue, Apr 5, 2016 at 1:18 PM, Peter Zijlstra <pet...@infradead.org> wrote:
>> >> Yes, I am able to reproduce it by running:
>> >>
>> >> # ./syz-execprog -cover=0 -repeat=0 -procs=20 -nobody=0
>> >> crash-qemu-26-1459761514194788294
>> >>
>> >> crash-qemu-26-1459761514194788294 being:
>> >> https://gist.githubusercontent.com/dvyukov/bce719c2b7dede54f96d8bbb7c78714f/raw/08a7a237986f0e3bccc7c0e800a3bdaa32e1fc0b/gistfile1.txt
>> >
>> >
>> > root@ivb-ep:~/gopath/src/github.com/google/syzkaller# ./bin/syz-execprog -cover=0 -repeat=0 -procs=20 -nobody=0 gistfile1.txt
>> > 2016/04/05 11:39:37 parsed 391 programs
>> > 2016/04/05 11:39:37 executed 0 programs
>> > result: failed=false hanged=false err=failed to start executor binary: fork/exec /root/gopath/src/github.com/google/syzkaller: permission denied
>> >
>> > result: failed=false hanged=false err=failed to start executor binary: fork/exec /root/gopath/src/github.com/google/syzkaller: permission denied
>> >
>> > result: failed=false hanged=false err=failed to start executor binary: fork/exec /root/gopath/src/github.com/google/syzkaller: permission denied
>> >
>> > (which goes on ad inf.)
>> >
>> > Clearly I'm not doing it right.. help?
>>
>> The latest version of syz-execproc uses ./syz-executor by default. If
>> you don't have ./syz-executor, add -executor=/path/to/syz-executor.
>
> I still have the version you helped make work; I saw that the config
> file format had changed since so I've not updated because I didn't want
> borkage.
>
> In any case, that did indeed work. But it doesn't trigger the collision
> for me :/
>
> I'm running a fairly bloated (distro derived) .config with various debug
> options (including lockdep and lockdep_debug) enabled.

Try to run it for several minutes, then reboot machine and try again.
It seems to be somewhat non-deterministic (maybe depends on order
chains are added to hashtable).

> Should I maybe do a 'normal' fuzz run to see if that will trigger it? I
> need a config for that though, as I only have the perf specific one.
> Also, how likely is running this thing as root on bare metal going to
> wreck things?

We are running syzkaller on an android device for a months, no issues
observed (we don't even reimage it).

My config just includes:

"disable_syscalls": [
"syz_open_dev$kvm"
],

and no "enable_syscalls". I don't know what would be a more
specialized config for this issue.

Reply all

Reply to author

Forward

lockdep WARNING in get_online_cpus

Dmitry Vyukov

Dmitry Vyukov

Peter Zijlstra

Dmitry Vyukov

Peter Zijlstra

Dmitry Vyukov

Peter Zijlstra

Dmitry Vyukov