WARNING in get_pi

syzbot

unread,

Oct 30, 2017, 3:44:01 PM10/30/17

to dvh...@infradead.org, linux-...@vger.kernel.org, mi...@redhat.com, pet...@infradead.org, syzkall...@googlegroups.com, tg...@linutronix.de

Hello,

syzkaller hit the following crash on
8fd0520d9cec0896d48d3921bc642a9ee81eae0c
git://git.cmpxchg.org/linux-mmots.git/master
compiler: gcc (GCC) 7.1.1 20170620
.config is attached
Raw console output is attached.
C reproducer is attached
syzkaller reproducer is attached. See https://goo.gl/kgGztJ
for information about syzkaller reproducers

WARNING: CPU: 1 PID: 24353 at kernel/futex.c:818 get_pi_state+0x15b/0x190
kernel/futex.c:818
Kernel panic - not syncing: panic_on_warn set ...

CPU: 1 PID: 24353 Comm: syzkaller121915 Not tainted 4.14.0-rc2-mm1+ #10
Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS
Google 01/01/2011
Call Trace:
__dump_stack lib/dump_stack.c:16 [inline]
dump_stack+0x194/0x257 lib/dump_stack.c:52
panic+0x1e4/0x417 kernel/panic.c:181
__warn+0x1c4/0x1d9 kernel/panic.c:542
report_bug+0x211/0x2d0 lib/bug.c:183
fixup_bug+0x40/0x90 arch/x86/kernel/traps.c:178
do_trap_no_signal arch/x86/kernel/traps.c:212 [inline]
do_trap+0x260/0x390 arch/x86/kernel/traps.c:261
do_error_trap+0x120/0x390 arch/x86/kernel/traps.c:298
do_invalid_op+0x1b/0x20 arch/x86/kernel/traps.c:311
invalid_op+0x18/0x20 arch/x86/entry/entry_64.S:905
RIP: 0010:get_pi_state+0x15b/0x190 kernel/futex.c:818
RSP: 0018:ffff8801bf2a77a8 EFLAGS: 00010097
RAX: ffff8801cb2d05c0 RBX: 0000000000000000 RCX: 1ffff10037e54efa
RDX: 0000000000000000 RSI: 0000000000000003 RDI: ffff8801d892cd40
RBP: ffff8801bf2a7838 R08: ffff8801cb2d1880 R09: 1ffff10037e54edf
R10: ffff8801cb2d05c0 R11: 0000000000000002 R12: 1ffff10037e54ef6
R13: ffff8801d892cd40 R14: 1ffff10037e54efa R15: ffff8801d892ce00
exit_pi_state_list+0x556/0x7a0 kernel/futex.c:932
mm_release+0x46d/0x590 kernel/fork.c:1191
exit_mm kernel/exit.c:499 [inline]
do_exit+0x481/0x1b00 kernel/exit.c:852
SYSC_exit kernel/exit.c:937 [inline]
SyS_exit+0x22/0x30 kernel/exit.c:935
entry_SYSCALL_64_fastpath+0x1f/0xbe
RIP: 0033:0x40337e
RSP: 002b:00007fdd99af2d20 EFLAGS: 00000246 ORIG_RAX: 000000000000003c
RAX: ffffffffffffffda RBX: 00007fdd99af3700 RCX: 000000000040337e
RDX: 000000000000003c RSI: 00000000007fb000 RDI: 0000000000000000
RBP: 0000000000000086 R08: 0000000020048000 R09: 0000000000000000
R10: 000000002000b000 R11: 0000000000000246 R12: 0000000000000000
R13: 00007ffcfc38d0bf R14: 00007fdd99af39c0 R15: 0000000000000000

---
This bug is generated by a dumb bot. It may contain errors.
See https://goo.gl/tpsmEJ for details.
Direct all questions to syzk...@googlegroups.com.

syzbot will keep track of this bug report.
Once a fix for this bug is committed, please reply to this email with:
#syz fix: exact-commit-title
To mark this as a duplicate of another syzbot report, please reply with:
#syz dup: exact-subject-of-another-report
If it's a one-off invalid bug report, please reply with:
#syz invalid
Note: if the crash happens again, it will cause creation of a new bug
report.

config.txt

raw.log

repro.txt

repro.c

Dmitry Vyukov

unread,

Oct 30, 2017, 3:53:42 PM10/30/17

to syzbot, dvh...@infradead.org, LKML, Ingo Molnar, Peter Zijlstra, syzkall...@googlegroups.com, Thomas Gleixner

On Mon, Oct 30, 2017 at 10:44 PM, syzbot
<bot+2af19c9e1ffe4d4ee1...@syzkaller.appspotmail.com>
wrote:

> Hello,
>
> syzkaller hit the following crash on
> 8fd0520d9cec0896d48d3921bc642a9ee81eae0c
> git://git.cmpxchg.org/linux-mmots.git/master
> compiler: gcc (GCC) 7.1.1 20170620
> .config is attached
> Raw console output is attached.
> C reproducer is attached
> syzkaller reproducer is attached. See https://goo.gl/kgGztJ
> for information about syzkaller reproducers

Also happened on more recent commits, including upstream
0787643a5f6aad1f0cdeb305f7fe492b71943ea4:

------------[ cut here ]------------
WARNING: CPU: 0 PID: 6170 at kernel/futex.c:818

get_pi_state+0x15b/0x190 kernel/futex.c:818
Kernel panic - not syncing: panic_on_warn set ...

CPU: 0 PID: 6170 Comm: syz-executor1 Not tainted 4.14.0-rc5+ #142

Hardware name: Google Google Compute Engine/Google Compute Engine,
BIOS Google 01/01/2011
Call Trace:
__dump_stack lib/dump_stack.c:16 [inline]
dump_stack+0x194/0x257 lib/dump_stack.c:52
panic+0x1e4/0x417 kernel/panic.c:181
__warn+0x1c4/0x1d9 kernel/panic.c:542
report_bug+0x211/0x2d0 lib/bug.c:183
fixup_bug+0x40/0x90 arch/x86/kernel/traps.c:178
do_trap_no_signal arch/x86/kernel/traps.c:212 [inline]
do_trap+0x260/0x390 arch/x86/kernel/traps.c:261
do_error_trap+0x120/0x390 arch/x86/kernel/traps.c:298
do_invalid_op+0x1b/0x20 arch/x86/kernel/traps.c:311
invalid_op+0x18/0x20 arch/x86/entry/entry_64.S:905
RIP: 0010:get_pi_state+0x15b/0x190 kernel/futex.c:818

RSP: 0018:ffff8801cddbf1a0 EFLAGS: 00010097
RAX: ffff8801c9d662c0 RBX: 0000000000000000 RCX: 1ffff10039bb7e39
RDX: 0000000000000000 RSI: 0000000000000003 RDI: ffff8801d04f4d40
RBP: ffff8801cddbf230 R08: 0000000000000001 R09: 1ffff10039bb7e26
R10: ffff8801cddbf0f8 R11: 0000000000000002 R12: 1ffff10039bb7e35
R13: ffff8801d04f4d40 R14: 1ffff10039bb7e39 R15: ffff8801d04f4df0
exit_pi_state_list+0x556/0x7a0 kernel/futex.c:932
mm_release+0x46d/0x590 kernel/fork.c:1148
exit_mm kernel/exit.c:499 [inline]
do_exit+0x481/0x1ad0 kernel/exit.c:852
do_group_exit+0x149/0x400 kernel/exit.c:968
get_signal+0x73f/0x16d0 kernel/signal.c:2334
do_signal+0x94/0x1ee0 arch/x86/kernel/signal.c:808
exit_to_usermode_loop+0x214/0x310 arch/x86/entry/common.c:158
prepare_exit_to_usermode arch/x86/entry/common.c:197 [inline]
syscall_return_slowpath+0x42f/0x510 arch/x86/entry/common.c:266
entry_SYSCALL_64_fastpath+0xbc/0xbe
RIP: 0033:0x452719
RSP: 002b:00007effb0461be8 EFLAGS: 00000212 ORIG_RAX: 00000000000000ca
RAX: 0000000000000000 RBX: 0000000000758190 RCX: 0000000000452719
RDX: 0000000000000004 RSI: 000080000000000b RDI: 000000002000cffc
RBP: 000000000000008f R08: 0000000020048000 R09: 0000000000000000
R10: 0000000020edfff0 R11: 0000000000000212 R12: 00000000006eee08
R13: 00000000ffffffff R14: 00007effb04626d4 R15: 0000000000000004

> --
> You received this message because you are subscribed to the Google Groups
> "syzkaller-bugs" group.
> To unsubscribe from this group and stop receiving emails from it, send an
> email to syzkaller-bug...@googlegroups.com.
> To view this discussion on the web visit
> https://groups.google.com/d/msgid/syzkaller-bugs/001a113a67502e70f4055cc8dc92%40google.com.
> For more options, visit https://groups.google.com/d/optout.

Peter Zijlstra

unread,

Oct 31, 2017, 4:36:47 AM10/31/17

to syzbot, dvh...@infradead.org, linux-...@vger.kernel.org, mi...@redhat.com, syzkall...@googlegroups.com, tg...@linutronix.de

On Mon, Oct 30, 2017 at 12:44:00PM -0700, syzbot wrote:
> WARNING: CPU: 1 PID: 24353 at kernel/futex.c:818 get_pi_state+0x15b/0x190
> kernel/futex.c:818

> exit_pi_state_list+0x556/0x7a0 kernel/futex.c:932
> mm_release+0x46d/0x590 kernel/fork.c:1191
> exit_mm kernel/exit.c:499 [inline]
> do_exit+0x481/0x1b00 kernel/exit.c:852
> SYSC_exit kernel/exit.c:937 [inline]
> SyS_exit+0x22/0x30 kernel/exit.c:935
> entry_SYSCALL_64_fastpath+0x1f/0xbe

Argh, I definitely messed that up. Let me have a prod..

Peter Zijlstra

unread,

Oct 31, 2017, 5:16:08 AM10/31/17

to syzbot, dvh...@infradead.org, linux-...@vger.kernel.org, mi...@redhat.com, syzkall...@googlegroups.com, tg...@linutronix.de

So that provided repro.c thing doesn't work _at_all_.

Its stuck on trying to create a tunnel for some daft reason.. I don't
have that.

I'll try and hack up the repro.c file to see if I can make it 'work',
but it would be nice if reproducers could actually be ran without too
much crap.

Dmitry Vyukov

unread,

Oct 31, 2017, 5:30:11 AM10/31/17

to Peter Zijlstra, syzbot, dvh...@infradead.org, LKML, Ingo Molnar, syzkall...@googlegroups.com, Thomas Gleixner

I understand your sentiment, but it's definitely not _at all_. The
system compiled this exact code, run it and triggered the bug on it.
Do you have suggestions on how to make this code more portable? How
does this setup would look on your system?

We do try hard to get rid of unnecessary stuff in reproducers. I think
what happened in this case is the following. This is a hard to
reproduce race. The bot was able to reproduce the crash on initial
program that uses tun, then tried to get rid of tun code and
re-reproduce it, but it did not reproduce this time, so it concluded
that tun code is somehow necessary here. That's unfortunate
consequence of testing complex concurrent code. May become somewhat
better once we have KTSAN, the race detector.

Peter Zijlstra

unread,

Oct 31, 2017, 6:08:56 AM10/31/17

to Dmitry Vyukov, syzbot, dvh...@infradead.org, LKML, Ingo Molnar, syzkall...@googlegroups.com, Thomas Gleixner

On Tue, Oct 31, 2017 at 12:29:50PM +0300, Dmitry Vyukov wrote:
> I understand your sentiment, but it's definitely not _at all_. The
> system compiled this exact code, run it and triggered the bug on it.
> Do you have suggestions on how to make this code more portable? How
> does this setup would look on your system?

So I don't see the point of that tun stuff; what was is supposed to do?

All it ever did after creation was flush_tun(), which reads until empty.
But given nobody would ever write into it, that's an 'expensive' NO-OP.

> We do try hard to get rid of unnecessary stuff in reproducers. I think
> what happened in this case is the following. This is a hard to
> reproduce race. The bot was able to reproduce the crash on initial
> program that uses tun, then tried to get rid of tun code and
> re-reproduce it, but it did not reproduce this time, so it concluded
> that tun code is somehow necessary here. That's unfortunate
> consequence of testing complex concurrent code. May become somewhat
> better once we have KTSAN, the race detector.

I ripped out the tun bits and it reproduced in ~100 seconds. I've now
got it running for well over 30m on the fixed kernel while I'm trying to
come up with a comprehensible Changelog ;-)

unread,

Nov 1, 2017, 4:45:29 AM11/1/17

to syzbot, dvh...@infradead.org, linux-...@vger.kernel.org, mi...@redhat.com, syzkall...@googlegroups.com, tg...@linutronix.de

OK, I let the kernel+patch run over night and it still lives. That's
almost 22 hours of running the repro without issue.

Dmitry Vyukov

unread,

Nov 7, 2017, 9:51:20 AM11/7/17

to Peter Zijlstra, syzbot, dvh...@infradead.org, LKML, Ingo Molnar, syzkall...@googlegroups.com, Thomas Gleixner, andreyknvl

On Tue, Oct 31, 2017 at 11:08 AM, Peter Zijlstra <pet...@infradead.org> wrote:
> On Tue, Oct 31, 2017 at 12:29:50PM +0300, Dmitry Vyukov wrote:
>> I understand your sentiment, but it's definitely not _at all_. The
>> system compiled this exact code, run it and triggered the bug on it.
>> Do you have suggestions on how to make this code more portable? How
>> does this setup would look on your system?
>
> So I don't see the point of that tun stuff; what was is supposed to do?
>
> All it ever did after creation was flush_tun(), which reads until empty.
> But given nobody would ever write into it, that's an 'expensive' NO-OP.

I've filed https://github.com/google/syzkaller/issues/414 for this. If
the program does not use tun, we will strip that code right away.
Thanks for feedback.

Dmitry Vyukov

unread,

Nov 7, 2017, 11:16:34 AM11/7/17

to Peter Zijlstra, syzbot, dvh...@infradead.org, LKML, Ingo Molnar, syzkall...@googlegroups.com, Thomas Gleixner

Let's tell the bot about the fix, otherwise it will never report bugs
in get_pi_state again:

#syz fix: futex: Fix more put_pi_state() vs. exit_pi_state_list() races

Reply all

Reply to author

Forward

WARNING in get_pi_state

syzbot

Dmitry Vyukov

Peter Zijlstra

Peter Zijlstra

Dmitry Vyukov

Peter Zijlstra

Peter Zijlstra

Dmitry Vyukov

Dmitry Vyukov

Peter Zijlstra

Peter Zijlstra

Peter Zijlstra

Peter Zijlstra

Dmitry Vyukov

Dmitry Vyukov