WARNING in get_pi_state

30 views
Skip to first unread message

syzbot

unread,
Oct 30, 2017, 3:44:01ā€ÆPM10/30/17
to dvh...@infradead.org, linux-...@vger.kernel.org, mi...@redhat.com, pet...@infradead.org, syzkall...@googlegroups.com, tg...@linutronix.de
Hello,

syzkaller hit the following crash on
8fd0520d9cec0896d48d3921bc642a9ee81eae0c
git://git.cmpxchg.org/linux-mmots.git/master
compiler: gcc (GCC) 7.1.1 20170620
.config is attached
Raw console output is attached.
C reproducer is attached
syzkaller reproducer is attached. See https://goo.gl/kgGztJ
for information about syzkaller reproducers


WARNING: CPU: 1 PID: 24353 at kernel/futex.c:818 get_pi_state+0x15b/0x190
kernel/futex.c:818
Kernel panic - not syncing: panic_on_warn set ...

CPU: 1 PID: 24353 Comm: syzkaller121915 Not tainted 4.14.0-rc2-mm1+ #10
Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS
Google 01/01/2011
Call Trace:
__dump_stack lib/dump_stack.c:16 [inline]
dump_stack+0x194/0x257 lib/dump_stack.c:52
panic+0x1e4/0x417 kernel/panic.c:181
__warn+0x1c4/0x1d9 kernel/panic.c:542
report_bug+0x211/0x2d0 lib/bug.c:183
fixup_bug+0x40/0x90 arch/x86/kernel/traps.c:178
do_trap_no_signal arch/x86/kernel/traps.c:212 [inline]
do_trap+0x260/0x390 arch/x86/kernel/traps.c:261
do_error_trap+0x120/0x390 arch/x86/kernel/traps.c:298
do_invalid_op+0x1b/0x20 arch/x86/kernel/traps.c:311
invalid_op+0x18/0x20 arch/x86/entry/entry_64.S:905
RIP: 0010:get_pi_state+0x15b/0x190 kernel/futex.c:818
RSP: 0018:ffff8801bf2a77a8 EFLAGS: 00010097
RAX: ffff8801cb2d05c0 RBX: 0000000000000000 RCX: 1ffff10037e54efa
RDX: 0000000000000000 RSI: 0000000000000003 RDI: ffff8801d892cd40
RBP: ffff8801bf2a7838 R08: ffff8801cb2d1880 R09: 1ffff10037e54edf
R10: ffff8801cb2d05c0 R11: 0000000000000002 R12: 1ffff10037e54ef6
R13: ffff8801d892cd40 R14: 1ffff10037e54efa R15: ffff8801d892ce00
exit_pi_state_list+0x556/0x7a0 kernel/futex.c:932
mm_release+0x46d/0x590 kernel/fork.c:1191
exit_mm kernel/exit.c:499 [inline]
do_exit+0x481/0x1b00 kernel/exit.c:852
SYSC_exit kernel/exit.c:937 [inline]
SyS_exit+0x22/0x30 kernel/exit.c:935
entry_SYSCALL_64_fastpath+0x1f/0xbe
RIP: 0033:0x40337e
RSP: 002b:00007fdd99af2d20 EFLAGS: 00000246 ORIG_RAX: 000000000000003c
RAX: ffffffffffffffda RBX: 00007fdd99af3700 RCX: 000000000040337e
RDX: 000000000000003c RSI: 00000000007fb000 RDI: 0000000000000000
RBP: 0000000000000086 R08: 0000000020048000 R09: 0000000000000000
R10: 000000002000b000 R11: 0000000000000246 R12: 0000000000000000
R13: 00007ffcfc38d0bf R14: 00007fdd99af39c0 R15: 0000000000000000


---
This bug is generated by a dumb bot. It may contain errors.
See https://goo.gl/tpsmEJ for details.
Direct all questions to syzk...@googlegroups.com.

syzbot will keep track of this bug report.
Once a fix for this bug is committed, please reply to this email with:
#syz fix: exact-commit-title
To mark this as a duplicate of another syzbot report, please reply with:
#syz dup: exact-subject-of-another-report
If it's a one-off invalid bug report, please reply with:
#syz invalid
Note: if the crash happens again, it will cause creation of a new bug
report.
config.txt
raw.log
repro.txt
repro.c

Dmitry Vyukov

unread,
Oct 30, 2017, 3:53:42ā€ÆPM10/30/17
to syzbot, dvh...@infradead.org, LKML, Ingo Molnar, Peter Zijlstra, syzkall...@googlegroups.com, Thomas Gleixner
On Mon, Oct 30, 2017 at 10:44 PM, syzbot
<bot+2af19c9e1ffe4d4ee1...@syzkaller.appspotmail.com>
wrote:
> Hello,
>
> syzkaller hit the following crash on
> 8fd0520d9cec0896d48d3921bc642a9ee81eae0c
> git://git.cmpxchg.org/linux-mmots.git/master
> compiler: gcc (GCC) 7.1.1 20170620
> .config is attached
> Raw console output is attached.
> C reproducer is attached
> syzkaller reproducer is attached. See https://goo.gl/kgGztJ
> for information about syzkaller reproducers


Also happened on more recent commits, including upstream
0787643a5f6aad1f0cdeb305f7fe492b71943ea4:

------------[ cut here ]------------
WARNING: CPU: 0 PID: 6170 at kernel/futex.c:818
get_pi_state+0x15b/0x190 kernel/futex.c:818
Kernel panic - not syncing: panic_on_warn set ...

CPU: 0 PID: 6170 Comm: syz-executor1 Not tainted 4.14.0-rc5+ #142
Hardware name: Google Google Compute Engine/Google Compute Engine,
BIOS Google 01/01/2011
Call Trace:
__dump_stack lib/dump_stack.c:16 [inline]
dump_stack+0x194/0x257 lib/dump_stack.c:52
panic+0x1e4/0x417 kernel/panic.c:181
__warn+0x1c4/0x1d9 kernel/panic.c:542
report_bug+0x211/0x2d0 lib/bug.c:183
fixup_bug+0x40/0x90 arch/x86/kernel/traps.c:178
do_trap_no_signal arch/x86/kernel/traps.c:212 [inline]
do_trap+0x260/0x390 arch/x86/kernel/traps.c:261
do_error_trap+0x120/0x390 arch/x86/kernel/traps.c:298
do_invalid_op+0x1b/0x20 arch/x86/kernel/traps.c:311
invalid_op+0x18/0x20 arch/x86/entry/entry_64.S:905
RIP: 0010:get_pi_state+0x15b/0x190 kernel/futex.c:818
RSP: 0018:ffff8801cddbf1a0 EFLAGS: 00010097
RAX: ffff8801c9d662c0 RBX: 0000000000000000 RCX: 1ffff10039bb7e39
RDX: 0000000000000000 RSI: 0000000000000003 RDI: ffff8801d04f4d40
RBP: ffff8801cddbf230 R08: 0000000000000001 R09: 1ffff10039bb7e26
R10: ffff8801cddbf0f8 R11: 0000000000000002 R12: 1ffff10039bb7e35
R13: ffff8801d04f4d40 R14: 1ffff10039bb7e39 R15: ffff8801d04f4df0
exit_pi_state_list+0x556/0x7a0 kernel/futex.c:932
mm_release+0x46d/0x590 kernel/fork.c:1148
exit_mm kernel/exit.c:499 [inline]
do_exit+0x481/0x1ad0 kernel/exit.c:852
do_group_exit+0x149/0x400 kernel/exit.c:968
get_signal+0x73f/0x16d0 kernel/signal.c:2334
do_signal+0x94/0x1ee0 arch/x86/kernel/signal.c:808
exit_to_usermode_loop+0x214/0x310 arch/x86/entry/common.c:158
prepare_exit_to_usermode arch/x86/entry/common.c:197 [inline]
syscall_return_slowpath+0x42f/0x510 arch/x86/entry/common.c:266
entry_SYSCALL_64_fastpath+0xbc/0xbe
RIP: 0033:0x452719
RSP: 002b:00007effb0461be8 EFLAGS: 00000212 ORIG_RAX: 00000000000000ca
RAX: 0000000000000000 RBX: 0000000000758190 RCX: 0000000000452719
RDX: 0000000000000004 RSI: 000080000000000b RDI: 000000002000cffc
RBP: 000000000000008f R08: 0000000020048000 R09: 0000000000000000
R10: 0000000020edfff0 R11: 0000000000000212 R12: 00000000006eee08
R13: 00000000ffffffff R14: 00007effb04626d4 R15: 0000000000000004
> --
> You received this message because you are subscribed to the Google Groups
> "syzkaller-bugs" group.
> To unsubscribe from this group and stop receiving emails from it, send an
> email to syzkaller-bug...@googlegroups.com.
> To view this discussion on the web visit
> https://groups.google.com/d/msgid/syzkaller-bugs/001a113a67502e70f4055cc8dc92%40google.com.
> For more options, visit https://groups.google.com/d/optout.

Peter Zijlstra

unread,
Oct 31, 2017, 4:36:47ā€ÆAM10/31/17
to syzbot, dvh...@infradead.org, linux-...@vger.kernel.org, mi...@redhat.com, syzkall...@googlegroups.com, tg...@linutronix.de
On Mon, Oct 30, 2017 at 12:44:00PM -0700, syzbot wrote:
> WARNING: CPU: 1 PID: 24353 at kernel/futex.c:818 get_pi_state+0x15b/0x190
> kernel/futex.c:818

> exit_pi_state_list+0x556/0x7a0 kernel/futex.c:932
> mm_release+0x46d/0x590 kernel/fork.c:1191
> exit_mm kernel/exit.c:499 [inline]
> do_exit+0x481/0x1b00 kernel/exit.c:852
> SYSC_exit kernel/exit.c:937 [inline]
> SyS_exit+0x22/0x30 kernel/exit.c:935
> entry_SYSCALL_64_fastpath+0x1f/0xbe


Argh, I definitely messed that up. Let me have a prod..

Peter Zijlstra

unread,
Oct 31, 2017, 5:16:08ā€ÆAM10/31/17
to syzbot, dvh...@infradead.org, linux-...@vger.kernel.org, mi...@redhat.com, syzkall...@googlegroups.com, tg...@linutronix.de

So that provided repro.c thing doesn't work _at_all_.

Its stuck on trying to create a tunnel for some daft reason.. I don't
have that.

I'll try and hack up the repro.c file to see if I can make it 'work',
but it would be nice if reproducers could actually be ran without too
much crap.

Dmitry Vyukov

unread,
Oct 31, 2017, 5:30:11ā€ÆAM10/31/17
to Peter Zijlstra, syzbot, dvh...@infradead.org, LKML, Ingo Molnar, syzkall...@googlegroups.com, Thomas Gleixner
I understand your sentiment, but it's definitely not _at all_. The
system compiled this exact code, run it and triggered the bug on it.
Do you have suggestions on how to make this code more portable? How
does this setup would look on your system?

We do try hard to get rid of unnecessary stuff in reproducers. I think
what happened in this case is the following. This is a hard to
reproduce race. The bot was able to reproduce the crash on initial
program that uses tun, then tried to get rid of tun code and
re-reproduce it, but it did not reproduce this time, so it concluded
that tun code is somehow necessary here. That's unfortunate
consequence of testing complex concurrent code. May become somewhat
better once we have KTSAN, the race detector.

Peter Zijlstra

unread,
Oct 31, 2017, 6:08:56ā€ÆAM10/31/17
to Dmitry Vyukov, syzbot, dvh...@infradead.org, LKML, Ingo Molnar, syzkall...@googlegroups.com, Thomas Gleixner
On Tue, Oct 31, 2017 at 12:29:50PM +0300, Dmitry Vyukov wrote:
> I understand your sentiment, but it's definitely not _at all_. The
> system compiled this exact code, run it and triggered the bug on it.
> Do you have suggestions on how to make this code more portable? How
> does this setup would look on your system?

So I don't see the point of that tun stuff; what was is supposed to do?

All it ever did after creation was flush_tun(), which reads until empty.
But given nobody would ever write into it, that's an 'expensive' NO-OP.

> We do try hard to get rid of unnecessary stuff in reproducers. I think
> what happened in this case is the following. This is a hard to
> reproduce race. The bot was able to reproduce the crash on initial
> program that uses tun, then tried to get rid of tun code and
> re-reproduce it, but it did not reproduce this time, so it concluded
> that tun code is somehow necessary here. That's unfortunate
> consequence of testing complex concurrent code. May become somewhat
> better once we have KTSAN, the race detector.

I ripped out the tun bits and it reproduced in ~100 seconds. I've now
got it running for well over 30m on the fixed kernel while I'm trying to
come up with a comprehensible Changelog ;-)

Peter Zijlstra

unread,
Oct 31, 2017, 6:18:56ā€ÆAM10/31/17
to syzbot, dvh...@infradead.org, linux-...@vger.kernel.org, mi...@redhat.com, syzkall...@googlegroups.com, tg...@linutronix.de
The below appears to cure the problem, I could (fairly quickly)
reproduce the issue one I hacked up the repro.c to not bother with
tunnels.

With the below patch, the reproducer has been running for a fairly long
time now without issue.

This should fix both that WARN and the UAF report, both were related
problems.

---
Subject: futex: Fix more put_pi_state() vs exit_pi_state_list() races

Dmitry (through syzbot) reported being able to trigger the WARN in
get_pi_state() and a use-after-free on
raw_spin_lock_irq(&pi_state->pi_mutex.wait_lock).

Both are due to this race:

exit_pi_state_list() put_pi_state()

lock(&curr->pi_lock)
while() {
pi_state = list_first_entry(head);
hb = hash_futex(&pi_state->key);
unlock(&curr->pi_lock);

dec_and_test(&pi_state->refcount);

lock(&hb->lock)
lock(&pi_state->pi_mutex.wait_lock) // uaf if pi_state free'd
lock(&curr->pi_lock);

....

unlock(&curr->pi_lock);
get_pi_state(); // WARN; refcount==0


The problem is we take the reference count too late, and don't allow it
being 0. Fix it by using inc_not_zero() and simply retrying the loop
when we fail to get a refcount. In that case put_pi_state() should
remove the entry from the list.

Cc: Gratian Crisan <gratian...@ni.com>
Reported-by: Dmitry Vyukov <dvy...@google.com>
Fixes: c74aef2d06a9 ("futex: Fix pi_state->owner serialization")
Signed-off-by: Peter Zijlstra (Intel) <pet...@infradead.org>
---
futex.c | 23 ++++++++++++++++++++---
1 file changed, 20 insertions(+), 3 deletions(-)

diff --git a/kernel/futex.c b/kernel/futex.c
index 0518a0bfc746..ca5bb9cba5cf 100644
--- a/kernel/futex.c
+++ b/kernel/futex.c
@@ -903,11 +903,27 @@ void exit_pi_state_list(struct task_struct *curr)
*/
raw_spin_lock_irq(&curr->pi_lock);
while (!list_empty(head)) {
-
next = head->next;
pi_state = list_entry(next, struct futex_pi_state, list);
key = pi_state->key;
hb = hash_futex(&key);
+
+ /*
+ * We can race against put_pi_state() removing itself from the
+ * list (a waiter going away). put_pi_state() will first
+ * decrement the reference count and then modify the list, so
+ * its possible to see the list entry but fail this reference
+ * acquire.
+ *
+ * In that case; drop the locks to let put_pi_state() make
+ * progress and retry the loop.
+ */
+ if (!atomic_inc_not_zero(&pi_state->refcount)) {
+ raw_spin_unlock_irq(&curr->pi_lock);
+ cpu_relax();
+ raw_spin_lock_irq(&curr->pi_lock);
+ continue;
+ }
raw_spin_unlock_irq(&curr->pi_lock);

spin_lock(&hb->lock);
@@ -918,8 +934,10 @@ void exit_pi_state_list(struct task_struct *curr)
* task still owns the PI-state:
*/
if (head->next != next) {
+ /* retain curr->pi_lock for the loop invariant */
raw_spin_unlock(&pi_state->pi_mutex.wait_lock);
spin_unlock(&hb->lock);
+ put_pi_state(pi_state);
continue;
}

@@ -927,9 +945,8 @@ void exit_pi_state_list(struct task_struct *curr)
WARN_ON(list_empty(&pi_state->list));
list_del_init(&pi_state->list);
pi_state->owner = NULL;
- raw_spin_unlock(&curr->pi_lock);

- get_pi_state(pi_state);
+ raw_spin_unlock(&curr->pi_lock);
raw_spin_unlock_irq(&pi_state->pi_mutex.wait_lock);
spin_unlock(&hb->lock);

Dmitry Vyukov

unread,
Oct 31, 2017, 6:21:21ā€ÆAM10/31/17
to Peter Zijlstra, syzbot, dvh...@infradead.org, LKML, Ingo Molnar, syzkall...@googlegroups.com, Thomas Gleixner
On Tue, Oct 31, 2017 at 1:08 PM, Peter Zijlstra <pet...@infradead.org> wrote:
> On Tue, Oct 31, 2017 at 12:29:50PM +0300, Dmitry Vyukov wrote:
>> I understand your sentiment, but it's definitely not _at all_. The
>> system compiled this exact code, run it and triggered the bug on it.
>> Do you have suggestions on how to make this code more portable? How
>> does this setup would look on your system?
>
> So I don't see the point of that tun stuff; what was is supposed to do?
>
> All it ever did after creation was flush_tun(), which reads until empty.
> But given nobody would ever write into it, that's an 'expensive' NO-OP.

See the text below.
It does try to minimize both programs and features used (e.g. also
these clunky NONFAILING macros, and filesystem business). But if it
takes 100 seconds to reproduce, then it's hard to do minimization.
Consider that you are trying to bisect such bugs, that also will be
hard and unreliable, and you can get a wrong commit in the end.

See this for an example for much more tidy reproducer:
https://groups.google.com/forum/#!topic/syzkaller-bugs/9nYn7hpNpEk
But that's a single threaded bug that instantly triggers each time you
run the program.

Dmitry Vyukov

unread,
Oct 31, 2017, 6:23:34ā€ÆAM10/31/17
to Peter Zijlstra, syzbot, dvh...@infradead.org, LKML, Ingo Molnar, syzkall...@googlegroups.com, Thomas Gleixner
On Tue, Oct 31, 2017 at 1:21 PM, Dmitry Vyukov <dvy...@google.com> wrote:
> On Tue, Oct 31, 2017 at 1:08 PM, Peter Zijlstra <pet...@infradead.org> wrote:
>> On Tue, Oct 31, 2017 at 12:29:50PM +0300, Dmitry Vyukov wrote:
>>> I understand your sentiment, but it's definitely not _at all_. The
>>> system compiled this exact code, run it and triggered the bug on it.
>>> Do you have suggestions on how to make this code more portable? How
>>> does this setup would look on your system?
>>
>> So I don't see the point of that tun stuff; what was is supposed to do?
>>
>> All it ever did after creation was flush_tun(), which reads until empty.
>> But given nobody would ever write into it, that's an 'expensive' NO-OP.
>
> See the text below.
> It does try to minimize both programs and features used (e.g. also
> these clunky NONFAILING macros, and filesystem business). But if it
> takes 100 seconds to reproduce, then it's hard to do minimization.
> Consider that you are trying to bisect such bugs, that also will be
> hard and unreliable, and you can get a wrong commit in the end.
>
> See this for an example for much more tidy reproducer:
> https://groups.google.com/forum/#!topic/syzkaller-bugs/9nYn7hpNpEk
> But that's a single threaded bug that instantly triggers each time you
> run the program.


But having said that, the tun code is not supposed to make the
reproducer non-working either. E.g. on our systems it just setups tun
successfully and then proceeds to the actual code that triggers the
problem. What's the failure mode with tun code on your system? If we
make it more portable, then such repros will work on your system as
well.

Peter Zijlstra

unread,
Oct 31, 2017, 6:31:40ā€ÆAM10/31/17
to syzbot, dvh...@infradead.org, linux-...@vger.kernel.org, mi...@redhat.com, syzkall...@googlegroups.com, tg...@linutronix.de
On Tue, Oct 31, 2017 at 11:18:53AM +0100, Peter Zijlstra wrote:
> On Tue, Oct 31, 2017 at 09:36:44AM +0100, Peter Zijlstra wrote:
> > On Mon, Oct 30, 2017 at 12:44:00PM -0700, syzbot wrote:
> > > WARNING: CPU: 1 PID: 24353 at kernel/futex.c:818 get_pi_state+0x15b/0x190
> > > kernel/futex.c:818
> >
> > > exit_pi_state_list+0x556/0x7a0 kernel/futex.c:932
> > > mm_release+0x46d/0x590 kernel/fork.c:1191
> > > exit_mm kernel/exit.c:499 [inline]
> > > do_exit+0x481/0x1b00 kernel/exit.c:852
> > > SYSC_exit kernel/exit.c:937 [inline]
> > > SyS_exit+0x22/0x30 kernel/exit.c:935
> > > entry_SYSCALL_64_fastpath+0x1f/0xbe
> >
> >
> > Argh, I definitely messed that up. Let me have a prod..
>
> The below appears to cure the problem, I could (fairly quickly)
> reproduce the issue one I hacked up the repro.c to not bother with
> tunnels.
>
> With the below patch, the reproducer has been running for a fairly long
> time now without issue.

And of course, now it went *splat*, lemme continue staring..

Peter Zijlstra

unread,
Oct 31, 2017, 6:36:30ā€ÆAM10/31/17
to Dmitry Vyukov, syzbot, dvh...@infradead.org, LKML, Ingo Molnar, syzkall...@googlegroups.com, Thomas Gleixner
On Tue, Oct 31, 2017 at 01:23:13PM +0300, Dmitry Vyukov wrote:

> But having said that, the tun code is not supposed to make the
> reproducer non-working either. E.g. on our systems it just setups tun
> successfully and then proceeds to the actual code that triggers the
> problem. What's the failure mode with tun code on your system? If we
> make it more portable, then such repros will work on your system as
> well.

It completely fails to create a tun (probably don't have support for
that built-in) and then just sits there doing nothing. I didn't spend
too much time analyzing and just ripped it out.

Peter Zijlstra

unread,
Oct 31, 2017, 6:38:15ā€ÆAM10/31/17
to syzbot, dvh...@infradead.org, linux-...@vger.kernel.org, mi...@redhat.com, syzkall...@googlegroups.com, tg...@linutronix.de
On Tue, Oct 31, 2017 at 11:31:34AM +0100, Peter Zijlstra wrote:

> And of course, now it went *splat*, lemme continue staring..

PEBKAC, turns out I was running the kernel without the patch in, and it
only took 3100 seconds to trigger the splat..

I am now actually running the kernel with the patch in, I'll leave the
machine running it for a few hours.

Peter Zijlstra

unread,
Nov 1, 2017, 4:45:29ā€ÆAM11/1/17
to syzbot, dvh...@infradead.org, linux-...@vger.kernel.org, mi...@redhat.com, syzkall...@googlegroups.com, tg...@linutronix.de
OK, I let the kernel+patch run over night and it still lives. That's
almost 22 hours of running the repro without issue.

Dmitry Vyukov

unread,
Nov 7, 2017, 9:51:20ā€ÆAM11/7/17
to Peter Zijlstra, syzbot, dvh...@infradead.org, LKML, Ingo Molnar, syzkall...@googlegroups.com, Thomas Gleixner, andreyknvl
On Tue, Oct 31, 2017 at 11:08 AM, Peter Zijlstra <pet...@infradead.org> wrote:
> On Tue, Oct 31, 2017 at 12:29:50PM +0300, Dmitry Vyukov wrote:
>> I understand your sentiment, but it's definitely not _at all_. The
>> system compiled this exact code, run it and triggered the bug on it.
>> Do you have suggestions on how to make this code more portable? How
>> does this setup would look on your system?
>
> So I don't see the point of that tun stuff; what was is supposed to do?
>
> All it ever did after creation was flush_tun(), which reads until empty.
> But given nobody would ever write into it, that's an 'expensive' NO-OP.

I've filed https://github.com/google/syzkaller/issues/414 for this. If
the program does not use tun, we will strip that code right away.
Thanks for feedback.

Dmitry Vyukov

unread,
Nov 7, 2017, 11:16:34ā€ÆAM11/7/17
to Peter Zijlstra, syzbot, dvh...@infradead.org, LKML, Ingo Molnar, syzkall...@googlegroups.com, Thomas Gleixner
Let's tell the bot about the fix, otherwise it will never report bugs
in get_pi_state again:

#syz fix: futex: Fix more put_pi_state() vs. exit_pi_state_list() races
Reply all
Reply to author
Forward
0 new messages