WARNING in syscall_exit_to_user_mode

47 views
Skip to first unread message

syzbot

unread,
Sep 11, 2020, 6:39:20ā€ÆAM9/11/20
to kees...@chromium.org, linux-...@vger.kernel.org, mi...@kernel.org, syzkall...@googlegroups.com, tg...@linutronix.de
Hello,

syzbot found the following issue on:

HEAD commit: f4d51dff Linux 5.9-rc4
git tree: upstream
console output: https://syzkaller.appspot.com/x/log.txt?x=12a4e7cd900000
kernel config: https://syzkaller.appspot.com/x/.config?x=a9075b36a6ae26c9
dashboard link: https://syzkaller.appspot.com/bug?extid=7ffc7214b893651d52b8
compiler: gcc (GCC) 10.1.0-syz 20200507
syz repro: https://syzkaller.appspot.com/x/repro.syz?x=122d7335900000
C reproducer: https://syzkaller.appspot.com/x/repro.c?x=13cea1a5900000

Bisection is inconclusive: the issue happens on the oldest tested release.

bisection log: https://syzkaller.appspot.com/x/bisect.txt?x=162247cd900000
final oops: https://syzkaller.appspot.com/x/report.txt?x=152247cd900000
console output: https://syzkaller.appspot.com/x/log.txt?x=112247cd900000

IMPORTANT: if you fix the issue, please add the following tag to the commit:
Reported-by: syzbot+7ffc72...@syzkaller.appspotmail.com

------------[ cut here ]------------
syscall 56 left IRQs disabled
WARNING: CPU: 1 PID: 31467 at kernel/entry/common.c:245 syscall_exit_to_user_mode_prepare kernel/entry/common.c:245 [inline]
WARNING: CPU: 1 PID: 31467 at kernel/entry/common.c:245 syscall_exit_to_user_mode+0xc1/0x2e0 kernel/entry/common.c:263
Kernel panic - not syncing: panic_on_warn set ...
CPU: 1 PID: 31467 Comm: syz-executor465 Not tainted 5.9.0-rc4-syzkaller #0
Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 01/01/2011
Call Trace:
__dump_stack lib/dump_stack.c:77 [inline]
dump_stack+0x198/0x1fd lib/dump_stack.c:118
panic+0x347/0x7c0 kernel/panic.c:231
__warn.cold+0x20/0x46 kernel/panic.c:600
report_bug+0x1bd/0x210 lib/bug.c:198
handle_bug+0x38/0x90 arch/x86/kernel/traps.c:234
exc_invalid_op+0x14/0x40 arch/x86/kernel/traps.c:254
asm_exc_invalid_op+0x12/0x20 arch/x86/include/asm/idtentry.h:536
RIP: 0010:syscall_exit_to_user_mode_prepare kernel/entry/common.c:245 [inline]
RIP: 0010:syscall_exit_to_user_mode+0xc1/0x2e0 kernel/entry/common.c:263
Code: 58 00 48 8b 7c 24 20 5b 5d 41 5c 41 5d e9 97 f7 ff ff e8 72 88 85 f9 eb c6 0f 0b 48 63 f6 48 c7 c7 50 a9 6a 89 e8 61 c8 50 f9 <0f> 0b e8 b8 8e 85 f9 48 83 3d 28 b6 c1 01 00 0f 84 7d 01 00 00 fb
RSP: 0018:ffffc90003f27f30 EFLAGS: 00010082
RAX: 0000000000000000 RBX: 0000000000004000 RCX: 0000000000000000
RDX: ffff8880a2ec6480 RSI: ffffffff815db9a7 RDI: fffff520007e4fd8
RBP: ffffc90003f27f58 R08: 0000000000000000 R09: ffffffff89bcb4a3
R10: 00000000000007db R11: 0000000000000001 R12: 0000000000000000
R13: 0000000000000000 R14: 0000000000000000 R15: 0000000000000000
ret_from_fork+0x15/0x30 arch/x86/entry/entry_64.S:287
RIP: 0033:0x44da49
Code: ff 48 85 f6 0f 84 a7 dc fb ff 48 83 ee 10 48 89 4e 08 48 89 3e 48 89 d7 4c 89 c2 4d 89 c8 4c 8b 54 24 08 b8 38 00 00 00 0f 05 <48> 85 c0 0f 8c 7e dc fb ff 74 01 c3 31 ed 48 f7 c7 00 00 01 00 75
RSP: 002b:00007f7349191db0 EFLAGS: 00000202 ORIG_RAX: 0000000000000038
RAX: 0000000000000000 RBX: 00007f7349192700 RCX: 000000000044da49
RDX: 00007f73491929d0 RSI: 00007f7349191db0 RDI: 00000000003d0f00
RBP: 00007ffe9e518280 R08: 00007f7349192700 R09: 00007f7349192700
R10: 00007f73491929d0 R11: 0000000000000202 R12: 0000000000000000
R13: 00007ffe9e5181df R14: 00007f73491929c0 R15: 20c49ba5e353f7cf
Kernel Offset: disabled


---
This report is generated by a bot. It may contain errors.
See https://goo.gl/tpsmEJ for more information about syzbot.
syzbot engineers can be reached at syzk...@googlegroups.com.

syzbot will keep track of this issue. See:
https://goo.gl/tpsmEJ#status for how to communicate with syzbot.
For information about bisection process see: https://goo.gl/tpsmEJ#bisection
syzbot can test patches for this issue, for details see:
https://goo.gl/tpsmEJ#testing-patches

Kees Cook

unread,
Sep 12, 2020, 1:52:56ā€ÆPM9/12/20
to tg...@linutronix.de, linux-...@vger.kernel.org, mi...@kernel.org, syzbot, syzkall...@googlegroups.com, Andy Lutomirski
On Fri, Sep 11, 2020 at 03:39:19AM -0700, syzbot wrote:
> Hello,
>
> syzbot found the following issue on:
>
> HEAD commit: f4d51dff Linux 5.9-rc4
> git tree: upstream
> console output: https://syzkaller.appspot.com/x/log.txt?x=12a4e7cd900000
> kernel config: https://syzkaller.appspot.com/x/.config?x=a9075b36a6ae26c9
> dashboard link: https://syzkaller.appspot.com/bug?extid=7ffc7214b893651d52b8
> compiler: gcc (GCC) 10.1.0-syz 20200507
> syz repro: https://syzkaller.appspot.com/x/repro.syz?x=122d7335900000
> C reproducer: https://syzkaller.appspot.com/x/repro.c?x=13cea1a5900000
>
> Bisection is inconclusive: the issue happens on the oldest tested release.
>
> bisection log: https://syzkaller.appspot.com/x/bisect.txt?x=162247cd900000
> final oops: https://syzkaller.appspot.com/x/report.txt?x=152247cd900000
> console output: https://syzkaller.appspot.com/x/log.txt?x=112247cd900000
>
> IMPORTANT: if you fix the issue, please add the following tag to the commit:
> Reported-by: syzbot+7ffc72...@syzkaller.appspotmail.com
>
> ------------[ cut here ]------------
> syscall 56 left IRQs disabled

This WARN appears reachable. :)

I also see on the dashboard these other problems with the new entry
code:

https://syzkaller.appspot.com/bug?extid=d4336c84ed0099fdbe47
https://syzkaller.appspot.com/bug?extid=c4af95386364bc59b13e

I can't tell if any of these have been looked at yet, though.

--
Kees Cook

Thomas Gleixner

unread,
Sep 13, 2020, 3:40:16ā€ÆPM9/13/20
to syzbot, kees...@chromium.org, linux-...@vger.kernel.org, mi...@kernel.org, syzkall...@googlegroups.com, Christian Brauner, Peter Zijlstra
On Fri, Sep 11 2020 at 03:39, syzbot wrote:
>
> ------------[ cut here ]------------
> syscall 56 left IRQs disabled

Ouch. syscall 56 == sys_clone

> WARNING: CPU: 1 PID: 31467 at kernel/entry/common.c:245 syscall_exit_to_user_mode_prepare kernel/entry/common.c:245 [inline]

<snip>

> ret_from_fork+0x15/0x30 arch/x86/entry/entry_64.S:287

But this does not make any sense for regular fork because
ret_from_fork() invokes schedule_tail() which drops the runqueue lock
and enables interrupts.

But for kernel threads that's more interesting because after
schedule_tail() the kernel thread function is invoked and this can call
kernel_execve() and return to ret_from_fork() and then run through
syscall_exit_to_user_mode().

Let me get that reproducer...

Thomas Gleixner

unread,
Sep 13, 2020, 3:54:36ā€ÆPM9/13/20
to Kees Cook, linux-...@vger.kernel.org, mi...@kernel.org, syzbot, syzkall...@googlegroups.com, Andy Lutomirski, Peter Zijlstra
On Sat, Sep 12 2020 at 10:52, Kees Cook wrote:
>> ------------[ cut here ]------------
>> syscall 56 left IRQs disabled
>
> This WARN appears reachable. :)

The above is hardly a problem of the new entry code. It's just detecting
the wreckage...

> I also see on the dashboard these other problems with the new entry
> code:

This one is also just the messenger. That's the

lockdep_assert_irqs_disabled();

in irqentry_exit() if I'm reading the reports correctly. That's a #PF
returning with interrupts enabled for whatever weird reason. Let me
stare at that...

> https://syzkaller.appspot.com/bug?extid=d4336c84ed0099fdbe47

This one is not a new entry code problem either:

> https://syzkaller.appspot.com/bug?extid=c4af95386364bc59b13e

INFO: task syz-executor.0:5956 can't die for more than 143 seconds.
task:syz-executor.0 state:R running task stack:25424 pid: 5956 ppid: 14284 flags:0x00004004
Call Trace:
context_switch kernel/sched/core.c:3778 [inline]
__schedule+0x8e5/0x21e0 kernel/sched/core.c:4527
preempt_schedule_irq+0xb0/0x150 kernel/sched/core.c:4785
irqentry_exit_cond_resched kernel/entry/common.c:333 [inline]
irqentry_exit_cond_resched kernel/entry/common.c:325 [inline]
irqentry_exit+0x65/0x90 kernel/entry/common.c:363
asm_sysvec_apic_timer_interrupt+0x12/0x20 arch/x86/include/asm/idtentry.h:581
RIP: 0010:arch_local_irq_restore arch/x86/include/asm/paravirt.h:653 [inline]
RIP: 0010:lock_acquire+0x27b/0xad0 kernel/locking/lockdep.c:5008
Code: 00 00 00 00 fc ff df 48 c1 e8 03 80 3c 10 00 0f 85 f8 06 00 00 48 83 3d 6a d1 5b 08 00 0f 84 a6 05 00 00 48 8b 7c 24 08 57 9d <0f> 1f 44 00 00 48 b8 00 00 00 00 00 fc ff df 48 03 44 24 10 48 c7
RSP: 0018:ffffc900088477c0 EFLAGS: 00000286
RAX: 1ffffffff136c7d9 RBX: ffff88808766c200 RCX: 000000005603e267
RDX: dffffc0000000000 RSI: 0000000000000001 RDI: 0000000000000286
RBP: 0000000000000000 R08: 0000000000000000 R09: ffffffff8c6a59e7
R10: fffffbfff18d4b3c R11: 0000000000000001 R12: 0000000000000002
R13: ffffffff89c67640 R14: 0000000000000000 R15: ffff88808766c200
rcu_lock_acquire include/linux/rcupdate.h:248 [inline]
rcu_read_lock include/linux/rcupdate.h:641 [inline]
inet_twsk_purge+0x112/0x7c0 net/ipv4/inet_timewait_sock.c:268
ops_exit_list+0x10d/0x160 net/core/net_namespace.c:189
setup_net+0x508/0x850 net/core/net_namespace.c:364
copy_net_ns+0x31e/0x760 net/core/net_namespace.c:482
create_new_namespaces+0x3f6/0xb20 kernel/nsproxy.c:110

The new entry code is just in the stack trace because that task was
preempted after a timer interrupt.

Thanks,

tglx

Kees Cook

unread,
Sep 14, 2020, 4:07:02ā€ÆPM9/14/20
to Thomas Gleixner, linux-...@vger.kernel.org, mi...@kernel.org, syzbot, syzkall...@googlegroups.com, Andy Lutomirski, Peter Zijlstra
On Sun, Sep 13, 2020 at 09:54:34PM +0200, Thomas Gleixner wrote:
> On Sat, Sep 12 2020 at 10:52, Kees Cook wrote:
> >> ------------[ cut here ]------------
> >> syscall 56 left IRQs disabled
> >
> > This WARN appears reachable. :)
>
> The above is hardly a problem of the new entry code. It's just detecting
> the wreckage...

Oh, well that's actually even better! :)

> > I also see on the dashboard these other problems with the new entry
> > code:
>
> > https://syzkaller.appspot.com/bug?extid=d4336c84ed0099fdbe47
>
> This one is also just the messenger. That's the
>
> lockdep_assert_irqs_disabled();
>
> in irqentry_exit() if I'm reading the reports correctly. That's a #PF
> returning with interrupts enabled for whatever weird reason. Let me
> stare at that...

Cool; thanks.

> > https://syzkaller.appspot.com/bug?extid=c4af95386364bc59b13e
>
> This one is not a new entry code problem either:
> [...]
> The new entry code is just in the stack trace because that task was
> preempted after a timer interrupt.

This seems a bit like the first issue: it's just getting noticed now?

--
Kees Cook

Thomas Gleixner

unread,
Sep 14, 2020, 4:23:01ā€ÆPM9/14/20
to Kees Cook, linux-...@vger.kernel.org, mi...@kernel.org, syzbot, syzkall...@googlegroups.com, Andy Lutomirski, Peter Zijlstra
On Mon, Sep 14 2020 at 13:06, Kees Cook wrote:
> On Sun, Sep 13, 2020 at 09:54:34PM +0200, Thomas Gleixner wrote:
>> On Sat, Sep 12 2020 at 10:52, Kees Cook wrote:
>> >> ------------[ cut here ]------------
>> >> syscall 56 left IRQs disabled
>> >
>> > This WARN appears reachable. :)
>>
>> The above is hardly a problem of the new entry code. It's just detecting
>> the wreckage...
>
> Oh, well that's actually even better! :)

Of course I can't reproduce here ...

>> > I also see on the dashboard these other problems with the new entry
>> > code:
>>
>> > https://syzkaller.appspot.com/bug?extid=d4336c84ed0099fdbe47
>>
>> This one is also just the messenger. That's the
>>
>> lockdep_assert_irqs_disabled();
>>
>> in irqentry_exit() if I'm reading the reports correctly. That's a #PF
>> returning with interrupts enabled for whatever weird reason. Let me
>> stare at that...
>
> Cool; thanks.

No idea so far.

>> > https://syzkaller.appspot.com/bug?extid=c4af95386364bc59b13e
>>
>> This one is not a new entry code problem either:
>> [...]
>> The new entry code is just in the stack trace because that task was
>> preempted after a timer interrupt.
>
> This seems a bit like the first issue: it's just getting noticed now?

IIRC, I've seen such reports on older kernels as well, but not with the
new entry code in the traces.

Thanks,

tglx

syzbot

unread,
Nov 8, 2020, 12:22:09ā€ÆPM11/8/20
to b.zoln...@samsung.com, chri...@brauner.io, dan.ca...@oracle.com, george....@oracle.com, kees...@chromium.org, linux-...@vger.kernel.org, lu...@kernel.org, mi...@kernel.org, pet...@infradead.org, syzkall...@googlegroups.com, tg...@linutronix.de
syzbot suspects this issue was fixed by commit:

commit a49145acfb975d921464b84fe00279f99827d816
Author: George Kennedy <george....@oracle.com>
Date: Tue Jul 7 19:26:03 2020 +0000

fbmem: add margin check to fb_check_caps()

bisection log: https://syzkaller.appspot.com/x/bisect.txt?x=17a21f32500000
start commit: f4d51dff Linux 5.9-rc4
git tree: upstream
If the result looks correct, please mark the issue as fixed by replying with:

#syz fix: fbmem: add margin check to fb_check_caps()

Dmitry Vyukov

unread,
Nov 11, 2020, 6:11:53ā€ÆAM11/11/20
to syzbot, Bartlomiej Zolnierkiewicz, Christian Brauner, Dan Carpenter, George Kennedy, Kees Cook, LKML, Andy Lutomirski, Ingo Molnar, Peter Zijlstra, syzkaller-bugs, Thomas Gleixner
Reply all
Reply to author
Forward
0 new messages