WARNING: suspicious RCU usage (3)

20 views
Skip to first unread message

syzbot

unread,
Nov 28, 2017, 3:45:04 PM11/28/17
to ak...@linux-foundation.org, c...@linux.com, iamjoon...@lge.com, linux-...@vger.kernel.org, linu...@kvack.org, pen...@kernel.org, rien...@google.com, syzkall...@googlegroups.com
Hello,

syzkaller hit the following crash on
b0a84f19a5161418d4360cd57603e94ed489915e
git://git.kernel.org/pub/scm/linux/kernel/git/next/linux-next.git/master
compiler: gcc (GCC) 7.1.1 20170620
.config is attached
Raw console output is attached.

Unfortunately, I don't have any reproducer for this bug yet.


WARNING: suspicious RCU usage
4.14.0-next-20171127+ #53 Not tainted
BUG: unable to handle kernel NULL pointer dereference at 0000000000000074
IP: virt_to_cache mm/slab.c:400 [inline]
IP: kfree+0xb2/0x250 mm/slab.c:3803
PGD 1cd9be067 P4D 1cd9be067 PUD 1c646d067 PMD 0
Oops: 0000 [#1] SMP KASAN
Dumping ftrace buffer:
(ftrace buffer empty)
Modules linked in:
CPU: 1 PID: 17319 Comm: syz-executor7 Not tainted 4.14.0-next-20171127+ #53
Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS
Google 01/01/2011
task: ffff8801c5442040 task.stack: ffff8801c7ed8000
RIP: 0010:virt_to_cache mm/slab.c:400 [inline]
RIP: 0010:kfree+0xb2/0x250 mm/slab.c:3803
RSP: 0018:ffff8801c7edf780 EFLAGS: 00010046
RAX: 0000000000000000 RBX: ffff8801c7edf948 RCX: ffffffffffffffff
RDX: ffffea00071fb7c0 RSI: 0000000000000000 RDI: ffff8801c7edf948
RBP: ffff8801c7edf7a0 R08: ffffed003b02866c R09: 0000000000000000
R10: 0000000000000001 R11: ffffed003b02866b R12: 0000000000000286
R13: 0000000000000000 R14: ffff8801c7edf948 R15: ffff8801c7edf8b0
FS: 00007ff14d179700(0000) GS:ffff8801db500000(0000) knlGS:0000000000000000
CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
CR2: 0000000000000074 CR3: 00000001c6768000 CR4: 00000000001426e0
Call Trace:
blkcipher_walk_done+0x72b/0xde0 crypto/blkcipher.c:139
encrypt+0x50a/0xaf0 crypto/salsa20_generic.c:208
skcipher_crypt_blkcipher crypto/skcipher.c:622 [inline]
skcipher_encrypt_blkcipher+0x213/0x310 crypto/skcipher.c:631
crypto_skcipher_encrypt include/crypto/skcipher.h:445 [inline]
_skcipher_recvmsg crypto/algif_skcipher.c:144 [inline]
skcipher_recvmsg+0x9e8/0xf20 crypto/algif_skcipher.c:165
sock_recvmsg_nosec net/socket.c:805 [inline]
sock_recvmsg+0xc9/0x110 net/socket.c:812
___sys_recvmsg+0x29b/0x630 net/socket.c:2207
__sys_recvmsg+0xe2/0x210 net/socket.c:2252
SYSC_recvmsg net/socket.c:2264 [inline]
SyS_recvmsg+0x2d/0x50 net/socket.c:2259
entry_SYSCALL_64_fastpath+0x1f/0x96
RIP: 0033:0x4529d9
RSP: 002b:00007ff14d178c58 EFLAGS: 00000212 ORIG_RAX: 000000000000002f
RAX: ffffffffffffffda RBX: 0000000000758190 RCX: 00000000004529d9
RDX: 0000000000010000 RSI: 0000000020d63fc8 RDI: 0000000000000018
RBP: 0000000000000086 R08: 0000000000000000 R09: 0000000000000000
R10: 0000000000000000 R11: 0000000000000212 R12: 00000000006f2728
R13: 00000000ffffffff R14: 00007ff14d1796d4 R15: 0000000000000000
Code: c2 48 b8 00 00 00 00 00 ea ff ff 48 89 df 48 c1 ea 0c 48 c1 e2 06 48
01 c2 48 8b 42 20 48 8d 48 ff a8 01 48 0f 45 d1 4c 8b 6a 30 <49> 63 75 74
e8 b5 5c af ff 48 89 de 4c 89 ef 4c 8b 75 08 e8 76
RIP: virt_to_cache mm/slab.c:400 [inline] RSP: ffff8801c7edf780
RIP: kfree+0xb2/0x250 mm/slab.c:3803 RSP: ffff8801c7edf780
CR2: 0000000000000074
---[ end trace e3c719a9c9d01886 ]---


---
This bug is generated by a dumb bot. It may contain errors.
See https://goo.gl/tpsmEJ for details.
Direct all questions to syzk...@googlegroups.com.
Please credit me with: Reported-by: syzbot <syzk...@googlegroups.com>

syzbot will keep track of this bug report.
Once a fix for this bug is committed, please reply to this email with:
#syz fix: exact-commit-title
To mark this as a duplicate of another syzbot report, please reply with:
#syz dup: exact-subject-of-another-report
If it's a one-off invalid bug report, please reply with:
#syz invalid
Note: if the crash happens again, it will cause creation of a new bug
report.
Note: all commands must start from beginning of the line in the email body.
config.txt
raw.log

Andrew Morton

unread,
Nov 28, 2017, 4:30:30 PM11/28/17
to syzbot, c...@linux.com, iamjoon...@lge.com, linux-...@vger.kernel.org, linu...@kvack.org, pen...@kernel.org, rien...@google.com, syzkall...@googlegroups.com, Paul E. McKenney, Herbert Xu
On Tue, 28 Nov 2017 12:45:01 -0800 syzbot <bot+73a7bec1bc0f4fc051...@syzkaller.appspotmail.com> wrote:

> Hello,
>
> syzkaller hit the following crash on
> b0a84f19a5161418d4360cd57603e94ed489915e
> git://git.kernel.org/pub/scm/linux/kernel/git/next/linux-next.git/master
> compiler: gcc (GCC) 7.1.1 20170620
> .config is attached
> Raw console output is attached.
>
> Unfortunately, I don't have any reproducer for this bug yet.
>
>
> WARNING: suspicious RCU usage

There's a bunch of other info which lockdep_rcu_suspicious() should
have printed out, but this trace doesn't have any of it. I wonder why.
It looks like blkcipher_walk_done() passed a bad address to kfree().

Paul E. McKenney

unread,
Nov 28, 2017, 5:30:47 PM11/28/17
to Andrew Morton, syzbot, c...@linux.com, iamjoon...@lge.com, linux-...@vger.kernel.org, linu...@kvack.org, pen...@kernel.org, rien...@google.com, syzkall...@googlegroups.com, Herbert Xu
On Tue, Nov 28, 2017 at 01:30:26PM -0800, Andrew Morton wrote:
> On Tue, 28 Nov 2017 12:45:01 -0800 syzbot <bot+73a7bec1bc0f4fc051...@syzkaller.appspotmail.com> wrote:
>
> > Hello,
> >
> > syzkaller hit the following crash on
> > b0a84f19a5161418d4360cd57603e94ed489915e
> > git://git.kernel.org/pub/scm/linux/kernel/git/next/linux-next.git/master
> > compiler: gcc (GCC) 7.1.1 20170620
> > .config is attached
> > Raw console output is attached.
> >
> > Unfortunately, I don't have any reproducer for this bug yet.
> >
> > WARNING: suspicious RCU usage
>
> There's a bunch of other info which lockdep_rcu_suspicious() should
> have printed out, but this trace doesn't have any of it. I wonder why.

Yes, there should be more info printed, no idea why it would go missing.

Thanx, Paul
> > See https://urldefense.proofpoint.com/v2/url?u=https-3A__goo.gl_tpsmEJ&d=DwICAg&c=jf_iaSHvJObTbx-siA1ZOg&r=q4hkQkeaNH3IlTsPvEwkaUALMqf7y6jCMwT5b6lVQbQ&m=aTCoRUwymtfv220QJqsca2w9mocNKMrzqpgtUF-s558&s=2sG5vQqYSKlZIxmT377N3IAs1G31yBEQVetU4JSSt34&e= for details.

Eric Biggers

unread,
Nov 29, 2017, 12:06:09 AM11/29/17
to Andrew Morton, syzbot, c...@linux.com, iamjoon...@lge.com, linux-...@vger.kernel.org, linu...@kvack.org, pen...@kernel.org, rien...@google.com, syzkall...@googlegroups.com, Paul E. McKenney, Herbert Xu
On Tue, Nov 28, 2017 at 01:30:26PM -0800, Andrew Morton wrote:
>
> It looks like blkcipher_walk_done() passed a bad address to kfree().
>

Indeed, it's freeing uninitialized memory because the Salsa20 algorithms are
using the blkcipher_walk API incorrectly. I've sent a patch to fix it:

"crypto: salsa20 - fix blkcipher_walk API usage"

I am not sure why the bug reports show up as "suspicious RCU usage", though.

There were also a few other syzbot reports of this same underlying bug; I marked
them as duplicates of this one.

Eric

Dmitry Vyukov

unread,
Nov 29, 2017, 1:25:54 AM11/29/17
to Paul McKenney, Andrew Morton, syzbot, Christoph Lameter, Joonsoo Kim, LKML, Linux-MM, Pekka Enberg, David Rientjes, syzkall...@googlegroups.com, Herbert Xu
On Tue, Nov 28, 2017 at 11:30 PM, Paul E. McKenney
<pau...@linux.vnet.ibm.com> wrote:
> On Tue, Nov 28, 2017 at 01:30:26PM -0800, Andrew Morton wrote:
>> On Tue, 28 Nov 2017 12:45:01 -0800 syzbot <bot+73a7bec1bc0f4fc051...@syzkaller.appspotmail.com> wrote:
>>
>> > Hello,
>> >
>> > syzkaller hit the following crash on
>> > b0a84f19a5161418d4360cd57603e94ed489915e
>> > git://git.kernel.org/pub/scm/linux/kernel/git/next/linux-next.git/master
>> > compiler: gcc (GCC) 7.1.1 20170620
>> > .config is attached
>> > Raw console output is attached.
>> >
>> > Unfortunately, I don't have any reproducer for this bug yet.
>> >
>> > WARNING: suspicious RCU usage
>>
>> There's a bunch of other info which lockdep_rcu_suspicious() should
>> have printed out, but this trace doesn't have any of it. I wonder why.
>
> Yes, there should be more info printed, no idea why it would go missing.

I think that's because while reporting "suspicious RCU usage" kernel hit:

BUG: unable to handle kernel NULL pointer dereference at 0000000000000074

and the rest of the report is actually about the NULL deref.

syzkaller hits too many crashes at the same time. And it's a problem
for us. We get reports with corrupted/intermixed titles,
corrupted/intermixed bodies, reports with same titles but about
different bugs, etc.

Paul E. McKenney

unread,
Nov 29, 2017, 3:51:07 PM11/29/17
to Dmitry Vyukov, Andrew Morton, syzbot, Christoph Lameter, Joonsoo Kim, LKML, Linux-MM, Pekka Enberg, David Rientjes, syzkall...@googlegroups.com, Herbert Xu
Got it, thank you!

Thanx, Paul

Eric Biggers

unread,
Nov 29, 2017, 7:47:47 PM11/29/17
to Andrew Morton, syzbot, c...@linux.com, iamjoon...@lge.com, linux-...@vger.kernel.org, linu...@kvack.org, pen...@kernel.org, rien...@google.com, syzkall...@googlegroups.com, Paul E. McKenney, Herbert Xu
The reason the "suspicious RCU usage" warning appeared is that due to the
incorrect call to blkcipher_walk_done(), kunmap_atomic() was being called
without a preceding kmap_atomic(), causing the preemption count to get screwed
up. This was in addition to the uninitialized pointer being kfree()'d.

Running a reproducer does show more information after the "WARNING: suspicious
RCU usage" (see below). So it does look like the report from syzkaller was
truncated, perhaps because two things went wrong right after each other.

Also, maybe enabling CONFIG_DEBUG_PREEMPT would be useful?

[ 9.136392]
[ 9.137202] =============================
[ 9.138014] WARNING: suspicious RCU usage
[ 9.138909] 4.15.0-rc1-00033-gef0010a30935 #113 Not tainted
[ 9.141195] -----------------------------
[ 9.142145] ./include/trace/events/kmem.h:142 suspicious rcu_dereference_check() usage!
[ 9.144400]
[ 9.144400] other info that might help us debug this:
[ 9.144400]
[ 9.146292]
[ 9.146292] rcu_scheduler_active = 2, debug_locks = 1
[ 9.148203] 1 lock held by syz_salsa20/625:
[ 9.149215] #0: (sk_lock-AF_ALG){+.+.}, at: [<00000000e0f6099e>] af_alg_wait_for_data+0xd8/0x150
[ 9.151682]
[ 9.151682] stack backtrace:
[ 9.152658] CPU: 1 PID: 625 Comm: syz_salsa20 Not tainted 4.15.0-rc1-00033-gef0010a30935 #113
[ 9.154669] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS Bochs 01/01/2011
[ 9.156408] Call Trace:
[ 9.156964] dump_stack+0x7c/0xb3
[ 9.157696] kfree+0x1c1/0x210
[ 9.158377] blkcipher_walk_done+0x21c/0x2c0
[ 9.159319] encrypt+0x7b/0xd0
[ 9.160000] ? skcipher_decrypt_blkcipher+0x40/0x50
[ 9.161061] ? skcipher_recvmsg+0x37a/0x3a0
[ 9.161981] ? sock_read_iter+0x93/0xd0
[ 9.162835] ? __vfs_read+0xcc/0x140
[ 9.163582] ? vfs_read+0x9c/0x130
[ 9.164282] ? SyS_read+0x45/0xb0
[ 9.164974] ? entry_SYSCALL_64_fastpath+0x1f/0x96
[ 9.166015] kfree_debugcheck: out of range ptr 28h
[ 9.166985] ------------[ cut here ]------------
[ 9.167834] kernel BUG at mm/slab.c:2753!
[ 9.168584] invalid opcode: 0000 [#1] SMP
[ 9.169335] CPU: 1 PID: 625 Comm: syz_salsa20 Not tainted 4.15.0-rc1-00033-gef0010a30935 #113
[ 9.171067] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS Bochs 01/01/2011
[ 9.172689] task: 00000000ee01d793 task.stack: 0000000004031a33
[ 9.173885] RIP: 0010:kfree_debugcheck+0x23/0x30
[ 9.174833] RSP: 0018:ffffb46b0092fc80 EFLAGS: 00010096
[ 9.175857] RAX: 0000000000000026 RBX: 0000000000000028 RCX: 0000000000000000
[ 9.177218] RDX: 0000000000000001 RSI: ffff99daff5cccc8 RDI: ffff99daff5cccc8
[ 9.178555] RBP: 0000000000000206 R08: 0000000000000001 R09: 0000000000000001
[ 9.179923] R10: 000000001f5d6993 R11: 0000000000000000 R12: ffffffff85b64b1c
[ 9.181284] R13: 0000000000000000 R14: ffffb46b0092fd98 R15: ffff99daf87b9000
[ 9.182617] FS: 00000000013bb880(0000) GS:ffff99daff400000(0000) knlGS:0000000000000000
[ 9.184148] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[ 9.185246] CR2: 00007f087bad7008 CR3: 0000000079f52003 CR4: 00000000001606e0
[ 9.186608] Call Trace:
[ 9.187081] kfree+0x5a/0x210
[ 9.187602] blkcipher_walk_done+0x21c/0x2c0
[ 9.188370] encrypt+0x7b/0xd0
[ 9.188933] ? skcipher_decrypt_blkcipher+0x40/0x50
[ 9.189796] ? skcipher_recvmsg+0x37a/0x3a0
[ 9.190541] ? sock_read_iter+0x93/0xd0
[ 9.191241] ? __vfs_read+0xcc/0x140
[ 9.191897] ? vfs_read+0x9c/0x130
[ 9.192502] ? SyS_read+0x45/0xb0
[ 9.193110] ? entry_SYSCALL_64_fastpath+0x1f/0x96
[ 9.193959] Code: 0f 1f 84 00 00 00 00 00 0f 1f 44 00 00 53 48 89 fb e8 32 f5 e1 ff 84 c0 74 02 5b c3 48 89 de 48 c7 c7 50 9c 21 86 e8 9a a0 f1 ff <0f> 0b 66 66 2e 0f 1f 84 00 00 00 00 00 0f 1f 44 00 00 41 57 89
[ 9.197364] RIP: kfree_debugcheck+0x23/0x30 RSP: ffffb46b0092fc80
[ 9.198455] ---[ end trace 833d54cb4ca6de67 ]---
[ 9.199291] Kernel panic - not syncing: Fatal exception in interrupt
[ 9.200595] Kernel Offset: 0x4600000 from 0xffffffff81000000 (relocation range: 0xffffffff80000000-0xffffffffbfffffff)
[ 9.202405] Rebooting in 5 seconds..

Dmitry Vyukov

unread,
Nov 30, 2017, 3:04:23 AM11/30/17
to Eric Biggers, Andrew Morton, syzbot, Christoph Lameter, Joonsoo Kim, LKML, Linux-MM, Pekka Enberg, David Rientjes, syzkall...@googlegroups.com, Paul E. McKenney, Herbert Xu
On Thu, Nov 30, 2017 at 1:47 AM, Eric Biggers <ebig...@gmail.com> wrote:
> On Tue, Nov 28, 2017 at 09:06:06PM -0800, Eric Biggers wrote:
>> On Tue, Nov 28, 2017 at 01:30:26PM -0800, Andrew Morton wrote:
>> >
>> > It looks like blkcipher_walk_done() passed a bad address to kfree().
>> >
>>
>> Indeed, it's freeing uninitialized memory because the Salsa20 algorithms are
>> using the blkcipher_walk API incorrectly. I've sent a patch to fix it:
>>
>> "crypto: salsa20 - fix blkcipher_walk API usage"
>>
>> I am not sure why the bug reports show up as "suspicious RCU usage", though.
>>
>> There were also a few other syzbot reports of this same underlying bug; I marked
>> them as duplicates of this one.
>>
>
> The reason the "suspicious RCU usage" warning appeared is that due to the
> incorrect call to blkcipher_walk_done(), kunmap_atomic() was being called
> without a preceding kmap_atomic(), causing the preemption count to get screwed
> up. This was in addition to the uninitialized pointer being kfree()'d.
>
> Running a reproducer does show more information after the "WARNING: suspicious
> RCU usage" (see below). So it does look like the report from syzkaller was
> truncated, perhaps because two things went wrong right after each other.
>
> Also, maybe enabling CONFIG_DEBUG_PREEMPT would be useful?


DEBUG_PREEMPT depends on PREEMPT, which is not enabled. So it seems
there is nothing to debug. Or how would it help?
> --
> You received this message because you are subscribed to the Google Groups "syzkaller-bugs" group.
> To unsubscribe from this group and stop receiving emails from it, send an email to syzkaller-bug...@googlegroups.com.
> To view this discussion on the web visit https://groups.google.com/d/msgid/syzkaller-bugs/20171130004743.GB65846%40gmail.com.
> For more options, visit https://groups.google.com/d/optout.

Dmitry Vyukov

unread,
Dec 1, 2017, 3:28:31 AM12/1/17
to Eric Biggers, Andrew Morton, syzbot, Christoph Lameter, Joonsoo Kim, LKML, Linux-MM, Pekka Enberg, David Rientjes, syzkall...@googlegroups.com, Paul E. McKenney, Herbert Xu
On Thu, Nov 30, 2017 at 9:04 AM, Dmitry Vyukov <dvy...@google.com> wrote:
> On Thu, Nov 30, 2017 at 1:47 AM, Eric Biggers <ebig...@gmail.com> wrote:
>> On Tue, Nov 28, 2017 at 09:06:06PM -0800, Eric Biggers wrote:
>>> On Tue, Nov 28, 2017 at 01:30:26PM -0800, Andrew Morton wrote:
>>> >
>>> > It looks like blkcipher_walk_done() passed a bad address to kfree().
>>> >
>>>
>>> Indeed, it's freeing uninitialized memory because the Salsa20 algorithms are
>>> using the blkcipher_walk API incorrectly. I've sent a patch to fix it:
>>>
>>> "crypto: salsa20 - fix blkcipher_walk API usage"

This is already applied to crypto tree, so let's do:

#syz fix: crypto: salsa20 - fix blkcipher_walk API usage
Reply all
Reply to author
Forward
0 new messages