KASAN: use-after-free Read in sha512_ctx_mgr_resubmit

36 views
Skip to first unread message

syzbot

unread,
Aug 15, 2018, 12:00:05 PM8/15/18
to da...@davemloft.net, ebig...@google.com, her...@gondor.apana.org.au, h...@zytor.com, linux-...@vger.kernel.org, linux-...@vger.kernel.org, mi...@redhat.com, syzkall...@googlegroups.com, tg...@linutronix.de, x...@kernel.org
Hello,

syzbot found the following crash on:

HEAD commit: 7796916146b8 Merge branch 'x86-cpu-for-linus' of git://git..
git tree: upstream
console output: https://syzkaller.appspot.com/x/log.txt?x=164b1922400000
kernel config: https://syzkaller.appspot.com/x/.config?x=265bef9882cce8d7
dashboard link: https://syzkaller.appspot.com/bug?extid=d5455bac3ba1ee9114e5
compiler: gcc (GCC) 8.0.1 20180413 (experimental)
syzkaller repro:https://syzkaller.appspot.com/x/repro.syz?x=1013478c400000
C reproducer: https://syzkaller.appspot.com/x/repro.c?x=1349c8aa400000

IMPORTANT: if you fix the bug, please add the following tag to the commit:
Reported-by: syzbot+d5455b...@syzkaller.appspotmail.com

==================================================================
BUG: KASAN: use-after-free in sha512_ctx_mgr_resubmit.part.3+0x3b1/0x4a0
arch/x86/crypto/sha512-mb/sha512_mb.c:136
Read of size 4 at addr ffff8801b0b9e838 by task kworker/0:1/13

CPU: 0 PID: 13 Comm: kworker/0:1 Not tainted 4.18.0+ #187
Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS
Google 01/01/2011
Workqueue: crypto mcryptd_flusher
Call Trace:
__dump_stack lib/dump_stack.c:77 [inline]
dump_stack+0x1c9/0x2b4 lib/dump_stack.c:113
print_address_description+0x6c/0x20b mm/kasan/report.c:256
kasan_report_error mm/kasan/report.c:354 [inline]
kasan_report.cold.7+0x242/0x2fe mm/kasan/report.c:412
__asan_report_load4_noabort+0x14/0x20 mm/kasan/report.c:432
sha512_ctx_mgr_resubmit.part.3+0x3b1/0x4a0
arch/x86/crypto/sha512-mb/sha512_mb.c:136
sha512_ctx_mgr_resubmit arch/x86/crypto/sha512-mb/sha512_mb.c:135 [inline]
sha512_ctx_mgr_flush+0x5c/0xb0 arch/x86/crypto/sha512-mb/sha512_mb.c:367
sha512_mb_flusher+0x27b/0x610 arch/x86/crypto/sha512-mb/sha512_mb.c:939
mcryptd_flusher+0x342/0x4b0 crypto/mcryptd.c:208
process_one_work+0xc73/0x1ba0 kernel/workqueue.c:2153
worker_thread+0x189/0x13c0 kernel/workqueue.c:2296
kthread+0x35a/0x420 kernel/kthread.c:246
ret_from_fork+0x3a/0x50 arch/x86/entry/entry_64.S:413

Allocated by task 23902:
save_stack+0x43/0xd0 mm/kasan/kasan.c:448
set_track mm/kasan/kasan.c:460 [inline]
kasan_kmalloc+0xc4/0xe0 mm/kasan/kasan.c:553
__do_kmalloc mm/slab.c:3718 [inline]
__kmalloc+0x14e/0x760 mm/slab.c:3727
kmalloc include/linux/slab.h:518 [inline]
sock_kmalloc+0x156/0x1f0 net/core/sock.c:1996
hash_accept_parent_nokey+0x58/0x2e0 crypto/algif_hash.c:438
hash_accept_parent+0x5b/0x80 crypto/algif_hash.c:465
af_alg_accept+0x127/0x7d0 crypto/af_alg.c:296
alg_accept+0x46/0x60 crypto/af_alg.c:332
__sys_accept4+0x3b2/0x8a0 net/socket.c:1600
__do_sys_accept4 net/socket.c:1635 [inline]
__se_sys_accept4 net/socket.c:1632 [inline]
__x64_sys_accept4+0x97/0xf0 net/socket.c:1632
do_syscall_64+0x1b9/0x820 arch/x86/entry/common.c:290
entry_SYSCALL_64_after_hwframe+0x49/0xbe

Freed by task 23902:
save_stack+0x43/0xd0 mm/kasan/kasan.c:448
set_track mm/kasan/kasan.c:460 [inline]
__kasan_slab_free+0x11a/0x170 mm/kasan/kasan.c:521
kasan_slab_free+0xe/0x10 mm/kasan/kasan.c:528
__cache_free mm/slab.c:3498 [inline]
kfree+0xd9/0x260 mm/slab.c:3813
__sock_kfree_s net/core/sock.c:2017 [inline]
sock_kfree_s+0x29/0x60 net/core/sock.c:2023
hash_sock_destruct+0x157/0x1c0 crypto/algif_hash.c:427
__sk_destruct+0x107/0xa60 net/core/sock.c:1573
sk_destruct+0x78/0x90 net/core/sock.c:1608
__sk_free+0xcf/0x300 net/core/sock.c:1619
sk_free+0x42/0x50 net/core/sock.c:1630
sock_put include/net/sock.h:1667 [inline]
af_alg_release+0x6e/0x90 crypto/af_alg.c:126
__sock_release+0xd7/0x260 net/socket.c:600
sock_close+0x19/0x20 net/socket.c:1151
__fput+0x355/0x8b0 fs/file_table.c:209
____fput+0x15/0x20 fs/file_table.c:243
task_work_run+0x1e8/0x2a0 kernel/task_work.c:113
tracehook_notify_resume include/linux/tracehook.h:192 [inline]
exit_to_usermode_loop+0x313/0x370 arch/x86/entry/common.c:166
prepare_exit_to_usermode arch/x86/entry/common.c:197 [inline]
syscall_return_slowpath arch/x86/entry/common.c:268 [inline]
do_syscall_64+0x6be/0x820 arch/x86/entry/common.c:293
entry_SYSCALL_64_after_hwframe+0x49/0xbe

The buggy address belongs to the object at ffff8801b0b9e340
which belongs to the cache kmalloc-2048 of size 2048
The buggy address is located 1272 bytes inside of
2048-byte region [ffff8801b0b9e340, ffff8801b0b9eb40)
The buggy address belongs to the page:
page:ffffea0006c2e780 count:1 mapcount:0 mapping:ffff8801dac00c40 index:0x0
compound_mapcount: 0
flags: 0x2fffc0000008100(slab|head)
raw: 02fffc0000008100 ffffea0007543a88 ffffea000760b188 ffff8801dac00c40
raw: 0000000000000000 ffff8801b0b9e340 0000000100000003 0000000000000000
page dumped because: kasan: bad access detected

Memory state around the buggy address:
ffff8801b0b9e700: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb
ffff8801b0b9e780: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb
> ffff8801b0b9e800: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb
^
ffff8801b0b9e880: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb
ffff8801b0b9e900: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb
==================================================================


---
This bug is generated by a bot. It may contain errors.
See https://goo.gl/tpsmEJ for more information about syzbot.
syzbot engineers can be reached at syzk...@googlegroups.com.

syzbot will keep track of this bug report. See:
https://goo.gl/tpsmEJ#bug-status-tracking for how to communicate with
syzbot.
syzbot can test patches for this bug, for details see:
https://goo.gl/tpsmEJ#testing-patches

Eric Biggers

unread,
Aug 20, 2018, 3:31:24 AM8/20/18
to Megha Dey, Tim Chen, da...@davemloft.net, her...@gondor.apana.org.au, syzbot, linux-...@vger.kernel.org, linux-...@vger.kernel.org, syzkall...@googlegroups.com, x...@kernel.org
[+sha512-mb maintainers...]
Apparently, the SHA multibuffer algorithms sometimes return success while the
hash request is still being asynchronously processed, which allows an in-use
request to be freed or reused. It also allows the wrong digest value to be
computed. The following patch fixes the crash for sha512_mb (the same fix is
also needed for sha1_mb and sha256_mb...), but I haven't properly reviewed yet
it as the multibuffer code is very difficult to understand...

diff --git a/arch/x86/crypto/sha512-mb/sha512_mb.c b/arch/x86/crypto/sha512-mb/sha512_mb.c
index 26b85678012d0..7708842447518 100644
--- a/arch/x86/crypto/sha512-mb/sha512_mb.c
+++ b/arch/x86/crypto/sha512-mb/sha512_mb.c
@@ -584,7 +584,7 @@ static int sha512_mb_update(struct ahash_request *areq)
return -EINPROGRESS;
done:
sha_complete_job(rctx, cstate, ret);
- return ret;
+ return -EINPROGRESS;
}

static int sha512_mb_finup(struct ahash_request *areq)
@@ -645,7 +645,7 @@ static int sha512_mb_finup(struct ahash_request *areq)
return -EINPROGRESS;
done:
sha_complete_job(rctx, cstate, ret);
- return ret;
+ return -EINPROGRESS;
}

static int sha512_mb_final(struct ahash_request *areq)
@@ -694,7 +694,7 @@ static int sha512_mb_final(struct ahash_request *areq)
return -EINPROGRESS;
done:
sha_complete_job(rctx, cstate, ret);
- return ret;
+ return -EINPROGRESS;
}

static int sha512_mb_export(struct ahash_request *areq, void *out)


And here's a simplified reproducer. It also works with sha1_mb and sha256_mb if
you change the .salg_name. Note: it actually causes a list corruption rather
than a use-after-free, but it seems to be the same bug...

#include <fcntl.h>
#include <linux/if_alg.h>
#include <sys/sendfile.h>
#include <sys/socket.h>
#include <unistd.h>

int main()
{
struct sockaddr_alg addr = {
.salg_type = "hash",
.salg_name = "sha512_mb",
};
int algfd, reqfd, filefd;
int nprocs = 8 * sysconf(_SC_NPROCESSORS_ONLN);

algfd = socket(AF_ALG, SOCK_SEQPACKET, 0);
bind(algfd, (void *)&addr, sizeof(addr));

if (fork()) {
while (nprocs-- > 0 && fork())
;
reqfd = accept(algfd, NULL, 0);
for (;;)
write(reqfd, "X", 1);
} else {
reqfd = accept(algfd, NULL, 0);
filefd = open("/bin/ls", O_RDONLY);
for (;;) {
loff_t offset = 0;

sendfile(reqfd, filefd, &offset, 256);
}
}
}

Personally, I'm wondering why the SHA multibuffer code is in the kernel at all
given all the severe issues it has that its authors/maintainers don't seem to be
working very hard to fix. The code is very difficult to understand due to the
weird 3-layer design with "mcryptd" and other issues, making debugging it very
time consuming; most of the code is duplicated in 3 places (sha1-mb, sha256-mb,
and sha512-mb), making maintenance even more difficult; and most importantly
there are severe bugs, including edge cases where it computes the wrong hash, as
shown not only by this bug but also the sha256_mb bug I recently ran into. It
seems the algorithms were never tested under load to cover these edge cases.

That's *not* acceptable for crypto code. Security and correctness come first.

Also as I've shown previously, in most cases the multibuffer SHA algorithms are
~1000x slower than the regular ones due to the flush delay. So the performance
argument for them actually seems pretty tenuous... And, isn't AVX2 multibuffer
useless on new processors, which have SHA instructions?

I'd also be very interested to hear an explanation for why systemwide sharing of
hash jobs doesn't enable side-channel attacks and isn't just the latest example
of prioritizing "performance" over security?

We need to have higher standards for crypto and not accept buggy spaghetti code
just because it's slightly faster in some artificial microbenchmark.

So unless major improvements are made, I personally think we'd be much better
off without the SHA multibuffer algorithms in the kernel.

Just my two cents...

- Eric

Ard Biesheuvel

unread,
Aug 21, 2018, 8:43:57 AM8/21/18
to Eric Biggers, Megha Dey, Tim Chen, David S. Miller, Herbert Xu, syzbot, open list:HARDWARE RANDOM NUMBER GENERATOR CORE, Linux Kernel Mailing List, syzkaller-bugs, the arch/x86 maintainers
[...]

>
> Personally, I'm wondering why the SHA multibuffer code is in the kernel at all
> given all the severe issues it has that its authors/maintainers don't seem to be
> working very hard to fix. The code is very difficult to understand due to the
> weird 3-layer design with "mcryptd" and other issues, making debugging it very
> time consuming; most of the code is duplicated in 3 places (sha1-mb, sha256-mb,
> and sha512-mb), making maintenance even more difficult; and most importantly
> there are severe bugs, including edge cases where it computes the wrong hash, as
> shown not only by this bug but also the sha256_mb bug I recently ran into. It
> seems the algorithms were never tested under load to cover these edge cases.
>
> That's *not* acceptable for crypto code. Security and correctness come first.
>
> Also as I've shown previously, in most cases the multibuffer SHA algorithms are
> ~1000x slower than the regular ones due to the flush delay. So the performance
> argument for them actually seems pretty tenuous... And, isn't AVX2 multibuffer
> useless on new processors, which have SHA instructions?
>
> I'd also be very interested to hear an explanation for why systemwide sharing of
> hash jobs doesn't enable side-channel attacks and isn't just the latest example
> of prioritizing "performance" over security?
>
> We need to have higher standards for crypto and not accept buggy spaghetti code
> just because it's slightly faster in some artificial microbenchmark.
>
> So unless major improvements are made, I personally think we'd be much better
> off without the SHA multibuffer algorithms in the kernel.
>

I agree. The code is obviously broken in a way that would have been
noticed if it were in wide use, and it is too complicated for mere
mortals to fix or maintain. I suggest we simply remove it for now, and
if anyone wants to reintroduce it, we can review the code *and* the
justification for the approach from scratch (in which case we should
consider factoring out the algo agnostics plumbing in a way that
allows it to be reused by other architectures as well)

Herbert Xu

unread,
Aug 22, 2018, 2:21:02 AM8/22/18
to Ard Biesheuvel, Eric Biggers, Megha Dey, Tim Chen, David S. Miller, syzbot, open list:HARDWARE RANDOM NUMBER GENERATOR CORE, Linux Kernel Mailing List, syzkaller-bugs, the arch/x86 maintainers
On Tue, Aug 21, 2018 at 02:43:56PM +0200, Ard Biesheuvel wrote:
>
> I agree. The code is obviously broken in a way that would have been
> noticed if it were in wide use, and it is too complicated for mere
> mortals to fix or maintain. I suggest we simply remove it for now, and
> if anyone wants to reintroduce it, we can review the code *and* the
> justification for the approach from scratch (in which case we should
> consider factoring out the algo agnostics plumbing in a way that
> allows it to be reused by other architectures as well)

I agree too. Could one of you guys send me a patch to remove
them?

Thanks,
--
Email: Herbert Xu <her...@gondor.apana.org.au>
Home Page: http://gondor.apana.org.au/~herbert/
PGP Key: http://gondor.apana.org.au/~herbert/pubkey.txt

Megha Dey

unread,
Aug 27, 2018, 6:48:21 PM8/27/18
to Herbert Xu, Ard Biesheuvel, Eric Biggers, Tim Chen, David S. Miller, syzbot, open list:HARDWARE RANDOM NUMBER GENERATOR CORE, Linux Kernel Mailing List, syzkaller-bugs, the arch/x86 maintainers
On Wed, 2018-08-22 at 14:20 +0800, Herbert Xu wrote:
> On Tue, Aug 21, 2018 at 02:43:56PM +0200, Ard Biesheuvel wrote:
> >
> > I agree. The code is obviously broken in a way that would have been
> > noticed if it were in wide use, and it is too complicated for mere
> > mortals to fix or maintain. I suggest we simply remove it for now, and
> > if anyone wants to reintroduce it, we can review the code *and* the
> > justification for the approach from scratch (in which case we should
> > consider factoring out the algo agnostics plumbing in a way that
> > allows it to be reused by other architectures as well)
>
> I agree too. Could one of you guys send me a patch to remove
> them?
>

Hi,

We are working on a fix to solve these corner cases.

-Megha
> Thanks,


Ard Biesheuvel

unread,
Aug 27, 2018, 7:01:01 PM8/27/18
to Megha Dey, Herbert Xu, Eric Biggers, Tim Chen, David S. Miller, syzbot, open list:HARDWARE RANDOM NUMBER GENERATOR CORE, Linux Kernel Mailing List, syzkaller-bugs, the arch/x86 maintainers
Great. thanks.

But it would also be helpful if you could try and answer the questions
raised by Eric:
- in which cases does this driver result in a speedup?
- how should we tune the flush delay to prevent pathological
performance regressions?
- is it still safe in the post-Spectre era of computing to aggregate
hash input from different sources (which may be different users
altogether) and process them as a single source of input?

Megha Dey

unread,
Aug 28, 2018, 5:56:59 PM8/28/18
to Ard Biesheuvel, Herbert Xu, Eric Biggers, Tim Chen, David S. Miller, syzbot, open list:HARDWARE RANDOM NUMBER GENERATOR CORE, Linux Kernel Mailing List, syzkaller-bugs, the arch/x86 maintainers
On Tue, 2018-08-28 at 01:01 +0200, Ard Biesheuvel wrote:
> On 28 August 2018 at 01:08, Megha Dey <megh...@linux.intel.com> wrote:
> > On Wed, 2018-08-22 at 14:20 +0800, Herbert Xu wrote:
> >> On Tue, Aug 21, 2018 at 02:43:56PM +0200, Ard Biesheuvel wrote:
> >> >
> >> > I agree. The code is obviously broken in a way that would have been
> >> > noticed if it were in wide use, and it is too complicated for mere
> >> > mortals to fix or maintain. I suggest we simply remove it for now, and
> >> > if anyone wants to reintroduce it, we can review the code *and* the
> >> > justification for the approach from scratch (in which case we should
> >> > consider factoring out the algo agnostics plumbing in a way that
> >> > allows it to be reused by other architectures as well)
> >>
> >> I agree too. Could one of you guys send me a patch to remove
> >> them?
> >>
> >
> > Hi,
> >
> > We are working on a fix to solve these corner cases.
> >
>
> Great. thanks.
>
> But it would also be helpful if you could try and answer the questions
> raised by Eric:
> - in which cases does this driver result in a speedup?

The multibuffer algorithm approach results in a speedup only when there
are a large number of independent jobs. Similarly if there is a steady
stream of independent jobs, multibuffer works well. So another way of
looking at this is that if there are a lot of flushes, then one
shouldn't be using multibuffer algorithms.

> - how should we tune the flush delay to prevent pathological
> performance regressions?

We have not optimized the flush delay on the ground that it should not
be occurring often. If this seems to be happening a lot, then perhaps we
could optimize it along the lines that if more than N "lanes" are full,
we will continue to use the present flush algorithm, else we could
switch to an optimized single buffer code for one of the lanes. Any
other suggestions are welcome too.

> - is it still safe in the post-Spectre era of computing to aggregate
> hash input from different sources (which may be different users
> altogether) and process them as a single source of input?

hmmm, I am not sure how one could do side-channel attack, could you
maybe give an example? One could do a denial of service attack, by
flooding it with too many requests.

Also Eric had raised the issue whether we need AVX2 multibuffer now that
we have SHA instructions. However, for cases where jobs come in fast
enough, sha1-mb is seen to perform better than sha-ni algorithm.

Since, we already have lowered the priority of the multibuffer
algorithms, users who know that they would have a consistent inflow of
jobs should only opt for the multibuffer algorithm.

From Herbert's suggestion, I am also working on removing the mcryptd
layer for the multibuffer algorithms, and follow the simd model to
simplify the multibuffer model.

Hence, instead of removing these algorithms, I would like to suggest the
following:
1. Find a fix for the corner cases of memory corruption
2. Remove mcryptd layer, add simd interface for hash
3. try to reuse code for hash multibuffer algorithms instead of
duplicating the same code 3 times.

Thanks,
Megha

Eric Biggers

unread,
Oct 5, 2018, 6:33:10 PM10/5/18
to syzbot, da...@davemloft.net, her...@gondor.apana.org.au, h...@zytor.com, linux-...@vger.kernel.org, linux-...@vger.kernel.org, mi...@redhat.com, syzkall...@googlegroups.com, tg...@linutronix.de, x...@kernel.org
On Wed, Aug 15, 2018 at 09:00:04AM -0700, syzbot wrote:
The "fix" for this is queued in cryptodev:

#syz fix: crypto: x86 - remove SHA multibuffer routines and mcryptd

- Eric
Reply all
Reply to author
Forward
0 new messages