KMSAN: uninit-value in aes_encrypt (4)

19 views
Skip to first unread message

syzbot

unread,
Sep 10, 2020, 10:09:26 AM9/10/20
to da...@davemloft.net, gli...@google.com, her...@gondor.apana.org.au, linux-...@vger.kernel.org, linux-...@vger.kernel.org, syzkall...@googlegroups.com
Hello,

syzbot found the following issue on:

HEAD commit: 3b3ea602 x86: add failure injection to get/put/clear_user
git tree: https://github.com/google/kmsan.git master
console output: https://syzkaller.appspot.com/x/log.txt?x=1030dda5900000
kernel config: https://syzkaller.appspot.com/x/.config?x=ee5f7a0b2e48ed66
dashboard link: https://syzkaller.appspot.com/bug?extid=828dfc12440b4f6f305d
compiler: clang version 10.0.0 (https://github.com/llvm/llvm-project/ c2443155a0fb245c8f17f2c1c72b6ea391e86e81)
syz repro: https://syzkaller.appspot.com/x/repro.syz?x=17b15055900000
C reproducer: https://syzkaller.appspot.com/x/repro.c?x=12931621900000

IMPORTANT: if you fix the issue, please add the following tag to the commit:
Reported-by: syzbot+828dfc...@syzkaller.appspotmail.com

=====================================================
BUG: KMSAN: uninit-value in subshift lib/crypto/aes.c:149 [inline]
BUG: KMSAN: uninit-value in aes_encrypt+0x12c5/0x1bc0 lib/crypto/aes.c:282
CPU: 0 PID: 8537 Comm: syz-executor872 Not tainted 5.8.0-rc5-syzkaller #0
Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 01/01/2011
Call Trace:
__dump_stack lib/dump_stack.c:77 [inline]
dump_stack+0x21c/0x280 lib/dump_stack.c:118
kmsan_report+0xf7/0x1e0 mm/kmsan/kmsan_report.c:121
__msan_warning+0x58/0xa0 mm/kmsan/kmsan_instr.c:215
subshift lib/crypto/aes.c:149 [inline]
aes_encrypt+0x12c5/0x1bc0 lib/crypto/aes.c:282
aesti_encrypt+0xe8/0x130 crypto/aes_ti.c:31
cipher_crypt_one crypto/cipher.c:75 [inline]
crypto_cipher_encrypt_one+0x1e2/0x3a0 crypto/cipher.c:82
crypto_cbcmac_digest_update+0x3e3/0x560 crypto/ccm.c:830
crypto_shash_update+0x455/0x5a0 crypto/shash.c:119
shash_ahash_finup+0x20b/0x7a0 crypto/shash.c:291
shash_async_finup+0xbb/0x110 crypto/shash.c:306
crypto_ahash_op+0x1c3/0x770 crypto/ahash.c:370
crypto_ahash_finup+0x8c/0xb0 crypto/ahash.c:395
crypto_ccm_auth+0x16b6/0x1780 crypto/ccm.c:221
crypto_ccm_encrypt+0x285/0x850 crypto/ccm.c:300
crypto_aead_encrypt+0x107/0x190 crypto/aead.c:94
tls_do_encryption net/tls/tls_sw.c:528 [inline]
tls_push_record+0x3cb9/0x4fc0 net/tls/tls_sw.c:762
bpf_exec_tx_verdict+0x195a/0x29e0 net/tls/tls_sw.c:802
tls_sw_do_sendpage+0x138a/0x1e30 net/tls/tls_sw.c:1213
tls_sw_sendpage+0x1da/0x250 net/tls/tls_sw.c:1277
inet_sendpage+0x1dc/0x2f0 net/ipv4/af_inet.c:828
kernel_sendpage net/socket.c:3642 [inline]
sock_sendpage+0x1dc/0x2b0 net/socket.c:945
pipe_to_sendpage+0x3f4/0x530 fs/splice.c:448
splice_from_pipe_feed fs/splice.c:502 [inline]
__splice_from_pipe+0x5e3/0xff0 fs/splice.c:626
splice_from_pipe fs/splice.c:661 [inline]
generic_splice_sendpage+0x1d5/0x2d0 fs/splice.c:834
do_splice_from fs/splice.c:846 [inline]
do_splice+0x2727/0x39e0 fs/splice.c:1144
__do_sys_splice fs/splice.c:1419 [inline]
__se_sys_splice+0x323/0x500 fs/splice.c:1401
__x64_sys_splice+0x6e/0x90 fs/splice.c:1401
do_syscall_64+0xad/0x160 arch/x86/entry/common.c:386
entry_SYSCALL_64_after_hwframe+0x44/0xa9
RIP: 0033:0x446a29
Code: Bad RIP value.
RSP: 002b:00007f91f8c0bd98 EFLAGS: 00000246 ORIG_RAX: 0000000000000113
RAX: ffffffffffffffda RBX: 00000000006dbc48 RCX: 0000000000446a29
RDX: 0000000000000004 RSI: 0000000000000000 RDI: 0000000000000003
RBP: 00000000006dbc40 R08: 00080000fffffffc R09: 0000000000000000
R10: 0000000000000000 R11: 0000000000000246 R12: 00000000006dbc4c
R13: 0000000020000680 R14: 00000000004ae948 R15: 20c49ba5e353f7cf

Uninit was stored to memory at:
kmsan_save_stack_with_flags mm/kmsan/kmsan.c:144 [inline]
kmsan_internal_chain_origin+0xad/0x130 mm/kmsan/kmsan.c:310
__msan_chain_origin+0x50/0x90 mm/kmsan/kmsan_instr.c:165
__crypto_xor+0x251/0x1610 crypto/algapi.c:1000
crypto_xor include/crypto/algapi.h:152 [inline]
crypto_cbcmac_digest_update+0x2b8/0x560 crypto/ccm.c:824
crypto_shash_update+0x455/0x5a0 crypto/shash.c:119
shash_ahash_finup+0x20b/0x7a0 crypto/shash.c:291
shash_async_finup+0xbb/0x110 crypto/shash.c:306
crypto_ahash_op+0x1c3/0x770 crypto/ahash.c:370
crypto_ahash_finup+0x8c/0xb0 crypto/ahash.c:395
crypto_ccm_auth+0x16b6/0x1780 crypto/ccm.c:221
crypto_ccm_encrypt+0x285/0x850 crypto/ccm.c:300
crypto_aead_encrypt+0x107/0x190 crypto/aead.c:94
tls_do_encryption net/tls/tls_sw.c:528 [inline]
tls_push_record+0x3cb9/0x4fc0 net/tls/tls_sw.c:762
bpf_exec_tx_verdict+0x195a/0x29e0 net/tls/tls_sw.c:802
tls_sw_do_sendpage+0x138a/0x1e30 net/tls/tls_sw.c:1213
tls_sw_sendpage+0x1da/0x250 net/tls/tls_sw.c:1277
inet_sendpage+0x1dc/0x2f0 net/ipv4/af_inet.c:828
kernel_sendpage net/socket.c:3642 [inline]
sock_sendpage+0x1dc/0x2b0 net/socket.c:945
pipe_to_sendpage+0x3f4/0x530 fs/splice.c:448
splice_from_pipe_feed fs/splice.c:502 [inline]
__splice_from_pipe+0x5e3/0xff0 fs/splice.c:626
splice_from_pipe fs/splice.c:661 [inline]
generic_splice_sendpage+0x1d5/0x2d0 fs/splice.c:834
do_splice_from fs/splice.c:846 [inline]
do_splice+0x2727/0x39e0 fs/splice.c:1144
__do_sys_splice fs/splice.c:1419 [inline]
__se_sys_splice+0x323/0x500 fs/splice.c:1401
__x64_sys_splice+0x6e/0x90 fs/splice.c:1401
do_syscall_64+0xad/0x160 arch/x86/entry/common.c:386
entry_SYSCALL_64_after_hwframe+0x44/0xa9

Uninit was created at:
kmsan_save_stack_with_flags+0x3c/0x90 mm/kmsan/kmsan.c:144
kmsan_internal_alloc_meta_for_pages mm/kmsan/kmsan_shadow.c:269 [inline]
kmsan_alloc_page+0xc5/0x1a0 mm/kmsan/kmsan_shadow.c:293
__alloc_pages_nodemask+0xdf0/0x1030 mm/page_alloc.c:4889
alloc_pages_current+0x685/0xb50 mm/mempolicy.c:2292
alloc_pages include/linux/gfp.h:545 [inline]
skb_page_frag_refill+0x540/0x780 net/core/sock.c:2487
sk_page_frag_refill+0xa3/0x3b0 net/core/sock.c:2507
sk_msg_alloc+0x26e/0x1340 net/core/skmsg.c:38
tls_alloc_encrypted_msg net/tls/tls_sw.c:289 [inline]
tls_sw_do_sendpage+0xb0d/0x1e30 net/tls/tls_sw.c:1191
tls_sw_sendpage+0x1da/0x250 net/tls/tls_sw.c:1277
inet_sendpage+0x1dc/0x2f0 net/ipv4/af_inet.c:828
kernel_sendpage net/socket.c:3642 [inline]
sock_sendpage+0x1dc/0x2b0 net/socket.c:945
pipe_to_sendpage+0x3f4/0x530 fs/splice.c:448
splice_from_pipe_feed fs/splice.c:502 [inline]
__splice_from_pipe+0x5e3/0xff0 fs/splice.c:626
splice_from_pipe fs/splice.c:661 [inline]
generic_splice_sendpage+0x1d5/0x2d0 fs/splice.c:834
do_splice_from fs/splice.c:846 [inline]
do_splice+0x2727/0x39e0 fs/splice.c:1144
__do_sys_splice fs/splice.c:1419 [inline]
__se_sys_splice+0x323/0x500 fs/splice.c:1401
__x64_sys_splice+0x6e/0x90 fs/splice.c:1401
do_syscall_64+0xad/0x160 arch/x86/entry/common.c:386
entry_SYSCALL_64_after_hwframe+0x44/0xa9
=====================================================


---
This report is generated by a bot. It may contain errors.
See https://goo.gl/tpsmEJ for more information about syzbot.
syzbot engineers can be reached at syzk...@googlegroups.com.

syzbot will keep track of this issue. See:
https://goo.gl/tpsmEJ#status for how to communicate with syzbot.
syzbot can test patches for this issue, for details see:
https://goo.gl/tpsmEJ#testing-patches

Eric Biggers

unread,
Sep 11, 2020, 1:01:53 PM9/11/20
to Boris Pismenny, Aviad Yehezkel, John Fastabend, Daniel Borkmann, Jakub Kicinski, net...@vger.kernel.org, da...@davemloft.net, gli...@google.com, her...@gondor.apana.org.au, linux-...@vger.kernel.org, linux-...@vger.kernel.org, syzkall...@googlegroups.com, syzbot
Looks like the tls subsystem is encrypting uninitialized memory again.
+tls maintainers and netdev.
> --
> You received this message because you are subscribed to the Google Groups "syzkaller-bugs" group.
> To unsubscribe from this group and stop receiving emails from it, send an email to syzkaller-bug...@googlegroups.com.
> To view this discussion on the web visit https://groups.google.com/d/msgid/syzkaller-bugs/0000000000008a7ae505aef61db1%40google.com.

syzbot

unread,
Sep 15, 2020, 6:23:09 AM9/15/20
to gli...@google.com, himadri...@gmail.com, syzkall...@googlegroups.com
Hello,

syzbot has tested the proposed patch but the reproducer is still triggering an issue:
KMSAN: uninit-value in aes_encrypt

=====================================================
BUG: KMSAN: uninit-value in subshift lib/crypto/aes.c:149 [inline]
BUG: KMSAN: uninit-value in aes_encrypt+0x12c5/0x1bc0 lib/crypto/aes.c:282
CPU: 1 PID: 10471 Comm: syz-executor.4 Not tainted 5.8.0-rc5-syzkaller #0
RIP: 0033:0x45d5b9
Code: Bad RIP value.
RSP: 002b:00007f5a38960c78 EFLAGS: 00000246 ORIG_RAX: 0000000000000113
RAX: ffffffffffffffda RBX: 0000000000033a00 RCX: 000000000045d5b9
RDX: 0000000000000004 RSI: 0000000000000000 RDI: 0000000000000003
RBP: 000000000118cf98 R08: 00080000fffffffc R09: 0000000000000000
R10: 0000000000000000 R11: 0000000000000246 R12: 000000000118cf4c
R13: 00007fffcc5402bf R14: 00007f5a389619c0 R15: 000000000118cf4c
Tested on:

commit: 3b3ea602 x86: add failure injection to get/put/clear_user
git tree: https://github.com/google/kmsan.git master
console output: https://syzkaller.appspot.com/x/log.txt?x=147fcd2d900000

Eric Biggers

unread,
Apr 6, 2023, 7:21:26 PM4/6/23
to syzbot, syzkall...@googlegroups.com
On Fri, Sep 11, 2020 at 10:01:50AM -0700, Eric Biggers wrote:
> Looks like the tls subsystem is encrypting uninitialized memory again.
> +tls maintainers and netdev.

Moving out of "crypto" subsystem:

#syz set subsystems: net

Tetsuo Handa

unread,
Jun 28, 2023, 9:48:27 AM6/28/23
to Boris Pismenny, John Fastabend, Jakub Kicinski, gli...@google.com, her...@gondor.apana.org.au, linux-...@vger.kernel.org, syzkall...@googlegroups.com, syzbot, Eric Biggers, Aviad Yehezkel, Daniel Borkmann, net...@vger.kernel.org, David S. Miller, Eric Dumazet, Paolo Abeni
syzbot is reporting uninit-value at aes_encrypt(), for block cipher assumes
that bytes to encrypt/decrypt is multiple of block size for that cipher but
tls_alloc_encrypted_msg() is not initializing padding bytes when
required_size is not multiple of block cipher's block size.

In order to make sure that padding bytes are automatically initialized,
enable __GFP_ZERO flag when setsockopt(SOL_TCP, TCP_ULP, "tls") is called.

Reported-by: syzbot <syzbot+828dfc...@syzkaller.appspotmail.com>
Closes: https://syzkaller.appspot.com/bug?extid=828dfc12440b4f6f305d
Signed-off-by: Tetsuo Handa <penguin...@I-love.SAKURA.ne.jp>
---
According to C reproducer, this problem happens when bpf_exec_tx_verdict() is
called with lower 4 bits of required_size being 0001 and does not happen when
being 0100. Thus, I assumed that this problem is caused by lack of initializing
padding bytes.
But I couldn't figure out why KMSAN reports this problem when bpf_exec_tx_verdict()
is called with lower 4 bits of required_size being 0001 for the second time and
does not report this problem when bpf_exec_tx_verdict() is called with lower
4 bits of required_size being 0001 for the first time. More deeper problem exists?
KMSAN reporting this problem when accessing u64 relevant?

net/tls/tls_main.c | 1 +
1 file changed, 1 insertion(+)

diff --git a/net/tls/tls_main.c b/net/tls/tls_main.c
index f2e7302a4d96..cd5366966864 100644
--- a/net/tls/tls_main.c
+++ b/net/tls/tls_main.c
@@ -1025,6 +1025,7 @@ static int tls_init(struct sock *sk)
struct tls_context *ctx;
int rc = 0;

+ sk->sk_allocation |= __GFP_ZERO;
tls_build_proto(sk);

#ifdef CONFIG_TLS_TOE
--
2.34.1

Jakub Kicinski

unread,
Jun 28, 2023, 5:03:20 PM6/28/23
to Tetsuo Handa, Boris Pismenny, John Fastabend, gli...@google.com, her...@gondor.apana.org.au, linux-...@vger.kernel.org, syzkall...@googlegroups.com, syzbot, Eric Biggers, Aviad Yehezkel, Daniel Borkmann, net...@vger.kernel.org, David S. Miller, Eric Dumazet, Paolo Abeni
On Wed, 28 Jun 2023 22:48:01 +0900 Tetsuo Handa wrote:
> syzbot is reporting uninit-value at aes_encrypt(), for block cipher assumes
> that bytes to encrypt/decrypt is multiple of block size for that cipher but
> tls_alloc_encrypted_msg() is not initializing padding bytes when
> required_size is not multiple of block cipher's block size.

Sounds odd, so crypto layer reads beyond what we submitted as
the buffer? I don't think the buffer needs to be aligned, so
the missing bits may well fall into a different (unmapped?) page.

This needs more careful investigation. Always zeroing the input
is just covering up the real issue.

Tetsuo Handa

unread,
Jun 28, 2023, 6:15:45 PM6/28/23
to Jakub Kicinski, Boris Pismenny, John Fastabend, gli...@google.com, her...@gondor.apana.org.au, linux-...@vger.kernel.org, syzkall...@googlegroups.com, syzbot, Eric Biggers, Aviad Yehezkel, Daniel Borkmann, net...@vger.kernel.org, David S. Miller, Eric Dumazet, Paolo Abeni
On 2023/06/29 6:03, Jakub Kicinski wrote:
> On Wed, 28 Jun 2023 22:48:01 +0900 Tetsuo Handa wrote:
>> syzbot is reporting uninit-value at aes_encrypt(), for block cipher assumes
>> that bytes to encrypt/decrypt is multiple of block size for that cipher but
>> tls_alloc_encrypted_msg() is not initializing padding bytes when
>> required_size is not multiple of block cipher's block size.
>
> Sounds odd, so crypto layer reads beyond what we submitted as
> the buffer? I don't think the buffer needs to be aligned, so
> the missing bits may well fall into a different (unmapped?) page.

Since passing __GFP_ZERO to skb_page_frag_refill() hides this problem,
I think that crypto layer is reading up to block size when requested
size is not multiple of block size.

>
> This needs more careful investigation. Always zeroing the input
> is just covering up the real issue.

Since block cipher needs to read up to block size, someone has to initialize
padding bytes. I guess that crypto API caller is responsible for allocating
and initializing padding bytes, otherwise such crypto API caller will fail to
encrypt/decrypt last partial bytes which are not multiple of cipher's block
size.

Which function in this report is responsible for initializing padding bytes?

Eric Biggers

unread,
Jun 30, 2023, 4:15:40 AM6/30/23
to Tetsuo Handa, Jakub Kicinski, Boris Pismenny, John Fastabend, gli...@google.com, her...@gondor.apana.org.au, linux-...@vger.kernel.org, syzkall...@googlegroups.com, syzbot, Aviad Yehezkel, Daniel Borkmann, net...@vger.kernel.org, David S. Miller, Eric Dumazet, Paolo Abeni
According to the sample crash report from
https://syzkaller.appspot.com/bug?extid=828dfc12440b4f6f305d, the uninitialized
memory access happens while the TLS layer is doing an AES-CCM encryption
operation. CCM supports arbitrarily-aligned additional authenticated data and
plaintext/ciphertext. Also, an encryption with crypto_aead_encrypt() reads
exactly 'assoclen + cryptlen' bytes from the 'src' scatterlist; it's not
supposed to ever go past that, even if the data isn't "block aligned".

The "aead" API (include/crypto/aead.h) is still confusing and hard to use
correctly, though, mainly because of the weird scatterlist layout it expects
with the AAD and plaintext/ciphertext concatenated to each other. I wouldn't be
surprised if the TLS layer is making some error. What is the exact sequence of
crypto_aead_* calls that results in this issue?

- Eric

Ard Biesheuvel

unread,
Jun 30, 2023, 5:37:08 AM6/30/23
to Tetsuo Handa, Boris Pismenny, John Fastabend, Jakub Kicinski, gli...@google.com, her...@gondor.apana.org.au, linux-...@vger.kernel.org, syzkall...@googlegroups.com, syzbot, Eric Biggers, Aviad Yehezkel, Daniel Borkmann, net...@vger.kernel.org, David S. Miller, Eric Dumazet, Paolo Abeni
On Wed, 28 Jun 2023 at 15:49, Tetsuo Handa
<penguin...@i-love.sakura.ne.jp> wrote:
>
> syzbot is reporting uninit-value at aes_encrypt(), for block cipher assumes
> that bytes to encrypt/decrypt is multiple of block size for that cipher but
> tls_alloc_encrypted_msg() is not initializing padding bytes when
> required_size is not multiple of block cipher's block size.
>
> In order to make sure that padding bytes are automatically initialized,
> enable __GFP_ZERO flag when setsockopt(SOL_TCP, TCP_ULP, "tls") is called.
>
> Reported-by: syzbot <syzbot+828dfc...@syzkaller.appspotmail.com>
> Closes: https://syzkaller.appspot.com/bug?extid=828dfc12440b4f6f305d

Maybe I am missing something, but this syzkaller report appears to b a
huge collection of uninit-value reports, of which only one is related
to bpx_tx_verdict(), and that one has nor reproducer.

So even if this would be the right fix, I don't think closing that
issue makes sense, so? The issue refers to a couple of other
occcurences of uninit-value in aesti_encrypt, which do seem related.

> Signed-off-by: Tetsuo Handa <penguin...@I-love.SAKURA.ne.jp>
> ---
> According to C reproducer, this problem happens when bpf_exec_tx_verdict() is
> called with lower 4 bits of required_size being 0001 and does not happen when
> being 0100. Thus, I assumed that this problem is caused by lack of initializing
> padding bytes.
> But I couldn't figure out why KMSAN reports this problem when bpf_exec_tx_verdict()
> is called with lower 4 bits of required_size being 0001 for the second time and
> does not report this problem when bpf_exec_tx_verdict() is called with lower
> 4 bits of required_size being 0001 for the first time. More deeper problem exists?
> KMSAN reporting this problem when accessing u64 relevant?
>

As Eric pointed out as well, zeroing the allocation papers over the problem.

Why are you sending this now? Do you have a reproducer for this issue?

Tetsuo Handa

unread,
Jun 30, 2023, 5:53:15 AM6/30/23
to Ard Biesheuvel, Boris Pismenny, John Fastabend, Jakub Kicinski, gli...@google.com, her...@gondor.apana.org.au, linux-...@vger.kernel.org, syzkall...@googlegroups.com, syzbot, Eric Biggers, Aviad Yehezkel, Daniel Borkmann, net...@vger.kernel.org, David S. Miller, Eric Dumazet, Paolo Abeni
On 2023/06/30 18:36, Ard Biesheuvel wrote:
> Why are you sending this now?

Just because this is currently top crasher and I can reproduce locally.

> Do you have a reproducer for this issue?

Yes. https://syzkaller.appspot.com/text?tag=ReproC&x=12931621900000 works.

Ard Biesheuvel

unread,
Jun 30, 2023, 6:02:21 AM6/30/23
to Tetsuo Handa, Boris Pismenny, John Fastabend, Jakub Kicinski, gli...@google.com, her...@gondor.apana.org.au, linux-...@vger.kernel.org, syzkall...@googlegroups.com, syzbot, Eric Biggers, Aviad Yehezkel, Daniel Borkmann, net...@vger.kernel.org, David S. Miller, Eric Dumazet, Paolo Abeni
Could you please share your kernel config and the resulting kernel log
when running the reproducer? I'll try to reproduce locally as well,
and see if I can figure out what is going on in the crypto layer

Alexander Potapenko

unread,
Jun 30, 2023, 6:11:03 AM6/30/23
to Ard Biesheuvel, Tetsuo Handa, Boris Pismenny, John Fastabend, Jakub Kicinski, her...@gondor.apana.org.au, linux-...@vger.kernel.org, syzkall...@googlegroups.com, syzbot, Eric Biggers, Aviad Yehezkel, Daniel Borkmann, net...@vger.kernel.org, David S. Miller, Eric Dumazet, Paolo Abeni
The config together with the repro is available at
https://syzkaller.appspot.com/bug?extid=828dfc12440b4f6f305d, see the
latest row of the "Crashes" table that contains a C repro.
Config: https://syzkaller.appspot.com/text?tag=KernelConfig&x=ee5f7a0b2e48ed66
Report: https://syzkaller.appspot.com/text?tag=CrashReport&x=1325260d900000
Syz repro: https://syzkaller.appspot.com/text?tag=ReproSyz&x=11af973e900000
C repro: https://syzkaller.appspot.com/text?tag=ReproC&x=163a1e45900000

The bug is reproducible for me locally as well (and Tetsuo's patch
makes it disappear, although I have no opinion on its correctness).

Ard Biesheuvel

unread,
Jun 30, 2023, 6:18:17 AM6/30/23
to Alexander Potapenko, Tetsuo Handa, Boris Pismenny, John Fastabend, Jakub Kicinski, her...@gondor.apana.org.au, linux-...@vger.kernel.org, syzkall...@googlegroups.com, syzbot, Eric Biggers, Aviad Yehezkel, Daniel Borkmann, net...@vger.kernel.org, David S. Miller, Eric Dumazet, Paolo Abeni
On Fri, 30 Jun 2023 at 12:11, Alexander Potapenko <gli...@google.com> wrote:
>
> On Fri, Jun 30, 2023 at 12:02 PM Ard Biesheuvel <ar...@kernel.org> wrote:
> >
> > On Fri, 30 Jun 2023 at 11:53, Tetsuo Handa
> > <penguin...@i-love.sakura.ne.jp> wrote:
> > >
> > > On 2023/06/30 18:36, Ard Biesheuvel wrote:
> > > > Why are you sending this now?
> > >
> > > Just because this is currently top crasher and I can reproduce locally.
> > >
> > > > Do you have a reproducer for this issue?
> > >
> > > Yes. https://syzkaller.appspot.com/text?tag=ReproC&x=12931621900000 works.
> > >
> >
> > Could you please share your kernel config and the resulting kernel log
> > when running the reproducer? I'll try to reproduce locally as well,
> > and see if I can figure out what is going on in the crypto layer
>
> The config together with the repro is available at
> https://syzkaller.appspot.com/bug?extid=828dfc12440b4f6f305d, see the
> latest row of the "Crashes" table that contains a C repro.

Could you explain why that bug contains ~50 reports that seem entirely
unrelated? AIUI, this actual issue has not been reproduced since
2020??


> Config: https://syzkaller.appspot.com/text?tag=KernelConfig&x=ee5f7a0b2e48ed66
> Report: https://syzkaller.appspot.com/text?tag=CrashReport&x=1325260d900000
> Syz repro: https://syzkaller.appspot.com/text?tag=ReproSyz&x=11af973e900000
> C repro: https://syzkaller.appspot.com/text?tag=ReproC&x=163a1e45900000
>
> The bug is reproducible for me locally as well (and Tetsuo's patch
> makes it disappear, although I have no opinion on its correctness).

What I'd like to do is run a kernel plus initrd locally in OVMF and
reproduce the issue - can I do that without all the syzkaller
machinery?

Tetsuo Handa

unread,
Jun 30, 2023, 7:11:49 AM6/30/23
to Ard Biesheuvel, Alexander Potapenko, Boris Pismenny, John Fastabend, Jakub Kicinski, her...@gondor.apana.org.au, linux-...@vger.kernel.org, syzkall...@googlegroups.com, syzbot, Eric Biggers, Aviad Yehezkel, Daniel Borkmann, net...@vger.kernel.org, David S. Miller, Eric Dumazet, Paolo Abeni
On 2023/06/30 19:18, Ard Biesheuvel wrote:
> On Fri, 30 Jun 2023 at 12:11, Alexander Potapenko <gli...@google.com> wrote:
>>
>> On Fri, Jun 30, 2023 at 12:02 PM Ard Biesheuvel <ar...@kernel.org> wrote:
>>>
>>> On Fri, 30 Jun 2023 at 11:53, Tetsuo Handa
>>> <penguin...@i-love.sakura.ne.jp> wrote:
>>>>
>>>> On 2023/06/30 18:36, Ard Biesheuvel wrote:
>>>>> Why are you sending this now?
>>>>
>>>> Just because this is currently top crasher and I can reproduce locally.
>>>>
>>>>> Do you have a reproducer for this issue?
>>>>
>>>> Yes. https://syzkaller.appspot.com/text?tag=ReproC&x=12931621900000 works.
>>>>
>>>
>>> Could you please share your kernel config and the resulting kernel log
>>> when running the reproducer? I'll try to reproduce locally as well,
>>> and see if I can figure out what is going on in the crypto layer
>>
>> The config together with the repro is available at
>> https://syzkaller.appspot.com/bug?extid=828dfc12440b4f6f305d, see the
>> latest row of the "Crashes" table that contains a C repro.

Kernel is commit e6bc8833d80f of https://github.com/google/kmsan/commits/master .
Config is available in the dashboard page, but a smaller one is available at
https://I-love.SAKURA.ne.jp/tmp/config-6.4.0-rc7-kmsan .

I'm using a debug printk() patch shown below.

----------------------------------------
diff --git a/net/tls/tls_sw.c b/net/tls/tls_sw.c
index 1a53c8f481e9..b32bb015995c 100644
--- a/net/tls/tls_sw.c
+++ b/net/tls/tls_sw.c
@@ -1210,7 +1210,8 @@ static int tls_sw_do_sendpage(struct sock *sk, struct page *page,
if (!sk_stream_memory_free(sk))
goto wait_for_sndbuf;
alloc_payload:
- ret = tls_alloc_encrypted_msg(sk, required_size);
+ ret = tls_alloc_encrypted_msg(sk, required_size); /////
+ pr_info("required_size=%d ret=%d\n", required_size, ret);
if (ret) {
if (ret != -ENOSPC)
goto wait_for_memory;
@@ -1232,7 +1233,9 @@ static int tls_sw_do_sendpage(struct sock *sk, struct page *page,

tls_ctx->pending_open_record_frags = true;
if (full_record || eor || sk_msg_full(msg_pl)) {
- ret = bpf_exec_tx_verdict(msg_pl, sk, full_record,
+ pr_info("full_record=%d eor=%d sk_msg_full(msg_pl)=%d copied=%d\n",
+ full_record, eor, sk_msg_full(msg_pl), copied);
+ ret = bpf_exec_tx_verdict(msg_pl, sk, full_record, /////
record_type, &copied, flags);
if (ret) {
if (ret == -EINPROGRESS)
----------------------------------------

Output (on Ubuntu 22.04 on Oracle VM VirtualBox) is shown below.
Please check tendency of the sum of required_size= values up to the full_record= line.
It seems that the value of required_size= might vary depending on the timings, but
the sum of the values seems to have some rule.

4125+8221+12317+16413=41076 (the lower 4 bits are 0100)
2461+6557+10653+14749+16413=50833 (the lower 4 bits are 0001)
2461+6573+10669+14765+16413=50881 (the lower 4 bits are 0001)

KMSAN reports this problem when the lower 4 bits became 0001 for the second time.
Unless KMSAN's reporting is asynchronous, maybe the reason of "for the second time"
part is that the previous state is relevant...

----------------------------------------
[ 157.471712][ T3414] required_size=4125 ret=0
[ 157.475879][ T3414] required_size=8221 ret=0
[ 157.480471][ T3414] required_size=12317 ret=0
[ 157.484604][ T3414] required_size=16413 ret=0
[ 157.490499][ T3414] full_record=1 eor=0 sk_msg_full(msg_pl)=0 copied=4096
[ 157.513772][ T3414] required_size=4125 ret=0
[ 157.523782][ T3414] required_size=8221 ret=0
[ 157.533658][ T3414] required_size=12317 ret=0
[ 157.539579][ T3414] required_size=16413 ret=0
[ 157.543785][ T3414] full_record=1 eor=0 sk_msg_full(msg_pl)=0 copied=4096
[ 157.572869][ T3414] required_size=4125 ret=0
[ 157.579350][ T3414] required_size=8221 ret=0
[ 157.584699][ T3414] required_size=12317 ret=0
[ 157.591756][ T3414] required_size=16413 ret=0
[ 157.595891][ T3414] full_record=1 eor=0 sk_msg_full(msg_pl)=0 copied=4096
[ 157.790734][ T3424] required_size=2461 ret=0
[ 157.800725][ T3424] required_size=6557 ret=0
[ 157.804560][ T3424] required_size=10653 ret=0
[ 157.808433][ T3424] required_size=14749 ret=0
[ 157.810125][ T3424] required_size=16413 ret=0
[ 157.829564][ T3424] full_record=1 eor=0 sk_msg_full(msg_pl)=0 copied=1664
[ 157.848397][ T3424] required_size=2461 ret=0
[ 157.854875][ T3424] required_size=6573 ret=0
[ 157.860883][ T3424] required_size=10669 ret=0
[ 157.865463][ T3424] required_size=14765 ret=0
[ 157.871794][ T3424] required_size=16413 ret=0
[ 157.877333][ T3424] full_record=1 eor=0 sk_msg_full(msg_pl)=0 copied=1648
[ 157.885187][ T3424] =====================================================
[ 157.887262][ T3424] BUG: KMSAN: uninit-value in aes_encrypt+0x1692/0x1fa0
[ 157.887262][ T3424] aes_encrypt+0x1692/0x1fa0
[ 157.887262][ T3424] aesti_encrypt+0xe1/0x160
[ 157.887262][ T3424] crypto_cipher_encrypt_one+0x1d1/0x2e0
[ 157.887262][ T3424] crypto_cbcmac_digest_update+0x3ff/0x5a0
[ 157.887262][ T3424] shash_ahash_finup+0x79d/0xd00
[ 157.887262][ T3424] shash_async_finup+0xbf/0x110
[ 157.887262][ T3424] crypto_ahash_finup+0x244/0x500
[ 157.887262][ T3424] crypto_ccm_auth+0x14df/0x15a0
[ 157.887262][ T3424] crypto_ccm_encrypt+0x2ad/0x8b0
[ 157.887262][ T3424] crypto_aead_encrypt+0x116/0x1a0
[ 157.887262][ T3424] tls_push_record+0x2bbe/0x3bf0
[ 157.887262][ T3424] bpf_exec_tx_verdict+0x5ba/0x2530
[ 157.887262][ T3424] tls_sw_do_sendpage+0x1779/0x21f0
[ 157.887262][ T3424] tls_sw_sendpage+0x247/0x2b0
[ 157.887262][ T3424] inet_sendpage+0x1de/0x2f0
[ 157.887262][ T3424] kernel_sendpage+0x4cc/0x940
[ 158.004827][ T3424] sock_sendpage+0x162/0x220
[ 158.004827][ T3424] pipe_to_sendpage+0x3df/0x4f0
[ 158.004827][ T3424] __splice_from_pipe+0x5c7/0x1010
[ 158.004827][ T3424] generic_splice_sendpage+0x1c6/0x2a0
[ 158.004827][ T3424] do_splice+0x26d8/0x32f0
[ 158.004827][ T3424] __se_sys_splice+0x81f/0xba0
[ 158.004827][ T3424] __x64_sys_splice+0x1a1/0x200
[ 158.004827][ T3424] do_syscall_64+0x41/0x90
[ 158.004827][ T3424] entry_SYSCALL_64_after_hwframe+0x63/0xcd
[ 158.004827][ T3424]
[ 158.004827][ T3424] Uninit was stored to memory at:
[ 158.004827][ T3424] __crypto_xor+0x285/0x1700
[ 158.004827][ T3424] crypto_cbcmac_digest_update+0x2ba/0x5a0
[ 158.004827][ T3424] shash_ahash_finup+0x79d/0xd00
[ 158.004827][ T3424] shash_async_finup+0xbf/0x110
[ 158.004827][ T3424] crypto_ahash_finup+0x244/0x500
[ 158.004827][ T3424] crypto_ccm_auth+0x14df/0x15a0
[ 158.004827][ T3424] crypto_ccm_encrypt+0x2ad/0x8b0
[ 158.004827][ T3424] crypto_aead_encrypt+0x116/0x1a0
[ 158.004827][ T3424] tls_push_record+0x2bbe/0x3bf0
[ 158.004827][ T3424] bpf_exec_tx_verdict+0x5ba/0x2530
[ 158.004827][ T3424] tls_sw_do_sendpage+0x1779/0x21f0
[ 158.004827][ T3424] tls_sw_sendpage+0x247/0x2b0
[ 158.004827][ T3424] inet_sendpage+0x1de/0x2f0
[ 158.004827][ T3424] kernel_sendpage+0x4cc/0x940
[ 158.004827][ T3424] sock_sendpage+0x162/0x220
[ 158.004827][ T3424] pipe_to_sendpage+0x3df/0x4f0
[ 158.004827][ T3424] __splice_from_pipe+0x5c7/0x1010
[ 158.004827][ T3424] generic_splice_sendpage+0x1c6/0x2a0
[ 158.004827][ T3424] do_splice+0x26d8/0x32f0
[ 158.004827][ T3424] __se_sys_splice+0x81f/0xba0
[ 158.004827][ T3424] __x64_sys_splice+0x1a1/0x200
[ 158.004827][ T3424] do_syscall_64+0x41/0x90
[ 158.004827][ T3424] entry_SYSCALL_64_after_hwframe+0x63/0xcd
[ 158.004827][ T3424]
[ 158.004827][ T3424] Uninit was created at:
[ 158.004827][ T3424] __alloc_pages+0x925/0x1050
[ 158.004827][ T3424] alloc_pages+0xe30/0x11b0
[ 158.004827][ T3424] skb_page_frag_refill+0x362/0x910
[ 158.004827][ T3424] sk_page_frag_refill+0xa2/0x1c0
[ 158.004827][ T3424] sk_msg_alloc+0x278/0x1560
[ 158.004827][ T3424] tls_sw_do_sendpage+0xbec/0x21f0
[ 158.004827][ T3424] tls_sw_sendpage+0x247/0x2b0
[ 158.004827][ T3424] inet_sendpage+0x1de/0x2f0
[ 158.004827][ T3424] kernel_sendpage+0x4cc/0x940
[ 158.004827][ T3424] sock_sendpage+0x162/0x220
[ 158.004827][ T3424] pipe_to_sendpage+0x3df/0x4f0
[ 158.004827][ T3424] __splice_from_pipe+0x5c7/0x1010
[ 158.004827][ T3424] generic_splice_sendpage+0x1c6/0x2a0
[ 158.260226][ T3424] do_splice+0x26d8/0x32f0
[ 158.260226][ T3424] __se_sys_splice+0x81f/0xba0
[ 158.260226][ T3424] __x64_sys_splice+0x1a1/0x200
[ 158.260226][ T3424] do_syscall_64+0x41/0x90
[ 158.260226][ T3424] entry_SYSCALL_64_after_hwframe+0x63/0xcd
[ 158.260226][ T3424]
[ 158.260226][ T3424] CPU: 7 PID: 3424 Comm: a.out Not tainted 6.4.0-rc7-ge6bc8833d80f-dirty #26
[ 158.260226][ T3424] Hardware name: innotek GmbH VirtualBox/VirtualBox, BIOS VirtualBox 12/01/2006
[ 158.260226][ T3424] =====================================================
[ 158.260226][ T3424] Disabling lock debugging due to kernel taint
[ 158.260226][ T3424] Kernel panic - not syncing: kmsan.panic set ...
[ 158.260226][ T3424] CPU: 7 PID: 3424 Comm: a.out Tainted: G B 6.4.0-rc7-ge6bc8833d80f-dirty #26
[ 158.320898][ T3424] Hardware name: innotek GmbH VirtualBox/VirtualBox, BIOS VirtualBox 12/01/2006
[ 158.334186][ T3424] Call Trace:
[ 158.334186][ T3424] <TASK>
[ 158.334186][ T3424] dump_stack_lvl+0x1f6/0x280
[ 158.334186][ T3424] dump_stack+0x29/0x30
[ 158.334186][ T3424] panic+0x4e7/0xc60
[ 158.334186][ T3424] ? add_taint+0x185/0x210
[ 158.334186][ T3424] kmsan_report+0x2d1/0x2e0
[ 158.334186][ T3424] ? __msan_warning+0x98/0x120
[ 158.334186][ T3424] ? aes_encrypt+0x1692/0x1fa0
[ 158.334186][ T3424] ? aesti_encrypt+0xe1/0x160
[ 158.334186][ T3424] ? crypto_cipher_encrypt_one+0x1d1/0x2e0
[ 158.334186][ T3424] ? crypto_cbcmac_digest_update+0x3ff/0x5a0
[ 158.334186][ T3424] ? shash_ahash_finup+0x79d/0xd00
[ 158.334186][ T3424] ? shash_async_finup+0xbf/0x110
[ 158.334186][ T3424] ? crypto_ahash_finup+0x244/0x500
[ 158.334186][ T3424] ? crypto_ccm_auth+0x14df/0x15a0
[ 158.334186][ T3424] ? crypto_ccm_encrypt+0x2ad/0x8b0
[ 158.334186][ T3424] ? crypto_aead_encrypt+0x116/0x1a0
[ 158.334186][ T3424] ? tls_push_record+0x2bbe/0x3bf0
[ 158.334186][ T3424] ? bpf_exec_tx_verdict+0x5ba/0x2530
[ 158.334186][ T3424] ? tls_sw_do_sendpage+0x1779/0x21f0
[ 158.334186][ T3424] ? tls_sw_sendpage+0x247/0x2b0
[ 158.334186][ T3424] ? inet_sendpage+0x1de/0x2f0
[ 158.334186][ T3424] ? kernel_sendpage+0x4cc/0x940
[ 158.334186][ T3424] ? sock_sendpage+0x162/0x220
[ 158.334186][ T3424] ? pipe_to_sendpage+0x3df/0x4f0
[ 158.334186][ T3424] ? __splice_from_pipe+0x5c7/0x1010
[ 158.334186][ T3424] ? generic_splice_sendpage+0x1c6/0x2a0
[ 158.334186][ T3424] ? do_splice+0x26d8/0x32f0
[ 158.334186][ T3424] ? __se_sys_splice+0x81f/0xba0
[ 158.334186][ T3424] ? __x64_sys_splice+0x1a1/0x200
[ 158.334186][ T3424] ? do_syscall_64+0x41/0x90
[ 158.334186][ T3424] ? entry_SYSCALL_64_after_hwframe+0x63/0xcd
[ 158.334186][ T3424] ? filter_irq_stacks+0xb9/0x230
[ 158.334186][ T3424] ? __stack_depot_save+0x22/0x490
[ 158.334186][ T3424] ? kmsan_internal_set_shadow_origin+0x66/0xe0
[ 158.334186][ T3424] ? kmsan_internal_chain_origin+0x110/0x120
[ 158.334186][ T3424] ? kmsan_get_shadow_origin_ptr+0x4d/0xa0
[ 158.334186][ T3424] __msan_warning+0x98/0x120
[ 158.334186][ T3424] aes_encrypt+0x1692/0x1fa0
[ 158.334186][ T3424] aesti_encrypt+0xe1/0x160
[ 158.334186][ T3424] crypto_cipher_encrypt_one+0x1d1/0x2e0
[ 158.334186][ T3424] ? aesti_set_key+0xb0/0xb0
[ 158.334186][ T3424] ? kmsan_get_shadow_origin_ptr+0x4d/0xa0
[ 158.334186][ T3424] crypto_cbcmac_digest_update+0x3ff/0x5a0
[ 158.334186][ T3424] ? crypto_cbcmac_digest_init+0x140/0x140
[ 158.334186][ T3424] shash_ahash_finup+0x79d/0xd00
[ 158.334186][ T3424] ? kmsan_get_shadow_origin_ptr+0x4d/0xa0
[ 158.334186][ T3424] shash_async_finup+0xbf/0x110
[ 158.334186][ T3424] crypto_ahash_finup+0x244/0x500
[ 158.334186][ T3424] ? shash_async_final+0x3d0/0x3d0
[ 158.334186][ T3424] crypto_ccm_auth+0x14df/0x15a0
[ 158.334186][ T3424] crypto_ccm_encrypt+0x2ad/0x8b0
[ 158.334186][ T3424] ? kmsan_get_shadow_origin_ptr+0x4d/0xa0
[ 158.334186][ T3424] ? crypto_ccm_setauthsize+0x100/0x100
[ 158.334186][ T3424] crypto_aead_encrypt+0x116/0x1a0
[ 158.653332][ T3424] tls_push_record+0x2bbe/0x3bf0
[ 158.653332][ T3424] bpf_exec_tx_verdict+0x5ba/0x2530
[ 158.653332][ T3424] ? _printk+0x181/0x1b0
[ 158.653332][ T3424] ? tls_sw_do_sendpage+0xc81/0x21f0
[ 158.653332][ T3424] tls_sw_do_sendpage+0x1779/0x21f0
[ 158.653332][ T3424] tls_sw_sendpage+0x247/0x2b0
[ 158.653332][ T3424] ? tls_sw_do_sendpage+0x21f0/0x21f0
[ 158.653332][ T3424] inet_sendpage+0x1de/0x2f0
[ 158.653332][ T3424] ? inet_sendmsg+0x1d0/0x1d0
[ 158.653332][ T3424] kernel_sendpage+0x4cc/0x940
[ 158.653332][ T3424] sock_sendpage+0x162/0x220
[ 158.653332][ T3424] pipe_to_sendpage+0x3df/0x4f0
[ 158.653332][ T3424] ? sock_fasync+0x240/0x240
[ 158.653332][ T3424] __splice_from_pipe+0x5c7/0x1010
[ 158.653332][ T3424] ? generic_splice_sendpage+0x2a0/0x2a0
[ 158.653332][ T3424] generic_splice_sendpage+0x1c6/0x2a0
[ 158.653332][ T3424] ? iter_file_splice_write+0x1a30/0x1a30
[ 158.653332][ T3424] do_splice+0x26d8/0x32f0
[ 158.653332][ T3424] ? kmsan_get_shadow_origin_ptr+0x4d/0xa0
[ 158.653332][ T3424] ? __se_sys_splice+0x292/0xba0
[ 158.653332][ T3424] ? __msan_metadata_ptr_for_load_8+0x24/0x40
[ 158.653332][ T3424] ? filter_irq_stacks+0xb9/0x230
[ 158.653332][ T3424] __se_sys_splice+0x81f/0xba0
[ 158.870673][ T3424] __x64_sys_splice+0x1a1/0x200
[ 158.870673][ T3424] do_syscall_64+0x41/0x90
[ 158.870673][ T3424] entry_SYSCALL_64_after_hwframe+0x63/0xcd
[ 158.870673][ T3424] RIP: 0033:0x7f6bbd51ea3d
[ 158.895223][ T3424] Code: 5b 41 5c c3 66 0f 1f 84 00 00 00 00 00 f3 0f 1e fa 48 89 f8 48 89 f7 48 89 d6 48 89 ca 4d 89 c2 4d 89 c8 4c 8b 4c 24 08 0f 05 <48> 3d 01 f0 ff ff 73 01 c3 48 8b 0d c3 a3 0f 00 f7 d8 64 89 01 48
[ 158.895223][ T3424] RSP: 002b:00007f6bbd731e08 EFLAGS: 00000246 ORIG_RAX: 0000000000000113
[ 158.895223][ T3424] RAX: ffffffffffffffda RBX: 000055ccd9ea6080 RCX: 00007f6bbd51ea3d
[ 158.895223][ T3424] RDX: 0000000000000004 RSI: 0000000000000000 RDI: 0000000000000003
[ 158.895223][ T3424] RBP: 000055ccd9ea41f4 R08: 00080000fffffffc R09: 0000000000000000
[ 158.895223][ T3424] R10: 0000000000000000 R11: 0000000000000246 R12: 0100000000000000
[ 158.895223][ T3424] R13: e65b75b4ec4292eb R14: f2300cdb85a45425 R15: 000055ccd9ea6088
[ 159.041467][ T3424] </TASK>
[ 159.041467][ T3424] Kernel Offset: disabled
[ 159.041467][ T3424] Rebooting in 10 seconds..
----------------------------------------

>
> Could you explain why that bug contains ~50 reports that seem entirely
> unrelated? AIUI, this actual issue has not been reproduced since
> 2020??

Multiple different bugs are reported as the same problem.
Reproducer is available for only bpf_exec_tx_verdict() one, and the reproducer still works.

>
>
>> Config: https://syzkaller.appspot.com/text?tag=KernelConfig&x=ee5f7a0b2e48ed66
>> Report: https://syzkaller.appspot.com/text?tag=CrashReport&x=1325260d900000
>> Syz repro: https://syzkaller.appspot.com/text?tag=ReproSyz&x=11af973e900000
>> C repro: https://syzkaller.appspot.com/text?tag=ReproC&x=163a1e45900000
>>
>> The bug is reproducible for me locally as well (and Tetsuo's patch
>> makes it disappear, although I have no opinion on its correctness).
>
> What I'd like to do is run a kernel plus initrd locally in OVMF and
> reproduce the issue - can I do that without all the syzkaller
> machinery?

I'm using Ubuntu 22.04 on Oracle VM VirtualBox.
I don't know if this can be reproduced with kernel plus initrd only. But
since the C reproducer is standalone, syzkaller machinery is not involved.

Eric Dumazet

unread,
Jun 30, 2023, 7:24:37 AM6/30/23
to Tetsuo Handa, Ard Biesheuvel, Alexander Potapenko, Boris Pismenny, John Fastabend, Jakub Kicinski, her...@gondor.apana.org.au, linux-...@vger.kernel.org, syzkall...@googlegroups.com, syzbot, Eric Biggers, Aviad Yehezkel, Daniel Borkmann, net...@vger.kernel.org, David S. Miller, Paolo Abeni
On Fri, Jun 30, 2023 at 1:11 PM Tetsuo Handa
<penguin...@i-love.sakura.ne.jp> wrote:
>
> On 2023/06/30 19:18, Ard Biesheuvel wrote:
> > On Fri, 30 Jun 2023 at 12:11, Alexander Potapenko <gli...@google.com> wrote:
> >>
> >> On Fri, Jun 30, 2023 at 12:02 PM Ard Biesheuvel <ar...@kernel.org> wrote:
> >>>
> >>> On Fri, 30 Jun 2023 at 11:53, Tetsuo Handa
> >>> <penguin...@i-love.sakura.ne.jp> wrote:
> >>>>
> >>>> On 2023/06/30 18:36, Ard Biesheuvel wrote:
> >>>>> Why are you sending this now?
> >>>>
> >>>> Just because this is currently top crasher and I can reproduce locally.
> >>>>
> >>>>> Do you have a reproducer for this issue?
> >>>>
> >>>> Yes. https://syzkaller.appspot.com/text?tag=ReproC&x=12931621900000 works.
> >>>>
> >>>
> >>> Could you please share your kernel config and the resulting kernel log
> >>> when running the reproducer? I'll try to reproduce locally as well,
> >>> and see if I can figure out what is going on in the crypto layer
> >>
> >> The config together with the repro is available at
> >> https://syzkaller.appspot.com/bug?extid=828dfc12440b4f6f305d, see the
> >> latest row of the "Crashes" table that contains a C repro.
>
> Kernel is commit e6bc8833d80f of https://github.com/google/kmsan/commits/master .
> Config is available in the dashboard page, but a smaller one is available at
> https://I-love.SAKURA.ne.jp/tmp/config-6.4.0-rc7-kmsan .
>
> I'm using a debug printk() patch shown below.
>

Please note that your patch is not correct, unless I am missing something.

sk_page_frag() will use a per-thread frag allocator
(current->task_frag), which might be fed with pages
allocated from other sockets (standard TCP ones), without __GFP_ZERO.

If we must (and I am not saying we should) take this route, we also
need to force
"sk->sk_use_task_frag = false" for TLS sockets.

Ard Biesheuvel

unread,
Jun 30, 2023, 7:32:50 AM6/30/23
to Tetsuo Handa, Alexander Potapenko, Boris Pismenny, John Fastabend, Jakub Kicinski, her...@gondor.apana.org.au, linux-...@vger.kernel.org, syzkall...@googlegroups.com, syzbot, Eric Biggers, Aviad Yehezkel, Daniel Borkmann, net...@vger.kernel.org, David S. Miller, Eric Dumazet, Paolo Abeni
On Fri, 30 Jun 2023 at 13:12, Tetsuo Handa
<penguin...@i-love.sakura.ne.jp> wrote:
>
> On 2023/06/30 19:18, Ard Biesheuvel wrote:
> > On Fri, 30 Jun 2023 at 12:11, Alexander Potapenko <gli...@google.com> wrote:
> >>
> >> On Fri, Jun 30, 2023 at 12:02 PM Ard Biesheuvel <ar...@kernel.org> wrote:
> >>>
> >>> On Fri, 30 Jun 2023 at 11:53, Tetsuo Handa
> >>> <penguin...@i-love.sakura.ne.jp> wrote:
> >>>>
> >>>> On 2023/06/30 18:36, Ard Biesheuvel wrote:
> >>>>> Why are you sending this now?
> >>>>
> >>>> Just because this is currently top crasher and I can reproduce locally.
> >>>>
> >>>>> Do you have a reproducer for this issue?
> >>>>
> >>>> Yes. https://syzkaller.appspot.com/text?tag=ReproC&x=12931621900000 works.
> >>>>
> >>>
> >>> Could you please share your kernel config and the resulting kernel log
> >>> when running the reproducer? I'll try to reproduce locally as well,
> >>> and see if I can figure out what is going on in the crypto layer
> >>
> >> The config together with the repro is available at
> >> https://syzkaller.appspot.com/bug?extid=828dfc12440b4f6f305d, see the
> >> latest row of the "Crashes" table that contains a C repro.
>
> Kernel is commit e6bc8833d80f of https://github.com/google/kmsan/commits/master .

That commit does not exist in that repo. Does it matter?

> Config is available in the dashboard page, but a smaller one is available at
> https://I-love.SAKURA.ne.jp/tmp/config-6.4.0-rc7-kmsan .
>

Thanks - I'll try to rebuild v6.4-rc7 with that.

Alexander Potapenko

unread,
Jun 30, 2023, 7:38:15 AM6/30/23
to Ard Biesheuvel, Tetsuo Handa, Boris Pismenny, John Fastabend, Jakub Kicinski, her...@gondor.apana.org.au, linux-...@vger.kernel.org, syzkall...@googlegroups.com, syzbot, Eric Biggers, Aviad Yehezkel, Daniel Borkmann, net...@vger.kernel.org, David S. Miller, Eric Dumazet, Paolo Abeni
On Fri, Jun 30, 2023 at 12:18 PM Ard Biesheuvel <ar...@kernel.org> wrote:
>
> On Fri, 30 Jun 2023 at 12:11, Alexander Potapenko <gli...@google.com> wrote:
> >
> > On Fri, Jun 30, 2023 at 12:02 PM Ard Biesheuvel <ar...@kernel.org> wrote:
> > >
> > > On Fri, 30 Jun 2023 at 11:53, Tetsuo Handa
> > > <penguin...@i-love.sakura.ne.jp> wrote:
> > > >
> > > > On 2023/06/30 18:36, Ard Biesheuvel wrote:
> > > > > Why are you sending this now?
> > > >
> > > > Just because this is currently top crasher and I can reproduce locally.
> > > >
> > > > > Do you have a reproducer for this issue?
> > > >
> > > > Yes. https://syzkaller.appspot.com/text?tag=ReproC&x=12931621900000 works.
> > > >
> > >
> > > Could you please share your kernel config and the resulting kernel log
> > > when running the reproducer? I'll try to reproduce locally as well,
> > > and see if I can figure out what is going on in the crypto layer
> >
> > The config together with the repro is available at
> > https://syzkaller.appspot.com/bug?extid=828dfc12440b4f6f305d, see the
> > latest row of the "Crashes" table that contains a C repro.
>
> Could you explain why that bug contains ~50 reports that seem entirely
> unrelated?

These are some unfortunate effects of syzbot trying to deduplicate
bugs. There's a tradeoff between reporting every single crash
separately and grouping together those that have e.g. the same origin.
Applying this algorithm transitively results in bigger clusters
containing unwanted reports.
We'll look closer.

> AIUI, this actual issue has not been reproduced since
> 2020??

Oh, sorry, I misread the table and misinformed you. The topmost row of
the table is indeed the _oldest_ one.
Another manifestation of the bug was on 2023/05/23
(https://syzkaller.appspot.com/text?tag=CrashReport&x=146f66b1280000)


>
> > Config: https://syzkaller.appspot.com/text?tag=KernelConfig&x=ee5f7a0b2e48ed66
> > Report: https://syzkaller.appspot.com/text?tag=CrashReport&x=1325260d900000
> > Syz repro: https://syzkaller.appspot.com/text?tag=ReproSyz&x=11af973e900000
> > C repro: https://syzkaller.appspot.com/text?tag=ReproC&x=163a1e45900000
> >
> > The bug is reproducible for me locally as well (and Tetsuo's patch
> > makes it disappear, although I have no opinion on its correctness).
>
> What I'd like to do is run a kernel plus initrd locally in OVMF and
> reproduce the issue - can I do that without all the syzkaller
> machinery?

You can build the kernel from the config linked above, that's what I
did to reproduce it locally.
As for initrd, there are disk images attached to the reports, will that help?

E.g.
$ wget https://storage.googleapis.com/syzbot-assets/79bb4ff7cc58/disk-f93f2fed.raw.xz
$ unxz disk-f93f2fed.raw.xz
$ qemu-system-x86_64 -smp 2,sockets=2,cores=1 -m 4G -drive
file=disk-f93f2fed.raw -snapshot -nographic -enable-kvm

lets me boot syzkaller with the disk/kernel from that report of 2023/05/23.
Adding "-net user,hostfwd=tcp::10022-:22 -net nic,model=e1000" I am
also able to SSH into the machine (there's no password):

$ ssh -o "StrictHostKeyChecking no" -p 10022 root@localhost

Then the repro can be downloaded and executed:

$ wget "https://syzkaller.appspot.com/text?tag=ReproC&x=163a1e45900000" -O t.c
$ gcc t.c -static -o t
$ scp -o "StrictHostKeyChecking no" -P 10022 t root@localhost:
$ ssh -o "StrictHostKeyChecking no" -p 10022 root@localhost ./t

Within a couple minutes the kernel crashes with the report:

[ 151.522472][ T5865] =====================================================
[ 151.523843][ T5865] BUG: KMSAN: uninit-value in aes_encrypt+0x15cc/0x1db0
[ 151.525120][ T5865] aes_encrypt+0x15cc/0x1db0
[ 151.526113][ T5865] aesti_encrypt+0x7d/0xf0
[ 151.527057][ T5865] crypto_cipher_encrypt_one+0x112/0x200
[ 151.528224][ T5865] crypto_cbcmac_digest_update+0x301/0x4b0

The vmlinux binary (also available at the syzbot page) can be used to
symbolize this report, but perhaps at this point you'd love to switch
to your own kernel.

--
Alexander Potapenko
Software Engineer

Google Germany GmbH
Erika-Mann-Straße, 33
80636 München

Geschäftsführer: Paul Manicle, Liana Sebastian
Registergericht und -nummer: Hamburg, HRB 86891
Sitz der Gesellschaft: Hamburg

Alexander Potapenko

unread,
Jun 30, 2023, 7:43:40 AM6/30/23
to Ard Biesheuvel, Tetsuo Handa, Boris Pismenny, John Fastabend, Jakub Kicinski, her...@gondor.apana.org.au, linux-...@vger.kernel.org, syzkall...@googlegroups.com, syzbot, Eric Biggers, Aviad Yehezkel, Daniel Borkmann, net...@vger.kernel.org, David S. Miller, Eric Dumazet, Paolo Abeni
> > >>>
> > >>> Could you please share your kernel config and the resulting kernel log
> > >>> when running the reproducer? I'll try to reproduce locally as well,
> > >>> and see if I can figure out what is going on in the crypto layer
> > >>
> > >> The config together with the repro is available at
> > >> https://syzkaller.appspot.com/bug?extid=828dfc12440b4f6f305d, see the
> > >> latest row of the "Crashes" table that contains a C repro.
> >
> > Kernel is commit e6bc8833d80f of https://github.com/google/kmsan/commits/master .
>
> That commit does not exist in that repo. Does it matter?

Apologies for this mess.
https://github.com/google/kmsan/commits/master is force-updated once a
week to point to the latest release candidate with KMSAN-specific
patches.
Older releases are called e.g. kmsan-v6.4-rc7.
Right now there's only one patch required to run torvalds/master with
KMSAN (https://github.com/google/kmsan/commit/e6bc8833d80f).
That patch will hit upstream in v6.5-rc1, after which I am going to
switch syzbot to test upstream.

For such a long-standing bug the exact version of KMSAN shouldn't matter.

Ard Biesheuvel

unread,
Jun 30, 2023, 7:49:50 AM6/30/23
to Alexander Potapenko, Tetsuo Handa, Boris Pismenny, John Fastabend, Jakub Kicinski, her...@gondor.apana.org.au, linux-...@vger.kernel.org, syzkall...@googlegroups.com, syzbot, Eric Biggers, Aviad Yehezkel, Daniel Borkmann, net...@vger.kernel.org, David S. Miller, Eric Dumazet, Paolo Abeni
That one has nothing to do with networking, so I don't see how this
patch would affect it.
OK, thanks for the instructions.

Out of curiosity - does the stack trace you cut off here include the
BPF routine mentioned in the report?

Alexander Potapenko

unread,
Jun 30, 2023, 7:55:38 AM6/30/23
to Ard Biesheuvel, Tetsuo Handa, Boris Pismenny, John Fastabend, Jakub Kicinski, her...@gondor.apana.org.au, linux-...@vger.kernel.org, syzkall...@googlegroups.com, syzbot, Eric Biggers, Aviad Yehezkel, Daniel Borkmann, net...@vger.kernel.org, David S. Miller, Eric Dumazet, Paolo Abeni
I definitely have to be more attentive.
You are right that this bug report is also unrelated. Yet it is still
fine to use the build artifacts corresponding to it (which is what I
did).
I'll investigate why so many reports got clustered into this one.



> OK, thanks for the instructions.
>
> Out of curiosity - does the stack trace you cut off here include the
> BPF routine mentioned in the report?

It does:

[ 151.522472][ T5865] =====================================================
[ 151.523843][ T5865] BUG: KMSAN: uninit-value in aes_encrypt+0x15cc/0x1db0
[ 151.525120][ T5865] aes_encrypt+0x15cc/0x1db0
[ 151.526113][ T5865] aesti_encrypt+0x7d/0xf0
[ 151.527057][ T5865] crypto_cipher_encrypt_one+0x112/0x200
[ 151.528224][ T5865] crypto_cbcmac_digest_update+0x301/0x4b0
[ 151.529459][ T5865] shash_ahash_finup+0x66e/0xc00
[ 151.530541][ T5865] shash_async_finup+0x7f/0xc0
[ 151.531542][ T5865] crypto_ahash_finup+0x1b8/0x3e0
[ 151.532583][ T5865] crypto_ccm_auth+0x1269/0x1350
[ 151.533606][ T5865] crypto_ccm_encrypt+0x1c9/0x7a0
[ 151.534650][ T5865] crypto_aead_encrypt+0xe0/0x150
[ 151.535695][ T5865] tls_push_record+0x3bf3/0x4ec0
[ 151.539491][ T5865] bpf_exec_tx_verdict+0x46e/0x21d0
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
[ 151.540597][ T5865] tls_sw_do_sendpage+0x1150/0x1ad0
[ 151.541594][ T5865] tls_sw_sendpage+0x15b/0x1b0
[ 151.542500][ T5865] inet_sendpage+0x138/0x210
[ 151.543365][ T5865] kernel_sendpage+0x34c/0x6d0
[ 151.544269][ T5865] sock_sendpage+0xb0/0x160
[ 151.545117][ T5865] pipe_to_sendpage+0x304/0x3f0
[ 151.546051][ T5865] __splice_from_pipe+0x438/0xc20
[ 151.547013][ T5865] generic_splice_sendpage+0x100/0x160
[ 151.548068][ T5865] do_splice+0x213b/0x2d10
[ 151.548933][ T5865] __se_sys_splice+0x5ad/0x8f0
[ 151.549851][ T5865] __x64_sys_splice+0x11b/0x1a0
[ 151.550790][ T5865] do_syscall_64+0x41/0xc0
[ 151.551646][ T5865] entry_SYSCALL_64_after_hwframe+0x63/0xcd
[ 151.552773][ T5865]
[ 151.553220][ T5865] Uninit was stored to memory at:
[ 151.554212][ T5865] __crypto_xor+0x171/0x1310
[ 151.555062][ T5865] crypto_cbcmac_digest_update+0x208/0x4b0
[ 151.556132][ T5865] shash_ahash_finup+0x66e/0xc00
[ 151.557084][ T5865] shash_async_finup+0x7f/0xc0
[ 151.557989][ T5865] crypto_ahash_finup+0x1b8/0x3e0
[ 151.558941][ T5865] crypto_ccm_auth+0x1269/0x1350
[ 151.559874][ T5865] crypto_ccm_encrypt+0x1c9/0x7a0
[ 151.560812][ T5865] crypto_aead_encrypt+0xe0/0x150
[ 151.561749][ T5865] tls_push_record+0x3bf3/0x4ec0
[ 151.562835][ T5865] bpf_exec_tx_verdict+0x46e/0x21d0
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
[ 151.563967][ T5865] tls_sw_do_sendpage+0x1150/0x1ad0
[ 151.565075][ T5865] tls_sw_sendpage+0x15b/0x1b0
[ 151.566107][ T5865] inet_sendpage+0x138/0x210
[ 151.567078][ T5865] kernel_sendpage+0x34c/0x6d0
[ 151.568087][ T5865] sock_sendpage+0xb0/0x160
[ 151.568960][ T5865] pipe_to_sendpage+0x304/0x3f0
[ 151.569909][ T5865] __splice_from_pipe+0x438/0xc20
[ 151.570886][ T5865] generic_splice_sendpage+0x100/0x160
[ 151.571946][ T5865] do_splice+0x213b/0x2d10
[ 151.572810][ T5865] __se_sys_splice+0x5ad/0x8f0
[ 151.573732][ T5865] __x64_sys_splice+0x11b/0x1a0
[ 151.574664][ T5865] do_syscall_64+0x41/0xc0
[ 151.575513][ T5865] entry_SYSCALL_64_after_hwframe+0x63/0xcd
[ 151.576640][ T5865]
[ 151.577084][ T5865] Uninit was created at:
[ 151.577949][ T5865] __alloc_pages+0x9a4/0xe00
[ 151.578849][ T5865] alloc_pages+0xd01/0x1040
[ 151.579729][ T5865] skb_page_frag_refill+0x2bf/0x7c0
[ 151.580752][ T5865] sk_page_frag_refill+0x59/0x130
[ 151.581720][ T5865] sk_msg_alloc+0x198/0x10d0
[ 151.582611][ T5865] tls_sw_do_sendpage+0x98a/0x1ad0
[ 151.583599][ T5865] tls_sw_sendpage+0x15b/0x1b0
[ 151.584535][ T5865] inet_sendpage+0x138/0x210
[ 151.585404][ T5865] kernel_sendpage+0x34c/0x6d0
[ 151.586275][ T5865] sock_sendpage+0xb0/0x160
[ 151.587099][ T5865] pipe_to_sendpage+0x304/0x3f0
[ 151.588023][ T5865] __splice_from_pipe+0x438/0xc20
[ 151.588981][ T5865] generic_splice_sendpage+0x100/0x160
[ 151.590032][ T5865] do_splice+0x213b/0x2d10
[ 151.590910][ T5865] __se_sys_splice+0x5ad/0x8f0
[ 151.591840][ T5865] __x64_sys_splice+0x11b/0x1a0
[ 151.592780][ T5865] do_syscall_64+0x41/0xc0
[ 151.593748][ T5865] entry_SYSCALL_64_after_hwframe+0x63/0xcd

Ard Biesheuvel

unread,
Jun 30, 2023, 11:16:22 AM6/30/23
to Alexander Potapenko, Tetsuo Handa, Boris Pismenny, John Fastabend, Jakub Kicinski, her...@gondor.apana.org.au, linux-...@vger.kernel.org, syzkall...@googlegroups.com, syzbot, Eric Biggers, Aviad Yehezkel, Daniel Borkmann, net...@vger.kernel.org, David S. Miller, Eric Dumazet, Paolo Abeni
OK, so after poking around a little bit, I have managed to confirm
that the problem is in the TLS layer, and I am a bit out of my depth
debugging that.

With the debugging code below applied, KMSAN triggers on an
uninit-memory in the input scatterlist provided by the TLS layer into
the CCM code.

[ 148.375852][ T2424] =====================================================
[ 148.377269][ T2424] BUG: KMSAN: uninit-value in tls_push_record+0x2d9f/0x3eb0
[ 148.378623][ T2424] tls_push_record+0x2d9f/0x3eb0
[ 148.379559][ T2424] bpf_exec_tx_verdict+0x5ba/0x2530
[ 148.380534][ T2424] tls_sw_do_sendpage+0x169c/0x1f80
[ 148.381519][ T2424] tls_sw_sendpage+0x247/0x2b0
...
[ 148.411559][ T2424]
[ 148.412108][ T2424] Bytes 0-15 of 16 are uninitialized
[ 148.413379][ T2424] Memory access of size 16 starts at ffff8880157889c7

Note that this is the *input* scatterlist containing the AAD
(additional authenticated data) and the crypto input, and so there is
definitely a bug here that shouldn't be papered over by zero'ing the
allocation.

--- a/net/tls/tls_sw.c
+++ b/net/tls/tls_sw.c
@@ -543,6 +543,21 @@ static int tls_do_encryption(struct sock *sk,
list_add_tail((struct list_head *)&rec->list, &ctx->tx_list);
atomic_inc(&ctx->encrypt_pending);

+ {
+ int len = aead_req->assoclen + aead_req->cryptlen;
+ struct sg_mapping_iter miter;
+
+ sg_miter_start(&miter, rec->sg_aead_in,
+ sg_nents(rec->sg_aead_in),
+ SG_MITER_TO_SG | SG_MITER_ATOMIC);
+
+ while (len > 0 && sg_miter_next(&miter)) {
+ kmsan_check_memory(miter.addr, min(len,
(int)miter.length));
+ len -= miter.length;
+ }
+ sg_miter_stop(&miter);
+ }
+

The reason that this cascades all the way down to the AES cipher code
appears to be that sbox substitution involves array indexing, which is
one of the actions KMSAN qualifies as 'use' of uninit data.

Jakub Kicinski

unread,
Jun 30, 2023, 11:27:36 AM6/30/23
to Ard Biesheuvel, Alexander Potapenko, Tetsuo Handa, Boris Pismenny, John Fastabend, her...@gondor.apana.org.au, linux-...@vger.kernel.org, syzkall...@googlegroups.com, syzbot, Eric Biggers, Aviad Yehezkel, Daniel Borkmann, net...@vger.kernel.org, David S. Miller, Eric Dumazet, Paolo Abeni
On Fri, 30 Jun 2023 17:16:06 +0200 Ard Biesheuvel wrote:
> Note that this is the *input* scatterlist containing the AAD
> (additional authenticated data) and the crypto input, and so there is
> definitely a bug here that shouldn't be papered over by zero'ing the
> allocation.

Noob question, it's not the tag / AAD, right? We definitely don't init
that..

Ard Biesheuvel

unread,
Jun 30, 2023, 11:51:10 AM6/30/23
to Jakub Kicinski, Alexander Potapenko, Tetsuo Handa, Boris Pismenny, John Fastabend, her...@gondor.apana.org.au, linux-...@vger.kernel.org, syzkall...@googlegroups.com, syzbot, Eric Biggers, Aviad Yehezkel, Daniel Borkmann, net...@vger.kernel.org, David S. Miller, Eric Dumazet, Paolo Abeni
assoclen + cryptlen does not cover the tag on encryption, so it would
be omitted here regardless of whether it was covered in the input
scatterlist or not.

Tetsuo Handa

unread,
Jul 1, 2023, 12:12:54 AM7/1/23
to Ard Biesheuvel, Alexander Potapenko, Boris Pismenny, John Fastabend, Jakub Kicinski, her...@gondor.apana.org.au, linux-...@vger.kernel.org, syzkall...@googlegroups.com, syzbot, Eric Biggers, Aviad Yehezkel, Daniel Borkmann, net...@vger.kernel.org, David S. Miller, Eric Dumazet, Paolo Abeni
I'm trying to write a simplified reproducer that reproduces

----------------------------------------
[ 162.905919][ T3399] required_size=2461 ret=0
[ 162.916796][ T3399] required_size=6557 ret=0
[ 162.919587][ T3399] required_size=10653 ret=0
[ 162.923090][ T3399] required_size=14749 ret=0
[ 162.928686][ T3399] required_size=16413 ret=0
[ 162.936894][ T3399] full_record=1 eor=0 sk_msg_full(msg_pl)=0 copied=1664
[ 162.962097][ T3399] required_size=2461 ret=0
[ 162.967270][ T3399] required_size=6557 ret=0
[ 162.992866][ T3399] required_size=10653 ret=0
[ 162.999962][ T3399] required_size=14765 ret=0
[ 163.006420][ T3399] required_size=16413 ret=0
[ 163.012163][ T3399] full_record=1 eor=0 sk_msg_full(msg_pl)=0 copied=1648
----------------------------------------

part, and came to wonder if serialization between splice() and sendmsg() is correct.

A program where splice() and sendmsg() cannot run in parallel due to single-threaded
is shown below.

----------------------------------------
#define _GNU_SOURCE
#include <sys/types.h>
#include <fcntl.h>
#include <unistd.h>
#include <sys/socket.h>
#include <arpa/inet.h>
#define SOL_TCP 6
#define TCP_REPAIR 19
#define TCP_ULP 31
#define TLS_TX 1

int main(int argc, char *argv[])
{
struct iovec iov = {
.iov_base = "@@@@@@@@@@@@@@@@",
.iov_len = 16
};
struct msghdr hdr = {
.msg_iov = &iov,
.msg_iovlen = 1,
.msg_flags = MSG_FASTOPEN
};
const struct sockaddr_in6 addr = { .sin6_family = AF_INET6, .sin6_addr = in6addr_loopback };
const int one = 1;
int ret_ignored = 0;
const int fd = socket(PF_INET6, SOCK_STREAM, IPPROTO_IP);
int pipe_fds[2] = { -1, -1 };
static char buf[32768] = { };

ret_ignored += pipe(pipe_fds);
setsockopt(fd, SOL_TCP, TCP_REPAIR, &one, sizeof(one));
connect(fd, (struct sockaddr *) &addr, sizeof(addr));
setsockopt(fd, SOL_TCP, TCP_ULP, "tls", 4);
setsockopt(fd, SOL_TLS, TLS_TX,"\3\0035\0%T\244\205\333\f0\362B\221\243\234\206\216\220\243u\347\342P|1\24}Q@\377\227\353\222B\354\264u[\346", 40);
ret_ignored += write(pipe_fds[1], buf, 2432);
ret_ignored += write(pipe_fds[1], buf, 10676);
ret_ignored += write(pipe_fds[1], buf, 17996);
if (argc == 1) {
ret_ignored += splice(pipe_fds[0], NULL, fd, NULL, 1048576, 0);
ret_ignored += sendmsg(fd, &hdr, MSG_DONTWAIT | MSG_MORE);
} else {
ret_ignored += sendmsg(fd, &hdr, MSG_DONTWAIT | MSG_MORE);
ret_ignored += splice(pipe_fds[0], NULL, fd, NULL, 1048576, 0);
}
return ret_ignored * 0;
}
----------------------------------------

If you run this program with argc == 1, you will get below output.

----------------------------------------
[ 4030.292376] [ T3896] required_size=2461 ret=0
[ 4030.292376] [ T3896] required_size=6557 ret=0
[ 4030.292376] [ T3896] required_size=10653 ret=0
[ 4030.292376] [ T3896] required_size=14749 ret=0
[ 4030.292376] [ T3896] required_size=16413 ret=0
[ 4030.292376] [ T3896] full_record=1 eor=0 sk_msg_full(msg_pl)=0 copied=1664
[ 4030.330185] [ T3896] required_size=2461 ret=0
[ 4030.330185] [ T3896] required_size=6557 ret=0
[ 4030.330185] [ T3896] required_size=10653 ret=0
[ 4030.335443] [ T3896] required_size=14749 ret=0
[ 4030.335443] [ T3896] full_record=0 eor=1 sk_msg_full(msg_pl)=0 copied=4096
----------------------------------------

If you run this program with argc != 1, you will get below output.

----------------------------------------
[ 4044.312696] [ T3898] required_size=2477 ret=0
[ 4044.312696] [ T3898] required_size=6573 ret=0
[ 4044.312696] [ T3898] required_size=10669 ret=0
[ 4044.312696] [ T3898] required_size=14765 ret=0
[ 4044.312696] [ T3898] required_size=16413 ret=0
[ 4044.312696] [ T3898] full_record=1 eor=0 sk_msg_full(msg_pl)=0 copied=1648
[ 4044.340425] [ T3898] required_size=2477 ret=0
[ 4044.350515] [ T3898] required_size=6573 ret=0
[ 4044.350515] [ T3898] required_size=10669 ret=0
[ 4044.350515] [ T3898] required_size=14765 ret=0
[ 4044.350515] [ T3898] full_record=0 eor=1 sk_msg_full(msg_pl)=0 copied=4096
----------------------------------------

The difference is that the required_size= value differs by "struct iovec"->iov_len
bytes which is the value passed to sendmsg().

If splice() happens before sendmsg() happens, the output does not include
bytes passed to sendmsg(). If sendmsg() happens before splice() happens,
the output includes bytes passed to sendmsg().

Then, where does the difference between

----------------------------------------
[ 162.919587][ T3399] required_size=10653 ret=0
[ 162.923090][ T3399] required_size=14749 ret=0
[ 162.928686][ T3399] required_size=16413 ret=0
----------------------------------------

and

----------------------------------------
[ 162.992866][ T3399] required_size=10653 ret=0
[ 162.999962][ T3399] required_size=14765 ret=0
[ 163.006420][ T3399] required_size=16413 ret=0
----------------------------------------

come from? Both output had the same values until 10653, but
the next value differs by 16. This might suggest that a race between
splice() and sendmsg() caused unexpected required_size= value...

This could explain "for the second time" part below.

Tetsuo Handa

unread,
Jul 4, 2023, 9:32:20 AM7/4/23
to Ard Biesheuvel, Alexander Potapenko, Boris Pismenny, John Fastabend, Jakub Kicinski, her...@gondor.apana.org.au, linux-...@vger.kernel.org, syzkall...@googlegroups.com, syzbot, Eric Biggers, Aviad Yehezkel, Daniel Borkmann, net...@vger.kernel.org, David S. Miller, Eric Dumazet, Paolo Abeni
On 2023/07/01 13:12, Tetsuo Handa wrote:
> Then, where does the difference between
>
> ----------------------------------------
> [ 162.919587][ T3399] required_size=10653 ret=0
> [ 162.923090][ T3399] required_size=14749 ret=0
> [ 162.928686][ T3399] required_size=16413 ret=0
> ----------------------------------------
>
> and
>
> ----------------------------------------
> [ 162.992866][ T3399] required_size=10653 ret=0
> [ 162.999962][ T3399] required_size=14765 ret=0
> [ 163.006420][ T3399] required_size=16413 ret=0
> ----------------------------------------
>
> come from? Both output had the same values until 10653, but
> the next value differs by 16. This might suggest that a race between
> splice() and sendmsg() caused unexpected required_size= value...
>

I found a simplified reproducer.
This problem happens when splice() and sendmsg() run in parallel.

----------------------------------------
#define _GNU_SOURCE
#include <sys/types.h>
#include <fcntl.h>
#include <unistd.h>
#include <sys/socket.h>
#include <arpa/inet.h>
#include <sys/poll.h>
#define SOL_TCP 6
#define TCP_REPAIR 19
#define TCP_ULP 31
#define TLS_TX 1

int main(int argc, char *argv[])
{
struct iovec iov = {
.iov_base = "@@@@@@@@@@@@@@@@",
.iov_len = 16
};
struct msghdr hdr = {
.msg_iov = &iov,
.msg_iovlen = 1,
.msg_flags = MSG_FASTOPEN
};
const struct sockaddr_in6 addr = { .sin6_family = AF_INET6, .sin6_addr = in6addr_loopback };
const int one = 1;
int ret_ignored = 0;
const int fd = socket(PF_INET6, SOCK_STREAM, IPPROTO_IP);
int pipe_fds[2] = { -1, -1 };
static char buf[32768] = { };

ret_ignored += pipe(pipe_fds);
setsockopt(fd, SOL_TCP, TCP_REPAIR, &one, sizeof(one));
connect(fd, (struct sockaddr *) &addr, sizeof(addr));
setsockopt(fd, SOL_TCP, TCP_ULP, "tls", 4);
setsockopt(fd, SOL_TLS, TLS_TX,"\3\0035\0%T\244\205\333\f0\362B\221\243\234\206\216\220\243u\347\342P|1\24}Q@\377\227\353\222B\354\264u[\346", 40);
if (fork() == 0) {
ret_ignored += splice(pipe_fds[0], NULL, fd, NULL, 1048576, SPLICE_F_MORE);
_exit(0);
}
close(pipe_fds[0]);
ret_ignored += write(pipe_fds[1], buf, sizeof(buf));
poll(NULL, 0, 1);
ret_ignored += sendmsg(fd, &hdr, MSG_DONTWAIT | MSG_MORE);
return ret_ignored * 0;
}
----------------------------------------

Debug printk() patch and output are shown below.

----------------------------------------
diff --git a/net/core/sock.c b/net/core/sock.c
index 24f2761bdb1d..879c1d54deed 100644
--- a/net/core/sock.c
+++ b/net/core/sock.c
@@ -2825,6 +2825,14 @@ static void sk_leave_memory_pressure(struct sock *sk)

DEFINE_STATIC_KEY_FALSE(net_high_order_alloc_disable_key);

+static inline char *kaddr(struct page *page)
+{
+ char *kaddr = kmap_atomic(page);
+
+ kunmap_atomic(kaddr);
+ return kaddr;
+}
+
/**
* skb_page_frag_refill - check that a page_frag contains enough room
* @sz: minimum size of the fragment we want to get
@@ -2840,10 +2848,17 @@ bool skb_page_frag_refill(unsigned int sz, struct page_frag *pfrag, gfp_t gfp)
if (pfrag->page) {
if (page_ref_count(pfrag->page) == 1) {
pfrag->offset = 0;
+ if (current->comm[0] && !strcmp(current->comm + 1, ".out")) {
+ pr_info("assigned %px\n", kaddr(pfrag->page));
+ }
return true;
}
- if (pfrag->offset + sz <= pfrag->size)
+ if (pfrag->offset + sz <= pfrag->size) {
+ if (current->comm[0] && !strcmp(current->comm + 1, ".out")) {
+ pr_info("assigned %d from %px\n", sz, kaddr(pfrag->page) + pfrag->offset);
+ }
return true;
+ }
put_page(pfrag->page);
}

@@ -2856,12 +2871,18 @@ bool skb_page_frag_refill(unsigned int sz, struct page_frag *pfrag, gfp_t gfp)
__GFP_NORETRY,
SKB_FRAG_PAGE_ORDER);
if (likely(pfrag->page)) {
+ if (current->comm[0] && !strcmp(current->comm + 1, ".out")) {
+ pr_info("allocated %px\n", kaddr(pfrag->page));
+ }
pfrag->size = PAGE_SIZE << SKB_FRAG_PAGE_ORDER;
return true;
}
}
pfrag->page = alloc_page(gfp);
if (likely(pfrag->page)) {
+ if (current->comm[0] && !strcmp(current->comm + 1, ".out")) {
+ pr_info("allocated %px\n", kaddr(pfrag->page));
+ }
pfrag->size = PAGE_SIZE;
return true;
}
diff --git a/net/tls/tls_sw.c b/net/tls/tls_sw.c
index 1a53c8f481e9..dcf00d1f239f 100644
--- a/net/tls/tls_sw.c
+++ b/net/tls/tls_sw.c
@@ -543,6 +543,22 @@ static int tls_do_encryption(struct sock *sk,
list_add_tail((struct list_head *)&rec->list, &ctx->tx_list);
atomic_inc(&ctx->encrypt_pending);

+ {
+ int len = aead_req->assoclen + aead_req->cryptlen;
+ struct sg_mapping_iter miter;
+
+ sg_miter_start(&miter, rec->sg_aead_in,
+ sg_nents(rec->sg_aead_in),
+ SG_MITER_TO_SG | SG_MITER_ATOMIC);
+
+ while (len > 0 && sg_miter_next(&miter)) {
+ pr_info("addr=%px len=%d\n", miter.addr, min(len, (int)miter.length));
+ kmsan_check_memory(miter.addr, min(len, (int)miter.length));
+ len -= miter.length;
+ }
+ sg_miter_stop(&miter);
+ }
+
rc = crypto_aead_encrypt(aead_req);
if (!rc || rc != -EINPROGRESS) {
atomic_dec(&ctx->encrypt_pending);
@@ -1211,6 +1227,7 @@ static int tls_sw_do_sendpage(struct sock *sk, struct page *page,
goto wait_for_sndbuf;
alloc_payload:
ret = tls_alloc_encrypted_msg(sk, required_size);
+ pr_info("required_size=%d ret=%d\n", required_size, ret);
if (ret) {
if (ret != -ENOSPC)
goto wait_for_memory;
@@ -1232,6 +1249,8 @@ static int tls_sw_do_sendpage(struct sock *sk, struct page *page,

tls_ctx->pending_open_record_frags = true;
if (full_record || eor || sk_msg_full(msg_pl)) {
+ pr_info("full_record=%d eor=%d sk_msg_full(msg_pl)=%d copied=%d\n",
+ full_record, eor, sk_msg_full(msg_pl), copied);
ret = bpf_exec_tx_verdict(msg_pl, sk, full_record,
record_type, &copied, flags);
if (ret) {
----------------------------------------

skb_page_frag_refill() assigned 32 bytes from ffff88810d40901d
but tls_do_encryption() accesses 16 bytes from ffff88810d40900d !?

----------------------------------------
[ 159.777428][ T3405] sock: allocated ffff88810d408000
[ 159.792685][ T3405] required_size=4125 ret=0
[ 159.802047][ T3395] sock: allocated ffff88810b000000
[ 159.799656][ T3405] sock: assigned 32 from ffff88810d40901d
[ 159.828229][ T3405] required_size=8237 ret=0
[ 159.848161][ T3405] sock: assigned 32 from ffff88810d40a01d
[ 159.849298][ T3405] required_size=12333 ret=0
[ 159.849382][ T3405] sock: assigned 32 from ffff88810d40b01d
[ 159.849416][ T3405] required_size=16413 ret=0
[ 159.849434][ T3405] full_record=1 eor=0 sk_msg_full(msg_pl)=0 copied=4080
[ 159.849472][ T3405] addr=ffff888115aa4658 len=13
[ 159.849492][ T3405] addr=ffff88811da48000 len=4096
[ 159.849513][ T3405] addr=ffff88810d40900d len=16
[ 159.849530][ T3405] =====================================================
[ 159.849618][ T3405] BUG: KMSAN: uninit-value in tls_push_record+0x2cff/0x4040
[ 159.849654][ T3405] tls_push_record+0x2cff/0x4040
[ 159.849686][ T3405] bpf_exec_tx_verdict+0x5ba/0x2530
[ 159.849715][ T3405] tls_sw_do_sendpage+0x1779/0x21f0
[ 159.849744][ T3405] tls_sw_sendpage+0x247/0x2b0
[ 159.849772][ T3405] inet_sendpage+0x1de/0x2f0
[ 159.849807][ T3405] kernel_sendpage+0x4cc/0x940
[ 159.849837][ T3405] sock_sendpage+0x162/0x220
[ 159.849864][ T3405] pipe_to_sendpage+0x3df/0x4f0
[ 159.849891][ T3405] __splice_from_pipe+0x5c7/0x1010
[ 159.849916][ T3405] generic_splice_sendpage+0x1c6/0x2a0
[ 159.849943][ T3405] do_splice+0x26d8/0x32f0
[ 159.849966][ T3405] __se_sys_splice+0x81f/0xba0
[ 159.849991][ T3405] __x64_sys_splice+0x1a1/0x200
[ 159.850016][ T3405] do_syscall_64+0x41/0x90
[ 159.850044][ T3405] entry_SYSCALL_64_after_hwframe+0x63/0xcd
[ 159.850082][ T3405]
[ 159.850088][ T3405] Uninit was created at:
[ 159.850136][ T3405] __alloc_pages+0x925/0x1050
[ 159.850171][ T3405] alloc_pages+0xe30/0x11b0
[ 159.850204][ T3405] skb_page_frag_refill+0x539/0x1030
[ 159.850238][ T3405] sk_page_frag_refill+0xa2/0x1c0
[ 159.850271][ T3405] sk_msg_alloc+0x278/0x1560
[ 159.850307][ T3405] tls_sw_do_sendpage+0xbec/0x21f0
[ 159.850336][ T3405] tls_sw_sendpage+0x247/0x2b0
[ 159.850363][ T3405] inet_sendpage+0x1de/0x2f0
[ 159.850392][ T3405] kernel_sendpage+0x4cc/0x940
[ 159.850418][ T3405] sock_sendpage+0x162/0x220
[ 159.850446][ T3405] pipe_to_sendpage+0x3df/0x4f0
[ 159.850470][ T3405] __splice_from_pipe+0x5c7/0x1010
[ 159.850495][ T3405] generic_splice_sendpage+0x1c6/0x2a0
[ 159.850522][ T3405] do_splice+0x26d8/0x32f0
[ 159.850545][ T3405] __se_sys_splice+0x81f/0xba0
[ 159.850570][ T3405] __x64_sys_splice+0x1a1/0x200
[ 159.850595][ T3405] do_syscall_64+0x41/0x90
[ 159.850619][ T3405] entry_SYSCALL_64_after_hwframe+0x63/0xcd
[ 159.850654][ T3405]
[ 159.850659][ T3405] Bytes 0-15 of 16 are uninitialized
[ 159.850672][ T3405] Memory access of size 16 starts at ffff88810d40900d
[ 159.850685][ T3405]
[ 159.850694][ T3405] CPU: 2 PID: 3405 Comm: a.out Not tainted 6.4.0-rc7-ge6bc8833d80f-dirty #32
[ 159.850725][ T3405] Hardware name: innotek GmbH VirtualBox/VirtualBox, BIOS VirtualBox 12/01/2006
[ 159.850741][ T3405] =====================================================
[ 159.850749][ T3405] Disabling lock debugging due to kernel taint
[ 159.850761][ T3405] Kernel panic - not syncing: kmsan.panic set ...
[ 159.850774][ T3405] CPU: 2 PID: 3405 Comm: a.out Tainted: G B 6.4.0-rc7-ge6bc8833d80f-dirty #32
[ 159.850801][ T3405] Hardware name: innotek GmbH VirtualBox/VirtualBox, BIOS VirtualBox 12/01/2006
[ 159.850814][ T3405] Call Trace:
[ 159.850824][ T3405] <TASK>
[ 159.850833][ T3405] dump_stack_lvl+0x1f6/0x280
[ 159.850863][ T3405] dump_stack+0x29/0x30
[ 159.850888][ T3405] panic+0x4e7/0xc60
[ 159.850928][ T3405] ? add_taint+0x185/0x210
[ 159.850969][ T3405] kmsan_report+0x2d1/0x2e0
[ 159.851008][ T3405] ? kmsan_internal_unpoison_memory+0x14/0x20
[ 159.851048][ T3405] ? __msan_metadata_ptr_for_load_4+0x24/0x40
[ 159.851089][ T3405] ? kmsan_internal_check_memory+0x47d/0x540
[ 159.851128][ T3405] ? kmsan_check_memory+0x1a/0x20
[ 159.851166][ T3405] ? tls_push_record+0x2cff/0x4040
[ 159.851197][ T3405] ? bpf_exec_tx_verdict+0x5ba/0x2530
[ 159.851228][ T3405] ? tls_sw_do_sendpage+0x1779/0x21f0
[ 159.851258][ T3405] ? tls_sw_sendpage+0x247/0x2b0
[ 159.851288][ T3405] ? inet_sendpage+0x1de/0x2f0
[ 159.851318][ T3405] ? kernel_sendpage+0x4cc/0x940
[ 159.851346][ T3405] ? sock_sendpage+0x162/0x220
[ 159.851374][ T3405] ? pipe_to_sendpage+0x3df/0x4f0
[ 159.851399][ T3405] ? __splice_from_pipe+0x5c7/0x1010
[ 159.851427][ T3405] ? generic_splice_sendpage+0x1c6/0x2a0
[ 159.851455][ T3405] ? do_splice+0x26d8/0x32f0
[ 159.851479][ T3405] ? __se_sys_splice+0x81f/0xba0
[ 159.851506][ T3405] ? __x64_sys_splice+0x1a1/0x200
[ 159.851532][ T3405] ? do_syscall_64+0x41/0x90
[ 159.851557][ T3405] ? entry_SYSCALL_64_after_hwframe+0x63/0xcd
[ 159.851598][ T3405] ? vprintk_default+0x8a/0xa0
[ 159.851634][ T3405] ? vprintk+0x163/0x180
[ 159.851669][ T3405] ? _printk+0x181/0x1b0
[ 159.851706][ T3405] kmsan_internal_check_memory+0x47d/0x540
[ 159.851748][ T3405] ? kmsan_get_shadow_origin_ptr+0x4d/0xa0
[ 159.851790][ T3405] kmsan_check_memory+0x1a/0x20
[ 159.851828][ T3405] tls_push_record+0x2cff/0x4040
[ 159.851873][ T3405] bpf_exec_tx_verdict+0x5ba/0x2530
[ 159.851926][ T3405] ? _printk+0x181/0x1b0
[ 159.851965][ T3405] ? tls_sw_do_sendpage+0xbd1/0x21f0
[ 159.851998][ T3405] tls_sw_do_sendpage+0x1779/0x21f0
[ 159.852045][ T3405] tls_sw_sendpage+0x247/0x2b0
[ 159.852078][ T3405] ? tls_sw_do_sendpage+0x21f0/0x21f0
[ 159.852110][ T3405] inet_sendpage+0x1de/0x2f0
[ 159.852143][ T3405] ? inet_sendmsg+0x1d0/0x1d0
[ 159.852173][ T3405] kernel_sendpage+0x4cc/0x940
[ 159.852206][ T3405] sock_sendpage+0x162/0x220
[ 159.852238][ T3405] pipe_to_sendpage+0x3df/0x4f0
[ 159.852264][ T3405] ? sock_fasync+0x240/0x240
[ 159.852296][ T3405] __splice_from_pipe+0x5c7/0x1010
[ 159.852324][ T3405] ? generic_splice_sendpage+0x2a0/0x2a0
[ 159.852360][ T3405] generic_splice_sendpage+0x1c6/0x2a0
[ 159.852393][ T3405] ? iter_file_splice_write+0x1a30/0x1a30
[ 159.852421][ T3405] do_splice+0x26d8/0x32f0
[ 159.852446][ T3405] ? kmsan_internal_set_shadow_origin+0x66/0xe0
[ 159.852487][ T3405] ? _raw_spin_lock_irqsave+0x91/0x110
[ 159.852527][ T3405] ? kmsan_get_shadow_origin_ptr+0x4d/0xa0
[ 159.852567][ T3405] ? _raw_spin_lock_irqsave+0x91/0x110
[ 159.852606][ T3405] ? _raw_spin_unlock_irqrestore+0x76/0xb0
[ 159.852652][ T3405] __se_sys_splice+0x81f/0xba0
[ 159.852696][ T3405] __x64_sys_splice+0x1a1/0x200
[ 159.852735][ T3405] do_syscall_64+0x41/0x90
[ 159.852767][ T3405] entry_SYSCALL_64_after_hwframe+0x63/0xcd
[ 159.852813][ T3405] RIP: 0033:0x7f5a347261aa
[ 159.852840][ T3405] Code: d8 64 89 02 48 c7 c0 ff ff ff ff eb b9 0f 1f 00 f3 0f 1e fa 49 89 ca 64 8b 04 25 18 00 00 00 85 c0 75 15 b8 13 01 00 00 0f 05 <48> 3d 00 f0 ff ff 77 7e c3 0f 1f 44 00 00 41 54 48 83 ec 30 44 89
[ 159.852873][ T3405] RSP: 002b:00007ffe1b674588 EFLAGS: 00000246 ORIG_RAX: 0000000000000113
[ 159.852905][ T3405] RAX: ffffffffffffffda RBX: 0000000000000000 RCX: 00007f5a347261aa
[ 159.852926][ T3405] RDX: 0000000000000003 RSI: 0000000000000000 RDI: 0000000000000004
[ 159.852942][ T3405] RBP: 0000000000000001 R08: 0000000000100000 R09: 0000000000000004
[ 159.852959][ T3405] R10: 0000000000000000 R11: 0000000000000246 R12: 0000000000000003
[ 159.852975][ T3405] R13: 000055675f9631c0 R14: 000055675f965d58 R15: 00007f5a348e8040
[ 159.853001][ T3405] </TASK>
[ 159.858725][ T3405] Kernel Offset: disabled
[ 159.858725][ T3405] Rebooting in 10 seconds..
----------------------------------------


Jakub Kicinski

unread,
Jul 6, 2023, 4:53:22 PM7/6/23
to Tetsuo Handa, Ard Biesheuvel, Alexander Potapenko, Boris Pismenny, John Fastabend, her...@gondor.apana.org.au, linux-...@vger.kernel.org, syzkall...@googlegroups.com, syzbot, Eric Biggers, Aviad Yehezkel, Daniel Borkmann, net...@vger.kernel.org, David S. Miller, Eric Dumazet, Paolo Abeni
On Tue, 4 Jul 2023 22:32:00 +0900 Tetsuo Handa wrote:
> I found a simplified reproducer.
> This problem happens when splice() and sendmsg() run in parallel.

Could you retry with the upstream (tip of Linus's tree) and see if it
still repros? I tried to get a KMSAN kernel to boot on QEMU but it
the kernel doesn't want to start, no idea what's going on :(

Tetsuo Handa

unread,
Jul 7, 2023, 5:41:58 AM7/7/23
to Jakub Kicinski, Ard Biesheuvel, Alexander Potapenko, Boris Pismenny, John Fastabend, her...@gondor.apana.org.au, linux-...@vger.kernel.org, syzkall...@googlegroups.com, syzbot, Eric Biggers, Aviad Yehezkel, Daniel Borkmann, net...@vger.kernel.org, David S. Miller, Eric Dumazet, Paolo Abeni
I can't reproduce this problem as of commit a452483508d7 of linux.git tree, for
the simplified reproducer is failing with EBADMSG error. Unless what the simplified
reproducer is doing has become illegal, I need to bisect between

commit 219d92056ba3 ("splice, net: Fix SPLICE_F_MORE signalling in splice_direct_to_actor()")

which fails with EBADMSG error and

commit 8a0d57df8938 ("tls: improve lockless access safety of tls_err_abort()")

which shows this problem, with

commit e6bc8833d80f ("string: use __builtin_memcpy() in strlcpy/strlcat")

backported...

----------------------------------------
root@fuzz:~# strace -f ./a.out
execve("./a.out", ["./a.out"], 0x7ffedb58a368 /* 26 vars */) = 0
brk(NULL) = 0x564cb3f30000
arch_prctl(0x3001 /* ARCH_??? */, 0x7ffd9c8980a0) = -1 EINVAL (Invalid argument)
mmap(NULL, 8192, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS, -1, 0) = 0x7f463d4e6000
access("/etc/ld.so.preload", R_OK) = -1 ENOENT (No such file or directory)
openat(AT_FDCWD, "/etc/ld.so.cache", O_RDONLY|O_CLOEXEC) = 3
newfstatat(3, "", {st_mode=S_IFREG|0644, st_size=37735, ...}, AT_EMPTY_PATH) = 0
mmap(NULL, 37735, PROT_READ, MAP_PRIVATE, 3, 0) = 0x7f463d4dc000
close(3) = 0
openat(AT_FDCWD, "/lib/x86_64-linux-gnu/libc.so.6", O_RDONLY|O_CLOEXEC) = 3
read(3, "\177ELF\2\1\1\3\0\0\0\0\0\0\0\0\3\0>\0\1\0\0\0P\237\2\0\0\0\0\0"..., 832) = 832
pread64(3, "\6\0\0\0\4\0\0\0@\0\0\0\0\0\0\0@\0\0\0\0\0\0\0@\0\0\0\0\0\0\0"..., 784, 64) = 784
pread64(3, "\4\0\0\0 \0\0\0\5\0\0\0GNU\0\2\0\0\300\4\0\0\0\3\0\0\0\0\0\0\0"..., 48, 848) = 48
pread64(3, "\4\0\0\0\24\0\0\0\3\0\0\0GNU\0i8\235HZ\227\223\333\350s\360\352,\223\340."..., 68, 896) = 68
newfstatat(3, "", {st_mode=S_IFREG|0644, st_size=2216304, ...}, AT_EMPTY_PATH) = 0
pread64(3, "\6\0\0\0\4\0\0\0@\0\0\0\0\0\0\0@\0\0\0\0\0\0\0@\0\0\0\0\0\0\0"..., 784, 64) = 784
mmap(NULL, 2260560, PROT_READ, MAP_PRIVATE|MAP_DENYWRITE, 3, 0) = 0x7f463d200000
mmap(0x7f463d228000, 1658880, PROT_READ|PROT_EXEC, MAP_PRIVATE|MAP_FIXED|MAP_DENYWRITE, 3, 0x28000) = 0x7f463d228000
mmap(0x7f463d3bd000, 360448, PROT_READ, MAP_PRIVATE|MAP_FIXED|MAP_DENYWRITE, 3, 0x1bd000) = 0x7f463d3bd000
mmap(0x7f463d415000, 24576, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_FIXED|MAP_DENYWRITE, 3, 0x214000) = 0x7f463d415000
mmap(0x7f463d41b000, 52816, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_FIXED|MAP_ANONYMOUS, -1, 0) = 0x7f463d41b000
close(3) = 0
mmap(NULL, 12288, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS, -1, 0) = 0x7f463d4d9000
arch_prctl(ARCH_SET_FS, 0x7f463d4d9740) = 0
set_tid_address(0x7f463d4d9a10) = 3444
set_robust_list(0x7f463d4d9a20, 24) = 0
rseq(0x7f463d4da0e0, 0x20, 0, 0x53053053) = 0
mprotect(0x7f463d415000, 16384, PROT_READ) = 0
mprotect(0x564cb38ac000, 4096, PROT_READ) = 0
mprotect(0x7f463d520000, 8192, PROT_READ) = 0
prlimit64(0, RLIMIT_STACK, NULL, {rlim_cur=8192*1024, rlim_max=RLIM64_INFINITY}) = 0
munmap(0x7f463d4dc000, 37735) = 0
socket(AF_INET6, SOCK_STREAM, IPPROTO_IP) = 3
pipe2([4, 5], 0) = 0
setsockopt(3, SOL_TCP, TCP_REPAIR, [1], 4) = 0
connect(3, {sa_family=AF_INET6, sin6_port=htons(0), sin6_flowinfo=htonl(0), inet_pton(AF_INET6, "::1", &sin6_addr), sin6_scope_id=0}, 28) = 0
setsockopt(3, SOL_TCP, TCP_ULP, [7564404], 4) = 0
setsockopt(3, SOL_TLS, TLS_TX, "\3\0035\0%T\244\205\333\f0\362B\221\243\234\206\216\220\243u\347\342P|1\24}Q@\377\227"..., 40) = 0
clone(child_stack=NULL, flags=CLONE_CHILD_CLEARTID|CLONE_CHILD_SETTID|SIGCHLDstrace: Process 3445 attached
, child_tidptr=0x7f463d4d9a10) = 3445
[pid 3444] close(4 <unfinished ...>
[pid 3445] set_robust_list(0x7f463d4d9a20, 24 <unfinished ...>
[pid 3444] <... close resumed>) = 0
[pid 3445] <... set_robust_list resumed>) = 0
[pid 3444] write(5, "\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0"..., 32768 <unfinished ...>
[pid 3445] splice(4, NULL, 3, NULL, 1048576, SPLICE_F_MORE <unfinished ...>
[pid 3444] <... write resumed>) = 32768
[pid 3444] poll(NULL, 0, 1) = 0 (Timeout)
[pid 3444] sendmsg(3, {msg_name=NULL, msg_namelen=0, msg_iov=[{iov_base="@@@@@@@@@@@@@@@@", iov_len=16}], msg_iovlen=1, msg_controllen=0, msg_flags=MSG_FASTOPEN}, MSG_DONTWAIT|MSG_MORE) = -1 EBADMSG (Bad message)
[pid 3445] <... splice resumed>) = -1 EBADMSG (Bad message)
[pid 3444] exit_group(0 <unfinished ...>
[pid 3445] exit_group(0 <unfinished ...>
[pid 3444] <... exit_group resumed>) = ?
[pid 3445] <... exit_group resumed>) = ?
[pid 3444] +++ exited with 0 +++
+++ exited with 0 +++
----------------------------------------

Tetsuo Handa

unread,
Jul 8, 2023, 6:35:03 AM7/8/23
to Jakub Kicinski, Ard Biesheuvel, Alexander Potapenko, Boris Pismenny, John Fastabend, her...@gondor.apana.org.au, linux-...@vger.kernel.org, syzkall...@googlegroups.com, syzbot, Eric Biggers, Aviad Yehezkel, Daniel Borkmann, net...@vger.kernel.org, David S. Miller, Eric Dumazet, Paolo Abeni
On 2023/07/07 18:41, Tetsuo Handa wrote:
> On 2023/07/07 5:53, Jakub Kicinski wrote:
>> On Tue, 4 Jul 2023 22:32:00 +0900 Tetsuo Handa wrote:
>>> I found a simplified reproducer.
>>> This problem happens when splice() and sendmsg() run in parallel.
>>
>> Could you retry with the upstream (tip of Linus's tree) and see if it
>> still repros? I tried to get a KMSAN kernel to boot on QEMU but it
>> the kernel doesn't want to start, no idea what's going on :(

(Returned from https://lkml.kernel.org/r/20230707180...@kernel.org .)

It seems that a series of changes that was merged in this merge window
made this problem no longer reproducible. ;-)

Which commit should we mark as a fix for this problem?

syzbot

unread,
Mar 18, 2024, 8:25:16 PMMar 18
to gli...@google.com, syzkall...@googlegroups.com
Auto-closing this bug as obsolete.
No recent activity, existing reproducers are no longer triggering the issue.
Reply all
Reply to author
Forward
0 new messages