[syzbot] upstream boot error: WARNING in netlink_ack

25 views
Skip to first unread message

syzbot

unread,
Oct 4, 2022, 4:27:39 AM10/4/22
to b...@vger.kernel.org, da...@davemloft.net, edum...@google.com, f...@strlen.de, harshit.m....@oracle.com, ku...@kernel.org, linux-...@vger.kernel.org, net...@vger.kernel.org, pab...@redhat.com, syzkall...@googlegroups.com
Hello,

syzbot found the following issue on:

HEAD commit: 725737e7c21d Merge tag 'statx-dioalign-for-linus' of git:/..
git tree: upstream
console output: https://syzkaller.appspot.com/x/log.txt?x=10257034880000
kernel config: https://syzkaller.appspot.com/x/.config?x=486af5e221f55835
dashboard link: https://syzkaller.appspot.com/bug?extid=3a080099974c271cd7e9
compiler: gcc (Debian 10.2.1-6) 10.2.1 20210110, GNU ld (GNU Binutils for Debian) 2.35.2

IMPORTANT: if you fix the issue, please add the following tag to the commit:
Reported-by: syzbot+3a0800...@syzkaller.appspotmail.com

------------[ cut here ]------------
memcpy: detected field-spanning write (size 28) of single field "&errmsg->msg" at net/netlink/af_netlink.c:2447 (size 16)
WARNING: CPU: 3 PID: 3351 at net/netlink/af_netlink.c:2447 netlink_ack+0x8ac/0xb10 net/netlink/af_netlink.c:2447
Modules linked in:
CPU: 3 PID: 3351 Comm: dhcpcd Not tainted 6.0.0-syzkaller-00593-g725737e7c21d #0
Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS 1.14.0-2 04/01/2014
RIP: 0010:netlink_ack+0x8ac/0xb10 net/netlink/af_netlink.c:2447
Code: fa ff ff e8 36 c3 e5 f9 b9 10 00 00 00 4c 89 ee 48 c7 c2 20 3f fb 8a 48 c7 c7 80 3f fb 8a c6 05 e8 98 34 06 01 e8 26 77 a6 01 <0f> 0b e9 3a fa ff ff 41 be 00 01 00 00 41 bd 14 00 00 00 e9 ea fd
RSP: 0018:ffffc900220e7758 EFLAGS: 00010282
RAX: 0000000000000000 RBX: ffff88801a798a80 RCX: 0000000000000000
RDX: ffff8880151c0180 RSI: ffffffff81611cb8 RDI: fffff5200441cedd
RBP: ffff88801ed850c0 R08: 0000000000000005 R09: 0000000000000000
R10: 0000000080000000 R11: 0000000000000000 R12: 0000000000000000
R13: 000000000000001c R14: ffff88801ec8e400 R15: ffff88801ec8e414
FS: 00007faef0af8740(0000) GS:ffff88802cb00000(0000) knlGS:0000000000000000
CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
CR2: 00007fff6adbe000 CR3: 0000000027683000 CR4: 0000000000150ee0
DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
Call Trace:
<TASK>
netlink_rcv_skb+0x33d/0x420 net/netlink/af_netlink.c:2507
genl_rcv+0x24/0x40 net/netlink/genetlink.c:803
netlink_unicast_kernel net/netlink/af_netlink.c:1319 [inline]
netlink_unicast+0x543/0x7f0 net/netlink/af_netlink.c:1345
netlink_sendmsg+0x917/0xe10 net/netlink/af_netlink.c:1921
sock_sendmsg_nosec net/socket.c:714 [inline]
sock_sendmsg+0xcf/0x120 net/socket.c:734
____sys_sendmsg+0x712/0x8c0 net/socket.c:2482
___sys_sendmsg+0x110/0x1b0 net/socket.c:2536
__sys_sendmsg+0xf3/0x1c0 net/socket.c:2565
do_syscall_x64 arch/x86/entry/common.c:50 [inline]
do_syscall_64+0x35/0xb0 arch/x86/entry/common.c:80
entry_SYSCALL_64_after_hwframe+0x63/0xcd
RIP: 0033:0x7faef0bf0163
Code: 64 89 02 48 c7 c0 ff ff ff ff eb b7 66 2e 0f 1f 84 00 00 00 00 00 90 64 8b 04 25 18 00 00 00 85 c0 75 14 b8 2e 00 00 00 0f 05 <48> 3d 00 f0 ff ff 77 55 c3 0f 1f 40 00 48 83 ec 28 89 54 24 1c 48
RSP: 002b:00007fff6adbdc48 EFLAGS: 00000246 ORIG_RAX: 000000000000002e
RAX: ffffffffffffffda RBX: 0000000000000000 RCX: 00007faef0bf0163
RDX: 0000000000000000 RSI: 00007fff6adbdc90 RDI: 0000000000000010
RBP: 00007fff6adc1ed8 R08: 0000000000000000 R09: 0000000000000000
R10: 00007faef0c6ffc0 R11: 0000000000000246 R12: 0000000000000010
R13: 00007fff6adc1cf0 R14: 0000000000000000 R15: 000055e5ebce0290
</TASK>


---
This report is generated by a bot. It may contain errors.
See https://goo.gl/tpsmEJ for more information about syzbot.
syzbot engineers can be reached at syzk...@googlegroups.com.

syzbot will keep track of this issue. See:
https://goo.gl/tpsmEJ#status for how to communicate with syzbot.

Dmitry Vyukov

unread,
Oct 4, 2022, 4:33:44 AM10/4/22
to syzbot, b...@vger.kernel.org, da...@davemloft.net, edum...@google.com, f...@strlen.de, harshit.m....@oracle.com, ku...@kernel.org, linux-...@vger.kernel.org, net...@vger.kernel.org, pab...@redhat.com, syzkall...@googlegroups.com, Kees Cook, linux-h...@vger.kernel.org
On Tue, 4 Oct 2022 at 10:27, syzbot
<syzbot+3a0800...@syzkaller.appspotmail.com> wrote:
>
> Hello,
>
> syzbot found the following issue on:
>
> HEAD commit: 725737e7c21d Merge tag 'statx-dioalign-for-linus' of git:/..
> git tree: upstream
> console output: https://syzkaller.appspot.com/x/log.txt?x=10257034880000
> kernel config: https://syzkaller.appspot.com/x/.config?x=486af5e221f55835
> dashboard link: https://syzkaller.appspot.com/bug?extid=3a080099974c271cd7e9
> compiler: gcc (Debian 10.2.1-6) 10.2.1 20210110, GNU ld (GNU Binutils for Debian) 2.35.2
>
> IMPORTANT: if you fix the issue, please add the following tag to the commit:
> Reported-by: syzbot+3a0800...@syzkaller.appspotmail.com

+linux-hardening
> --
> You received this message because you are subscribed to the Google Groups "syzkaller-bugs" group.
> To unsubscribe from this group and stop receiving emails from it, send an email to syzkaller-bug...@googlegroups.com.
> To view this discussion on the web visit https://groups.google.com/d/msgid/syzkaller-bugs/000000000000a793cc05ea313b87%40google.com.

Harshit Mogalapalli

unread,
Oct 4, 2022, 8:29:40 AM10/4/22
to syzbot, b...@vger.kernel.org, da...@davemloft.net, edum...@google.com, f...@strlen.de, ku...@kernel.org, linux-...@vger.kernel.org, net...@vger.kernel.org, pab...@redhat.com, syzkall...@googlegroups.com, Kees Cook, linux-h...@vger.kernel.org, Dmitry Vyukov, Vegard Nossum
#syz test: git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git 725737e7c21d2d25a4312c2aaa82a52bd03e3126

On Tue, 4 Oct 2022 at 10:27, syzbot
<syzbot+3a0800...@syzkaller.appspotmail.com> wrote:
>
> Hello,
>
> syzbot found the following issue on:
>
> HEAD commit:    725737e7c21d Merge tag 'statx-dioalign-for-linus' of git:/..
> git tree:       upstream

> syzbot engineers can be reached at syzk...@googlegroups.com.
>
> syzbot will keep track of this issue. See:

>
> --
> You received this message because you are subscribed to the Google Groups "syzkaller-bugs" group.
> To unsubscribe from this group and stop receiving emails from it, send an email to syzkaller-bug...@googlegroups.com.
0001-netlink-Bounds-check-struct-nlmsgerr-creation.patch

syzbot

unread,
Oct 4, 2022, 8:29:55 AM10/4/22
to Harshit Mogalapalli, b...@vger.kernel.org, da...@davemloft.net, dvy...@google.com, edum...@google.com, f...@strlen.de, harshit.m....@oracle.com, kees...@chromium.org, ku...@kernel.org, linux-h...@vger.kernel.org, linux-...@vger.kernel.org, net...@vger.kernel.org, pab...@redhat.com, syzkall...@googlegroups.com, vegard...@oracle.com
> #syz test: git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git 725737e7c21d2d25a4312c2aaa82a52bd03e3126

"git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git" does not look like a valid git repo address.

Harshit Mogalapalli

unread,
Oct 4, 2022, 8:40:47 AM10/4/22
to syzbot, b...@vger.kernel.org, da...@davemloft.net, dvy...@google.com, edum...@google.com, f...@strlen.de, kees...@chromium.org, ku...@kernel.org, linux-h...@vger.kernel.org, linux-...@vger.kernel.org, net...@vger.kernel.org, pab...@redhat.com, syzkall...@googlegroups.com, Vegard Nossum
0001-netlink-Bounds-check-struct-nlmsgerr-creation.patch

syzbot

unread,
Oct 4, 2022, 8:57:23 AM10/4/22
to b...@vger.kernel.org, da...@davemloft.net, dvy...@google.com, edum...@google.com, f...@strlen.de, harshit.m....@oracle.com, kees...@chromium.org, ku...@kernel.org, linux-h...@vger.kernel.org, linux-...@vger.kernel.org, net...@vger.kernel.org, pab...@redhat.com, syzkall...@googlegroups.com, vegard...@oracle.com
Hello,

syzbot tried to test the proposed patch but the build/boot failed:

asset storage also requires dashboard client

syzkaller build log:
go env (err=<nil>)
GO111MODULE="auto"
GOARCH="amd64"
GOBIN=""
GOCACHE="/syzkaller/.cache/go-build"
GOENV="/syzkaller/.config/go/env"
GOEXE=""
GOEXPERIMENT=""
GOFLAGS=""
GOHOSTARCH="amd64"
GOHOSTOS="linux"
GOINSECURE=""
GOMODCACHE="/syzkaller/jobs/linux/gopath/pkg/mod"
GONOPROXY=""
GONOSUMDB=""
GOOS="linux"
GOPATH="/syzkaller/jobs/linux/gopath"
GOPRIVATE=""
GOPROXY="https://proxy.golang.org,direct"
GOROOT="/usr/local/go"
GOSUMDB="sum.golang.org"
GOTMPDIR=""
GOTOOLDIR="/usr/local/go/pkg/tool/linux_amd64"
GOVCS=""
GOVERSION="go1.17"
GCCGO="gccgo"
AR="ar"
CC="gcc"
CXX="g++"
CGO_ENABLED="1"
GOMOD="/syzkaller/jobs/linux/gopath/src/github.com/google/syzkaller/go.mod"
CGO_CFLAGS="-g -O2"
CGO_CPPFLAGS=""
CGO_CXXFLAGS="-g -O2"
CGO_FFLAGS="-g -O2"
CGO_LDFLAGS="-g -O2"
PKG_CONFIG="pkg-config"
GOGCCFLAGS="-fPIC -m64 -pthread -fmessage-length=0 -fdebug-prefix-map=/tmp/go-build46036865=/tmp/go-build -gno-record-gcc-switches"

git status (err=<nil>)
HEAD detached at feb563518
nothing to commit, working tree clean


go list -f '{{.Stale}}' ./sys/syz-sysgen | grep -q false || go install ./sys/syz-sysgen
make .descriptions
bin/syz-sysgen
touch .descriptions
GOOS=linux GOARCH=amd64 go build "-ldflags=-s -w -X github.com/google/syzkaller/prog.GitRevision=feb5635181eb12a6e3516172a3f5af06a3bc93e1 -X 'github.com/google/syzkaller/prog.gitRevisionDate=20220930-160315'" "-tags=syz_target syz_os_linux syz_arch_amd64 " -o ./bin/linux_amd64/syz-fuzzer github.com/google/syzkaller/syz-fuzzer
GOOS=linux GOARCH=amd64 go build "-ldflags=-s -w -X github.com/google/syzkaller/prog.GitRevision=feb5635181eb12a6e3516172a3f5af06a3bc93e1 -X 'github.com/google/syzkaller/prog.gitRevisionDate=20220930-160315'" "-tags=syz_target syz_os_linux syz_arch_amd64 " -o ./bin/linux_amd64/syz-execprog github.com/google/syzkaller/tools/syz-execprog
GOOS=linux GOARCH=amd64 go build "-ldflags=-s -w -X github.com/google/syzkaller/prog.GitRevision=feb5635181eb12a6e3516172a3f5af06a3bc93e1 -X 'github.com/google/syzkaller/prog.gitRevisionDate=20220930-160315'" "-tags=syz_target syz_os_linux syz_arch_amd64 " -o ./bin/linux_amd64/syz-stress github.com/google/syzkaller/tools/syz-stress
mkdir -p ./bin/linux_amd64
gcc -o ./bin/linux_amd64/syz-executor executor/executor.cc \
-m64 -O2 -pthread -Wall -Werror -Wparentheses -Wunused-const-variable -Wframe-larger-than=16384 -Wno-stringop-overflow -Wno-array-bounds -Wno-format-overflow -static-pie -fpermissive -w -DGOOS_linux=1 -DGOARCH_amd64=1 \
-DHOSTGOOS_linux=1 -DGIT_REVISION=\"feb5635181eb12a6e3516172a3f5af06a3bc93e1\"



Tested on:

commit: 725737e7 Merge tag 'statx-dioalign-for-linus' of git:/..
git tree: git://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git
kernel config: https://syzkaller.appspot.com/x/.config?x=36e3ab6ff9643877
dashboard link: https://syzkaller.appspot.com/bug?extid=3a080099974c271cd7e9
compiler: gcc (Debian 10.2.1-6) 10.2.1 20210110, GNU ld (GNU Binutils for Debian) 2.35.2
patch: https://syzkaller.appspot.com/x/patch.diff?x=12302bb8880000

Kees Cook

unread,
Oct 4, 2022, 10:36:59 AM10/4/22
to Dmitry Vyukov, syzbot, b...@vger.kernel.org, da...@davemloft.net, edum...@google.com, f...@strlen.de, harshit.m....@oracle.com, ku...@kernel.org, linux-...@vger.kernel.org, net...@vger.kernel.org, pab...@redhat.com, syzkall...@googlegroups.com, linux-h...@vger.kernel.org


On October 4, 2022 1:33:30 AM PDT, Dmitry Vyukov <dvy...@google.com> wrote:
>On Tue, 4 Oct 2022 at 10:27, syzbot
><syzbot+3a0800...@syzkaller.appspotmail.com> wrote:
>>
>> Hello,
>>
>> syzbot found the following issue on:
>>
>> HEAD commit: 725737e7c21d Merge tag 'statx-dioalign-for-linus' of git:/..
>> git tree: upstream
>> console output: https://syzkaller.appspot.com/x/log.txt?x=10257034880000
>> kernel config: https://syzkaller.appspot.com/x/.config?x=486af5e221f55835
>> dashboard link: https://syzkaller.appspot.com/bug?extid=3a080099974c271cd7e9
>> compiler: gcc (Debian 10.2.1-6) 10.2.1 20210110, GNU ld (GNU Binutils for Debian) 2.35.2
>>
>> IMPORTANT: if you fix the issue, please add the following tag to the commit:
>> Reported-by: syzbot+3a0800...@syzkaller.appspotmail.com
>
>+linux-hardening
>
>> ------------[ cut here ]------------
>> memcpy: detected field-spanning write (size 28) of single field "&errmsg->msg" at net/netlink/af_netlink.c:2447 (size 16)

This is fixed in the pending netdev tree coming for the merge window.

--
Kees Cook

Jakub Kicinski

unread,
Oct 4, 2022, 1:42:57 PM10/4/22
to Kees Cook, Dmitry Vyukov, syzbot, b...@vger.kernel.org, da...@davemloft.net, edum...@google.com, f...@strlen.de, harshit.m....@oracle.com, linux-...@vger.kernel.org, net...@vger.kernel.org, pab...@redhat.com, syzkall...@googlegroups.com, linux-h...@vger.kernel.org
On Tue, 04 Oct 2022 07:36:55 -0700 Kees Cook wrote:
> This is fixed in the pending netdev tree coming for the merge window.

This has been weighing on my conscience a little, I don't like how we
still depend on putting one length in the skb and then using a
different one for the actual memcpy(). How would you feel about this
patch on top (untested):

diff --git a/include/net/netlink.h b/include/net/netlink.h
index 4418b1981e31..6ad671441dff 100644
--- a/include/net/netlink.h
+++ b/include/net/netlink.h
@@ -931,6 +931,29 @@ static inline struct nlmsghdr *nlmsg_put(struct sk_buff *skb, u32 portid, u32 se
return __nlmsg_put(skb, portid, seq, type, payload, flags);
}

+/**
+ * nlmsg_append - Add more data to a nlmsg in a skb
+ * @skb: socket buffer to store message in
+ * @nlh: message header
+ * @payload: length of message payload
+ *
+ * Append data to an existing nlmsg, used when constructing a message
+ * with multiple fixed-format headers (which is rare).
+ * Returns NULL if the tailroom of the skb is insufficient to store
+ * the extra payload.
+ */
+static inline void *nlmsg_append(struct sk_buff *skb, struct nlmsghdr *nlh,
+ u32 size)
+{
+ if (unlikely(skb_tailroom(skb) < NLMSG_ALIGN(size)))
+ return NULL;
+
+ if (!__builtin_constant_p(size) || NLMSG_ALIGN(size) - size != 0)
+ memset(skb_tail_pointer(skb) + size, 0,
+ NLMSG_ALIGN(size) - size);
+ return __skb_put(NLMSG_ALIGN(size));
+}
+
/**
* nlmsg_put_answer - Add a new callback based netlink message to an skb
* @skb: socket buffer to store message in
diff --git a/net/netlink/af_netlink.c b/net/netlink/af_netlink.c
index a662e8a5ff84..bb3d855d1f57 100644
--- a/net/netlink/af_netlink.c
+++ b/net/netlink/af_netlink.c
@@ -2488,19 +2488,28 @@ void netlink_ack(struct sk_buff *in_skb, struct nlmsghdr *nlh, int err,
flags |= NLM_F_ACK_TLVS;

skb = nlmsg_new(payload + tlvlen, GFP_KERNEL);
- if (!skb) {
- NETLINK_CB(in_skb).sk->sk_err = ENOBUFS;
- sk_error_report(NETLINK_CB(in_skb).sk);
- return;
- }
+ if (!skb)
+ goto err_bad_put;

rep = nlmsg_put(skb, NETLINK_CB(in_skb).portid, nlh->nlmsg_seq,
- NLMSG_ERROR, payload, flags);
+ NLMSG_ERROR, sizeof(*errmsg), flags);
+ if (!rep)
+ goto err_bad_put;
errmsg = nlmsg_data(rep);
errmsg->error = err;
- unsafe_memcpy(&errmsg->msg, nlh, payload > sizeof(*errmsg)
- ? nlh->nlmsg_len : sizeof(*nlh),
- /* Bounds checked by the skb layer. */);
+ memcpy(&errmsg->msg, nlh, sizeof(*nlh));
+
+ if (!(flags & NLM_F_CAPPED)) {
+ size_t data_len = nlh->nlmsg_len - sizeof(*nlh);
+ void *data;
+
+ data = nlmsg_append(skb, rep, data_len);
+ if (!data)
+ goto err_bad_put;
+
+ /* the nlh + 1 is probably going to make you unhappy? */
+ memcpy(data, nlh + 1, data_len);
+ }

if (tlvlen)
netlink_ack_tlv_fill(in_skb, skb, nlh, err, extack);
@@ -2508,6 +2517,12 @@ void netlink_ack(struct sk_buff *in_skb, struct nlmsghdr *nlh, int err,
nlmsg_end(skb, rep);

nlmsg_unicast(in_skb->sk, skb, NETLINK_CB(in_skb).portid);
+
+ return;
+
+err_bad_put:
+ NETLINK_CB(in_skb).sk->sk_err = ENOBUFS;
+ sk_error_report(NETLINK_CB(in_skb).sk);
}
EXPORT_SYMBOL(netlink_ack);

Kees Cook

unread,
Oct 4, 2022, 7:40:35 PM10/4/22
to Jakub Kicinski, Dmitry Vyukov, syzbot, b...@vger.kernel.org, da...@davemloft.net, edum...@google.com, f...@strlen.de, harshit.m....@oracle.com, linux-...@vger.kernel.org, net...@vger.kernel.org, pab...@redhat.com, syzkall...@googlegroups.com, linux-h...@vger.kernel.org
On Tue, Oct 04, 2022 at 10:42:53AM -0700, Jakub Kicinski wrote:
> On Tue, 04 Oct 2022 07:36:55 -0700 Kees Cook wrote:
> > This is fixed in the pending netdev tree coming for the merge window.
>
> This has been weighing on my conscience a little, I don't like how we
> still depend on putting one length in the skb and then using a
> different one for the actual memcpy(). How would you feel about this
> patch on top (untested):

tl;dr: yes, I like it. Please add a nlmsg_contents member. :)

Rambling below...

>
> diff --git a/include/net/netlink.h b/include/net/netlink.h
> index 4418b1981e31..6ad671441dff 100644
> --- a/include/net/netlink.h
> +++ b/include/net/netlink.h
> @@ -931,6 +931,29 @@ static inline struct nlmsghdr *nlmsg_put(struct sk_buff *skb, u32 portid, u32 se
> return __nlmsg_put(skb, portid, seq, type, payload, flags);
> }
>
> +/**
> + * nlmsg_append - Add more data to a nlmsg in a skb
> + * @skb: socket buffer to store message in
> + * @nlh: message header
> + * @payload: length of message payload
> + *
> + * Append data to an existing nlmsg, used when constructing a message
> + * with multiple fixed-format headers (which is rare).
> + * Returns NULL if the tailroom of the skb is insufficient to store
> + * the extra payload.
> + */
> +static inline void *nlmsg_append(struct sk_buff *skb, struct nlmsghdr *nlh,

nlh not needed here?

> + u32 size)
> +{
> + if (unlikely(skb_tailroom(skb) < NLMSG_ALIGN(size)))
> + return NULL;
> +
> + if (!__builtin_constant_p(size) || NLMSG_ALIGN(size) - size != 0)

why does a fixed size mean no memset?
Should it test this flag, or test if the sizes show the need for "extra"
payload length?

I always found the progression of sizes here to be confusing. "payload"
starts as sizeof(*errmsg), and gets nlmsg_len(nlh) added but only when if
"(err && !(nlk->flags & NETLINK_F_CAP_ACK)" was true. Why is
nlmsg_len(nlh) _wrong_ if the rest of its contents are correct? If this
was "0" in the other state, the logic would just be:

nlh_bytes = nlmsg_len(nlh);
total = sizeof(*errmsg);
total += nlh_bytes;
total += tlvlen;

and:

nlmsg_new(total, ...);
... nlmsg_put(..., sizeof(*errmsg), ...);
...
errmsg->error = err;
errmsg->nlh = *nlh;
if (nlh_bytes) {
data = nlmsg_append(..., nlh_bytes), ...);
...
memcpy(data, nlh->nlmsg_contents, nlh_bytes);
}

> + size_t data_len = nlh->nlmsg_len - sizeof(*nlh);

I think data_len here is also "payload - sizeof(*errmsg)"? So if it's >0,
we need to append the nlh contents.

> + void *data;
> +
> + data = nlmsg_append(skb, rep, data_len);
> + if (!data)
> + goto err_bad_put;
> +
> + /* the nlh + 1 is probably going to make you unhappy? */

Right, the compiler may think it is an object no larger than sizeof(*nlh).
My earliest attempt at changes here introduced a flex-array for the
contents, and split the memcpy:
https://lore.kernel.org/lkml/d7251d92-150b-5346...@rasmusvillemoes.dk/
which is basically the solution you have here, except it wasn't having
the nlmsg_*-helpers do the bounds checking.

> + memcpy(data, nlh + 1, data_len);

So with the struct nlmsghdr::nlmsg_contents member, this becomes:

memcpy(data, nlh->nlmsg_contents, data_len);

--
Kees Cook

Jakub Kicinski

unread,
Oct 4, 2022, 8:04:07 PM10/4/22
to Kees Cook, Dmitry Vyukov, syzbot, b...@vger.kernel.org, da...@davemloft.net, edum...@google.com, f...@strlen.de, harshit.m....@oracle.com, linux-...@vger.kernel.org, net...@vger.kernel.org, pab...@redhat.com, syzkall...@googlegroups.com, linux-h...@vger.kernel.org
On Tue, 4 Oct 2022 16:40:32 -0700 Kees Cook wrote:
> On Tue, Oct 04, 2022 at 10:42:53AM -0700, Jakub Kicinski wrote:
> > This has been weighing on my conscience a little, I don't like how we
> > still depend on putting one length in the skb and then using a
> > different one for the actual memcpy(). How would you feel about this
> > patch on top (untested):
>
> tl;dr: yes, I like it. Please add a nlmsg_contents member. :)

Can do, but you'll need to tell me how..

__DECLARE_FLEX_ARRAY(char, nlmsg_contents)

?

> > + u32 size)
> > +{
> > + if (unlikely(skb_tailroom(skb) < NLMSG_ALIGN(size)))
> > + return NULL;
> > +
> > + if (!__builtin_constant_p(size) || NLMSG_ALIGN(size) - size != 0)
>
> why does a fixed size mean no memset?

Copy and paste, it seems to originate from:

0c19b0adb8dd ("netlink: avoid memset of 0 bytes sparse warning")

Any idea why sparse would not like empty memsets?

> > rep = nlmsg_put(skb, NETLINK_CB(in_skb).portid, nlh->nlmsg_seq,
> > - NLMSG_ERROR, payload, flags);
> > + NLMSG_ERROR, sizeof(*errmsg), flags);
> > + if (!rep)
> > + goto err_bad_put;
> > errmsg = nlmsg_data(rep);
> > errmsg->error = err;
> > - unsafe_memcpy(&errmsg->msg, nlh, payload > sizeof(*errmsg)
> > - ? nlh->nlmsg_len : sizeof(*nlh),
> > - /* Bounds checked by the skb layer. */);
> > + memcpy(&errmsg->msg, nlh, sizeof(*nlh));
> > +
> > + if (!(flags & NLM_F_CAPPED)) {
>
> Should it test this flag, or test if the sizes show the need for "extra"
> payload length?
>
> I always found the progression of sizes here to be confusing. "payload"
> starts as sizeof(*errmsg), and gets nlmsg_len(nlh) added but only when if
> "(err && !(nlk->flags & NETLINK_F_CAP_ACK)" was true.

struct nlmsgerr is one of the least badly documented structs we have in
netlink so let me start with a copy & paste:

struct nlmsgerr {
int error;
struct nlmsghdr msg;
/*
* followed by the message contents unless NETLINK_CAP_ACK was set
* or the ACK indicates success (error == 0)
* message length is aligned with NLMSG_ALIGN()
*/
/*
* followed by TLVs defined in enum nlmsgerr_attrs
* if NETLINK_EXT_ACK was set
*/
};

*Why* that's the behavior - 🤷

> Why is nlmsg_len(nlh) _wrong_ if the rest of its contents are
> correct?

This is an ack message, to be clear, doesn't mean anything was wrong.
It just carries errno.

> If this was "0" in the other state, the logic would just be:
>
> nlh_bytes = nlmsg_len(nlh);
> total = sizeof(*errmsg);
> total += nlh_bytes;
> total += tlvlen;
>
> and:
>
> nlmsg_new(total, ...);
> ... nlmsg_put(..., sizeof(*errmsg), ...);
> ...
> errmsg->error = err;
> errmsg->nlh = *nlh;
> if (nlh_bytes) {
> data = nlmsg_append(..., nlh_bytes), ...);
> ...
> memcpy(data, nlh->nlmsg_contents, nlh_bytes);
> }
>
> > + size_t data_len = nlh->nlmsg_len - sizeof(*nlh);
>
> I think data_len here is also "payload - sizeof(*errmsg)"? So if it's
> >0, we need to append the nlh contents.

I was trying to avoid using payload in case it has overflown :S

Jakub Kicinski

unread,
Oct 4, 2022, 8:23:39 PM10/4/22
to Kees Cook, Dmitry Vyukov, syzbot, b...@vger.kernel.org, da...@davemloft.net, edum...@google.com, f...@strlen.de, harshit.m....@oracle.com, linux-...@vger.kernel.org, net...@vger.kernel.org, pab...@redhat.com, syzkall...@googlegroups.com, linux-h...@vger.kernel.org
On Tue, 4 Oct 2022 17:04:00 -0700 Jakub Kicinski wrote:
> > why does a fixed size mean no memset?
>
> Copy and paste, it seems to originate from:
>
> 0c19b0adb8dd ("netlink: avoid memset of 0 bytes sparse warning")
>
> Any idea why sparse would not like empty memsets?

Google answers is. Let me test if sparse still wants the workaround.

syzbot

unread,
Jan 2, 2023, 8:33:29 PM1/2/23
to syzkall...@googlegroups.com
Auto-closing this bug as obsolete.
Crashes did not happen for a while, no reproducer and no activity.
Reply all
Reply to author
Forward
0 new messages