general protection fault in xsk_poll

14 views
Skip to first unread message

syzbot

unread,
Aug 19, 2019, 9:18:07 PM8/19/19
to a...@kernel.org, bjorn...@intel.com, b...@vger.kernel.org, dan...@iogearbox.net, da...@davemloft.net, ha...@kernel.org, jakub.k...@netronome.com, john.fa...@gmail.com, jonatha...@gmail.com, ka...@fb.com, linux-...@vger.kernel.org, magnus....@intel.com, net...@vger.kernel.org, songliu...@fb.com, syzkall...@googlegroups.com, xdp-n...@vger.kernel.org, y...@fb.com
Hello,

syzbot found the following crash on:

HEAD commit: da657043 Add linux-next specific files for 20190819
git tree: linux-next
console output: https://syzkaller.appspot.com/x/log.txt?x=16af124c600000
kernel config: https://syzkaller.appspot.com/x/.config?x=739a9b3ab3d8c770
dashboard link: https://syzkaller.appspot.com/bug?extid=c82697e3043781e08802
compiler: gcc (GCC) 9.0.0 20181231 (experimental)
syz repro: https://syzkaller.appspot.com/x/repro.syz?x=109e1922600000
C reproducer: https://syzkaller.appspot.com/x/repro.c?x=1445bf02600000

The bug was bisected to:

commit 77cd0d7b3f257fd0e3096b4fdcff1a7d38e99e10
Author: Magnus Karlsson <magnus....@intel.com>
Date: Wed Aug 14 07:27:17 2019 +0000

xsk: add support for need_wakeup flag in AF_XDP rings

bisection log: https://syzkaller.appspot.com/x/bisect.txt?x=15e1ea4c600000
final crash: https://syzkaller.appspot.com/x/report.txt?x=17e1ea4c600000
console output: https://syzkaller.appspot.com/x/log.txt?x=13e1ea4c600000

IMPORTANT: if you fix the bug, please add the following tag to the commit:
Reported-by: syzbot+c82697...@syzkaller.appspotmail.com
Fixes: 77cd0d7b3f25 ("xsk: add support for need_wakeup flag in AF_XDP
rings")

kasan: CONFIG_KASAN_INLINE enabled
kasan: GPF could be caused by NULL-ptr deref or user memory access
general protection fault: 0000 [#1] PREEMPT SMP KASAN
CPU: 1 PID: 7959 Comm: syz-executor611 Not tainted 5.3.0-rc5-next-20190819
#68
Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS
Google 01/01/2011
RIP: 0010:xsk_poll+0x95/0x540 net/xdp/xsk.c:386
Code: 80 3c 02 00 0f 85 70 04 00 00 4c 8b a3 88 04 00 00 48 b8 00 00 00 00
00 fc ff df 49 8d bc 24 96 00 00 00 48 89 fa 48 c1 ea 03 <0f> b6 04 02 48
89 fa 83 e2 07 38 d0 7f 08 84 c0 0f 85 bf 03 00 00
RSP: 0018:ffff8880926f7850 EFLAGS: 00010207
RAX: dffffc0000000000 RBX: ffff88809a141700 RCX: ffffffff859b07aa
RDX: 0000000000000012 RSI: ffffffff859b07c4 RDI: 0000000000000096
RBP: ffff8880926f7880 R08: ffff88809698a580 R09: ffffed1013428329
R10: ffffed1013428328 R11: ffff88809a141947 R12: 0000000000000000
R13: 0000000000000304 R14: ffff888095d4d840 R15: ffff888092bdd020
FS: 0000555557529880(0000) GS:ffff8880ae900000(0000) knlGS:0000000000000000
CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
CR2: 0000000020000280 CR3: 0000000098281000 CR4: 00000000001406e0
DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
Call Trace:
sock_poll+0x15e/0x480 net/socket.c:1256
vfs_poll include/linux/poll.h:90 [inline]
do_pollfd fs/select.c:859 [inline]
do_poll fs/select.c:907 [inline]
do_sys_poll+0x7c2/0xde0 fs/select.c:1001
__do_sys_ppoll fs/select.c:1101 [inline]
__se_sys_ppoll fs/select.c:1081 [inline]
__x64_sys_ppoll+0x259/0x310 fs/select.c:1081
do_syscall_64+0xfa/0x760 arch/x86/entry/common.c:290
entry_SYSCALL_64_after_hwframe+0x49/0xbe
RIP: 0033:0x440159
Code: 18 89 d0 c3 66 2e 0f 1f 84 00 00 00 00 00 0f 1f 00 48 89 f8 48 89 f7
48 89 d6 48 89 ca 4d 89 c2 4d 89 c8 4c 8b 4c 24 08 0f 05 <48> 3d 01 f0 ff
ff 0f 83 fb 13 fc ff c3 66 2e 0f 1f 84 00 00 00 00
RSP: 002b:00007ffd9fbd16e8 EFLAGS: 00000246 ORIG_RAX: 000000000000010f
RAX: ffffffffffffffda RBX: 00000000004002c8 RCX: 0000000000440159
RDX: 0000000000000000 RSI: 0000000000000001 RDI: 0000000020000280
RBP: 00000000006ca018 R08: 0000000000000000 R09: 00000000004002c8
R10: 0000000000000000 R11: 0000000000000246 R12: 00000000004019e0
R13: 0000000000401a70 R14: 0000000000000000 R15: 0000000000000000
Modules linked in:
---[ end trace da907175426b4065 ]---
RIP: 0010:xsk_poll+0x95/0x540 net/xdp/xsk.c:386
Code: 80 3c 02 00 0f 85 70 04 00 00 4c 8b a3 88 04 00 00 48 b8 00 00 00 00
00 fc ff df 49 8d bc 24 96 00 00 00 48 89 fa 48 c1 ea 03 <0f> b6 04 02 48
89 fa 83 e2 07 38 d0 7f 08 84 c0 0f 85 bf 03 00 00
RSP: 0018:ffff8880926f7850 EFLAGS: 00010207
RAX: dffffc0000000000 RBX: ffff88809a141700 RCX: ffffffff859b07aa
RDX: 0000000000000012 RSI: ffffffff859b07c4 RDI: 0000000000000096
RBP: ffff8880926f7880 R08: ffff88809698a580 R09: ffffed1013428329
R10: ffffed1013428328 R11: ffff88809a141947 R12: 0000000000000000
R13: 0000000000000304 R14: ffff888095d4d840 R15: ffff888092bdd020
FS: 0000555557529880(0000) GS:ffff8880ae900000(0000) knlGS:0000000000000000
CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
CR2: 0000000020000280 CR3: 0000000098281000 CR4: 00000000001406e0
DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400


---
This bug is generated by a bot. It may contain errors.
See https://goo.gl/tpsmEJ for more information about syzbot.
syzbot engineers can be reached at syzk...@googlegroups.com.

syzbot will keep track of this bug report. See:
https://goo.gl/tpsmEJ#status for how to communicate with syzbot.
For information about bisection process see: https://goo.gl/tpsmEJ#bisection
syzbot can test patches for this bug, for details see:
https://goo.gl/tpsmEJ#testing-patches

Hillf Danton

unread,
Aug 19, 2019, 11:32:10 PM8/19/19
to syzbot, a...@kernel.org, bjorn...@intel.com, b...@vger.kernel.org, dan...@iogearbox.net, da...@davemloft.net, ha...@kernel.org, jakub.k...@netronome.com, john.fa...@gmail.com, jonatha...@gmail.com, ka...@fb.com, linux-...@vger.kernel.org, magnus....@intel.com, net...@vger.kernel.org, songliu...@fb.com, syzkall...@googlegroups.com, xdp-n...@vger.kernel.org, y...@fb.com

On Mon, 19 Aug 2019 18:18:06 -0700
Add umem check.

--- a/net/xdp/xsk.c
+++ b/net/xdp/xsk.c
@@ -381,9 +381,9 @@ static unsigned int xsk_poll(struct file
struct sock *sk = sock->sk;
struct xdp_sock *xs = xdp_sk(sk);
struct net_device *dev = xs->dev;
- struct xdp_umem *umem = xs->umem;
+ struct xdp_umem *umem = READ_ONCE(xs->umem);

- if (umem->need_wakeup)
+ if (umem && umem->need_wakeup)
dev->netdev_ops->ndo_xsk_wakeup(dev, xs->queue_id,
umem->need_wakeup);

--

Björn Töpel

unread,
Aug 20, 2019, 5:17:54 AM8/20/19
to Hillf Danton, syzbot, a...@kernel.org, b...@vger.kernel.org, dan...@iogearbox.net, da...@davemloft.net, ha...@kernel.org, jakub.k...@netronome.com, john.fa...@gmail.com, jonatha...@gmail.com, ka...@fb.com, linux-...@vger.kernel.org, magnus....@intel.com, net...@vger.kernel.org, songliu...@fb.com, syzkall...@googlegroups.com, xdp-n...@vger.kernel.org, y...@fb.com
Thanks!

What do you think about making it a bit more generic, like:

diff --git a/net/xdp/xsk.c b/net/xdp/xsk.c
index ee4428a892fa..08bed5e92af4 100644
--- a/net/xdp/xsk.c
+++ b/net/xdp/xsk.c
@@ -356,13 +356,20 @@ static int xsk_generic_xmit(struct sock *sk,
struct msghdr *m,
return err;
}

+static bool xsk_is_bound(struct xdp_sock *xs)
+{
+ struct net_device *dev = READ_ONCE(xs->dev);
+
+ return dev && xs->state == XSK_BOUND;
+}
+
static int xsk_sendmsg(struct socket *sock, struct msghdr *m, size_t
total_len)
{
bool need_wait = !(m->msg_flags & MSG_DONTWAIT);
struct sock *sk = sock->sk;
struct xdp_sock *xs = xdp_sk(sk);

- if (unlikely(!xs->dev))
+ if (unlikely(!xsk_is_bound(xs)))
return -ENXIO;
if (unlikely(!(xs->dev->flags & IFF_UP)))
return -ENETDOWN;
@@ -383,6 +390,9 @@ static unsigned int xsk_poll(struct file *file,
struct socket *sock,
struct net_device *dev = xs->dev;
struct xdp_umem *umem = xs->umem;

+ if (unlikely(!xsk_is_bound(xs)))
+ return mask;
+
if (umem->need_wakeup)
dev->netdev_ops->ndo_xsk_wakeup(dev, xs->queue_id,
umem->need_wakeup);
@@ -417,7 +427,7 @@ static void xsk_unbind_dev(struct xdp_sock *xs)
{
struct net_device *dev = xs->dev;

- if (!dev || xs->state != XSK_BOUND)
+ if (!xsk_is_bound(xs))
return;

xs->state = XSK_UNBOUND;

Björn Töpel

unread,
Aug 20, 2019, 6:05:00 AM8/20/19
to syzbot+c82697...@syzkaller.appspotmail.com, a...@kernel.org, dan...@iogearbox.net, net...@vger.kernel.org, bjorn...@intel.com, b...@vger.kernel.org, da...@davemloft.net, ha...@kernel.org, jakub.k...@netronome.com, john.fa...@gmail.com, jonatha...@gmail.com, ka...@fb.com, linux-...@vger.kernel.org, magnus....@intel.com, songliu...@fb.com, syzkall...@googlegroups.com, xdp-n...@vger.kernel.org, y...@fb.com, hda...@sina.com
From: Björn Töpel <bjorn...@intel.com>

The poll() implementation for AF_XDP sockets did not perform the
proper state checks, prior accessing the socket umem. This patch fixes
that by performing a xsk_is_bound() check.

Suggested-by: Hillf Danton <hda...@sina.com>
Reported-by: syzbot+c82697...@syzkaller.appspotmail.com
Fixes: 77cd0d7b3f25 ("xsk: add support for need_wakeup flag in AF_XDP rings")
Signed-off-by: Björn Töpel <bjorn...@intel.com>
---
net/xdp/xsk.c | 14 ++++++++++++--
1 file changed, 12 insertions(+), 2 deletions(-)
--
2.20.1

Daniel Borkmann

unread,
Aug 20, 2019, 11:01:02 AM8/20/19
to Björn Töpel, syzbot+c82697...@syzkaller.appspotmail.com, a...@kernel.org, net...@vger.kernel.org, bjorn...@intel.com, b...@vger.kernel.org, da...@davemloft.net, ha...@kernel.org, jakub.k...@netronome.com, john.fa...@gmail.com, jonatha...@gmail.com, ka...@fb.com, linux-...@vger.kernel.org, magnus....@intel.com, songliu...@fb.com, syzkall...@googlegroups.com, xdp-n...@vger.kernel.org, y...@fb.com, hda...@sina.com
I think I'm a bit confused by your READ_ONCE() usage. ;-/ I can see why you're
using it in xsk_is_bound() above, but then at the same time all the other callbacks
like xsk_poll() or xsk_unbind_dev() above have a struct net_device *dev = xs->dev
right before the test. Could you elaborate?

Thanks,
Daniel

Björn Töpel

unread,
Aug 20, 2019, 11:29:23 AM8/20/19
to Daniel Borkmann, syzbot+c82697...@syzkaller.appspotmail.com, Alexei Starovoitov, Netdev, Björn Töpel, bpf, David Miller, Jesper Dangaard Brouer, Jakub Kicinski, John Fastabend, Jonathan Lemon, Martin KaFai Lau, LKML, Karlsson, Magnus, Song Liu, syzkall...@googlegroups.com, Xdp, Yonghong Song, hda...@sina.com
Yes, now I'm confused as well! Digging deeper... I believe there are a
couple of places in xsk.c that do not have
READ_ONCE/WRITE_ONCE-correctness. Various xdp_sock members are read
lock-less outside the control plane mutex (mutex member of struct
xdp_sock). This needs some re-work. I'll look into using the newly
introduced state member (with corresponding read/write barriers) for
this.

I'll cook some patch(es) that address this, but first it sounds like I
need to reread [1] two, or three times. At least. ;-)


Thanks,
Björn


[1] https://lwn.net/Articles/793253/


> Thanks,
> Daniel

Daniel Borkmann

unread,
Aug 20, 2019, 5:24:39 PM8/20/19
to Björn Töpel, syzbot+c82697...@syzkaller.appspotmail.com, Alexei Starovoitov, Netdev, Björn Töpel, bpf, David Miller, Jesper Dangaard Brouer, Jakub Kicinski, John Fastabend, Jonathan Lemon, Martin KaFai Lau, LKML, Karlsson, Magnus, Song Liu, syzkall...@googlegroups.com, Xdp, Yonghong Song, hda...@sina.com
Right, so even in above two cases, the compiler could have refetched, e.g.
dev variable could have first been NULL, but xsk_is_bound() later returns
true.
Reply all
Reply to author
Forward
0 new messages