netlink: NULL timer crash

119 views
Skip to first unread message

Dmitry Vyukov

unread,
Mar 23, 2017, 8:55:52 AM3/23/17
to David Miller, Cong Wang, Eric Dumazet, Herbert Xu, Alexei Starovoitov, netdev, LKML, syzkaller
Hello,

The following program triggers call of NULL timer func:

https://gist.githubusercontent.com/dvyukov/c210d01c74b911273469a93862ea7788/raw/2a3182772a6a6e20af3e71c02c2a1c2895d803fb/gistfile1.txt


BUG: unable to handle kernel NULL pointer dereference at (null)
IP: (null)
PGD 0
Oops: 0010 [#1] SMP KASAN
Modules linked in:
CPU: 2 PID: 0 Comm: swapper/2 Not tainted 4.11.0-rc3+ #365
Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS Bochs 01/01/2011
task: ffff88006c634300 task.stack: ffff88006c640000
RIP: 0010: (null)
RSP: 0018:ffff88006d1077c8 EFLAGS: 00010246
RAX: dffffc0000000000 RBX: ffff880062bddb00 RCX: ffffffff8154e161
RDX: 1ffffffff090c1f1 RSI: 0000000000000000 RDI: ffff880062bddb00
RBP: ffff88006d1077e8 R08: fffffbfff0a936a8 R09: 0000000000000001
R10: 0000000000000001 R11: fffffbfff0a936a7 R12: ffffffff84860f80
R13: 0000000000000000 R14: ffff880062bddb60 R15: 1ffff1000da20f05
FS: 0000000000000000(0000) GS:ffff88006d100000(0000) knlGS:0000000000000000
CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
CR2: 0000000000000000 CR3: 0000000004e21000 CR4: 00000000001406e0
Call Trace:
<IRQ>
neigh_timer_handler+0x365/0xd40 net/core/neighbour.c:944
call_timer_fn+0x232/0x8c0 kernel/time/timer.c:1268
expire_timers kernel/time/timer.c:1307 [inline]
__run_timers+0x6f7/0xbd0 kernel/time/timer.c:1601
run_timer_softirq+0x21/0x80 kernel/time/timer.c:1614
__do_softirq+0x2d6/0xb54 kernel/softirq.c:284
invoke_softirq kernel/softirq.c:364 [inline]
irq_exit+0x1b1/0x1e0 kernel/softirq.c:405
exiting_irq arch/x86/include/asm/apic.h:657 [inline]
smp_apic_timer_interrupt+0x76/0xa0 arch/x86/kernel/apic/apic.c:962
apic_timer_interrupt+0x93/0xa0 arch/x86/entry/entry_64.S:487
RIP: 0010:native_safe_halt+0x6/0x10 arch/x86/include/asm/irqflags.h:53
RSP: 0018:ffff88006c647dc0 EFLAGS: 00000286 ORIG_RAX: ffffffffffffff10
RAX: dffffc0000000000 RBX: 1ffff1000d8c8fbb RCX: 0000000000000000
RDX: 1ffffffff09d8ed4 RSI: 0000000000000001 RDI: ffffffff84ec76a0
RBP: ffff88006c647dc0 R08: ffffed000d8c6861 R09: 0000000000000000
R10: 0000000000000000 R11: 0000000000000000 R12: fffffbfff09d8ed2
R13: ffff88006c647e78 R14: ffffffff84ec7690 R15: 0000000000000002
</IRQ>
arch_safe_halt arch/x86/include/asm/paravirt.h:98 [inline]
default_idle+0xba/0x450 arch/x86/kernel/process.c:275
arch_cpu_idle+0xa/0x10 arch/x86/kernel/process.c:266
default_idle_call+0x37/0x80 kernel/sched/idle.c:97
cpuidle_idle_call kernel/sched/idle.c:155 [inline]
do_idle+0x230/0x380 kernel/sched/idle.c:244
cpu_startup_entry+0x18/0x20 kernel/sched/idle.c:346
start_secondary+0x2a7/0x340 arch/x86/kernel/smpboot.c:275
start_cpu+0x14/0x14 arch/x86/kernel/head_64.S:306
Code: Bad RIP value.
RIP: (null) RSP: ffff88006d1077c8
CR2: 0000000000000000
---[ end trace 845120b8a0d21411 ]---

On commit 093b995e3b55a0ae0670226ddfcb05bfbf0099ae

Eric Dumazet

unread,
Mar 23, 2017, 10:53:08 AM3/23/17
to Dmitry Vyukov, David Miller, Cong Wang, Herbert Xu, Alexei Starovoitov, netdev, LKML, syzkaller
Nice !

Looks like neigh->ops->solicit is NULL

Eric Dumazet

unread,
Mar 23, 2017, 12:01:00 PM3/23/17
to Eric Dumazet, Dmitry Vyukov, David Miller, Cong Wang, Herbert Xu, Alexei Starovoitov, netdev, LKML, syzkaller
On Thu, 2017-03-23 at 07:53 -0700, Eric Dumazet wrote:

> Nice !
>
> Looks like neigh->ops->solicit is NULL

Apparently we allow admins to do really stupid things with neighbours
on tunnels.

Following patch should avoid the crash.

Anyone has better ideas ?


net/ipv4/arp.c | 5 +++++
net/ipv6/ndisc.c | 4 ++++
2 files changed, 9 insertions(+)

diff --git a/net/ipv4/arp.c b/net/ipv4/arp.c
index 51b27ae09fbd725bcd8030982e5850215ac4ce5c..963191b12e28041bf5df6f37f222a7155f83a414 100644
--- a/net/ipv4/arp.c
+++ b/net/ipv4/arp.c
@@ -146,8 +146,13 @@ static const struct neigh_ops arp_hh_ops = {
.connected_output = neigh_resolve_output,
};

+static void arp_no_solicit(struct neighbour *neigh, struct sk_buff *skb)
+{
+}
+
static const struct neigh_ops arp_direct_ops = {
.family = AF_INET,
+ .solicit = arp_no_solicit,
.output = neigh_direct_output,
.connected_output = neigh_direct_output,
};
diff --git a/net/ipv6/ndisc.c b/net/ipv6/ndisc.c
index 7ebac630d3c603186be2fc0dcbaac7d7e74bfde6..86f290b749d5ca0db4310b17ebeff35d847540c7 100644
--- a/net/ipv6/ndisc.c
+++ b/net/ipv6/ndisc.c
@@ -99,9 +99,13 @@ static const struct neigh_ops ndisc_hh_ops = {
.connected_output = neigh_resolve_output,
};

+static void ndisc_no_solicit(struct neighbour *neigh, struct sk_buff *skb)
+{
+}

static const struct neigh_ops ndisc_direct_ops = {
.family = AF_INET6,
+ .solicit = ndisc_no_solicit,
.output = neigh_direct_output,
.connected_output = neigh_direct_output,
};


David Miller

unread,
Mar 23, 2017, 3:00:36 PM3/23/17
to eric.d...@gmail.com, edum...@google.com, dvy...@google.com, xiyou.w...@gmail.com, her...@gondor.apana.org.au, a...@kernel.org, net...@vger.kernel.org, linux-...@vger.kernel.org, syzk...@googlegroups.com
From: Eric Dumazet <eric.d...@gmail.com>
Date: Thu, 23 Mar 2017 09:00:58 -0700

> On Thu, 2017-03-23 at 07:53 -0700, Eric Dumazet wrote:
>
>> Nice !
>>
>> Looks like neigh->ops->solicit is NULL
>
> Apparently we allow admins to do really stupid things with neighbours
> on tunnels.
>
> Following patch should avoid the crash.
>
> Anyone has better ideas ?

This is probably good enough for now, but you need to also handle
dn_neigh_ops.

Another way to solve this is to add a NULL method check to the
one spot where we invoke this method. That clearly shows that
the method is optional.

Eric Dumazet

unread,
Mar 23, 2017, 3:32:30 PM3/23/17
to David Miller, edum...@google.com, dvy...@google.com, xiyou.w...@gmail.com, her...@gondor.apana.org.au, a...@kernel.org, net...@vger.kernel.org, linux-...@vger.kernel.org, syzk...@googlegroups.com
Yes, this would be a one liner. I will post this in a minute.

Thanks.



Eric Dumazet

unread,
Mar 23, 2017, 3:39:23 PM3/23/17
to David Miller, edum...@google.com, dvy...@google.com, xiyou.w...@gmail.com, her...@gondor.apana.org.au, a...@kernel.org, net...@vger.kernel.org, linux-...@vger.kernel.org, syzk...@googlegroups.com
From: Eric Dumazet <edum...@google.com>

Dmitry posted a nice reproducer of a bug triggering in neigh_probe()
when dereferencing a NULL neigh->ops->solicit method.

This can happen for arp_direct_ops/ndisc_direct_ops and similar,
which can be used for NUD_NOARP neighbours (created when dev->header_ops
is NULL). Admin can then force changing nud_state to some other state
that would fire neigh timer.

Signed-off-by: Eric Dumazet <edum...@google.com>
Reported-by: Dmitry Vyukov <dvy...@google.com>
---
net/core/neighbour.c | 3 ++-
1 file changed, 2 insertions(+), 1 deletion(-)

diff --git a/net/core/neighbour.c b/net/core/neighbour.c
index e7c12caa20c88acc9a5dd86f07d11644fb58341d..4526cbd7e28a1fcdecfc06a41985fd4d19634457 100644
--- a/net/core/neighbour.c
+++ b/net/core/neighbour.c
@@ -860,7 +860,8 @@ static void neigh_probe(struct neighbour *neigh)
if (skb)
skb = skb_clone(skb, GFP_ATOMIC);
write_unlock(&neigh->lock);
- neigh->ops->solicit(neigh, skb);
+ if (neigh->ops->solicit)
+ neigh->ops->solicit(neigh, skb);
atomic_inc(&neigh->probes);
kfree_skb(skb);
}


David Miller

unread,
Mar 24, 2017, 12:28:52 AM3/24/17
to eric.d...@gmail.com, edum...@google.com, dvy...@google.com, xiyou.w...@gmail.com, her...@gondor.apana.org.au, a...@kernel.org, net...@vger.kernel.org, linux-...@vger.kernel.org, syzk...@googlegroups.com
From: Eric Dumazet <eric.d...@gmail.com>
Date: Thu, 23 Mar 2017 12:39:21 -0700

> From: Eric Dumazet <edum...@google.com>
>
> Dmitry posted a nice reproducer of a bug triggering in neigh_probe()
> when dereferencing a NULL neigh->ops->solicit method.
>
> This can happen for arp_direct_ops/ndisc_direct_ops and similar,
> which can be used for NUD_NOARP neighbours (created when dev->header_ops
> is NULL). Admin can then force changing nud_state to some other state
> that would fire neigh timer.
>
> Signed-off-by: Eric Dumazet <edum...@google.com>
> Reported-by: Dmitry Vyukov <dvy...@google.com>

Applied and queued up for -stable, thanks Eric.

chunyu.w...@gmail.com

unread,
Jul 26, 2017, 9:09:28 AM7/26/17
to syzkaller, da...@davemloft.net, xiyou.w...@gmail.com, edum...@google.com, her...@gondor.apana.org.au, a...@kernel.org, net...@vger.kernel.org, linux-...@vger.kernel.org
Hi Dmitry,

By trying to apply your reproducer to normal kernels, this scenery can not be reproduced (on fedora). Does this C source only for  KASAN kernels?

- Chunyu

Dmitry Vyukov

unread,
Jul 26, 2017, 9:13:37 AM7/26/17
to chunyu.w...@gmail.com, syzkaller, David Miller, Cong Wang, Eric Dumazet, Herbert Xu, Alexei Starovoitov, netdev, LKML
On Wed, Jul 26, 2017 at 3:09 PM, <chunyu.w...@gmail.com> wrote:
> Hi Dmitry,
>
> By trying to apply your reproducer to normal kernels, this scenery can not
> be reproduced (on fedora). Does this C source only for KASAN kernels?

No, NULL derefs are detected without KASAN.
> --
> You received this message because you are subscribed to the Google Groups
> "syzkaller" group.
> To unsubscribe from this group and stop receiving emails from it, send an
> email to syzkaller+...@googlegroups.com.
> For more options, visit https://groups.google.com/d/optout.

ChunYu Wang

unread,
Jul 26, 2017, 10:35:19 AM7/26/17
to Dmitry Vyukov, syzkaller, David Miller, Cong Wang, Eric Dumazet, Herbert Xu, Alexei Starovoitov, netdev, LKML
Wo, thanks!
--
CHUNYU WANG

ASSOCIATE QE

KERNEL ENG
Reply all
Reply to author
Forward
0 new messages