net: GPF in eth_header

277 views
Skip to first unread message

Dmitry Vyukov

unread,
Nov 26, 2016, 12:30:24 PM11/26/16
to David Miller, Tom Herbert, adu...@mirantis.com, Hannes Frederic Sowa, jb...@redhat.com, Sabrina Dubroca, netdev, LKML, Eric Dumazet, syzkaller
Hello,

The following program triggers GPF in eth_header:

https://gist.githubusercontent.com/dvyukov/613cadf05543b55a419f237e419cd495/raw/5471231523d1a07c3de55f11f87472c2816ee06c/gistfile1.txt

On commit 16ae16c6e5616c084168740990fc508bda6655d4 (Nov 24).

BUG: unable to handle kernel paging request at ffffed002d14d74a
IP: [<ffffffff86be3295>] eth_header+0x75/0x260 net/ethernet/eth.c:88
PGD 7fff6067 [ 50.787819] PUD 7fff5067
PMD 0 [ 50.787819]
Oops: 0000 [#1] SMP DEBUG_PAGEALLOC KASAN
Modules linked in:
CPU: 2 PID: 6712 Comm: a.out Not tainted 4.9.0-rc6+ #55
Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS Bochs 01/01/2011
task: ffff88003a1841c0 task.stack: ffff880034d08000
RIP: 0010:[<ffffffff86be3295>] [<ffffffff86be3295>]
eth_header+0x75/0x260 net/ethernet/eth.c:88
RSP: 0018:ffff880034d0eb68 EFLAGS: 00010a03
RAX: 1ffff1002d14d74a RBX: ffff880168a6ba4a RCX: ffff88006a9c7858
RDX: 000000000000dd86 RSI: dffffc0000000000 RDI: ffff880168a6ba56
RBP: ffff880034d0eb98 R08: 0000000000000000 R09: 0000000000000031
R10: 0000000000000002 R11: 0000000000000000 R12: 0000000000000000
R13: ffff88006c208d80 R14: 00000000000086dd R15: ffff88006a9c7858
FS: 0000000001a02940(0000) GS:ffff88006d000000(0000) knlGS:0000000000000000
CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
CR2: ffffed002d14d74a CR3: 0000000037373000 CR4: 00000000000006e0
Stack:
000000316881ab40 ffff88006a9c76c0 ffff88006881ab40 ffff88006a9c77f8
0000000000000000 dffffc0000000000 ffff880034d0ee98 ffffffff86b31af9
ffffffff8719605c ffff880034d0f0f8 ffffffff000086dd ffffffff86be3220
Call Trace:
[< inline >] dev_hard_header ./include/linux/netdevice.h:2762
[<ffffffff86b31af9>] neigh_resolve_output+0x659/0xb20 net/core/neighbour.c:1302
[< inline >] dst_neigh_output ./include/net/dst.h:464
[<ffffffff8719605c>] ip6_finish_output2+0xb3c/0x2500 net/ipv6/ip6_output.c:121
[<ffffffff871a0b0b>] ip6_finish_output+0x2eb/0x760 net/ipv6/ip6_output.c:139
[< inline >] NF_HOOK_COND ./include/linux/netfilter.h:246
[<ffffffff871a1157>] ip6_output+0x1d7/0x9a0 net/ipv6/ip6_output.c:153
[< inline >] dst_output ./include/net/dst.h:501
[<ffffffff873312ea>] ip6_local_out+0x9a/0x180 net/ipv6/output_core.c:170
[<ffffffff871a3886>] ip6_send_skb+0xa6/0x340 net/ipv6/ip6_output.c:1712
[<ffffffff871a3bd8>] ip6_push_pending_frames+0xb8/0xe0
net/ipv6/ip6_output.c:1732
[< inline >] rawv6_push_pending_frames net/ipv6/raw.c:607
[<ffffffff8722acfb>] rawv6_sendmsg+0x250b/0x2c20 net/ipv6/raw.c:920
[<ffffffff8701c4f5>] inet_sendmsg+0x385/0x590 net/ipv4/af_inet.c:734
[< inline >] sock_sendmsg_nosec net/socket.c:621
[<ffffffff86a6ea9f>] sock_sendmsg+0xcf/0x110 net/socket.c:631
[<ffffffff86a6ee0b>] sock_write_iter+0x32b/0x620 net/socket.c:829
[<ffffffff81a6f153>] do_iter_readv_writev+0x363/0x670 fs/read_write.c:695
[<ffffffff81a71ba1>] do_readv_writev+0x431/0x9b0 fs/read_write.c:872
[<ffffffff81a726dc>] vfs_writev+0x8c/0xc0 fs/read_write.c:911
[<ffffffff81a72825>] do_writev+0x115/0x2d0 fs/read_write.c:944
[< inline >] SYSC_writev fs/read_write.c:1017
[<ffffffff81a75fdc>] SyS_writev+0x2c/0x40 fs/read_write.c:1014
[<ffffffff8814cf85>] entry_SYSCALL_64_fastpath+0x23/0xc6
arch/x86/entry/entry_64.S:209
Code: 41 83 fe 04 0f 84 aa 00 00 00 e8 17 4e b0 fa 48 8d 7b 0c 48 be
00 00 00 00 00 fc ff df 44 89 f2 66 c1 c2 08 48 89 f8 48 c1 e8 03 <0f>
b6 0c 30 48 8d 43 0d 49 89 c0 49 c1 e8 03 41 0f b6 34 30 49
RIP [<ffffffff86be3295>] eth_header+0x75/0x260 net/ethernet/eth.c:88
RSP <ffff880034d0eb68>
CR2: ffffed002d14d74a
---[ end trace a73fedfdc11bd60c ]---

Eric Dumazet

unread,
Nov 26, 2016, 1:28:05 PM11/26/16
to Dmitry Vyukov, David Miller, Tom Herbert, Alexander Duyck, Hannes Frederic Sowa, Jiri Benc, Sabrina Dubroca, netdev, LKML, syzkaller
Hi Dmitry

I could not reproduce the issue. Might need some specific configuration...

loopback device has proper ethernet header (all 0)

Fault happens in :

0f b6 0c 30 movzbl (%rax,%rsi,1),%ecx

RAX=1ffff1002d14d74a which is RDI>>3, and RSI=dffffc0000000000

Could this be a KASAN problem ?

Andrey Konovalov

unread,
Nov 26, 2016, 2:07:10 PM11/26/16
to syzkaller, Dmitry Vyukov, David Miller, Tom Herbert, Alexander Duyck, Hannes Frederic Sowa, Jiri Benc, Sabrina Dubroca, netdev, LKML
Hi Eric,

The crash happens when the kernel tries to access shadow for nonmapped memory.

The issue here is an integer overflow which happens in neigh_resolve_output().
skb_network_offset(skb) can return negative number, but __skb_pull()
accepts unsigned int as len.
As a result, the least significat bit in higher 32 bits of skb->data
gets set and we get an out-of-bounds with offset of 4 GB.

I've attached a short reproducer, but you either need KASAN or to add
a BUG_ON to see the crash.
In this reproducer skb_network_offset() becomes negative after merging
two ipv6 fragments.

I actually see multiple places where skb_network_offset() is used as
an argument to skb_pull().
So I guess every place can potentially be buggy.

Thanks!


>
> --
> You received this message because you are subscribed to the Google Groups "syzkaller" group.
> To unsubscribe from this group and stop receiving emails from it, send an email to syzkaller+...@googlegroups.com.
> For more options, visit https://groups.google.com/d/optout.
ipv6-poc1.c

Eric Dumazet

unread,
Nov 26, 2016, 3:05:09 PM11/26/16
to Andrey Konovalov, syzkaller, Dmitry Vyukov, David Miller, Tom Herbert, Alexander Duyck, Hannes Frederic Sowa, Jiri Benc, Sabrina Dubroca, netdev, LKML
> Hi Eric,
>
> The crash happens when the kernel tries to access shadow for nonmapped memory.
>
> The issue here is an integer overflow which happens in neigh_resolve_output().
> skb_network_offset(skb) can return negative number, but __skb_pull()
> accepts unsigned int as len.
> As a result, the least significat bit in higher 32 bits of skb->data
> gets set and we get an out-of-bounds with offset of 4 GB.
>
> I've attached a short reproducer, but you either need KASAN or to add
> a BUG_ON to see the crash.
> In this reproducer skb_network_offset() becomes negative after merging
> two ipv6 fragments.
>
> I actually see multiple places where skb_network_offset() is used as
> an argument to skb_pull().
> So I guess every place can potentially be buggy.

Well, I think the intent is to accept a negative number.

This definitely was assumed by commit e1f165032c8bade authors !

I guess they were using a 32bit kernel for their tests.

Eric Dumazet

unread,
Nov 26, 2016, 3:34:40 PM11/26/16
to Andrey Konovalov, syzkaller, Dmitry Vyukov, David Miller, Tom Herbert, Alexander Duyck, Hannes Frederic Sowa, Jiri Benc, Sabrina Dubroca, netdev, LKML
Correct fix would be to use

skb_push(skb, -skb_network_offset(skb));

As done in other locations...

Eric Dumazet

unread,
Nov 28, 2016, 1:51:21 PM11/28/16
to Andrey Konovalov, Hannes Frederic Sowa, syzkaller, Dmitry Vyukov, David Miller, Tom Herbert, Alexander Duyck, Hannes Frederic Sowa, Jiri Benc, Sabrina Dubroca, netdev, LKML
I can not reproduce the bug on my hosts.
Quite hard to debug for me.

skb_network_offset() can not be negative at this point, unless there is
a bug upper in the stack.

Hannes, do you have an idea of what could be wrong in IPv6 stack ?

Thanks.


Andrey Konovalov

unread,
Nov 28, 2016, 2:04:46 PM11/28/16
to syzkaller, Hannes Frederic Sowa, Dmitry Vyukov, David Miller, Tom Herbert, Alexander Duyck, Jiri Benc, Sabrina Dubroca, netdev, LKML
Hi Eric,

As far as I can see, skb_network_offset() becomes negative after
pskb_pull(skb, (u8 *) (fhdr + 1) - skb->data) in nf_ct_frag6_queue().
At least I'm able to detect that with a BUG_ON().

Also it seems that the issue is only reproducible (at least with the
poc I provided) for a short time after boot.

I hope that helps.

>
> Hannes, do you have an idea of what could be wrong in IPv6 stack ?
>
> Thanks.
>
>

Dmitry Vyukov

unread,
Nov 28, 2016, 2:34:37 PM11/28/16
to syzkaller, Hannes Frederic Sowa, David Miller, Tom Herbert, Alexander Duyck, Jiri Benc, Sabrina Dubroca, netdev, LKML
On Mon, Nov 28, 2016 at 8:04 PM, 'Andrey Konovalov' via syzkaller
Eric,

Is it enough to debug? Or maybe Andrey can trace some values for you.

Eric Dumazet

unread,
Nov 28, 2016, 2:48:41 PM11/28/16
to Dmitry Vyukov, syzkaller, Hannes Frederic Sowa, David Miller, Tom Herbert, Alexander Duyck, Jiri Benc, Sabrina Dubroca, netdev, LKML
On Mon, 2016-11-28 at 20:34 +0100, Dmitry Vyukov wrote:
> On Mon, Nov 28, 2016 at 8:04 PM, 'Andrey Konovalov' via syzkaller

> > Hi Eric,
> >
> > As far as I can see, skb_network_offset() becomes negative after
> > pskb_pull(skb, (u8 *) (fhdr + 1) - skb->data) in nf_ct_frag6_queue().
> > At least I'm able to detect that with a BUG_ON().
> >
> > Also it seems that the issue is only reproducible (at least with the
> > poc I provided) for a short time after boot.
>
>
> Eric,
>
> Is it enough to debug? Or maybe Andrey can trace some values for you.

Well, now we are talking, if you tell me how many modules you load, it
might help ;)

nf_ct_frag6_queue is nowhere to be seen in my kernels, that might
explain why I could not reproduce the bug.

Let me try ;)


Eric Dumazet

unread,
Nov 28, 2016, 4:06:55 PM11/28/16
to Dmitry Vyukov, Florian Westphal, syzkaller, Hannes Frederic Sowa, David Miller, Tom Herbert, Alexander Duyck, Jiri Benc, Sabrina Dubroca, netdev, LKML
Might be a bug added in commit daaa7d647f81f3
("netfilter: ipv6: avoid nf_iterate recursion")

Florian, what do you think of dropping a packet that presumably was
mangled badly by nf_ct_frag6_queue() ?

(Like about 48 byte pulled :(, and/or skb->csum changed )

diff --git a/net/ipv6/netfilter/nf_defrag_ipv6_hooks.c b/net/ipv6/netfilter/nf_defrag_ipv6_hooks.c
index f7aab5ab93a5..31aa947332d8 100644
--- a/net/ipv6/netfilter/nf_defrag_ipv6_hooks.c
+++ b/net/ipv6/netfilter/nf_defrag_ipv6_hooks.c
@@ -65,8 +65,8 @@ static unsigned int ipv6_defrag(void *priv,

err = nf_ct_frag6_gather(state->net, skb,
nf_ct6_defrag_user(state->hook, skb));
- /* queued */
- if (err == -EINPROGRESS)
+ /* queued or mangled ... */
+ if (err)
return NF_STOLEN;

return NF_ACCEPT;


Eric Dumazet

unread,
Nov 28, 2016, 4:19:09 PM11/28/16
to Dmitry Vyukov, Florian Westphal, syzkaller, Hannes Frederic Sowa, David Miller, Tom Herbert, Alexander Duyck, Jiri Benc, Sabrina Dubroca, netdev, LKML
Or more exactly , returning NF_DROP so that skb is really freed ;)


diff --git a/net/ipv6/netfilter/nf_defrag_ipv6_hooks.c b/net/ipv6/netfilter/nf_defrag_ipv6_hooks.c
index f7aab5ab93a5..508739a3ca2a 100644
--- a/net/ipv6/netfilter/nf_defrag_ipv6_hooks.c
+++ b/net/ipv6/netfilter/nf_defrag_ipv6_hooks.c
@@ -65,9 +65,9 @@ static unsigned int ipv6_defrag(void *priv,

err = nf_ct_frag6_gather(state->net, skb,
nf_ct6_defrag_user(state->hook, skb));
- /* queued */
- if (err == -EINPROGRESS)
- return NF_STOLEN;
+ /* queued or mangled ... */
+ if (err)
+ return (err == -EINPROGRESS) ? NF_STOLEN : NF_DROP;

return NF_ACCEPT;
}


Florian Westphal

unread,
Nov 28, 2016, 4:37:41 PM11/28/16
to Eric Dumazet, Dmitry Vyukov, Florian Westphal, syzkaller, Hannes Frederic Sowa, David Miller, Tom Herbert, Alexander Duyck, Jiri Benc, Sabrina Dubroca, netdev, LKML
Eric Dumazet <eric.d...@gmail.com> wrote:
> > Might be a bug added in commit daaa7d647f81f3
> > ("netfilter: ipv6: avoid nf_iterate recursion")
> >
> > Florian, what do you think of dropping a packet that presumably was
> > mangled badly by nf_ct_frag6_queue() ?

ipv4 definitely frees malformed packets.
In general, I think netfilter should avoid 'silent' drops if possible
and let skb continue, but of course such skbs should not be made worse
as what we ate to begin with...

> > (Like about 48 byte pulled :(, and/or skb->csum changed )

I think this warrants a review of ipv6 reassembly too, bug reported here
is because ipv6 nf defrag is also done on output.

> diff --git a/net/ipv6/netfilter/nf_defrag_ipv6_hooks.c b/net/ipv6/netfilter/nf_defrag_ipv6_hooks.c
> index f7aab5ab93a5..508739a3ca2a 100644
> --- a/net/ipv6/netfilter/nf_defrag_ipv6_hooks.c
> +++ b/net/ipv6/netfilter/nf_defrag_ipv6_hooks.c
> @@ -65,9 +65,9 @@ static unsigned int ipv6_defrag(void *priv,
>
> err = nf_ct_frag6_gather(state->net, skb,
> nf_ct6_defrag_user(state->hook, skb));
> - /* queued */
> - if (err == -EINPROGRESS)
> - return NF_STOLEN;
> + /* queued or mangled ... */
> + if (err)
> + return (err == -EINPROGRESS) ? NF_STOLEN : NF_DROP;
>
> return NF_ACCEPT;

Looks good, we'll need to change some of the errno return codes in
nf_ct_frag6_gather to 0 though for this to work, which should not be too
hard ;)

Eric Dumazet

unread,
Nov 28, 2016, 5:15:07 PM11/28/16
to Florian Westphal, Dmitry Vyukov, syzkaller, Hannes Frederic Sowa, David Miller, Tom Herbert, Alexander Duyck, Jiri Benc, Sabrina Dubroca, netdev, LKML
On Mon, 2016-11-28 at 22:34 +0100, Florian Westphal wrote:
> Eric Dumazet <eric.d...@gmail.com> wrote:
> > > Might be a bug added in commit daaa7d647f81f3
> > > ("netfilter: ipv6: avoid nf_iterate recursion")
> > >
> > > Florian, what do you think of dropping a packet that presumably was
> > > mangled badly by nf_ct_frag6_queue() ?
>
> ipv4 definitely frees malformed packets.
> In general, I think netfilter should avoid 'silent' drops if possible
> and let skb continue, but of course such skbs should not be made worse
> as what we ate to begin with...
>
> > > (Like about 48 byte pulled :(, and/or skb->csum changed )
>
> I think this warrants a review of ipv6 reassembly too, bug reported here
> is because ipv6 nf defrag is also done on output.


ip6_frag_queue() definitely frees bad/mangled skbs()


> Looks good, we'll need to change some of the errno return codes in
> nf_ct_frag6_gather to 0 though for this to work, which should not be too
> hard ;)

If the goal is to let buggy packets pass, then we might need to undo
changes in nf_ct_frag6_queue()



Florian Westphal

unread,
Nov 28, 2016, 5:22:20 PM11/28/16
to Eric Dumazet, Florian Westphal, Dmitry Vyukov, syzkaller, Hannes Frederic Sowa, David Miller, Tom Herbert, Alexander Duyck, Jiri Benc, Sabrina Dubroca, netdev, LKML
Eric Dumazet <eric.d...@gmail.com> wrote:
> On Mon, 2016-11-28 at 22:34 +0100, Florian Westphal wrote:
> > Eric Dumazet <eric.d...@gmail.com> wrote:
> > > > Might be a bug added in commit daaa7d647f81f3
> > > > ("netfilter: ipv6: avoid nf_iterate recursion")
> > > >
> > > > Florian, what do you think of dropping a packet that presumably was
> > > > mangled badly by nf_ct_frag6_queue() ?
> >
> > ipv4 definitely frees malformed packets.
> > In general, I think netfilter should avoid 'silent' drops if possible
> > and let skb continue, but of course such skbs should not be made worse
> > as what we ate to begin with...
> >
> > > > (Like about 48 byte pulled :(, and/or skb->csum changed )
> >
> > I think this warrants a review of ipv6 reassembly too, bug reported here
> > is because ipv6 nf defrag is also done on output.
>
>
> ip6_frag_queue() definitely frees bad/mangled skbs()

Yes, sorry. nf_ct_frag6_queue is mostly derived from ip6_frag_queue
so any bugs in one might also exist in other.
Thats all I wanted to say here. I'll check this tomorrow.

> > Looks good, we'll need to change some of the errno return codes in
> > nf_ct_frag6_gather to 0 though for this to work, which should not be too
> > hard ;)
>
> If the goal is to let buggy packets pass, then we might need to undo
> changes in nf_ct_frag6_queue()

It currently returns -EINVAL in cases where skb wasn't changed/altered
(e.g. because it doesn't have a fragment header), so we should ACCEPT in
that case.

As for 'buggy' packet, I think its ok to mimic ip6_frag_queue, i.e.
if it tosses returning NF_DROP under same circumstance seems ok.

(Passing however will -- on ingress side -- cause snmp stat increments
in ipv6 reassembly, this still might be desireable).

I'll check where undo might be possible/not too hard.

Thanks Eric for debugging this!

Eric Dumazet

unread,
Nov 28, 2016, 6:17:15 PM11/28/16
to Florian Westphal, Dmitry Vyukov, syzkaller, Hannes Frederic Sowa, David Miller, Tom Herbert, Alexander Duyck, Jiri Benc, Sabrina Dubroca, netdev, LKML
On Mon, 2016-11-28 at 23:19 +0100, Florian Westphal wrote:

> It currently returns -EINVAL in cases where skb wasn't changed/altered
> (e.g. because it doesn't have a fragment header), so we should ACCEPT in
> that case.
>

Maybe nf_ct_frag6_queue() should return direct NF_ codes then ...



Andrey Konovalov

unread,
Nov 29, 2016, 5:26:55 AM11/29/16
to syzkaller, Dmitry Vyukov, David Miller, Tom Herbert, Alexander Duyck, Hannes Frederic Sowa, Jiri Benc, Sabrina Dubroca, netdev, LKML
On Sat, Nov 26, 2016 at 9:05 PM, Eric Dumazet <erd...@gmail.com> wrote:
>> I actually see multiple places where skb_network_offset() is used as
>> an argument to skb_pull().
>> So I guess every place can potentially be buggy.
>
> Well, I think the intent is to accept a negative number.

I'm not sure that was the intent since it results in a signedness
issue which leads to an out-of-bounds.

A quick grep shows that the same issue can potentially happen in
multiple places across the kernel:

net/ipv6/ip6_output.c:1655: __skb_pull(skb, skb_network_offset(skb));
net/packet/af_packet.c:2043: skb_pull(skb, skb_network_offset(skb));
net/packet/af_packet.c:2165: skb_pull(skb, skb_network_offset(skb));
net/core/neighbour.c:1301: __skb_pull(skb, skb_network_offset(skb));
net/core/neighbour.c:1331: __skb_pull(skb, skb_network_offset(skb));
net/core/dev.c:3157: __skb_pull(skb, skb_network_offset(skb));
net/sched/sch_teql.c:337: __skb_pull(skb, skb_network_offset(skb));
net/sched/sch_atm.c:479: skb_pull(skb, skb_network_offset(skb));
net/ipv4/ip_output.c:1385: __skb_pull(skb, skb_network_offset(skb));
net/ipv4/ip_fragment.c:391: if (!pskb_pull(skb, skb_network_offset(skb) + ihl))
drivers/net/vxlan.c:1440: __skb_pull(reply, skb_network_offset(reply));
drivers/net/vxlan.c:1902: __skb_pull(skb, skb_network_offset(skb));
drivers/net/vrf.c:220: __skb_pull(skb, skb_network_offset(skb));
drivers/net/vrf.c:314: __skb_pull(skb, skb_network_offset(skb));

A similar thing also happened to somebody else (on a receive path!):
https://forums.grsecurity.net/viewtopic.php?f=3&t=4550

Does it make sense to check skb_network_offset() before passing it to
skb_pull() everywhere?

>
> This definitely was assumed by commit e1f165032c8bade authors !
>
> I guess they were using a 32bit kernel for their tests.
>

Eric Dumazet

unread,
Nov 29, 2016, 9:58:24 AM11/29/16
to Andrey Konovalov, syzkaller, Dmitry Vyukov, David Miller, Tom Herbert, Alexander Duyck, Hannes Frederic Sowa, Jiri Benc, Sabrina Dubroca, netdev, LKML
On Tue, 2016-11-29 at 11:26 +0100, Andrey Konovalov wrote:
> On Sat, Nov 26, 2016 at 9:05 PM, Eric Dumazet <erd...@gmail.com> wrote:
> >> I actually see multiple places where skb_network_offset() is used as
> >> an argument to skb_pull().
> >> So I guess every place can potentially be buggy.
> >
> > Well, I think the intent is to accept a negative number.
>
> I'm not sure that was the intent since it results in a signedness
> issue which leads to an out-of-bounds.
>

Hey, I already mentioned where was the bug.

You missed the investigation where I pointed it to FLorian ?

> A quick grep shows that the same issue can potentially happen in
> multiple places across the kernel:
>
> net/ipv6/ip6_output.c:1655: __skb_pull(skb, skb_network_offset(skb));
> net/packet/af_packet.c:2043: skb_pull(skb, skb_network_offset(skb));
> net/packet/af_packet.c:2165: skb_pull(skb, skb_network_offset(skb));
> net/core/neighbour.c:1301: __skb_pull(skb, skb_network_offset(skb));
> net/core/neighbour.c:1331: __skb_pull(skb, skb_network_offset(skb));
> net/core/dev.c:3157: __skb_pull(skb, skb_network_offset(skb));
> net/sched/sch_teql.c:337: __skb_pull(skb, skb_network_offset(skb));
> net/sched/sch_atm.c:479: skb_pull(skb, skb_network_offset(skb));
> net/ipv4/ip_output.c:1385: __skb_pull(skb, skb_network_offset(skb));
> net/ipv4/ip_fragment.c:391: if (!pskb_pull(skb, skb_network_offset(skb) + ihl))
> drivers/net/vxlan.c:1440: __skb_pull(reply, skb_network_offset(reply));
> drivers/net/vxlan.c:1902: __skb_pull(skb, skb_network_offset(skb));
> drivers/net/vrf.c:220: __skb_pull(skb, skb_network_offset(skb));
> drivers/net/vrf.c:314: __skb_pull(skb, skb_network_offset(skb));
>
> A similar thing also happened to somebody else (on a receive path!):
> https://forums.grsecurity.net/viewtopic.php?f=3&t=4550
>
> Does it make sense to check skb_network_offset() before passing it to
> skb_pull() everywhere?

Well, sure, we could add safety checks everywhere and slow the kernel
when debugging is requested.

But skb_network_offset() is not the problem here. Why are you focusing
on it ?

The real problem is in __skb_pull() or __skb_push() and all similar
helpers. Lots of added checks and slowdowns.

diff --git a/include/linux/skbuff.h b/include/linux/skbuff.h
index 9c535fbccf2c7dbfae04cee393460e86d588c26b..d6116e37d054fc1536114347ed3c41fc7dc7a882 100644
--- a/include/linux/skbuff.h
+++ b/include/linux/skbuff.h
@@ -1923,6 +1923,7 @@ static inline unsigned char *__skb_put(struct sk_buff *skb, unsigned int len)
unsigned char *skb_push(struct sk_buff *skb, unsigned int len);
static inline unsigned char *__skb_push(struct sk_buff *skb, unsigned int len)
{
+ BUG_ON((int)len < 0);
skb->data -= len;
skb->len += len;
return skb->data;
@@ -1931,6 +1932,7 @@ static inline unsigned char *__skb_push(struct sk_buff *skb, unsigned int len)
unsigned char *skb_pull(struct sk_buff *skb, unsigned int len);
static inline unsigned char *__skb_pull(struct sk_buff *skb, unsigned int len)
{
+ BUG_ON((int)len < 0);
skb->len -= len;
BUG_ON(skb->len < skb->data_len);
return skb->data += len;
@@ -1938,6 +1940,7 @@ static inline unsigned char *__skb_pull(struct sk_buff *skb, unsigned int len)

static inline unsigned char *skb_pull_inline(struct sk_buff *skb, unsigned int len)
{
+ BUG_ON((int)len < 0);
return unlikely(len > skb->len) ? NULL : __skb_pull(skb, len);
}

diff --git a/net/core/skbuff.c b/net/core/skbuff.c
index d1d1a5a5ad24ded523fc12ffba8c602b03bd0830..7bf098c848fd857ba5d287fc91d43f62f381bd55 100644
--- a/net/core/skbuff.c
+++ b/net/core/skbuff.c
@@ -1450,6 +1450,7 @@ EXPORT_SYMBOL(skb_put);
*/
unsigned char *skb_push(struct sk_buff *skb, unsigned int len)
{
+ BUG_ON((int)len < 0);
skb->data -= len;
skb->len += len;
if (unlikely(skb->data<skb->head))







Andrey Konovalov

unread,
Nov 29, 2016, 10:31:04 AM11/29/16
to syzkaller, Dmitry Vyukov, David Miller, Tom Herbert, Alexander Duyck, Hannes Frederic Sowa, Jiri Benc, Sabrina Dubroca, netdev, LKML
The issue is not with skb_network_offset(), but with __skb_pull()
using skb_network_offset() as an argument.

I'm not sure what would be the beast way to fix this, to add a check
before every __skb_pull(skb_network_offset()), to fix __skb_pull() to
work with signed ints, to add BUG_ON()'s in __skb_pull, or something
else.

What I meant is that you fixed this very instance of the bug, and I'm
pointing out that a similar one might hit us again.

Eric Dumazet

unread,
Nov 29, 2016, 11:15:16 AM11/29/16
to Andrey Konovalov, syzkaller, Dmitry Vyukov, David Miller, Tom Herbert, Alexander Duyck, Hannes Frederic Sowa, Jiri Benc, Sabrina Dubroca, netdev, LKML
On Tue, 2016-11-29 at 16:31 +0100, Andrey Konovalov wrote:
=
> The issue is not with skb_network_offset(), but with __skb_pull()
> using skb_network_offset() as an argument.
>

No. The issue can happen with _any_ __skb_pull() with a 'negative'
argument, on 64bit arches.

skb_network_offset() is only one of the many cases this could happen if
a bug is added at some random place, including memory corruption from
a different kernel layer, or buggy hardware.

> I'm not sure what would be the beast way to fix this, to add a check
> before every __skb_pull(skb_network_offset()), to fix __skb_pull() to
> work with signed ints, to add BUG_ON()'s in __skb_pull, or something
> else.
>
> What I meant is that you fixed this very instance of the bug, and I'm
> pointing out that a similar one might hit us again.

As I said, adding a check in skb_network_offset() would not be generic
enough.

Sure, we can be proactive and add tests everywhere in the kernel, but we
also want to keep it reasonably fast.



Reply all
Reply to author
Forward
0 new messages