[PATCH] tun: Use netif_receive_skb instead of netif_rx

267 views
Skip to first unread message

Andrey Konovalov

unread,
Nov 29, 2016, 10:25:52 AM11/29/16
to Herbert Xu, David S . Miller, Jason Wang, Eric Dumazet, Paolo Abeni, Michael S . Tsirkin, Soheil Hassas Yeganeh, Markus Elfring, Mike Rapoport, net...@vger.kernel.org, linux-...@vger.kernel.org, Dmitry Vyukov, Kostya Serebryany, syzk...@googlegroups.com, Andrey Konovalov
This patch changes tun.c to call netif_receive_skb instead of netif_rx
when a packet is received. The difference between the two is that netif_rx
queues the packet into the backlog, and netif_receive_skb proccesses the
packet in the current context.

This patch is required for syzkaller [1] to collect coverage from packet
receive paths, when a packet being received through tun (syzkaller collects
coverage per process in the process context).

A similar patch was introduced back in 2010 [2, 3], but the author found
out that the patch doesn't help with the task he had in mind (for cgroups
to shape network traffic based on the original process) and decided not to
go further with it. The main concern back then was about possible stack
exhaustion with 4K stacks, but CONFIG_4KSTACKS was removed and stacks are
8K now.

[1] https://github.com/google/syzkaller

[2] https://www.spinics.net/lists/netdev/thrd440.html#130570

[3] https://www.spinics.net/lists/netdev/msg130570.html

Signed-off-by: Andrey Konovalov <andre...@google.com>
---
drivers/net/tun.c | 4 +++-
1 file changed, 3 insertions(+), 1 deletion(-)

diff --git a/drivers/net/tun.c b/drivers/net/tun.c
index 8093e39..4b56e91 100644
--- a/drivers/net/tun.c
+++ b/drivers/net/tun.c
@@ -1304,7 +1304,9 @@ static ssize_t tun_get_user(struct tun_struct *tun, struct tun_file *tfile,
skb_probe_transport_header(skb, 0);

rxhash = skb_get_hash(skb);
- netif_rx_ni(skb);
+ local_bh_disable();
+ netif_receive_skb(skb);
+ local_bh_enable();

stats = get_cpu_ptr(tun->pcpu_stats);
u64_stats_update_begin(&stats->syncp);
--
2.8.0.rc3.226.g39d4020

Eric Dumazet

unread,
Nov 29, 2016, 10:35:30 AM11/29/16
to Andrey Konovalov, Peter Klausler, Herbert Xu, David S . Miller, Jason Wang, Eric Dumazet, Paolo Abeni, Michael S . Tsirkin, Soheil Hassas Yeganeh, Markus Elfring, Mike Rapoport, net...@vger.kernel.org, linux-...@vger.kernel.org, Dmitry Vyukov, Kostya Serebryany, syzk...@googlegroups.com
On Tue, 2016-11-29 at 16:25 +0100, Andrey Konovalov wrote:
> This patch changes tun.c to call netif_receive_skb instead of netif_rx
> when a packet is received. The difference between the two is that netif_rx
> queues the packet into the backlog, and netif_receive_skb proccesses the
> packet in the current context.
>
> This patch is required for syzkaller [1] to collect coverage from packet
> receive paths, when a packet being received through tun (syzkaller collects
> coverage per process in the process context).
>
> A similar patch was introduced back in 2010 [2, 3], but the author found
> out that the patch doesn't help with the task he had in mind (for cgroups
> to shape network traffic based on the original process) and decided not to
> go further with it. The main concern back then was about possible stack
> exhaustion with 4K stacks, but CONFIG_4KSTACKS was removed and stacks are
> 8K now.

Acked-by: Eric Dumazet <edum...@google.com>

We're using a similar patch written by Peter , let me copy here part of
his changelog, since main motivation was speed improvement at that
time :

commit 29aa09f47d43e93327a706cd835a37012ccc5b9e
Author: Peter Klausler <p...@google.com>
Date: Fri Mar 29 17:08:02 2013 -0400

net-tun: Add netif_rx_ni_immediate() variant to speed up tun/tap.

Speed up packet reception from (i.e., writes to) a tun/tap
device by adding an alternative netif_rx_ni_immediate()
interface that invokes netif_receive_skb() immediately rather
than enqueueing the packet in the backlog queue and then driving
its processing with do_softirq(). Forced queueing as a consequence
of an RPS CPU mask will still work as expected.

This change speeds up my closed-loop single-stream tap/OVS benchmark
by about 23%, from 700k packets/second to 867k packets/second.




Michael S. Tsirkin

unread,
Nov 29, 2016, 11:20:25 AM11/29/16
to Andrey Konovalov, Herbert Xu, David S . Miller, Jason Wang, Eric Dumazet, Paolo Abeni, Soheil Hassas Yeganeh, Markus Elfring, Mike Rapoport, net...@vger.kernel.org, linux-...@vger.kernel.org, Dmitry Vyukov, Kostya Serebryany, syzk...@googlegroups.com
On Tue, Nov 29, 2016 at 04:25:36PM +0100, Andrey Konovalov wrote:
> This patch changes tun.c to call netif_receive_skb instead of netif_rx
> when a packet is received. The difference between the two is that netif_rx
> queues the packet into the backlog, and netif_receive_skb proccesses the
> packet in the current context.
>
> This patch is required for syzkaller [1] to collect coverage from packet
> receive paths, when a packet being received through tun (syzkaller collects
> coverage per process in the process context).
>
> A similar patch was introduced back in 2010 [2, 3], but the author found
> out that the patch doesn't help with the task he had in mind (for cgroups
> to shape network traffic based on the original process) and decided not to
> go further with it. The main concern back then was about possible stack
> exhaustion with 4K stacks, but CONFIG_4KSTACKS was removed and stacks are
> 8K now.
>
> [1] https://github.com/google/syzkaller
>
> [2] https://www.spinics.net/lists/netdev/thrd440.html#130570
>
> [3] https://www.spinics.net/lists/netdev/msg130570.html
>
> Signed-off-by: Andrey Konovalov <andre...@google.com>

This was on my list of things to investigate ever since
8k stack default went in. Thanks for looking into this!

I note that there are still seem to exit 3 architectures that do have
CONFIG_4KSTACKS.

How about a wrapper that does netif_rx_ni with CONFIG_4KSTACKS.

Andrey Konovalov

unread,
Dec 1, 2016, 4:34:50 AM12/1/16
to Herbert Xu, David S . Miller, Jason Wang, Eric Dumazet, Peter Klausler, Paolo Abeni, Michael S . Tsirkin, Soheil Hassas Yeganeh, Markus Elfring, Mike Rapoport, net...@vger.kernel.org, linux-...@vger.kernel.org, Dmitry Vyukov, Kostya Serebryany, syzk...@googlegroups.com, Andrey Konovalov
This patch changes tun.c to call netif_receive_skb instead of netif_rx
when a packet is received (if CONFIG_4KSTACKS is not enabled to avoid
stack exhaustion). The difference between the two is that netif_rx queues
the packet into the backlog, and netif_receive_skb proccesses the packet
in the current context.

This patch is required for syzkaller [1] to collect coverage from packet
receive paths, when a packet being received through tun (syzkaller collects
coverage per process in the process context).

As mentioned by Eric this change also speeds up tun/tap. As measured by
Peter it speeds up his closed-loop single-stream tap/OVS benchmark by
about 23%, from 700k packets/second to 867k packets/second.

A similar patch was introduced back in 2010 [2, 3], but the author found
out that the patch doesn't help with the task he had in mind (for cgroups
to shape network traffic based on the original process) and decided not to
go further with it. The main concern back then was about possible stack
exhaustion with 4K stacks.
---

Changes since v1:
- incorporate Eric's note about speed improvements in commit description
- use netif_receive_skb CONFIG_4KSTACKS is not enabled

drivers/net/tun.c | 6 ++++++
1 file changed, 6 insertions(+)

diff --git a/drivers/net/tun.c b/drivers/net/tun.c
index 8093e39..d310b13 100644
--- a/drivers/net/tun.c
+++ b/drivers/net/tun.c
@@ -1304,7 +1304,13 @@ static ssize_t tun_get_user(struct tun_struct *tun, struct tun_file *tfile,
skb_probe_transport_header(skb, 0);

rxhash = skb_get_hash(skb);
+#ifndef CONFIG_4KSTACKS
+ local_bh_disable();
+ netif_receive_skb(skb);
+ local_bh_enable();
+#else
netif_rx_ni(skb);
+#endif

Andrey Konovalov

unread,
Dec 1, 2016, 4:39:47 AM12/1/16
to syzkaller, Herbert Xu, David S . Miller, Jason Wang, Eric Dumazet, Paolo Abeni, Soheil Hassas Yeganeh, Markus Elfring, Mike Rapoport, netdev, LKML, Dmitry Vyukov, Kostya Serebryany
On Tue, Nov 29, 2016 at 5:20 PM, Michael S. Tsirkin <m...@redhat.com> wrote:
> On Tue, Nov 29, 2016 at 04:25:36PM +0100, Andrey Konovalov wrote:
>> This patch changes tun.c to call netif_receive_skb instead of netif_rx
>> when a packet is received. The difference between the two is that netif_rx
>> queues the packet into the backlog, and netif_receive_skb proccesses the
>> packet in the current context.
>>
>> This patch is required for syzkaller [1] to collect coverage from packet
>> receive paths, when a packet being received through tun (syzkaller collects
>> coverage per process in the process context).
>>
>> A similar patch was introduced back in 2010 [2, 3], but the author found
>> out that the patch doesn't help with the task he had in mind (for cgroups
>> to shape network traffic based on the original process) and decided not to
>> go further with it. The main concern back then was about possible stack
>> exhaustion with 4K stacks, but CONFIG_4KSTACKS was removed and stacks are
>> 8K now.
>>
>> [1] https://github.com/google/syzkaller
>>
>> [2] https://www.spinics.net/lists/netdev/thrd440.html#130570
>>
>> [3] https://www.spinics.net/lists/netdev/msg130570.html
>>
>> Signed-off-by: Andrey Konovalov <andre...@google.com>
>
> This was on my list of things to investigate ever since
> 8k stack default went in. Thanks for looking into this!
>
> I note that there are still seem to exit 3 architectures that do have
> CONFIG_4KSTACKS.

Hi Michael,

Missed that, mailed v2.

Thanks!

>
> How about a wrapper that does netif_rx_ni with CONFIG_4KSTACKS.
>
>
>> ---
>> drivers/net/tun.c | 4 +++-
>> 1 file changed, 3 insertions(+), 1 deletion(-)
>>
>> diff --git a/drivers/net/tun.c b/drivers/net/tun.c
>> index 8093e39..4b56e91 100644
>> --- a/drivers/net/tun.c
>> +++ b/drivers/net/tun.c
>> @@ -1304,7 +1304,9 @@ static ssize_t tun_get_user(struct tun_struct *tun, struct tun_file *tfile,
>> skb_probe_transport_header(skb, 0);
>>
>> rxhash = skb_get_hash(skb);
>> - netif_rx_ni(skb);
>> + local_bh_disable();
>> + netif_receive_skb(skb);
>> + local_bh_enable();
>>
>> stats = get_cpu_ptr(tun->pcpu_stats);
>> u64_stats_update_begin(&stats->syncp);
>> --
>> 2.8.0.rc3.226.g39d4020
>
> --
> You received this message because you are subscribed to the Google Groups "syzkaller" group.
> To unsubscribe from this group and stop receiving emails from it, send an email to syzkaller+...@googlegroups.com.
> For more options, visit https://groups.google.com/d/optout.

Jason Wang

unread,
Dec 1, 2016, 4:40:03 AM12/1/16
to Andrey Konovalov, Herbert Xu, David S . Miller, Eric Dumazet, Peter Klausler, Paolo Abeni, Michael S . Tsirkin, Soheil Hassas Yeganeh, Markus Elfring, Mike Rapoport, net...@vger.kernel.org, linux-...@vger.kernel.org, Dmitry Vyukov, Kostya Serebryany, syzk...@googlegroups.com
I get +20% tx pps from guest with this patch.

Acked-by: Jason Wang <jaso...@redhat.com>

Michael S. Tsirkin

unread,
Dec 1, 2016, 8:05:35 AM12/1/16
to Andrey Konovalov, Herbert Xu, David S . Miller, Jason Wang, Eric Dumazet, Peter Klausler, Paolo Abeni, Soheil Hassas Yeganeh, Markus Elfring, Mike Rapoport, net...@vger.kernel.org, linux-...@vger.kernel.org, Dmitry Vyukov, Kostya Serebryany, syzk...@googlegroups.com
On Thu, Dec 01, 2016 at 10:34:40AM +0100, Andrey Konovalov wrote:
> This patch changes tun.c to call netif_receive_skb instead of netif_rx
> when a packet is received (if CONFIG_4KSTACKS is not enabled to avoid
> stack exhaustion). The difference between the two is that netif_rx queues
> the packet into the backlog, and netif_receive_skb proccesses the packet
> in the current context.
>
> This patch is required for syzkaller [1] to collect coverage from packet
> receive paths, when a packet being received through tun (syzkaller collects
> coverage per process in the process context).
>
> As mentioned by Eric this change also speeds up tun/tap. As measured by
> Peter it speeds up his closed-loop single-stream tap/OVS benchmark by
> about 23%, from 700k packets/second to 867k packets/second.
>
> A similar patch was introduced back in 2010 [2, 3], but the author found
> out that the patch doesn't help with the task he had in mind (for cgroups
> to shape network traffic based on the original process) and decided not to
> go further with it. The main concern back then was about possible stack
> exhaustion with 4K stacks.
>
> [1] https://github.com/google/syzkaller
>
> [2] https://www.spinics.net/lists/netdev/thrd440.html#130570
>
> [3] https://www.spinics.net/lists/netdev/msg130570.html
>
> Signed-off-by: Andrey Konovalov <andre...@google.com>

Acked-by: Michael S. Tsirkin <m...@redhat.com>

David Miller

unread,
Dec 1, 2016, 2:44:00 PM12/1/16
to andre...@google.com, her...@gondor.apana.org.au, jaso...@redhat.com, edum...@google.com, p...@google.com, pab...@redhat.com, m...@redhat.com, soh...@google.com, elf...@users.sourceforge.net, rp...@linux.vnet.ibm.com, net...@vger.kernel.org, linux-...@vger.kernel.org, dvy...@google.com, k...@google.com, syzk...@googlegroups.com
From: Andrey Konovalov <andre...@google.com>
Date: Thu, 1 Dec 2016 10:34:40 +0100

> This patch changes tun.c to call netif_receive_skb instead of netif_rx
> when a packet is received (if CONFIG_4KSTACKS is not enabled to avoid
> stack exhaustion). The difference between the two is that netif_rx queues
> the packet into the backlog, and netif_receive_skb proccesses the packet
> in the current context.
>
> This patch is required for syzkaller [1] to collect coverage from packet
> receive paths, when a packet being received through tun (syzkaller collects
> coverage per process in the process context).
>
> As mentioned by Eric this change also speeds up tun/tap. As measured by
> Peter it speeds up his closed-loop single-stream tap/OVS benchmark by
> about 23%, from 700k packets/second to 867k packets/second.
>
> A similar patch was introduced back in 2010 [2, 3], but the author found
> out that the patch doesn't help with the task he had in mind (for cgroups
> to shape network traffic based on the original process) and decided not to
> go further with it. The main concern back then was about possible stack
> exhaustion with 4K stacks.
>
> [1] https://github.com/google/syzkaller
>
> [2] https://www.spinics.net/lists/netdev/thrd440.html#130570
>
> [3] https://www.spinics.net/lists/netdev/msg130570.html
>
> Signed-off-by: Andrey Konovalov <andre...@google.com>
> ---
>
> Changes since v1:
> - incorporate Eric's note about speed improvements in commit description
> - use netif_receive_skb CONFIG_4KSTACKS is not enabled

Applied to net-next, thanks!

Jason Wang

unread,
Dec 6, 2016, 10:21:20 PM12/6/16
to David Miller, andre...@google.com, her...@gondor.apana.org.au, edum...@google.com, p...@google.com, pab...@redhat.com, m...@redhat.com, soh...@google.com, elf...@users.sourceforge.net, rp...@linux.vnet.ibm.com, net...@vger.kernel.org, linux-...@vger.kernel.org, dvy...@google.com, k...@google.com, syzk...@googlegroups.com
David, looks like this commit is not in net-next.git.

Please help to check.

Thanks

David Miller

unread,
Dec 6, 2016, 10:25:28 PM12/6/16
to jaso...@redhat.com, andre...@google.com, her...@gondor.apana.org.au, edum...@google.com, p...@google.com, pab...@redhat.com, m...@redhat.com, soh...@google.com, elf...@users.sourceforge.net, rp...@linux.vnet.ibm.com, net...@vger.kernel.org, linux-...@vger.kernel.org, dvy...@google.com, k...@google.com, syzk...@googlegroups.com
From: Jason Wang <jaso...@redhat.com>
Date: Wed, 7 Dec 2016 11:21:11 +0800

> David, looks like this commit is not in net-next.git.
>
> Please help to check.

Take a look, it should be there now.

Jason Wang

unread,
Dec 6, 2016, 10:40:04 PM12/6/16
to David Miller, andre...@google.com, her...@gondor.apana.org.au, edum...@google.com, p...@google.com, pab...@redhat.com, m...@redhat.com, soh...@google.com, elf...@users.sourceforge.net, rp...@linux.vnet.ibm.com, net...@vger.kernel.org, linux-...@vger.kernel.org, dvy...@google.com, k...@google.com, syzk...@googlegroups.com
Yes, thanks.
Reply all
Reply to author
Forward
0 new messages