Slow network speed outside netvm

88 views
Skip to first unread message

Jarle Thorsen

unread,
Jan 31, 2018, 4:50:09 AM1/31/18
to qubes-users
My netvm (Fedora 26 template) has a 10gbe network card, and from within the netwm I have no problem saturating the 10Gbit link using iperf to an external server.

However, in any vm sending traffic through this netvm I can only get around 2Gbit max using the same iperf command against the same server.

I have this problem both with appvms using the *same* Fedora template as the netvm, and also in Windows HVM.

Anybody have experience with Qubes in a 10Gbe network?

Jarle Thorsen

unread,
Jan 31, 2018, 5:02:05 AM1/31/18
to qubes-users
I do notice the the process ksoftirqd in the netvm is running at 100% cpu during the iperf test from the appvm. I'm guessing this is related to my problem?

Ilpo Järvinen

unread,
Jan 31, 2018, 5:12:33 AM1/31/18
to Jarle Thorsen, qubes-users
SG being disabled for the qubes inter-vm interfaces (due to an unknown
reason as Marek didn't anymore remember the details for that change) might
have some(/huge?) impact on performance as it prevents using some
high-speed features of the kernel.


--
i.

Alex Dubois

unread,
Jan 31, 2018, 5:14:31 AM1/31/18
to qubes-users

Interested to find out too. Have you tried from FirewallVM?

Could you also test what happen when during the load test you start a disposable VM? Does it drop (I am using a Qubes server as a firewall, and my kids complain it freeze their game for few seconds when I start DispVM (my intranet card is in an appVM with the PCI NIC card attached).

Jarle Thorsen

unread,
Jan 31, 2018, 6:54:14 AM1/31/18
to qubes-users
Not quite sure what "SG" is in this context. Is there any settings I can experiment with?

Are you talking about enabling sg on the virtual network device in the netvm?

Something like "sudo ethtool -K vif12.0 sg on" ?

Ilpo Järvinen

unread,
Jan 31, 2018, 6:59:57 AM1/31/18
to Jarle Thorsen, qubes-users
On Wed, 31 Jan 2018, Jarle Thorsen wrote:

> onsdag 31. januar 2018 11.12.33 UTC+1 skrev Ilpo Järvinen følgende:
> > On Wed, 31 Jan 2018, Jarle Thorsen wrote:
> > > onsdag 31. januar 2018 10.50.09 UTC+1 skrev Jarle Thorsen følgende:
> > > > My netvm (Fedora 26 template) has a 10gbe network card, and from
> > > > within the netwm I have no problem saturating the 10Gbit link using
> > > > iperf to an external server.
> > > > However, in any vm sending traffic through this netvm I can only get
> > > > around 2Gbit max using the same iperf command against the same server.
> > > >
> > > > I have this problem both with appvms using the *same* Fedora template
> > > > as the netvm, and also in Windows HVM.
> > > >
> > > > Anybody have experience with Qubes in a 10Gbe network?
> > >
> > > I do notice the the process ksoftirqd in the netvm is running at 100%
> > > cpu during the iperf test from the appvm. I'm guessing this is related
> > > to my problem?
> >
> > SG being disabled for the qubes inter-vm interfaces (due to an unknown
> > reason as Marek didn't anymore remember the details for that change) might
> > have some(/huge?) impact on performance as it prevents using some
> > high-speed features of the kernel.
>
> Not quite sure what "SG" is in this context. Is there any settings I can
> experiment with?

Scatter-Gather.

> Are you talking about enabling sg on the virtual network device in the
> netvm?
>
> Something like "sudo ethtool -K vif12.0 sg on" ?

Yes. For both that and the eth0 in appvm.


--
i.

Jarle Thorsen

unread,
Jan 31, 2018, 7:15:01 AM1/31/18
to qubes-users
Alex Duboise:

> Interested to find out too. Have you tried from FirewallVM?

Same slow performance when iperf is run from a FirewallVM that is connected to the netvm.

> Could you also test what happen when during the load test you start a disposable VM? Does it drop

Running iperf in the netvm, and starting a disposable vm using the same netvm there is no performance drop. (I get 8+ Gbit running one thread, and about 9,5Mbit when using multiple threads)

Jarle Thorsen

unread,
Jan 31, 2018, 7:24:14 AM1/31/18
to qubes-users
Ilpo Järvinen:
> Scatter-Gather.
>
> > Are you talking about enabling sg on the virtual network device in the
> > netvm?
> >
> > Something like "sudo ethtool -K vif12.0 sg on" ?
>
> Yes. For both that and the eth0 in appvm.

This made a huge performance boost! (single threaded iperf went from 1.34 Gbit/sec to 3.75 Gbit/sec in appvm after enabling SG)

I'm still far away from the speed I get inside the netvm though... Maybe the netvm needs more than 2 vcpus to handle the stress from appvms?

Ilpo Järvinen

unread,
Jan 31, 2018, 7:35:44 AM1/31/18
to Jarle Thorsen, qubes-users
On Wed, 31 Jan 2018, Jarle Thorsen wrote:

> Ilpo Järvinen:
> > Scatter-Gather.
> >
> > > Are you talking about enabling sg on the virtual network device in the
> > > netvm?
> > >
> > > Something like "sudo ethtool -K vif12.0 sg on" ?
> >
> > Yes. For both that and the eth0 in appvm.
>
> This made a huge performance boost! (single threaded iperf went from
> 1.34 Gbit/sec to 3.75 Gbit/sec in appvm after enabling SG)

Please also check that GSO (generic-segmentation-offload) is on at
the sending appvm eth0 (I don't remember if the depency logic causes it to
get toggled off when SG was off'ed and cannot check it ATM myself).

> I'm still far away from the speed I get inside the netvm though... Maybe
> the netvm needs more than 2 vcpus to handle the stress from appvms?

Could be.

I'm not sure how your appvm is connected to netvm if there's firewall vm
in between that would also be relevant.

I guess you can try e.g. vmstat 1 in all vms through which the traffic
passes through to see if any of them is saturating CPU.


--
i.

Jarle Thorsen

unread,
Jan 31, 2018, 8:23:19 AM1/31/18
to qubes-users
Ilpo Järvinen:
> Please also check that GSO (generic-segmentation-offload) is on at
> the sending appvm eth0 (I don't remember if the depency logic causes it to
> get toggled off when SG was off'ed and cannot check it ATM myself).

Yes, GSO is automatically turned on when SG is enabled.

> > I'm still far away from the speed I get inside the netvm though... Maybe
> > the netvm needs more than 2 vcpus to handle the stress from appvms?
>
> Could be.

Going from 2 to 4 vcpu in netvm does not make any difference.

> I'm not sure how your appvm is connected to netvm if there's firewall vm
> in between that would also be relevant.

Appvm goes directly into the netvm without anything (no firewall) in between.

> I guess you can try e.g. vmstat 1 in all vms through which the traffic
> passes through to see if any of them is saturating CPU.

Appvm hardly uses any cpu at all, and netvm is below 50% cpu usage.

Mike Keehan

unread,
Jan 31, 2018, 9:56:55 AM1/31/18
to qubes...@googlegroups.com
Hi Jarle,

It sounds a bit ambitious to run 10gb per sec from one VM through
another and onto the wire. I suspect you are memory speed limited
if you are using a straightforward desktop pc.

Do you know if anyone has achieved this?

Mike.

Alex Dubois

unread,
Jan 31, 2018, 4:50:23 PM1/31/18
to qubes-users

Thanks for testing and for the SG module I didn't know before this thread. I'll have to dig into my pb...

Jarle Thorsen

unread,
Feb 1, 2018, 2:02:58 AM2/1/18
to qubes-users
Mike Keehan:

> It sounds a bit ambitious to run 10gb per sec from one VM through
> another and onto the wire. I suspect you are memory speed limited
> if you are using a straightforward desktop pc.

I'm not sure what is the limiting factor (memory speed, xen overhead?), but I just did an iperf test between netvm and appvm only, and I still maxed out at 4 Gbit/s. This test takes the network card out of the equation and only tests the speed between the vms. Maybe somebody can do similar tests on their system?

> Do you know if anyone has achieved this?

I don't know.

Ilpo Järvinen

unread,
Feb 1, 2018, 3:34:31 AM2/1/18
to Jarle Thorsen, qubes-users
On Wed, 31 Jan 2018, Jarle Thorsen wrote:

I'd next try to tweak the txqueuelen (at the netvm side):
sudo ifconfig vifxx.0 txqueuelen xxxx

Appvm side (eth0) seems to have 1000 but the other side (vifxx.0) has
only 64 by default that seems a bit small for high-performance transfers.



--
i.

Jarle Thorsen

unread,
Feb 1, 2018, 3:58:07 AM2/1/18
to qubes-users
Ilpo Järvinen:
> I'd next try to tweak the txqueuelen (at the netvm side):
> sudo ifconfig vifxx.0 txqueuelen xxxx
>
> Appvm side (eth0) seems to have 1000 but the other side (vifxx.0) has
> only 64 by default that seems a bit small for high-performance transfers.

Thanks a lot for your help so far!

Unfortunately setting txqueuelen to 1000 did not make any difference...

Ilpo Järvinen

unread,
Feb 1, 2018, 4:12:55 AM2/1/18
to Jarle Thorsen, qubes-users
I found this:
https://wiki.xenproject.org/wiki/Xen-netback_and_xen-netfront_multi-queue_performance_testing

It might be that roughly 4Gbps might be what you can get for cross-vm with
one flow (but those results are quite old).

I guess that there are by default two queues per vif (based on the kernel
thread naming) which would explain why going beyond 2 VCPUs didn't help
any. ...Now we only just need to figure out how to configure the number of
queues to see if that has some impact on performance (ethtool seems to be
unable to do that).

...And of course you'll need to use more than one flow to utilize all
those queues anyway but I guess you tested also the inter-vm test with
more than one flow?


--
i.

Jarle Thorsen

unread,
Feb 1, 2018, 5:02:16 AM2/1/18
to qubes-users
Ilpo Järvinen:
Thanks, I'll have a look.

> It might be that roughly 4Gbps might be what you can get for cross-vm with
> one flow (but those results are quite old).
>
> I guess that there are by default two queues per vif (based on the kernel
> thread naming) which would explain why going beyond 2 VCPUs didn't help
> any. ...Now we only just need to figure out how to configure the number of
> queues to see if that has some impact on performance (ethtool seems to be
> unable to do that).
>
> ...And of course you'll need to use more than one flow to utilize all
> those queues anyway but I guess you tested also the inter-vm test with
> more than one flow?

Yes, using 2,4 or 6 flows gives the same result.

Ilpo Järvinen

unread,
Feb 1, 2018, 5:06:14 PM2/1/18
to Jarle Thorsen, qubes-users
Can you try if you get better throughput between a proxy vm and an appvm
using this kind of topology?

sys-net <-> iperf-srv (proxyvm) <-> iperf-cli (appvm)

I could push ~10Gbps with one flow and slightly more with more parallel
flows between them. But between sys-net and iperf-srv vms I've a lower
cap like you for some reason.

I also tried to use similar parameter for the kernel too but 10Gbps result
was still achievable.


--
i.

Jarle Thorsen

unread,
Feb 2, 2018, 3:19:10 AM2/2/18
to qubes-users
Ilpo Järvinen:
> Can you try if you get better throughput between a proxy vm and an appvm
> using this kind of topology?
>
> sys-net <-> iperf-srv (proxyvm) <-> iperf-cli (appvm)
>
> I could push ~10Gbps with one flow and slightly more with more parallel
> flows between them.

Great find Ilpo! Did you have to do some iptables-trickery for this testing? I have ping working between proxy and appvm, but iperf and nc both tell me no route to host?

PROXY-VM:

$ ifconfig
eth0: flags=4163<UP,BROADCAST,RUNNING,MULTICAST> mtu 1500
inet 10.137.4.34 netmask 255.255.255.255 broadcast 10.255.255.255
inet6 fe80::216:3eff:fe5e:6c20 prefixlen 64 scopeid 0x20<link>
ether 00:16:3e:5e:6c:20 txqueuelen 1000 (Ethernet)
RX packets 86 bytes 6193 (6.0 KiB)
RX errors 0 dropped 0 overruns 0 frame 0
TX packets 162 bytes 14313 (13.9 KiB)
TX errors 0 dropped 0 overruns 0 carrier 0 collisions 0

lo: flags=73<UP,LOOPBACK,RUNNING> mtu 65536
inet 127.0.0.1 netmask 255.0.0.0
inet6 ::1 prefixlen 128 scopeid 0x10<host>
loop txqueuelen 1 (Local Loopback)
RX packets 36 bytes 2016 (1.9 KiB)
RX errors 0 dropped 0 overruns 0 frame 0
TX packets 36 bytes 2016 (1.9 KiB)
TX errors 0 dropped 0 overruns 0 carrier 0 collisions 0

vif37.0: flags=4163<UP,BROADCAST,RUNNING,MULTICAST> mtu 1500
inet 10.137.6.1 netmask 255.255.255.255 broadcast 0.0.0.0
inet6 fe80::fcff:ffff:feff:ffff prefixlen 64 scopeid 0x20<link>
ether fe:ff:ff:ff:ff:ff txqueuelen 32 (Ethernet)
RX packets 91 bytes 6489 (6.3 KiB)
RX errors 0 dropped 0 overruns 0 frame 0
TX packets 86 bytes 7993 (7.8 KiB)
TX errors 0 dropped 0 overruns 0 carrier 0 collisions 0

$ sudo iptables -L
Chain INPUT (policy DROP)
target prot opt source destination
DROP udp -- anywhere anywhere udp dpt:bootpc
ACCEPT all -- anywhere anywhere ctstate RELATED,ESTABLISHED
ACCEPT icmp -- anywhere anywhere
ACCEPT all -- anywhere anywhere
REJECT all -- anywhere anywhere reject-with icmp-host-prohibited

Chain FORWARD (policy DROP)
target prot opt source destination
ACCEPT all -- anywhere anywhere ctstate RELATED,ESTABLISHED
ACCEPT all -- anywhere anywhere
DROP all -- anywhere anywhere
ACCEPT udp -- 10.137.6.35 gateway udp dpt:domain
ACCEPT udp -- 10.137.6.35 10.137.4.254 udp dpt:domain
ACCEPT tcp -- 10.137.6.35 gateway tcp dpt:domain
ACCEPT tcp -- 10.137.6.35 10.137.4.254 tcp dpt:domain
ACCEPT icmp -- 10.137.6.35 anywhere
DROP tcp -- 10.137.6.35 10.137.255.254 tcp dpt:us-cli
ACCEPT all -- 10.137.6.35 anywhere

Chain OUTPUT (policy ACCEPT)
target prot opt source destination

$ route -n
Kernel IP routing table
Destination Gateway Genmask Flags Metric Ref Use Iface
0.0.0.0 10.137.4.1 0.0.0.0 UG 0 0 0 eth0
10.137.4.1 0.0.0.0 255.255.255.255 UH 0 0 0 eth0
10.137.6.35 0.0.0.0 255.255.255.255 UH 32715 0 0 vif37.0


APP-VM:
$ ifconfig
eth0: flags=4163<UP,BROADCAST,RUNNING,MULTICAST> mtu 1500
inet 10.137.6.35 netmask 255.255.255.255 broadcast 10.255.255.255
inet6 fe80::216:3eff:fe5e:6c21 prefixlen 64 scopeid 0x20<link>
ether 00:16:3e:5e:6c:21 txqueuelen 1000 (Ethernet)
RX packets 86 bytes 6789 (6.6 KiB)
RX errors 0 dropped 0 overruns 0 frame 0
TX packets 91 bytes 7763 (7.5 KiB)
TX errors 0 dropped 0 overruns 0 carrier 0 collisions 0

lo: flags=73<UP,LOOPBACK,RUNNING> mtu 65536
inet 127.0.0.1 netmask 255.0.0.0
inet6 ::1 prefixlen 128 scopeid 0x10<host>
loop txqueuelen 1 (Local Loopback)
RX packets 0 bytes 0 (0.0 B)
RX errors 0 dropped 0 overruns 0 frame 0
TX packets 0 bytes 0 (0.0 B)
TX errors 0 dropped 0 overruns 0 carrier 0 collisions 0

$ sudo iptables -L
Chain INPUT (policy ACCEPT)
target prot opt source destination
DROP udp -- anywhere anywhere udp dpt:bootpc
ACCEPT all -- anywhere anywhere ctstate RELATED,ESTABLISHED
ACCEPT icmp -- anywhere anywhere
ACCEPT all -- anywhere anywhere
REJECT all -- anywhere anywhere reject-with icmp-host-prohibited
DROP all -- anywhere anywhere

Chain FORWARD (policy ACCEPT)
target prot opt source destination
ACCEPT all -- anywhere anywhere ctstate RELATED,ESTABLISHED
DROP all -- anywhere anywhere
ACCEPT all -- anywhere anywhere
DROP all -- anywhere anywhere

Chain OUTPUT (policy ACCEPT)
target prot opt source destination

$ route -n
Kernel IP routing table
Destination Gateway Genmask Flags Metric Ref Use Iface
0.0.0.0 10.137.6.1 0.0.0.0 UG 0 0 0 eth0
10.137.6.1 0.0.0.0 255.255.255.255 UH 0 0 0 eth0

Ilpo Järvinen

unread,
Feb 2, 2018, 3:33:45 AM2/2/18
to Jarle Thorsen, qubes-users
On Fri, 2 Feb 2018, Jarle Thorsen wrote:

> Ilpo Järvinen:
> > Can you try if you get better throughput between a proxy vm and an appvm
> > using this kind of topology?
> >
> > sys-net <-> iperf-srv (proxyvm) <-> iperf-cli (appvm)
> >
> > I could push ~10Gbps with one flow and slightly more with more parallel
> > flows between them.
>
> Great find Ilpo! Did you have to do some iptables-trickery for this
> testing? I have ping working between proxy and appvm, but iperf and nc
> both tell me no route to host?

Yes, I did (it replies with ICMP by default). You'll need to fill in the
vif IP-address to this command:

sudo iptables -I INPUT 1 -i vif+ -p tcp -d ***proxyvmIPhere*** -j ACCEPT



--
i.

Jarle Thorsen

unread,
Feb 2, 2018, 3:58:11 AM2/2/18
to qubes-users
Ilpo Järvinen:
> > Great find Ilpo! Did you have to do some iptables-trickery for this
> > testing? I have ping working between proxy and appvm, but iperf and nc
> > both tell me no route to host?
>
> Yes, I did (it replies with ICMP by default). You'll need to fill in the
> vif IP-address to this command:
>
> sudo iptables -I INPUT 1 -i vif+ -p tcp -d ***proxyvmIPhere*** -j ACCEPT

WOW! I can easily push 19Gbit/s from appvm to proxyvm (after turning on SG on appvm), but from proxyvm to netvm I can only push 3-4Gbit/s. There HAS to be room for some improvement here?
Reply all
Reply to author
Forward
0 new messages