Intel 82599ES - performance issue

cibe...@gmail.com

unread,

Jan 6, 2016, 4:59:08 PM1/6/16

to Snabb Switch development

Hi,

I need some advice to get a better understanding of some performance problems which I'm facing with my Intel 82599ES 10G card.

The setup is very straightforward:
1. I have a server with 2 "Intel 82599ES"

2. snabb is installed and configured according to https://github.com/SnabbCo/snabbswitch/blob/master/src/program/snabbnfv/doc/getting-started.md

3. 2 process "snabb snabbnfv traffic" are running

4. NO NUMA allocation is done

5. Ubuntu VM with 3 interfaces is started: 1 management and 2 interfaces are connected to both snabb process.

6. the basic connectivity is there, traffic which is recieved on one interface can be forwarded by ubuntu to the second.

7. DPDK dpdk-2.0.0 is installed on the Ubuntu VM

8. the test tool "testpmd" is used to verify the performance, which we can get from the virtual instance, connected to a snabb switch

9. in the testpmd configuration I have assigned different cores to different ports, like:

testpmd> set nbcore 2
Number of forwarding cores set to 2
testpmd> show config fwd
io packet forwarding - ports=2 - cores=2 - streams=2 - NUMA support disabled, MP ov|
er anonymous pages disabled
Logical Core 2 (socket 0) forwards packets on 1 streams:
  RX P=0/Q=0 (socket 0) -> TX P=1/Q=0 (socket 0) peer=02:00:00:00:00:01
Logical Core 3 (socket 0) forwards packets on 1 streams:
  RX P=1/Q=0 (socket 0) -> TX P=0/Q=0 (socket 0) peer=02:00:00:00:00:00

10. I have a traffic generator tool, connected to both 10G ports and I'm sending 500K PPS traffic.

Now is the problem - the "ingress" snabb process is able to process only 350K PPS:

link report:
      11,289,284,558 sent on id1_NIC.tx -> id1_Virtio.rx (loss rate: 0%)
               3,454 sent on id1_Virtio.tx -> id1_NIC.rx (loss rate: 0%)
load: time: 1.00s  fps: 345,025   fpGbps: 4.154 fpb: 123 bpp: 1496 sleep: 0   us
load: time: 1.00s  fps: 345,407   fpGbps: 4.159 fpb: 125 bpp: 1496 sleep: 0   us
load: time: 1.00s  fps: 344,867   fpGbps: 4.152 fpb: 125 bpp: 1496 sleep: 0   us
load: time: 1.00s  fps: 344,177   fpGbps: 4.144 fpb: 122 bpp: 1496 sleep: 0   us
load: time: 1.00s  fps: 343,507   fpGbps: 4.136 fpb: 121 bpp: 1496 sleep: 0   us
load: time: 1.00s  fps: 344,604   fpGbps: 4.149 fpb: 124 bpp: 1496 sleep: 0   us
load: time: 1.00s  fps: 331,253   fpGbps: 3.988 fpb: 124 bpp: 1496 sleep: 0   us
load: time: 1.00s  fps: 330,553   fpGbps: 3.980 fpb: 121 bpp: 1496 sleep: 0   us

so, my question is - What should I do get better performance?
It is told, that theoretically snabb switch can get 10G on each core. How should I set up an environment to reach that number?
According to the previous discussion with Luke, there are 2 possible ways to do it, but both have some issues:
1. The VM instance must utilize all available CPU resources. But in the current VIRTIO drivers there is a possibility to use only a single queue per port, which means we can effectively utilize only 2 CPUs (because we have 2 interfaces). IF we want to use more CPUs we need more ports (setup will be more complex) or more queues per port (current driver doesn't support it)
2. use multiple VM's, connecting to the same snabb process and segment the "physical" port into multiple VLANs. This will also follow to much more complex setup.

Is there any other way to achive the "full-line" rate ? How is it implemented in the real life scenarios?

Best reagards,
Konstantin

Luke Gorrie

unread,

Jan 6, 2016, 5:41:49 PM1/6/16

to snabb...@googlegroups.com

Howdy!

On 6 January 2016 at 22:59, <cibe...@gmail.com> wrote:

I need some advice to get a better understanding of some performance problems which I'm facing with my Intel 82599ES 10G card.

Quick initial thoughts:

Low packet rates like 0.3 Mpps sounds like something broken somewhere rather than a tuning issue. You might lose half your performance due to unfortunate NUMA/hyperthread assignments but we should still be talking about millions of packets per second.

Is your IOMMU enabled? If so please try disabling it by booting Linux with "intel_iommu=off" (e.g. in grub config) and see if that helps. We have seen IOMMU support interact badly with the way we take over PCI devices from the kernel (https://github.com/lukego/blog/issues/13) and don't have the full picture there yet.

You could try enabling more debug info from the snabbnfv process e.g. the NIC register dumps (-D<seconds> argument).

Could you relocate your test to the Snabb Lab? Then the test environment would be accessible to other people too and we can take a look directly. This seems to be the easiest way we have for reproducing issues at the moment. https://github.com/SnabbCo/snabbswitch/wiki/Snabb-Lab

For real-life setup/tuning tips check out the Getting Started guide:

https://github.com/SnabbCo/snabbswitch/blob/master/src/program/snabbnfv/doc/getting-started.md

Let us know if you find these tips useful!

Cheers :)

-Luke

cibe...@gmail.com

unread,

Jan 7, 2016, 5:44:04 AM1/7/16

to Snabb Switch development

Hi Luke,

I followed the the guide GettingStarted and so IOMMU is disabled.

with Snabb Lab it is a good idea... let me send the SSH-key and get an access to the dev environment.

One question to the setup which we discussed before, when we have a single physical 10G NIC and we use VLAN to segment that port.
I tried to do that but have some problems.
the port is configured as following:

return {
  { vlan = 21,
    mac_address = "12:54:12:34:21:01",
    port_id = "p2v21",
  },
  { vlan = 22,
    mac_address = "12:54:12:34:22:01",
    port_id = "p2v22",
  },
}

I run the snabb process like:

snabb snabbnfv traffic -k 10 -D 0 0000:03:00.0 /root/scripts/snabb_port1.cfg /root/vhost-sockets/vm1.socket

and can see that 2 VIRTIO process are started.

But I can start only a single VM, which is accessing the vm1.socket, if I try to start to second VM it can't get an access to that vm1.socket.


         -chardev socket,id=char0,path=/root/vhost-sockets/vm1.socket,server \
         -netdev type=vhost-user,id=net0,chardev=char0  \
         -device virtio-net-pci,netdev=net0,mac=12:54:12:34:21:01 \

So, the question is - how can I run multiple VM's, accessing the same snabb process?

Best,
Konstantin

Marcel Wiget

unread,

Jan 7, 2016, 7:01:15 AM1/7/16

to snabb...@googlegroups.com

Use %s as a wildcard for the socket when launching snabb. This will use the port_id's from the config file to generate p2v21.socket and p2v22.socket in /root/vhost-sockets/, which you can then bind with qemu.

snabb snabbnfv traffic -k 10 -D 0 0000:03:00.0 /root/scripts/snabb_port1.cfg /root/vhost-sockets/%s.socket

--
You received this message because you are subscribed to the Google Groups "Snabb Switch development" group.
To unsubscribe from this group and stop receiving emails from it, send an email to snabb-devel...@googlegroups.com.
To post to this group, send email to snabb...@googlegroups.com.
Visit this group at https://groups.google.com/group/snabb-devel.

cibe...@gmail.com

unread,

Jan 7, 2016, 3:23:54 PM1/7/16

to Snabb Switch development

Hello Marcel,

thanks, it really works!
May I ask another question?
the VLAN id which we use in the port configuration files, is it a realy VLAN tag id? does it mean I should use the tagged frames to send packets to VM's?

I'm sending now non-tagged frames and it actually works somehow, but by a high load, like 150KPPS, only 100K PPS is going through the VM and seen on the "egress snabb process.
Also VM shows some strange errors:

root@ubuntu-1:~# dmesg  |tail
[ 1005.271388] eth0: bad gso type 139.
[ 1005.276708] eth0: bad gso type 139.
[ 1005.284418] eth0: bad gso type 139.
[ 1005.325198] eth0: bad gso type 139.
[ 1005.329673] eth0: bad gso type 139.
[ 1005.333645] eth0: bad gso type 139.
[ 1005.388844] eth0: bad gso type 139.
[ 1005.391828] eth0: bad gso type 139.
[ 1005.492153] eth0: bad gso type 139.
[ 1005.494391] eth0: bad gso type 139.

and sometimes:

root@ubuntu-1:~# dmesg |tail
[ 1169.382930] eth0: bad gso type 22.
[ 1169.782999] skbuff: bad partial csum: csum=35649/29830 len=124
[ 1194.483068] eth0: bad gso type 22.
[ 1194.883119] skbuff: bad partial csum: csum=35649/29830 len=124
[ 1200.985208] eth0: bad gso type 22.
[ 1201.025183] skbuff: bad partial csum: csum=35649/29830 len=124
[ 1203.915220] eth0: bad gso type 22.
[ 1203.955103] skbuff: bad partial csum: csum=35649/29830 len=124
[ 1205.805172] eth0: bad gso type 22.
[ 1205.845151] skbuff: bad partial csum: csum=35649/29830 len=124

I don't think I saw such errors when I configured "raw" ports, without VLAN segmentation.

I also saw in the previous threads, Luke wrote:



On 23 June 2014 14:12, Luke Gorrie <lu...@snabb.co> wrote:
 
...
 
FYI the numbers that I see on chur (2GHz Xeon) right now, for traffic being looped through a VM and back onto the network, is:

1514 byte packets: 9.82 Gbps 
256 byte packets: 6.50 Gbps
64 byte packets: 1.68 Gbps

and we are focused on improving these scores over the coming week+.

Cheers,
-Luke

And I'm really interested - how that test environment looked like?

четверг, 7 января 2016 г., 13:01:15 UTC+1 пользователь Marcel Wiget написал:

Luke Gorrie

unread,

Jan 8, 2016, 1:51:32 AM1/8/16

to snabb...@googlegroups.com

On 7 January 2016 at 21:23, <cibe...@gmail.com> wrote:

May I ask another question?
the VLAN id which we use in the port configuration files, is it a realy VLAN tag id? does it mean I should use the tagged frames to send packets to VM's?

Yes. This will be mapped directly onto an 802.1Q VLAN tag that will be automatically inserted/removed.

Interesting.

This looks like a bug in snabbnfv checksum-offload feature triggered by your specific configuration. It would be interesting to reproduce this in the lab to make a test case -> workaround -> fix.

FYI the numbers that I see on chur (2GHz Xeon) right now, for traffic being looped through a VM and back onto the network, is: 1514 byte packets: 9.82 Gbps 256 byte packets: 6.50 Gbps 64 byte packets: 1.68 Gbps and we are focused on improving these scores over the coming week+. Cheers, -Luke

And I'm really interested - how that test environment looked like?

This is based on our dockerized test environment:

https://github.com/snabbnfv-goodies/snabbswitch/blob/c60adbede601d06da755c4aa132fd75f74269127/src/doc/snabbnfv-test-suite-walk-though.md

I am not sure if we have a write-up of the actual benchmark setup actually, but it is simple: we take two 10G ports that are cabled together, we hook one port up to a load generator (packetblaster) sending <n> byte packets, and we hook the other port up to an Ubuntu VM running the DPDK l2fwd reference application (receiving packets and sending them back to the same port). The benchmark measures the rate at which packets come back out of the VM and onto the network.

This is a full-duplex benchmark. If this benchmark reports a result 4 Mpps then it really means 8 Mpps processed by the Snabb process (half network->vm and half vm->network).

btw: The fastest speed I have seen reported on Snabb NFV was ~13.5 Mpps of unidirectional traffic from a DPDK-based VM onto the network. If you ever beat this then please send a note :-). I think the only difference between that setup and our reference benchmark is a faster CPU (chur is slow...) and a VM running a load generator rather than an L2 forwarder.

Please take it for a spin on chur and let us know how you go :-)

Cheers,

-Luke

cibe...@gmail.com

unread,

Jan 9, 2016, 6:28:15 PM1/9/16

to Snabb Switch development

Hello Luke,

I have got an access onto the chur server and could start a VM ubuntu server with 2 interfaces, connected to NICs "0000:01:00.0" and "0000:01:00.1".
So now I have a question - how can we generate some traffic, which goes to that VM and get forwarded/routed from one interface to the other?
To create another 2 VM's, connected to those NICs and use them as traffic generators? But I will not be able to bind those VM's to the same sockets....
any suggestion?

Best.
Konstantin.

Luke Gorrie

unread,

Jan 11, 2016, 8:34:07 AM1/11/16

to snabb...@googlegroups.com

Yes:

The NICs on the server are connected with SFP+ Direct Attach cables like this:

01:00.0 <-> 01:00.1

03:00.0 <-> 03:00.0

etc

So in the setup you mention above - VM connected to 01:00.0 and 01:00.1 - your two ports are cabled together and any packet you send on one port you will then receive on the other.

You could alternatively connect the VM to ports 01:00.0 and 03:00.0. Then you could run a load generator on 01:00.1 and/or 03:00.1 and it would send packets to the VM. The load generator could either be our simple packetblaster application or it could be something you run in another VM.

Below is a picture of the setup with two VMs that are connected to each other, not showing the snabbnfv processes and drawing physical cables at dotted lines.

packetblaster is a simple and efficient load generator. You can generate any amount of traffic on any number of ports using a pcap file as the source. For example you could write "sudo snabb packetblaster replay myfile.cap 01:00.0 03:00.0" and it would generate load on both of those ports. (If you are using very small packets then you should also set numa affinity with "sudo numactl -N 0 snabb packetblaster ...")

Usage here:

https://github.com/SnabbCo/snabbswitch/tree/master/src/program/packetblaster

Does that help?

Cheers,

-Luke

cibe...@gmail.com

unread,

Jan 11, 2016, 1:58:24 PM1/11/16

to Snabb Switch development

hi Luke,

got it! thanks for the suggestion.
Let me try that.

Konstantin

cibe...@gmail.com

unread,

Jan 11, 2016, 4:43:26 PM1/11/16

to Snabb Switch development

hi Luke,

I tried to run the packetblaster according to your suggestion, but it fails for some reason:


[ciberkot@chur:~]$ ps -ef | grep kvm
root       544     2  0 13:40 ?        00:00:00 [kvm-irqfd-clean]
root      5514     1 95 21:11 ?        00:27:55 qemu-system-x86_64 -daemonize -drive if=virtio,file=/home/ciberkot/ubuntu-s.qcow2 -M pc -smp 2 --enable-kvm -cpu host -m 1024 -numa node,memdev=mem -object memory-backend-file,id=mem,size=1024M,mem-path=/mnt/huge,share=on -chardev socket,id=char0,path=/home/ciberkot/vhost-sockets/vm-0-0.socket,server -netdev type=vhost-user,id=net0,chardev=char0 -device virtio-net-pci,netdev=net0,mac=52:54:11:00:00:01 -chardev socket,id=char1,path=/home/ciberkot/vhost-sockets/vm-3-0.socket,server -netdev type=vhost-user,id=net1,chardev=char1 -device virtio-net-pci,netdev=net1,mac=52:54:33:00:00:03 -serial telnet:127.0.0.1:10003,server,nowait,nodelay -serial file:./ubuntu-1_.log -netdev tap,id=hostnet3,ifname=tapMGMT,script=no -device e1000,netdev=hostnet3,id=net3,mac=52:54:77:00:00:07,bus=pci.0,addr=0xf -vnc :1
root      5522     2  0 21:11 ?        00:00:00 [kvm-pit/5514]
ciberkot  7840  7761  0 21:40 pts/5    00:00:00 grep kvm

[ciberkot@chur:~]$ ps -ef | grep traf
root      5415     1  4 21:10 ?        00:01:21 snabb snabbnfv traffic -k 10 -D 0 0000:01:00.0 ./port-0-0.cfg ./vhost-sockets/vm-0-0.socket
root      5420     1  4 21:11 ?        00:01:19 snabb snabbnfv traffic -k 10 -D 0 0000:03:00.0 ./port-3-0.cfg ./vhost-sockets/vm-3-0.socket
ciberkot  7842  7761  0 21:40 pts/5    00:00:00 grep traf

[ciberkot@chur:~]$ sudo snabb packetblaster replay test1.pcap 01:00.0 03:00.0
failed to lock /sys/bus/pci/devices/0000:01:00.0/resource0
lib/hardware/pci.lua:114: assertion failed!
stack traceback:
        core/main.lua:116: in function <core/main.lua:114>
        [C]: in function 'assert'
        lib/hardware/pci.lua:114: in function 'map_pci_memory'
        apps/intel/intel10g.lua:89: in function 'open'
        apps/intel/loadgen.lua:20: in function 'new'
        core/app.lua:165: in function <core/app.lua:162>
        core/app.lua:197: in function 'apply_config_actions'
        core/app.lua:110: in function 'configure'
        program/packetblaster/packetblaster.lua:51: in function 'run'
        core/main.lua:56: in function <core/main.lua:32>
        [C]: in function 'xpcall'
        core/main.lua:121: in main chunk
        [C]: at 0x0044f740
        [C]: in function 'pcall'
        core/startup.lua:1: in main chunk
        [C]: in function 'require'
        [string "require "core.startup""]:1: in main chunk

понедельник, 11 января 2016 г., 14:34:07 UTC+1 пользователь Luke Gorrie написал:

Luke Gorrie

unread,

Jan 12, 2016, 9:57:13 AM1/12/16

to snabb...@googlegroups.com

Howdy,

On 11 January 2016 at 22:43, <cibe...@gmail.com> wrote:

hi Luke,

I tried to run the packetblaster according to your suggestion, but it fails for some reason:

[ciberkot@chur:~]$ sudo snabb packetblaster replay test1.pcap 01:00.0 03:00.0 failed to lock /sys/bus/pci/devices/0000:01:00.0/resource0 lib/hardware/pci.lua:114: assertion failed!

This "failed to lock" error means that some other Snabb process is using that PCI device.

Such is life with a shared machine :-). Can you try with some other PCI addresses? can also hop onto the #lab Slack chat to ask for help and find out about other tests that people are doing on that machine.

I can also install more 82599 ports into chur if that would help, just let me know.

(Plan is to bring more servers online this week and that will alleviate the load on chur.)

cibe...@gmail.com

unread,

Jan 12, 2016, 4:05:20 PM1/12/16

to Snabb Switch development

hi Luke,

it seems I'm a bit lost.....
those PCI adresses are used by "my" VM, and I thought I will be able push some data to one of that interfaces and VM will be able to see that and process somehow.
but it looks like I'm completely wrong, sorry...
it means actually I should use the PCI 01:00.1 and 03:00.1 and push the traffic through the interconnecting cable to the interfaces, which are connected to mz VM... right?

Konstantin

вторник, 12 января 2016 г., 15:57:13 UTC+1 пользователь Luke Gorrie написал:

cibe...@gmail.com

unread,

Jan 12, 2016, 4:23:07 PM1/12/16

to Snabb Switch development

hi Luke,

I got it!

the result is something like that:

[root@chur:/home/ciberkot]# tail -f nohup1.out
load: time: 1.00s  fps: 854,090   fpGbps: 0.376 fpb: 6   bpp: 42   sleep: 0   us
load: time: 1.00s  fps: 855,193   fpGbps: 0.376 fpb: 6   bpp: 42   sleep: 0   us
load: time: 1.00s  fps: 855,802   fpGbps: 0.377 fpb: 6   bpp: 42   sleep: 2   us
load: time: 1.00s  fps: 856,801   fpGbps: 0.377 fpb: 6   bpp: 42   sleep: 0   us
link report:
                   0 sent on id3_NIC.tx -> id3_Virtio.rx (loss rate: 0%)
          19,541,020 sent on id3_Virtio.tx -> id3_NIC.rx (loss rate: 0%)
load: time: 1.00s  fps: 857,460   fpGbps: 0.377 fpb: 6   bpp: 42   sleep: 0   us
load: time: 1.00s  fps: 855,061   fpGbps: 0.376 fpb: 6   bpp: 42   sleep: 1   us
load: time: 1.00s  fps: 855,151   fpGbps: 0.376 fpb: 6   bpp: 42   sleep: 1   us
load: time: 1.00s  fps: 854,459   fpGbps: 0.376 fpb: 6   bpp: 42   sleep: 2   us
load: time: 1.00s  fps: 855,459   fpGbps: 0.376 fpb: 6   bpp: 42   sleep: 0   us
load: time: 1.00s  fps: 854,774   fpGbps: 0.376 fpb: 6   bpp: 42   sleep: 0   us
load: time: 1.00s  fps: 850,750   fpGbps: 0.374 fpb: 6   bpp: 42   sleep: 0   us

the test tool is like:

[root@chur:/home/ciberkot]# snabb packetblaster replay test1.pcap 01:00.1
Transmissions (last 1 sec):
apps report:
nic1
0000:01:00.1    TXDGPC (TX packets)     12,241,923      GOTCL (TX octets)       783,374,592
Transmissions (last 1 sec):
apps report:
nic1
0000:01:00.1    TXDGPC (TX packets)     14,880,434      GOTCL (TX octets)       952,337,216
Transmissions (last 1 sec):
apps report:
nic1
0000:01:00.1    TXDGPC (TX packets)     14,880,852      GOTCL (TX octets)       952,371,648

вторник, 12 января 2016 г., 15:57:13 UTC+1 пользователь Luke Gorrie написал:

Luke Gorrie

unread,

Jan 13, 2016, 3:46:46 PM1/13/16

to snabb...@googlegroups.com

On 12 January 2016 at 22:23, <cibe...@gmail.com> wrote:

I got it!

:-)

Great work!

Reply all

Reply to author

Forward