TRex issue - Error disabling MSI-X interrupts

874 views
Skip to first unread message

bmic...@redhat.com

unread,
Apr 3, 2017, 10:24:43 AM4/3/17
to TRex Traffic Generator

I'm relatively new to using TRex. I've had success running TRex on RHEL 7.3 using a Haswell based server. I've now moved to a Broadwell based system and am seeing an error upon TRex/DPDK initialization.

I'm witnessing what seems to be an error disabling interrupts, the code then errors as seem below. I start with 32 1G hugepages and after the error all pages are consumed and not given back to the system.

Here's the start up error:

#LD_LIBRARY_PATH=.:/opt/trex-core/external_libs/ibverbs ./_t-rex-64 -i -c 10 --checksum-offload
Starting TRex v2.22 please wait ...
set driver name net_i40e
zmq publisher at: tcp://*:4500
Number of ports found: 2
EAL: Error disabling MSI-X interrupts for fd 61
EAL: Error disabling MSI-X interrupts for fd 61
EAL: Error disabling MSI-X interrupts for fd 61
EAL: Error disabling MSI-X interrupts for fd 61
EAL: Error disabling MSI-X interrupts for fd 61
EAL: Error disabling MSI-X interrupts for fd 61
EAL: Error disabling MSI-X interrupts for fd 61
EAL: Error disabling MSI-X interrupts for fd 61
EAL: Error disabling MSI-X interrupts for fd 61
EAL: Error - exiting with code: 1
Cause: rte_eth_dev_start: err=-1, port=0

I do have two ports bound:

# /usr/share/dpdk/tools/dpdk-devbind.py --status
Network devices using DPDK-compatible driver
============================================
0000:04:00.0 'Ethernet Controller XL710 for 40GbE QSFP+' drv=vfio-pci unused=i40e
0000:06:00.0 'Ethernet Controller XL710 for 40GbE QSFP+' drv=vfio-pci unused=i40e

and also:

# cat /proc/cmdline
BOOT_IMAGE=/vmlinuz-3.10.0-514.10.2.el7.x86_64 root=/dev/mapper/rhel_perf138-root ro crashkernel=auto rd.lvm.lv=rhel_perf138/root rd.lvm.lv=rhel_perf138/swap rhgb quiet LANG=en_US.UTF-8 isolcpus=1-47 intel_iommu=on iommu=pt default_hugepagesz=1G hugepagesz=1G hugepages=32 intel_pstate=disable nohz=on nohz_full=1-47 rcu_nocbs=1-47 intel_pstate=disable nosoftlockup


# cat /etc/trex_cfg.yaml
- port_limit : 2
cpu_mask_offset : 8
version : 2
#List of interfaces. Change to suit your setup. Use ./dpdk_setup_ports.py -s to see available options
interfaces : ["06:00.0","04:00.0"]
port_info : # Port IPs. Change to suit your needs. In case of loopback, you can leave as is.
- ip : 1.1.1.1
default_gw : 2.2.2.2
- ip : 2.2.2.2
default_gw : 1.1.1.1

platform:
master_thread_id : 1
rx_thread_id : 3
dual_if:
- socket : 1
threads : [2,4,6,8,10,12,14,16,18,20]

If anyone has an idea how to get past this issue, or on how to collect additional information, I would really appreciate the help.

Thank you,
Bill

hanoh haim

unread,
Apr 3, 2017, 10:58:45 AM4/3/17
to bmic...@redhat.com, TRex Traffic Generator
I would try uio instead vfio-pci. 
try to unbind to vfio-pci 


$cd ./ko/src
$make 
$install 

and run trex again . it should load gio driver 

Hanoh




--
You received this message because you are subscribed to the Google Groups "TRex Traffic Generator" group.
To unsubscribe from this group and stop receiving emails from it, send an email to trex-tgn+unsubscribe@googlegroups.com.
To post to this group, send email to trex...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/trex-tgn/a2b15077-a01e-453e-b961-8a7f6683a874%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.



--
Hanoh
Sent from my iPhone

bmic...@redhat.com

unread,
Apr 3, 2017, 1:43:31 PM4/3/17
to TRex Traffic Generator, bmic...@redhat.com
> To unsubscribe from this group and stop receiving emails from it, send an email to trex-tgn+u...@googlegroups.com.
>
> To post to this group, send email to trex...@googlegroups.com.
>
> To view this discussion on the web visit https://groups.google.com/d/msgid/trex-tgn/a2b15077-a01e-453e-b961-8a7f6683a874%40googlegroups.com.
>
> For more options, visit https://groups.google.com/d/optout.
>
>
>
>
>
> --
>
>
>
> HanohSent from my iPhone

Thank you for the reply.

I could not get uio to bind to the XL710 ports, but I did successfully build igb_uio.ko. It bound successfully, but when starting there is an error:

#LD_LIBRARY_PATH=.:/opt/trex-core/external_libs/ibverbs ./_t-rex-64 -i -c 10 --checksum-offload
Starting TRex v2.13 please wait ...
zmq publisher at: tcp://*:4500
Number of ports found: 2
set driver name rte_i40e_pmd
EAL: Error - exiting with code: 1
Cause: rte_eth_dev_start: err=-1, port=0

# /usr/share/dpdk/tools/dpdk-devbind.py --status

Network devices using DPDK-compatible driver
============================================
0000:04:00.0 'Ethernet Controller XL710 for 40GbE QSFP+' drv=igb_uio unused=i40e,uio_pci_generic
0000:06:00.0 'Ethernet Controller XL710 for 40GbE QSFP+' drv=igb_uio unused=i40e,uio_pci_generic

hanoh haim

unread,
Apr 3, 2017, 1:47:32 PM4/3/17
to TRex Traffic Generator, bmic...@redhat.com
Please send the output of this command

./t-rex-64-debug -i -v 7



For more options, visit https://groups.google.com/d/optout.
--

bmic...@redhat.com

unread,
Apr 3, 2017, 1:50:23 PM4/3/17
to TRex Traffic Generator, bmic...@redhat.com
Here is the debug log with igb_uio still bound to the interfaces (as opposed to vfio-pci):

#./t-rex-64-debug -i -v 7
Creating huge node
Starting Scapy server.... Scapy server is started

Starting TRex v2.13 please wait ...
Using configuration file /etc/trex_cfg.yaml
port limit : 2
port_bandwidth_gb : 10
if_mask : None
thread_per_dual_if : 1
if : 06:00.0, 04:00.0,
enable_zmq_pub : 1
zmq_pub_port : 4500
m_zmq_rpc_port : 4501
src : 00:00:00:00:00:00
dest : 00:00:00:00:00:00
src : 00:00:00:00:00:00
dest : 00:00:00:00:00:00
memory per 2x10G ports
MBUF_64 : 16380
MBUF_128 : 8190
MBUF_256 : 8190
MBUF_512 : 8190
MBUF_1024 : 8190
MBUF_2048 : 4095
MBUF_4096 : 128
MBUF_9K : 512
TRAFFIC_MBUF_64 : 65520
TRAFFIC_MBUF_128 : 32760
TRAFFIC_MBUF_256 : 8190
TRAFFIC_MBUF_512 : 8190
TRAFFIC_MBUF_1024 : 8190
TRAFFIC_MBUF_2048 : 65520
TRAFFIC_MBUF_4096 : 128
TRAFFIC_MBUF_9K : 512
MBUF_DP_FLOWS : 524288
MBUF_GLOBAL_FLOWS : 5120
master thread : 1
rx thread : 3
dual_if : 0
socket : 1
[ 2 4 6 8 10 12 14 16 18 20 ]
CTimerWheelYamlInfo does not exist
flags : 8010f00
write_file : 0
verbose : 7
realtime : 1
flip : 0
cores : 1
single core : 0
flow-flip : 0
no clean close : 0
zmq_publish : 1
vlan_enable : 0
client_cfg : 0
mbuf_cache_disable : 0
vm mode : 0
cfg file :
mac file :
out file :
client cfg file :
duration : 0
factor : 1
mbuf_factor : 1
latency : 0 pkt/sec
zmq_port : 4500
telnet_port : 4501
expected_ports : 2
tw_bucket_usec : 20.000000 usec
tw_buckets : 1024 usec
tw_levels : 3 usec
port : 0 dst:00:00:00:00:00:00 src:00:00:00:00:00:00
port : 1 dst:00:00:00:00:00:00 src:00:00:00:00:00:00
port : 2 dst:00:00:00:01:00:00 src:00:00:00:00:00:00
port : 3 dst:00:00:00:01:00:00 src:00:00:00:00:00:00
port : 4 dst:00:00:00:01:00:00 src:00:00:00:00:00:00
port : 5 dst:00:00:00:01:00:00 src:00:00:00:00:00:00
port : 6 dst:00:00:00:01:00:00 src:00:00:00:00:00:00
port : 7 dst:00:00:00:01:00:00 src:00:00:00:00:00:00
port : 8 dst:00:00:00:01:00:00 src:00:00:00:00:00:00
port : 9 dst:00:00:00:01:00:00 src:00:00:00:00:00:00
port : 10 dst:00:00:00:01:00:00 src:00:00:00:00:00:00
port : 11 dst:00:00:00:01:00:00 src:00:00:00:00:00:00
port : 12 dst:00:00:00:01:00:00 src:00:00:00:00:00:00
port : 13 dst:00:00:00:01:00:00 src:00:00:00:00:00:00
port : 14 dst:00:00:00:01:00:00 src:00:00:00:00:00:00
port : 15 dst:00:00:00:01:00:00 src:00:00:00:00:00:00
Total Memory :
MBUF_64 : 81900
MBUF_128 : 40950
MBUF_256 : 16380
MBUF_512 : 16380
MBUF_1024 : 16380
MBUF_2048 : 69615
MBUF_4096 : 256
MBUF_9K : 1024
MBUF_DP_FLOWS : 524288
MBUF_GLOBAL_FLOWS : 5120
get_each_core_dp_flows : 524288
Total memory : 104.00 Mbytes
core_mask e
sockets : 1
active sockets : 1
ports_sockets : 1
1,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,
phy | virt
2 1
args
xx
-c
0xe
-n
4
--log-level
8
--master-lcore
1
-w
06:00.0
-w
04:00.0
EAL: Detected 48 lcore(s)
EAL: Probing VFIO support...
EAL: PCI device 0000:04:00.0 on NUMA socket 0
EAL: probe driver: 8086:1583 rte_i40e_pmd
PMD: eth_i40e_dev_init(): FW 5.0 API 1.5 NVM 05.00.05 eetrack 800028a6
EAL: PCI device 0000:06:00.0 on NUMA socket 0
EAL: probe driver: 8086:1583 rte_i40e_pmd
PMD: eth_i40e_dev_init(): FW 5.0 API 1.5 NVM 05.00.05 eetrack 800028a6
TRex cfg port id: 0 <-> DPDK port id: 1
TRex cfg port id: 1 <-> DPDK port id: 0
zmq publisher at: tcp://*:4500
Number of ports found: 2


if_index : 0
driver name : rte_i40e_pmd
min_rx_bufsize : 1024
max_rx_pktlen : 9728
max_rx_queues : 320
max_tx_queues : 320
max_mac_addrs : 64
rx_offload_capa : 2f
tx_offload_capa : 1bf
set driver name rte_i40e_pmd
port 0: FW ver 05.00.05
port 1: FW ver 05.00.05
port 0 desc: Ethernet Controller XL710 for 40GbE QSFP+

hanoh haim

unread,
Apr 3, 2017, 1:50:32 PM4/3/17
to TRex Traffic Generator, bmic...@redhat.com
I think the issue is that you run the underlined app.
Try to run the script

./t-rex-64 -i -c 4

There is no need for check-offload option in Stateless mode

Thanks
Hanoh

bmic...@redhat.com

unread,
Apr 3, 2017, 1:56:38 PM4/3/17
to TRex Traffic Generator, bmic...@redhat.com
Thank you for looking - Unfortunately another failure:

# LD_LIBRARY_PATH=.:/opt/trex-core/external_libs/ibverbs ./t-rex-64 -i -c 10
Killing Scapy server... Scapy server is killed
Starting Scapy server.... Scapy server is started

bmic...@redhat.com

unread,
Apr 3, 2017, 2:01:47 PM4/3/17
to TRex Traffic Generator, bmic...@redhat.com
Not sure if its a clue or not, but as a side note, I am able to load DPDK+Openvswitch with vfio-pci successfully. This issue seems TRex+DPDK specific.

bmic...@redhat.com

unread,
Apr 3, 2017, 2:02:29 PM4/3/17
to TRex Traffic Generator

hanoh haim

unread,
Apr 3, 2017, 2:03:26 PM4/3/17
to TRex Traffic Generator, bmic...@redhat.com
Looking again into trex_cfg it does not seems ok.

Try to create the trex_cfg file using 
Our script

This one:

<none>
- port_limit      : 2
  version         : 2
#List of interfaces. Change according to your setup. Use ./dpdk_setup_ports.py -s to see available options.
interfaces    : ["04:00.0", "06:00.0"]  #1
port_info       :  # Port IPs. Change according to your needs. In case of loopback, you can leave as is.

          - ip         : 1.1.1.1
            default_gw : 2.2.2.2
          - ip         : 2.2.2.2
            default_gw : 1.1.1.1


Or 

sudo ./dpdk_setup_ports.py -i

Hanoh


For more options, visit https://groups.google.com/d/optout.
--

bmic...@redhat.com

unread,
Apr 4, 2017, 8:29:06 AM4/4/17
to TRex Traffic Generator, bmic...@redhat.com
Hello Hanoh:

I've had some success.  There was a problem with the YAML file but its beyond me what it is - its was the same file I used on the Haswell systems.

Note the enclosed configuration file did not work for me, but using the dpdk_setup_ports.py utility generated a configuration file that does work.

Regardless, I seem to be past it - thank you again for all the support.

- Bill

Yichen Wang

unread,
Apr 28, 2017, 7:03:52 PM4/28/17
to TRex Traffic Generator, bmic...@redhat.com
We are hitting exactly the same issue here, the NIC we are using is "Intel Corporation Ethernet Controller XL710 for 40GbE QSFP+ (rev 02)".

The magic is to put: "port_bandwidth_gb: 40" in your /etc/trex_cfg.yaml. We spent quite some time to figure it out, and it is really documented. We really need to have this documented, so people can save some time in the future.

Thanks very much!

Regards,
Yichen

hanoh haim

unread,
Apr 29, 2017, 1:03:41 PM4/29/17
to TRex Traffic Generator, Yichen Wang, bmic...@redhat.com
Hi, 
Thanks for looking into this.
What this option does is to add more mbuf. 
I will try to reconstruct it with less mbuf 

Thanks,
Hanoh

For more options, visit https://groups.google.com/d/optout.

hanoh haim

unread,
Apr 29, 2017, 1:46:29 PM4/29/17
to TRex Traffic Generator, Yichen Wang, bmic...@redhat.com
Hi, 
in the v2.14 (wasn't released yet, but latest in GitHub) we might fixed this issue as we moved to more efficient mbufs usage in received side.

Thanks,
Hanoh

ido barnea

unread,
Apr 29, 2017, 1:53:32 PM4/29/17
to hanoh haim, TRex Traffic Generator, Yichen Wang, bmic...@redhat.com
Small typo. Should be v2.24 (instead of v2.14)
Reply all
Reply to author
Forward
0 new messages