Issues Running TRex on Azure

1,146 views
Skip to first unread message

mrpa...@gmail.com

unread,
Apr 22, 2019, 9:47:21 AM4/22/19
to TRex Traffic Generator
Right now the environment is a single server.
- Size: Standard F8s_v2 (8 vcpus, 16 GB memory)
- OS: Ubuntu 16.04 (previous attempts were CentOS 7.5 & Ubuntu 18.04 with same results)
- eth0 = Management
- eth1 & eth2 are one same network

I've followed the Azure guide for getting DPDK installed and TRex does detect them.

/opt/trex/v2.56$ sudo ./dpdk_setup_ports.py -t
+----+------+--------------+-------------------+---------------------------------------------------------------------+-----------+----------+--------+
| ID | NUMA | PCI | MAC | Name | Driver | Linux IF | Active |
+====+======+==============+===================+=====================================================================+===========+==========+========+
| 0 | 0 | 0002:00:02.0 | 00:0d:3a:53:a0:5f | MT27500/MT27520 Family [ConnectX-3/ConnectX-3 Pro Virtual Function] | mlx4_core | enP2p0s2 | |
+----+------+--------------+-------------------+---------------------------------------------------------------------+-----------+----------+--------+
| 1 | 0 | 0003:00:02.0 | 00:0d:3a:53:a7:e8 | MT27500/MT27520 Family [ConnectX-3/ConnectX-3 Pro Virtual Function] | mlx4_core | enP3p0s2 | |
+----+------+--------------+-------------------+---------------------------------------------------------------------+-----------+----------+--------+

Traffic-01:/opt/trex/v2.56$ cat /etc/trex_cfg.yaml
- version: 2
interfaces: ['0002:00:02.0', '0003:00:02.0']
#interfaces: ['eth1', 'eth2']
port_info:
- ip: 10.4.7.5
default_gw: 10.4.7.7
- ip: 10.4.7.7
default_gw: 10.4.7.5

platform:
master_thread_id: 0
latency_thread_id: 7
dual_if:
- socket: 0
threads: [1,2,3,4,5,6]

However when I run TRex I get an error regarding the missing ofed_info binary.

/opt/trex/v2.56$ sudo ./t-rex-64 -f cap2/dns.yaml
Warning: Mellanox NICs where tested only with RedHat/CentOS 7.4
Correct usage with other Linux distributions is not guaranteed.
OFED /usr/bin/ofed_info is not installed on this setup
ERROR encountered while configuring TRex system

I've tried to install MLNX_OFED versions 4.4-2.0.7.0 & 4.5-1.0.1.0 with zero success. Has anyone been able to get this working on Azure with DPDK?

TRex does work if I use eth1 & eth2 but I'm not confident I'm getting full performance.

mail.sac...@gmail.com

unread,
Sep 13, 2019, 11:18:09 AM9/13/19
to TRex Traffic Generator
Hi

Poking this old email. Were you able to move ahead?

Going by the other threads, I think trex does not support netvsc vdev yet (https://groups.google.com/forum/#!searchin/trex-tgn/azure%7Csort:date/trex-tgn/v8WZAeid6PA/gZtWZ5LxCAAJ). This email thread (https://groups.google.com/forum/#!searchin/trex-tgn/azure%7Csort:date/trex-tgn/7Y6akHkBfhc/5EuYN64xAAAJ) tried but with AF_PACKET which obviously has performance limitation.

So if you were able to get the DPDK PMD working with trex, it would great to share with the wider audience.

Thanks
Sachin.

hanoh haim

unread,
Sep 14, 2019, 12:44:34 PM9/14/19
to mail.sac...@gmail.com, TRex Traffic Generator
Azure is going to support Mellanox VF CX-4/5 instead of the old CX-3 driver which has excellent performance. I would ask MS how to get early access to this types of instances 

Thanks
Hanoh

--
You received this message because you are subscribed to the Google Groups "TRex Traffic Generator" group.
To unsubscribe from this group and stop receiving emails from it, send an email to trex-tgn+u...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/trex-tgn/a81a26d3-0a7a-498e-b71d-78139961f7cd%40googlegroups.com.
--
Hanoh
Sent from my iPhone

sad...@ncsu.edu

unread,
Sep 26, 2019, 11:42:50 PM9/26/19
to TRex Traffic Generator
Hi,


I am trying to make Trex work on azure for Performance testing on NVA's. I installed required DPDK packages and OFED packages which seemed a requirement for Trex to run. But ihave been running into the below issue and help from the community would be appreciated:

[Centos-7 v2.61]$ sudo ./t-rex-64-debug-o -f cap2/dns.yaml -c 2 -m 1 -d 10
 The ports are bound/configured.
Starting  TRex v2.61 please wait  ...
ERROR in DPDK map
Could not find requested interface b767:00:02.0

If we check the dpdk interfaces, the above interface is listed:

[Centos-7 v2.61]$ sudo ./dpdk_setup_ports.py -s

Network devices using DPDK-compatible driver
============================================
95ac:00:02.0 'MT27500/MT27520 Family [ConnectX-3/ConnectX-3 Pro Virtual Function]' drv=mlx4_core unused=igb_uio,vfio-pci,uio_pci_generic
b767:00:02.0 'MT27500/MT27520 Family [ConnectX-3/ConnectX-3 Pro Virtual Function]' drv=mlx4_core unused=igb_uio,vfio-pci,uio_pci_generic               <------This is the interface

Network devices using kernel driver
===================================
95ac:00:02.0 'MT27500/MT27520 Family [ConnectX-3/ConnectX-3 Pro Virtual Function]' if=eth2 drv=mlx4_core unused=igb_uio,vfio-pci,uio_pci_generic
b767:00:02.0 'MT27500/MT27520 Family [ConnectX-3/ConnectX-3 Pro Virtual Function]' if=eth3 drv=mlx4_core unused=igb_uio,vfio-pci,uio_pci_generic

I have created the trex config file with the help of ./dpdk_setup_ports.py -i and below is the output:

- version: 2
  interfaces: ['95ac:00:02.0', 'b767:00:02.0']
  port_info:
      - ip: 10.0.0.7
        default_gw: 10.0.0.4
      - ip: 10.0.1.4
        default_gw: 10.0.1.5

  platform:
      master_thread_id: 0
      latency_thread_id: 3
      dual_if:
        - socket: 0
          threads: [1,2]


Am i missing anything here. Any help would be appreciated

Regards,
Charan

hanoh haim

unread,
Sep 27, 2019, 1:54:15 AM9/27/19
to sad...@ncsu.edu, TRex Traffic Generator
Hi Charan,

I’ve looked into this a few days ago.
Azure has 2 methods for DPDK.

1. Fail - safe PMD for DPDK this supports both mlx4 (CX-3) and mlx5 (CX-4/5) new.
2. Linux AF_PACKET.

The problem in #1 that TRex supports only working directly with the VF (mlx5 driver) and Azure does not support it (they drop specific packets)

I suggest to move to Linux mode (AF_PACKET) that won’t have the best performance but at least will work.

I will update when the fail-safe PMD will be integrated 


--
You received this message because you are subscribed to the Google Groups "TRex Traffic Generator" group.
To unsubscribe from this group and stop receiving emails from it, send an email to trex-tgn+u...@googlegroups.com.

Sai Subha Charan Addala

unread,
Sep 27, 2019, 2:11:25 AM9/27/19
to hanoh haim, TRex Traffic Generator
Thanks Hanoh for the response.

Seems like If i get hold of Cx-4/5 supported azure Vm, I can atleast send some traffic. Correct me if I am wrong

Also do you have an eta on when the team wants to resolve this for Azure Vm’s? Just curious to know the timeline.

Let me know if there is anything we can help from Azure’s point of view

Regards,
Charan
--
Regards,

Sai Subha Charan Addala,
Graduate Student,Computer Networking,
Electrical and Computer Engineering Department,
North Carolina State University,

Hanoch Haim

unread,
Sep 27, 2019, 5:36:18 AM9/27/19
to TRex Traffic Generator
 Hi Sai, 

I wasn't clear. 
With DPDK there is another layer that MS did called fail-safe 
see here

TRex supports only direct VF or PF of CX-5/CX-4 (mlx5 driver) 
The mlx4 (CX-3) is not maintained and not tested by us, but I know that Mellanox guys are using it with TRex but I would not be able to help with that. 

I've tested the direct VF with the new Azure mlx5 support (CX-5 VF) and it's half backed because the fail-safe is not there and only partial packets works. 
For now it is better to fall back to Linux AF_PACKET see 

It worked for me, but in much lower performance (~1Mpps)
I'm looking for a way to add the fail-safe support. 


thanks
Hanoh

Sai Subha Charan Addala

unread,
Sep 27, 2019, 12:02:58 PM9/27/19
to Hanoch Haim, TRex Traffic Generator
Hi Hanoh,

It makes sense. Also, another point i missed is that we are trying to run TREX on accelerated networking enabled azure VM's. So if the interaction between Trex (Dpdk application) and mlx5 driver is not handled correctly than its an issue. 

DPDK AN flow in Azure(8th or 9th slide displays the driver interaction) :
https://www.dpdk.org/wp-content/uploads/sites/35/2018/10/pm-05-DPDK-Azure.pdf                  <---------This presentation gives more context 

Can you update this thread, whenever you have added fail-safe support. It will help many engineers who are looking for traffic generator on Azure platform

Regards,
Sai Subha Charan Addala,

--
You received this message because you are subscribed to a topic in the Google Groups "TRex Traffic Generator" group.
To unsubscribe from this topic, visit https://groups.google.com/d/topic/trex-tgn/JVD67FQ6rpY/unsubscribe.
To unsubscribe from this group and all its topics, send an email to trex-tgn+u...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/trex-tgn/2aa8f5c1-17e4-4bed-963d-acc201390575%40googlegroups.com.

Sai Subha Charan Addala

unread,
Sep 27, 2019, 12:21:03 PM9/27/19
to Hanoch Haim, TRex Traffic Generator, nva...@microsoft.com

Microsoft NVA engineering team

Regards,
Charan

hanoh haim

unread,
Sep 29, 2019, 7:13:51 AM9/29/19
to Sai Subha Charan Addala, TRex Traffic Generator, nva...@microsoft.com
Hi Sai, 
I've added the code but coudn't test it
https://github.com/hhaim/trex-core/tree/azure

(added tun/failsafe/vdev)

Could you try this package 

wget --no-cache https://trex-tgn.cisco.com/trex/release/v2.62-azure.tar.gz 

You can add dpdk extention  (ext_dpdk_opt) to the trex_ext.yaml

- port_limit: 2
  version: 2
  interfaces: ['04:00.0', '04:00.1']
  ext_dpdk_opt: ['--vdev=net_vdev_netvsc0,iface=enp4s0f0', '--vdev=net_vdev_netvsc1,iface=enp4s0f1'] 
  port_info:
      - ip: 1.1.1.1
        default_gw: 2.2.2.2
      - ip: 2.2.2.2
        default_gw: 1.1.1.1

  platform:
      master_thread_id: 0
      latency_thread_id: 14
      dual_if:
        - socket: 0
          threads: [1,2,3,4,5,6,7,8,9,10,11,12,13]

try to run it with "-v 7 " to look into the DPDK log

./t-rex-64 -i -v 7 -c 1 --software 

When I test it on my CentOs 7.4, it is not crashed but does not work as it does not find the vdev 

Will be back on Wednesday  

thanks
Hanoh






You received this message because you are subscribed to the Google Groups "TRex Traffic Generator" group.
To unsubscribe from this group and stop receiving emails from it, send an email to trex-tgn+u...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/trex-tgn/CAJJvaspRWPrVAd6p9nB2g6w%3D%3DDn97soD2LYuHnGg0Rqq-SjRSw%40mail.gmail.com.

hanoh haim

unread,
Sep 29, 2019, 10:08:10 AM9/29/19
to Sai Subha Charan Addala, TRex Traffic Generator, nva...@microsoft.com

Hi Sai, 

With this diff (on top of my branch), I've tested the code and it  the logs looks ok,  however it does not seem to work
The code wasn't compiled with Azure Kernel, it was with CentOs 7.6 vanilla 

[]> git diff
diff --git a/src/dpdk/drivers/net/tap/tap_autoconf.h b/src/dpdk/drivers/net/tap/tap_autoconf.h
index 79636a92..753485e3 100644
--- a/src/dpdk/drivers/net/tap/tap_autoconf.h
+++ b/src/dpdk/drivers/net/tap/tap_autoconf.h
@@ -1,34 +1,36 @@
-#ifndef HAVE_TC_FLOWER
-#undef HAVE_TC_FLOWER
-#endif
+//#ifndef HAVE_TC_FLOWER
+#define HAVE_TC_FLOWER 1
+//#endif
 
-#ifndef HAVE_TC_VLAN_ID
-#undef HAVE_TC_VLAN_ID
-#endif
+//#ifndef HAVE_TC_VLAN_ID
+//#undef  HAVE_TC_VLAN_ID
+//#endif
 
-#ifndef HAVE_TC_BPF
-#undef HAVE_TC_BPF
-#endif
+//#ifndef HAVE_TC_BPF
+#define HAVE_TC_BPF 1
+//#endif
 
-#ifndef HAVE_TC_BPF_FD
-#undef HAVE_TC_BPF_FD
-#endif
+//HAVE_TC_BPF_FD
+//#ifndef HAVE_TC_BPF_FD
+//#define  HAVE_TC_BPF_FD 1
+//#endif
 
-#ifndef HAVE_TC_ACT_BPF
-#undef HAVE_TC_ACT_BPF
-#endif
+#define HAVE_TC_VLAN_ID 1
+//#ifndef HAVE_TC_ACT_BPF
+//#undef HAVE_TC_ACT_BPF
+//#endif
 
-#ifndef HAVE_TC_ACT_BPF_FD
-#undef HAVE_TC_ACT_BPF_FD
-#endif
+//#ifndef HAVE_TC_ACT_BPF_FD
+//#undef HAVE_TC_ACT_BPF_FD
+//#endif
 
-#include "linux/if_tun.h"
+//#include "linux/if_tun.h"
 
 /* for old kernel*/
-#ifndef IFF_DETACH_QUEUE
-#define IFF_DETACH_QUEUE 0x0400
-#endif
+//#ifndef IFF_DETACH_QUEUE
+//#define IFF_DETACH_QUEUE 0x0400
+//#endif
 
-#ifndef TUNSETQUEUE
-#define TUNSETQUEUE _IOW('T', 217, int)
-#endif
\ No newline at end of file
+//#ifndef TUNSETQUEUE
+//#define TUNSETQUEUE _IOW('T', 217, int)
+//#endif
\ No newline at end of file


DPDK args
 xx  -d  libmlx5-64-debug.so  -d  libmlx4-64-debug.so  -c  0x7  -n  4  --log-level  8  --master-lcore  0  -w  0002:00:02.0  -w  0003:00:02.0  --legacy-mem  --vdev=tvsc0,iface=eth1  --vdev=net_vdev_netvsc1,iface=eth2  
EAL: Detected 16 lcore(s)
EAL: Detected 1 NUMA nodes
EAL: Multi-process socket /var/run/dpdk/rte/mp_socket
EAL: No available hugepages reported in hugepages-1048576kB
 EAL: Probing VFIO support...
EAL: WARNING: cpu flags constant_tsc=yes nonstop_tsc=no -> using unreliable clock cycles !
EAL: PCI device 0002:00:02.0 on NUMA socket -1
EAL:   Invalid NUMA socket, default to 0
EAL:   probe driver: 15b3:1016 net_mlx5
net_mlx5: mlx5.c:1244: mlx5_dev_spawn(): tunnel offloading disabled due to old OFED/rdma-core version
net_mlx5: mlx5.c:1256: mlx5_dev_spawn(): MPLS over GRE/UDP tunnel offloading disabled due to old OFED/rdma-core version or firmware configuration
EAL: PCI device 0003:00:02.0 on NUMA socket -1
EAL:   Invalid NUMA socket, default to 0
EAL:   probe driver: 15b3:1016 net_mlx5
net_mlx5: mlx5.c:1244: mlx5_dev_spawn(): tunnel offloading disabled due to old OFED/rdma-core version
net_mlx5: mlx5.c:1256: mlx5_dev_spawn(): MPLS over GRE/UDP tunnel offloading disabled due to old OFED/rdma-core version or firmware configuration
net_vdev_netvsc: probably using routed NetVSC interface "eth1" (index 3)
net_vdev_netvsc: probably using routed NetVSC interface "eth2" (index 4)

                input : [0002:00:02.0, 0003:00:02.0]
                 dpdk : [0002:00:02.0, 0003:00:02.0]
             pci_scan : [0002:00:02.0, 0003:00:02.0]
                  map : [ 0, 1]
 TRex port mapping
 -----------------
 TRex vport: 0 dpdk_rte_eth: 0
 TRex vport: 1 dpdk_rte_eth: 1
 set driver name net_mlx5
 driver capability  : TCP_UDP_OFFLOAD  TSO
 set dpdk queues mode to DROP_QUE_FILTER
 Number of ports found: 2


if_index : 5
driver name : net_mlx5
min_rx_bufsize : 32
max_rx_pktlen  : 65536
max_rx_queues  : 4096
max_tx_queues  : 4096
max_mac_addrs  : 128
rx_offload_capa : 0x16a0f
tx_offload_capa : 0x802f
rss reta_size   : 512
flow_type_rss   : 0x3afbc
tx_desc_max     : 65535
tx_desc_min     : 0
rx_desc_max     : 65535
rx_desc_min     : 0
zmq publisher at: tcp://*:4500
 rx_data_q_num : 1
 rx_drop_q_num : 1
 rx_dp_q_num   : 0
 rx_que_total : 2
 --  
 rx_desc_num_data_q   : 4096
 rx_desc_num_drop_q   : 64
 rx_desc_num_dp_q     : 0
 total_desc           : 4160
 --  
 tx_desc_num     : 1024
port 0 desc: MT27710 Family [ConnectX-4 Lx Virtual Function]
 rx_qid: 0 (64)
 rx_qid: 1 (4096)
port 1 desc: MT27710 Family [ConnectX-4 Lx Virtual Function]
 rx_qid: 0 (64)

 rx_qid: 1 (4096)
 wait 1 sec .
port : 0
------------
link         :  link : Link Up - speed 4294967295 Mbps - full-duplex
promiscuous  : 0
port : 1
------------
link         :  link : Link Up - speed 4294967295 Mbps - full-duplex
promiscuous  : 0
 number of ports         : 2
 max cores for 2 ports   : 1
 tx queues per port      : 3
 -------------------------------
RX core uses TX queue number 1 on all ports
 core, c-port, c-queue, s-port, s-queue, lat-queue
 ------------------------------------------
 1        0      0       1       0      2  
 -------------------------------
base stack ctor
Legacy stack ctor
add port node

Charan Addala

unread,
Sep 30, 2019, 2:27:09 PM9/30/19
to hanoh haim, Sai Subha Charan Addala, Stephen Hemminger, TRex Traffic Generator, NVA Engineering Team

+Stephen

 

Stephen asked me to add rdma-core package and check if that works. Will try that now

 

Regards,

Charan

hanoh haim

unread,
Oct 2, 2019, 6:24:00 AM10/2/19
to Charan Addala, Sai Subha Charan Addala, Stephen Hemminger, TRex Traffic Generator, NVA Engineering Team
Hi Charan, 

It seems there is another issue in my code. I wasn't aware that adding failsafe will add more PMD into the pool and there is a need to choose the right PMD (failsafe) 
I don't have a setup right now. Will update 

thanks
Hanoh

hanoh haim

unread,
Oct 2, 2019, 3:32:42 PM10/2/19
to Charan Addala, Sai Subha Charan Addala, Stephen Hemminger, TRex Traffic Generator, NVA Engineering Team
Hi All,

I managed to make it work (with the help of Cisco CSR team) . I need to clean up the code a bit
There are some limitations that I need to re-validate for example the failsafe counters are not accurate 
The performance with CX-4 cards (mlx5 driver) with one core is about ~5MPPS/64B/1C the cpu% is ~40% so it is some hardware backpressure. 
(in our setup the performance is 10-15MPPS for one core)
image.png
tui>start -f stl/bench.py -t vm=cached,size=64 -m 3mpps  --force


thanks
Hanoh

Stephen Hemminger

unread,
Oct 2, 2019, 3:34:07 PM10/2/19
to hanoh haim, Charan Addala, Sai Subha Charan Addala, TRex Traffic Generator, NVA Engineering Team
It would be interesting to compare failsafe versus the native netvsc PMD.

From: hanoh haim <hhaim...@gmail.com>
Sent: Wednesday, October 2, 2019 12:32 PM
To: Charan Addala <Sai.A...@microsoft.com>
Cc: Sai Subha Charan Addala <sad...@ncsu.edu>; Stephen Hemminger <sthe...@microsoft.com>; TRex Traffic Generator <trex...@googlegroups.com>; NVA Engineering Team <nva...@microsoft.com>

Subject: Re: [trex-tgn] Re: Issues Running TRex on Azure

hanoh haim

unread,
Oct 2, 2019, 3:39:04 PM10/2/19
to Stephen Hemminger, Charan Addala, Sai Subha Charan Addala, TRex Traffic Generator, NVA Engineering Team
Hi Stephen, 

Just to make it clear. I'm comparing it to our UCS setup with CX-5 Mellanox VF performance.
Would it be enough to choose the  net_tap_vscx interfaces instead of the failsafe (net_failsafe_xx) ? or do I need to use the native vsc PMD (I need to add support for it)

thanks
Hanoh

Stephen Hemminger

unread,
Oct 2, 2019, 3:44:04 PM10/2/19
to hanoh haim, Charan Addala, Sai Subha Charan Addala, TRex Traffic Generator, NVA Engineering Team
Using native netvsc PMD requires using uio_hv_generic driver and reassign the netvsc kernel device to uio
(similar to how other native drivers work).


From: hanoh haim <hhaim...@gmail.com>
Sent: Wednesday, October 2, 2019 12:38 PM
To: Stephen Hemminger <sthe...@microsoft.com>
Cc: Charan Addala <Sai.A...@microsoft.com>; Sai Subha Charan Addala <sad...@ncsu.edu>; TRex Traffic Generator <trex...@googlegroups.com>; NVA Engineering Team <nva...@microsoft.com>

Subject: Re: [trex-tgn] Re: Issues Running TRex on Azure

hanoh haim

unread,
Oct 2, 2019, 3:56:51 PM10/2/19
to Stephen Hemminger, Charan Addala, Sai Subha Charan Addala, TRex Traffic Generator, NVA Engineering Team
Hi Stephen, 
I assume that there is a value with Mellanox CX-4/5 VF over native netvsc (similar to vmxnet3 for ESXi)
Do you have the expected performance numbers?
BTW, the current numbers (~5MPPS) are way better than Amazon ENA as it seems the DPDK ENA performance are worse than the Linux.

thanks
Hanoh


hanoh haim

unread,
Oct 3, 2019, 7:10:27 AM10/3/19
to Stephen Hemminger, mbum...@cisco.com, Charan Addala, Sai Subha Charan Addala, TRex Traffic Generator, NVA Engineering Team
Hi All, 
+ Malcolm from Cisco CSR team. 
While playing with the setups 

TRex VM is connected to CSR VM (using route) of 2 CX-4 cards 
Up to 100kpps all the traffic gets to CSR and get back without any failures, but higher rates starts showing drops. 
We noticed that both UDP/TCP traffic with a valid checksum were a route to the tap instead of the mlx5 driver  (both with STL and ASTF mode)
This creates a very heavy IRQ that kill the management (ZMQ SUB/PUB and REQ/RES just stop working with heavy traffic)


[]$ ifconfig
dtap0: flags=4163<UP,BROADCAST,RUNNING,MULTICAST>  mtu 1500
        inet6 fe80::20d:3aff:fe55:8d3  prefixlen 64  scopeid 0x20<link>
        ether 00:0d:3a:55:08:d3  txqueuelen 1000  (Ethernet)
        RX packets 0  bytes 0 (0.0 B)
        RX errors 0  dropped 0  overruns 0  frame 0
        TX packets 287399  bytes 140846732 (134.3 MiB)
        TX errors 0  dropped 8787 overruns 0  carrier 0  collisions 0

dtap1: flags=4163<UP,BROADCAST,RUNNING,MULTICAST>  mtu 1500
        inet6 fe80::20d:3aff:fe14:49d9  prefixlen 64  scopeid 0x20<link>
        ether 00:0d:3a:14:49:d9  txqueuelen 1000  (Ethernet)
        RX packets 0  bytes 0 (0.0 B)
        RX errors 0  dropped 0  overruns 0  frame 0
        TX packets 579825  bytes 44897401 (42.8 MiB)
        TX errors 0  dropped 21501 overruns 0  carrier 0  collisions 0

eth0: flags=4163<UP,BROADCAST,RUNNING,MULTICAST>  mtu 1500
        inet 10.90.52.101  netmask 255.255.255.0  broadcast 10.90.52.255
        inet6 fe80::20d:3aff:fe55:626  prefixlen 64  scopeid 0x20<link>
        ether 00:0d:3a:55:06:26  txqueuelen 1000  (Ethernet)
        RX packets 1071836  bytes 643353805 (613.5 MiB)
        RX errors 0  dropped 0  overruns 0  frame 0
        TX packets 1006784  bytes 279824718 (266.8 MiB)
        TX errors 0  dropped 0 overruns 0  carrier 0  collisions 0

eth1: flags=4163<UP,BROADCAST,RUNNING,MULTICAST>  mtu 1500
        inet 10.90.23.101  netmask 255.255.255.0  broadcast 10.90.23.255
        inet6 fe80::f230:73e0:bbd5:5150  prefixlen 64  scopeid 0x20<link>
        ether 00:0d:3a:55:08:d3  txqueuelen 1000  (Ethernet)
        RX packets 14405974  bytes 1050416181 (1001.7 MiB)
        RX errors 0  dropped 0  overruns 0  frame 0
        TX packets 24  bytes 1896 (1.8 KiB)
        TX errors 0  dropped 0 overruns 0  carrier 0  collisions 0

eth2: flags=4163<UP,BROADCAST,RUNNING,MULTICAST>  mtu 1500
        inet 10.90.130.101  netmask 255.255.255.0  broadcast 10.90.130.255
        inet6 fe80::f3a0:fdd:399d:aa0a  prefixlen 64  scopeid 0x20<link>
        ether 00:0d:3a:14:49:d9  txqueuelen 1000  (Ethernet)
        RX packets 74216776  bytes 4758076483 (4.4 GiB)
        RX errors 0  dropped 0  overruns 0  frame 0
        TX packets 21  bytes 1686 (1.6 KiB)
        TX errors 0  dropped 0 overruns 0  carrier 0  collisions 0

eth3: flags=6211<UP,BROADCAST,RUNNING,SLAVE,MULTICAST>  mtu 9238
        ether 00:0d:3a:55:08:d3  txqueuelen 1000  (Ethernet)
        RX packets 0  bytes 0 (0.0 B)
        RX errors 0  dropped 0  overruns 0  frame 0
        TX packets 12  bytes 720 (720.0 B)
        TX errors 0  dropped 0 overruns 0  carrier 0  collisions 0

eth4: flags=6211<UP,BROADCAST,RUNNING,SLAVE,MULTICAST>  mtu 9238
        ether 00:0d:3a:14:49:d9  txqueuelen 1000  (Ethernet)
        RX packets 0  bytes 0 (0.0 B)
        RX errors 0  dropped 0  overruns 0  frame 0
        TX packets 12  bytes 720 (720.0 B)
        TX errors 0  dropped 0 overruns 0  carrier 0  collisions 0

lo: flags=73<UP,LOOPBACK,RUNNING>  mtu 65536
        inet 127.0.0.1  netmask 255.0.0.0
        inet6 ::1  prefixlen 128  scopeid 0x10<host>
        loop  txqueuelen 1000  (Local Loopback)
        RX packets 306775  bytes 70217233 (66.9 MiB)
        RX errors 0  dropped 0  overruns 0  frame 0
        TX packets 306775  bytes 70217233 (66.9 MiB)
        TX errors 0  dropped 0 overruns 0  carrier 0  collisions 0


]$ ibv_devinfo
hca_id: mlx5_0
        transport:                      InfiniBand (0)
        fw_ver:                         14.24.4558
        node_guid:                      000d:3aff:fe55:08d3
        sys_image_guid:                 bc83:85fa:f7b1:0001
        vendor_id:                      0x02c9
        vendor_part_id:                 4118
        hw_ver:                         0x80
        board_id:                       MSF0010110035
        phys_port_cnt:                  1
                port:   1
                        state:                  PORT_ACTIVE (4)
                        max_mtu:                4096 (5)
                        active_mtu:             4096 (5)
                        sm_lid:                 0
                        port_lid:               0
                        port_lmc:               0x00
                        link_layer:             Ethernet

hca_id: mlx5_1
        transport:                      InfiniBand (0)
        fw_ver:                         14.24.4558
        node_guid:                      000d:3aff:fe14:49d9
        sys_image_guid:                 bc83:85fa:f7b1:0001
        vendor_id:                      0x02c9
        vendor_part_id:                 4118
        hw_ver:                         0x80
        board_id:                       MSF0010110035
        phys_port_cnt:                  1
                port:   1
                        state:                  PORT_ACTIVE (4)
                        max_mtu:                4096 (5)
                        active_mtu:             4096 (5)
                        sm_lid:                 0
                        port_lid:               0
                        port_lmc:               0x00
                        link_layer:             Ethernet

[]$ ibdev2netdev
mlx5_0 port 1 ==> eth3 (Up)
mlx5_1 port 1 ==> eth4 (Up)


eth3/eth4 are the Mellanox VF and there are not traffic there ..

Could you explain how we can help the VF handles the traffic instead the dtap? 


thanks
Hanoh


Stephen Hemminger

unread,
Oct 3, 2019, 11:13:41 AM10/3/19
to hanoh haim, mbum...@cisco.com, Charan Addala, Sai Subha Charan Addala, TRex Traffic Generator, NVA Engineering Team
Statistics reported by kernel devices for DPDK over mlx device do not include all packets.
The DPDK device driver uses ib-verbs and infiniband packets don't seem to count.

The transmit path over DPDK failsafe is always through the mlx device (if it is present and up).
The receive path for DPDK is split. New flows go over slow (tap) and existing flows go via VF (mlx).

You should see almost no interrupts. If you do it is a sign that the VF device  is not being picked up DPDK application.
This occurs if MLX device driver is not part of DPDK, or you don't whitelist the PCI address of the MLX device.

What is startup log?


From: hanoh haim <hhaim...@gmail.com>
Sent: Thursday, October 3, 2019 4:10 AM
To: Stephen Hemminger <sthe...@microsoft.com>; mbum...@cisco.com <mbum...@cisco.com>

Cc: Charan Addala <Sai.A...@microsoft.com>; Sai Subha Charan Addala <sad...@ncsu.edu>; TRex Traffic Generator <trex...@googlegroups.com>; NVA Engineering Team <nva...@microsoft.com>
Subject: Re: [trex-tgn] Re: Issues Running TRex on Azure

hanoh haim

unread,
Oct 3, 2019, 11:28:35 AM10/3/19
to Stephen Hemminger, Charan Addala, NVA Engineering Team, Sai Subha Charan Addala, TRex Traffic Generator, mbum...@cisco.com
Hi Stephen,

Mellanox VF works good.


Malcolm did more experiments and this is what we found:

In the STL test (similar to IMIX) there are no new flows and it is about 1000 flows of UDP.
In this case when packet size is 64 most of the traffic goes throw the dtap (tx side) - killing the management.
When the packet size in 68 it goes throw the VF - no management issues - performance is about 3MPPS 

The Rx side is always from eth1/2 (Linux Mellanox)

We still trying to understand the logic behind the decision to accelerate.

ASTF (Stateful) with HTTP/TCP flows with valid checksum (average packet size is high) the route decision is  dtap for some reason.

Is there a way to better understand the reasons for the route decision to dtap so we can accelerate the traffic? 

Thanks
Hanoh

Stephen Hemminger

unread,
Oct 3, 2019, 11:43:45 AM10/3/19
to hanoh haim, Charan Addala, NVA Engineering Team, Sai Subha Charan Addala, TRex Traffic Generator, mbum...@cisco.com

Not sure why failsafe would ever send over dtap. It doesn’t have to at all.

hanoh haim

unread,
Oct 3, 2019, 2:44:41 PM10/3/19
to Stephen Hemminger, Charan Addala, NVA Engineering Team, Sai Subha Charan Addala, TRex Traffic Generator, mbum...@cisco.com

While traffic running (HTTP/TCP) ~100kpps (the diff is 1sec betwean the read)


[]$ sudo ethtool -S eth3 | grep vport_uni
     rx_vport_unicast_packets: 12749447777
     rx_vport_unicast_bytes: 1051709357511
     tx_vport_unicast_packets: 16940152019
     tx_vport_unicast_bytes: 1381023024927
[]$ sudo ethtool -S eth3 | grep vport_uni
     rx_vport_unicast_packets: 12749464757
     rx_vport_unicast_bytes: 1051733185053
     tx_vport_unicast_packets: 16940158111
     tx_vport_unicast_bytes: 1381023598000
[]$ sudo ethtool -S eth4 | grep vport_uni
     rx_vport_unicast_packets: 12883845064
     rx_vport_unicast_bytes: 1053057197008
     tx_vport_unicast_packets: 16831794821
     tx_vport_unicast_bytes: 1382705644149
[]$ sudo ethtool -S eth4 | grep vport_uni
     rx_vport_unicast_packets: 12883848642
     rx_vport_unicast_bytes: 1053057656011
     tx_vport_unicast_packets: 16831817184
     tx_vport_unicast_bytes: 1382735835231

[]$ ifconfig dtap0

dtap0: flags=4163<UP,BROADCAST,RUNNING,MULTICAST>  mtu 1500
        inet6 fe80::20d:3aff:fe55:8d3  prefixlen 64  scopeid 0x20<link>
        ether 00:0d:3a:55:08:d3  txqueuelen 1000  (Ethernet)
        RX packets 0  bytes 0 (0.0 B)
        RX errors 0  dropped 0  overruns 0  frame 0
        TX packets 178467  bytes 13238484 (12.6 MiB)

        TX errors 0  dropped 0 overruns 0  carrier 0  collisions 0

[]$ ifconfig dtap0

dtap0: flags=4163<UP,BROADCAST,RUNNING,MULTICAST>  mtu 1500
        inet6 fe80::20d:3aff:fe55:8d3  prefixlen 64  scopeid 0x20<link>
        ether 00:0d:3a:55:08:d3  txqueuelen 1000  (Ethernet)
        RX packets 0  bytes 0 (0.0 B)
        RX errors 0  dropped 0  overruns 0  frame 0
        TX packets 178951  bytes 13274300 (12.6 MiB)

        TX errors 0  dropped 0 overruns 0  carrier 0  collisions 0

[]$ ifconfig dtap1

dtap1: flags=4163<UP,BROADCAST,RUNNING,MULTICAST>  mtu 1500
        inet6 fe80::20d:3aff:fe14:49d9  prefixlen 64  scopeid 0x20<link>
        ether 00:0d:3a:14:49:d9  txqueuelen 1000  (Ethernet)
        RX packets 0  bytes 0 (0.0 B)
        RX errors 0  dropped 0  overruns 0  frame 0
        TX packets 861889  bytes 57731802 (55.0 MiB)

        TX errors 0  dropped 0 overruns 0  carrier 0  collisions 0

[]$ ifconfig dtap1

dtap1: flags=4163<UP,BROADCAST,RUNNING,MULTICAST>  mtu 1500
        inet6 fe80::20d:3aff:fe14:49d9  prefixlen 64  scopeid 0x20<link>
        ether 00:0d:3a:14:49:d9  txqueuelen 1000  (Ethernet)
        RX packets 0  bytes 0 (0.0 B)
        RX errors 0  dropped 0  overruns 0  frame 0
        TX packets 864328  bytes 57893198 (55.2 MiB)

        TX errors 0  dropped 0 overruns 0  carrier 0  collisions 0

[1]$ ifconfig eth1

eth1: flags=4163<UP,BROADCAST,RUNNING,MULTICAST>  mtu 1500
        inet 10.90.23.101  netmask 255.255.255.0  broadcast 10.90.23.255
        inet6 fe80::f230:73e0:bbd5:5150  prefixlen 64  scopeid 0x20<link>
        ether 00:0d:3a:55:08:d3  txqueuelen 1000  (Ethernet)
        RX packets 14934836  bytes 1202553104 (1.1 GiB)

        RX errors 0  dropped 0  overruns 0  frame 0
        TX packets 24  bytes 1896 (1.8 KiB)
        TX errors 0  dropped 0 overruns 0  carrier 0  collisions 0

[1]$ ifconfig eth2

eth2: flags=4163<UP,BROADCAST,RUNNING,MULTICAST>  mtu 1500
        inet 10.90.130.101  netmask 255.255.255.0  broadcast 10.90.130.255
        inet6 fe80::f3a0:fdd:399d:aa0a  prefixlen 64  scopeid 0x20<link>
        ether 00:0d:3a:14:49:d9  txqueuelen 1000  (Ethernet)
        RX packets 75761468  bytes 4874522179 (4.5 GiB)

        RX errors 0  dropped 0  overruns 0  frame 0
        TX packets 21  bytes 1686 (1.6 KiB)
        TX errors 0  dropped 0 overruns 0  carrier 0  collisions 0

[1]$ ifconfig eth2

eth2: flags=4163<UP,BROADCAST,RUNNING,MULTICAST>  mtu 1500
        inet 10.90.130.101  netmask 255.255.255.0  broadcast 10.90.130.255
        inet6 fe80::f3a0:fdd:399d:aa0a  prefixlen 64  scopeid 0x20<link>
        ether 00:0d:3a:14:49:d9  txqueuelen 1000  (Ethernet)
        RX packets 75763987  bytes 4874688875 (4.5 GiB)

        RX errors 0  dropped 0  overruns 0  frame 0
        TX packets 21  bytes 1686 (1.6 KiB)
        TX errors 0  dropped 0 overruns 0  carrier 0  collisions 0

thanks
Hanoh


hanoh haim

unread,
Oct 3, 2019, 2:48:49 PM10/3/19
to Stephen Hemminger, Charan Addala, NVA Engineering Team, Sai Subha Charan Addala, TRex Traffic Generator, mbum...@cisco.com
One more problem, with more than 1 queue configuration (default RSS) traffic does not return back to cores (ARP)

thanks
Hanoh

hanoh haim

unread,
Oct 3, 2019, 2:57:03 PM10/3/19
to Stephen Hemminger, Charan Addala, NVA Engineering Team, Sai Subha Charan Addala, TRex Traffic Generator, mbum...@cisco.com
Hi All, 
A preliminary version can be found here 


Getting the package 
$wget --no-cache https://trex-tgn.cisco.com/trex/release/v2.63-azure.tar.gz

trex_cfg.yaml 

- port_limit      : 2
  version         : 2
  interfaces  : ['0002:00:02.0', '0003:00:02.0'] # the PCI for Mellanox 
  ext_dpdk_opt: ['--vdev=net_vdev_netvsc0,iface=eth1', '--vdev=net_vdev_netvsc1,iface=eth2'] # ask for failsafe
  interfaces_vdevs : ['net_failsafe_vsc0','net_failsafe_vsc1'] # use failsafe
  port_info       :  # Port IPs. Change to suit your needs. In case of loopback, you can leave as is.
          - ip         : 10.90.23.101
            default_gw : 10.90.23.202
          - ip         : 10.90.130.101
            default_gw : 10.90.130.202

  platform:
      master_thread_id: 0
      latency_thread_id: 1
      dual_if:
        - socket: 0
          threads: [2,3,4,5,6,7,8,9,10,11,12,13,14,15]

Running it with one core
 sudo ./t-rex-64 -i  -c 1
thanks
Hanoh

Charan Addala

unread,
Oct 3, 2019, 6:24:36 PM10/3/19
to hanoh haim, Stephen Hemminger, NVA Engineering Team, Sai Subha Charan Addala, TRex Traffic Generator, mbum...@cisco.com

Hi Hanoh,

 

I downloaded v2.63 package but Trex (DPDK app) is unable to resolve Mac for the gateway IP’s provided in the trex config file

 

Trex bails out with the below failure:

Failed resolving dest MAC for default gateway:10.0.1.5 on port 0

TX ARP request on port 0 - ip: 10.0.1.5 mac: Unknown

TX ARP request on port 1 - ip: 10.0.0.4 mac: Unknown

TX ARP request on port 0 - ip: 10.0.1.5 mac: Unknown

TX ARP request on port 1 - ip: 10.0.0.4 mac: Unknown

 

Below is the config file:

 

- version: 2

                interfaces: ['a1b0:00:02.0','b43d:00:02.0']

                ext_dpdk_opt: ['--vdev=net_vdev_netvsc0,iface=eth1','--vdev=net_vdev_netvsc1,iface=eth2', '-w a1b0:00:02.0', '-w b43d:00:02.0']

                port_info:

               - ip: 10.0.1.4

                               default_gw: 10.0.1.5

               - ip: 10.0.0.10

                               default_gw: 10.0.0.4

 

 

I have added static routes for the gateways and I am able to traceroute to Gateway IP’s from the VM  (The gateway IP’s are on different VM in same VNET):

[Centos-7 v2.63-azure]$ traceroute 10.0.1.5

traceroute to 10.0.1.5 (10.0.1.5), 30 hops max, 60 byte packets

  1. 10.0.1.5 (10.0.1.5)  2.186 ms  2.181 ms  2.124 ms

 

 

I see EAL error Logs  where tap interfaces fail to be created. I want to know if this is the reason or if it’s an issue at all:

net_failsafe: sub_device 1 probe failed (File exists)

eth_dev_tap_create(): Unable to create TAP interface

eth_dev_tap_create(): TAP Unable to initialize net_tap_vsc0

EAL: Driver cannot attach the device (net_tap_vsc0)

EAL: Failed to attach device on primary process

 

I am on Centos Linux version:

                Centos-7 v2.63-azure]$ uname -a

Linux Centos-7.6 5.3.1-1.el7.elrepo.x86_64 #1 SMP Sat Sep 21 09:44:09 EDT 2019 x86_64 x86_64 x86_64 GNU/Linux

 

To be clear, I have enabled MLX4,MLX5 PMD’s in the dpdk config file (CONFIG_RTE_LIBRTE_MLX4_PMD =y)

 

Can you help pointing out the problem? The topology is simple with a VM1 ßàVM2 (Each have two Nics with each pair in the same subnet).

 

Also, it would be great if we can document any caveats for bring up TREX on Azure platform. 

 

Regards,

Charan

 

From: hanoh haim <hhaim...@gmail.com>
Sent: Thursday, October 3, 2019 11:57 AM
To: Stephen Hemminger <sthe...@microsoft.com>
Cc: Charan Addala <Sai.A...@microsoft.com>; NVA Engineering Team <nva...@microsoft.com>; Sai Subha Charan Addala <sad...@ncsu.edu>; TRex Traffic Generator <trex...@googlegroups.com>; mbum...@cisco.com
Subject: Re: [trex-tgn] Re: Issues Running TRex on Azure

 

Hi All, 

Malcolm Bumgardner (mbumgard)

unread,
Oct 3, 2019, 8:34:43 PM10/3/19
to Charan Addala, hanoh haim, Stephen Hemminger, NVA Engineering Team, Sai Subha Charan Addala, TRex Traffic Generator

Please try this syntax:

 

                interfaces: ['a1b0:00:02.0','b43d:00:02.0']

                ext_dpdk_opt: ['--vdev=net_vdev_netvsc0,iface=eth1','--vdev=net_vdev_netvsc1,iface=eth2']

               interfaces_vdevs : ['net_failsafe_vsc0','net_failsafe_vsc1']

 

Thanks,

-Malcolm

Charan Addala

unread,
Oct 4, 2019, 12:07:26 AM10/4/19
to Malcolm Bumgardner (mbumgard), hanoh haim, Stephen Hemminger, NVA Engineering Team, Sai Subha Charan Addala, TRex Traffic Generator

Hi Malcolm,

 

I have tried the below Trex configuration but still run into the dame issue:

 

- version: 2

  interfaces: ['a1b0:00:02.0','b43d:00:02.0']

  ext_dpdk_opt: ['--vdev=net_vdev_netvsc0,iface=eth1','--vdev=net_vdev_netvsc1,iface=eth2']

  interfaces_vdevs : ['net_failsafe_vsc0','net_failsafe_vsc1']

  port_info:

      - ip: 10.0.1.4

        default_gw: 10.0.1.5

      - ip: 10.0.0.10

        default_gw: 10.0.0.4

  platform:

      master_thread_id: 0

      latency_thread_id: 3

      dual_if:

        - socket: 0

          threads: [1,2]

 

EAL logs after running Trex:

[Centos-7 v2.63-azure]$ sudo ./t-rex-64 -f cap2/http_simple.yaml -c 1 -m 10000 -d 100 --no-flow-control-change

The ports are bound/configured.

Starting  TRex v2.62-azure please wait  ...

eth_dev_tap_create(): Unable to create TAP interface

eth_dev_tap_create(): TAP Unable to initialize net_tap_vsc0

EAL: Driver cannot attach the device (net_tap_vsc0)

EAL: Failed to attach device on primary process

net_failsafe: sub_device 1 probe failed (File exists)

eth_dev_tap_create(): Unable to create TAP interface

eth_dev_tap_create(): TAP Unable to initialize net_tap_vsc1

EAL: Driver cannot attach the device (net_tap_vsc1)

EAL: Failed to attach device on primary process

net_failsafe: sub_device 1 probe failed (File exists)

eth_dev_tap_create(): Unable to create TAP interface

eth_dev_tap_create(): TAP Unable to initialize net_tap_vsc0

vdev_probe(): failed to initialize net_tap_vsc0 device

eth_dev_tap_create(): Unable to create TAP interface

eth_dev_tap_create(): TAP Unable to initialize net_tap_vsc1

vdev_probe(): failed to initialize net_tap_vsc1 device

EAL: Bus (vdev) probe failed.

set driver name net_mlx4

driver capability  : TCP_UDP_OFFLOAD  TSO

set dpdk queues mode to ONE_QUE

Number of ports found: 2

zmq publisher at: tcp://*:4500

wait 1 sec .

port : 0

------------

link         :  link : Link Up - speed 40000 Mbps - full-duplex

promiscuous  : 0

port : 1

------------

link         :  link : Link Up - speed 40000 Mbps - full-duplex

promiscuous  : 0

 

net_failsafe: sub_device 1 probe failed (File exists)

Failed resolving dest MAC for default gateway:10.0.1.5 on port 0

 

 

Just more context, I am providing TREX with eth1 and eth2 interfaces, but Accelerated networking is enabled on my eth0 interface also.

 

[Centos-7 v2.63-azure]$ ip link show

1: lo: <LOOPBACK,UP,LOWER_UP> mtu 65536 qdisc noqueue state UNKNOWN mode DEFAULT group default qlen 1000

    link/loopback 00:00:00:00:00:00 brd 00:00:00:00:00:00

2: eth0: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc mq state UP mode DEFAULT group default qlen 1000

    link/ether 00:0d:3a:fd:f0:24 brd ff:ff:ff:ff:ff:ff

3: eth1: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc mq state UP mode DEFAULT group default qlen 1000

    link/ether 00:0d:3a:6c:bd:bd brd ff:ff:ff:ff:ff:ff

4: eth2: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc mq state UP mode DEFAULT group default qlen 1000

    link/ether 00:0d:3a:fd:22:e2 brd ff:ff:ff:ff:ff:ff

5: eth3: <BROADCAST,MULTICAST,SLAVE,UP,LOWER_UP> mtu 1500 qdisc mq master eth0 state UP mode DEFAULT group default qlen 1000

    link/ether 00:0d:3a:fd:f0:24 brd ff:ff:ff:ff:ff:ff

46: eth4: <BROADCAST,MULTICAST,SLAVE,UP,LOWER_UP> mtu 1500 qdisc mq master eth1 state UP mode DEFAULT group default qlen 1000

    link/ether 00:0d:3a:6c:bd:bd brd ff:ff:ff:ff:ff:ff

47: eth5: <BROADCAST,MULTICAST,SLAVE,UP,LOWER_UP> mtu 1500 qdisc mq master eth2 state UP mode DEFAULT group default qlen 1000

    link/ether 00:0d:3a:fd:22:e2 brd ff:ff:ff:ff:ff:ff

 

Is there anything else I am missing? If you are able to run Trex on Azure VM, can you point me to Guest OS and the kernel version? I have compiled below PMD’s in dpdk apart from default:

 

CONFIG_RTE_LIBRTE_MLX4_PMD=y

CONFIG_RTE_LIBRTE_MLX5_PMD=y

CONFIG_RTE_LIBRTE_NETVSC_PMD=y

CONFIG_RTE_LIBRTE_VDEV_NETVSC_PMD=y

 

Regards,

Charan

hanoh haim

unread,
Oct 4, 2019, 1:06:25 AM10/4/19
to Charan Addala, Malcolm Bumgardner (mbumgard), Stephen Hemminger, NVA Engineering Team, Sai Subha Charan Addala, TRex Traffic Generator
Hi Charan, 

1. It does not seem that you are using the package that I've built. It should be v2.63-azure and not v2.62-azure . My private branch does not include the version change. Did you compile it yourself?
2. You don't  need to compile DPDK on the machine -- DPDK is part of TRex image and has its own source code  and it crosses compiled. So TAP kernel API headers are part of the TRex source tree. This way there is no need to compile mlx5/tap PMD on specific machines 
3. We have tested TRex (Malclom spinned the VM for me) Azure CentOs 7.6 image. 
4. You will need to follow the TRex Mellanox appendix to install specific OFED/DPDK lib for ibv as if you are going to use the mlx5
5. Please send the output with "-v 7" added with the right image 
6. We tested it with mlx5 driver and not mlx4. Could you try to spin the same VM?

set driver name net_mlx4

thanks
Hanoh
 

Stephen Hemminger

unread,
Oct 4, 2019, 11:19:04 AM10/4/19
to hanoh haim, Charan Addala, Malcolm Bumgardner (mbumgard), NVA Engineering Team, Sai Subha Charan Addala, TRex Traffic Generator

Mellanox DPDK support does need (and should not use OFED) on kernels 4.14 or later.

1.       10.0.1.5 (10.0.1.5)  2.186 ms  2.181 ms  2.124 ms

hanoh haim

unread,
Oct 6, 2019, 5:01:41 AM10/6/19
to Stephen Hemminger, Charan Addala, Malcolm Bumgardner (mbumgard), NVA Engineering Team, Sai Subha Charan Addala, TRex Traffic Generator
Hi All,

I've released v2.64 with first Azure DPDK failsafe support.
see 

Please verify and update if it works for you. 

thanks
Hanoh
** There are still things that do not work. Hope we can take it offline with MS guys to resolve it 


Charan Addala

unread,
Oct 8, 2019, 4:38:30 PM10/8/19
to hanoh haim, Malcolm Bumgardner (mbumgard), Stephen Hemminger, NVA Engineering Team, Sai Subha Charan Addala, TRex Traffic Generator

Hi Hanoh,

 

Using the latest 2.64 trex version, we were able to push traffic upto 1Gbps. This was on mlx4 driver (CX-3).

 

 

-Global stats enabled

Cpu Utilization : 11.1  %  18.2 Gb/core

Platform_factor : 1.0

Total-Tx        :       1.01 Gbps

Total-Rx        :     996.32 Mbps

Total-PPS       :     295.42 Kpps

Total-CPS       :       4.52 Kcps

 

Expected-PPS    :     300.64 Kpps

Expected-CPS    :       4.51 Kcps

Expected-BPS    :       1.06 Gbps

 

Active-flows    :     2935  Clients :      255   Socket-util : 0.0183 %

Open-flows      :    94023  Servers :    65535   Socket :     2935 Socket/Clients :  11.5

drop-rate       :       0.00  bps

current time    : 21.4 sec

test duration   : 28.6 sec

 

I will now try to run TREX on mlx5 and update if I observe a better traffic rate.

1.       10.0.1.5 (10.0.1.5)  2.186 ms  2.181 ms  2.124 ms

Message has been deleted

hanoh haim

unread,
Apr 25, 2022, 4:57:18 AM4/25/22
to KAI CHEN, TRex Traffic Generator
Hi KAI, 
It is a big pain to run TRex on Azure, see this old Wiki 
https://github.com/cisco-system-traffic-generator/trex-core/wiki/Build-For-Azure-Ubuntu-(mlx5)
We are not testing it regularly, so the specific version in the wiki might work 

Thanks
Hanoh
Reply all
Reply to author
Forward
0 new messages