High droprate when using IP addresses in config (Sending traffic to a host)

1,194 views
Skip to first unread message

Dave Houser

unread,
Mar 15, 2021, 5:44:45 PM3/15/21
to TRex Traffic Generator
Understand, I am still learning how to use trex, so I maybe using the system wrong, hence why I see the problem I do. Below are details on the problem, my system, troubleshooting, and configs.

Problem: 
trex has very high drop rates when using IPs in the trex-cfg.yaml for stateful, astf, and stateless. Communication is all L2 (same subnet).

System:
- Physical host, 2 physical 10Gbps ports connected to Juniper QFX5100, 2 ports looped to each other 10Gbps, running rhel 7.9. 
- Clients receiving packets from trex, are virtual machines on vmware esxi running rhel7.9 as well. 
- Default gateway configured on the QFX5100, confirmed I can access via L2 (So this should not be needed)

Trouble shooting:
- Confirmed L2 works from Linux kernel by rebooting with out configuring dpdk, can ping and forwarding tables are populated.
- Tried putting the system in to astf mode and perform service, then arp, arp resolves.
- Performed traces on hosts sending packets to, they receive packets. From the examples below they receive a TCP handshake, as well as the HTTP packets, but trex still shows major drop rates.
- When using the stateful command "./t-rex-64 -f cap2/http_simple.yaml -c 1 -d 100" (With the config below) I see trex reach 800kbps tx, drop rate shows close to that. Used the program 'bmon' on clients to see transfer rates. Shows about 2-3kbps of data coming to the client host. Doing a trace shows the host receives all packets, but it looks like it never responds... maybe this is the cause of the drop rate? 
- Tried adjusting the config to use MAC addresses. Good news, no drop rate. Bad news, hosts do not receive packets at all. It is not clear to me how to direct traffic to specific MAC addresses. I assume this is why hosts dont receive traffic.
- Checked MTU size, all interfaces are using 1500MTU.
- Made sure VMware distributed switch has forged transmits, pernicious mode, and MAC changes enabled for the port network the client VM is using. 

What am I missing here?

My config file:
### Config file generated by dpdk_setup_ports.py ### 
 
- version: 2 
  interfaces: ['5e:00.2', '5e:00.3'] 
  port_info: 
      - ip: 172.16.90.65 
        default_gw: 172.16.90.1 
      - ip: 172.16.90.66 
        default_gw: 172.16.90.1 
 
  platform: 
      master_thread_id: 0 
      latency_thread_id: 1 
      dual_if: 
        - socket: 0 
          threads: [2,4,6,8,10,12,14,16,18,20,22,24,26,28,30] 

Interfaces:

Network devices using DPDK-compatible driver
============================================
0000:5e:00.2 'Ethernet Controller X710 for 10GbE SFP+' drv=igb_uio unused=i40e,vfio-pci,uio_pci_generic
0000:5e:00.3 'Ethernet Controller X710 for 10GbE SFP+' drv=igb_uio unused=i40e,vfio-pci,uio_pci_generic

Network devices using kernel driver
===================================
0000:19:00.0 'I350 Gigabit Network Connection' if=em1 drv=igb unused=igb_uio,vfio-pci,uio_pci_generic
0000:19:00.1 'I350 Gigabit Network Connection' if=em2 drv=igb unused=igb_uio,vfio-pci,uio_pci_generic
0000:19:00.2 'I350 Gigabit Network Connection' if=em3 drv=igb unused=igb_uio,vfio-pci,uio_pci_generic
0000:19:00.3 'I350 Gigabit Network Connection' if=em4 drv=igb unused=igb_uio,vfio-pci,uio_pci_generic *Active*

Other network devices
=====================
0000:5e:00.0 'Ethernet Controller X710 for 10GbE SFP+' unused=i40e,igb_uio,vfio-pci,uio_pci_generic
0000:5e:00.1 'Ethernet Controller X710 for 10GbE SFP+' unused=i40e,igb_uio,vfio-pci,uio_pci_generic
0000:86:00.0 'Ethernet Controller X710 for 10GbE SFP+' unused=i40e,igb_uio,vfio-pci,uio_pci_generic
0000:86:00.1 'Ethernet Controller X710 for 10GbE SFP+' unused=i40e,igb_uio,vfio-pci,uio_pci_generic
0000:86:00.2 'Ethernet Controller X710 for 10GbE SFP+' unused=i40e,igb_uio,vfio-pci,uio_pci_generic
0000:86:00.3 'Ethernet Controller X710 for 10GbE SFP+' unused=i40e,igb_uio,vfio-pci,uio_pci_generic

portattr in astf mode:
trex>portattr
Port Status

     port       |          0           |          1
----------------+----------------------+---------------------
driver          |       net_i40e       |       net_i40e
description     |  Ethernet Controlle  |  Ethernet Controlle
link status     |          UP          |          UP
link speed      |       10 Gb/s        |       10 Gb/s
port status     |        LOADED        |        LOADED
promiscuous     |         off          |         off
multicast       |         off          |         off
flow ctrl       |         none         |         none
vxlan fs        |          -           |          -
--              |                      |
layer mode      |         IPv4         |         IPv4
src IPv4        |     172.16.90.65     |     172.16.90.66
IPv6            |         off          |         off
src MAC         |  f8:f2:1e:bc:f4:d2   |  f8:f2:1e:bc:f4:d3
---             |                      |
Destination     |     172.16.90.1      |     172.16.90.1
ARP Resolution  |  88:d9:8f:af:34:60   |  88:d9:8f:af:34:60
----            |                      |
VLAN            |          -           |          -
-----           |                      |
PCI Address     |     0000:5e:00.2     |     0000:5e:00.3
NUMA Node       |          0           |          0
RX Filter Mode  |    hardware match    |    hardware match
RX Queueing     |         off          |         off
Grat ARP        |  every 120 seconds   |  every 120 seconds
------          |                      |

Config file I tried using in stateful mode:
- duration : 0.1
  generator :
          distribution : "seq"
          clients_start : "172.16.90.100"
          clients_end   : "172.16.90.101"
          servers_start : "172.16.90.62"
          servers_end   : "172.16.90.62"
          clients_per_gb : 201
          min_clients    : 101
          dual_port_mask : "1.0.0.0"
          tcp_aging      : 0
          udp_aging      : 0
  cap_ipg    : true
  cap_info :
     - name: avl/delay_10_http_browsing_0.pcap
       cps : 2.776
       #cps : 1
       ipg : 10000
       rtt : 10000
       w   : 1

Config file I tried using in astf mode:
from trex.astf.api import *


class Prof1():
    def __init__(self):
        pass

    def get_profile(self, **kwargs):
        # ip generator
        ip_gen_c = ASTFIPGenDist(ip_range=["172.16.90.65", "172.16.90.66"], distribution="seq")
        ip_gen_s = ASTFIPGenDist(ip_range=["172.16.90.62", "172.16.90.62"], distribution="seq")
        ip_gen = ASTFIPGen(glob=ASTFIPGenGlobal(ip_offset="1.0.0.0"),
                           dist_client=ip_gen_c,
                           dist_server=ip_gen_s)

        return ASTFProfile(default_ip_gen=ip_gen,
                            cap_list=[ASTFCapInfo(file="../avl/delay_10_http_browsing_0.pcap",
                            cps=2.776)])


def register():
    return Prof1()

example output of stateful mode running this command
"./t-rex-64 -f cap2/http_simple.yaml -c 1 -d 100"

-Per port stats table
      ports |               0 |               1
 -----------------------------------------------------------------------------------------
   opackets |             420 |             690
     obytes |           34170 |         1003020
   ipackets |              46 |              46
     ibytes |            2944 |            2944
    ierrors |               0 |               0
    oerrors |               0 |               0
      Tx Bw |      24.41 Kbps |     682.61 Kbps

-Global stats enabled
 Cpu Utilization : 0.0  %
 Platform_factor : 1.0
 Total-Tx        :     707.02 Kbps
 Total-Rx        :       4.22 Kbps
 Total-PPS       :      95.64  pps
 Total-CPS       :       2.92  cps

 Expected-PPS    :     102.71  pps
 Expected-CPS    :       2.78  cps
 Expected-BPS    :     767.80 Kbps

 Active-flows    :        0  Clients :        2   Socket-util : 0.0000 %
 Open-flows      :       30  Servers :        1   Socket :        0 Socket/Clients :  0.0
 drop-rate       :     702.80 Kbps
 current time    : 12.3 sec
 test duration   : 87.7 sec




hem...@mnkcg.com

unread,
Mar 15, 2021, 7:29:08 PM3/15/21
to daveh...@gmail.com, trex...@googlegroups.com

Dave,

 

Would it be possible to see your Juniper switch configuration?  http_simple.yaml uses IP subnets in 16,0.0.0 and 48.0.0.0 nets. The switch should have routes to the trex ports and ports IP. 

 

Please see this 2-node trex topology with gory details that should help you. The DUT used is a VPP node.

 

https://fd.io/docs/vpp/master/usecases/simpleperf/trex.html

 

Hemant

--
You received this message because you are subscribed to the Google Groups "TRex Traffic Generator" group.
To unsubscribe from this group and stop receiving emails from it, send an email to trex-tgn+u...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/trex-tgn/fb011ca4-9b11-4f5a-b91c-5e59ddbffa0bn%40googlegroups.com.

Dave Houser

unread,
Mar 15, 2021, 7:43:03 PM3/15/21
to TRex Traffic Generator
Hemant,

First of all, I changed those default IPs in the http config, you can see so in the configs above, I used the 172.16.90.0/24 subnet instead.
Second, here is switch port status, and interface configuration, STP is enabled. There is a trunk that is used to get to the ESXi with the VMs on, nothing to really show there. Trunk has the vlan, and the ESXi port has STP enabled as well.
Third, I read a post on here about how someone suggested turning off STP, is this required? If so does STP need to be disabled?


# run show interfaces terse 
Interface               Admin Link Proto    Local                 Remote
xe-0/0/1               up    up
xe-0/0/1.0             up    up   eth-switch
xe-0/0/2               up    up
xe-0/0/2.0             up    up   eth-switch
irb                     up    up
irb.1024                up    up   inet     172.16.90.1/24

# run show configuration interfaces irb.1024
family inet {
    address 172.16.90.1/24;
}

# run show configuration vlans trex-1024
vlan-id 222;
l3-interface irb.222;

# run show configuration interfaces xe-0/0/1
description "to-trex-svr-p1";
unit 0 {
    family ethernet-switching {
        interface-mode access;
        vlan {
            members trex-222;
        }
        storm-control default;
    }
}

{master:0}[edit]
# run show configuration interfaces xe-0/0/2
description "to-trex-svr-p2";
unit 0 {
    family ethernet-switching {
        interface-mode access;
        vlan {
            members trex-222;
        }
        storm-control default;
    }
}

# run show configuration protocols rstp
interface xe-0/0/1;
interface xe-0/0/2;




hem...@mnkcg.com

unread,
Mar 15, 2021, 7:58:45 PM3/15/21
to daveh...@gmail.com, trex...@googlegroups.com

Dave,

 

Not sure.  However, to debug, turn off STP and see what transpires.  I tested without using any trunk nor a VLAN. I would ping from the switch to trex Tx and Rx ports.  If both pings pass, layer-3 connectivity looks good.  If the switch has any packet trace facility, use it, because that’s the best help. 

Dave Houser

unread,
Mar 15, 2021, 8:08:56 PM3/15/21
to TRex Traffic Generator
As I said in my original post, I can ping just fine. I know L3 is fine, its on a flat L2 network.
What I am curious about is the following:

A) Am I using trex correctly based on my configurations I shared?
B) The client host packet traces show no response back to the hosts, is this expected?
C) And of course why I have such high drop rate, and is that expected with the http_simple file I am using in stateful mode / the http_simple file I am using astf mode?

I will test turning of STP tomorrow and get back to you.

hem...@mnkcg.com

unread,
Mar 15, 2021, 8:23:34 PM3/15/21
to daveh...@gmail.com, trex...@googlegroups.com

Dave,

 

Sorry, I missed that you ping fine.  Let’s use a UDP test since you are using http and TCP.  Use scripts/cap2/dns.yaml and DNS is using UDP.  If UDP test passes, we have narrowed down the issue to be TCP.  Sorry, I don’t understand B) where no response is seen at the host – what is the expected response?

Dave Houser

unread,
Mar 16, 2021, 12:33:27 PM3/16/21
to TRex Traffic Generator
Updates:
I performed a stateful test with the dns.yaml, same issue, high drop rate, close to all packets drop (same as http_simple).
I took a look at my /etc/trex_cfg.yaml again, and noticed that I was using a "dummy" port. I adjusted the configuration to use the two PCI interfaces instead of just one pci interface and a dummy port. Now the droprate is at 0.0 bps when using stateful http_simple.yaml, or dns.yaml. However, this does not fix astf mode at all, and I still see very high drop rate.

New question: Why do I need both interfaces? Why cant I just use one interface? Does one interface TX and the other RX's? Are two interfaces required? 

Stateful command used for the above test: ./t-rex-64 -f cap2/http_simple.yaml -c 1 -d 100
astf command used for the above test which still shows high drop rate: start -f astf/http_simple.py -d 100 -m 1

This testing expands my questions now into the scope of reading Bps / pps. I shifted to using 'iftop' instead of bmon to make sure I tried different bandwidth benchmarking tools on the client. The results I found from both are the same, trex shows its trying to use one speed, but the client shows different. 

See this screen shot below. I believe I am reading this correctly, if not, please correct me. On the right is the trex server running stateful mode, as we can see Its trying to push 1.36kbps. However on iftop on the client, I can see its only reaching 236bps. Why? 


Untitled.png

Re: "no response" clarification from my previous post. I thought the client was not responding to the far end host (trex) for http GET requests and DNS dig requests, but my assumption was misleading. There is no HTTP server or DNS server, nor are port 80 or 53 open, so of course it would not respond. 
However I wanted to test more to make sure the traces on the client created the same results if I were to have those services running on the client side. 
To test this theory more,  I set up a bind server and httpd server on the client (172.16.90.62), I am able to dig against it remotely as well as perform a wget on the index.html for apache, so its now listening on port 80 and 53. The bind server is not able to forward to the internet, so I built a local a record for www.cisco.com (the same record that is being requested in the trex pcap for dns.yaml). Note I also turned off the firewall on the client system as well.
I performed the tests again and the http and dns results look the same. stateful no longer drops packets, astf drops many packets. Also comparing the the bandwidth speeds of the client and trex, they do not align, and I still do not understand why. 

Lastly, re: Spanning Tree (STP). There is no way we can disable this on our interfaces. We will loose connection to the system as STP is used to route traffic though the backbone via layer2 for test traffic. So STP has to stay. Is STP not supported by Trex?

So two major problems I am having now are the following:

- Why does the bandwidth averages differ between what trex shows and the client shows?
- Why does stateful not drop packets but astf does? Why does stateful drop packets when a dummy port is being used?

hem...@mnkcg.com

unread,
Mar 16, 2021, 1:50:00 PM3/16/21
to daveh...@gmail.com, trex...@googlegroups.com

Hi Dave,

 

Please wait for trex folks to reply regarding astf drops.

 

Thanks,

image001.png

Besart Dollma

unread,
Mar 19, 2021, 3:27:11 AM3/19/21
to TRex Traffic Generator
Hi 
1) You need both interfaces in ASTF, one TX and one RX. Otherwise how can we validate the drop rate if we don't check what was received vs what was transmitted?
Take a look here on how it works, one instance is the client and the other is the server.
2) You are probably not using ASTF correctly based on the first question, so you need to point us more so we can help you solve the problem.
3)STP is not supported by TRex.
Thanks,
Reply all
Reply to author
Forward
0 new messages