Observed TCP rampup behaviour on TRex v2.85

376 views
Skip to first unread message

Darius Grassi

unread,
Sep 28, 2021, 8:44:46 PM9/28/21
to TRex Traffic Generator
Hi,

I've been running more tests using TRex's Python automation API for ASTF tests and have been enjoying using TRex. I've been having some success, but have also observed some surprising results that I was hoping to get more insight on.

My test runs for ~60 seconds. When running my test, I observe ~5 seconds of "ramp-up" time, where the traffic appears to increase very slowly. My tput then jumps up to ~3 Gbps after this period, where it stays consistently. For clarity, here is a 7 second sample of my output:

(Note: client is on port 0, server is on port 1)
TRAFFIC RUNNING 0.01 SEC
Port | TX pps    | RX pps  | TX bps    | RX bps  |
--------------------------------------------------
0    | 0.0 pps | 0.0 pps | 0.0 bps | 0.0 bps |
1    | 0.0 pps | 0.0 pps | 0.0 bps | 0.0 bps |
2    | 0.0 pps | 0.0 pps | 0.0 bps | 0.0 bps |
3    | 0.0 pps | 0.0 pps | 0.0 bps | 0.0 bps |


TRAFFIC RUNNING 1.02 SEC
Port | TX pps    | RX pps  | TX bps    | RX bps  |
--------------------------------------------------
0    | 12.0 pps | 109.0 pps | 6.7 Kbps | 7.7 Mbps |
1    | 109.0 pps | 12.0 pps | 7.7 Mbps | 6.7 Kbps |
2    | 0.0 pps | 0.0 pps | 0.0 bps | 0.0 bps |
3    | 0.0 pps | 0.0 pps | 0.0 bps | 0.0 bps |


TRAFFIC RUNNING 2.03 SEC
Port | TX pps    | RX pps  | TX bps    | RX bps  |
--------------------------------------------------
0    | 12.1 pps | 250.0 pps | 6.7 Kbps | 17.8 Mbps |
1    | 250.0 pps | 12.1 pps | 17.8 Mbps | 6.7 Kbps |
2    | 0.0 pps | 0.0 pps | 0.0 bps | 0.0 bps |
3    | 0.0 pps | 0.0 pps | 0.0 bps | 0.0 bps |


TRAFFIC RUNNING 3.04 SEC
Port | TX pps    | RX pps  | TX bps    | RX bps  |
--------------------------------------------------
0    | 17.2 pps | 519.2 pps | 9.4 Kbps | 36.7 Mbps |
1    | 519.2 pps | 17.2 pps | 36.7 Mbps | 9.4 Kbps |
2    | 0.0 pps | 0.0 pps | 0.0 bps | 0.0 bps |
3    | 0.0 pps | 0.0 pps | 0.0 bps | 0.0 bps |


TRAFFIC RUNNING 4.05 SEC
Port | TX pps    | RX pps  | TX bps    | RX bps  |
--------------------------------------------------
0    | 20.3 pps | 946.3 pps | 11.1 Kbps | 67.0 Mbps |
1    | 946.3 pps | 20.3 pps | 67.0 Mbps | 11.1 Kbps |
2    | 0.0 pps | 0.0 pps | 0.0 bps | 0.0 bps |
3    | 0.0 pps | 0.0 pps | 0.0 bps | 0.0 bps |


TRAFFIC RUNNING 5.05 SEC
Port | TX pps    | RX pps  | TX bps    | RX bps  |
--------------------------------------------------
0    | 20.5 pps | 953.9 pps | 11.1 Kbps | 67.0 Mbps |
1    | 954.0 pps | 20.5 pps | 67.0 Mbps | 11.1 Kbps |
2    | 0.0 pps | 0.0 pps | 0.0 bps | 0.0 bps |
3    | 0.0 pps | 0.0 pps | 0.0 bps | 0.0 bps |


TRAFFIC RUNNING 6.06 SEC
Port | TX pps    | RX pps  | TX bps    | RX bps  |
--------------------------------------------------
0    | 404.8 pps | 36.6 Kpps | 224.1 Kbps | 2.6 Gbps |
1    | 36.6 Kpps | 405.8 pps | 2.6 Gbps | 224.7 Kbps |
2    | 0.0 pps | 0.0 pps | 0.0 bps | 0.0 bps |
3    | 0.0 pps | 0.0 pps | 0.0 bps | 0.0 bps |


TRAFFIC RUNNING 7.07 SEC
Port | TX pps    | RX pps  | TX bps    | RX bps  |
--------------------------------------------------
0    | 512.6 pps | 46.6 Kpps | 282.9 Kbps | 3.3 Gbps |
1    | 46.7 Kpps | 511.6 pps | 3.3 Gbps | 282.4 Kbps |
2    | 0.0 pps | 0.0 pps | 0.0 bps | 0.0 bps |
3    | 0.0 pps | 0.0 pps | 0.0 bps | 0.0 bps |
.
.
.

What is causing this behaviour? Could it be due to the version of TRex I'm using? Is this the expected behaviour of TRex? Or could it be an issue with my profile?

If sharing any other code would be helpful to debugging/understanding this behaviour, please let me know and I'll be happy to share. Thanks!

Best,
Darius

hanoh haim

unread,
Sep 30, 2021, 4:47:41 AM9/30/21
to Darius Grassi, TRex Traffic Generator
Hi Darius,
I suggest to run the same profile using the Console and look into more TCP/UDP counters for the issue. 
Other things that could help, start with loopback maybe your DUT is dropping packets .

Thanks
Hanoh

Linh Quack

unread,
Oct 4, 2021, 3:46:04 AM10/4/21
to hanoh haim, Darius Grassi, TRex Traffic Generator
Hi!
How to setup testing trex???

Vào Th 5, 30 thg 9, 2021 vào lúc 15:48 hanoh haim <hhaim...@gmail.com> đã viết:
--
You received this message because you are subscribed to the Google Groups "TRex Traffic Generator" group.
To unsubscribe from this group and stop receiving emails from it, send an email to trex-tgn+u...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/trex-tgn/CA%2BYxBoLepq7TuL_XyaSLg2eGmZXPEhCqDAq3zMLnF4y_AC-RRQ%40mail.gmail.com.

Darius Grassi

unread,
Oct 5, 2021, 7:56:57 PM10/5/21
to TRex Traffic Generator
Hi Hanoh,

I've recently been running my tests with these specs:
Server version:   v2.92 @ ASTF
Server mode:      Advanced Stateful
Server CPU:       1 x Intel(R) Xeon(R) Gold 5220R CPU @ 2.20GHz
Ports count:      2 x 40Gbps @ Ethernet Controller XL710 for 40GbE QSFP+

I'm currently running ASTF tests on TRex's profile "http_eflow2.py" with loopback set up correctly and using the console, and unfortunately still have been seeing disappointing results.

My goal is to replicate the results I've seen in your older post (https://groups.google.com/g/trex-tgn/c/lBIajJKGNz8/m/ARBivohCCQAJ), where you reach ~10-15 Gbps TCP tput per flow using this same profile.

Here are the commands I am using:
TRex: sudo ./t-rex-64 --cfg [config file location] --astf -c 1 -i

TRex Console: start -f astf/http_eflow2.py -m 10 -d 60 -l 1000 -t size=800,loop=100000,win=512,pipe=1

Here is a snapshot of the output from TRex interactive mode after starting a test with these commands:
-Per port stats table
      ports |               0 |               1
 -----------------------------------------------------------------------------------------
   opackets |          272634 |        23383588
     obytes |        18319289 |     34856975402
   ipackets |        23383588 |          272634
     ibytes |     34856975402 |        18319289
    ierrors |               0 |               0
    oerrors |               0 |               0
      Tx Bw |     808.01 Kbps |       1.72 Gbps

-Global stats enabled
 Cpu Utilization : 5.6  %  61.8 Gb/core
 Platform_factor : 1.0
 Total-Tx        :       1.72 Gbps
 Total-Rx        :       1.72 Gbps
 Total-PPS       :     145.28 Kpps
 Total-CPS       :       0.00  cps

 Expected-PPS    :       0.00  pps
 Expected-CPS    :       0.00  cps
 Expected-L7-BPS :       0.00  bps

 Active-flows    :        1  Clients :        0   Socket-util : 0.0000 %
 Open-flows      :        1  Servers :        0   Socket :        0 Socket/Clients :  -nan
 drop-rate       :       0.00  bps
 current time    : 245.9 sec
 test duration   : 0.0 sec

Here are the counters shown to me by the console during the test:
Traffic stats summary.
                        |      client       |      server       |
------------------------+-------------------+-------------------+------------------------
         m_active_flows |                 1 |                 1 | active open flows
            m_est_flows |                 1 |                 1 | active established flows
           m_tx_bw_l7_r |             0 bps |         1.54 Gbps | tx L7 bw acked
     m_tx_bw_l7_total_r |             0 bps |         1.54 Gbps | tx L7 bw total
           m_rx_bw_l7_r |         1.54 Gbps |             0 bps | rx L7 bw acked
             m_tx_pps_r |        469.41 pps |         3.75 Kpps | tx pps
             m_rx_pps_r |       134.19 Kpps |        469.41 pps | rx pps
             m_avg_size |           1.43 KB |          45.52 KB | average pkt size
             m_tx_ratio |               0 % |             100 % | Tx acked/sent ratio
                      - |                   |                   |
                      - |                   |                   |
                    TCP |                   |                   |
                      - |                   |                   |
       tcps_connattempt |                 1 |                 0 | connections initiated
           tcps_accepts |                 0 |                 1 | connections accepted
          tcps_connects |                 1 |                 1 | connections established
         tcps_segstimed |                 2 |             63756 | segs where we tried to get rtt
        tcps_rttupdated |                 2 |             63786 | times we succeeded
            tcps_delack |               324 |                 0 | delayed acks sent
          tcps_sndtotal |             63787 |            508759 | total packets sent
           tcps_sndpack |                 1 |            508758 | data packets sent
           tcps_sndbyte |               249 |       26052378734 | data bytes sent by application
        tcps_sndbyte_ok |               249 |       26052378734 | data bytes sent by tcp
           tcps_sndctrl |                 1 |                 0 | control (SYN|FIN|RST) packets sent
           tcps_sndacks |             63785 |                 1 | ack-only packets sent
           tcps_rcvpack |          18187909 |                 1 | packets received in sequence
           tcps_rcvbyte |       26051854446 |               249 | bytes received in sequence
        tcps_rcvackpack |                 1 |             63786 | rcvd ack packets
        tcps_rcvackbyte |               249 |       26051854446 | tx bytes acked by rcvd acks
     tcps_rcvackbyte_of |                 0 |                 1 | tx bytes acked by rcvd acks - overflow acked
           tcps_preddat |          18187908 |                 0 | times hdr predict ok for data pkts
           tcps_predack |                 0 |             63428 | times hdr predict ok for acks
                      - |                   |                   |
                    UDP |                   |                   |
                      - |                   |                   |
                      - |                   |                   |
             Flow Table |                   |                   |
                      - |                   |                   |
       err_rx_throttled |              3980 |                 0 | rx thread was throttled

At this point, after removing the possibility of issues with an older version of TRex or my DUT, it seems to me the only knobs left here are either my TRex configuration file or something relating to the err_rx_throttled errors seen in my console counters.

Can you please give me more insight on what could be behind the err_rx_throttled error? Do you have any idea based on this information why I am seeing only ~2 Gbps throughput?

Thanks, and apologies for the lengthy message!


Best,
Darius

hanoh haim

unread,
Oct 6, 2021, 11:36:54 AM10/6/21
to Darius Grassi, TRex Traffic Generator
image.png

I've tested it on mlx5 (100gbps physical interface) and change this to make it faster rampup

info.tcp.initwnd = 20 # start big
but you are right there is a limit to the maximum speed due to the maximum window size in the rx side. 
In case of faster rate the BDP is higher and you need bigger windows this creates bigger tx and rx bursts but there is a limit to the amount of traffic we rx (hardcoded) maybe you can try change it in the code that to make the BDP bigger (with bigger window)

Thanks
Hanoh


Darius Grassi

unread,
Oct 7, 2021, 4:54:06 PM10/7/21
to TRex Traffic Generator
Hi Hanoh,

This test has produced some very interesting results.

Changes: I increased the initwnd tunable to 20 inside of http_eflow2.py as you recommended, and ran the command you used: start -f astf/http_eflow2.py -m 1 -d 60 -t mss=5000,size=4000,loop=100000,win=2048,pipe=1. I am still using my initial set up for these tests otherwise. 

When running these tests, I observed the following behavior in "phases":
Phase 1: after starting the test, there is a "ramp-up" period. Traffic slowly increases to around 100 Mbps for ~30 seconds.
Phase 2: traffic then instantly jumps up to 7-8 Gbps, where it stays consistently for a while
Phase 3: traffic then quickly starts to increase up to 40 Gbps, where it stays until the test completes

It's also worth noting that the "queue_full" counter stays at nearly 0 until we enter phase 3, where it then rapidly starts to increase.

Here are my stats during phase 3:
Screen Shot 2021-10-07 at 1.26.45 PM.png
---
And here are my TCP counters after the test completed:
Screen Shot 2021-10-07 at 1.27.52 PM.png
---
The test also completes after 150 seconds, rather than 60 seconds (when not using --nc).

Do you observe the same behavior on your side?

Best,
Darius

hanoh haim

unread,
Oct 8, 2021, 1:13:40 AM10/8/21
to Darius Grassi, TRex Traffic Generator
I didn’t waited enough for stage3 but I’ve simulated it using bigger window.

Bigger window is required to overcome the BDP. Higher rate is higher BDP requiring higher window.

However TRex has one core with limited queue size (Tx and Rx) and TCP get to a point that it tries to burst more than the tx queue size and stuck with queue_full.

Try to tune the window size.

Thanks
Hanoh

--
You received this message because you are subscribed to the Google Groups "TRex Traffic Generator" group.
To unsubscribe from this group and stop receiving emails from it, send an email to trex-tgn+u...@googlegroups.com.
--
Hanoh
Sent from my iPhone

Darius Grassi

unread,
Oct 12, 2021, 6:52:31 PM10/12/21
to TRex Traffic Generator
Hi Hanoh,

I'm glad you were able to simulate the same results I observed.

I tested many larger window sizes. The general pattern that followed when testing larger window sizes even from as small as 10 MB up to 1 GB (the max size) was:
Phase 1: Very long "ramp-up" period. Tput climbs eventually up to 1 Gbps at most. Queue full counter goes up as throughput gets higher.
Phase 2: Tput then drops drastically, and maintains at a very low throughput (<100 Mbps).

Your reasoning makes sense as to why a larger window should resolve the observed issues of a very long ramp-up period and err_rx_throttled and queue_full counters increasing.
I tried tuning memory allocation in my config file, mss and size, but nothing seemed to work. Your original test tunables are still ultimately the best performing. Other parameters either never reach linerate or take much longer to do so.

Questions:
1) Why is this long ramp-up period so persistent for every test?
2) What is causing the queue_full counter to rapidly rise, even while using the maximum window size?
3) Why does the err_rx_throttled counter increase during these tests? Is this a sign that something is going wrong?
4) Is all the behavior I'm observing with these tests expected, or is there some other behavior that should be happening?

Thanks, and sorry for all the questions. I'd just like to understand what is causing the behavior I'm seeing, as it seems strange and erratic.

Best,
Darius

Darius Grassi

unread,
Oct 13, 2021, 3:12:29 PM10/13/21
to TRex Traffic Generator
Hi Hanoh,

Through further experiments, I'm actually seeing tests where win=1024 reach ~36-38 Gbps in about 90 seconds. These tests do not have a queue_full counter increasing (at all), and I don't see the err_rx_throttled counter either.

It seems like, contrary to reason, a smaller window actually has better performance. 

If you'd like to observe this yourself, use this command in the console:
start -f astf/http_eflow2.py -m 1 -d 120 --nc -t mss=5000,size=4000,loop=100000,win=1024,pipe=1

Regardless, all of these tests still have the very long "warmup" time until they eventually reach linerate, if they are able to.

Best,
Darius

hanoh haim

unread,
Oct 14, 2021, 2:47:41 AM10/14/21
to Darius Grassi, TRex Traffic Generator
Hi Darius, 

The window size is expected, see what I've written. 
"Bigger window is required to overcome the BDP. Higher rate is higher BDP requiring higher window.
However TRex has one core with limited queue size (Tx and Rx) and TCP get to a point that it tries to burst more than the tx queue size and stuck with queue_full.
Try to tune the window size."

In regards to the slow ramp-up, we will look into the window size and CC numbers to verify (new reno) 

Thanks
Haoh

Besart Dollma

unread,
Oct 14, 2021, 7:50:54 AM10/14/21
to TRex Traffic Generator
Hi Darius, 
We changed the behaviour of no_delay and didn't fix all the profiles unfortunately.
Please use info.tcp.no_delay = 0, and add another line defining the delay counter info.tcp.no_delay_counter = 5 * mss
This should solve the ramp up problem.
We will fix this and document in better in the next release.
Thank you for reporting this.

Darius Grassi

unread,
Oct 14, 2021, 1:48:23 PM10/14/21
to TRex Traffic Generator
Hi,

I've just tested out http_eflow2.py with these changes, and it works great! I'm seeing an instant 40 Gbps line rate throughput transmit now.

This has resolved the rampup issue that I was seeing, everything seems to be working as expected now. Thank you both for your help!

Best,
Darius

Reply all
Reply to author
Forward
0 new messages