reno vs cubic vs bbr vs bbr2 via iperf3 based on netem

2,729 views
Skip to first unread message

Tao Tse

unread,
Jun 15, 2021, 7:11:26 AM6/15/21
to BBR Development
Hi,
I did some tests in LAN env, I'm not sure if there was something goes wrong, bbr did not behave as expected.

The test envirnment:
sender: (10.10.61.58)
receiver:(192.168.8.89)
Link speed on eth0: 10000Mb/s
Kernel source git-reversion: 74f603c@https://github.com/google/bbr.git

Firstly, I set a netem qdisc at the sender side:
tc qd replace dev eth0 root netem delay 10ms rate 100mbit loss 0.1%

According to the BBRv1 paper, it should fully utilize the BltBw(100Mpbs) when loss rate is under 20% (this was also discussed here), but it did not. See the detailed report below:

Cubic
cubic.png
-----------------------------------------------------------
Server listening on 5201
-----------------------------------------------------------
Accepted connection from 192.168.8.89, port 16989
[  5] local 10.10.61.58 port 5201 connected to 192.168.8.89 port 16991
[ ID] Interval           Transfer     Bandwidth       Retr  Cwnd
[  5]   0.00-1.00   sec  13.6 MBytes   114 Mbits/sec   10    452 KBytes       
[  5]   1.00-2.00   sec  9.99 MBytes  83.8 Mbits/sec    9    354 KBytes       
[  5]   2.00-3.00   sec  11.2 MBytes  94.3 Mbits/sec    8    272 KBytes       
[  5]   3.00-4.00   sec  7.49 MBytes  62.8 Mbits/sec   27    150 KBytes       
[  5]   4.00-5.00   sec  8.72 MBytes  73.2 Mbits/sec   11    331 KBytes       
[  5]   5.00-6.00   sec  11.2 MBytes  94.0 Mbits/sec    4    284 KBytes       
[  5]   6.00-7.00   sec  11.2 MBytes  94.0 Mbits/sec    0    315 KBytes       
[  5]   7.00-8.00   sec  8.74 MBytes  73.3 Mbits/sec   20    131 KBytes       
[  5]   8.00-9.00   sec  6.24 MBytes  52.3 Mbits/sec   19    145 KBytes       
[  5]   9.00-10.00  sec  7.45 MBytes  62.5 Mbits/sec   10    483 KBytes       
[  5]  10.00-11.00  sec  12.5 MBytes   105 Mbits/sec    0   1.05 MBytes       
[  5]  11.00-12.00  sec  10.0 MBytes  83.9 Mbits/sec   11    873 KBytes       
[  5]  12.00-13.00  sec  11.2 MBytes  94.4 Mbits/sec    9    620 KBytes       
[  5]  13.00-14.00  sec  11.2 MBytes  94.4 Mbits/sec    8    510 KBytes       
[  5]  14.00-15.00  sec  8.75 MBytes  73.4 Mbits/sec   27    138 KBytes       
[  5]  15.00-16.00  sec  11.2 MBytes  94.4 Mbits/sec    0    192 KBytes       
[  5]  16.00-17.00  sec  11.2 MBytes  94.4 Mbits/sec    4    191 KBytes       
[  5]  17.00-18.00  sec  10.0 MBytes  83.9 Mbits/sec    6    143 KBytes       
[  5]  18.00-19.00  sec  10.0 MBytes  83.9 Mbits/sec    9    153 KBytes       
[  5]  19.00-20.00  sec  10.0 MBytes  83.9 Mbits/sec    6    153 KBytes       
[  5]  20.00-20.03  sec  1.25 MBytes   372 Mbits/sec    0    154 KBytes       
- - - - - - - - - - - - - - - - - - - - - - - - -
[ ID] Interval           Transfer     Bandwidth       Retr
[  5]   0.00-20.03  sec   203 MBytes  85.2 Mbits/sec  198             sender
[  5]   0.00-20.03  sec  0.00 Bytes  0.00 bits/sec                  receiver

Reno
reno.png
-----------------------------------------------------------
Server listening on 5201
-----------------------------------------------------------
Accepted connection from 192.168.8.89, port 17125
[  5] local 10.10.61.58 port 5201 connected to 192.168.8.89 port 17127
[ ID] Interval           Transfer     Bandwidth       Retr  Cwnd
[  5]   0.00-1.00   sec  14.4 MBytes   121 Mbits/sec    0   1.38 MBytes       
[  5]   1.00-2.00   sec  11.2 MBytes  94.4 Mbits/sec    0   1.38 MBytes       
[  5]   2.00-3.00   sec  11.2 MBytes  94.4 Mbits/sec    0   1.38 MBytes       
[  5]   3.00-4.00   sec  12.5 MBytes   105 Mbits/sec    0   1.38 MBytes       
[  5]   4.00-5.00   sec  11.2 MBytes  94.4 Mbits/sec    0   1.38 MBytes       
[  5]   5.00-6.00   sec  11.2 MBytes  94.4 Mbits/sec    0   1.38 MBytes       
[  5]   6.00-7.00   sec  8.75 MBytes  73.4 Mbits/sec   14    548 KBytes       
[  5]   7.00-8.00   sec  6.25 MBytes  52.4 Mbits/sec   12    208 KBytes       
[  5]   8.00-9.00   sec  7.50 MBytes  62.9 Mbits/sec    9    164 KBytes       
[  5]   9.00-10.00  sec  10.0 MBytes  83.9 Mbits/sec    6    123 KBytes       
[  5]  10.00-11.00  sec  12.5 MBytes   105 Mbits/sec    0    220 KBytes       
[  5]  11.00-12.00  sec  7.50 MBytes  62.9 Mbits/sec   12    181 KBytes       
[  5]  12.00-13.00  sec  10.0 MBytes  83.9 Mbits/sec    6    192 KBytes       
[  5]  13.00-14.00  sec  3.75 MBytes  31.5 Mbits/sec   19    121 KBytes       
[  5]  14.00-15.00  sec  11.2 MBytes  94.4 Mbits/sec    0    218 KBytes       
[  5]  15.00-16.00  sec  12.5 MBytes   105 Mbits/sec    0    285 KBytes       
[  5]  16.00-17.00  sec  8.75 MBytes  73.4 Mbits/sec    6    248 KBytes       
[  5]  17.00-18.00  sec  10.0 MBytes  83.9 Mbits/sec    3    237 KBytes       
[  5]  18.00-19.00  sec  11.2 MBytes  94.4 Mbits/sec    0    299 KBytes       
[  5]  19.00-20.00  sec  11.2 MBytes  94.4 Mbits/sec    0    351 KBytes       
[  5]  20.00-20.04  sec  1.25 MBytes   275 Mbits/sec    0    352 KBytes       
- - - - - - - - - - - - - - - - - - - - - - - - -
[ ID] Interval           Transfer     Bandwidth       Retr
[  5]   0.00-20.04  sec   204 MBytes  85.6 Mbits/sec   87             sender
[  5]   0.00-20.04  sec  0.00 Bytes  0.00 bits/sec                  receiver

BBR
bbr.png
-----------------------------------------------------------
Server listening on 5201
-----------------------------------------------------------
Accepted connection from 192.168.8.89, port 17201
[  5] local 10.10.61.58 port 5201 connected to 192.168.8.89 port 17203
[ ID] Interval           Transfer     Bandwidth       Retr  Cwnd
[  5]   0.00-1.00   sec  13.6 MBytes   114 Mbits/sec    9    821 KBytes       
[  5]   1.00-2.00   sec  11.2 MBytes  94.4 Mbits/sec    8    670 KBytes       
[  5]   2.00-3.00   sec  10.0 MBytes  83.9 Mbits/sec    8    767 KBytes       
[  5]   3.00-4.00   sec  11.2 MBytes  94.4 Mbits/sec    0    297 KBytes       
[  5]   4.00-5.00   sec  11.2 MBytes  94.4 Mbits/sec    8    670 KBytes       
[  5]   5.00-6.00   sec  7.50 MBytes  62.9 Mbits/sec   17   1.01 MBytes       
[  5]   6.00-7.00   sec  8.75 MBytes  73.4 Mbits/sec   41   3.51 MBytes       
[  5]   7.00-8.00   sec  11.2 MBytes  94.4 Mbits/sec    0   3.51 MBytes       
[  5]   8.00-9.00   sec  11.2 MBytes  94.4 Mbits/sec   85   3.51 MBytes       
[  5]   9.00-10.00  sec  11.2 MBytes  94.4 Mbits/sec  246   3.51 MBytes       
[  5]  10.00-11.00  sec  8.75 MBytes  73.4 Mbits/sec    2    348 KBytes       
[  5]  11.00-12.00  sec  12.5 MBytes   105 Mbits/sec    0    299 KBytes       
[  5]  12.00-13.00  sec  7.50 MBytes  62.9 Mbits/sec   30   2.92 MBytes       
[  5]  13.00-14.00  sec  11.2 MBytes  94.4 Mbits/sec    0   3.13 MBytes       
[  5]  14.00-15.00  sec  11.2 MBytes  94.4 Mbits/sec    0   3.13 MBytes       
[  5]  15.00-16.00  sec  11.2 MBytes  94.4 Mbits/sec    0   3.12 MBytes       
[  5]  16.00-17.00  sec  11.2 MBytes  94.4 Mbits/sec    0   3.13 MBytes       
[  5]  17.00-18.00  sec  12.5 MBytes   105 Mbits/sec    0   3.13 MBytes       
[  5]  18.00-19.00  sec  5.00 MBytes  41.9 Mbits/sec   27   3.00 MBytes       
[  5]  19.00-20.00  sec  7.50 MBytes  62.9 Mbits/sec   15   1.44 MBytes       
[  5]  20.00-20.04  sec  0.00 Bytes  0.00 bits/sec    0   1.45 MBytes       
- - - - - - - - - - - - - - - - - - - - - - - - -
[ ID] Interval           Transfer     Bandwidth       Retr
[  5]   0.00-20.04  sec   206 MBytes  86.3 Mbits/sec  496             sender
[  5]   0.00-20.04  sec  0.00 Bytes  0.00 bits/sec                  receiver

BBRv2
bbr2.png
-----------------------------------------------------------
Server listening on 5201
-----------------------------------------------------------
Accepted connection from 192.168.8.89, port 17277
[  5] local 10.10.61.58 port 5201 connected to 192.168.8.89 port 17279
[ ID] Interval           Transfer     Bandwidth       Retr  Cwnd
[  5]   0.00-1.00   sec  11.4 MBytes  95.7 Mbits/sec   15    701 KBytes       
[  5]   1.00-2.00   sec  7.50 MBytes  62.9 Mbits/sec   23   1.48 MBytes       
[  5]   2.00-3.00   sec  12.5 MBytes   105 Mbits/sec    0   1.46 MBytes       
[  5]   3.00-4.00   sec  11.2 MBytes  94.4 Mbits/sec    0   1.47 MBytes       
[  5]   4.00-5.00   sec  11.2 MBytes  94.4 Mbits/sec    0    274 KBytes       
[  5]   5.00-6.00   sec  6.25 MBytes  52.4 Mbits/sec   43    157 KBytes       
[  5]   6.00-7.00   sec  11.2 MBytes  94.4 Mbits/sec    0    164 KBytes       
[  5]   7.00-8.00   sec  8.75 MBytes  73.4 Mbits/sec    8    312 KBytes       
[  5]   8.00-9.00   sec  8.75 MBytes  73.4 Mbits/sec    8    509 KBytes       
[  5]   9.00-10.00  sec  11.2 MBytes  94.4 Mbits/sec    0    509 KBytes       
[  5]  10.00-11.00  sec  11.2 MBytes  94.4 Mbits/sec    0    218 KBytes       
[  5]  11.00-12.00  sec  12.5 MBytes   105 Mbits/sec    0    435 KBytes       
[  5]  12.00-13.00  sec  11.2 MBytes  94.4 Mbits/sec    0    435 KBytes       
[  5]  13.00-14.00  sec  7.50 MBytes  62.9 Mbits/sec   32    509 KBytes       
[  5]  14.00-15.00  sec  8.75 MBytes  73.4 Mbits/sec   12    823 KBytes       
[  5]  15.00-16.00  sec  11.2 MBytes  94.4 Mbits/sec    0    823 KBytes       
[  5]  16.00-17.00  sec  11.2 MBytes  94.4 Mbits/sec    0    823 KBytes       
[  5]  17.00-18.00  sec  5.00 MBytes  41.9 Mbits/sec  198    269 KBytes       
[  5]  18.00-19.00  sec  10.0 MBytes  83.9 Mbits/sec   71    254 KBytes       
[  5]  19.00-20.00  sec  8.75 MBytes  73.4 Mbits/sec    8    291 KBytes       
[  5]  20.00-20.04  sec  0.00 Bytes  0.00 bits/sec    0    291 KBytes       
- - - - - - - - - - - - - - - - - - - - - - - - -
[ ID] Interval           Transfer     Bandwidth       Retr
[  5]   0.00-20.04  sec   198 MBytes  82.7 Mbits/sec  418             sender
[  5]   0.00-20.04  sec  0.00 Bytes  0.00 bits/sec                  receiver

---

The captured packets by tcpdump is in the attached file.

---
xtao
ccas.tar.gz

Jonathan Morton

unread,
Jun 15, 2021, 8:30:29 AM6/15/21
to Tao Tse, BBR Development
> On 15 Jun, 2021, at 2:11 pm, Tao Tse <g.xi...@gmail.com> wrote:
>
> Firstly, I set a netem qdisc at the sender side:
> tc qd replace dev eth0 root netem delay 10ms rate 100mbit loss 0.1%
>
> According to the BBRv1 paper, it should fully utilize the BltBw(100Mpbs) when loss rate is under 20% (this was also discussed here), but it did not.

It's common for netem to run out of internal buffer space by default, especially when the "delay" parameter is used, and that will increase the loss rate beyond what you have explicitly set. Choose a sufficiently large "limit" parameter, dimensioned in packets, to correct this.

- Jonathan Morton

Eric Dumazet

unread,
Jun 15, 2021, 8:35:05 AM6/15/21
to Jonathan Morton, Tao Tse, BBR Development
Another suggestion : run netem at receiver side.

netem on the sender might have unwanted effects, somehow not fully mitigated.


> - Jonathan Morton
>
> --
> You received this message because you are subscribed to the Google Groups "BBR Development" group.
> To unsubscribe from this group and stop receiving emails from it, send an email to bbr-dev+u...@googlegroups.com.
> To view this discussion on the web visit https://groups.google.com/d/msgid/bbr-dev/FD4FA0F4-4871-4E41-B6FF-37184893EABE%40gmail.com.

Neal Cardwell

unread,
Jun 15, 2021, 10:24:20 AM6/15/21
to Tao Tse, BBR Development
As Jonathan notes, you'll need to set a limit for the netem qdisc (limit <L>) that is big enough to hold the packets in the emulated link, plus the emulated buffer. You can use the following Python helper to compute a limit for a netem qdisc, with a rate in Mbit/sec, delay in ms, and bottleneck buffer size in packets (assuming MTU=1500 bytes):

def netem_limit(rate, delay, buf):
    """Get netem limit in packets.                                                                                                                                
                                                                                                                                                                   
    Needs to hold the packets in emulated pipe and emulated buffer.                                                                                                
    """
    bdp_bits = (rate * 1000000.0) * (delay / 1000.0)
    bdp_bytes = bdp_bits / 8.0  
    bdp = int(bdp_bytes / 1500.0)
    limit = bdp + buf
    return limit

Also, with regard to the "expected bandwidth", keep in mind that the application-visible throughput will not match the link bandwidth, due to header overhead. For TCP/IPv4 the maximum goodput will be something like 1448 / 1514 of the link bandwidth, or roughly 95.6% of the link bandwidth.

Also, as Eric noted, tests will not get realistic results with netem running on the sender. For more details about why this is the case, and some suggested alternatives, please see this BBR FAQ entry:

https://github.com/google/bbr/blob/master/Documentation/bbr-faq.md#how-can-i-test-linux-tcp-bbr-with-an-emulated-network

"""

How can I test Linux TCP BBR with an emulated network?

For a feature-rich tool to test Linux TCP performance over emulated networks, check out the transperf tool, which handles the details of configuring network emulation on a single machine or sets of physical machines.

If you want to manually configure an emulated network scenario on Linux machines, you can use netem directly. However, keep in mind that TCP performance results are not realistic when netem is installed on the sending machine, due to interactions between netem and mechanisms like TSQ (TCP small queues). To get realistic TCP performance results with netem, the netem qdisc has to be installed either on an intermediate "router" machine or on the ingress path of the receiving machine.

For examples on how to install netem on the ingress of a machine, see the ifb0 example in the "How can I use netem on incoming traffic?" section of the linuxfoundation.org netem page.

Another factor to consider is that when you emulate loss with netem, the netem qdisc makes drop decisions in terms of entire sk_buff TSO bursts (of up to 44 lMTU-sized packets), rather than individual MTU-sized packets. This makes the loss process highly unrealistic relative to a drop process that drops X% of MTU-size packets: the time in between drops can be up to 44x longer, and the drops are much burstier (e.g. dropping 44 MTU-sized packets in a single sk_buff). For more realistic loss processes you may need to disable LRO and GRO.

"""

If you are still seeing unexpectedly low throughput, then please take sender-side packet traces and visualize them and/or share them:

eg:
tcpdump -w ./trace.pcap -s 120 -c 100000000 host $HOST &
thanks,
neal


--
You received this message because you are subscribed to the Google Groups "BBR Development" group.
To unsubscribe from this group and stop receiving emails from it, send an email to bbr-dev+u...@googlegroups.com.

Tao Tse

unread,
Jun 15, 2021, 10:55:39 AM6/15/21
to BBR Development
Hi, Jonathan & Neal
Thanks for your detailed information about netem!
I'll try your suggestions later.

--
xtao

Dave Taht

unread,
Jun 15, 2021, 1:52:43 PM6/15/21
to Neal Cardwell, Tao Tse, BBR Development
Some of us that care about induced latency, tend to use much smaller mss than 1500, e.g. ~600, in my case, and thus the netem packet limit needs to be even larger to be accurate.  



--
Reply all
Reply to author
Forward
0 new messages