Share some test results of bbr-vs-cubic here

293 views
Skip to first unread message

Tao Tse

unread,
Aug 4, 2020, 8:40:32 AM8/4/20
to BBR Development
Hi,

Here, I collected some test results of fetching ~1MB data from remote server using bbr vs. cubic.

Test description:
The client repeated almost 2000 times on fetch ~1MB data from the remote server over WAN where it was about 2000 km(~50ms RTT) away from the client. The results recorded the time cost (in second) for every try.
Note:
  - The tests went one by one, not concurrently.
  - The client ran on wired network.

20200804201647.png
* Y represents the time(in second) cost, while X represents the test case No.
* The orange dots represent BBR's costs
* The blue dots represent CUBIC's costs

As you can see, BBR is more steady and it spent less time to get the tasks complete.

Further more, due to the small data size, I guess BBR always ran in STARTUP state in these tests. And the main reason of good result of BBR may be it pays less attention on packet loss, but keeps its eye on the bottleneck bandwidth estimation.

--
xtao

Neal Cardwell

unread,
Aug 4, 2020, 9:06:23 AM8/4/20
to Tao Tse, BBR Development
Thanks for sharing your experiment results.

One thing to consider here: it seems that the first few CUBIC transfers were quick, and the following transfers were not. My guess would be that after some CUBIC connection saw a packet loss, the server's kernel may have cached the low ssthresh value for the client's IP, causing further CUBIC transfers to exit slow start at that lower ssthresh value, and thus to be quite slow.

To see what TCP metrics have been cached, you can run:
  ip tcp_metrics show
Then you can grep for the client's IP to see if an ssthresh value has been saved for the client's IP.

It would be interesting to see experiment results where "ip tcp_metrics flush all " is run between experiments, to flush the ssthresh cache, or where the kernel is recent enough to have the following patch that fixes this issue (v5.6 or later):

  65e6d90168f3 net-tcp: Disable TCP ssthresh metrics cache by default

best,
neal


--
You received this message because you are subscribed to the Google Groups "BBR Development" group.
To unsubscribe from this group and stop receiving emails from it, send an email to bbr-dev+u...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/bbr-dev/72122cab-88e3-4108-8196-ef2d42acb066n%40googlegroups.com.

Tao Tse

unread,
Aug 4, 2020, 10:59:07 PM8/4/20
to BBR Development
Sorry for forgotten the server's kernel version info here:
# uname -r
5.7.10-1.el7.elrepo.x86_64

More, the server's cc had been configured bbr now, and here is the tcp_metrics info on the server for the client's IP, I'm not sure if this is helpful.
# ip tcp_metrics show | grep a.a.a.a
a.a.a.a age 91.518sec cwnd 819 rtt 29711us rttvar 2497us source b.b.b.b

Due to the server is running online service now, it is unavailable to do the test at present, if there was any chance to do the test again, I'll continue to share the result here.

--
xtao

Neal Cardwell

unread,
Aug 4, 2020, 11:18:25 PM8/4/20
to Tao Tse, BBR Development
Thanks for the info about the server's kernel version. Since the kernel is 5.7.x and the fix (65e6d90168f3 net-tcp: Disable TCP ssthresh metrics cache by default) is in 5.6, my theory would not apply. And your dump of the tcp_metrics seems to confirm that. So the difference in performance is probably not due to that ssthresh caching issue for CUBIC, but is likely due to the differing responses to packet loss between CUBIC and BBR.

thanks,
neal



Neal Cardwell

unread,
Aug 4, 2020, 11:20:11 PM8/4/20
to Tao Tse, BBR Development
It occurs to me that another possibility for the lower throughput for CUBIC is Hystart spuriously exiting slow-start. That seems less likely, given the client is on a wired network. But you could check nstat counters or packet traces on the server to determine whether that might be an ingredient.

best,
neal

Tao Tse

unread,
Aug 6, 2020, 5:25:39 AM8/6/20
to BBR Development
For now, the server isn't running CUBIC, do the nstat counters still make sense? And which counters should I focus on to verify this guess on Hystart?

# nstat
#kernel
IpInReceives 1251438358 0.0
IpInDelivers 1251438358 0.0
IpOutRequests 699994577 0.0
IcmpInMsgs 10588 0.0
IcmpInErrors 2119 0.0
IcmpInDestUnreachs 4313 0.0
IcmpInTimeExcds 71 0.0
IcmpInEchos 6202 0.0
IcmpInEchoReps 2 0.0
IcmpOutMsgs 6463 0.0
IcmpOutDestUnreachs 259 0.0
IcmpOutEchos 2 0.0
IcmpOutEchoReps 6202 0.0
IcmpMsgInType0 2 0.0
IcmpMsgInType3 4313 0.0
IcmpMsgInType8 6202 0.0
IcmpMsgInType11 71 0.0
IcmpMsgOutType0 6202 0.0
IcmpMsgOutType3 259 0.0
IcmpMsgOutType8 2 0.0
TcpActiveOpens 435214 0.0
TcpPassiveOpens 2056642 0.0
TcpAttemptFails 6195 0.0
TcpEstabResets 655548 0.0
TcpInSegs 1250767706 0.0
TcpOutSegs 2360625405 0.0
TcpRetransSegs 25106210 0.0
TcpInErrs 320 0.0
TcpOutRsts 354411 0.0
TcpInCsumErrors 272 0.0
UdpInDatagrams 659781 0.0
UdpNoPorts 259 0.0
UdpOutDatagrams 660750 0.0
TcpExtEmbryonicRsts 6195 0.0
TcpExtOutOfWindowIcmps 335 0.0
TcpExtTW 85422 0.0
TcpExtPAWSEstab 974867 0.0
TcpExtDelayedACKs 7281016 0.0
TcpExtDelayedACKLocked 2594 0.0
TcpExtDelayedACKLost 333471 0.0
TcpExtListenDrops 32 0.0
TcpExtTCPHPHits 250201435 0.0
TcpExtTCPPureAcks 500768044 0.0
TcpExtTCPHPAcks 297950962 0.0
TcpExtTCPRenoRecovery 1100 0.0
TcpExtTCPSackRecovery 1901270 0.0
TcpExtTCPSACKReneging 696 0.0
TcpExtTCPSACKReorder 33906514 0.0
TcpExtTCPRenoReorder 4367 0.0
TcpExtTCPTSReorder 202427 0.0
TcpExtTCPFullUndo 75516 0.0
TcpExtTCPPartialUndo 65122 0.0
TcpExtTCPDSACKUndo 56403 0.0
TcpExtTCPLossUndo 43826 0.0
TcpExtTCPLostRetransmit 3204792 0.0
TcpExtTCPRenoFailures 171 0.0
TcpExtTCPSackFailures 29131 0.0
TcpExtTCPLossFailures 7583 0.0
TcpExtTCPFastRetrans 23745657 0.0
TcpExtTCPSlowStartRetrans 876429 0.0
TcpExtTCPTimeouts 173993 0.0
TcpExtTCPLossProbes 332980 0.0
TcpExtTCPLossProbeRecovery 17017 0.0
TcpExtTCPRenoRecoveryFail 289 0.0
TcpExtTCPSackRecoveryFail 42554 0.0
TcpExtTCPBacklogCoalesce 2301223 0.0
TcpExtTCPDSACKOldSent 333284 0.0
TcpExtTCPDSACKOfoSent 178 0.0
TcpExtTCPDSACKRecv 5631738 0.0
TcpExtTCPDSACKOfoRecv 382127 0.0
TcpExtTCPAbortOnData 1 0.0
TcpExtTCPAbortOnClose 314117 0.0
TcpExtTCPAbortOnTimeout 4381 0.0
TcpExtTCPSACKDiscard 255 0.0
TcpExtTCPDSACKIgnoredOld 48281 0.0
TcpExtTCPDSACKIgnoredNoUndo 2764740 0.0
TcpExtTCPSpuriousRTOs 5398 0.0
TcpExtTCPSackShifted 38040062 0.0
TcpExtTCPSackMerged 47233125 0.0
TcpExtTCPSackShiftFallback 73104475 0.0
TcpExtTCPRcvCoalesce 77967701 0.0
TcpExtTCPOFOQueue 4668 0.0
TcpExtTCPOFOMerge 178 0.0
TcpExtTCPChallengeACK 7482 0.0
TcpExtTCPSYNChallenge 55 0.0
TcpExtTCPFastOpenCookieReqd 2 0.0
TcpExtTCPSpuriousRtxHostQueues 401 0.0
TcpExtTCPAutoCorking 2010662 0.0
TcpExtTCPFromZeroWindowAdv 112938 0.0
TcpExtTCPToZeroWindowAdv 114002 0.0
TcpExtTCPWantZeroWindowAdv 660166 0.0
TcpExtTCPSynRetrans 17415 0.0
TcpExtTCPOrigDataSent 2152532866 0.0
TcpExtTCPACKSkippedSynRecv 1891 0.0
TcpExtTCPACKSkippedPAWS 815737 0.0
TcpExtTCPACKSkippedSeq 167488 0.0
TcpExtTCPACKSkippedFinWait2 73 0.0
TcpExtTCPACKSkippedTimeWait 16 0.0
TcpExtTCPACKSkippedChallenge 36376 0.0
TcpExtTCPWinProbe 185007 0.0
TcpExtTCPKeepAlive 9858901 0.0
TcpExtTCPDelivered 2138230862 0.0
TcpExtTCPDeliveredCE 8756 0.0
TcpExtTCPAckCompressed 42 0.0
TcpExtTcpTimeoutRehash 134469 0.0
TcpExtTcpDuplicateDataRehash 8877 0.0
IpExtInOctets 3181319513129 0.0
IpExtOutOctets 3031910039135 0.0
IpExtInNoECTPkts 3103463378 0.0
IpExtInECT1Pkts 3420 0.0
IpExtInECT0Pkts 74073 0.0
IpExtInCEPkts 3 0.0                                              

btw, what do packet traces mean?

--
xtao

Neal Cardwell

unread,
Aug 6, 2020, 8:33:04 AM8/6/20
to Tao Tse, BBR Development
Regarding counters, the Hystart counters have the string "Hystart" in them. For example:

TcpExtTCPHystartTrainDetect     57538              0.0
TcpExtTCPHystartTrainCwnd       14774147           0.0
TcpExtTCPHystartDelayDetect     1262               0.0
TcpExtTCPHystartDelayCwnd       314881             0.0

Please see the CUBIC source code in tcp_cubic.c for the meaning of those counters.

By "packet traces", I mean files recording packets captured by a tool like tcpdump ( https://www.tcpdump.org/ ). Here is a recipe for taking a trace with tcpdump and visualizing it:


best,
neal


XTao

unread,
Aug 9, 2020, 10:28:02 PM8/9/20
to Neal Cardwell, BBR Development
Neal, thanks for your detailed information!

--
xtao
Reply all
Reply to author
Forward
0 new messages