BBR test on wireless path

866 views
Skip to first unread message

cf623...@gmail.com

unread,
Jan 11, 2017, 11:10:29 AM1/11/17
to BBR Development
Hi All,

I have setup a wireless env(A PC host which connect to wireless Router through cable served as the Sender and a  Laptop which connct to the WLAN Hotspots 

served as a data receiver)to test BBR. 

 
cable  
PC -----------  Wireless Router .)))  Laptop 


BBR has been deployed on the the PC host with the kernel version 4.9-rc2 and Iperf tool has been used to send/recv data on PC/Laptop.  To have a contrast, 

Cubic algorithm has been tested too. What confused me is that Cubic get a higher throughput than BBR with/without fq qdisc pacing enable on my test(since

BBR is rate-based rather than loss-based,  it is intuitive that BBR can reach a higher throughput than loss-based algorithm on unreliable wireless link).


Related test info as follows:



1. Sender with cubic algorithm get a 15.6Mbps throughput 
 
    iperf2 -c 192.168.1.201 -p 7070 -i 1 -t 30 -P 1       
    
    [  3]  0.0-30.1 sec  56.4 MBytes  15.7 Mbits/sec

    -----------------------------------------------------------------------

     from tcptrace: (about 0.05% loss)

     actual data pkts:      17579           actual data pkts:          0      
     rexmt data pkts:           9           rexmt data pkts:           0  

    -------------------------------------------------------------------------

    from 1s inteval ss output:

            cubic wscale:0,7 rto:220 rtt:19.708/13.509 mss:1448 cwnd:47 ssthresh:34 bytes_acked:35900289 segs_out:25026 segs_in:12503 data_segs_out:25024 send 27.6Mbps lastsnd:9 lastrcv:1916349 lastack:9 pacing_rate 33.1Mbps unacked:45 retrans:0/8 rcv_space:29200 notsent:267880 minrtt:1.899
         cubic wscale:0,7 rto:290 rtt:89.453/6.28 mss:1448 cwnd:47 ssthresh:34 bytes_acked:37888768 segs_out:26412 segs_in:13196 data_segs_out:26410 send 6.1Mbps lastsnd:138 lastrcv:1917353 lastack:142 pacing_rate 7.3Mbps unacked:45 retrans:0/8 rcv_space:29200 notsent:222992 minrtt:1.899
         cubic wscale:0,7 rto:212 rtt:10.534/1.361 mss:1448 cwnd:47 ssthresh:34 bytes_acked:39452233 segs_out:27498 segs_in:13741 data_segs_out:27496 send 51.7Mbps lastsnd:1 lastrcv:1918358 lastack:1 pacing_rate 62.0Mbps unacked:41 retrans:0/8 rcv_space:29200 notsent:226961 minrtt:1.899
         cubic wscale:0,7 rto:261 rtt:60.695/5.236 mss:1448 cwnd:47 ssthresh:34 bytes_acked:41967409 segs_out:29253 segs_in:14616 data_segs_out:29251 send 9.0Mbps lastsnd:13 lastrcv:1919362 lastack:17 pacing_rate 10.8Mbps unacked:46 retrans:0/8 rcv_space:29200 notsent:348593 minrtt:1.899
         cubic wscale:0,7 rto:213 rtt:12.386/3.724 mss:1448 cwnd:47 ssthresh:34 bytes_acked:44046737 segs_out:30696 segs_in:15339 data_segs_out:30694 send 44.0Mbps lastsnd:21 lastrcv:1920366 lastack:22 pacing_rate 52.7Mbps unacked:43 retrans:0/8 rcv_space:29200 notsent:249056 minrtt:1.899


2. Sender with BBR + FQ (without pacing) get a 13.2Mbps

    iperf2 -c 192.168.1.201 -p 7070 -i 1 -t 30 -P 1 

    [  3]  0.0-30.1 sec  47.5 MBytes  13.2 Mbits/sec
  
    ------------------------------------------------------------------------

    from tcptrace: (about 0.03% loss)
 
     actual data pkts:      12478           actual data pkts:          0      
     rexmt data pkts:           4           rexmt data pkts:           0  

    ---------------------------------------------------------------------------
 
    from 1s inteval ss output:

     bbr wscale:0,7 rto:238 rtt:37.473/8.284 mss:1448 cwnd:42 ssthresh:36 bytes_acked:3434681 segs_out:2430 segs_in:1194 data_segs_out:2428 bbr mode:probe_bw,ca state:Open,maxbw:64.9M,minrtt:2.339,pacing gain:256,cwnd gain:512, send 13.0Mbps lastsnd:1 lastrcv:3475176 lastack:3 pacing_rate 67.2Mbps unacked:42 rcv_space:29200 notsent:654496 minrtt:2.339
         bbr wscale:0,7 rto:271 rtt:17.134/1.198 mss:1448 cwnd:24 ssthresh:36 bytes_acked:4976801 segs_out:3477 segs_in:1727 data_segs_out:3475 bbr mode:probe_bw,ca state:Open,maxbw:34.9M,minrtt:2.339,pacing gain:256,cwnd gain:512, send 16.2Mbps lastsnd:19 lastrcv:3476181 lastack:20 pacing_rate 36.2Mbps unacked:23 rcv_space:29200 notsent:955680 minrtt:2.339
         bbr wscale:0,7 rto:230 rtt:29.556/8.483 mss:1448 cwnd:16 ssthresh:36 bytes_acked:6233665 segs_out:4339 segs_in:2161 data_segs_out:4337 bbr mode:probe_bw,ca state:Open,maxbw:21.0M,minrtt:2.339,pacing gain:256,cwnd gain:512, send 6.3Mbps lastsnd:19 lastrcv:3477185 lastack:81 pacing_rate 21.7Mbps unacked:17 rcv_space:29200 notsent:758752 minrtt:2.339
         bbr wscale:0,7 rto:216 rtt:15.587/4.026 mss:1448 cwnd:32 ssthresh:36 bytes_acked:6943185 segs_out:4845 segs_in:2406 data_segs_out:4843 bbr mode:probe_bw,ca state:Open,maxbw:46.3M,minrtt:2.339,pacing gain:256,cwnd gain:512, send 23.8Mbps lastsnd:24 lastrcv:3478190 lastack:57 pacing_rate 48.0Mbps unacked:33 rcv_space:29200 notsent:726896 minrtt:2.339
         bbr wscale:0,7 rto:260 rtt:14.821/2.368 mss:1448 cwnd:18 ssthresh:36 bytes_acked:7991537 segs_out:5554 segs_in:2768 data_segs_out:5552 bbr mode:probe_bw,ca state:Open,maxbw:27.0M,minrtt:2.339,pacing gain:256,cwnd gain:512, send 14.1Mbps lastsnd:16 lastrcv:3479194 lastack:16 pacing_rate 27.9Mbps unacked:18 rcv_space:29200 notsent:751512 minrtt:2.339
         

3. Sender with BBR + FQ(with pacing) get a 9.79Mbps 


    iperf2 -c 192.168.1.201 -p 7070 -i 1 -t 30 -P 1 

    [  3]  0.0-30.4 sec  35.5 MBytes  9.79 Mbits/sec
  
    ------------------------------------------------------------------------

    from tcptrace: (about 0.04% loss)
 
    actual data pkts:      12062           actual data pkts:          0        
     rexmt data pkts:           6           rexmt data pkts:           0    

    ---------------------------------------------------------------------------
 
   from 1s inteval ss output:

             bbr wscale:0,7 rto:208 rtt:7.359/1.107 mss:1448 cwnd:42 bytes_acked:3711249 segs_out:2628 segs_in:1293 data_segs_out:2626 bbr mode:probe_bw,ca state:Open,maxbw:48.3M,minrtt:3.753,pacing gain:192,cwnd gain:512, send 66.1Mbps lastsnd:1 lastrcv:3548446 lastack:5 pacing_rate 37.5Mbps unacked:42 retrans:0/1 rcv_space:29200 notsent:1560944 minrtt:3.753
         bbr wscale:0,7 rto:226 rtt:25.274/1.817 mss:1448 cwnd:20 bytes_acked:5343145 segs_out:3724 segs_in:1857 data_segs_out:3722 bbr mode:probe_bw,ca state:Open,maxbw:28.3M,minrtt:2.464,pacing gain:256,cwnd gain:512, send 9.2Mbps lastsnd:1 lastrcv:3549451 lastack:2 pacing_rate 29.3Mbps unacked:10 retrans:0/1 rcv_space:29200 notsent:1377048 minrtt:2.464
         bbr wscale:0,7 rto:212 rtt:11.586/3.568 mss:1448 cwnd:16 bytes_acked:6652137 segs_out:4629 segs_in:2309 data_segs_out:4627 bbr mode:probe_bw,ca state:Open,maxbw:19.2M,minrtt:2.464,pacing gain:256,cwnd gain:512, send 16.0Mbps lastsnd:1 lastrcv:3550456 lastack:1 pacing_rate 19.9Mbps unacked:11 retrans:0/1 rcv_space:29200 notsent:1404560 minrtt:2.464
         bbr wscale:0,7 rto:221 rtt:20.082/9.429 mss:1448 cwnd:18 bytes_acked:8076969 segs_out:5620 segs_in:2801 data_segs_out:5618 bbr mode:probe_bw,ca state:Open,maxbw:24.5M,minrtt:2.464,pacing gain:256,cwnd gain:512, send 10.4Mbps lastsnd:5 lastrcv:3551460 lastack:9 pacing_rate 25.4Mbps unacked:18 retrans:0/1 rcv_space:29200 notsent:1307544 minrtt:2.464
         bbr wscale:0,7 rto:212 rtt:11.775/2.485 mss:1448 cwnd:14 bytes_acked:9397545 segs_out:6527 segs_in:3257 data_segs_out:6525 bbr mode:probe_bw,ca state:Open,maxbw:15.8M,minrtt:2.464,pacing gain:256,cwnd gain:512, send 13.8Mbps lastsnd:17 lastrcv:3552465 lastack:21 pacing_rate 16.4Mbps unacked:13 retrans:0/1 rcv_space:29200 notsent:1332160 minrtt:2.464
         bbr wscale:0,7 rto:233 rtt:32.745/17.781 mss:1448 cwnd:14 bytes_acked:10408249 segs_out:7226 segs_in:3606 data_segs_out:7224 bbr mode:probe_bw,ca state:Open,maxbw:16.5M,minrtt:2.279,pacing gain:256,cwnd gain:512, send 5.0Mbps lastsnd:26 lastrcv:3553469 lastack:28 pacing_rate 17.1Mbps unacked:14 retrans:0/1 rcv_space:29200 notsent:1689816 minrtt:2.279



There are no packets dropped on TC layer on each test. 

It seems that BBR has a more variable CWND compared with Cubic from my roughly ss output. Please see attachment for tcpdump trace files.

Any suggestion would be greate appreciate!

Regards,
devin

bbr_cubic_pcap.rar

Dave Taht

unread,
Jan 11, 2017, 11:20:35 AM1/11/17
to cf623...@gmail.com, BBR Development
On Wed, Jan 11, 2017 at 8:10 AM, <cf623...@gmail.com> wrote:
> Hi All,
>
> I have setup a wireless env(A PC host which connect to wireless Router
> through cable served as the Sender and a Laptop which connct to the WLAN
> Hotspots

Wifi drivers almost universally suck, and it would help to identify
the wireless cards under test here.

I've planned to add BBR to the make-wifi-fast test suite against the
new ath9k + fq_codel + ATF stuff for a while now, have not quite got
around to it yet.

I've generally expected in simple scenarios such as yours for BBR's
1ms pacing to interact badly (from a throughput perspective) with
aggregation (essentially on 4+ms intervals), but in more complex
scenarios (more clients/interference), I have not the foggiest idea -
and in either case there was a huge latency win.

Can you measure throughput and latency with flent.org's tcp_nup or
tcp_ndown tests with bbr vs cubic?
> --
> You received this message because you are subscribed to the Google Groups
> "BBR Development" group.
> To unsubscribe from this group and stop receiving emails from it, send an
> email to bbr-dev+u...@googlegroups.com.
> For more options, visit https://groups.google.com/d/optout.



--
Dave Täht
Let's go make home routers and wifi faster! With better software!
http://blog.cerowrt.org

Neal Cardwell

unread,
Jan 11, 2017, 12:34:29 PM1/11/17
to Dave Taht, cf623...@gmail.com, BBR Development
Thanks for the detailed report. This was a useful data point.

Apart from whatever may be going on at the wifi layer (which Dave nicely alludes to) the BBR trace seems to nicely correspond to a known area where we'd like to improve BBR: provisioning sufficient cwnd for paths with very delayed, stretched, or aggregated ACKs. Such behavior is common for cellular, cable modem, or wifi paths. And it shows up here, in a dramatic fashion: the min_rtt is around 2ms, but often the ACKs for a flight arrive about 20ms later, in a tight burst.

I have attached screenshots of the tcptrace/xplot output for the CUBIC and fq-paced BBR traces, for comparison.

In the BBR trace one can see that a big part of the dynamic is that the flow runs out of cwnd. This is because the cwnd in BBR is currently calculated as: cwnd = cwnd_gain * bw * min_rtt + 3 * tso_segs. In this particular case, as shown in the nice ss output, this can give a cwnd as small as 14 packets, which is not nearly enough to saturate this bottleneck.

For traffic traveling over the Internet, this is usually not an issue in our experience, since the longer RTT for the WAN portion of the path causes enough cwnd to be provisioned for the wifi hop. But if the path is just a wifi LAN, the cwnd can be insufficient.

As we have discussed in a thread related to cable modem behavior:
we are experimenting with an approach that includes some budget in the cwnd for the aggregation/burstiness levels recently observed in the ACK stream. That might work reasonably well in this wifi LAN case, where the ACKs for nearly an entire flight quite often arrive in a tight burst that is easily measured.

Thanks!

neal



> For more options, visit https://groups.google.com/d/optout.



--
Dave Täht
Let's go make home routers and wifi faster! With better software!
http://blog.cerowrt.org
--
You received this message because you are subscribed to the Google Groups "BBR Development" group.
To unsubscribe from this group and stop receiving emails from it, send an email to bbr-dev+unsubscribe@googlegroups.com.
2017-01-11-wifi-lan-cubic-1.png
2017-01-11-wifi-lan-bbr-1.png

Dave Taht

unread,
Jan 11, 2017, 1:37:44 PM1/11/17
to Neal Cardwell, cf623...@gmail.com, BBR Development
On Wed, Jan 11, 2017 at 9:33 AM, Neal Cardwell <ncar...@google.com> wrote:
> Thanks for the detailed report. This was a useful data point.

If you want more terrifying traces I can supply a bunch from multiple
common wifi drivers - but not against BBR as yet.

> Apart from whatever may be going on at the wifi layer (which Dave nicely
> alludes to) the BBR trace seems to nicely correspond to a known area where

Btw: recently it was discovered that a large number of celluar modems
actually have linux deeply embedded in them:

https://www.youtube.com/watch?v=sq9chzNVoXg

(Which A) gives me hope that the firmware could be dramatically
improved to reduce bloat, and B) gives me the heebie-jeebies - access
linux via the hayes AT command set? what could go wrong?)

I am a huge fan of BBR's approach but still feel co-evolving the
firmware would be most helpful, particularly if a huge swath like
QCA's gear could be improved, per above.....

> we'd like to improve BBR: provisioning sufficient cwnd for paths with very
> delayed, stretched, or aggregated ACKs. Such behavior is common for
> cellular, cable modem, or wifi paths. And it shows up here, in a dramatic
> fashion: the min_rtt is around 2ms, but often the ACKs for a flight arrive
> about 20ms later, in a tight burst.

This is a typical trace for an isolated single client wifi scenario.
The ack bursts get larger, and more random as you add clients. As
pretty as it is... want some stuff that's not as pretty? :)

> I have attached screenshots of the tcptrace/xplot output for the CUBIC and
> fq-paced BBR traces, for comparison.
>
> In the BBR trace one can see that a big part of the dynamic is that the flow
> runs out of cwnd. This is because the cwnd in BBR is currently calculated
> as: cwnd = cwnd_gain * bw * min_rtt + 3 * tso_segs. In this particular case,
> as shown in the nice ss output, this can give a cwnd as small as 14 packets,
> which is not nearly enough to saturate this bottleneck.

Yes. But I'd like to stress that finding the right rate for the
application at hand seems more important than finding the max rate a
test can generate. you might get a burst of acks every 100ms (or
worse!), but that does not translate to sending a huge burst back as a
"good" thing.

But scaling a burst to the client as opposed to direct pacing/ms... hmmm.

You can sort of determine the transmit rate of the wifi client from
the size of the ack burst (X fit into an aggregate), the amount of
contention on the network in the jitter/delay, but actually finding
the capacity for a given aggregated burst down to the client from the
AP... mmmmm....

> For traffic traveling over the Internet, this is usually not an issue in our
> experience, since the longer RTT for the WAN portion of the path causes
> enough cwnd to be provisioned for the wifi hop. But if the path is just a
> wifi LAN, the cwnd can be insufficient.
>
> As we have discussed in a thread related to cable modem behavior:
> https://groups.google.com/d/msg/bbr-dev/Fj2emRS4Wn4/k23d7nPPCAAJ
> we are experimenting with an approach that includes some budget in the cwnd
> for the aggregation/burstiness levels recently observed in the ACK stream.
> That might work reasonably well in this wifi LAN case, where the ACKs for
> nearly an entire flight quite often arrive in a tight burst that is easily
> measured.

Hope so. I'll put up a BBR server soon.
>> > email to bbr-dev+u...@googlegroups.com.
>> > For more options, visit https://groups.google.com/d/optout.
>>
>>
>>
>> --
>> Dave Täht
>> Let's go make home routers and wifi faster! With better software!
>> http://blog.cerowrt.org
>>
>> --
>> You received this message because you are subscribed to the Google Groups
>> "BBR Development" group.
>> To unsubscribe from this group and stop receiving emails from it, send an
>> email to bbr-dev+u...@googlegroups.com.

Neal Cardwell

unread,
Jan 11, 2017, 2:02:01 PM1/11/17
to Dave Taht, cf623...@gmail.com, BBR Development
On Wed, Jan 11, 2017 at 1:37 PM, Dave Taht <dave...@gmail.com> wrote:
On Wed, Jan 11, 2017 at 9:33 AM, Neal Cardwell <ncar...@google.com> wrote:

> In the BBR trace one can see that a big part of the dynamic is that the flow
> runs out of cwnd. This is because the cwnd in BBR is currently calculated
> as: cwnd = cwnd_gain * bw * min_rtt + 3 * tso_segs. In this particular case,
> as shown in the nice ss output, this can give a cwnd as small as 14 packets,
> which is not nearly enough to saturate this bottleneck.

Yes. But I'd like to stress that finding the right rate for the
application at hand seems more important than finding the max rate a
test can generate. you might get a burst of acks every 100ms (or
worse!), but that does not translate to sending a huge burst back as a
"good" thing.

Yes, I should clarify that I'm not proposing that huge bursts are a good thing, and the experiments we have undertaken do not aim for bigger bursts. I would emphasize that in the experiments where we allow a bigger cwnd for BBR in such situations, the pacing rate still dominates, so that BBR still paces packets at the bandwidth it has seen over the scale of a round-trip. It's not that there would be big bursts. It's just that BBR would allow itself to pace out more packets before waiting for an ACK.

neal

cf623...@gmail.com

unread,
Jan 12, 2017, 10:56:00 AM1/12/17
to BBR Development, cf623...@gmail.com
Hi Dave,

Thanks for your reply!

On Wed, Jan 11, 2017 at 8:10 AM,  <cf623...@gmail.com> wrote:
> Hi All,
>
> I have setup a wireless env(A PC host which connect to wireless Router
> through cable served as the Sender and a  Laptop which connct to the WLAN
> Hotspots

Wifi drivers almost universally suck, and it would help to identify
the wireless cards under test here.
Wireless cards is Intel WiFi Link 1000 BGN with a driver version 13.5.0.6 released at 2011/1/19. 

I've planned to add BBR to the make-wifi-fast test suite against the
new ath9k + fq_codel + ATF stuff for a while now, have not quite got
around to it yet.
Great work, i will try to learn more about it. 
I've generally expected in simple scenarios such as yours for BBR's
1ms pacing to interact badly (from a throughput perspective) with
aggregation (essentially on 4+ms intervals), but in more complex
scenarios (more clients/interference), I have not the foggiest idea -
and in either case there was a huge latency win.

Can you measure throughput and latency with flent.org's tcp_nup or
tcp_ndown tests with bbr vs cubic?
I am trying to use the flent to do up/download tests as your guide, but i can not find the suitable netperf for window for my Laptop. 
I just find a netperf with version 2.4.5, but it seem it cann't work with 2.7.0 version running on my Linux PC. I will keep trying and give the result once it work.



cf623...@gmail.com

unread,
Jan 12, 2017, 11:21:48 AM1/12/17
to BBR Development, dave...@gmail.com, cf623...@gmail.com

Thanks for the detailed report. This was a useful data point.

Apart from whatever may be going on at the wifi layer (which Dave nicely alludes to) the BBR trace seems to nicely correspond to a known area where we'd like to improve BBR: provisioning sufficient cwnd for paths with very delayed, stretched, or aggregated ACKs. Such behavior is common for cellular, cable modem, or wifi paths. And it shows up here, in a dramatic fashion: the min_rtt is around 2ms, but often the ACKs for a flight arrive about 20ms later, in a tight burst.

I have attached screenshots of the tcptrace/xplot output for the CUBIC and fq-paced BBR traces, for comparison.

In the BBR trace one can see that a big part of the dynamic is that the flow runs out of cwnd. This is because the cwnd in BBR is currently calculated as: cwnd = cwnd_gain * bw * min_rtt + 3 * tso_segs. In this particular case, as shown in the nice ss output, this can give a cwnd as small as 14 packets, which is not nearly enough to saturate this bottleneck.
Thanks for your detailed explaination, but i am still quite not understand the tcptrace/xplot graph :) .  Can you give me some guide on how to understand it? Very apprecaite! (How do you conclude the lower throughput is caused by running out of CWND from the graph?)

For traffic traveling over the Internet, this is usually not an issue in our experience, since the longer RTT for the WAN portion of the path causes enough cwnd to be provisioned for the wifi hop. But if the path is just a wifi LAN, the cwnd can be insufficient.

As we have discussed in a thread related to cable modem behavior:
we are experimenting with an approach that includes some budget in the cwnd for the aggregation/burstiness levels recently observed in the ACK stream. That might work reasonably well in this wifi LAN case, where the ACKs for nearly an entire flight quite often arrive in a tight burst that is easily measured.
Ok, i will have a try. 

Regards,
devin

Neal Cardwell

unread,
Jan 12, 2017, 12:23:18 PM1/12/17
to cf623...@gmail.com, BBR Development, Dave Taht
On Thu, Jan 12, 2017 at 11:21 AM, <cf623...@gmail.com> wrote:

Thanks for the detailed report. This was a useful data point.

Apart from whatever may be going on at the wifi layer (which Dave nicely alludes to) the BBR trace seems to nicely correspond to a known area where we'd like to improve BBR: provisioning sufficient cwnd for paths with very delayed, stretched, or aggregated ACKs. Such behavior is common for cellular, cable modem, or wifi paths. And it shows up here, in a dramatic fashion: the min_rtt is around 2ms, but often the ACKs for a flight arrive about 20ms later, in a tight burst.

I have attached screenshots of the tcptrace/xplot output for the CUBIC and fq-paced BBR traces, for comparison.

In the BBR trace one can see that a big part of the dynamic is that the flow runs out of cwnd. This is because the cwnd in BBR is currently calculated as: cwnd = cwnd_gain * bw * min_rtt + 3 * tso_segs. In this particular case, as shown in the nice ss output, this can give a cwnd as small as 14 packets, which is not nearly enough to saturate this bottleneck.
Thanks for your detailed explaination, but i am still quite not understand the tcptrace/xplot graph :) .  Can you give me some guide on how to understand it? Very apprecaite!

The manual for tcptrace is here: 
Please see the "Time Sequence Graph". In summary, the white line segments are transmitted packets, the green line is the cumulative ack line (snd_una), the yellow line is the limit imposed by the receiver window.

 
(How do you conclude the lower throughput is caused by running out of CWND from the graph?)

From the paced BBR graph, the advancing diagonal line of white transmitted packets stops suddenly, around the point dictated by the cwnd. And aside from the graph, in general from first principles the cwnd needs to encompass at least the BDP of the path, and for wifi LAN paths the BDP is not well approximated by bandwidth*min_rtt.
 

For traffic traveling over the Internet, this is usually not an issue in our experience, since the longer RTT for the WAN portion of the path causes enough cwnd to be provisioned for the wifi hop. But if the path is just a wifi LAN, the cwnd can be insufficient.

As we have discussed in a thread related to cable modem behavior:
we are experimenting with an approach that includes some budget in the cwnd for the aggregation/burstiness levels recently observed in the ACK stream. That might work reasonably well in this wifi LAN case, where the ACKs for nearly an entire flight quite often arrive in a tight burst that is easily measured.
Ok, i will have a try. 

OK, thanks. We will post a note to the list when the patch is out. Though that the moment we are focused on other aspects of BBR.

thanks,
neal

Jim Gettys

unread,
Jan 12, 2017, 12:25:34 PM1/12/17
to Dave Taht, Neal Cardwell, cf623...@gmail.com, BBR Development
Cell phones are really bad; how bad has just become something I'm trying to figure out....

See below for a frightening factoid....

On Wed, Jan 11, 2017 at 1:37 PM, Dave Taht <dave...@gmail.com> wrote:
On Wed, Jan 11, 2017 at 9:33 AM, Neal Cardwell <ncar...@google.com> wrote:
> Thanks for the detailed report. This was a useful data point.

If you want more terrifying traces I can supply a bunch from multiple
common wifi drivers - but not against BBR as yet.

> Apart from whatever may be going on at the wifi layer (which Dave nicely
> alludes to) the BBR trace seems to nicely correspond to a known area where

Btw: recently it was discovered that a large number of celluar modems
actually have linux deeply embedded in them:

https://www.youtube.com/watch?v=sq9chzNVoXg

(Which A) gives me hope that the firmware could be dramatically
improved to reduce bloat, and B) gives me the heebie-jeebies - access
linux via the hayes AT command set? what could go wrong?)

​I'm going to be poking at a brand new Google Pixel phone (which, I gather is Linux 3.18) shortly to figure out how Google has their phones configured; with luck, next week.

I do note that I have email from someone who looked at a
 
​Nexus 6 (which should easily have had fq_codel enabled, as it runs something later than Linux 3.5) last week found  
pfifo_fast as its qdisc
​ on the cellular interface​
...  I don't know how many packets of buffering in it
​, but will find out soon.​

And that is above whatever disaster we find in the binary blob cellular drivers.

For the Google guys ​on this list, if any of you have contacts in the Android part of Google, please help them drain the Android swamp and educate the Android management to get the chip vendors to drain their swamp.  Unfortunately, that will probably of order the effort that has gone on for WiFi....

Similarly to any Apple guys on the list....


I am a huge fan of BBR's approach but still feel co-evolving the
firmware would be most helpful, particularly if a huge swath like
QCA's gear could be improved, per above.....

​As am I.

But we need the QCA guys to fix their drivers too...  So if any of you know QCA guys.......

                                              - Jim
 

>> > For more options, visit https://groups.google.com/d/optout.
>>
>>
>>
>> --
>> Dave Täht
>> Let's go make home routers and wifi faster! With better software!
>> http://blog.cerowrt.org
>>
>> --
>> You received this message because you are subscribed to the Google Groups
>> "BBR Development" group.
>> To unsubscribe from this group and stop receiving emails from it, send an

>> For more options, visit https://groups.google.com/d/optout.
>
>



--
Dave Täht
Let's go make home routers and wifi faster! With better software!
http://blog.cerowrt.org

--
You received this message because you are subscribed to the Google Groups "BBR Development" group.
To unsubscribe from this group and stop receiving emails from it, send an email to bbr-dev+unsubscribe@googlegroups.com.

Dave Taht

unread,
Jan 12, 2017, 12:48:34 PM1/12/17
to cf623...@gmail.com, BBR Development
On Thu, Jan 12, 2017 at 8:21 AM, <cf623...@gmail.com> wrote:
>
>> Thanks for the detailed report. This was a useful data point.
>>
>> Apart from whatever may be going on at the wifi layer (which Dave nicely
>> alludes to) the BBR trace seems to nicely correspond to a known area where
>> we'd like to improve BBR: provisioning sufficient cwnd for paths with very
>> delayed, stretched, or aggregated ACKs. Such behavior is common for
>> cellular, cable modem, or wifi paths. And it shows up here, in a dramatic
>> fashion: the min_rtt is around 2ms, but often the ACKs for a flight arrive
>> about 20ms later, in a tight burst.
>>
>> I have attached screenshots of the tcptrace/xplot output for the CUBIC and
>> fq-paced BBR traces, for comparison.
>>
>> In the BBR trace one can see that a big part of the dynamic is that the
>> flow runs out of cwnd. This is because the cwnd in BBR is currently
>> calculated as: cwnd = cwnd_gain * bw * min_rtt + 3 * tso_segs. In this
>> particular case, as shown in the nice ss output, this can give a cwnd as
>> small as 14 packets, which is not nearly enough to saturate this bottleneck.
>
> Thanks for your detailed explaination, but i am still quite not understand
> the tcptrace/xplot graph :) . Can you give me some guide on how to
> understand it? Very apprecaite! (How do you conclude the lower throughput is
> caused by running out of CWND from the graph?)

Interpreting xplot and tcptrace is an ar!. There are several resources
on the web for it. A basic intro stuart cheshire did to xplot I like
very much is, it's about 10 minutes long and starting 16 minutes in:

https://plus.google.com/107942175615993706558/posts/1j8nXtLGZDm

The video is downloadable if you don't have osx.
>>
>>
>> For traffic traveling over the Internet, this is usually not an issue in
>> our experience, since the longer RTT for the WAN portion of the path causes
>> enough cwnd to be provisioned for the wifi hop. But if the path is just a
>> wifi LAN, the cwnd can be insufficient.
>>
>> As we have discussed in a thread related to cable modem behavior:
>> https://groups.google.com/d/msg/bbr-dev/Fj2emRS4Wn4/k23d7nPPCAAJ
>> we are experimenting with an approach that includes some budget in the
>> cwnd for the aggregation/burstiness levels recently observed in the ACK
>> stream. That might work reasonably well in this wifi LAN case, where the
>> ACKs for nearly an entire flight quite often arrive in a tight burst that is
>> easily measured.
>
> Ok, i will have a try.
>
> Regards,
> devin



Jonathan Morton

unread,
Jan 12, 2017, 12:54:55 PM1/12/17
to cf623...@gmail.com, BBR Development

> On 12 Jan, 2017, at 17:56, cf623...@gmail.com wrote:
>
> I am trying to use the flent to do up/download tests as your guide, but i can not find the suitable netperf for window for my Laptop.

I don’t think anyone has Flent running on Windows yet. I recommend booting your laptop into a Linux live environment, and running the netperf server within that.

- Jonathan Morton

Dave Taht

unread,
Jan 12, 2017, 1:13:20 PM1/12/17
to Jonathan Morton, cf623...@gmail.com, BBR Development
Flent's analytic/browsing code does run on windows (at least in the
prior release, what's in git head now is untested (and contains a lot
of the new wifi specific tests and is much faster at plotting on
multi-cores) on windows. So you can browse test results on osx,
windows, linux to your heart's content.

Driving tests in flent is currently limited to osx and linux. The
bloat results we got on windows were so dismal on a variety of wifi
drivers 2(? 3?) years back that we punted and merely begged several
contacts at microsoft to add latency+load tests like ours to their
ethernet and wifi driver validation suites.

We have not tried windows 10 at all for either browsing results or
generating tests. I'll ping rick to see if a netperf build exists for
that.

I don't mind at all people finding ways to test windows better! Using
a fixed version of iperf combined with ping, without flent, seems the
only way forward as I write. And watch out for this iperf3 bug with
udp tests!

http://burntchrome.blogspot.com/2016/09/iperf3-and-microbursts.html

> - Jonathan Morton

Daniel Havey

unread,
Jan 12, 2017, 6:10:54 PM1/12/17
to Dave Taht, Jonathan Morton, cf623...@gmail.com, BBR Development
Also we should look again at Windows. I bet I can get the Windows
WiFi team to help us. I bet I can even get the OEMs to play nice if
we are lucky :). Let's kill this bufferbloat pig!

On Thu, Jan 12, 2017 at 2:42 PM, Daniel Havey <dha...@gmail.com> wrote:
> A little off topic, but, as long as we are talking about iperf3 it
> forces a static buffer size of 212 KB and performance is stinky.
> Working with Cygwin to fix this.

cf623...@gmail.com

unread,
Jan 13, 2017, 7:27:24 AM1/13/17
to BBR Development, cf623...@gmail.com, dave...@gmail.com


On Thu, Jan 12, 2017 at 11:21 AM, <cf623...@gmail.com> wrote:

Thanks for the detailed report. This was a useful data point.

Apart from whatever may be going on at the wifi layer (which Dave nicely alludes to) the BBR trace seems to nicely correspond to a known area where we'd like to improve BBR: provisioning sufficient cwnd for paths with very delayed, stretched, or aggregated ACKs. Such behavior is common for cellular, cable modem, or wifi paths. And it shows up here, in a dramatic fashion: the min_rtt is around 2ms, but often the ACKs for a flight arrive about 20ms later, in a tight burst.

I have attached screenshots of the tcptrace/xplot output for the CUBIC and fq-paced BBR traces, for comparison.

In the BBR trace one can see that a big part of the dynamic is that the flow runs out of cwnd. This is because the cwnd in BBR is currently calculated as: cwnd = cwnd_gain * bw * min_rtt + 3 * tso_segs. In this particular case, as shown in the nice ss output, this can give a cwnd as small as 14 packets, which is not nearly enough to saturate this bottleneck.
Thanks for your detailed explaination, but i am still quite not understand the tcptrace/xplot graph :) .  Can you give me some guide on how to understand it? Very apprecaite!

The manual for tcptrace is here: 
Please see the "Time Sequence Graph". In summary, the white line segments are transmitted packets, the green line is the cumulative ack line (snd_una), the yellow line is the limit imposed by the receiver window.

 
(How do you conclude the lower throughput is caused by running out of CWND from the graph?)

From the paced BBR graph, the advancing diagonal line of white transmitted packets stops suddenly, around the point dictated by the cwnd. And aside from the graph, in general from first principles the cwnd needs to encompass at least the BDP of the path, and for wifi LAN paths the BDP is not well approximated by bandwidth*min_rtt.

Ok, thanks for your guide. So much i have got :)  
 

For traffic traveling over the Internet, this is usually not an issue in our experience, since the longer RTT for the WAN portion of the path causes enough cwnd to be provisioned for the wifi hop. But if the path is just a wifi LAN, the cwnd can be insufficient.

As we have discussed in a thread related to cable modem behavior:
we are experimenting with an approach that includes some budget in the cwnd for the aggregation/burstiness levels recently observed in the ACK stream. That might work reasonably well in this wifi LAN case, where the ACKs for nearly an entire flight quite often arrive in a tight burst that is easily measured.
Ok, i will have a try. 

OK, thanks. We will post a note to the list when the patch is out. Though that the moment we are focused on other aspects of BBR.

thanks,
neal

 

Regards,
devin

--
You received this message because you are subscribed to the Google Groups "BBR Development" group.
To unsubscribe from this group and stop receiving emails from it, send an email to bbr-dev+u...@googlegroups.com.
Reply all
Reply to author
Forward
0 new messages