BBR over high BDP

Douglas Caetano dos Santos

unread,

Apr 12, 2017, 4:20:23 PM4/12/17

to bbr...@googlegroups.com

Hey guys,

We have done some tests with CUBIC and BBR on our servers and got some
strange measurements.

Mainly, what we see is that CUBIC with PFIFO is faster than other
combinations to download a file of about 3.2MB on a 1Gbps link with
~120ms of RTT. Our measurements:

- CUBIC + PFIFO: ~0.83s
- CUBIC + FQ : ~1.50s
- BBR + FQ : ~0.93s

Both sides are dedicated 1Gbps links to Internet, with 10Gbps optical
ports (i.e. bottleneck is in the middle of the path). Measurements are
the same with or without concurrent traffic (around 500Mbps from a third
host).

With CUBIC+FQ, we tweaked with tcp_pacing_{ss,ca}_ratio settings and got
some better readings, but still worse than CUBIC+FIFO.

When testing locally with two machines connected via 1GbE, BBR+FQ is
faster, but CUBIC+FQ is still the slowest.

Is it expected that high BDPs affect BBR and/or FQ?

-- Douglas

Yuchung Cheng

unread,

Apr 12, 2017, 4:42:16 PM4/12/17

to Douglas Caetano dos Santos, BBR Development

Very interesting. Can you provide tcpdump traces for the 3 scenarios? Thanks.

-- Douglas

--
You received this message because you are subscribed to the Google Groups "BBR Development" group.
To unsubscribe from this group and stop receiving emails from it, send an email to bbr-dev+unsubscribe@googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

Neal Cardwell

unread,

Apr 12, 2017, 4:51:07 PM4/12/17

to Yuchung Cheng, Douglas Caetano dos Santos, BBR Development

Also, if you'd be able to run your "CUBIC + FQ" experiment with the following setting, that would be useful:

echo 2 > /sys/module/tcp_cubic/parameters/hystart_detect

The default behavior for CUBIC Hystart tends to exit slow-start early due to the delays induced by pacing. That's probably the cause of CUBIC+FQ being slow. The line above ensures that only HYSTART_DELAY is used.

neal

Douglas Caetano dos Santos

unread,

Apr 12, 2017, 5:57:47 PM4/12/17

to Neal Cardwell, Yuchung Cheng, BBR Development

Thanks for the prompt reply.

Here are the traces: https://www.taghos.com.br/bbr/traces-new.tar.gz

-- Douglas

> it, send an email to bbr-dev+u...@googlegroups.com
> <mailto:bbr-dev%2Bunsu...@googlegroups.com>.

> For more options, visit https://groups.google.com/d/optout

> <https://groups.google.com/d/optout>.

>
>
> --
> You received this message because you are subscribed to the Google
> Groups "BBR Development" group.
> To unsubscribe from this group and stop receiving emails from it,

> send an email to bbr-dev+u...@googlegroups.com
> <mailto:bbr-dev+u...@googlegroups.com>.

> For more options, visit https://groups.google.com/d/optout

> <https://groups.google.com/d/optout>.

>
>
> --
> You received this message because you are subscribed to the Google
> Groups "BBR Development" group.
> To unsubscribe from this group and stop receiving emails from it, send

> an email to bbr-dev+u...@googlegroups.com
> <mailto:bbr-dev+u...@googlegroups.com>.

Neal Cardwell

unread,

Apr 13, 2017, 10:04:32 PM4/13/17

to Douglas Caetano dos Santos, Yuchung Cheng, BBR Development

On Wed, Apr 12, 2017 at 5:57 PM, Douglas Caetano dos Santos <doug...@taghos.com.br> wrote:

Thanks for the prompt reply.

Here are the traces: https://www.taghos.com.br/bbr/traces-new.tar.gz

Thanks for the traces! Attached are plots of each case (all generated with the "tcptrace" tool), along with a plot that superimposes each case, for easier comparison.

The plots show that the behaviors for all the cases are pretty similar; they are all within a round trip of each other. There are small differences due to the details of pacing, or the lack thereof. But they have all fundamentally the same growth rate: doubling the sending rate each round trip. That's because the transfers are so small (a few megabytes) relative to the BDP of the path (.120 seconds * 1 Gbps = 15 megabytes) that the flows all spend their time in slow-start (CUBIC) or the analogous Startup mode in BBR; both of these modes double the sending rate each round trip in order to rapidly probe the available capacity.

The flows don't fill the pipe, so they don't reach the phases in their lifetimes where the significant differences in steady-state behavior between CUBIC and BBR would show up.

thanks,

neal

bbr-dev-2017-04-12-cubic-pfifo-tseq.png

bbr-dev-2017-04-12-cubic-fq-tseq.png

bbr-dev-2017-04-12-cubic-fq-hystart-tseq.png

bbr-dev-2017-04-12-bbr-fq-tseq.png

bbr-dev-2017-04-12-superimposed-tseq.png

Yuchung Cheng

unread,

Apr 14, 2017, 11:44:09 AM4/14/17

to Neal Cardwell, Douglas Caetano dos Santos, BBR Development

On Thu, Apr 13, 2017 at 7:03 PM, Neal Cardwell <ncar...@google.com> wrote:

On Wed, Apr 12, 2017 at 5:57 PM, Douglas Caetano dos Santos <doug...@taghos.com.br> wrote:
Thanks for the prompt reply.

Here are the traces: https://www.taghos.com.br/bbr/traces-new.tar.gz

Thanks for the traces! Attached are plots of each case (all generated with the "tcptrace" tool), along with a plot that superimposes each case, for easier comparison.

The plots show that the behaviors for all the cases are pretty similar; they are all within a round trip of each other. There are small differences due to the details of pacing, or the lack thereof. But they have all fundamentally the same growth rate: doubling the sending rate each round trip. That's because the transfers are so small (a few megabytes) relative to the BDP of the path (.120 seconds * 1 Gbps = 15 megabytes) that the flows all spend their time in slow-start (CUBIC) or the analogous Startup mode in BBR; both of these modes double the sending rate each round trip in order to rapidly probe the available capacity.

Notice that in terms of smoothness BBR/fq-pacing > Cubic-tuned-hystart/fq-pacing > Cubic/pfifo. Our experience is that smoothing the slow start burst significantly reduced the loss rate on shallow-buffered switches on Google's edge networks. Reducing potential losses in slow start is a big performance improvement even in Cubic by reducing congestion avoidance / cubic-growth phase towards the end of a transfer.

Douglas Caetano dos Santos

unread,

May 8, 2017, 5:36:10 PM5/8/17

to Yuchung Cheng, Neal Cardwell, BBR Development

Hi Neal, Yuchung,

Sorry for the late answer. I was in vacation for a couple weeks, then got busy
with other activities.

On 04/13/2017 11:03 PM, Neal Cardwell wrote:
> Thanks for the traces! Attached are plots of each case (all generated
> with the "tcptrace" tool), along with a plot that superimposes each
> case, for easier comparison.

Great tip on "tcptrace", thanks!

> The plots show that the behaviors for all the cases are pretty similar;
> they are all within a round trip of each other. There are small
> differences due to the details of pacing, or the lack thereof. But they
> have all fundamentally the same growth rate: doubling the sending rate
> each round trip. That's because the transfers are so small (a few
> megabytes) relative to the BDP of the path (.120 seconds * 1 Gbps = 15
> megabytes) that the flows all spend their time in slow-start (CUBIC) or
> the analogous Startup mode in BBR; both of these modes double the
> sending rate each round trip in order to rapidly probe the available
> capacity.

Yes, agreed. Between BBR and CUBIC/pfifo, there is this one RTT extra delay that
just makes sense, as BBR is pacing the packets.

I took another look at the traces I sent and they don't show one case, though,
that is the bigger time for the CUBIC/fq without hystart tuning. But, as you
already explained, the default behavior does affect slow-start negatively.

> The flows don't fill the pipe, so they don't reach the phases in their
> lifetimes where the significant differences in steady-state behavior
> between CUBIC and BBR would show up.

I thought about it, but anyway had an idea that BBR would have a significant
difference against CUBIC.

On 04/14/2017 12:43 PM, Yuchung Cheng wrote:
> Notice that in terms of smoothness BBR/fq-pacing >
> Cubic-tuned-hystart/fq-pacing > Cubic/pfifo.

I noticed that, really nice!

> Our experience is that smoothing the slow start burst significantly reduced
> the loss rate on shallow-buffered switches on Google's edge networks. Reducing
> potential losses in slow start is a big performance improvement even in Cubic
> by reducing congestion avoidance / cubic-growth phase towards the end of a
> transfer.

Maybe a bit off-topic, but do you have any data or statistics on loss rates in
such shallow-buffered switches?

I did some local tests where I mistakenly introduced several packet losses
through a low packet limit on netem (used to add delay). Could this be similar
to a shallow-buffered switch situation? What I got was that some transfers took
as long as ~10s to finish while median time was ~2s, using BBR. Analyzing the
packets from the 10s transfer, I saw an ~11% packet loss on the first 2s of the
connection, when BBR seems to set a fixed slow rate. Is this expected? I can
send the traces if they're useful.

Thanks!
Douglas.

Neal Cardwell

unread,

May 16, 2017, 10:11:24 PM5/16/17

to Douglas Caetano dos Santos, Yuchung Cheng, BBR Development

On Mon, May 8, 2017 at 5:35 PM, Douglas Caetano dos Santos <doug...@taghos.com.br> wrote:

Hi Neal, Yuchung,

Sorry for the late answer. I was in vacation for a couple weeks, then got busy
with other activities.

No worries, I understand. Sorry for my delay in getting back to this thread.

On 04/13/2017 11:03 PM, Neal Cardwell wrote:

Thanks for the traces! Attached are plots of each case (all generated with the "tcptrace" tool), along with a plot that superimposes each case, for easier comparison.

Great tip on "tcptrace", thanks!

The plots show that the behaviors for all the cases are pretty similar; they are all within a round trip of each other. There are small differences due to the details of pacing, or the lack thereof. But they have all fundamentally the same growth rate: doubling the sending rate each round trip. That's because the transfers are so small (a few megabytes) relative to the BDP of the path (.120 seconds * 1 Gbps = 15 megabytes) that the flows all spend their time in slow-start (CUBIC) or the analogous Startup mode in BBR; both of these modes double the sending rate each round trip in order to rapidly probe the available capacity.

Yes, agreed. Between BBR and CUBIC/pfifo, there is this one RTT extra delay that
just makes sense, as BBR is pacing the packets.

I took another look at the traces I sent and they don't show one case, though,
that is the bigger time for the CUBIC/fq without hystart tuning. But, as you
already explained, the default behavior does affect slow-start negatively.

The flows don't fill the pipe, so they don't reach the phases in their lifetimes where the significant differences in steady-state behavior between CUBIC and BBR would show up.

I thought about it, but anyway had an idea that BBR would have a significant
difference against CUBIC.

BBR can yield a significant improvement in startup behavior over CUBIC if there is noise in the RTT that causes Hystart to exit slow-start prematurely. This happens a lot on cellular or wifi links. But otherwise the initial performance in slow-start (CUBIC) or Startup (BBR) is pretty similar.

On 04/14/2017 12:43 PM, Yuchung Cheng wrote:

Notice that in terms of smoothness BBR/fq-pacing > Cubic-tuned-hystart/fq-pacing > Cubic/pfifo.

I noticed that, really nice!

Our experience is that smoothing the slow start burst significantly reduced
the loss rate on shallow-buffered switches on Google's edge networks. Reducing
potential losses in slow start is a big performance improvement even in Cubic
by reducing congestion avoidance / cubic-growth phase towards the end of a
transfer.

Maybe a bit off-topic, but do you have any data or statistics on loss rates in such shallow-buffered switches?

I think it would depend a lot on the characteristics of the traffic, since the bursts are due to coincident arrivals. So the loss rate will depend on the number of flows, how big their bursts are (e.g. TSO/GSO), how correlated the traffic is, etc. Maybe someone on the list will know of an existing publication that has looked at this.

I did some local tests where I mistakenly introduced several packet losses
through a low packet limit on netem (used to add delay). Could this be similar
to a shallow-buffered switch situation? What I got was that some transfers took
as long as ~10s to finish while median time was ~2s, using BBR. Analyzing the
packets from the 10s transfer, I saw an ~11% packet loss on the first 2s of the
connection, when BBR seems to set a fixed slow rate. Is this expected? I can
send the traces if they're useful.

Yes, this could be similar in spirit to a shallow-buffered switch situation. Though how similar would depend on the constants. That sounds like a fairly synthetic result that may or may not correspond to realistic scenarios. So I think we will try to focus our attention on behavior in actual paths where the switch buffers are shallow relative to the BDP. But thank you for the offer!