BBR v Reno

Peter Dordal

unread,

Jan 23, 2017, 12:27:15 AM1/23/17

to BBR Development

I'm trying to understand how BBR and Reno would interact. Here's my model. Suppose we have a BBR and a Reno connection sharing a bottleneck with a bandwidth of 2 packet/ms. The RTT is 100 ms, so BDP = 200 packets, and right now each connection has 100 packets in flight.

After eight RTTs (the length of the pacing cycle), the Reno inflight/cwnd is 108. At this point the pacing_gain=1.25 kicks in, so BBR's inflight is 125. These extra 25 packets, plus the 8 from Reno, boost the RTT to 233/2, so the next BBR bandwidth estimate is 125/(233/2) = 1.073 pkts/ms. The new BBR inflight becomes 1.073 times the RTTmin, or 107.3 packets. Because of the max filter on bandwidth, this new estimate should stick around for the next pacing cycle.

This 7.3 packets is very close to Reno's increase in cwnd. The question is whether this is coincidental, or whether there are other factors driving BBR to keep up with Reno. I'm ignoring the PROBE_RTT phase, but those are not frequent.

Any ideas? One problem with my model is that it doesn't scale very well; change the 100 packets to 200 and the inflight increase is 18 packets. What drives BBR and Reno to find some sort of equilibrium?

Peter Dordal
Loyola University Chicago CS Dept

Jonathan Morton

unread,

Jan 23, 2017, 1:04:00 AM1/23/17

to Peter Dordal, BBR Development

I prefer to think qualitatively, rather than quantitatively, when considering dynamics of this sort. Details can then be filled in by targeted quantitative analysis.

In general, Reno will (slowly) expand its cwnd to fill the pipe *and* the bottleneck link’s buffer. At that point, packet loss will occur, Reno’s cwnd will halve, and then it’ll start growing again.

It’s important to note that pacing_gain=1.25 applies *only* during maxBW probes, as it tends to start filling the pipe. After that, there is a “drain” phase with pacing_gain well below unity, followed by a steady-state phase with pacing_gain held fractionally below unity. You seem to have missed the latter two phases in your analysis.

The cwnd gain factor gives BBR some short-term resilience against fluctuations in path characteristics, but if we ignore PROBE_RTT and assume a large link buffer, eventually BBR’s cwnd (calculated from a fixed minRTT and maxBW) will become the limiting factor, and BBR’s throughput will drop off, becoming at least partly ack-clocked instead of paced.

That is, to a first approximation, the pacing rate is calculated from maxBW alone, while the cwnd is calculated from maxBW *and* minRTT.

The effect of PROBE_RTT is to compensate for non-BBR competition (and also outright path changes) by detecting the effective minRTT imposed by the buffer occupancy of competing flows. The cwnd will rise as a result, and maxBW will remain relatively stable. Competing BBR flows are protected from escalation of this type by a strong tendency for PROBE_RTT phases to synchronise, so that all BBR flows measure the same (true) minRTT.

The long-term average competition between BBR and Reno depends on the size of the link buffer and whether it has AQM; AQM tends to make Reno behave as though there is a small link buffer, even if it is physically much larger. Very large buffers will tend to give Reno an advantage over BBR, because BBR has more opportunities to get and keep lower maxBW estimates during Reno’s growth phase, and because BBR tends to minimise its share of the link buffer. In particular, BBR is likely to suffer a lot if the buffer exceeds 10 seconds of link bandwidth, as that coincides with BBR’s PROBE_RTT cycle.

Actually, rather than theorising, why not measure - Flent has some TCP comparison tests which can easily be modified.

- Jonathan Morton

Neal Cardwell

unread,

Jan 23, 2017, 10:37:50 AM1/23/17

to Jonathan Morton, Peter Dordal, BBR Development

Jonathan has given a nice discussion of many the dynamics of Reno and BBR sharing a bottleneck. I would just add that in the BBR presentation that Yuchung and I gave at the IETF in November, we discussed these dynamics, and presented some example results. See pages 23-24 of:

https://www.ietf.org/proceedings/97/slides/slides-97-iccrg-bbr-congestion-control-02.pdf

Here the experiments were with CUBIC, but in tests with Reno I have seen qualitatively similar results.

For those who have not seen it, the BBR paper in the ACM queue article discusses this briefly, in the section entitled "Competition with Loss-Based Congestion Control":

http://queue.acm.org/detail.cfm?id=3022184

As Jonathan has noted, the long-term bandwidth shares of loss-based CC (Reno or CUBIC) and BBR depend mostly on the depth of the buffer, and the flavor of AQM, if any.

With shallow buffers or aggressive AQM, the loss-based CC will back off when more bandwidth is actually available, and thus BBR will get a higher proportion of bandwidth. If the buffer is significantly less than a BDP then Reno and CUBIC can underutilize the bandwidth even running by themselves; in such cases BBR can be thought of as utilizing that spare bandwidth that loss-based CC was wasting.

With deep buffers, loss-based CC will tend to cram the buffer full of excess packets. BBR's normal PROBE_BW and PROBE_RTT mechanisms will normally be enough to move BBR toward the flow's fair share of bandwidth, and provision a pacing rate and cwnd that can maintain something acceptably close to the flow's fair share, as long as the buffer is not "too deep" (e.g., up to 16x BDP in the graph on page 24 of the slide deck above). For buffers deeper than that, BBR's mechanisms (in PROBE_BW and PROBE_RTT) to try to periodically drain the queue and rediscover and reestablish reasonable operating point are going to give BBR flows a slightly lower share of bandwidth than their loss-based counterparts.

neal

--
You received this message because you are subscribed to the Google Groups "BBR Development" group.
To unsubscribe from this group and stop receiving emails from it, send an email to bbr-dev+unsubscribe@googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

Peter Dordal

unread,

Jan 23, 2017, 10:27:45 PM1/23/17

to BBR Development

I won't quibble with quantitative vs qualitative, but I'm looking for a small-scale explanation of this statement from one of the slides mentioned:

BBR's bw and RTT probing tends to drive system toward fairness

What is the force towards fairness equilibrium? How stable is this equilibrium?

I do understand that the pacing_gain=1.25 applies for only one cycle out of eight. But because larger maxBW measurements persist for 10 RTTs, a higher maxBW measurement during the pacing_gain=1.25 period will continue to be used even as pacing_gain falls, and in fact for the rest of the pacing_gain cycle. If there is competition, BBR will measure an elevated maxBW during the pacing_gain=1.25 period. By contrast, if the BBR flow is alone on the bottleneck, the maxBW measurement made when pacing_gain = 1.25 does *not* show an increase.

I did ignore RTT probing in my analysis, and this clearly does play a role in keeping up with Reno/Cubic. Each time minRTT increases, inflight increases in proportion. But what is the relative importance of the maxBW upwards ratcheting I described earlier versus the minRTT upwards ratcheting?

Peter

Jonathan Morton

unread,

Jan 24, 2017, 3:29:55 AM1/24/17

to Peter Dordal, BBR Development

These dynamics don’t seem to be shown visually in the paper. However, I think I can shed some light on them:

I’ll assume we started the BBR flow with no competition, and that a new Reno flow has, for some reason, switched to congestion avoidance before reaching the full BDP. This implies a fairly high BDP, ie. both bandwidth and RTT are significant. This situation generalises well.

So BBR starts with knowledge of the true minRTT and maxBW. From here, maxBW will not increase as long as the path remains constant. With competition, the actual delivery rate is reduced from maxBW, so inflight begins to increase and eventually hits cwnd = maxBW * minRTT * cwnd_gain. Entering a PROBE_BW phase with pacing_gain=1.25 here simply increases inflight faster in the short term; the delivery rate still cannot increase enough to match. At inflight == cwnd, pacing_gain is no longer the controlling factor (except possibly when pacing_gain=0.75), and BBR becomes ack-clocked.

A key distinction between BBR and Reno is that cwnd is capped at a multiple of the true BDP at this point. When BBR becomes ack-clocked, its buffer occupancy and induced latency with a dumb FIFO is significantly less than with a loss-based CC.

After 20 seconds, the true maxBW and minRTT samples time out of the max- and min-filters, and new values are computed from more recent samples of delivery rate and effective RTT. The measurement window by now includes at least one cycle of the probe mechanism under competition. The maxBW value will therefore fall, and minRTT will rise.

Rising minRTT increases the cwnd cap, allowing BBR to compete fairly with loss-based CCs that fill a large buffer. Control of latency is lost, but that’s due to the competition ignoring it, not any failing in BBR.

Falling maxBW implements that fairness by sharing the buffer with the competition. Even at the true maxBW, BBR is only ever sending 25% faster than the path capacity, so there’s plenty of opportunity for Reno to insert packets as well. When ack-clocked, BBR is naturally limited to some relatively fair throughput, related to the ratio of cwnds at that time. In the worst case, the ack-clocked rate will eventually become the dominant value in the max-filtered delivery rate, as shown in Figure 3 in the paper.

Since Reno increases cwnd quite slowly in congestion-avoidance, BBR should initially converge on a slightly increased minRTT and a roughly halved maxBW. BW probes will *then* show a temporarily increased delivery rate, since they disturb the buffer-sharing equilibrium. However, because BBR paces its probes, they will never reveal the full path BW while there is competition for it, so BBR will always leave some room, and Reno will grow into it, forcing the RTT gradually upwards.

This does not mean the BBR is guaranteed to reach *equal* throughput with its competition. It is not, even approximately. But it won’t *starve* the competition. That’s what “fairness" means in congestion control.

Things get a bit more complicated when the buffer fills and starts dropping packets. This places a cap on Reno’s cwnd and BBR’s minRTT estimate. This results in the long-term sharing statistics, which depend on the buffer size relative to the true BDP.

- Jonathan Morton

Neal Cardwell

unread,

Jan 24, 2017, 11:24:00 AM1/24/17

to Peter Dordal, BBR Development

On Mon, Jan 23, 2017 at 10:27 PM, Peter Dordal <pdo...@cs.luc.edu> wrote:

I won't quibble with quantitative vs qualitative, but I'm looking for a small-scale explanation of this statement from one of the slides mentioned:

BBR's bw and RTT probing tends to drive system toward fairness

Jonathan has provided a nice discussion of the dynamics, so I will just add a few more points, and a graph or two to illustrate.

What is the force towards fairness equilibrium?

Here is one "small-scale explanation" of the basic force that drives a mix of loss-based and BBR flows toward equilibrium:

Any loss-based CC (Reno or CUBIC) in the traffic mix drives the system into a state (oscillating around a full buffer) where the control loops of Reno/CUBIC and BBR behave in a roughly compatible fashion. Each flow of either flavor periodically probes for more bandwidth, and finds more bandwidth available if its fair share is higher, and cannot persistently find more bw if they have reached their fair share (because all the other flows are probing as well). And in practice it does not seem to matter too much that the criterion loss-based CC and BBR use to decide whether more bandwidth is available is very different: based on (a) loss (Reno/CUBIC) or (b) bandwidth saturation (BBR).

How stable is this equilibrium?

Pretty stable in our experience.

For a simple case, slide 23 of the ICCRG slides give a simple (but long-running) case with 1 CUBIC and 1 BBR:

https://www.ietf.org/proceedings/97/slides/slides-97-iccrg-bbr-congestion-control-02.pdf

If you are curious about the flavor of behavior for more flows, an example graph is attached showing 3 BBR flows and 3 CUBIC flows sharing a 10Mbit link with 50ms RTT and 24 BDP of buffering (again this is from an internal test/viz tool written by BBR team member Soheil Hassas Yeganeh). This is close to the worst case for BBR, since the buffer is so deep (24x BDP, or 1000 packets). And yet BBR holds a pretty respectable share of bandwidth.

If there is someone with cycles to quantify this further, it would be an interesting study to report back to the list.

I do understand that the pacing_gain=1.25 applies for only one cycle out of eight. But because larger maxBW measurements persist for 10 RTTs, a higher maxBW measurement during the pacing_gain=1.25 period will continue to be used even as pacing_gain falls, and in fact for the rest of the pacing_gain cycle. If there is competition, BBR will measure an elevated maxBW during the pacing_gain=1.25 period. By contrast, if the BBR flow is alone on the bottleneck, the maxBW measurement made when pacing_gain = 1.25 does *not* show an increase.

I did ignore RTT probing in my analysis, and this clearly does play a role in keeping up with Reno/Cubic. Each time minRTT increases, inflight increases in proportion. But what is the relative importance of the maxBW upwards ratcheting I described earlier versus the minRTT upwards ratcheting?

Both bw and min_rtt are important. The inflight must be high enough to reach and maintain the fair share, which means the cwnd needs to be at least fair_share_bw * min_rtt. So both the bw and min_rtt must converge upward toward the required value.

neal

2016-08-18-bba01cb9a521-3xbbr-vs-3xcubic-10M-50ms-1000pkt.buffer.420sec.1.util.png

Peter Dordal

unread,

Jan 28, 2017, 11:30:09 PM1/28/17

to BBR Development, pdo...@cs.luc.edu

On Tuesday, January 24, 2017 at 10:24:00 AM UTC-6, Neal Cardwell wrote:

For a simple case, slide 23 of the ICCRG slides give a simple (but long-running) case with 1 CUBIC and 1 BBR:
https://www.ietf.org/proceedings/97/slides/slides-97-iccrg-bbr-congestion-control-02.pdf

Of all the graphs, this one struck me as the most interesting, because BBR clearly does *not* do well initially, but then recovers. Am I correct in my understanding that the recovery begins at the time of the first PROBE_RTT phase?

I would love to run this in a simulator. Is there any ns-3 code yet for BBR?

Peter

Neal Cardwell

unread,

Jan 29, 2017, 4:49:27 PM1/29/17

to Peter Dordal, BBR Development

On Sat, Jan 28, 2017 at 11:30 PM, Peter Dordal <pdo...@cs.luc.edu> wrote:

On Tuesday, January 24, 2017 at 10:24:00 AM UTC-6, Neal Cardwell wrote:

For a simple case, slide 23 of the ICCRG slides give a simple (but long-running) case with 1 CUBIC and 1 BBR:
https://www.ietf.org/proceedings/97/slides/slides-97-iccrg-bbr-congestion-control-02.pdf

Of all the graphs, this one struck me as the most interesting, because BBR clearly does *not* do well initially, but then recovers. Am I correct in my understanding that the recovery begins at the time of the first PROBE_RTT phase?

Yes, the convergence starts once BBR's PROBE_RTT phase causes the min_rtt estimate to reflect the queuing delay imposed by the CUBIC flow. This allows BBR's inflight to grow enough for PROBE_BW to discover more bandwidth and converge toward its fair share.

In this case the CUBIC is repeatedly filling the 8*BDP of buffer, leading to median RTTs of roughly 10ms, or 7.75x the min_rtt. When there is a standing queue, a flow's share of throughput is equal to its share of queue slots. The large number of packets that CUBIC is shoving into the queue initially give it a large proportion of queue slots, and thus a large share of bandwidth.

A significant increase in delay is ambiguous: (a) it could be caused by a route change to a longer path, (b) one or more loss-based flows filling the queue, (c) one or more BBR flows entering the path and still learning the bw and min_rtt, or some combination of those.There is a trade-off here: if BBR reacts to a delay increase by estimating that it is dealing with (a) or (b) and increasing inflight, then if in reality the dynamic is (c) then it will have caused unnecessary queuing delay and packet loss. The current BBR approach tries to strike a balance in this trade-off.

I would love to run this in a simulator. Is there any ns-3 code yet for BBR?

I don't see ns-3 support for BBR in their repo. But you can check with some of the folks who chimed in on this thread:

https://groups.google.com/forum/#!topic/bbr-dev/Qk8k_PD0b5s

In the meantime Linux netem is quite good for emulating various RTT and bandwidth parameters, if you want to experiment with the Linux TCP BBR code.

neal

Peter Dordal

unread,

Mar 5, 2017, 10:17:45 PM3/5/17

to BBR Development

On Sunday, January 22, 2017 at 11:27:15 PM UTC-6, Peter Dordal wrote:

I'm trying to understand how BBR and Reno would interact...

Ok, I finally got BBR installed. Here's my chart of BBR (green) vs Reno (red) throughput (actually goodput), for a queue size of 8 BDP, done under Mininet. No-load RTT is 40 ms; the bandwidth is 1.25 MByte/s, and the queue is 267 packets (that should work out to 8 BDP, assuming a packet size of ~1500 bytes). The graph is of 300 sec; the Reno connection is finished at 260 sec at which point BBR consumes all the bandwidth.

The general shape is remarkably similar to page 23 of slides-97-iccrg-bbr-congestion-control-2016.pdf (there contrasting cubic and bbr). I have not yet been able to identify any qualitative difference between a reno-v-bbr and a cubic-v-bbr graph, at least for large BBR (unless the flightsize is plotted as well).

In the graph above, you can clearly see the bbr dip in throughput every 10 sec for the PROBE_RTT phase; Reno's throughput ticks up correspondingly. (That throughput uptick is presumably drawing from the queue, not from any change in cwnd.) It's a little harder to see the effect of the pacing-gain cycle, at least at this magnification. That may be an artifact due to the way I measured flightsize and throughput.

Above the throughput graphs are graphs of the flightsize, as measured by snooping TCP SEQ and ACK values. The labeled vertical scale for these graphs is meaningless, except that the heavy black line at 1.400 does represent the zero line.

The goodput is a little less than 1.25 MBps, but I assume the difference is due to headers. We're clearly keeping the queue utilized all the time.

Neal Cardwell

unread,

Mar 6, 2017, 4:09:34 PM3/6/17

to Peter Dordal, BBR Development

Thanks, Peter, for the nice graph and analysis. This is indeed consistent with the kinds of results we see when testing BBR with Reno and CUBIC.

neal

--

Yuchung Cheng

unread,

Mar 6, 2017, 5:07:18 PM3/6/17

to Neal Cardwell, Peter Dordal, BBR Development

Yes thanks for the test.

btw, regarding your comment of not seeing major qualitative difference between reno-v-bbr and cubic-v-bbr: I am not surprised b/c in this BDP, Cubic mostly would run in the Reno emulation mode, meaning it is essentially Reno.

Reply all

Reply to author

Forward