Some questions regarding BBR's performance (regarding BBR values from the "ss" tool)

608 views

Skip to first unread message

Kano_N

unread,

Apr 25, 2018, 5:47:43 PM4/25/18

to BBR Development

Hi all,

I tried to run some iperf experiments and see BBR's value from the "ss" command line tool. Below are the details:

Testbed: Two linux machines A and B (with Linux kernel 4.16) connected by a NETGEAR switch. Machine A is the sender, machine B is the receiver.

Benchmark tool: Iperf, run for 60 seconds

Bandwidth settings: Use netem ifb tool to set the bandwidth on the receiver machine B. At time t=0s, set bandwidth to 500mbps; at time t=26s, halve bandwidth to 250mbps; at time t=36s, double bandwidth back to 500mbps.

Delay settings: Add 10ms delay on the receiver machine B by using htb.

I collected 3 types of BBR values from the "ss" tool {bw, pacing_rate, delivery_rate} every 1ms. I also got the reported bandwidth from the iperf tool. The values are plotted in the attached figures. The 3 figures show the collected values when the bandwidth is {decreasing, stable, increasing}. The original "ss" log is also attached.

There are performance issues I don't quite understand here. Hoping to get your idea.

The "pacing_rate" (shown in red line) is generally higher than the "bw"(shown in blue line). When pacing_rate equals to 1, shouldn't be the estimated "bw" be the same as "pacing_rate"?
During the period when bandwidth doubles (around t=36s), it takes less than 2 cycles (1 cycle = 8 round trips) for the "bw" estimates to double. However, the paper (Figure 3 in https://queue.acm.org/detail.cfm?id=3022184) mentions the "bw" estimates increases 1.95X (1.25^3) in 3 cycles. May I know why it takes less time to double here?
During the period when bandwidth halves -- After the "delivery_rate" (shown in yellow line) begins to decreases, it takes more than 5 cycles (40 rtts) for "bw" and "pacing_rate" to decrease. I thought the estimate bandwidth filter window was 10 rtts so "bw" would start decreasing after 10 rtts. Why it takes 40 rtts for "bw" to start decreasing?

Thanks,

Davy

ChangeBW_decreasing.png

ChangeBW_increasing.png

ChangeBW_stable.png

ss.log

Neal Cardwell

unread,

Apr 25, 2018, 9:53:09 PM4/25/18

to kaya...@gmail.com, BBR Development

On Wed, Apr 25, 2018 at 5:47 PM Kano_N <kaya...@gmail.com> wrote:

Hi all,

I tried to run some iperf experiments and see BBR's value from the "ss" command line tool. Below are the details:

Testbed: Two linux machines A and B (with Linux kernel 4.16) connected by a NETGEAR switch. Machine A is the sender, machine B is the receiver.
Benchmark tool: Iperf, run for 60 seconds
Bandwidth settings: Use netem ifb tool to set the bandwidth on the receiver machine B. At time t=0s, set bandwidth to 500mbps; at time t=26s, halve bandwidth to 250mbps; at time t=36s, double bandwidth back to 500mbps.
Delay settings: Add 10ms delay on the receiver machine B by using htb.

I collected 3 types of BBR values from the "ss" tool {bw, pacing_rate, delivery_rate} every 1ms. I also got the reported bandwidth from the iperf tool. The values are plotted in the attached figures. The 3 figures show the collected values when the bandwidth is {decreasing, stable, increasing}. The original "ss" log is also attached.

Thanks for the nice graphs and good questions!

There are performance issues I don't quite understand here. Hoping to get your idea.

The "pacing_rate" (shown in red line) is generally higher than the "bw"(shown in blue line). When pacing_rate equals to 1, shouldn't be the estimated "bw" be the same as "pacing_rate"?

The pacing rate is computed in terms of actual bytes BBR tries to put on the wire per second; it is a function of the desired packet send rate times the full packet MTU (see bbr_bw_to_pacing_rate(), which uses tcp_mss_to_mtu()).

The bandwidth anddelivery rate are estimates of application goodput; as such they are a function of the payload size or MSS of the connection (basically the packet delivery rate times the MSS). You can check out bbr_get_info() for the bw estimate and tcp_compute_delivery_rate() for the delivery rate, and note that both use tp->mss_cache.

Basically we have:

MSS = MTU - headers = MTU - (ip_headers + tcp_headers)

Since the MSS is lower than the MTU, the bw and delivery rate (functions of MSS) are lower than the pacing rate (a function of MTU).

During the period when bandwidth doubles (around t=36s), it takes less than 2 cycles (1 cycle = 8 round trips) for the "bw" estimates to double. However, the paper (Figure 3 in https://queue.acm.org/detail.cfm?id=3022184) mentions the "bw" estimates increases 1.95X (1.25^3) in 3 cycles. May I know why it takes less time to double here?

The max-filtered bandwidth estimate immediately picks up any increases in bandwidth, which is immediately reflected in a higher pacing rate, and if we are probing for bandwidth this also immediately turns into a higher target in-flight for probing. So, depending on the timing, in some cases a virtuous and quick feedback loop can develop where a flow probes for bandwidth, which discovers a higher delivery rate, which immediately feeds back into a higher target inflight for probing, so that the flow rapidly discovers more bandwidth, and plateaus once it has saturated the link (much like the feedback loop of increasing pacing rate and measured bandwidth in Startup mode behavior).

One thing we have considered is tweaking the gain cycling pacing_gain and target inflight for probing to make this kind of virtuous feedback loop more likely. For example, probing for bandwidth with a pacing_gain of 1.125x and setting a target inflight for probing of 1.25*BDP. That can tend to make it likely that the sender will see the higher bandwidth samples starting at the end of the first round of probing, and usually start that virtuous feedback loop during the second round of probing (if more bandwidth is available). But we haven't had time to investigate this further and test the implications of that revised scheme for fairness or stability. If any researchers have time to investigate, we would be interested to hear the results.

During the period when bandwidth halves -- After the "delivery_rate" (shown in yellow line) begins to decreases, it takes more than 5 cycles (40 rtts) for "bw" and "pacing_rate" to decrease. I thought the estimate bandwidth filter window was 10 rtts so "bw" would start decreasing after 10 rtts. Why it takes 40 rtts for "bw" to start decreasing?

The clock for the bandwidth filter is in terms of "number of packet-timed round trips elapsed". For the period where the delivery rate has dropped and the bandwidth estimate has not, sending is faster than delivery for a while, so a queue forms, which increases the total RTT and thus the length of these packet-timed round trips. This makes the estimated bandwidth persist longer than it otherwise would. You can see the RTT briefly goes up to about 47ms (quite a bit higher than the min of 10ms) in the ss output during this period. Note that 10 * 47ms = 470ms, which is about the time elapsed between when the delivery rate drops and the estimated bw drops. So I suspect the delay for the estimated bw to drop is related to that. If you modify the kernel to log the bandwidth samples and bbr->rtt_cnt values that could help verify this theory.

thanks,

neal

Kano_N

unread,

Apr 26, 2018, 3:33:59 PM4/26/18

to BBR Development

Hi Neal,

Great, thanks for the detailed and informative reply! They are really helpful.

Davy

Reply all

Reply to author

Forward

0 new messages