bbr new implementation validation

225 views
Skip to first unread message

Frederic Lecaille

unread,
Dec 3, 2024, 2:44:43 PM12/3/24
to BBR Development
Hello,

Here is a graph which  plots  the BBR bottleneck bandwidth (->bw in bytes/s) from a QUIC/BBR implementation during a 3GB object donwload as a function of the time (in seconds) on a 100 Mbits/s (or 12.5MB/s) link.  

My question is:
is the estimated more than 16MB/s bandwidth after 200s suspect/wrong or not during this download? Indeed, as the packet loss is not negligible during this download (more than 15% after reaching more than 16MB/s on a 100Mbits/s link,  have I to conclude that this implementation is overestimating the delivery rate,  and as a consequence leading to a big packet loss rate?

A second question would be:

If the delivery rate estimation wrong, which would lead BBR to take more than 2/3 of the connection time to reach the ~16MB/s download rate (but with high packet loss) in place of the more realistic 12.5MB/s rate?

Regars,
Fred.



bbr.bw.png

Frederic Lecaille

unread,
Dec 3, 2024, 3:04:07 PM12/3/24
to BBR Development
Sorry fot the typo, the 2nd question is:

*Is* the delivery rate estimation wrong, which would lead BBR to take more than 2/3 of the connection time to reach the ~16MB/s download rate (but with high packet loss) in place of the more realistic 12.5MB/s rate?

Neal Cardwell

unread,
Dec 3, 2024, 3:17:54 PM12/3/24
to Frederic Lecaille, BBR Development
On Tue, Dec 3, 2024 at 2:48 PM Frederic Lecaille <flec...@haproxy.com> wrote:
Hello,

Here is a graph which  plots  the BBR bottleneck bandwidth (->bw in bytes/s) from a QUIC/BBR implementation during a 3GB object donwload as a function of the time (in seconds) on a 100 Mbits/s (or 12.5MB/s) link.  

My question is:
is the estimated more than 16MB/s bandwidth after 200s suspect/wrong or not during this download?

The estimated bw of 16MB/s  is slightly suspicious, but that could be reasonable, depending on the exact timing behavior of ACKs. There are some common link aggregation/batching behaviors that can cause that kind of bandwidth overestimation.
 
Indeed, as the packet loss is not negligible during this download (more than 15% after reaching more than 16MB/s on a 100Mbits/s link,  have I to conclude that this implementation is overestimating the delivery rate,  and as a consequence leading to a big packet loss rate?

Yes, it sounds like perhaps the bw estimate is too high, and your implementation is using BBRv1 rather than BBRv2 or BBRv3, and the bottleneck buffer is shallow, and the combination of those things is causing significant packet loss.

I would suggest using BBRv3, which should keep the loss rate at a *much* lower level, by learning volumes of in-flight data that are safe and cause acceptably low loss rates:
 
A second question would be:

If the delivery rate estimation wrong, which would lead BBR to take more than 2/3 of the connection time to reach the ~16MB/s download rate (but with high packet loss) in place of the more realistic 12.5MB/s rate?

The bw oscillation between 5 MB/s and 15 MB/s until around t=220sec looks like a separate issue. Something seems to be causing the bw estimate to periodically erroneously drop from around 12 MB/s to around 5 MB/s. A hunch would be that somehow there is a buggy interaction between the bw estimator in that BBR CC implementation and the loss recovery or application-limited logic in that QUIC transport implementation.

I would suggest looking at two kinds of trace together to understand what's going on:

 (1) time-sequence plot visualizing X=time, Y=data_sequence, slope=bandwidth

 (2) logging for each ACK: wall clock time in microseconds, bytes or packets marked delivered on this ACK, bytes or packets marked lost on this ACK, bw sample, estimated bw, pacing rate, cwnd, BBR gain cycling phase

To debug the oscillation, it probably makes sense to drill down in the logs at the moments when the estimated bw drops, to understand why the bw estimate is dropping.

best regards,
neal

 
Regars,
Fred.



--
You received this message because you are subscribed to the Google Groups "BBR Development" group.
To unsubscribe from this group and stop receiving emails from it, send an email to bbr-dev+u...@googlegroups.com.
To view this discussion visit https://groups.google.com/d/msgid/bbr-dev/2c702554-4691-4677-b83d-cd9076dbdadfn%40googlegroups.com.

Neal Cardwell

unread,
Dec 3, 2024, 3:20:54 PM12/3/24
to Frederic Lecaille, BBR Development
On Tue, Dec 3, 2024 at 3:18 PM Frederic Lecaille <flec...@haproxy.com> wrote:
Sorry fot the typo, the 2nd question is:

*Is* the delivery rate estimation wrong, which would lead BBR to take more than 2/3 of the connection time to reach the ~16MB/s download rate (but with high packet loss) in place of the more realistic 12.5MB/s rate?

Yes, AFAICT it seems the bw estimator has some kind of issue. :-)

I think this part of my reply above still makes sense:

The bw oscillation between 5 MB/s and 15 MB/s until around t=220sec looks like a separate issue. Something seems to be causing the bw estimate to periodically erroneously drop from around 12 MB/s to around 5 MB/s. A hunch would be that somehow there is a buggy interaction between the bw estimator in that BBR CC implementation and the loss recovery or application-limited logic in that QUIC transport implementation.

I would suggest looking at two kinds of trace together to understand what's going on:

 (1) time-sequence plot visualizing X=time, Y=data_sequence, slope=bandwidth

 (2) logging for each ACK: wall clock time in microseconds, bytes or packets marked delivered on this ACK, bytes or packets marked lost on this ACK, bw sample, estimated bw, pacing rate, cwnd, BBR gain cycling phase

To debug the oscillation, it probably makes sense to drill down in the logs at the moments when the estimated bw drops, to understand why the bw estimate is dropping.

best regards,
neal

 
On Tuesday, December 3, 2024 at 8:44:43 PM UTC+1 Frederic Lecaille wrote:
Hello,

Here is a graph which  plots  the BBR bottleneck bandwidth (->bw in bytes/s) from a QUIC/BBR implementation during a 3GB object donwload as a function of the time (in seconds) on a 100 Mbits/s (or 12.5MB/s) link.  

My question is:
is the estimated more than 16MB/s bandwidth after 200s suspect/wrong or not during this download? Indeed, as the packet loss is not negligible during this download (more than 15% after reaching more than 16MB/s on a 100Mbits/s link,  have I to conclude that this implementation is overestimating the delivery rate,  and as a consequence leading to a big packet loss rate?

A second question would be:

If the delivery rate estimation wrong, which would lead BBR to take more than 2/3 of the connection time to reach the ~16MB/s download rate (but with high packet loss) in place of the more realistic 12.5MB/s rate?

Regars,
Fred.



--
You received this message because you are subscribed to the Google Groups "BBR Development" group.
To unsubscribe from this group and stop receiving emails from it, send an email to bbr-dev+u...@googlegroups.com.

Frederic Lecaille

unread,
Dec 4, 2024, 1:48:55 PM12/4/24
to Neal Cardwell, BBR Development
On 12/3/24 21:17, 'Neal Cardwell' via BBR Development wrote:
>
>
> On Tue, Dec 3, 2024 at 2:48 PM Frederic Lecaille <flec...@haproxy.com
> <mailto:flec...@haproxy.com>> wrote:
>
> Hello,
>
> Here is a graph which  plots  the BBR bottleneck bandwidth (->bw in
> bytes/s) from a QUIC/BBR implementation during a 3GB object donwload
> as a function of the time (in seconds) on a 100 Mbits/s (or 12.5MB/
> s) link.  
>
> My question is:
> is the estimated more than 16MB/s bandwidth after 200s suspect/wrong
> or not during this download?
>
>
> The estimated bw of 16MB/s  is slightly suspicious, but that could be
> reasonable, depending on the exact timing behavior of ACKs. There are
> some common link aggregation/batching behaviors that can cause that kind
> of bandwidth overestimation.

I see.

> Indeed, as the packet loss is not negligible during this download
> (more than 15% after reaching more than 16MB/s on a 100Mbits/s
> link,  have I to conclude that this implementation is overestimating
> the delivery rate,  and as a consequence leading to a big packet
> loss rate?
>
>
> Yes, it sounds like perhaps the bw estimate is too high, and your
> implementation is using BBRv1 rather than BBRv2 or BBRv3, and the
> bottleneck buffer is shallow, and the combination of those things is
> causing significant packet loss.

My buggy implementation is using BBRv3.

> I would suggest using BBRv3, which should keep the loss rate at a *much*
> lower level, by learning volumes of in-flight data that are safe and
> cause acceptably low loss rates:
>   https://datatracker.ietf.org/doc/draft-ietf-ccwg-bbr/ <https://
> datatracker.ietf.org/doc/draft-ietf-ccwg-bbr/>
>  
>
> A second question would be:
>
> If the delivery rate estimation wrong, which would lead BBR to take
> more than 2/3 of the connection time to reach the ~16MB/s download
> rate (but with high packet loss) in place of the more realistic
> 12.5MB/s rate?
>
>
> The bw oscillation between 5 MB/s and 15 MB/s until around t=220sec
> looks like a separate issue. Something seems to be causing the bw
> estimate to periodically erroneously drop from around 12 MB/s to around
> 5 MB/s. A hunch would be that somehow there is a buggy interaction
> between the bw estimator in that BBR CC implementation and the loss
> recovery or application-limited logic in that QUIC transport implementation.
>
> I would suggest looking at two kinds of trace together to understand
> what's going on:
>
>  (1) time-sequence plot visualizing X=time, Y=data_sequence, slope=bandwidth
>
>  (2) logging for each ACK: wall clock time in microseconds, bytes or
> packets marked delivered on this ACK, bytes or packets marked lost on
> this ACK, bw sample, estimated bw, pacing rate, cwnd, BBR gain cycling phase
>
> To debug the oscillation, it probably makes sense to drill down in the
> logs at the moments when the estimated bw drops, to understand why the
> bw estimate is dropping.

At least I have fixed the oscillation issue which was due to 2 calls to
BBRInflight() with wrong values as <gain>. This had as side effect to
make BBR stay in ProbeBW_DOWN for too long before deciding to cruise.
Have a look to the new plot png files attached to this mail.

What is weird is why my BBR emplementation lately decides to stop
oscillating after downloading 2/3 of the object?

About the packet loss rate, it should not be more than 1% for all the
connection time I guess? Still too big on my side. Investigating...

Thanks a lot Neal.

Fred.
bbr.bw.2.png
bbr.bw.png

Neal Cardwell

unread,
Dec 4, 2024, 2:03:14 PM12/4/24
to Frederic Lecaille, BBR Development
Great. That definitely looks like an improvement!

I guess the plots are showing the estimated bandwidth, and not the pacing rate? If so, it seems to me there is still too much oscillation in the estimated bandwidth, if it is a wired or emulated link. If it's a wifi or cellular bottleneck, then perhaps it is working as intended.

Since this is BBRv3, I would suggest double-checking that the estimated bandwidth is correctly computed using the max bandwidth sample from the last two bandwidth-probing cycles. The way the bandwidth in the graphs oscillates makes it seem like perhaps the estimated bandwidth is using the most recent bandwidth sample, and not the max over a longer time range?
 
What is weird is why my BBR emplementation lately decides to stop
oscillating after downloading 2/3 of the object?
 
Yes, that is strange. :-) It looks like perhaps the code is getting stuck in ProbeBW_UP. You may want to add debug logging to understand why it decides not to exit that state.

About the packet loss rate, it should not be more than 1% for all the
connection time I guess? Still too big on my side. Investigating...

Yes, a single BBRv3 flow sending through a bottleneck with a stable available bandwidth should typically have an average loss rate of less than 0.1%. In ProbeBW_UP it should experience around 2% loss for one round trip, and then it should experience dozens of round trips with 0 packet loss, with the exact time elapsed depending on the RTT, and then it should repeat.  I'd suggest double-checking that inflight_hi and inflight_lo are being set as expected.

best,
neal

Frederic Lecaille

unread,
Dec 5, 2024, 6:23:45 PM12/5/24
to Neal Cardwell, BBR Development
On 12/4/24 20:02, Neal Cardwell wrote:
>
>
> On Wed, Dec 4, 2024 at 1:39 PM Frederic Lecaille <flec...@haproxy.com
> <mailto:flec...@haproxy.com>> wrote:
>
> On 12/3/24 21:17, 'Neal Cardwell' via BBR Development wrote:
> >
> >
> > On Tue, Dec 3, 2024 at 2:48 PM Frederic Lecaille
> <flec...@haproxy.com <mailto:flec...@haproxy.com>
> datatracker.ietf.org/doc/draft-ietf-ccwg-bbr/> <https://
> > datatracker.ietf.org/doc/draft-ietf-ccwg-bbr/ <http://
Sorry for my late reply.

Yes, the plots are showing the estimated bandwidth (named BBR.bw in RFC)
on a wired link (BBR sender in Chicago(US), receiver in France) with
~100ms as RTT with a null variation most of the time.

> Since this is BBRv3, I would suggest double-checking that the estimated
> bandwidth is correctly computed using the max bandwidth sample from the
> last two bandwidth-probing cycles. The way the bandwidth in the graphs
> oscillates makes it seem like perhaps the estimated bandwidth is using
> the most recent bandwidth sample, and not the max over a longer time range?

I have checked the RFC and modified consequently a little bit the code.

> What is weird is why my BBR emplementation lately decides to stop
> oscillating after downloading 2/3 of the object?
>
>  
> Yes, that is strange. :-) It looks like perhaps the code is getting
> stuck in ProbeBW_UP. You may want to add debug logging to understand why
> it decides not to exit that state.

No, BBR is not stuck in ProbeBW_UP. It is most of the time
cruising->refilling->probing_up->and_down, but loosing too much packets
during the stable period after downloading for 200s (and even during the
oscillation phase). The ProbeBW_UP phases are short. This is why they
are not visible on the graphs. I am still investigating.

> About the packet loss rate, it should not be more than 1% for all the
> connection time I guess? Still too big on my side. Investigating...
>
>
> Yes, a single BBRv3 flow sending through a bottleneck with a stable
> available bandwidth should typically have an average loss rate of less
> than 0.1%. In ProbeBW_UP it should experience around 2% loss for one
> round trip, and then it should experience dozens of round trips with 0
> packet loss, with the exact time elapsed depending on the RTT, and then
> it should repeat.  I'd suggest double-checking that inflight_hi and
> inflight_lo are being set as expected.

I will do that asap. I was focusing on the oscillation. This is not easy
at all to analyze such issues, due to the huge amount of data to be logged.

To be sure: BBR should be continuously oscillating around the max
estimated bandwidth? Shouldn't it?

Please found attached to this mail a new plot(bbr.bbr.3.png). In purple,
this is a plot for the estimated bandwidth, in green, the difference
between the pacing rate and the estimated bandwidth. bbr.bw.3.2.png is a
plot for the last ~100s.

Regards,
Fred.

bbr.bw.3.png
bbr.bw.3.2.png

Neal Cardwell

unread,
Dec 6, 2024, 12:17:35 PM12/6/24
to Frederic Lecaille, BBR Development
To reduce the volume of data, you may want to do the initial debugging on a slower link? Perhaps a link to a machine at home  connected via broadband? Or a link emulated with the netem qdisc? Or perhaps the transperf tool at https://github.com/google/transperf ? (It was designed for this kind of thing...)
 
To be sure: BBR should be continuously oscillating around the max
estimated bandwidth? Shouldn't it?

The pacing rate should be continuously oscillating around the max estimated bandwidth, but the max estimated bandwidth should not be oscillating at all if the link's available bandwidth is not oscillating.
 
Please found attached to this mail a new plot(bbr.bbr.3.png). In purple,
this is a plot for the estimated bandwidth, in green, the difference
between the pacing rate and the estimated bandwidth. bbr.bw.3.2.png is a
plot for the last ~100s.

Thanks. I think this comment from me still applies: "I would suggest double-checking that the estimated bandwidth is correctly computed using the max bandwidth sample from the last two bandwidth-probing cycles. The way the bandwidth in the graphs oscillates makes it seem like perhaps the estimated bandwidth is using the most recent bandwidth sample, and not the max over a longer time range?"

best regards,
neal

 

Regards,
Fred.

Neal Cardwell

unread,
Dec 9, 2024, 1:13:45 PM12/9/24
to Frederic Lecaille, BBR Development


On Mon, Dec 9, 2024 at 12:02 PM Frederic Lecaille <flec...@haproxy.com> wrote:
On 12/6/24 18:17, Neal Cardwell wrote:
>
>
> On Thu, Dec 5, 2024 at 6:15 PM Frederic Lecaille <flec...@haproxy.com
> <mailto:flec...@haproxy.com>> wrote:
>
>     On 12/4/24 20:02, Neal Cardwell wrote:
>     >
>     >
>     > On Wed, Dec 4, 2024 at 1:39 PM Frederic Lecaille
>     <flec...@haproxy.com <mailto:flec...@haproxy.com>
> transperf tool at https://github.com/google/transperf <https://
> github.com/google/transperf> ? (It was designed for this kind of thing...)

As the oscillation issue occurs at the begin of the download session, I
have reduced the download size to 200MB (~20s).

About transperf, at this time I am not sure I can install all its
requirements on the sender. Will check this.


>     To be sure: BBR should be continuously oscillating around the max
>     estimated bandwidth? Shouldn't it?
>
>
> The pacing rate should be continuously oscillating around the max
> estimated bandwidth, but the max estimated bandwidth should not be
> oscillating at all if the link's available bandwidth is not oscillating.

Ok.


>     Please found attached to this mail a new plot(bbr.bbr.3.png). In purple,
>     this is a plot for the estimated bandwidth, in green, the difference
>     between the pacing rate and the estimated bandwidth. bbr.bw.3.2.png is a
>     plot for the last ~100s.
>
>
> Thanks. I think this comment from me still applies: "I would suggest
> double-checking that the estimated bandwidth is correctly computed using
> the max bandwidth sample from the last two bandwidth-probing cycles. The
> way the bandwidth in the graphs oscillates makes it seem like perhaps
> the estimated bandwidth is using the most recent bandwidth sample, and
> not the max over a longer time range?"

Ok. Perhaps I have missed something, but I have double checked the code
about the max bandwidth (BBR.max_bw) filter (BBR.MaxBwFilter). We use
the same logic as the one for the QUIC implementation for quiche here to
implement the windowed max filter:

https://quiche.googlesource.com/quiche/+/5be974e29f7e71a196e726d6e2272676d33ab77d/quic/core/congestion_control/windowed_filter.h

That said, I have just realized that this code is different from the one
used by the Linux kernel in lib/win_minmax.c. This is the time which is
compared when possibly updating the 2nd and 3rd best choices in the
kernel, the value in quiche.

Here is a plot with the 3 sampled values for the max window filter (smp1
(green), smp2 (blue), smp3 (orange)). So, the BBR.max_bw value is equal
to smp1. In black we have the last sampled rs.delivery_rate value. The
BBR.cycle_count value is also plot. This is the time used to update the
windowed filer for the max bandwitdh (BBR.MaxBwFilter). Its scale is on
the right. During this test, this is same code as the one the kernel
code which is used to update the max filter.

AFAICT there's a bug in the way the BBR.cycle_count variable is updated in your implementation. In BBRv3 it should only be updated once per bandwidth-probing cycle (as long as the flow was not app-limited while it was attempting to probe for bandwidth). I only see about 5 bandwidth-probing cycles in this trace.  So BBR.cycle_count should only be incremented about 5 times. The code seems to be incrementing BBR.cycle_count  about once per round trip, or something like that?
 
Note that during such a little test, there are big losses (more than
10%) during PROBE_BW_UP. This could explain the max bandwidth oscillation?

No, the losses should not cause the max bw estimate to decrease. :-) The main issue seems to be the BBR.cycle_count maintenance.
 
As far as the delivery_rate increase during PROBE_BW_UP, I do not see
why the max bandwidth would not increase during this state. This is the
state during which the max bandwidth increases very much. Then
BBR.max_bw stays stable during 2 cycles. Same thing for the other samples.

If the available bandwidth has not changed, then the max bandwidth estimate should not change while probing for bandwidth because the max bandwidth estimate should "remember" the delivery rate during the last time the flow probed for bandwidth.
 
Also please note that our pacer is newly implemented. Perhaps there are
bugs in relation with it. One question that comes to mind about the
pacing is, what if the pacer "lies" to BBR? I mean, what if it does not
pace the flow at the rate computed by BBR?

Sure, BBR depends on a correct pacing implementation, so if the pacing is not working correctly then things will go awry. :-)
 

Some others questions came to mind when I had to implement BBR. Into
haproxy, the packet loss lookup is first done before treating the
acknowledged packets. I am not sure this is a good idea for BBR. So,
such BBR functions are called in this order:

BBRHandleLostPacket()
GenerateRateSample()
BBRUpdateOnACK()

That looks OK to me.

best,
neal

 

Regards,
Fred.

Neal Cardwell

unread,
Dec 9, 2024, 3:08:07 PM12/9/24
to Frederic Lecaille, BBR Development


On Mon, Dec 9, 2024 at 2:47 PM Frederic Lecaille <flec...@haproxy.com> wrote:
On 12/9/24 19:13, Neal Cardwell wrote:
>
> AFAICT there's a bug in the way the BBR.cycle_count variable is updated
> in your implementation. In BBRv3 it should only be updated once per
> bandwidth-probing cycle (as long as the flow was not app-limited while
> it was attempting to probe for bandwidth). I only see about 5 bandwidth-
> probing cycles in this trace.  So BBR.cycle_count should only be
> incremented about 5 times. The code seems to be incrementing
> BBR.cycle_count  about once per round trip, or something like that?

Well, I do not know how this is possible because this is in relation
with an easy part of the code. :-)

BBR.cycle_count is incremented by BBRAdvanceMaxBwFilter(). This latter
is called only under the same conditions as in BBRAdaptUpperBounds():

if (BBR.ack_phase == ACKS_PROBE_STOPPING and BBR.round_start)
      /* end of samples from bw probing phase */
      if (IsInAProbeBWState() and !rs.is_app_limited)
        BBRAdvanceMaxBwFilter()

Same code on my side.

BBR.ack_phase may be set to ACKS_PROBE_STOPPING only by
BBRCheckProbeRTT() and BBRStartProbeBW_DOWN(). But BBRCheckProbeRTT()
sets BBR.ack_phase to ACKS_PROBE_STOPPING when entering PROBE_RTT state.

Impossible here!

So I guess that if BBR.cycle_count is incremented when it should not
this is because BBR.ack_phase remains at ACKS_PROBE_STOPPING phase for
too long.

Another condition to call BBRAdvanceMaxBwFilter() is that
BBR.round_start is not null, but as far as I see/understand the RFC,
this happen each time the bytes/packets which were delivered have been
acknowledged (according to BBRUpdateRound())

Sounds like something is up with your BBR.ack_phase state machine implementation. If that BBR.ack_phase approach is troublesome, you might try this alternate approach proposed by Joseph Beshay from Meta:


That BBRv3 state machine approach gets rid of the BBR.ack_phase variable entirely. I have not tried it myself, but it sounds simpler, and promising.
 
According to this part, I am not sure that the code mentionned above may
ensure the cycle_count is correctly updated:

4.5.6. Tracking Time for the BBR.max_bw Max Filter

BBR tracks time for the BBR.max_bw filter window using a virtual
(non-wall-clock) time tracked by counting the cyclical progression
through ProbeBW cycles. Each time through the Probe bw cycle, one round
trip after exiting ProbeBW_UP (the point at which the flow has its best
chance to measure the highest throughput of the cycle), BBR increments
BBR.cycle_count, the virtual time used by the BBR.max_bw filter window.
Note that BBR.cycle_count only needs to be tracked with a single bit,
since the BBR.max_bw filter only needs to track samples from two time
slots: the previous ProbeBW cycle and the current ProbeBW cycle:

OK, if you see a specific problem there, please let me know. :-)

best,
neal
 

Neal Cardwell

unread,
Dec 10, 2024, 9:48:43 AM12/10/24
to Frederic Lecaille, BBR Development


On Tue, Dec 10, 2024 at 6:01 AM Frederic Lecaille <flec...@haproxy.com> wrote:
> ee98c12ad6f0e93153656218a7df1b1ef92618d7 <https://github.com/ietf-wg-
> ccwg/draft-ietf-ccwg-bbr/pull/5/commits/
> ee98c12ad6f0e93153656218a7df1b1ef92618d7>

>
> That BBRv3 state machine approach gets rid of the BBR.ack_phase variable
> entirely. I have not tried it myself, but it sounds simpler, and promising.
>  

Ok. It really seems promising. I have tested this patch and I confirm
that the max bw oscillation issue has disappeared as shown by the last
plot file attached to this mail.

One fixed issue! Thank you Neal!

Great! You're welcome!
 
The next one is the remaining big packet losses issue. I am still
investigating. One thing which I have noted on this plot that BBR hangs
on a overestimated max bw (~16MB/s) I think, with a startup period which
lasted 4s. This seems too long to me.

Agreed. From that plot it seems like the flow should have exited Startup mode around t=1sec, around the time that the packet loss starts (I'm guessing, based on the . Adding debug logging to the code that decides whether to exit Startup based on packet loss should reveal why the flow doesn't exit Startup.

best regards,
neal

 
Regards,
Fred

Frederic Lecaille

unread,
Dec 10, 2024, 11:19:49 AM12/10/24
to Neal Cardwell, BBR Development
On 12/10/24 15:48, Neal Cardwell wrote:
> Agreed. From that plot it seems like the flow should have exited Startup
> mode around t=1sec, around the time that the packet loss starts (I'm
> guessing, based on the . Adding debug logging to the code that decides
> whether to exit Startup based on packet loss should reveal why the flow
> doesn't exit Startup.

I will check this asap. I was analyzing the packet loss issue. I have
noticed that my code computed huge values for BBR.infligh_hi due to the
fact that my implementation for BBRInflightHiFromLostPacket() does not
prevent such issues when computing the values for <inflight_prev>
<lost_prefix> (underflow when subtracting lost_prev from BBRLossThresh *
inflight_prev).

So, what should return this function if

BBRLossThresh * inflight_prev < lost_prev?

infligth_prev I guess?

Frederic Lecaille

unread,
Dec 10, 2024, 11:19:53 AM12/10/24
to Neal Cardwell, BBR Development
On 12/9/24 21:07, Neal Cardwell wrote:
> https://github.com/ietf-wg-ccwg/draft-ietf-ccwg-bbr/pull/5/commits/
> ee98c12ad6f0e93153656218a7df1b1ef92618d7 <https://github.com/ietf-wg-
> ccwg/draft-ietf-ccwg-bbr/pull/5/commits/
> ee98c12ad6f0e93153656218a7df1b1ef92618d7>
>
> That BBRv3 state machine approach gets rid of the BBR.ack_phase variable
> entirely. I have not tried it myself, but it sounds simpler, and promising.
>  

Ok. It really seems promising. I have tested this patch and I confirm
that the max bw oscillation issue has disappeared as shown by the last
plot file attached to this mail.

One fixed issue! Thank you Neal!

The next one is the remaining big packet losses issue. I am still
investigating. One thing which I have noted on this plot that BBR hangs
on a overestimated max bw (~16MB/s) I think, with a startup period which
lasted 4s. This seems too long to me.

Regards,
Fred
bbr.samples.2.png

Frederic Lecaille

unread,
Dec 10, 2024, 11:19:56 AM12/10/24
to Neal Cardwell, BBR Development
On 12/9/24 19:13, Neal Cardwell wrote:
>
> AFAICT there's a bug in the way the BBR.cycle_count variable is updated
> in your implementation. In BBRv3 it should only be updated once per
> bandwidth-probing cycle (as long as the flow was not app-limited while
> it was attempting to probe for bandwidth). I only see about 5 bandwidth-
> probing cycles in this trace.  So BBR.cycle_count should only be
> incremented about 5 times. The code seems to be incrementing
> BBR.cycle_count  about once per round trip, or something like that?

Well, I do not know how this is possible because this is in relation
with an easy part of the code. :-)

BBR.cycle_count is incremented by BBRAdvanceMaxBwFilter(). This latter
is called only under the same conditions as in BBRAdaptUpperBounds():

if (BBR.ack_phase == ACKS_PROBE_STOPPING and BBR.round_start)
/* end of samples from bw probing phase */
if (IsInAProbeBWState() and !rs.is_app_limited)
BBRAdvanceMaxBwFilter()

Same code on my side.

BBR.ack_phase may be set to ACKS_PROBE_STOPPING only by
BBRCheckProbeRTT() and BBRStartProbeBW_DOWN(). But BBRCheckProbeRTT()
sets BBR.ack_phase to ACKS_PROBE_STOPPING when entering PROBE_RTT state.

Impossible here!

So I guess that if BBR.cycle_count is incremented when it should not
this is because BBR.ack_phase remains at ACKS_PROBE_STOPPING phase for
too long.

Another condition to call BBRAdvanceMaxBwFilter() is that
BBR.round_start is not null, but as far as I see/understand the RFC,
this happen each time the bytes/packets which were delivered have been
acknowledged (according to BBRUpdateRound())


Frederic Lecaille

unread,
Dec 10, 2024, 11:19:59 AM12/10/24
to Neal Cardwell, BBR Development
On 12/6/24 18:17, Neal Cardwell wrote:
>
>
> On Thu, Dec 5, 2024 at 6:15 PM Frederic Lecaille <flec...@haproxy.com
> <mailto:flec...@haproxy.com>> wrote:
>
> On 12/4/24 20:02, Neal Cardwell wrote:
> >
> >
> > On Wed, Dec 4, 2024 at 1:39 PM Frederic Lecaille
> <flec...@haproxy.com <mailto:flec...@haproxy.com>
> <https://datatracker.ietf.org/doc/draft-ietf-ccwg-bbr/> <https://
> datatracker.ietf.org/doc/draft-ietf-ccwg-bbr/> <http://
> transperf tool at https://github.com/google/transperf <https://
> github.com/google/transperf> ? (It was designed for this kind of thing...)

As the oscillation issue occurs at the begin of the download session, I
have reduced the download size to 200MB (~20s).

About transperf, at this time I am not sure I can install all its
requirements on the sender. Will check this.

> To be sure: BBR should be continuously oscillating around the max
> estimated bandwidth? Shouldn't it?
>
>
> The pacing rate should be continuously oscillating around the max
> estimated bandwidth, but the max estimated bandwidth should not be
> oscillating at all if the link's available bandwidth is not oscillating.

Ok.

> Please found attached to this mail a new plot(bbr.bbr.3.png). In purple,
> this is a plot for the estimated bandwidth, in green, the difference
> between the pacing rate and the estimated bandwidth. bbr.bw.3.2.png is a
> plot for the last ~100s.
>
>
> Thanks. I think this comment from me still applies: "I would suggest
> double-checking that the estimated bandwidth is correctly computed using
> the max bandwidth sample from the last two bandwidth-probing cycles. The
> way the bandwidth in the graphs oscillates makes it seem like perhaps
> the estimated bandwidth is using the most recent bandwidth sample, and
> not the max over a longer time range?"

Ok. Perhaps I have missed something, but I have double checked the code
about the max bandwidth (BBR.max_bw) filter (BBR.MaxBwFilter). We use
the same logic as the one for the QUIC implementation for quiche here to
implement the windowed max filter:

https://quiche.googlesource.com/quiche/+/5be974e29f7e71a196e726d6e2272676d33ab77d/quic/core/congestion_control/windowed_filter.h

That said, I have just realized that this code is different from the one
used by the Linux kernel in lib/win_minmax.c. This is the time which is
compared when possibly updating the 2nd and 3rd best choices in the
kernel, the value in quiche.

Here is a plot with the 3 sampled values for the max window filter (smp1
(green), smp2 (blue), smp3 (orange)). So, the BBR.max_bw value is equal
to smp1. In black we have the last sampled rs.delivery_rate value. The
BBR.cycle_count value is also plot. This is the time used to update the
windowed filer for the max bandwitdh (BBR.MaxBwFilter). Its scale is on
the right. During this test, this is same code as the one the kernel
code which is used to update the max filter.

Note that during such a little test, there are big losses (more than
10%) during PROBE_BW_UP. This could explain the max bandwidth oscillation?

As far as the delivery_rate increase during PROBE_BW_UP, I do not see
why the max bandwidth would not increase during this state. This is the
state during which the max bandwidth increases very much. Then
BBR.max_bw stays stable during 2 cycles. Same thing for the other samples.

Also please note that our pacer is newly implemented. Perhaps there are
bugs in relation with it. One question that comes to mind about the
pacing is, what if the pacer "lies" to BBR? I mean, what if it does not
pace the flow at the rate computed by BBR?

Some others questions came to mind when I had to implement BBR. Into
haproxy, the packet loss lookup is first done before treating the
acknowledged packets. I am not sure this is a good idea for BBR. So,
such BBR functions are called in this order:

BBRHandleLostPacket()
GenerateRateSample()
BBRUpdateOnACK()


Regards,
Fred.
bbr.samples.png

Frederic Lecaille

unread,
Dec 10, 2024, 3:38:50 PM12/10/24
to Neal Cardwell, BBR Development
I have perhaps found another bug into the
BBRAdaptLowerBoundsFromCongestion() implementation on my side. It does
nothing if BBRIsProbingBW() return true:

/* Once per round-trip respond to congestion */
BBRAdaptLowerBoundsFromCongestion():
if (BBRIsProbingBW())
return
if (BBR.loss_in_round)
BBRInitLowerBounds()
BBRLossLowerBounds()


That said BBRIsProbingBW() is not defined in the RFC pseudo code. My
implementation stupidly calls the same code as IsInAProbeBWState(). I
think this is not correct. It should rely on this RFC part I guess:


4.5.10.3. When not Probing for Bandwidth

When not explicitly accelerating to probe for bandwidth (*Drain*,
*ProbeRTT*, *ProbeBW_DOWN*, *ProbeBW_CRUISE*), BBR responds to loss by
slowing down to some extent. This is because loss suggests that the
available bandwidth and safe volume of in-flight data may have decreased
recently, and the flow needs to adapt, slowing down toward the latest
delivery process. BBR flows implement this response by reducing the
short-term model parameters, BBR.bw_lo and BBR.inflight_lo


So BBRIsProbingBW() should be implemented as follows which excludes all
the non accelerating BBR states mentioned in this section

BBRIsProbingBW()
state = BBR.state
return (state == ProbeBW_REFILL or
state == ProbeBW_UP)

Perhaps the name of this function should be changed to reflect the fact
that this is only accelerating probing states which must be excluded by
BBRAdaptLowerBoundsFromCongestion() and make it return asap doing nothing.

Regards,
Fred

Frederic Lecaille

unread,
Dec 10, 2024, 5:15:49 PM12/10/24
to Neal Cardwell, BBR Development
On 12/10/24 21:52, Frederic Lecaille wrote:

> I have added new plots to the png files: BBR.inflight_hi,
> BBR.inflight_low and rs.loss whose unit are packets. Their values are
> stored in bytes into BBR stated, but divided by 1252 when plotted, 1252
> being the QUIC MTU.

I should have mentioned that the scale for inflight_hi, inflight_lo and
loss with the packet as unit is on the right side.


Frederic Lecaille

unread,
Dec 10, 2024, 5:15:53 PM12/10/24
to Neal Cardwell, BBR Development
On 12/10/24 21:32, Frederic Lecaille wrote:
> So BBRIsProbingBW() should be implemented as follows which excludes all
> the non accelerating BBR states mentioned in this section
>
> BBRIsProbingBW()
> state = BBR.state
> return (state == ProbeBW_REFILL or
> state == ProbeBW_UP)
>
> Perhaps the name of this function should be changed to reflect the fact
> that this is only accelerating probing states which must be excluded by
> BBRAdaptLowerBoundsFromCongestion() and make it return asap doing nothing.
>
> Regards,
> Fred

It seems that this modification fixes at least a part of the packet
loss. Please have a look to the plots attached to this mail.
bbr.loss.wo.patch.png is some plots with the patch, bbr.loss.patch.png
is with the patch mentionned above.
bbr.loss.patch.png
bbr.loss.wo.patch.png
Reply all
Reply to author
Forward
0 new messages