Question on the loss threshold

183 views
Skip to first unread message

ssw

unread,
Apr 2, 2025, 4:09:55 AMApr 2
to BBR Development

Hello, I have a question regarding the determination of the 2% loss threshold value.

I understand that setting a lower threshold may result in poor throughput when BBR competes with traditional congestion control algorithms, while setting a higher threshold could make BBR overly aggressive.

Therefore, I am curious about how the 2% threshold was derived. Was it calculated based on theoretical analysis, or is it primarily supported by empirical results?

Neal Cardwell

unread,
Apr 2, 2025, 10:11:57 AMApr 2
to ssw, BBR Development
On Wed, Apr 2, 2025 at 4:10 AM ssw <ssw3...@gmail.com> wrote:

Hello, I have a question regarding the determination of the 2% loss threshold value.

I understand that setting a lower threshold may result in poor throughput when BBR competes with traditional congestion control algorithms,

The 2% loss threshold is not necessary for achieving sufficient throughput when BBR competes with traditional (Reno/CUBIC) congestion control algorithms. Since Reno/CUBIC do a large multiplicative decrease (50% or 30%, respectively) upon any round trip with loss, tolerating up to 2% loss is not necessary to compete with them.
 

while setting a higher threshold could make BBR overly aggressive.

Yes. 

Therefore, I am curious about how the 2% threshold was derived. Was it calculated based on theoretical analysis, or is it primarily supported by empirical results?

The loss_thresh value of 2% was supported with empirical results from measurement and testing in three primary environments:

+ the public Internet
+ Google's internal high-speed WANs
+ Google's datacenters

The 2% value was determined to strike a reasonable balance; stopping bandwidth probing at 2% loss over a single round trip is...

+ sufficient to generally  achieve full throughput with up to 1% losses in a well-utilized high-speed shallow-buffered WAN using commodity switches
+ sufficient to react to even a single packet loss at typical broadband BDPs (at a BDP of 50 packets or less, reacting to loss at or above 2% means reacting to 1 or more packets lost), enhancing coexistence with Reno/CUBIC
+ sufficient to produce an acceptably low average loss rate that can be around 0.1%, since we only expect to encounter 2% loss in one round trip (the bandwidth probing round trip) out of a bandwidth-probing cycle of 20-60 round trips in a typical WAN path of 100ms or less

best regards,
neal


 

--
You received this message because you are subscribed to the Google Groups "BBR Development" group.
To unsubscribe from this group and stop receiving emails from it, send an email to bbr-dev+u...@googlegroups.com.
To view this discussion visit https://groups.google.com/d/msgid/bbr-dev/def71252-97fb-416a-a54a-b1f24ee4cea7n%40googlegroups.com.

ssw

unread,
Apr 3, 2025, 2:09:53 AMApr 3
to BBR Development
Thanks for the detailed answer, Neal!

Best regards,
ssw

2025년 4월 2일 수요일 오후 11시 11분 57초 UTC+9에 Neal Cardwell님이 작성:

Maxim Ivanov

unread,
Jun 18, 2025, 9:14:03 AMJun 18
to BBR Development
Hello Neal,

We see that simulated 5% packet loss in lab environment causes BBR to collapse congestion window to what seems to be a small multiple of MTU. 

Do I understand it correctly that  it can be explained by this loss threshold? It seems that BBR doesn't enter bandwidth probing state and therefore can't estimate BDP properly to increase CWND above minimum.

How would you recommend testing for a bad networks, like low-signal WiFI, where packet loss is constantly  present regardless of send rate?

Regards,
Maxim

Neal Cardwell

unread,
Jun 18, 2025, 12:11:39 PMJun 18
to Maxim Ivanov, BBR Development
On Wed, Jun 18, 2025 at 9:14 AM Maxim Ivanov <ma...@heroiclabs.com> wrote:
Hello Neal,

We see that simulated 5% packet loss in lab environment causes BBR to collapse congestion window to what seems to be a small multiple of MTU. 
... 
Do I understand it correctly that  it can be explained by this loss threshold?

Which BBR version is this? With BBRv2 or BBRv3 this is expected in the presence of persistent 5% loss.
 
It seems that BBR doesn't enter bandwidth probing state and therefore can't estimate BDP properly to increase CWND above minimum.

That should not happen. Are you seeing evidence of this from logging internal state, or speculating? BBR should always be periodically probing for bandwidth. But if there is persistent 5% loss then at high cwnd values BBR will almost always stop probing quickly, as soon as it measures the per-round-trip loss rate going above loss_thresh.
 
How would you recommend testing for a bad networks, like low-signal WiFI, where packet loss is constantly  present regardless of send rate?

I would recommend setting up a real-life bad network, and then testing it. I suspect you will find that with low-signal WiFi networks there is not much TCP-visible packet loss, due to link-layer retransmissions. That is what we typically see in looking at wifi traces.

best,
neal

 
Reply all
Reply to author
Forward
0 new messages