How was 200ms chosen as a floor for PROBE_RTT?

103 views
Skip to first unread message

Aidan

unread,
Feb 5, 2024, 9:37:50 AMFeb 5
to BBR Development
RFC says:
ProbeRTT lasts long enough (at least ProbeRTTDuration = 200 ms) to allow flows with different RTTs to have overlapping ProbeRTT state.

My question is how? Considering the flows would enter ProbeRTT in different times and no guarantee multiple flows would be probing for RTT at the same time.

The flip side of this is when the flow goes from ProbeBW_REFILL to ProbeBW_UP. It tries to fill the "pipe" before probing up, but it unlike ProbeRTT does not wait and rather just makes sure a round trip has passed while refilling.

Why 200ms and not 100ms? For that matter, why not wait a round trip like ProbeBW_REFILL does?

Would like to take this opportunity to thank everyone working on BBR. Looking forward to v3's RFC :)

Neal Cardwell

unread,
Feb 5, 2024, 10:44:30 AMFeb 5
to Aidan, BBR Development
On Mon, Feb 5, 2024 at 9:37 AM Aidan <sale...@gmail.com> wrote:
RFC says:
ProbeRTT lasts long enough (at least ProbeRTTDuration = 200 ms) to allow flows with different RTTs to have overlapping ProbeRTT state.

My question is how? Considering the flows would enter ProbeRTT in different times and no guarantee multiple flows would be probing for RTT at the same time.

Yes, by necessity the ProbeRTT mechanism is best-effort. There is (by necessity) no guarantee that multiple flows would be probing for the minimum RTT at the same time.

The goal is to have a best-effort mechanism that makes feasible trade-offs to try to sacrifice a small amount of throughput (roughly 2%) to greatly increase the odds that at least longer bulk flows will be able to measure a good approximation to the two-way propagation delay of their path. And note that, while there is certainly no guarantee that multiple flows would be probing for the minimum RTT at the same time, in practice experience seems to show that the best-effort synchronization approach for ProbeRTT does a reasonable job of often meeting that goal, for longer bulk flows over busy paths, where this is most important. For common application mixes of short/application-limited flows – web pages, chunked streaming video, videoconferences, RPCs – with reasonably well-provisioned network paths the utilization is often low enough for the queue to drain simply due to organic traffic fluctuations.
 
The flip side of this is when the flow goes from ProbeBW_REFILL to ProbeBW_UP. It tries to fill the "pipe" before probing up, but it unlike ProbeRTT does not wait and rather just makes sure a round trip has passed while refilling.

Yes, that's because ProbeBW_REFILL has a different goal. ProbeBW_REFILL is not trying to coordinate action, because flows don't need to coordinate anything to raise their amount of in-flight data up to their estimated BDP. But ProbeRTT, to have a decent chance of success, requires coordination between different flows, because the flows cannot measure RTTs close to their two-way propagation delay unless *all* flows sharing the bottleneck have removed all their packets from the bottleneck queue.
 
Why 200ms and not 100ms? 
For that matter, why not wait a round trip like ProbeBW_REFILL does?

After the amount of in-flight data reaches the target amount, ProbeRTT waits for at least 200ms and at least one round trip to elapse.

The 200ms constant was chosen, as noted above, to allow diverse flows (e.g., flows with different RTTs or low sending rates) to have overlapping ProbeRTT states. More specifically:

+ (1) RTTs: In terms of diversity of RTTs, most Internet traffic probably has a minimum RTT less than 200ms, since 200ms is enough to cover most common continent-spanning paths and some of the most common intercontinental paths, like:
  + West Coast of North America to Eastern Europe
  + Eastern Asia to the West Coast of North America
  + Japan to the East Coast of North America 
And note that much of Internet traffic is streaming video or web traffic between widely-distributed Cloud/CDN nodes at the edge of the Internet and users in the same metropolitan region, with an RTT lower than 40ms or so. So having ProbeRTT last 200ms means that many flows will have a decent chance to get at least one or two round trips of decent, low RTT samples during a ProbeRTT phase.

+ (2) Rates: In terms of rate diversity, the main issue is that low-rate flows have long inter-packet gaps that  would make it hard for them to obtain RTT samples if ProbeRTT were too brief. For example, a 64 Kbit/sec connection only sends one standard Ethernet-MTU packet every 1514*8 bits / 64000 bps ~= 189ms. So if ProbeRTT were significantly shorter than 200ms, it would make it difficult for slower flows like this to obtain an RTT sample during the ProbeRTT phase. And if  ProbeRTT only waited a round trip, like ProbeBW_REFILL does, then it would make it very difficult for such low-rate flows to obtain a good minimum RTT sample. For example, suppose an intra-datacenter flow with a minimum RTT of 100 microseconds shared a bottleneck with a WAN flow with a minimum RTT of 100 milliseconds. If the intra-datacenter flow were to only stay in ProbeRTT for 100 microseconds, then the odds of the 100ms WAN flow getting an RTT sample during that brief 100 microsecond phase would not be acceptably high. :-)

To try to clarify things a bit, I'm proposing the following change to that phrase in the draft text:

"ProbeRTT lasts long enough (at least ProbeRTTDuration = 200 ms) to allow flows with different RTTs to have overlapping ProbeRTT states"
->
"ProbeRTT lasts long enough (at least ProbeRTTDuration = 200 ms) to allow diverse flows (e.g., flows with different RTTs or lower rates and thus longer inter-packet gaps) to have overlapping ProbeRTT states"

Hopefully that makes things at least a little more clear.
 
Would like to take this opportunity to thank everyone working on BBR. Looking forward to v3's RFC :)

Thanks!

neal
 

--
You received this message because you are subscribed to the Google Groups "BBR Development" group.
To unsubscribe from this group and stop receiving emails from it, send an email to bbr-dev+u...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/bbr-dev/86bc3388-0b8f-4190-a7ad-64fa02bedf04n%40googlegroups.com.

Aidan

unread,
Feb 6, 2024, 10:46:59 PMFeb 6
to BBR Development
Thank you Neal for the extremely thorough answer.

I like the proposed change to the draft text and would suggest to add a mention about BBR's self synchronization as currently there's no mention of it.
If I understood correct other flows will get a new min_rtt sample when a flow goes into ProbeRTT because the queue has drained a little, and so their samples would expire at roughly the same time prompting them to probe for rtt at about the same time.

Reply all
Reply to author
Forward
0 new messages