After the amount of in-flight data reaches the target amount, ProbeRTT waits for at least 200ms and at least one round trip to elapse.
The 200ms constant was chosen, as noted above, to allow diverse flows (e.g., flows with different RTTs or low sending rates) to have overlapping ProbeRTT states. More specifically:
+ (1) RTTs: In terms of diversity of RTTs, most Internet traffic probably has a minimum RTT less than 200ms, since 200ms is enough to cover most common continent-spanning paths and some of the most common intercontinental paths, like:
+ Eastern Asia to the West Coast of North America
+ Japan to the East Coast of North America
And note that much of Internet traffic is streaming video or web traffic between widely-distributed Cloud/CDN nodes at the edge of the Internet and users in the same metropolitan region, with an RTT lower than 40ms or so. So having ProbeRTT last 200ms means that many flows will have a decent chance to get at least one or two round trips of decent, low RTT samples during a ProbeRTT phase.
+ (2) Rates: In terms of rate diversity, the main issue is that low-rate flows have long inter-packet gaps that would make it hard for them to obtain RTT samples if ProbeRTT were too brief. For example, a 64 Kbit/sec connection only sends one standard Ethernet-MTU packet every 1514*8 bits / 64000 bps ~= 189ms. So if ProbeRTT were significantly shorter than 200ms, it would make it difficult for slower flows like this to obtain an RTT sample during the ProbeRTT phase. And if ProbeRTT only waited a round trip, like ProbeBW_REFILL does, then it would make it very difficult for such low-rate flows to obtain a good minimum RTT sample. For example, suppose an intra-datacenter flow with a minimum RTT of 100 microseconds shared a bottleneck with a WAN flow with a minimum RTT of 100 milliseconds. If the intra-datacenter flow were to only stay in ProbeRTT for 100 microseconds, then the odds of the 100ms WAN flow getting an RTT sample during that brief 100 microsecond phase would not be acceptably high. :-)
To try to clarify things a bit, I'm proposing the following change to that phrase in the draft text:
"ProbeRTT lasts long enough (at least ProbeRTTDuration = 200 ms) to allow flows with different RTTs to have overlapping ProbeRTT states"
->
"ProbeRTT lasts long enough (at least ProbeRTTDuration = 200 ms) to allow diverse flows (e.g., flows with different RTTs or lower rates and thus longer inter-packet gaps) to have overlapping ProbeRTT states"
Hopefully that makes things at least a little more clear.