_______________________________________________
Bloat mailing list
Bl...@lists.bufferbloat.net
https://lists.bufferbloat.net/listinfo/bloat
--
You received this message because you are subscribed to the Google Groups "BBR Development" group.
To unsubscribe from this group and stop receiving emails from it, send an email to bbr-dev+u...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/bbr-dev/CADVnQykKbnxpNcpuZATug_4VLhV1%3DaoTTQE2263o8HF9ye_TQg%40mail.gmail.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/bbr-dev/CAHb6Lvqi9%3Dr0H6uNekoCs8tm450Bd-6%2BA9v7XMszX73h7yVMPg%40mail.gmail.com.
--
Hey Neal,I was revisiting this thread before presenting this paper in iccrg tomorrow - and I was particularly intrigued by one of the motivations you mentioned for BBR:"BBR is not trying to maintain a higher throughput than CUBIC in these kinds of scenarios with steady-state bulk flows. BBR is trying to be robust to the kinds of random packet loss that happen in the real world when there are flows dynamically entering/leaving a bottleneck."BBRv1 essentially tried to deal with this problem by doing away with packet loss as a congestion signal and having an entirely different philosophy to congestion control. However, if we set aside the issue of buffer bloat, I would imagine packet loss is a bad congestion signal in this situation because most loss-based congestion control algorithms use it as a binary signal with a binary response (back-off or no back-off). In other words, I feel the blame must be placed on not just the congestion signal, but also on how most algorithms respond to this congestion signal.
On a per-packet basis, packet loss is a binary signal. But over a window, the loss percentage and distribution, for example, can be a rich signal. There is probably scope for differentiating between different kinds of packet losses (and deciding how to react to them) when packet loss is coupled with the most recent delay measurement too. Now that BBRv2 reacts to packet loss, are you making any of these considerations too?
This is not something I plan to present in iccrg tomorrow, just something I was curious about :)
> On Mar 28, 2023, at 11:36, Ayush Mishra via Bloat <bl...@lists.bufferbloat.net> wrote:
>
> Hey Neal,
>
> I was revisiting this thread before presenting this paper in iccrg tomorrow - and I was particularly intrigued by one of the motivations you mentioned for BBR:
>
> "BBR is not trying to maintain a higher throughput than CUBIC in these kinds of scenarios with steady-state bulk flows. BBR is trying to be robust to the kinds of random packet loss that happen in the real world when there are flows dynamically entering/leaving a bottleneck."
But isn't "when there are flows dynamically entering" actually a bona fide reason for the already established flows to scale back a bit, to give the new-commers some room to establish themselves?
> BBRv1 essentially tried to deal with this problem by doing away with packet loss as a congestion signal and having an entirely different philosophy to congestion control. However, if we set aside the issue of buffer bloat, I would imagine packet loss is a bad congestion signal in this situation because most loss-based congestion control algorithms use it as a binary signal with a binary response (back-off or no back-off). In other words, I feel the blame must be placed on not just the congestion signal, but also on how most algorithms respond to this congestion signal.
Fair enough, but even if we assume a capacity based loss we really do not know:
a) did the immediate traffic simply exceed the bottleneck's queue (assuming a fixed egress capacity/rate)
b) did the immediate traffic simply exceed the bottleneck's egress capacity (think variable rate link that just dropped in rate, while traffic rate was constant)
In case a) we might be OK with doing a gentle reduction (and take a bit to do so) in case b) we probably should be doing a less gentle reduction and preferably ASAP.
>
> On a per-packet basis, packet loss is a binary signal. But over a window, the loss percentage and distribution, for example, can be a rich signal. There is probably scope for differentiating between different kinds of packet losses
Sure, as long as a veridical congestion detection is still timely enough not to make case b) above worse...
> (and deciding how to react to them) when packet loss is coupled with the most recent delay measurement too.
Hmm, say we get a "all is fine" delay probe at time X, at X+1 the capacity drops to 50% and we incur a drop, will the most recent delay data actually be informative for the near future?
Regards
Sebastian
On Sun, Apr 2, 2023 at 8:14 AM Sebastian Moeller <moel...@gmx.de> wrote:Hi Ayush,
> On Mar 28, 2023, at 11:36, Ayush Mishra via Bloat <bl...@lists.bufferbloat.net> wrote:
>
> Hey Neal,
>
> I was revisiting this thread before presenting this paper in iccrg tomorrow - and I was particularly intrigued by one of the motivations you mentioned for BBR:
>
> "BBR is not trying to maintain a higher throughput than CUBIC in these kinds of scenarios with steady-state bulk flows. BBR is trying to be robust to the kinds of random packet loss that happen in the real world when there are flows dynamically entering/leaving a bottleneck."
But isn't "when there are flows dynamically entering" actually a bona fide reason for the already established flows to scale back a bit, to give the new-commers some room to establish themselves?Yes, I agree that "when there are flows dynamically entering" is actually a bona fide reason for the already established flows to scale back to give the newcomers some room to establish themselves. I'm not arguing against scaling back to give the newcomers some room to establish themselves. I'm arguing against the specific way that Reno and CUBIC behave to try to accomplish that. :-)
Hi Neil,
thanks for your response. To make it clear I appreciate this discussion and I do in no way want to imply the BBRs are doing anything untoward here this is about understanding the principles better.
> On Apr 2, 2023, at 16:02, Neal Cardwell <ncar...@google.com> wrote:
>
>
>
> On Sun, Apr 2, 2023 at 8:14 AM Sebastian Moeller <moel...@gmx.de> wrote:
> Hi Ayush,
>
> > On Mar 28, 2023, at 11:36, Ayush Mishra via Bloat <bl...@lists.bufferbloat.net> wrote:
> >
> > Hey Neal,
> >
> > I was revisiting this thread before presenting this paper in iccrg tomorrow - and I was particularly intrigued by one of the motivations you mentioned for BBR:
> >
> > "BBR is not trying to maintain a higher throughput than CUBIC in these kinds of scenarios with steady-state bulk flows. BBR is trying to be robust to the kinds of random packet loss that happen in the real world when there are flows dynamically entering/leaving a bottleneck."
>
> But isn't "when there are flows dynamically entering" actually a bona fide reason for the already established flows to scale back a bit, to give the new-commers some room to establish themselves?
>
> Yes, I agree that "when there are flows dynamically entering" is actually a bona fide reason for the already established flows to scale back to give the newcomers some room to establish themselves. I'm not arguing against scaling back to give the newcomers some room to establish themselves. I'm arguing against the specific way that Reno and CUBIC behave to try to accomplish that. :-)
[SM] Fair enough. There likely is room for improvements
>
> > BBRv1 essentially tried to deal with this problem by doing away with packet loss as a congestion signal and having an entirely different philosophy to congestion control. However, if we set aside the issue of buffer bloat, I would imagine packet loss is a bad congestion signal in this situation because most loss-based congestion control algorithms use it as a binary signal with a binary response (back-off or no back-off). In other words, I feel the blame must be placed on not just the congestion signal, but also on how most algorithms respond to this congestion signal.
>
> Fair enough, but even if we assume a capacity based loss we really do not know:
> a) did the immediate traffic simply exceed the bottleneck's queue (assuming a fixed egress capacity/rate)
> b) did the immediate traffic simply exceed the bottleneck's egress capacity (think variable rate link that just dropped in rate, while traffic rate was constant)
>
> In case a) we might be OK with doing a gentle reduction (and take a bit to do so) in case b) we probably should be doing a less gentle reduction and preferably ASAP.
>
> Agreed. And that's the approach that BBRv2 takes; it would behave differently in the two cases. In case (a) it would essentially notice that packets are being dropped and yet the delivery rate remains high, so would infer that in-flight is too high but the estimated bandwidth seems OK, so it would immediately reduce the cwnd slightly but maintain the pacing rate.
[SM] Showing my confusion here: will reducing the cwnd not result in a reduced pacing rate at one point? Or are we talking about the immediate response here and not the (slightly) longer term average?
> In case (b) it would notice that the loss rate is high and delivery rate has reduced substantially, so would immediately and substantially reduce both the cwnd and pacing rate.
[SM] But to notice a high loss rate, will we not have to wait and withhold our response (a bit) longer, or are we talking about DupACKs showing more than one segment missing? (Both can be fine and used in conjunction, I just wonder what you had in mind here)
>
> >
> > On a per-packet basis, packet loss is a binary signal. But over a window, the loss percentage and distribution, for example, can be a rich signal. There is probably scope for differentiating between different kinds of packet losses
>
> Sure, as long as a veridical congestion detection is still timely enough not to make case b) above worse...
>
> Agreed.
>
> > (and deciding how to react to them) when packet loss is coupled with the most recent delay measurement too.
>
> Hmm, say we get a "all is fine" delay probe at time X, at X+1 the capacity drops to 50% and we incur a drop, will the most recent delay data actually be informative for the near future?
>
> Usually it takes an ACK (a dupack or ACK carrying a SACK block) ACKing data that transited the network path *after* the loss to infer the loss (consistent with the RACK philosophy), and that ACK will usually provide a delay sample. So when there is loss usually there will be a delay signal that is at least as fresh as the loss signal, providing a hint about the state of the bottleneck queue after the loss. So even with loss I'd imagine that using that most recent delay data should usually be informative about the near future.
[SM] Thanks, I think I was confusing the timing of the bandwidth probing steps with the latency measurements, thanks for clearing that up, and sorry... Yes, I agree that delay measurement is as good as it gets, and yes typically we should be able to extrapolate a bit into the future...
Many Thanks & Best Regards