BBR with FQ_Codel on congested link

1,228 views
Skip to first unread message

Scott Rosenberg

unread,
Apr 24, 2018, 1:06:54 PM4/24/18
to BBR Development
I've been looking through this forum and found nothing properly identifying this topic.

I understand there has been some limited discussion about fq_codel and bbr, where pacing support was recently added to the linux kernel. That is not particularly relevant to my question.

If you have a router with a congested link that has fq_codel enabled, does this behavior negatively affect a BBR flow due to dropping packets, where as BBR wouldn't identify the dropped packet as congestion?

Neal Cardwell

unread,
Apr 24, 2018, 2:09:54 PM4/24/18
to thescott...@gmail.com, BBR Development
In our tests so far, BBR works well with fq_codel at the bottleneck, because fq_codel shapes each flow to its fair share rate, and BBR adapts itself to that rate.

We would be interested to hear if you do find cases where BBR is not behaving well with an fq_codel bottleneck.

thanks,
neal

 

Dave Taht

unread,
Jul 15, 2018, 1:46:30 PM7/15/18
to BBR Development
Now that sch_cake is mainlined, I have a rather huge battery of tests lined up taking a look at various TCPs left to run for the next couple weeks.

yes, fq_codel derived algos work great with BBR. But it drops more packets than I would like in the first 10 seconds, retains a bit of the latecomer problem, and I keep hoping to see a sane response to ECN emerge, and perhaps I'm annoyed at this to go hack at it whilst those tests run (after I get caught up on some other major overdue stuff)

Either:

A)  tailoring BBR's ecn response to fq_codel's onset of CE. The window here for a new flow is typically only 100ms to start with, with a goal of only 5ms of queuing. Delay based tcps tend to work pretty good against fq_codel in the first place, and CE marks from the field, as near as i can tell, 
are mostly coming from the fq_codel deployment, so if you assume fq is a given, also, perhaps some things get easier.

B) looking at repurposing ce_threshold as per l4s. 

 
neal

 

Dave Taht

unread,
Jul 15, 2018, 2:03:15 PM7/15/18
to BBR Development
Oops, I said that badly. Restating: Delay based tcps tend to work less than optimally against delay based tcps in the first place (reverting to drop), but with a known congestion response window for CE, perhaps that can be made better for BBR.

Neal Cardwell

unread,
Jul 15, 2018, 2:29:23 PM7/15/18
to Dave Taht, BBR Development
On Sun, Jul 15, 2018 at 1:46 PM Dave Taht <dave...@gmail.com> wrote:


On Tuesday, April 24, 2018 at 11:09:54 AM UTC-7, Neal Cardwell wrote:
On Tue, Apr 24, 2018 at 1:06 PM Scott Rosenberg <thescott...@gmail.com> wrote:
I've been looking through this forum and found nothing properly identifying this topic.

I understand there has been some limited discussion about fq_codel and bbr, where pacing support was recently added to the linux kernel. That is not particularly relevant to my question.

If you have a router with a congested link that has fq_codel enabled, does this behavior negatively affect a BBR flow due to dropping packets, where as BBR wouldn't identify the dropped packet as congestion?

In our tests so far, BBR works well with fq_codel at the bottleneck, because fq_codel shapes each flow to its fair share rate, and BBR adapts itself to that rate.

We would be interested to hear if you do find cases where BBR is not behaving well with an fq_codel bottleneck.

thanks,

Now that sch_cake is mainlined, I have a rather huge battery of tests lined up taking a look at various TCPs left to run for the next couple weeks.

yes, fq_codel derived algos work great with BBR. But it drops more packets than I would like in the first 10 seconds,

We are actively working on improving BBR's response to packet loss in Startup. Do you have packet traces from your tests, illustrating the behavior you are seeing in your tests? I'd be interested to see if the mechanisms we are working with would cover the cases you have in mind there.
 
retains a bit of the latecomer problem, and I keep hoping to see a sane response to ECN emerge, and perhaps I'm annoyed at this to go hack at it whilst those tests run (after I get caught up on some other major overdue stuff)

Either:

A)  tailoring BBR's ecn response to fq_codel's onset of CE.

We are also working on BBR's response to ECN (specifically, adding such a response). There, too, it would be interesting to see traces showing the ECN mark patterns from your tests.

cheers,
neal

Jonathan Morton

unread,
Jul 15, 2018, 3:11:08 PM7/15/18
to Neal Cardwell, Dave Taht, BBR Development
> On 15 Jul, 2018, at 9:29 pm, 'Neal Cardwell' via BBR Development <bbr...@googlegroups.com> wrote:
>
> A) tailoring BBR's ecn response to fq_codel's onset of CE.
>
> We are also working on BBR's response to ECN (specifically, adding such a response). There, too, it would be interesting to see traces showing the ECN mark patterns from your tests.

Given that BBR currently disables advertisement of ECN capability (correctly, since it isn't), getting such traces may be tricky. Would ECN responses to other TCPs, such as NewReno and Cubic, be helpful? Or mangling ECT(0) into a BBR flow and watching the fireworks? (Since BBR should be able to adapt to the path bandwidth - eventually - without ECN assistance, that might not be as bad as it seems.)

In general, Codel's behaviour is as follows:

- when packet sojourn in queue is less than target (typically 5ms), no marks are made.

- when packet sojourn is observed over target *continuously* for at least the interval, the first mark is made. At this point the marking frequency is equal to the interval. Interval is typically 100ms, and is intended to match a prior estimate of the path RTT.

- as long as packet sojourn remains above target, the marking frequency increases linearly per time. To achieve this, the interval between marks decreases on a inverse-square-root schedule per mark. The second mark will therefore, by default, occur about 70ms after the first.

- when packet sojourn falls below target, marking ceases.

- when packet sojourn remains below target, the marking frequency is reduced according to some rule. The rule for reference Codel is a bit non-intuitive and, I suspect, has not been carefully thought out. Cake's version decreases it linearly, by running the inverse-square-root-schedule in reverse (marks are scheduled but not actually made during this time); eventually it returns to the baseline interval.

- if packet sojourn rises above target very soon after it fell below, the delay before marking resumes, and the marking frequency, may therefore be shorter/faster than the initial condition. This helps Codel to adapt to RTTs shorter than estimated.

The potential positions of marks can also be inferred from packet losses from the same AQM action on a non-ECT flow, which would result in SACKs in roughly the same pattern as ECE flags arriving at the sender. This is true because Codel both marks and drops packets from the head of the queue, immediately prior to delivery. In general, Codel drops non-ECT packets at the same times as it would mark ECT ones, but it delivers *some* packet in both cases; in case of a drop, it'll be the following packet in the same flow (ensuring fast recognition that a drop has occurred). If the queue is configured deeply, it won't show overflow loss as a FIFO would, and in a lab environment it's also possible to assert that no random loss occurs.

An exception to the mark-loss equivalence rule is that Cake won't drop the last remaining packet in the queue, but might still mark it. Codel and fq_codel might still drop a tail packet, if it's been sitting in the queue long enough.

- Jonathan Morton

Dave Taht

unread,
Jul 15, 2018, 10:19:59 PM7/15/18
to Jonathan Morton, Neal Cardwell, BBR Development
On Sun, Jul 15, 2018 at 12:11 PM Jonathan Morton <chrom...@gmail.com> wrote:
>
> > On 15 Jul, 2018, at 9:29 pm, 'Neal Cardwell' via BBR Development <bbr...@googlegroups.com> wrote:
> >
> > A) tailoring BBR's ecn response to fq_codel's onset of CE.
> >
> > We are also working on BBR's response to ECN (specifically, adding such a response). There, too, it would be interesting to see traces showing the ECN mark patterns from your tests.
>
> Given that BBR currently disables advertisement of ECN capability (correctly, since it isn't),

It does now? It didn't used to, enabling ecn for tcp and turning on or
off bbr were entirely independent last i checked (and at the time, I
had hoped, they would become co-dependent until such time they were
both supported at the same time, so as to minimize user error). I'm
setting up some tests this week, will look into it.

>getting such traces may be tricky. Would ECN responses to other TCPs, such as NewReno and Cubic, be helpful? Or mangling ECT(0) into a BBR flow and watching the fireworks? (Since BBR should be able to adapt to the path bandwidth - eventually - without ECN assistance, that might not be as bad as it seems.)

It was a matter of hooking it up. There has been some published work
at the ietf on applying l4s's notions to BBR, but it was
insufficiently detailed to readily duplicate. Still:

https://datatracker.ietf.org/meeting/101/materials/slides-101-iccrg-bbr-congestion-control-with-l4s-support-02

Those results seemed terrible enough to try and tackle with something
attuned to fq-ing and codel-ling. Unlike some, I'd be totally happy
with some bandwidth sacrifice, 0 queuing delay (with fq_codel staying
in it's "new" flow phase consistently), and finding some way to
observe when it slips into "old flow" phase, to stay around that.
Since that's a truly tiny number of ms I imagine it's currently
measured in the "noise" that BBR currently rejects, but since we
obsoleted the field of LPCC and there's kind of a need for consistent,
if lower than max, bandwidth for many apps, I'd like to try.

I had three basic approaches in mind. 1) force a probeRTT phase on
receipt of CE 2) decrease the RTT estimate by somewhere between 5 and
20ms, reducing cwnd to suit something less agressive than cubic but
harsher than reno 3) observe the frequency of CE marks and drops in
general, and buffer depths, and observed RTTs, with various sorts of
tcp related traffic using flent's toolset.

Please note my principal objective is to beat up the "final" version
of sch_cake (after the 19 not so thoroughly evaluated revisions prior
to upstreaming last week) so I'm going to hit it with cdg, vegas,
cubic, and dctcp, and it's just that I'm sufficiently annoyed by BBR
lacking ecn still to want to wrap my hands around it and shake it hard
- but i do got other things on my plate and won't mind if someone
produces patches....

I was pleased to see a couple very good improvements to dctcp land in
net-next recently.

> In general, Codel's behaviour is as follows:

This is a nice summary below, although it's preaching to the largely
converted. :)

>
> - when packet sojourn in queue is less than target (typically 5ms), no marks are made.
>
> - when packet sojourn is observed over target *continuously* for at least the interval, the first mark is made. At this point the marking frequency is equal to the interval. Interval is typically 100ms, and is intended to match a prior estimate of the path RTT.
>
> - as long as packet sojourn remains above target, the marking frequency increases linearly per time. To achieve this, the interval between marks decreases on a inverse-square-root schedule per mark. The second mark will therefore, by default, occur about 70ms after the first.
>
> - when packet sojourn falls below target, marking ceases.
>
> - when packet sojourn remains below target, the marking frequency is reduced according to some rule. The rule for reference Codel is a bit non-intuitive and, I suspect, has not been carefully thought out. Cake's version decreases it linearly, by running the inverse-square-root-schedule in reverse (marks are scheduled but not actually made during this time); eventually it returns to the baseline interval.

One thing from netdevconf was the concept of ABC in the cellular
network, where permission to accellerate is granted by the aqm in the
middle. A lot of the data these folk are dealing with is flawed (they
*start* with traces that have 150-250ms delay already in them before
applying their "solutions") which makes me want to tune out...

https://netdevconf.org/0x12/session.html?congestion-control-for-cellular-wireless-networks

An idea towards "permission to accellerate" is to try and get into the
new/old flow bifircation fq_codel has. One means is "hard", I think -
as BBR seems (?) to pace a minimum of two packets at a time which will
tend to force things too often into the old flows before being acked

> - if packet sojourn rises above target very soon after it fell below, the delay before marking resumes, and the marking frequency, may therefore be shorter/faster than the initial condition. This helps Codel to adapt to RTTs shorter than estimated.
>
> The potential positions of marks can also be inferred from packet losses from the same AQM action on a non-ECT flow, which would result in SACKs in roughly the same pattern as ECE flags arriving at the sender. This is true because Codel both marks and drops packets from the head of the queue, immediately prior to delivery. In general, Codel drops non-ECT packets at the same times as it would mark ECT ones, but it delivers *some* packet in both cases; in case of a drop, it'll be the following packet in the same flow (ensuring fast recognition that a drop has occurred).

Um, er,

"in case of a drop, it'll be the following packet in the same flow
(ensuring fast recognition that a drop has occurred)"

is only true on a fq_codel derived algorithm. In the case of an aqm of
any sort, that does not do fq, it's anybody's guess as to what packet
from the flow will arrive next in time. (win!, for fq_codel/fq_pie,
for stability of rate and markings, again).

>If the queue is configured deeply, it won't show overflow loss as a FIFO would, and in a lab environment it's also possible to assert that no random loss occurs.
>
> An exception to the mark-loss equivalence rule is that Cake won't drop the last remaining packet in the queue, but might still mark it. Codel and fq_codel might still drop a tail packet, if it's been sitting in the queue long enough.

i note that (completely other, and often heated thread on the cake
list), that not dropping the last packet in a queue when things are
overloaded is *not* something I agree with at lower bandwidths. It is
better to force an RTO when things are that congested, and/or rely on
other packets already in the pipe to arrive to replace it, in order to
retain low queuing latency, store stuff "in the network and not the
queue". That observation is what led to fq_codel's current, deployed
"drop even if it's the last packet" behavior.

(I don't think that the BBR list is particularly the right place for
furthering this but I did want to make it clear that I think it's
controversial)

Also, since we got cake to 50gbit seeing what happens with > 1024
flows should be "interesting"

>
> - Jonathan Morton
>


--

Dave Täht
CEO, TekLibre, LLC
http://www.teklibre.com
Tel: 1-669-226-2619

Yuchung Cheng

unread,
Jul 16, 2018, 12:47:03 PM7/16/18
to Jonathan Morton, Neal Cardwell, Dave Taht, BBR Development
On Sun, Jul 15, 2018 at 3:11 PM, Jonathan Morton <chrom...@gmail.com> wrote:
> On 15 Jul, 2018, at 9:29 pm, 'Neal Cardwell' via BBR Development <bbr...@googlegroups.com> wrote:
>
> A)  tailoring BBR's ecn response to fq_codel's onset of CE.
>
> We are also working on BBR's response to ECN (specifically, adding such a response). There, too, it would be interesting to see traces showing the ECN mark patterns from your tests.

Given that BBR currently disables advertisement of ECN capability (correctly, since it isn't), getting such traces may be tricky.  Would ECN responses to other TCPs, such as NewReno and Cubic, be helpful?  Or mangling ECT(0) into a BBR flow and watching the fireworks?  (Since BBR should be able to adapt to the path bandwidth - eventually - without ECN assistance, that might not be as bad as it seems.)
BBR does not disable advertisement of ECN. It does not react to ECN but we're working on it as Neal replied.

While the exact response function is not settle yet, BBR likely supports better ECN like AccurateECN or DCTCP-ECN, not classic ECN (RFC3168).

In general, Codel's behaviour is as follows:

 - when packet sojourn in queue is less than target (typically 5ms), no marks are made.

 - when packet sojourn is observed over target *continuously* for at least the interval, the first mark is made. At this point the marking frequency is equal to the interval.  Interval is typically 100ms, and is intended to match a prior estimate of the path RTT.
continuously == marking only starts if every single packet sojourn time over the interval exceeds the target?


 - as long as packet sojourn remains above target, the marking frequency increases linearly per time.  To achieve this, the interval between marks decreases on a inverse-square-root schedule per mark.  The second mark will therefore, by default, occur about 70ms after the first.

 - when packet sojourn falls below target, marking ceases.

 - when packet sojourn remains below target, the marking frequency is reduced according to some rule.  The rule for reference Codel is a bit non-intuitive and, I suspect, has not been carefully thought out.  Cake's version decreases it linearly, by running the inverse-square-root-schedule in reverse (marks are scheduled but not actually made during this time); eventually it returns to the baseline interval.

 - if packet sojourn rises above target very soon after it fell below, the delay before marking resumes, and the marking frequency, may therefore be shorter/faster than the initial condition.  This helps Codel to adapt to RTTs shorter than estimated.

The potential positions of marks can also be inferred from packet losses from the same AQM action on a non-ECT flow, which would result in SACKs in roughly the same pattern as ECE flags arriving at the sender.  This is true because Codel both marks and drops packets from the head of the queue, immediately prior to delivery.  In general, Codel drops non-ECT packets at the same times as it would mark ECT ones, but it delivers *some* packet in both cases; in case of a drop, it'll be the following packet in the same flow (ensuring fast recognition that a drop has occurred).  If the queue is configured deeply, it won't show overflow loss as a FIFO would, and in a lab environment it's also possible to assert that no random loss occurs.

An exception to the mark-loss equivalence rule is that Cake won't drop the last remaining packet in the queue, but might still mark it.  Codel and fq_codel might still drop a tail packet, if it's been sitting in the queue long enough.

 - Jonathan Morton

--
You received this message because you are subscribed to the Google Groups "BBR Development" group.
To unsubscribe from this group and stop receiving emails from it, send an email to bbr-dev+unsubscribe@googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

Jonathan Morton

unread,
Jul 16, 2018, 1:25:57 PM7/16/18
to Yuchung Cheng, Neal Cardwell, Dave Taht, BBR Development


> On 16 Jul, 2018, at 7:46 pm, Yuchung Cheng <ych...@google.com> wrote:
>
> BBR does not disable advertisement of ECN. It does not react to ECN but we're working on it as Neal replied.

Hmm. I was sure that someone said ECN was being disabled with BBR - but maybe that was something specific to Google's test deployment servers?

>> - when packet sojourn is observed over target *continuously* for at least the interval, the first mark is made. At this point the marking frequency is equal to the interval. Interval is typically 100ms, and is intended to match a prior estimate of the path RTT.

> continuously == marking only starts if every single packet sojourn time over the interval exceeds the target?

Yes - with the caveat that in fq_codel and Cake there's a Codel instance per flow queue, and only the packets in that particular queue count against these rules. So a mixture of saturating and sparse flows won't confuse the logic, and only the saturating flows will get AQM activity.

Succinctly, Codel aims to signal about *persistent* queuing which is typically caused by too high a send rate or window, while ignoring *transient* queuing which is typically the result of a short burst transmission (eg. GSO, wifi aggregation, DOCSIS MAC grant). So if the queue empties below its target, the previous queue observed must have been transient and not really the sender's fault.

- Jonathan Morton

Reply all
Reply to author
Forward
0 new messages