On Mon, Aug 26, 2019 at 9:17 PM Dave Taht <
dave...@gmail.com> wrote:
>
> Thank you for
the flame graph
> The context of that fq vs fq_codel debate was quite painful - and in the context of a safe default for the internet. where things stood at that point in time was:
>
>
https://github.com/systemd/systemd/issues/9725#issuecomment-413369212
>
> Which is a benchmark you can run on your own deployments.
>
> If it isn't clear from that:
>
> IF your server's traffic is just tcp, and you are not using network namespaces, vpns or vms or containers, and especially if you want to use BBR - BY ALL MEANS switch to sch_fq. (does edf work on quic/udp stuff now?)
I should clarify this a bit more. If you are using a reverse proxy (as
you are) to get at your containers sch_fq with pacing is even more the
right thing (with or without bbr). In these other circumstances, where
linux is acting essentially as a router, not so much.
As for tcp - particularly when observing large rtts - I'd like it if
more folk were actually monitoring their rtts relative to what the
physical path should be achieving. "Out there" are tons of folk trying
via inadequate means like policers and via inbound shaping to keep
their networks usable for gaming, voip and videoconferencing, and
making the assumption that a tcp will react to a drop with a reno-like
response that bbrv1 does not have, Overbuffering along the edge is oft
measured in seconds, when 10s of ms is needed.
As for the cpu hit you are observing - I have no way to duplicate your
workload. Does sound buggy and a worthwhile thing to analyze
independently. But I'd recommend a kernel upgrade first as a test -
and then,
packet captures and more flame graphs. :/
sch_fq can self congest - I've seen googlers recommend using a shaper
on it - and nobody actually knows
how aws does rate management in the first place! I'd love it if more
interactivish apps used tcp_ lowat.
BBRv2 looks quite promising except for the rfc3168/ and sce vs L4S debate.
> It's not clear to me how well sch_fq actually works in a vm, my impression is google mostly runs it on bare
> metal.
>
> My own fear is that fq_codel is only thing keeping billions of containers from melting down the internet, but
> I have no data on it aside from a few benchmarks like that. I would welcome more testing and it would be
> great to have one singing all dancing qdisc.
>
--
Dave Täht
CTO, TekLibre, LLC
http://www.teklibre.com
Tel:
1-831-205-9740