On Thu, Jan 8, 2015 at 10:39 AM, Oliver Mattos <
oma...@gmail.com> wrote:
> Designing a system which has arbitrary bad paths, but that can then probe to
> get better paths seems suboptimal... That probe process is going to take a
> long time,
In the case of mosh it doesn't matter how long it takes. The protocol is very
lightweight and is basically sending a diff from the last known state of the
terminal. If all but one of 99 probes fail, you still won.
In the case of a quic, you have the inevitable keep-alive process you need
to maintain, anyway, after connection establishment.
> and how do you know if you have a "good" or "bad" or "quite good"
> path? It becomes the multi-armed bandit problem!
mosh-multipath measures RTT as it's primary metric.
try it.
Mosh was one of those tools that I went around installing *everywhere*, much
like I did mosaic back in the day. I have nearly completely abandoned ssh for
interactive (terminal) traffic as a result. mosh-multipath is making
an occasionally
annoying problem (fq hash collision) vanish completely.
In the case of a quic with multiple connections I would imagine periodic
measurements of idle RTT, and preserving a mapping between
(observed capacity, connection).
>
> Just make sure we don't get bad paths in the first place by using N hashing
> algorithms in the FQ design, and take the min response, a bit like a
> counting bloom filter
You are possibly conflating two things. In this case the "fq" stuff is on a
bottleneck link somewhere else on the network. There is no way for
the application to infer the number of queues, the kind of stochastic
hash (2 tuple?
5 tuple?), the actual value of the hash (randomly permuted), etc.
An application can, however, observe differences in RTT when nailed up
via different ips and port ranges. The mosh-multipath technique was
originally intended for dealing with source specific routing, where a
given set of source IPs might end up going out vastly different
(ethernet, wifi, LTE) paths, with intermittent connectivity on all of
them.
> If you have 2^20 hash buckets, you just reduced your
Although the fq_codel algorithm, as implemented, supports 2^16 hash buckets, the
default is 2^10. This is partially due to the keeping codel state on a
per queue basis on links in the range 0-1gigbit and monty carlo
testing ( see
https://tools.ietf.org/html/draft-hoeiland-joergensen-aqm-fq-codel-01
),
and partially due to trying to keep memory overhead low. The "right"
number has not been derived for faster links or for "all" traffic
loads, 2^10 was merely a good-seeming number given traffic conditions
today.
The full blown sch_fq algorithm (widely deployed now in linux based
datacenters) uses a red-black tree to do completely fair queueing with
no stochastic hash, scaling to millions of flows.
SFQ, as deployed, uses 127 queues, and many deployments permute the
hash every 10 seconds (scrambling long duration flows and acting as a
poor man's aqm)
Other combinations of FQ + aqm technologies do the FQ first, across N
flows, then apply a single aqm algorithm to the resulting flow.
> chance of full collision to about 2^(20*N), as long as the count of elephant
> flows << 2^20.
Well, there is the birthday problem, and the fact that few hosts have
more than a few IP addresses to source or dest to.