On 3/1/22 12:05 PM, Mike O'Connor wrote:
> sure. i've gotten really good at standing up Linodes. would a few of those work? we could sprinkle them around various Linode datacenters. their network seems pretty good. i can build them, or i can show you how -- it's just a public stackscript that throws all the needed stuff together.
And pretty much without me asking Mike just set up one, contacted me and
let me use it! (not for free of course, I owe him some beers when we
meet in person :-)
Kudos and many thanks for the help!
A summary of what this looks like below...
Best,
-- Fernando
--------
The third path to the linode instance clarified things a bit more. I
think I know what is happening but please correct me if I am making some
mistake...
< these are selected parts of the email I sent to Sonic (they are
silent, so far) >
This is what I have tested with (detailed traceroute below):
a) Sonic -> Cenic -> Stanford
b) Concast -> He.net -> Stanford
c) Sonic -> Cogentco -> Linode instance (in NJ!)
a -> dropouts (incoming udp traffic from the world only, outgoing is fine)
b -> no dropouts (0% packet loss)
c -> no dropouts (0% packet loss)
Comparing a and b we could say that Stanford is not at fault. Comparing
b and c we could say that the Sonic routers (probably) are not at fault
and it is Cenic that has a misconfiguration in a router, or is actively
throttling udp traffic from the world to Sonic (or more accurately, to
my OMT which is my only test point).
This is the connection point through cogentco (no dropped packets):
----
10
102.ae1.nrd1.pao1.sonic.net (70.36.205.6) 4.411 ms
100.ae1.nrd1.equinix-sj.sonic.net (75.101.33.185) 4.545 ms
102.ae1.nrd1.pao1.sonic.net (70.36.205.6) 4.748 ms
11
hu0-3-0-2.ccr31.sjc04.atlas.cogentco.com (38.104.141.81) 5.119 ms
4.778 ms 4.844 ms
----
and this is the connection point through Cenic (dropped packets):
----
10
100.ae1.nrd1.equinix-sj.sonic.net (75.101.33.185) 4.664 ms 4.520
ms 4.735 ms
11
dc-svl-agg8--svl-agg10-300g.cenic.net (137.164.11.81) 6.076 ms
eqix-sv5.cenic.com (206.223.117.118) 4.206 ms 4.599 ms
----
It does appear that 75.101.33.185 on Sonic's end is connecting to both
(I do not understand why 3 ip addresses are in the hop 10 of the
Congentco route - while I am a sysadmin I am not a network
specialist...), so, it is very likely the problem is in Cenic (or in the
port in Sonic's router that connects to Cenic).
I would imagine Sonic would now contact Cenic and kindly tell them to
stop throttling traffic[*] coming into Sonic from the world :-)
< I can dream, can't I? >
....
[*] == A bit more about traffic:
The nature of the packet loss is very clear and repeatable. At low
aggregated bandwidth and packet size (what you would find in voice over
ip traffic) there are dropped packets but very few. This would degrade
the quality of the call, but probably not that much, or not in a way
that would immediately be noticed.
As the bandwidth goes up the % packet loss goes up. With the bandwidth
used by the software I have been using (approx 1.5Mbits/s, not much in
the context of a 1G/s connection) I see what for me are substantial
drops, as detailed in previous emails. Furthermore, across many tests
the number of dropped packets is approximately constant, about 1 dropped
packet per second. The interval between dropped packets is randomized,
but overall I was getting between 29 and 33 dropped packets in a 30
second interval over many tests. At all times (ie: not really depending
on when I did the test). This would suggest this is not a misconfiguration.
The weird part is that if I keep the same bandwidth but I change the
size of the packets (ie: sending more or less packets but keeping the
Mb/s constant), the NUMBER of dropped packets does not change
significantly. In my tests yesterday evening I still kept seeing about
30 packets dropped in a 30 second interval. All the while going from
about 64 bytes per packet to 1024+ or so.
So, something in their router is, for the bandwidth in my connection
(1.5Mb/s), picking one packet randomly every second and dropping it,
regardless of the packet size. Not very effective for congestion
control, so perhaps this is just a misconfiguration? Or maybe it is just
a simple way to cut corners and not route so much traffic.
== route to linode instance
$ traceroute 172.104.5.245
traceroute to 172.104.5.245 (172.104.5.245), 30 hops max, 60 byte packets
1 ControlPanel.Home (192.168.42.1) 1.077 ms 1.366 ms 1.645 ms
2
lo0.bras1.sncrca11.sonic.net (157.131.132.81) 5.541 ms 5.648 ms
5.854 ms
3 *
157-131-210-226.static.sonic.net (157.131.210.226) 21.433 ms *
4
157-131-210-193.static.sonic.net (157.131.210.193) 19.307 ms
157-131-210-174.static.sonic.net (157.131.210.174) 17.942 ms 17.968 ms
5
0.ae2.cr1.lsatca11.sonic.net (157.131.209.161) 8.933 ms
0.ae1.cr1.colaca01.sonic.net (157.131.209.65) 22.265 ms 22.285 ms
6
0.ae1.cr1.snjsca11.sonic.net (157.131.209.149) 16.089 ms 12.142
ms
0.ae0.cr1.lsatca11.sonic.net (157.131.209.86) 4.626 ms
7 * * *
8 * * *
9
100.ae1.nrd1.equinix-sj.sonic.net (75.101.33.185) 4.300 ms 4.524 ms *
10
102.ae1.nrd1.pao1.sonic.net (70.36.205.6) 4.411 ms
100.ae1.nrd1.equinix-sj.sonic.net (75.101.33.185) 4.545 ms
102.ae1.nrd1.pao1.sonic.net (70.36.205.6) 4.748 ms
11
hu0-3-0-2.ccr31.sjc04.atlas.cogentco.com (38.104.141.81) 5.119 ms
4.778 ms 4.844 ms
12
be2430.ccr22.sfo01.atlas.cogentco.com (154.54.88.185) 6.196 ms
hu0-3-0-2.ccr31.sjc04.atlas.cogentco.com (38.104.141.81) 4.763 ms 4.831 ms
13
be3109.ccr21.slc01.atlas.cogentco.com (154.54.44.138) 20.833 ms
be2379.ccr21.sfo01.atlas.cogentco.com (154.54.42.157) 5.943 ms
be3109.ccr21.slc01.atlas.cogentco.com (154.54.44.138) 20.228 ms
14
be3110.ccr32.slc01.atlas.cogentco.com (154.54.44.142) 20.895 ms
be3109.ccr21.slc01.atlas.cogentco.com (154.54.44.138) 20.875 ms
be3037.ccr21.den01.atlas.cogentco.com (154.54.41.146) 30.789 ms
15
be3035.ccr21.mci01.atlas.cogentco.com (154.54.5.90) 42.214 ms
42.268 ms 42.408 ms
16
be3036.ccr22.mci01.atlas.cogentco.com (154.54.31.90) 43.070 ms
be2831.ccr41.ord01.atlas.cogentco.com (154.54.42.166) 52.963 ms 53.155 ms
17
be2831.ccr41.ord01.atlas.cogentco.com (154.54.42.166) 53.629 ms
be2832.ccr42.ord01.atlas.cogentco.com (154.54.44.170) 53.835 ms
be2718.ccr22.cle04.atlas.cogentco.com (154.54.7.130) 60.409 ms
18
be2717.ccr21.cle04.atlas.cogentco.com (154.54.6.222) 60.493 ms
be2718.ccr22.cle04.atlas.cogentco.com (154.54.7.130) 60.548 ms
be2717.ccr21.cle04.atlas.cogentco.com (154.54.6.222) 60.825 ms
19
be2889.ccr41.jfk02.atlas.cogentco.com (154.54.47.50) 73.082 ms
be2890.ccr42.jfk02.atlas.cogentco.com (154.54.82.246) 73.231 ms
be3294.ccr31.jfk05.atlas.cogentco.com (154.54.47.218) 74.328 ms
20 38.104.75.138 (38.104.75.138) 77.339 ms 73.552 ms
be3295.ccr31.jfk05.atlas.cogentco.com (154.54.80.2) 72.333 ms
21 38.104.75.138 (38.104.75.138) 73.749 ms 74.011 ms *
22 * * *
23 * * *
24 *
xxxx.ip.linodeusercontent.com (xx.xx.xx.xx) 74.000 ms *
== route to stanford server
$ traceroute
cm-toast.stanford.edu
traceroute to
cm-toast.stanford.edu (171.64.197.122), 30 hops max, 60
byte packets
1 ControlPanel.Home (192.168.42.1) 1.181 ms 0.630 ms 0.847 ms
2
lo0.bras1.sncrca11.sonic.net (157.131.132.81) 3.532 ms 3.534 ms
3.430 ms
3
157-131-210-226.static.sonic.net (157.131.210.226) 21.318 ms
21.251 ms 21.249 ms
4
157-131-210-193.static.sonic.net (157.131.210.193) 24.229 ms
157-131-210-174.static.sonic.net (157.131.210.174) 22.930 ms 23.037 ms
5
0.ae2.cr1.lsatca11.sonic.net (157.131.209.161) 9.212 ms
0.ae1.cr1.colaca01.sonic.net (157.131.209.65) 9.279 ms
0.ae2.cr1.lsatca11.sonic.net (157.131.209.161) 9.089 ms
6 *
0.ae0.cr1.lsatca11.sonic.net (157.131.209.86) 7.135 ms
0.ae1.cr1.snjsca11.sonic.net (157.131.209.149) 21.811 ms
7 * *
0.ae1.cr1.snjsca11.sonic.net (157.131.209.149) 19.939 ms
8 * * *
9 *
100.ae1.nrd1.equinix-sj.sonic.net (75.101.33.185) 3.838 ms 4.456 ms
10
100.ae1.nrd1.equinix-sj.sonic.net (75.101.33.185) 4.664 ms 4.520
ms 4.735 ms
11
dc-svl-agg8--svl-agg10-300g.cenic.net (137.164.11.81) 6.076 ms
eqix-sv5.cenic.com (206.223.117.118) 4.206 ms 4.599 ms
12
dc-stanford--svl-agg4-100ge.cenic.net (137.164.23.145) 4.576 ms
dc-svl-agg8--svl-agg10-300g.cenic.net (137.164.11.81) 5.389 ms 5.397 ms
13
dc-stanford--svl-agg4-100ge.cenic.net (137.164.23.145) 5.220 ms
5.821 ms 5.751 ms
14 * noa-east-rtr-vl2.SUNet (171.64.255.134) 5.837 ms *
15 * * *
16 * * *
17 * * *
18 * * *
19 * * *
20 * * *
21 * * *
22 * * *
23 * * *
24 * * *
25 * * *
26 * * *
27 * * *
28 * * *
29 * * *
30 * * *