SSH at Utah InstaGENI painfully slow

21 views
Skip to first unread message

Sarah Edwards

unread,
Aug 8, 2014, 10:25:28 AM8/8/14
to geni-...@googlegroups.com, Sarah Edwards, Jongwon Yoon
Hi InstaGENI folks,

In an effort to reproduce a bug report from an experimenter, I reserved an OpenVZ node at Utah InstaGENI. It claims to be ready but ssh into the node takes a very, very long time (probably at least a minute and definitely long enough that I gave up and thought it wasn't going to work).
I have an example up now in Slice URN:
urn:publicid:IDN+ch.geni.net:tutorial+slice+test
Could someone check to see what's happening? Is there a systematic issue?

Thanks,
Sarah

*******************************************************************************
Sarah Edwards
GENI Project Office

BBN Technologies
Cambridge, MA
phone: (617) 873-2329
email: sedw...@bbn.com





Leigh Stoller

unread,
Aug 8, 2014, 10:50:45 AM8/8/14
to geni-...@googlegroups.com, Sarah Edwards, Jongwon Yoon
> In an effort to reproduce a bug report from an experimenter, I reserved an OpenVZ node at Utah InstaGENI. It claims to be ready but ssh into the node takes a very, very long time (probably at least a minute and definitely long enough that I gave up and thought it wasn't going to work).
> I have an example up now in Slice URN:
> urn:publicid:IDN+ch.geni.net:tutorial+slice+test
> Could someone check to see what's happening? Is there a systematic issue?

I don’t see any problems, I could ssh in as root, took about 1 second.
Can you ping/traceroute to it from your location?

I also did this from Oregon: ssh -p 30522 pc1.utah.geniracks.net
without any problems.

Leigh





Sarah Edwards

unread,
Aug 8, 2014, 11:27:12 AM8/8/14
to geni-...@googlegroups.com, Sarah Edwards, Jongwon Yoon
I'm on BBN's internal network.

I just did:
ssh sedw...@pc1.utah.geniracks.net -p 30522

It took 1 min 17 sec to ssh in.

Ping is timing out:
$ ping pc1.utah.geniracks.net
PING pc1.utah.geniracks.net (155.98.34.11): 56 data bytes
Request timeout for icmp_seq 0
Request timeout for icmp_seq 1
Request timeout for icmp_seq 2
Request timeout for icmp_seq 3

Traceroute doesn't look promising either:
$ traceroute pc1.utah.geniracks.net
traceroute to pc1.utah.geniracks.net (155.98.34.11), 64 hops max, 52 byte packets
1 128.89.72.2 (128.89.72.2) 2.165 ms 3.772 ms 1.186 ms
2 128.33.96.10 (128.33.96.10) 0.756 ms 0.454 ms 0.417 ms
3 192.1.101.3 (192.1.101.3) 0.912 ms 1.394 ms 0.853 ms
4 te0-0-1-1.214.nr11.b002250-1.bos01.atlas.cogentco.com (38.104.187.117) 1.718 ms 1.834 ms 1.819 ms
5 te4-4.mag01.bos01.atlas.cogentco.com (154.24.14.201) 1.676 ms 1.289 ms
te4-4.mag02.bos01.atlas.cogentco.com (154.24.14.205) 1.682 ms
6 te0-3-1-3.ccr22.bos01.atlas.cogentco.com (154.54.7.41) 1.901 ms
te0-4-0-0.ccr21.bos01.atlas.cogentco.com (154.54.43.49) 1.513 ms
te0-3-1-3.ccr22.bos01.atlas.cogentco.com (154.54.7.41) 1.832 ms
7 be2138.ccr42.ord01.atlas.cogentco.com (154.54.43.201) 24.899 ms
be2137.ccr41.ord01.atlas.cogentco.com (154.54.43.193) 24.691 ms 24.714 ms
8 be2156.ccr21.mci01.atlas.cogentco.com (154.54.6.85) 37.094 ms 37.122 ms 37.498 ms
9 be2130.ccr22.den01.atlas.cogentco.com (154.54.26.121) 48.701 ms
be2128.ccr21.den01.atlas.cogentco.com (154.54.25.173) 49.055 ms 48.880 ms
10 be2126.ccr21.slc01.atlas.cogentco.com (154.54.25.66) 59.483 ms 59.377 ms 59.453 ms
11 te4-1.mag01.slc01.atlas.cogentco.com (154.54.87.9) 58.724 ms
te2-1.mag01.slc01.atlas.cogentco.com (154.54.86.73) 59.006 ms 58.866 ms
12 te0-0-2-0.nr11.b020767-1.slc01.atlas.cogentco.com (154.24.0.234) 59.958 ms
te0-0-2-3.nr11.b020767-1.slc01.atlas.cogentco.com (154.24.0.238) 59.463 ms
te0-0-2-0.nr11.b020767-1.slc01.atlas.cogentco.com (154.24.0.234) 59.835 ms
13 38.104.174.66 (38.104.174.66) 59.673 ms 60.648 ms 59.864 ms
14 140.197.253.23 (140.197.253.23) 61.875 ms 61.946 ms 61.759 ms
15 140.197.253.23 (140.197.253.23) 63.652 ms 71.569 ms 61.279 ms
16 140.197.253.139 (140.197.253.139) 58.293 ms 60.213 ms 58.677 ms
17 199.104.93.33 (199.104.93.33) 60.183 ms 58.486 ms 60.108 ms
18 * * *
19 * * *
20 155.98.127.45 (155.98.127.45) 62.522 ms 64.310 ms 60.616 ms
21 155.98.127.46 (155.98.127.46) 65.665 ms 62.770 ms 62.288 ms
22 * * *
23 * * *
24 * * *
25 * * *
26 * * *
27 * * *
28 * * *
29 * * *
30 * * *
31 * * *
32 * * *
33 * * *
34 * * *
35 * * *
36 * * *
37 * * *
38 * * *
39 * * *
40 * * *
41 * * *
42 * * *
43 * * *
44 * * *
45 * * *
46 * * *
47 * * *
48 * * *
49 * * *
50 * * *
51 * * *
52 * * *
53 * * *
54 * * *
55 * * *
56 * * *
57 * * *
58 * * *
59 * * *
60 * * *
61 * * *
62 * * *
63 * * *
64 * * *
> --
> GENI Users is a community supported mailing list, so please help by responding to questions you know the answer to.
>
> If this is your first time posting a question to this list, please review http://groups.geni.net/geni/wiki/GENIExperimenter/CommunityMailingList
> ---
> You received this message because you are subscribed to the Google Groups "GENI Users" group.
> To unsubscribe from this group and stop receiving emails from it, send an email to geni-users+...@googlegroups.com.
> For more options, visit https://groups.google.com/d/optout.

Sarah Edwards

unread,
Aug 8, 2014, 11:38:01 AM8/8/14
to geni-...@googlegroups.com, Sarah Edwards, Jongwon Yoon
Two things:
1) Tim says he can ping pc2, pc4, and pc5 at utah-ig
He cannot ping or traceroute to pc1 or pc3

2) Running ssh with -v shows it trying to connecting via IP v6 first, that times out and the it works with IPv4:
$ ssh sedw...@pc1.utah.geniracks.net -p 30522 -v
<snip>
debug1: Connecting to pc1.utah.geniracks.net [2011:1948:417:4:ea39:35ff:feb1:f94] port 30522.
<hangs a long time here>
debug1: connect to address 2011:1948:417:4:ea39:35ff:feb1:f94 port 30522: Operation timed out
debug1: Connecting to pc1.utah.geniracks.net [155.98.34.11] port 30522.
debug1: Connection established.
<prints a bunch of stuff and I'm in>


If I do the same thing with a node I have at wisconsin it starts by trying to connect to the IPv4 address and works immediately.
$ ssh sedw...@pc2.instageni.wisc.edu -p 32058 -v
<snip>
debug1: Connecting to pc2.instageni.wisc.edu [128.104.159.21] port 32058.
debug1: Connection established.On Aug 8, 2014, at 11:27 AM, Sarah Edwards <sedw...@bbn.com> wrote:
<prints a bunch of stuff and I'm in>

Leigh Stoller

unread,
Aug 8, 2014, 12:00:14 PM8/8/14
to geni-...@googlegroups.com, Sarah Edwards, Jongwon Yoon
> 1) Tim says he can ping pc2, pc4, and pc5 at utah-ig
> He cannot ping or traceroute to pc1 or pc3

This is expected. pc1 and pc3 are the shared hosts and are heavily
firewalled these days. This has been rolling out to all racks over the last
few months. You can ssh into your VMs, but most stuff is blocked to the
physical host.

> 2) Running ssh with -v shows it trying to connecting via IP v6 first,
> that times out and the it works with IPv4:
> $ ssh sedw...@pc1.utah.geniracks.net -p 30522 -v

Utah IG/PG do not do ipv6 ... our infrastructure does not support it. The
DDC might someday, but not had the time to deal with it.

I can still get there just fine from Oregon, so I really do not have any
ideas on where along the route you are having problems. Sorry.

Leigh





Sarah Edwards

unread,
Aug 8, 2014, 12:11:55 PM8/8/14
to geni-...@googlegroups.com, Sarah Edwards, Jongwon Yoon
inline
On Aug 8, 2014, at 12:00 PM, Leigh Stoller <lbst...@gmail.com> wrote:

>> 2) Running ssh with -v shows it trying to connecting via IP v6 first,
>> that times out and the it works with IPv4:
>> $ ssh sedw...@pc1.utah.geniracks.net -p 30522 -v
>
> Utah IG/PG do not do ipv6 ... our infrastructure does not support it. The
> DDC might someday, but not had the time to deal with it.
>
> I can still get there just fine from Oregon, so I really do not have any
> ideas on where along the route you are having problems. Sorry.

Two more pieces of information:
1) I spoke with our IT staff and they indicated that while BBN has IPv6 connectivity to the wider world, a key machine that provides it was out of commission for the last 3 months or so and was just replaced in the last day or two.

2) Tim points out that DNS lookups using the 'host' tool show v4 and v6 for utah-ig experimental nodes, but only v4 for utahddc-ig.
For example:
$ host pc1.utah.geniracks.net
pc1.utah.geniracks.net has address 155.98.34.11
pc1.utah.geniracks.net has IPv6 address 2011:1948:417:4:ea39:35ff:feb1:f94
$ host pc1.utahddc.geniracks.net
pc1.utahddc.geniracks.net has address 155.99.144.15

So maybe the issue is that BBN has v6 connectivity and maybe our clients preferentially prefer v6. Then because Utah IG has a v6 address we try to use that and fail and then fallback to the v4 address.

Perhaps something changed in the last 3 months with the addresses and we're just noticing at BBN because of the hardware issue at BBN?

Maybe....

Sarah

Brecht Vermeulen

unread,
Aug 8, 2014, 12:13:44 PM8/8/14
to geni-...@googlegroups.com, Sarah Edwards, Jongwon Yoon

I think it's the default IPv6 resolve, which is bad if you don't have
the connectivity on IPv6.
(I think you should remove the IPv6 DNS entries)
I see the same here:

ssh -v pc2.utah.geniracks.net
OpenSSH_6.0p1 Debian-4, OpenSSL 1.0.1e 11 Feb 2013
debug1: Reading configuration data /etc/ssh/ssh_config
debug1: /etc/ssh/ssh_config line 19: Applying options for *
debug1: Connecting to pc2.utah.geniracks.net
[2011:1948:417:4:ea39:35ff:feb1:c7c] port 22.

debug1: connect to address 2011:1948:417:4:ea39:35ff:feb1:c7c port 22:
Connection timed out
debug1: Connecting to pc2.utah.geniracks.net [155.98.34.12] port 22.
debug1: Connection established.
debug1: permanently_set_uid: 0/0
debug1: identity file /root/.ssh/id_rsa type -1
debug1: identity file /root/.ssh/id_rsa-cert type -1
debug1: identity file /root/.ssh/id_dsa type -1
debug1: identity file /root/.ssh/id_dsa-cert type -1
debug1: identity file /root/.ssh/id_ecdsa type -1
debug1: identity file /root/.ssh/id_ecdsa-cert type -1
debug1: Remote protocol version 1.99, remote software version
OpenSSH_5.4p1 FreeBSD-20100308
debug1: match: OpenSSH_5.4p1 FreeBSD-20100308 pat OpenSSH_5*
debug1: Enabling compatibility mode for protocol 2.0
debug1: Local version string SSH-2.0-OpenSSH_6.0p1 Debian-4
debug1: SSH2_MSG_KEXINIT sent
debug1: SSH2_MSG_KEXINIT received
debug1: kex: server->client aes128-ctr hmac-md5 none
debug1: kex: client->server aes128-ctr hmac-md5 none
debug1: SSH2_MSG_KEX_DH_GEX_REQUEST(1024<1024<8192) sent
debug1: expecting SSH2_MSG_KEX_DH_GEX_GROUP
debug1: SSH2_MSG_KEX_DH_GEX_INIT sent
debug1: expecting SSH2_MSG_KEX_DH_GEX_REPLY
debug1: Server host key: RSA 46:63:92:67:c8:75:20:4e:52:9f:2d:f6:cb:58:16:77
The authenticity of host 'pc2.utah.geniracks.net (155.98.34.12)' can't
be established.
RSA key fingerprint is 46:63:92:67:c8:75:20:4e:52:9f:2d:f6:cb:58:16:77.
Are you sure you want to continue connecting (yes/no)?


Brecht

Leigh Stoller

unread,
Aug 8, 2014, 1:57:15 PM8/8/14
to geni-...@googlegroups.com, Sarah Edwards, Jongwon Yoon
> I think it's the default IPv6 resolve, which is bad if you don't have the connectivity on IPv6.
> (I think you should remove the IPv6 DNS entries)
> I see the same here:

Yep, sorry about that. I had left the IPV6 configure variable set
from when I was testing the integration of the ipv6 code you gave us.
I just removed that, but it might take some time for the map change
to propagate.

Leigh





Brecht Vermeulen

unread,
Aug 8, 2014, 2:00:37 PM8/8/14
to geni-...@googlegroups.com, Sarah Edwards, Jongwon Yoon

> Yep, sorry about that. I had left the IPV6 configure variable set
> from when I was testing the integration of the ipv6 code you gave us.
> I just removed that, but it might take some time for the map change
> to propagate.
>
looks way faster from here now (and no IPv6 try first).
Sorry for the patch :-)

Brecht

Leigh Stoller

unread,
Aug 8, 2014, 2:04:34 PM8/8/14
to geni-...@googlegroups.com, Sarah Edwards, Jongwon Yoon
> Sorry for the patch :-)

Totally my fault!

Leigh





Sarah Edwards

unread,
Aug 8, 2014, 2:31:53 PM8/8/14
to geni-...@googlegroups.com, Sarah Edwards, Jongwon Yoon
Works great for me too as well!

Thank you so much.

Now I wonder if this was actually causing the issues my experimenter was seeing....

Brecht Vermeulen

unread,
Aug 11, 2014, 11:12:03 AM8/11/14
to geni-...@googlegroups.com, Sarah Edwards, Jongwon Yoon

I think it's the default IPv6 resolve, which is bad if you don't have
the connectivity on IPv6.
(I think you should remove the IPv6 DNS entries)
I see the same here:

Reply all
Reply to author
Forward
0 new messages