This started happening recently, with no apparent reason.
I have systems A, B, C, D in my LAN. When I SSH from A into B the
connection gets established immediately. However when I do it from A to C
I get the following SSH traces in A:
OpenSSH_7.4p1, OpenSSL 1.0.2u 20 Dec 2019
debug1: Reading configuration data /home/xyz/.ssh/config
debug1: Reading configuration data /etc/ssh/ssh_config
debug2: resolving "C" port 22
It hangs at that point for 20 seconds or so, but then the connection
succeeds. Once it is established, the performance of the session is as
expected. The same behavior is observed when I try to SSH from A into D.
Taking the "resolving "C" port 22" as a clue, I thought that I
might be having DNS problems in A. That does not seem to be the case
though:
$ dig C
; <<>> DiG 9.11.19 <<>> mimir
;; global options: +cmd
;; Got answer:
;; ->>HEADER<<- opcode: QUERY, status: NOERROR, id: 63806
;; flags: qr aa rd ra; QUERY: 1, ANSWER: 1, AUTHORITY: 0,
ADDITIONAL: 1
;; OPT PSEUDOSECTION:
; EDNS: version: 0, flags:; udp: 4096
;; QUESTION SECTION:
;C. IN A
;; ANSWER SECTION:
C. 0 IN A 192.168.0.9
;; Query time: 0 msec
;; SERVER: 192.168.0.7#53(192.168.0.7)
;; WHEN: Tue Jul 21 09:07:02 MDT 2020
;; MSG SIZE rcvd: 50
The resolution happens with no delay, but the SSH connection attempt to C
keeps taking 20 seconds, as above. And analogously for D.
Now once I have SSH'd into C, I can SSH from C into D with no
delay. I can even SSH from C into A (or B, or D) with no delay
whatsoever. In general, if A is not involved as a client things seem to
be normal.
So it would seem that the problem is in A alone, and not when
trying to SSH into every system - only some seem to be affected.
Any suggestions on how to diagnose this? I can't see anything
relevant in the logs of the systems involved - the only thing present is
the delay I mentioned, in the circumstances that I described. In A I have
the following IPtables rules:
Chain INPUT (policy ACCEPT)
target prot opt source destination
Chain FORWARD (policy DROP)
target prot opt source destination
ACCEPT all --
0.0.0.0/0 0.0.0.0/0
state RELATED,ESTABLISHED
ACCEPT all --
0.0.0.0/0 0.0.0.0/0
Chain OUTPUT (policy ACCEPT)
target prot opt source destination
whereas in the other systems I have no IPtables rules. I stopped and
starting the network in A, to no avail. I guess that rebooting A might
solve the problem. Or not. Even if it does, I would not understand why.