Google Groups no longer supports new Usenet posts or subscriptions. Historical content remains viewable.
Dismiss

Ssh connection hangs. Ignored ACK packet?

113 views
Skip to first unread message

Bernardo Dal Seno

unread,
Mar 17, 2008, 11:40:21 AM3/17/08
to
I have intermittent problems in transferring files between two
machines via scp. Symptomps are: when transferring a large file from
the server to the client, scp transfers a few Kbytes and then says
"stalled".

After a while I was having this problem, I tried to investigate it,
and captured the TCP packets on both machines. Here you can find a
sample of a hanging connection (dump taken from the server side):
http://home.dei.polimi.it/dalseno/dumpssh-broken .

>From what I recall of TCP, it seems a puzzling behavior to me. After
the first packets are transmitted without any problem, a packet from
the server doesn't get through (the bandwidth limit of the client DSL
connection has been hit?), and the client sends ACKs requesting a
retransmission (packets 145--162 in the dump). After the server
retransmits the lost packet (163), the client asks for the
retransmission of another, more recent packet (164), but the server
keeps retransmitting the first lost packet, as if subsequent ACKs were
ignored. But the dump has been taken on the server, so the ACKs have
definetely been received, and the first rule of iptable INPUT table is
-m state --state RELATED,ESTABLISHED -j ACCEPT

The same connection on the client side:
http://home.dei.polimi.it/dalseno/dumpssh-broken-client . (Please
notice that the client is NATted)


What is possibly going wrong?

Searching with Google didn't help me. I've found only very old posts
(more than 2-3 years ago) or problems with MTU discovery. Any help is
appreciated, as I don't know how to handle this.


Additional information.

The server is a desktop AMD Sempron running an Ssh server, with a
public IP, behind a firewall; the client is an AMD Duron connected to
a DSL line and is double NATted (NAT is used by my ISP, and I have a
NATting firewall between my local lan and a non-NATting DSL router).
Both machines run Debian Sid; I tried to update the kernel (which
contains the TCP/IP stack) and also use the Debian stock kernel image
(2.6.24-4) and build the latest 2.6.24.3 from kernel.org.

Some software versions:
server: ssh 1:4.7p1-4, libc6 2.7-6, kernel: linux-image-2.6.24-1-686 (2.6.24-4)
client: ssh 1:4.7p1-4, libc6 2.7-8, kernel: custom-built,
linux-source-2.6.23: 2.6.23-2
Server network interface:
eth0: RealTek RTL8139 at 0xd000, 00:05:5d:4c:66:0d, IRQ 18
eth0: Identified 8139 chip type 'RTL-8100B/8139D'
Client network interface:
eth0: VIA Rhine II at 0x19000, 00:0e:a6:1d:45:ca, IRQ 18.
eth0: MII PHY found at address 1, status 0x786d advertising 01e1 Link 45e1.

I used the command
tcpdump -c 100000 -s 0 -p -w <filename> tcp port 22 or icmp
to capture packets, and then I used Wireshark to select the packets
belonging to one connection. No Icmp packets had been captured.


Best regards,
Bernardo


--
To UNSUBSCRIBE, email to debian-us...@lists.debian.org
with a subject of "unsubscribe". Trouble? Contact listm...@lists.debian.org

Ken Irving

unread,
Mar 17, 2008, 6:10:10 PM3/17/08
to
On Mon, Mar 17, 2008 at 02:49:21PM +0100, Bernardo Dal Seno wrote:
> I have intermittent problems in transferring files between two
> machines via scp. Symptomps are: when transferring a large file from
> the server to the client, scp transfers a few Kbytes and then says
> "stalled".
>
> ...

> What is possibly going wrong?
>
> Searching with Google didn't help me. I've found only very old posts
> (more than 2-3 years ago) or problems with MTU discovery. Any help is
> appreciated, as I don't know how to handle this.
>
> ...

> Additional information.
>
> The server is a desktop AMD Sempron running an Ssh server, with a
> public IP, behind a firewall; the client is an AMD Duron connected to
> a DSL line and is double NATted (NAT is used by my ISP, and I have a
> NATting firewall between my local lan and a non-NATting DSL router).
> ...

MTU is my standard WAG for this kind of thing, having had problems in
the distant past. interfaces(5) should show how to set the MTU for
the interface; on my local box behind a DSL line I have:

auto eth0
iface eth0 inet static
address 192.168.1.5
netmask 255.255.255.0
network 192.168.1.0
gateway 192.168.1.1
broadcast 192.168.1.255
mtu 1452

ifconfig(8) should show the MTU value, and maybe can be used to set it.
I was able to set the MTU using ip(8), so it should be pretty easy to
test, e.g.:

$ sudo ip link set eth0 mtu 1452

Ken

--
Ken Irving, fnkci+de...@uaf.edu

Bernardo Dal Seno

unread,
Mar 17, 2008, 6:50:17 PM3/17/08
to
On 17/03/2008, Ken Irving <fn...@uaf.edu> wrote:
> MTU is my standard WAG for this kind of thing, having had problems in
> the distant past.

I don't understand how MTU could be the culprit, as my problem seems
to be that a packet is not resent, and not that a packet doesn't
arrive. Anyway, as I'm not 100% sure of having interpreted the TCP
dump correctly, I tried to lower the MTU to 1400 on both machines. No
improvement. :-(

Bernardo

Bernardo Dal Seno

unread,
Mar 18, 2008, 1:00:21 PM3/18/08
to
I think I've found out why the TCP hangs: someone messes with TCP
sequence numbers and get them wrong.

I studied some advanced features of TCP, and discovered the existence
of "selective acknowledgment" (SACK), which is a very nice feature, by
the way. By comparing packets at the two ends of the connection, it
is clear that sequence numbers are rewritten in the standard TCP
header, but not in the SACK option. This should be a good way to
confuse the TCP stack at the sender side and break the connection.

I suspect my ISP (which does NAT), but I have to do some more
experiments to be sure.


Thanks for the suggestions I received.

0 new messages