Google Groups no longer supports new Usenet posts or subscriptions. Historical content remains viewable.
Dismiss

tcp/ip hangs on Solaris 2.2

0 views
Skip to first unread message

Joachim Bartsch

unread,
Jan 10, 1994, 11:08:01 AM1/10/94
to
Hi netlanders,

I have several problems running Solaris 2.2 (patches installed) on a Sun 10/40.
A process listens on a socket, starts child-processes and continues listening.
All further communication on the established connection is done by the
child, until the child dies. The parent starts listening after the fork()
call and will create new children on connection requests.

On Solaris2.2, the process hangs after a few hours and does not accept
connections anymore, but I don't know why. On other OS, I never had problems
like this. And Solaris itself seems to crash later: First, no tcp/ip is
possible anymore, on the second stage I get a silent death of the kernel.

One of the interesting things is, that closed sockets sometimes remain (as
netstat displays) as long as the listening process exists, no matter if the
child is dead or not.

Has anybody ever had similar problems with Solaris? Any help would be
greatly appreciated.

Thanks in advance
Joachim
--
----------------------------------------------------------------------------
Joachim Bartsch (b...@wintermute.ma.bb.de) * systems programming task force(?)
BeBit Infotechnik GmbH, Mannheim. Voice: +49 621-459-2652 Fax: 621-459-2501
> type c) to continue, d) to dump core or b) to reboot:

Ian Donaldson

unread,
Jan 10, 1994, 5:39:25 PM1/10/94
to
b...@bbma2.ma.bb.de (Joachim Bartsch) writes:

>Hi netlanders,

>I have several problems running Solaris 2.2 (patches installed) on a Sun 10/40.
>A process listens on a socket, starts child-processes and continues listening.
>All further communication on the established connection is done by the
>child, until the child dies. The parent starts listening after the fork()
>call and will create new children on connection requests.

>On Solaris2.2, the process hangs after a few hours and does not accept
>connections anymore, but I don't know why. On other OS, I never had problems
>like this. And Solaris itself seems to crash later: First, no tcp/ip is
>possible anymore, on the second stage I get a silent death of the kernel.

>One of the interesting things is, that closed sockets sometimes remain (as
>netstat displays) as long as the listening process exists, no matter if the
>child is dead or not.

>Has anybody ever had similar problems with Solaris? Any help would be
>greatly appreciated.

I just installed Solaris 2.3 on an SS1000 and experienced a stuck TCP
connection on an rsh stream.

I did this from a Sol2.3 box to an IPX running 4.1.2:

2.3host% rsh 4.1.2host gunzip < /net/4.1.2host/somewhere.gz | tar tvf -

and the tcp connection hung after tar showed about a 100 k worth
of files (3 files actually). This is 100% repeatable.

trace(1) of gunzip on the 4.1.2 host shows it stuck in write()

truss(1) of tar(1) on the 2.3 host shows it stuck in read()

tcpdump(1) shows an ACK war on the wire between the two hosts which goes
on for a very long time. Here is an excerpt.


10:01:58.796842 2.3host.1022 > 4.1.2host.shell: . ack 37870 win 8760 (DF) (ttl 255, id 11465)
10:01:58.797206 4.1.2host.shell > 2.3host.1022: . ack 36887 win 0 (ttl 60, id 63764)
10:01:58.797798 2.3host.1022 > 4.1.2host.shell: . ack 37870 win 8760 (DF) (ttl 255, id 11466)
10:01:58.798199 4.1.2host.shell > 2.3host.1022: . ack 36887 win 0 (ttl 60, id 63765)
10:01:58.798757 2.3host.1022 > 4.1.2host.shell: . ack 37870 win 8760 (DF) (ttl 255, id 11467)
10:01:58.799196 4.1.2host.shell > 2.3host.1022: . ack 36887 win 0 (ttl 60, id 63766)
10:01:58.799794 2.3host.1022 > 4.1.2host.shell: . ack 37870 win 8760 (DF) (ttl 255, id 11468)
10:01:58.800219 4.1.2host.shell > 2.3host.1022: . ack 36887 win 0 (ttl 60, id 63767)
10:01:58.800704 2.3host.1022 > 4.1.2host.shell: . ack 37870 win 8760 (DF) (ttl 255, id 11469)
10:01:59.624692 2.3host.1022 > 4.1.2host.shell: . ack 37870 win 8760 (DF) (ttl 255, id 12206)
10:01:59.625080 4.1.2host.shell > 2.3host.1022: . ack 36887 win 0 (ttl 60, id 64556)
10:01:59.625602 2.3host.1022 > 4.1.2host.shell: . ack 37870 win 8760 (DF) (ttl 255, id 12207)
10:01:59.626028 4.1.2host.shell > 2.3host.1022: . ack 36887 win 0 (ttl 60, id 64557)
10:01:59.626584 2.3host.1022 > 4.1.2host.shell: . ack 37870 win 8760 (DF) (ttl 255, id 12208)
10:01:59.626961 4.1.2host.shell > 2.3host.1022: . ack 36887 win 0 (ttl 60, id 64558)
10:01:59.627540 2.3host.1022 > 4.1.2host.shell: . ack 37870 win 8760 (DF) (ttl 255, id 12209)
10:01:59.627937 4.1.2host.shell > 2.3host.1022: . ack 36887 win 0 (ttl 60, id 64559)
10:01:59.628467 2.3host.1022 > 4.1.2host.shell: . ack 37870 win 8760 (DF) (ttl 255, id 12210)
10:01:59.628929 4.1.2host.shell > 2.3host.1022: . ack 36887 win 0 (ttl 60, id 64560)
10:01:59.629465 2.3host.1022 > 4.1.2host.shell: . ack 37870 win 8760 (DF) (ttl 255, id 12211)


On the 2.3host, I have the following patch installed

2.3host% what /kernel/drv/tcp
/kernel/drv/tcp:
SunOS 5.3 Generic 101346-03 November 1993

Anybody else seen this?

Ian D

Casper H.S. Dik

unread,
Jan 11, 1994, 5:49:39 AM1/11/94
to
ia...@labtam.labtam.oz.au (Ian Donaldson) writes:

>I just installed Solaris 2.3 on an SS1000 and experienced a stuck TCP
>connection on an rsh stream.

>I did this from a Sol2.3 box to an IPX running 4.1.2:

> 2.3host% rsh 4.1.2host gunzip < /net/4.1.2host/somewhere.gz | tar tvf -

>and the tcp connection hung after tar showed about a 100 k worth
>of files (3 files actually). This is 100% repeatable.

Yep, 100% repeatable, but even from a Solaris 2.3 host to itself.

It seems that a tcp/ip connection in Solaris can't cope with
two way traffic, when it exceeds more than a couple of MB,
but especially when there is processing going on.

% rsh 2.3host cat < /opt/patches/5.3/101318-13.tar.Z | wc

hangs or slows to a crawl.

After a while, cat gets EPIPE and exits.

Casper

Rob Healey

unread,
Jan 11, 1994, 2:04:31 PM1/11/94
to
In article <iand.758241565@labtam>,

Ian Donaldson <ia...@labtam.labtam.oz.au> wrote:
>>I have several problems running Solaris 2.2 (patches installed) on a Sun 10/40.
>
>I just installed Solaris 2.3 on an SS1000 and experienced a stuck TCP
>connection on an rsh stream.
>
>On the 2.3host, I have the following patch installed
>
>2.3host% what /kernel/drv/tcp
>/kernel/drv/tcp:
> SunOS 5.3 Generic 101346-03 November 1993
>
Which patches? Minimally for network problems you probably want
at least 101318-21, the kernel patch with gobs of networking fixes.
Might be others too but this one is definitely a must.

-Rob

Casper H.S. Dik

unread,
Jan 11, 1994, 3:10:35 PM1/11/94
to
rhe...@sirius.aggregate.com (Rob Healey) writes:

>In article <iand.758241565@labtam>,
>Ian Donaldson <ia...@labtam.labtam.oz.au> wrote:

>>I just installed Solaris 2.3 on an SS1000 and experienced a stuck TCP
>>connection on an rsh stream.

> Which patches? Minimally for network problems you probably want


> at least 101318-21, the kernel patch with gobs of networking fixes.
> Might be others too but this one is definitely a must.


This bug still exists in Solaris 2.3 w/ patch 101318-21 installed
and can be easily reproduced.


Casper

0 new messages