We are seeing a large number of TCP connections in the IDLE
state, which netstat display as:
Local Address Remote Address State
-------------------- -------------------- -------
*.* *.* IDLE
*.* *.* IDLE
...
lsof is a little more helpful and shows:
COMMAND ... FD ... INODE NAME
ns-httpd ... 121u ... TCP elvis:*->remote-1:* (IDLE QR=0 QS=0 WR=0 WW=0)
ns-httpd ... 122u ... TCP elvis:*->remote-2:* (IDLE QR=644464758 QS=0 WR=0 WW=0)
...
The remote addresses are browsers that connected to the web
server on elvis and then failed - apparently due to a problem
on the web server side. Restarting the web server removes
the IDLE connections.
This above behavior does not quite match the description of
the IDLE state in the netstat man page. What other conditions
would move a connection to the idle state? Thanks for any help.
-- Alec
IDLE generally means the socket is open and any one of several things
is true:
- the application called socket(), but never called bind() or
connect().
- the application called bind(), but the binding failed.
- the application is using TPI and it did a T_UNBIND_REQ.
> COMMAND ... FD ... INODE NAME
> ns-httpd ... 121u ... TCP elvis:*->remote-1:* (IDLE QR=0 QS=0 WR=0 WW=0)
> ns-httpd ... 122u ... TCP elvis:*->remote-2:* (IDLE QR=644464758 QS=0 WR=0 WW=0)
> ...
The second of those doesn't really look healthy, I think. ":*" means
the port number is zero, doesn't it?
> The remote addresses are browsers that connected to the web
> server on elvis and then failed - apparently due to a problem
> on the web server side. Restarting the web server removes
> the IDLE connections.
>
> This above behavior does not quite match the description of
> the IDLE state in the netstat man page. What other conditions
> would move a connection to the idle state? Thanks for any help.
At a guess (I don't work on that code a great deal and I *don't* work
in support, so *please* don't bother reporting the problem to *me*), I
think the only way this could happen is if TCP is unable to establish
the demultiplexing IRE in IP. I would think there'd be some
interesting logs on the server side, and I'd expect that anything that
allows accept() to succeed but leaves the TCP connection as IDLE would
be a bug.
Of course, there could possibly be other reasons for this. Please
contact support.
(Posting more information, such as the version of Solaris you're
running [see "uname -a"] and the name and version of the web server
might help. It's possible that this is a known problem.)
--
James Carlson, Solaris Networking <james.d...@east.sun.com>
SUN Microsystems / 1 Network Drive 71.234W Vox +1 781 442 2084
MS UBUR02-212 / Burlington MA 01803-2757 42.497N Fax +1 781 442 1677
Thanks.
> > COMMAND ... FD ... INODE NAME
> > ns-httpd ... 121u ... TCP elvis:*->remote-1:* (IDLE QR=0 QS=0 WR=0 WW=0)
> > ns-httpd ... 122u ... TCP elvis:*->remote-2:* (IDLE QR=644464758 QS=0 WR=0 WW=0)
> > ...
>
> The second of those doesn't really look healthy, I think. ":*" means
> the port number is zero, doesn't it?
>
> > The remote addresses are browsers that connected to the web
> > server on elvis and then failed - apparently due to a problem
> > on the web server side. Restarting the web server removes
> > the IDLE connections.
> >
> > This above behavior does not quite match the description of
> > the IDLE state in the netstat man page. What other conditions
> > would move a connection to the idle state? Thanks for any help.
>
> At a guess (I don't work on that code a great deal and I *don't* work
> in support, so *please* don't bother reporting the problem to *me*), I
> think the only way this could happen is if TCP is unable to establish
> the demultiplexing IRE in IP. I would think there'd be some
> interesting logs on the server side, and I'd expect that anything that
> allows accept() to succeed but leaves the TCP connection as IDLE would
> be a bug.
The web server logs don't have any new information. Are there any other
logs that would provide a clue?
> Of course, there could possibly be other reasons for this. Please
> contact support.
>
> (Posting more information, such as the version of Solaris you're
> running [see "uname -a"] and the name and version of the web server
> might help. It's possible that this is a known problem.)
The server is running SunOS 5.6 and has all networking patches installed.
The web server is iPlanet 4.1 SP 5. The server is also running Central
Dispatch from Resonate.
-- Alec