I am using Sun OS 5.5.1
I've written a small server program, which binds to port 8000, and listens
for incoming request.
When a client connects to it, it sends back some simple message like "Hello
World"
The client can send a special terminate message, in which case the server
closes the socket and terminates.
My problem is that after the client sends the terminate message ( and the
server shuts down ), I cannot immediately start the server, I get the
following error :
"Address 8000 already in use"
if I do a "netstat -f inet | grep 8000", I see that the port 8000 has gone
in TIME_WAIT state .
After some time the OS release this port, and I can restart my server.
Is there any way I can avoid this TIME_WAIT state ??, am I not closing the
server socket properly ?? ( I do a close( socket_id ) )
Is there any option I can set on the socket to avoid the TIME_WAIT state ??
Any help will be greatly appreciated....
Thanx,
This example sets the timeout to 5 seconds. I have a command like this
thats executes during boot.
Gary
Humm, I would *not* recommend setting the time_wait state to 5 seconds.
Believe it or not, the t_w state was designed into TCP for a real reason.
Rich Stevens
You don't want to avoid the T_W state. What you should be doing is
setting the SO_REUSEADDR socket option before calling bind() in your
server. Look at the TCP/IP FAQ or the socket programming FAQ--they
both talk about this.
Rich Stevens
::>You don't want to avoid the T_W state. What you should be doing is
:>setting the SO_REUSEADDR socket option before calling bind() in your
:>server. Look at the TCP/IP FAQ or the socket programming FAQ--they
:>both talk about this.
What do you do when it's not your server code? SFAIK, my only
recourse is ndd with both time_wait (60 seconds) and close_wait (1
second). I have telemetry processing for spacecraft which need to
open and close sockets quickly.
Are there alternatives?
TIA,
Dave Howell
Software Engineer
ONSI
"At Absolute Zero, Resistance is Useless"
>On 18 Dec 1997 23:12:24 GMT, rste...@noao.edu (W. Richard Stevens)
>spake unto the masses:
>::>You don't want to avoid the T_W state. What you should be doing is
>:>setting the SO_REUSEADDR socket option before calling bind() in your
>:>server. Look at the TCP/IP FAQ or the socket programming FAQ--they
>:>both talk about this.
>What do you do when it's not your server code? SFAIK, my only
>recourse is ndd with both time_wait (60 seconds) and close_wait (1
>second). I have telemetry processing for spacecraft which need to
>open and close sockets quickly.
>Are there alternatives?
Scream and yell at whoever wrote the broken code to fix the blasted
thing?
Really, if code requires such horrible hacks to work then the
code is broken. So broken that it makes you wonder how the rest
can work at all if that bit indicates the extent of the authors
ability.
:>In <3499e645...@news.erols.com> dho...@bitbucket.erols.com (Dave Howell) writes:
:>
:>>On 18 Dec 1997 23:12:24 GMT, rste...@noao.edu (W. Richard Stevens)
:>>spake unto the masses:
:>
:>>::>You don't want to avoid the T_W state. What you should be doing is
:>>:>setting the SO_REUSEADDR socket option before calling bind() in your
:>>:>server. Look at the TCP/IP FAQ or the socket programming FAQ--they
:>>:>both talk about this.
:>
:>
:>>What do you do when it's not your server code? SFAIK, my only
:>>recourse is ndd with both time_wait (60 seconds) and close_wait (1
:>>second). I have telemetry processing for spacecraft which need to
:>>open and close sockets quickly.
:>
:>>Are there alternatives?
:>
:>Scream and yell at whoever wrote the broken code to fix the blasted
:>thing?
Loral, sold to Lock-Mart, spun off into L3 Communications. Lots of
legacy behind it: All I can do is *request* the change. May take
years. So what else can I do in the meantime? Seriously.
Seriously, nothing.
Well, that isn't true. If it is dynamically linked you can try to
find some library call it calls around where you would want to call
setsockopt(..., SO_REUSEADDR, ...), which shouldn't be too hard to
do, then make your own library and trick (ie. use LD_* env variables)
the loader into using that instead. In your library, you would
simply call the function it wants then do a setsockopt().
If it isn't dynamically linked, you can't really do much unless
you are up for editing the binary. Or unless you run on an OS where
you have kernel source or an option to force SO_REUSADDR; some
kernels can be set to force keepalives... not sure of any offhand
that can force SO_REUSEADDR. If you had kernel source, you could
do that.
Or you do the ugly hack that you have done and lower the times.
It is not a good thing to do though, and 99% of the time people
want to do it they shouldn't. It does break fundamental
details of TCP and can cause you problems, especially on WANs.
> My problem is that after the client sends the terminate message ( and the
> server shuts down ), I cannot immediately start the server, I get the
> following error :
> "Address 8000 already in use"
You need to set the option on the socket to get the bind() call to allow
local address reuse:
int opt = -1;
if(setsockopt(s, SOL_SOCKET, SO_REUSEADDR,
(char *) &opt, sizeof(opt)) < 0) {
perror("setsockopt");
exit(1);
}
if (bind(s, (struct sockaddr *), ...
Hope this helps.
--
Dr Andrew Gay http://www.ssynth.co.uk/~gay/ no spasm please
Systems Synthesis Ltd, Bristol, England +44 117 923 8853
: Loral, sold to Lock-Mart, spun off into L3 Communications. Lots of
: legacy behind it: All I can do is *request* the change. May take
: years. So what else can I do in the meantime? Seriously.
TIME_WAIT is part of TCP's correctness algorithms. It is there to
protect against the unwitting receipt and passing to the application
of data from prior connections of the same "name." A connection name
is the four-tuple of local and remote IP address and local/remote TCP
port number.
TCP (and anyone using IP, or UDP) must assume that IP datagrams can be
bent, folded, spindled or mutilated. They can be duplicated and routed
through Pluto (Loral - hmm, maybe Pluto isn't that far-fetched :).
However, one can also assume that IP datagrams will eventually
"timeout" in the network (misnomer since it is a hop count but...)
So, to be "certain" (statistically I assume) that a new TCP connection
by the same name will not mistakenly receive datagrams left-over from
an old connection, TCP will leave one half of the connection (the side
that initiates the shutdown first) in TIME_WAIT. While in that state,
it can safely "absorb" those poor lost misguided segments and send them
to their ultimate salvation (/dev/null).
If you shorten the TIME_WAIT time, you increase the risk that those
old segments will not be bit-bucketed and will be accepted as "real"
(current) data instead.
That could be bad. Very bad.
All the advice to use SO_REUSEADDR and not short-circuit the TIME_WAIT
state is very, very good advice. It will allow a new server to create
a listen socket, but still keep the TIME_WAITs going. Ignore that
advice at your (better not be my...) peril and/or those who rely upon
the product.
Not only must the application do its part, and the system admin do his
part, but the OS must be written correctly wrt TIME_WAIT - it should
not re-use the resources of a TIME_WAIT connection until TIME_WAIT has
expired.
rick jones
these opinions are mine, all mine; HP might not want them anyway... :)