Checking connection status and recovering from fail

GilNajera

unread,

Mar 29, 2011, 11:38:44 AM3/29/11

to

Hi everybody!

I have a winsocks application that uses a TCP socket. My problem is:

There are sometimes, when the application is running, that one of the
sides (server or client, could be any) seems not to be receiving data
for some seconds, even if it must receive each 100 milliseconds.

The thing I want to do is to check for this problems, flush the
connection (only if it is useful), and return to an stable state.

Until now, I'm checking if the application lasts some fixed amount of
time without receiving data (let's say 3 seconds), but I don't know
what to do next to say the other side "I don't care what are you
sending to me, let's forget it and start fresh". Or, how to know if
there is a bottleneck in the network and look for the way to walk
through it?.

Sorry if my english is bad, it's my third language and I only speak
one ;).

Thanks in advance

Jean-Christophe

unread,

Mar 29, 2011, 12:08:53 PM3/29/11

to

On Mar 29, 5:38 pm, GilNajera

Hi GilNajera,

> I have a winsocks application that uses a TCP socket. My problem is:
> There are sometimes, when the application is running, that one of the
> sides (server or client, could be any) seems not to be receiving data
> for some seconds, even if it must receive each 100 milliseconds.

If your app send data every 100 ms isn't it
somehow going to overload the network's bus ?
Maybe some other apps are sending too ?

> The thing I want to do is to check for this problems, flush the
> connection (only if it is useful), and return to an stable state.
> Until now, I'm checking if the application lasts some fixed amount of
> time without receiving data (let's say 3 seconds), but I don't know
> what to do next to say the other side "I don't care what are you
> sending to me, let's forget it and start fresh".

Why not sendback some data to the other side, meaning "restart
fresh" ?
Could be done thru the same TCP or by other means like UDP.

Or sendback "acknowledged" to tell the other side it has received ?
This way, if the other side does NOT receive an "acknowledged"
il will "restart fresh" by itself.

GilNajera

unread,

Mar 29, 2011, 3:21:22 PM3/29/11

to

Hi Jean!

Thanks for your fast answer.

I'm affraid of sending back the acknowledge because of overloading the
network; you've said it, other apps may be sending.

For now, what i'm doing is to pause the app if it don't receive data
for 3 seconds and sending a "paused" message to the other end for it
to pause too. While both ends are paused, they keep sending a small
sync message (each 500 ms) when they receive a certain amount of sync
messages I can asume the connection has recovered. The idea is, when
the pause lasts for more than, say, 20 seconds close the conection and
try to reconnect.

Am I right? Is there a more efficient (and easy) way to do this?

By the way, Naggle's algorithm is disabled and the maximum amount of
data I send each time(100 ms) is 500 bytes (to avoid fragmentation).

Jean-Christophe

unread,

Mar 29, 2011, 3:56:51 PM3/29/11

to

On Mar 29, 9:21 pm, GilNajera

> I'm affraid of sending back the acknowledge because of overloading
> the network; you've said it, other apps may be sending.

You should tell us more about what you are doing.
What network ?
Ethernet ? Small LAN ? Large LAN ?
Internet ?

> For now, what i'm doing is to pause the app if it don't
> receive data for 3 seconds and sending a "paused" message
> to the other end for it to pause too.

Does it work ?

> While both ends are paused, they keep sending a small
> sync message (each 500 ms) when they receive a certain amount of sync
> messages I can asume the connection has recovered. The idea is, when
> the pause lasts for more than, say, 20 seconds close the conection and
> try to reconnect.
> Am I right?

I'm not an expert, but this sounds correct to me.

> Is there a more efficient (and easy) way to do this?

I guess so. TCP should really transport packets.
If it doesn't, then you should seriously investigate
what is really happening before trying to solve this.

GilNajera

unread,

Mar 29, 2011, 4:46:33 PM3/29/11

to

On 29 mar, 13:56, Jean-Christophe <5...@free.fr> wrote:
> You should tell us more about what you are doing.
> What network ?
> Ethernet ? Small LAN ? Large LAN ?
> Internet ?
>

Ok, sorry. I'm working in an action game, first it must be played in
LAN having a peer to peer connection; after that it will be scaled to
internet using a socket server as a bridge between peers.

> > For now, what i'm doing is to pause the app if it don't
> > receive data for 3 seconds and sending a "paused" message
> > to the other end for it to pause too.
>
> Does it work ?
>

Testing now, it looks like it works (at least until the pause part)

Jean-Christophe

unread,

Mar 29, 2011, 5:28:46 PM3/29/11

to

On Mar 29, 10:46 pm, GilNajera

> I'm working in an action game, first it must be played
> in LAN having a peer to peer connection; after that it
> will be scaled to internet using a socket server
> as a bridge between peers.

Ok.
Why did you choose to use TCP ?
If you *really* need to send packets
every 100 ms, have you considered UDP ?
Some UDP packets *could* be lost, but if you send
them every 100 ms it shouldn't be that important.
More to the point, it will keep fast transfer
rate without the need of a TCP connection.

> Testing now, it looks like it works
> (at least until the pause part)

Did you run this test on a LAN or thru internet ?
On a LAN you should NOT have so much unexpected
disconnections nor packet lost because of the network.

GilNajera

unread,

Mar 30, 2011, 2:05:32 PM3/30/11

to

On 29 mar, 15:28, Jean-Christophe <5...@free.fr> wrote:
> Ok.
> Why did you choose to use TCP ?
> If you *really* need to send packets
> every 100 ms, have you considered UDP ?
> Some UDP packets *could* be lost, but if you send
> them every 100 ms it shouldn't be that important.
> More to the point, it will keep fast transfer
> rate without the need of a TCP connection.

I've chosen TCP because of its reliability, the most of the
information I send can't be lost. Also (call me lazy) I didn't want to
implement my own reliable UDP, although that is not totally discarded.
I've read, don't remeber exactly where, that an well configured TCP
(no Naggle, proper packet size, reasonable send rates,...) could be as
fast as a reliable UDP with less effort.

> Did you run this test on a LAN or thru internet ?
> On a LAN you should NOT have so much unexpected
> disconnections nor packet lost because of the network.

I'm testing on a wireless LAN, intentionally chosen to be a little
unstable for testing purposes.

David Schwartz

unread,

Mar 30, 2011, 5:52:07 PM3/30/11

to

On Mar 29, 12:21 pm, GilNajera <gilnaj...@gmail.com> wrote:

> By the way, Naggle's algorithm is disabled and the maximum amount of
> data I send each time(100 ms) is 500 bytes (to avoid fragmentation).

TCP has its own algorithms to avoid fragmentation. Sending smaller
messages only makes things worse.

DS

GilNajera

unread,

Mar 31, 2011, 10:35:17 AM3/31/11

to

On 30 mar, 15:52, David Schwartz <dav...@webmaster.com> wrote:
> TCP has its own algorithms to avoid fragmentation. Sending smaller
> messages only makes things worse.
>
> DS

Ok David. That´s interesting, could you tell me a little more about
that?

Thanks in advance.

Jean-Christophe

unread,

Apr 2, 2011, 7:20:08 AM4/2/11

to

On Mar 30, 8:05 pm, GilNajera

> I've chosen TCP because of its reliability,
> the most of the information I send can't be lost.

And yet you said that your TCP breaks down sometimes.

If a UDP packet is lost, the next one will update
the information you want to transmit, isn't it ?
AND you won't need a complete TCP reconnection sequence,
so the overall UDP transmission speed won't be so bad after all.

> call me lazy

Why, because you're in Mexico ?
( just kidding )

> I didn't want to implement my own
> reliable UDP, although that is not totally discarded.

Come on, Gil : it's no so hard to implement.
Maybe you don't even need to implement such a thing
over UDP, because even if a UPD packet is lost,
it doesn't mean that the connection itself has been broken :
the next UDP packet has a good probability to be transmitted.

> well configured TCP could be as fast

> as a reliable UDP with less effort.

Unless and/or until the TCP connection breaks down,
which seems to be the problem you are experiencing.

Why don't you just implement UDP and check against TCP ?
It's not a big deal since UDP is far easier to program,
and this way you will be able to compare recovering
from connection breakdown - I bet UDP will do fine,
since you don't need to implement any recovering at all.

HTH

Peter Duniho

unread,

Apr 2, 2011, 1:06:01 PM4/2/11

to

On 4/2/11 4:20 AM, Jean-Christophe wrote:
> [...]

>> I didn't want to implement my own
>> reliable UDP, although that is not totally discarded.
>
> Come on, Gil : it's no so hard to implement.
> Maybe you don't even need to implement such a thing
> over UDP, because even if a UPD packet is lost,
> it doesn't mean that the connection itself has been broken :
> the next UDP packet has a good probability to be transmitted.

I agree with most of what you've written, in theory. However, I have
first-hand experience with connections where UDP is preferentially
sacrificed in favor of TCP during congestion.

In fact, the scenario in that case was a networked game, and I was able
to completely resolve all of our game synchronization/player reliability
issues by writing a simple UDP proxy that transmitted the UDP datagrams
via TCP over the unreliable segment of the network.

Many game authors swear by UDP, claiming that TCP has too much overhead,
too much latency, etc. but it turns out for many types of games
(including the real-time action game I was dealing with above), the
amount of overhead and latency TCP can theoretically introduce is simply
not a problem.

>> well configured TCP could be as fast
>> as a reliable UDP with less effort.
>
> Unless and/or until the TCP connection breaks down,
> which seems to be the problem you are experiencing.

The problem is that we don't really know he's experiencing real problems
with TCP. Frankly, it's highly suspect that the network connection
could be otherwise reliable and yet there still be problems with TCP.
It _is_ possible that the network itself is unreliable, but if so,
switching protocols isn't going to help.

As David already pointed out, removing the code that disables the Nagel
algorithm might help improve things, since that would allow the network
to operate in a more efficient manner. I'd try that before anything else.

But ultimately the issue here is that the OP needs to figure out why his
network is unreliable and deal with that. If the network is unreliable,
then most likely any protocol he uses is going to have trouble.

It's also entirely possible that the unreliability seen with TCP is not
in the network at all, but rather in the implementation of either end of
the TCP connection.

> Why don't you just implement UDP and check against TCP ?
> It's not a big deal since UDP is far easier to program,
> and this way you will be able to compare recovering
> from connection breakdown - I bet UDP will do fine,
> since you don't need to implement any recovering at all.

UDP has its own requirements. If the higher-level program i/o code is
already well-suited for UDP, such that missing, duplicated, and
out-of-order datagrams are not a problem, then sure…UDP can be simpler
to code for. But code that's been using TCP is unlikely to be prepared
to deal with missing, duplicated, and out-of-order datagrams. So
switching to UDP is more that just ripping out the connection state
management portion of the network code. There's a bunch of other things
that would need to be fixed too.

And after all that, no real guarantee that UDP would work better.

Pete

Jean-Christophe

unread,

Apr 3, 2011, 12:26:31 AM4/3/11

to

On Apr 2, 7:06 pm, Peter Duniho

Interesting, Pete.
However there seems to be some misunderstanding.

I didn't say that UDP will always be better than TCP.
If I need to be sure that packets will reach
destination then I don't use UDP, because some
clever guys already wrote efficient algorithms
to perform the pretty good reliability given by TCP,
so - in this case - I will use TCP.
Now, if I need to send packets every 100 ms
and if I know that once in a while the connection
will break for some reason, then I will implement
both TCP and UDP to compare performances
by counting how many packets are lost,
how much time it takes to recover from breakdown,
etc ... against a period of time that is long
enough to be statistically significant.
And yes, of course, in both cases I have to implement
what's needed by the upper layers of my software.
Unless I've done that, I won't be able to
tell the 'best choice' between TCP or UDP.

I suggested GilNajera to implement a UDP
version into his software so he could compare
the performances between TCP and UDP on a real basis
instead of making blind guesses.

Forgive me to not enter into details but as a matter
of fact I am not the one who has network problems,
and I am sure GilNajera will be happy to receive
good advice from someone who already implemented
network games - which I did not.

GilNajera

unread,

Apr 4, 2011, 5:33:06 PM4/4/11

to

Thanks to all of you guys!

I've achieved a some stable behavior in my net service.

When it happens some time without receiving data (2 seconds) in one
end, the game pauses and sends an state message to the other end. If
the reception is restored in less than 10s (data reception rates
recovered) the game resumes, if not, the game ends assuming connection
is lost. The peer that receives the pause message copies the received
game state and waits for the resume message. During this pause, both
sides keep sending a small amount of data each 200 ms, to check if the
connection is restored or not.

In my wireless LAN I'm still experiencing some delays (none lost) in
data delivering causing the LAN tests to pause 1 or at most 2 times in
a 10 minutes game.

So, at the end, I didn't check the connection status nor tried to
restablish or flush it; just reduce the amount of data sent and wait
for the protocol to do its job and clear the buffer by reading pending
data. That's working for this first LAN test.

I'm sure I'll ask more things when I go to internet.

Gil