[ace-users] Socket failure detection

2 views
Skip to first unread message

Altaf Aali

unread,
May 2, 2005, 9:31:13 PM5/2/05
to

We are using ACE for our network application in which we establish socket connections to external devices or hosts. We use the ACE connector framework (i.e. the ACE Reactor, socket handler classes etc). One problem we have found is that if we pull the Ethernet cable from the external device or our host running the application, it can take ages for socket failure to be detected or sometimes when we put the cable back. Is there a way to detect socket failure in this scenario immediately or we are mostly at the mercy of the OS? Any help would be appreciated.

 

Thanks.

Robert Iakobashvili

unread,
May 3, 2005, 1:01:57 AM5/3/05
to
Hi Altaf,
 
Please, always provide the PRF form.
 
Look, what happens with your network interfaces on plugging in/out the cable by ifconfig command.
 
May be your interface goes down on cable unplugging, so you need to re-start your application in some, e.g. 
eth0.up script.
 
Look also, what happens with your socket using netstat, lsof, etc.
 
Sounds like a non-ACE problem.
 
Sincerely,
Robert Iakobashvili
coroberti at gmail dot com 


From: owner-a...@cse.wustl.edu [mailto:owner-a...@cse.wustl.edu] On Behalf Of Altaf Aali
Sent: Tuesday, May 03, 2005 3:21 AM
To: ace-...@cs.wustl.edu
Subject: [ace-users] Socket failure detection

Matthew Gillen

unread,
May 3, 2005, 9:38:07 AM5/3/05
to
Altaf,
I agree that it sounds like a non-ACE problem. But if you're on linux,
you can look at the ifplugd/netplugd daemon. They take care of bringing
interfaces up and down when the cable is un/plugged. That should make
the failures happen faster.

--Matt

Robert Iakobashvili wrote:
> Hi Altaf,
>
> Please, always provide the PRF form.
>
> Look, what happens with your network interfaces on plugging in/out the
> cable by ifconfig command.
>
> May be your interface goes down on cable unplugging, so you need to
> re-start your application in some, e.g.
> eth0.up script.
>
> Look also, what happens with your socket using netstat, lsof, etc.
>
> Sounds like a non-ACE problem.
>
> Sincerely,
> Robert Iakobashvili
> coroberti at gmail dot com
>

> ------------------------------------------------------------------------
> *From:* owner-a...@cse.wustl.edu
> [mailto:owner-a...@cse.wustl.edu] *On Behalf Of *Altaf Aali
> *Sent:* Tuesday, May 03, 2005 3:21 AM
> *To:* ace-...@cs.wustl.edu
> *Subject:* [ace-users] Socket failure detection


>
> We are using ACE for our network application in which we establish
> socket connections to external devices or hosts. We use the ACE
> connector framework (i.e. the ACE Reactor, socket handler classes etc).
> One problem we have found is that if we pull the Ethernet cable from the
> external device or our host running the application, it can take ages
> for socket failure to be detected or sometimes when we put the cable
> back. Is there a way to detect socket failure in this scenario
> immediately or we are mostly at the mercy of the OS? Any help would be
> appreciated.
>
>
>
> Thanks.
>


--
Matthew Gillen phone: 617-873-5263
BBN Technologies fax: 617-873-2616
10 Moulton Street e-mail: mgi...@bbn.com
Cambridge, MA 02138

Steve Huston

unread,
May 3, 2005, 10:37:11 AM5/3/05
to
Hi Altaf,

As others have noted, this is a non-ACE problem. However, I encourage
you to look at the bigger picture.

TCP is designed to survive transient outages below it. These include
routing issues and cables being pulled out. Just because a cable is
unplugged doesn't mean the TCP connection should go away. There may be
any number of other paths that IP can use to reach the peer. This is a
feature of the protocol design, and a strong one, IMO.

If your application requires a very quick session shutdown when a
lapse at the TCP level occurs, you need to design that into your
application's protocol.

Sidebar 13 on C++NPv2 page 61 discusses this issue further.

Best regards,

-Steve

--
Steve Huston, Riverace Corporation
New! ACE Training - http://www.riverace.com/training.htm
ACE book info at http://www.riverace.com/acebooks/

-----Original Message-----
From: owner-a...@cse.wustl.edu
[mailto:owner-a...@cse.wustl.edu] On Behalf Of Altaf Aali

David Hawkins

unread,
May 3, 2005, 1:08:02 PM5/3/05
to

Hi Altaf,

I observed similar behaviour with a system I built. As commented
by others, this is 'standard' TCP/IP behaviour. In my application
I expected data every 0.5s from multiple connections (the connections
each send data relative to a GPS tick and NTP time) and had a timeout
setup to fire if data was not received within 100ms of a 0.5s
boundary. Since the data was being 'read' if a cable was disconnected,
the timeout fired, but the connections were not detected as being
broken. I changed my code to allow 10 failures (5sec with no data),
and if that situation was detected, the connection was closed and
an attempt to reestablish with all the data servers was made.
This ensured that when the other servers really did come
up again, new connections were established. Otherwise the
old stale ones remained. Bottom line, you have to add a
'keep-alive' message if your communications protocol does
not have some other expected timing requirements.

On the other end, I added send timeouts.

The system is now robust to server shutdowns (machine abruptly
turned off) and cable disconnects.

Cheers
Dave

Reply all
Reply to author
Forward
0 new messages