[Boost-users] boost asio: boost::asio::deadline_timer blocking ?

147 views
Skip to first unread message

Avi Bahra

unread,
Nov 9, 2012, 5:40:48 AM11/9/12
to asio-...@lists.sourceforge.net, boost...@lists.boost.org
Using boost asio 1.47, suse linux 11.3, gcc 4.5

I am using boost::asio::deadline_timer, where the timer fires every second.
When the timer expires, I check responses from the clients.
I use the same timer to check when I have reach the minute boundary,
so that server can do some other work.

Now 99.9% of the time this works perfectly. However occasionally
the timer does not fire every second, in fact it can delay/block for
several minutes!

This is what I use:

timer_.expires_from_now( boost::posix_time::seconds( 1 ) );
timer_.async_wait( server_->io_service_.wrap( boost::bind(
&NodeTreeTraverser::traverse,this,boost::asio::placeholders::error ) )
);

When I run strace on the process I see:

epoll_wait(4, {{EPOLLIN, {u32=15957244, u64=15957244}}}, 128, 4294967295) = 1
timerfd_settime(5, 0, {it_interval={0, 0}, it_value={16, 944525000}},
{it_interval={16048400, 15957984}, it_value={140733847085872,
6025966}}) = 0
timerfd_settime(5, 0, {it_interval={0, 0}, it_value={0, 999992000}},
{it_interval={15958092, 6011901}, it_value={17195917584,
140733847086024}}) = 0

I have several questions:

o Are there known problems with boost::asio::deadline_timer accuracy
I realise I need soft real time, with one minute resolution.
o Should I have two separate timers, one 1 second & for 1 minute
o Is using timer_.expires_at(...) more reliable, I assume this affects
how it configures timerfd_settime
o Now I suspect this issue may be related to slow disk system
Could a slow disk system affect timerfd_settime system call,
since it is based on file descriptors.
o Are there any well known asio based design patterns/idioms
for ensuring reliable timer resolutions

Any help appreciated.

Ta,
Avi
_______________________________________________
Boost-users mailing list
Boost...@lists.boost.org
http://lists.boost.org/mailman/listinfo.cgi/boost-users

Igor R

unread,
Nov 9, 2012, 6:44:01 AM11/9/12
to boost...@lists.boost.org
> Using boost asio 1.47, suse linux 11.3, gcc 4.5
> I am using boost::asio::deadline_timer, where the timer fires every second.
> When the timer expires, I check responses from the clients.
> I use the same timer to check when I have reach the minute boundary,
> so that server can do some other work.
>
> Now 99.9% of the time this works perfectly. However occasionally
> the timer does not fire every second, in fact it can delay/block for
> several minutes!


FWIW, I'm using asio:deadline_timer's quite intensively with different
resolutions (from tens of ms to tens of sec), and never encountered
such an issue. Typically, in my application there're a few tens of
timers associated with several io_services (io_service-per-cpu
approach) with different deadlines, and all of them fire as expected.
Check that you never block any completion-handler (timed locks,
time-consuming processing etc).
Besides, try upgrading to the most recent asio (see revision history:
http://www.boost.org/doc/libs/1_52_0/doc/html/boost_asio/history.html)
By the way, is it possible that the above 0.1% happens when the system
time gets updated (eg. due to NTP synchronization)? If yes, see my
comments below.


> o Are there known problems with boost::asio::deadline_timer accuracy
> I realise I need soft real time, with one minute resolution.

At these resolutions there shouldn't be any problem. It has
monotonicity issues, which are solved in 1.49 by steady_timer, but if
you don't update the system-time it shouldn't affect you. (See the
following: http://permalink.gmane.org/gmane.comp.lib.boost.asio.user/5183)


> o Should I have two separate timers, one 1 second & for 1 minute

After the timer expires and the completion handler gets called, you
can re-set the same timer with any other deadline.


> o Is using timer_.expires_at(...) more reliable, I assume this affects
> how it configures timerfd_settime

As far as I can see, expires_from_now() is implemented in terms of
expires_at(), so there's no any difference.


Igor'.

Avi Bahra

unread,
Nov 14, 2012, 9:18:11 AM11/14/12
to asio-...@lists.sourceforge.net, boost...@lists.boost.org
Answers:

1. I doubt this is really an issue is with deadline_timer accuracy.

2. Are you double-scheduling the same timer? Or are you just doing some
extra work, every time a single expiry event crosses a minute boundary?
I wouldn't double-schedule the same timer, if that's what you're doing.

>> No. I record the next minute boundary, when the timer fires, I check the current
>> time against the minute boundary. So no double scheduling.

3. The timer may be susceptible to time changes on your system. It
needn't be, but don't assume that means it's not. It might be worth
looking at the implementation, to make sure.

>> The problem occurs independent of the time change.

4. Slow disk could be an issue, but only if your io_service thread(s)
are blocking on disk activity (either direct disk access or page
faults). The file descriptors you're talking about have nothing to do
with disk files, in this case.

>> The current server is single threaded, so my this was my first suspect.
>> I log every client request to a log file. I had suspected that a slow
>> file system, would block when logging the client request, and affect the timer.
>> I added extra instrumentation to time the logging. However this was not the cause

5. Depends on your accuracy needs and the load on the rest of the
system. On a system under moderate loads, I'd expect no problems using
ASIO to perform processing on a one-second granularity. Just be sure to
compute the next expiry based the current time, rather than sleeping for
1 second. If you want to make this resilient to time change, you can
use a monotonic timebase.


Matt

================================================================

The client is designed to cancel long running connection to the
server.. ( > 20seconds)
This timeout is based on boost asio example:
http://www.boost.org/doc/libs/1_47_0/doc/html/boost_asio/example/timeouts/async_tcp_client.cpp

What I have noticed is that the 1 second timer in the server
does not fire, *when* client request have been cancelled.
It appears that some of chain of async handlers
that are being called in the server, during this process, cause the
timer to hang ?

For example if the client cancels after 20 seconds, then something in the server
must been blocking, for that period ?
When the client cancels, the server::handle_write(..), closes
both end of the socket. So I see the following message logged:

shutdown: Transport endpoint is not connected

Here is the async' functions in the server:

void server::start_accept()
{
connection_ptr new_conn( new connection( io_service_ ) );
acceptor_.async_accept( new_conn->socket(),
boost::bind( &server::handle_accept, this,
boost::asio::placeholders::error,
new_conn ) );
}

void server::handle_accept( const boost::system::error_code& e,
connection_ptr conn )
{
if (!acceptor_.is_open()) {
if (serverEnv_.debug()) cout << " server::handle_accept:
acceptor is closed, returning\n";
return;
}

if ( !e ) {
conn->async_read( inbound_request_,
boost::bind( &server::handle_read, this,
boost::asio::placeholders::error,conn ) );
}
else {
if (e != boost::asio::error::operation_aborted) {
// An error occurred. Log it
LogToCout toCoutAsWell;
LOG(Log::ERR, " server::handle_accept error occurred " <<
e.message());
}
}
start_accept();
}

void server::handle_read( const boost::system::error_code&
e,connection_ptr conn )
{
if ( !e ) {
try {
// Service the in bound request, handling the request will
populate the outbound_response_
// Note:: Handle request will first authenticate
outbound_response_.set_cmd( inbound_request_.handleRequest( this ) );
}
catch (exception& e) {
outbound_response_.set_cmd( PreAllocatedReply::error_cmd( e.what() ));
}

// *Reply* back to the client:
conn->async_write( outbound_response_,
boost::bind(&server::handle_write,
this,
boost::asio::placeholders::error,
conn ) );
}
else {
LOG(Log::ERR, "server::handle_read error occurred : " << e.message());
}
}

void server::handle_write( const boost::system::error_code& e,
connection_ptr conn )
{
boost::system::error_code ec;
conn->socket().shutdown(boost::asio::ip::tcp::socket::shutdown_both,ec);
if (ec) {
ecf::LogToCout logToCout;
LOG(Log::ERR,"server::handle_write: socket shutdown both failed:
" << ec.message());
return;
}
}

I had noticed that I was not checking error code in the server::handle_write(..)
but I'm not sure if this would account, for why the timer does not
fire temporarily.


Ta,
Avi
Reply all
Reply to author
Forward
0 new messages