Hi Mohsin,
Thank you for using the PROBLEM_REPORT_FORM.
I believe that what you’re seeing is a result of the epoll_wait() call never having to wait – your events are coming too rapidly. If your code has other requirements (such as a thread never taking more 10 seconds to do anything) I recommend you write a custom event loop, such as the code you supplied below. Only do not add it to ACE – you can either derive a new class from ACE_Reactor or implement a stand-alone event handling loop that fulfills your requirements.
Best regards,
-Steve
From: ace-bugs...@list.isis.vanderbilt.edu [mailto:ace-bugs...@list.isis.vanderbilt.edu] On Behalf Of Mohsin Zaidi
Sent: Monday, May 14, 2012 10:09 PM
To: ace-...@list.isis.vanderbilt.edu
Subject: [ace-bugs] Reactor: Loop in run_reactor_event_loop() not adhering to specified timeout
Hello,
Before I begin, I'd like to say that this is the first time I am submitting a PRF. I hope you'll understand if my submission isn't totally clear or if I've missed something obvious that would explain what I'm seeing. I can provide additional details as required.
Your help is very greatly appreciated as is the amazing piece of work that ACE is.
Thanks.
Regards,
Mohsin
PS: Before anyone says it - the code that I think is causing the behavior I'm seeing is also present in the latest ACE release.
====================
ACE VERSION:
6.0.1
HOST MACHINE and OPERATING SYSTEM:
Linux 2.6.18-274.18.1.el5
64-bit Intel, multi-processor system
TARGET MACHINE and OPERATING SYSTEM, if different from HOST:
NA
COMPILER NAME AND VERSION (AND PATCHLEVEL):
gcc version 4.1.2
THE $ACE_ROOT/ace/config.h FILE:
Attached. It is the config-linux.h file with a few additional defines.
THE $ACE_ROOT/include/makeinclude/platform_macros.GNU FILE:
Same as platform_linux.GNU
CONTENTS OF $ACE_ROOT/bin/MakeProjectCreator/config/default.features:
Does not exist
AREA/CLASS/EXAMPLE AFFECTED:
ACE_Reactor/ACE_Dev_Poll_Reactor
DOES THE PROBLEM AFFECT:
COMPILATION?
LINKING?
EXECUTION?
OTHER (please specify)?
Execution of Application is affected.
SYNOPSIS:
The documentation for the ACE_Reactor::run_reactor_event_loop(ACE_Time_Value&, REACTOR_EVENT_HOOK) routine says,
"Run the event loop until the ACE_Reactor::handle_events() or <ACE_Reactor::alertable_handle_events> methods returns -1,
the end_reactor_event_loop() method is invoked, or the ACE_Time_Value expires."
In my case, the loop in run_reactor_event_loop() keeps on running far beyond the timeout expiration.
DESCRIPTION:
I suspect that the following piece of code is not behaving as intended (or as I thought it should have been):
int
ACE_Reactor::run_reactor_event_loop (ACE_Time_Value &tv,
REACTOR_EVENT_HOOK eh)
{
ACE_TRACE ("ACE_Reactor::run_reactor_event_loop");
if (this->reactor_event_loop_done ())
return 0;
while (1)
{
int result = this->implementation_->handle_events (tv); <<<< Returns 1 if an event was dispatched successfully.
if (eh != 0 && (*eh) (this))
continue;
else if (result == -1)
{
if (this->implementation_->deactivated ())
result = 0;
return result;
}
else if (result == 0)
{
// The <handle_events> method timed out without dispatching
// anything. Because of rounding and conversion errors and
// such, it could be that the wait loop (WFMO, select, etc.)
// timed out, but the timer queue said it wasn't quite ready
// to expire a timer. In this case, the ACE_Time_Value we
// passed into handle_events won't have quite been reduced
// to 0, and we need to go around again. If we are all the
// way to 0, just return, as the entire time the caller
// wanted to wait has been used up.
if (tv.usec () > 0) <<<< Only comes here if result was 0.
continue;
return 0;
}
// Else there were some events dispatched; go around again
}
ACE_NOTREACHED (return 0;)
}
I am generating several hundred thousand input events per second and am seeing the loop above take an inordinately long amount of time (up to 20 seconds, after which my program crashes as it relies on threads not EVER taking longer than 10 seconds to do anything). This is much longer than the 500 ms timeout that I am specifying.
My understanding of this issue is that although the timeout eventually reduces to zero, the loop only exits when result = 0, which happens if epoll_wait() times out. At the rate at which I am generating events, this doesn't happen for a long time. Moreover, a timeout of 0 (which is what epoll_wait() will get once the timeout has been exhausted) only makes epoll_wait() not block but it does not cause it to timeout.
My questions here: Why do we go back round to poll again just because an event was successfully dispatched even if the timeout has now become 0? Would it make sense to modify this function to check the value of tv.usec() even if an event was successfully dispatched? Or would it be appropriate to have a new function? My two cents.
REPEAT BY:
Use Dev_Poll_Reactor to handle several hundred thousand input events per second.
SAMPLE FIX/WORKAROUND:
I'm using the following function I have written as a workaround. I do not really need to check the return value from handle_events() but I do need it to exit on time.
void runReactorEventLoop(ACE_Reactor* reactor)
{
if (reactor->reactor_event_loop_done())
{
return;
}
ACE_Time_Value timeout(0, REACTOR_TIMEOUT);
while (1)
{
if ((reactor->handle_events(timeout) == -1) ||
(timeout.usec() == 0))
{
return;
}
}
}
The behavior is consistent with the way ACE_Time_Value timeouts are used, and also with the behavior of 0-valued timeouts to the underlying event demultiplexers – 0 timeout generally means to poll for events – if there are always events ready, the poll will never time out.
A clarification to that effect in the header file would be a good idea – if it wasn’t clear to you (and I can certainly see why) it will also be unclear to others.
If you can send a patch, that would be great. Else I’ll add it to my list of things to do, but that list is very long ;-)
-Steve
From
"Run the event loop until the ACE_Reactor::handle_events() or <ACE_Reactor::alertable_handle_events> methods returns -1, the end_reactor_event_loop() method is invoked, or the ACE_Time_Value expires."
To
"Run the event loop until the ACE_Reactor::handle_events() or <ACE_Reactor::alertable_handle_events> methods returns -1, the end_reactor_event_loop() method is invoked, or events were not available before the ACE_Time_Value expired. Note that it is possible for events to continuously be available and for the event loop to thus keep running, potentially longer than the ACE_Time_Value."
Thanks again.
Looks good to me – it’ll be in the next release.
Thanks,