Google Groups no longer supports new Usenet posts or subscriptions. Historical content remains viewable.
Dismiss

Re: [ace-bugs] Reactor: Loop in run_reactor_event_loop() not adhering to specified timeout

222 views
Skip to first unread message

Steve Huston

unread,
May 15, 2012, 1:58:29 PM5/15/12
to Mohsin Zaidi, ace-...@list.isis.vanderbilt.edu

Hi Mohsin,

 

Thank you for using the PROBLEM_REPORT_FORM.

 

I believe that what you’re seeing is a result of the epoll_wait() call never having to wait – your events are coming too rapidly. If your code has other requirements (such as a thread never taking more 10 seconds to do anything) I recommend you write a custom event loop, such as the code you supplied below. Only do not add it to ACE – you can either derive a new class from ACE_Reactor or implement a stand-alone event handling loop that fulfills your requirements.

 

Best regards,

-Steve

 

From: ace-bugs...@list.isis.vanderbilt.edu [mailto:ace-bugs...@list.isis.vanderbilt.edu] On Behalf Of Mohsin Zaidi
Sent: Monday, May 14, 2012 10:09 PM
To: ace-...@list.isis.vanderbilt.edu
Subject: [ace-bugs] Reactor: Loop in run_reactor_event_loop() not adhering to specified timeout

 

Hello,

 

Before I begin, I'd like to say that this is the first time I am submitting a PRF. I hope you'll understand if my submission isn't totally clear or if I've missed something obvious that would explain what I'm seeing. I can provide additional details as required.

 

Your help is very greatly appreciated as is the amazing piece of work that ACE is.

 

Thanks.

 

Regards,

Mohsin

 

PS: Before anyone says it - the code that I think is causing the behavior I'm seeing is also present in the latest ACE release.

 

====================

 

    ACE VERSION:

                  6.0.1

 

    HOST MACHINE and OPERATING SYSTEM:

                 Linux 2.6.18-274.18.1.el5

                      64-bit Intel, multi-processor system

 

    TARGET MACHINE and OPERATING SYSTEM, if different from HOST:

                  NA

 

    COMPILER NAME AND VERSION (AND PATCHLEVEL):

                  gcc version 4.1.2

 

    THE $ACE_ROOT/ace/config.h FILE:

                  Attached. It is the config-linux.h file with a few additional defines.

 

    THE $ACE_ROOT/include/makeinclude/platform_macros.GNU FILE:

                  Same as platform_linux.GNU

 

    CONTENTS OF $ACE_ROOT/bin/MakeProjectCreator/config/default.features:

                  Does not exist

 

    AREA/CLASS/EXAMPLE AFFECTED:

                  ACE_Reactor/ACE_Dev_Poll_Reactor

 

    DOES THE PROBLEM AFFECT:

        COMPILATION?

        LINKING?

        EXECUTION?

        OTHER (please specify)?

 

                  Execution of Application is affected.

 

    SYNOPSIS:

                  The documentation for the ACE_Reactor::run_reactor_event_loop(ACE_Time_Value&, REACTOR_EVENT_HOOK) routine says,

                  "Run the event loop until the ACE_Reactor::handle_events() or <ACE_Reactor::alertable_handle_events> methods returns -1,

                  the end_reactor_event_loop() method is invoked, or the ACE_Time_Value expires."

 

                  In my case, the loop in run_reactor_event_loop() keeps on running far beyond the timeout expiration.

 

    DESCRIPTION:

                  I suspect that the following piece of code is not behaving as intended (or as I thought it should have been):

 

int

ACE_Reactor::run_reactor_event_loop (ACE_Time_Value &tv,

                                     REACTOR_EVENT_HOOK eh)

{

  ACE_TRACE ("ACE_Reactor::run_reactor_event_loop");

 

  if (this->reactor_event_loop_done ())

    return 0;

 

  while (1)

    {

      int result = this->implementation_->handle_events (tv);           <<<< Returns 1 if an event was dispatched successfully.

 

      if (eh != 0 && (*eh) (this))

        continue;

      else if (result == -1)

        {

          if (this->implementation_->deactivated ())

            result = 0;

          return result;

        }

      else if (result == 0)

        {

          // The <handle_events> method timed out without dispatching

          // anything.  Because of rounding and conversion errors and

          // such, it could be that the wait loop (WFMO, select, etc.)

          // timed out, but the timer queue said it wasn't quite ready

          // to expire a timer. In this case, the ACE_Time_Value we

          // passed into handle_events won't have quite been reduced

          // to 0, and we need to go around again. If we are all the

          // way to 0, just return, as the entire time the caller

          // wanted to wait has been used up.

          if (tv.usec () > 0)                                               <<<< Only comes here if result was 0.

            continue;

          return 0;

        }

      // Else there were some events dispatched; go around again

    }

 

  ACE_NOTREACHED (return 0;)

}

 

I am generating several hundred thousand input events per second and am seeing the loop above take an inordinately long amount of time (up to 20 seconds, after which my program crashes as it relies on threads not EVER taking longer than 10 seconds to do anything). This is much longer than the 500 ms timeout that I am specifying.

 

My understanding of this issue is that although the timeout eventually reduces to zero, the loop only exits when result = 0, which happens if epoll_wait() times out. At the rate at which I am generating events, this doesn't happen for a long time. Moreover, a timeout of 0 (which is what epoll_wait() will get once the timeout has been exhausted) only makes epoll_wait() not block but it does not cause it to timeout.

 

My questions here: Why do we go back round to poll again just because an event was successfully dispatched even if the timeout has now become 0? Would it make sense to modify this function to check the value of tv.usec() even if an event was successfully dispatched? Or would it be appropriate to have a new function? My two cents.

 

    REPEAT BY:

                  Use Dev_Poll_Reactor to handle several hundred thousand input events per second.

 

    SAMPLE FIX/WORKAROUND:

 

I'm using the following function I have written as a workaround. I do not really need to check the return value from handle_events() but I do need it to exit on time.

 

void runReactorEventLoop(ACE_Reactor* reactor)

{

    if (reactor->reactor_event_loop_done())

    {

        return;

    }

 

    ACE_Time_Value timeout(0, REACTOR_TIMEOUT);

 

    while (1)

    {

        if ((reactor->handle_events(timeout) == -1) ||

            (timeout.usec() == 0))

        {

            return;

        }

    }

}

 

Mohsin Zaidi

unread,
May 15, 2012, 3:06:17 PM5/15/12
to Steve Huston, ace-...@list.isis.vanderbilt.edu
Thanks, Steve. I agree - epoll_wait() never having to wait is what seems to be causing this behaviour though I only went with the 10 second limit assuming that the "or the ACE_Time_Value expires" part of the API documentation holds true in all cases.

Thinking about it again, having a reactor loop that takes a timeout but only times out if the underlying poll call times out regardless of however long the loop has been running seems a little odd. However, that might just be the way I'm looking at it.

If one does want a loop that times out though, is writing one the only solution? If it is, does the likelihood of such a use case warrant making it a part of ACE, or is custom code sufficient? We're pushing our ACE-based system to its limit and other people are bound to be doing so as well. It seems probable that something like this would come up again.

I was wondering whether it might make sense to add a clarification in the code and/or documentation so that the "or the ACE_Time_Value expires" statement doesn't cause confusion for other people like it did for me.

Thanks again for your reply!

Regards,
Mohsin

Steve Huston

unread,
May 16, 2012, 4:17:33 PM5/16/12
to Mohsin Zaidi, ace-...@list.isis.vanderbilt.edu

The behavior is consistent with the way ACE_Time_Value timeouts are used, and also with the behavior of 0-valued timeouts to the underlying event demultiplexers – 0 timeout generally means to poll for events – if there are always events ready, the poll will never time out.

 

A clarification to that effect in the header file would be a good idea – if it wasn’t clear to you (and I can certainly see why) it will also be unclear to others.

 

If you can send a patch, that would be great. Else I’ll add it to my list of things to do, but that list is very long ;-)

 

-Steve

Mohsin Zaidi

unread,
May 17, 2012, 10:44:40 AM5/17/12
to Steve Huston, ace-...@list.isis.vanderbilt.edu
Thanks a lot for explaining, Steve. Is there a separate procedure for submitting a patch?

How about the following changes to the API documentation for "int ACE_Reactor::run_reactor_event_loop(ACE_Time_Value & tv, REACTOR_EVENT_HOOK eh = 0)"?

From

 

"Run the event loop until the ACE_Reactor::handle_events() or <ACE_Reactor::alertable_handle_events> methods returns -1, the end_reactor_event_loop() method is invoked, or the ACE_Time_Value expires."

 

To

 

"Run the event loop until the ACE_Reactor::handle_events() or <ACE_Reactor::alertable_handle_events> methods returns -1, the end_reactor_event_loop() method is invoked, or events were not available before the ACE_Time_Value expired. Note that it is possible for events to continuously be available and for the event loop to thus keep running, potentially longer than the ACE_Time_Value."


Thanks again.


Regards,
Mohsin

Mohsin Zaidi

unread,
May 17, 2012, 11:24:25 AM5/17/12
to Steve Huston, ace-...@list.isis.vanderbilt.edu
A slight rephrase below. Sorry for the double mail.

  /**
   * Run the event loop until the ACE_Reactor::handle_events() or
   * <ACE_Reactor::alertable_handle_events> methods returns -1, the
   * end_reactor_event_loop() method is invoked, or the ACE_Time_Value
   * expires.
   */

becomes

  /**
   * Run the event loop until the ACE_Reactor::handle_events() or
   * <ACE_Reactor::alertable_handle_events> method returns -1, the
   * end_reactor_event_loop() method is invoked, or the ACE_Time_Value
   * expires while the event demultiplexer is waiting for events.
   * Note that it is possible for events to continuously be available
   * and for the event loop to thus keep running, potentially longer
   * than the ACE_Time_Value. In case the ACE_Time_Value has expired
   * but the event loop is still running, the loop will run until the
   * next time the event demultiplexer times out while waiting for events.
   */

Regards,
Mohsin

Steve Huston

unread,
May 17, 2012, 11:46:30 AM5/17/12
to Mohsin Zaidi, ace-...@list.isis.vanderbilt.edu

Looks good to me – it’ll be in the next release.

Thanks,

Mohsin Zaidi

unread,
May 17, 2012, 11:48:05 AM5/17/12
to Steve Huston, ace-...@list.isis.vanderbilt.edu
Great! Thanks, Steve!

Regards,
Mohsin
0 new messages