Why does LockSupport.parkNanos() work?

james droog

unread,

Aug 16, 2012, 8:22:54 PM8/16/12

to lmax-di...@googlegroups.com

Hi,

I'm yet another porting Disruptor to Windows in C++. I've been seeing starvation when using the BlockingWaitStrategy
and the SingleThreadClaimStrategy when the ringbuffer is of size 1. From what I can tell the Producer thread is
blocked waiting on the Condition.Await() in WaitForL method (with no timeout). The Consumer is then spinning in the
WaitForFreeSlotAt() method where parkNanos() is repeatedly called until *stuff* happens.

I initially used Window's SwitchToThread function to do what I though parkNanos does. It doesn't work. Using Sleep
seems to work but I'm not sure if that's because it allows *stuff* to happen or if it just reschedules things to hide an
underlying other problem.

My ReentrantLock implementation is merely a wrapper around a CRITICAL_SECTION and its APIs. My Condition is
Windows' CONDITION_VARIABLE.

From my reading of parkNanos() is seems that it's perfectly likely that this method returns immediately 'spuriously'.
In that case isn't it possible that the WaitFor method operates as a busy-spin, thereby exposing the same problem
I'm facing? Or is there something happening in the background that I'm not aware of? I've grepped and can't see any
calls to 'unpark' so it seems to me that this call is just used to schedule another thread, in which case I can't see why
using SwitchToThread() doesn't work.

I hope this isn't a daft question.

James

james droog

unread,

Aug 16, 2012, 11:59:19 PM8/16/12

to lmax-di...@googlegroups.com

I've been hacking away at this. I've tried using a Mutex in the ReentrantLock class along with the CRITICAL_SECTION but that made no difference. I'll add that I seem to need the CRITICAL_SECTION in order to use the CONDITION_VARIABLE with is/seems to be the equivalent of java.util.concurrency.Condition.

The only thing that makes the problem disappear is to remove the 'if (iNumWaiters != 0)' statement
from BlockingWaitStrategy::SignalAllWhenBlocking.

Stumped.

Michael Barker

unread,

Aug 17, 2012, 12:49:34 PM8/17/12

to lmax-di...@googlegroups.com

You may have uncovered a subtle race condition. I've had a quick
conversation with a colleague and I'm looking into a potential fix.

Mike.

james droog

unread,

Aug 18, 2012, 10:39:44 PM8/18/12

to lmax-di...@googlegroups.com

Can't beat dealing with a race condition on a Friday afternoon! Sorry about that!

Anyway, I've ported my code over to Linux and I get the same starvation. This
time the ReentrantLock uses pthread_mutex_t and the Condition is pthread_cond_t.
And again, removing the 'if (iNumWaiters != 0) line prevents the scenario.

James

Avinash Lakshman

unread,

Aug 19, 2012, 3:22:15 AM8/19/12

to lmax-di...@googlegroups.com

Is the patch http://code.google.com/p/disruptor/source/diff?path=/trunk/code/src/main/com/lmax/disruptor/BlockingWaitStrategy.java&format=side&r=581 a fix for this issue? Does this need to be part of 2.10.1? If it is I do not see it in the downloaded jar?

Avinash

Jeff Hain

unread,

Aug 19, 2012, 8:47:15 AM8/19/12

to lmax-di...@googlegroups.com

>You may have uncovered a subtle race condition. I've had a quick
>conversation with a colleague and I'm looking into a potential fix.

Is the race condition that (can exchange publisher and processor):
- publisher does a lazySet on sequence, then calls
signalAllWhenBlocking, and sees that numWaiters == 0,
so does not signal,
- processor enters waitFor and blocks forever
(no signal done)
(and that by removing (numWaiters != 0), publisher
would always makes its lazy set visible by calling
lock() in signalAllWhenBlocking)
?

In that case, a solution, when using lazy sets together with
the "numWaiters != 0" trick, is to only block for some time,
then wake-up to re-check the boolean condition.
For example, waiting (1 ns + 0.1 * ns elapsed since wait
beginning), i.e. if first wait chunk takes 100 microseconds, next
wait chunk will be targeted to wake up after 10 more microseconds.

While looking at this, I also found out you can use a
non-volatile "numWaiters_inLock" variable, incremented
and decremented in the lock, and use it to update the
volatile one, which should save two volatile reads
per blocking wait (but since these volatile reads
are close to the writes, I don't know if it helps much).

-Jeff

Michael Barker

unread,

Aug 19, 2012, 12:23:33 PM8/19/12

to lmax-di...@googlegroups.com

> In that case, a solution, when using lazy sets together with
> the "numWaiters != 0" trick, is to only block for some time,
> then wake-up to re-check the boolean condition.
> For example, waiting (1 ns + 0.1 * ns elapsed since wait
> beginning), i.e. if first wait chunk takes 100 microseconds, next
> wait chunk will be targeted to wake up after 10 more microseconds.

Initially I'm going for a brutally simple workaround of:

processorNotifyCondition.await(1, MILLISECONDS);

Then look if any more complicated solution is necessary. Has no
impact on the microbenchmarks and running with a buffer size of 1 is a
fairly unusual case.

Mike

james droog

unread,

Aug 19, 2012, 12:56:06 PM8/19/12

to lmax-di...@googlegroups.com, Jeff Hain

Hi Jeff,

I don't think the behaviour of lazySet has anything to do with it.
My interpretation of the AtomicLong class looks like this:

class AtomicLong
{
public:
    void LazySet(__int64 aNewValue)
    {
        iValue = aNewValue;
    }

    void Set(__int64 aNewValue)
    {
        __sync_val_compare_and_swap(&iValue, iValue, aNewValue);
    }

    volatile __int64 iValue;
}

If I use Set instead of LazySet the issue is still there.

As for the ringbuffer size being unusual... sure however it makes the condition easier to reproduce.
I've seen it with larger sizes (8 I think) too.

James

Michael Barker

unread,

Aug 19, 2012, 3:40:08 PM8/19/12

to lmax-di...@googlegroups.com

I'm assuming this is C++. I'm not sure the implementation of LazySet
is sufficient. If you are using C++ then I think you'll need to use
something like boost::atomic.store(xx, boost::memory_order_release) or
the C++11 equivalent.

If the volatile keyword works the same in C++ as it does in C then it
only guarantees ordering with respect to other volatile fields and GCC
is notorious for only implement barest minimum with regards to
ordering constraints.

Mike.

Reply all

Reply to author

Forward