Real cause of spurius wakeups

2222 views
Skip to first unread message

Vladimir Prus

unread,
Mar 31, 2005, 9:43:04 AM3/31/05
to

Hello,

can somebody explain the real reason why spurious wakeups are possible. That
is, why call to pthread_cond_wait can return without
pthread_cond_broadcast/pthread_cond_signal. I'm specifically talking about
spurious wakeup caused by pthread implementation, not the one caused by
loose predicate checking on the notifying side, or "stolen wakeup". I
looked in archives for this group, and in the FAQ, but found only vague
statements that on some SMP systems avoiding spurious wakeup is too hard.

I understand that pthread_cond_wait need to be wrapped in a loop even if
spurious wakeups did not exist. I'm asking just because the wakeups are
mentioned everywhere, but I never seen a complete explanations.

Thanks in advance,
Volodya

Joe Seigh

unread,
Mar 31, 2005, 12:13:34 PM3/31/05
to

It allows flexibility in the implementation. A strict definition with
no spurious wakeups allowed could possibly require overly expensive
synchronization on some platforms. You get to make a tradeoff. The
occasional spurious wakeup for much better performance. You can
write your own condition variable that does not have spurious wakeups,
but it will not perform as well.

The Linux futex based condition variables return spurious interrupts on
signal interruptions. Not really for performance reasons, AFAICT. Just
because then can do it.

I've written a faster futex based condition variable that wouldn't be
possible if spurious wakeups were not allowed. So it is useful to have
condition variables defined that way.


--
Joe Seigh

Alexander Terekhov

unread,
Mar 31, 2005, 12:19:08 PM3/31/05
to

Vladimir Prus wrote:
>
> Hello,
>
> can somebody explain the real reason why spurious wakeups are possible. That
> is, why call to pthread_cond_wait can return without
> pthread_cond_broadcast/pthread_cond_signal.

Because apart from realtime scheduling,
pthread_cond_broadcast/pthread_cond_signal can be implemented as nop.

No kidding.

cond_wait: mutex::release_guard guard(mutex); sleep(random());

cond_signal: nop

cond_broadcast: nop


See also

http://terekhov.de/DESIGN-futex-CV.cpp
http://terekhov.de/DESIGN-futex-CV-with-async.cancelable-wait.txt

regards,
alexander.

David Hopwood

unread,
Mar 31, 2005, 4:31:36 PM3/31/05
to
Joe Seigh wrote:
> On Thu, 31 Mar 2005 18:43:04 +0400, Vladimir Prus <gh...@cs.msu.su> wrote:
>
>> can somebody explain the real reason why spurious wakeups are
>> possible. That is, why call to pthread_cond_wait can return without
>> pthread_cond_broadcast/pthread_cond_signal. I'm specifically talking
>> about
>> spurious wakeup caused by pthread implementation, not the one caused by
>> loose predicate checking on the notifying side, or "stolen wakeup". I
>> looked in archives for this group, and in the FAQ, but found only vague
>> statements that on some SMP systems avoiding spurious wakeup is too hard.
>>
>> I understand that pthread_cond_wait need to be wrapped in a loop even if
>> spurious wakeups did not exist. I'm asking just because the wakeups are
>> mentioned everywhere, but I never seen a complete explanations.
>
> It allows flexibility in the implementation. A strict definition with
> no spurious wakeups allowed could possibly require overly expensive
> synchronization on some platforms. You get to make a tradeoff. The
> occasional spurious wakeup for much better performance. You can
> write your own condition variable that does not have spurious wakeups,
> but it will not perform as well.

You've given a generic answer that could apply to any instance of a relaxed
specification whatsoever. How is this useful when the OP asked for a
specific reason for this particular relaxation?

> The Linux futex based condition variables return spurious interrupts on
> signal interruptions. Not really for performance reasons, AFAICT. Just
> because then can do it.
>
> I've written a faster futex based condition variable that wouldn't be
> possible if spurious wakeups were not allowed.

Then you're in a good position to give the OP a real answer to his question.
I've only ever heard "it turns out to be hard" and "it might be costly"
handwaving, without any analysis of the actual difficulty or cost.

The c.p.threads FAQ has a brief discussion of this
<http://www.lambdacs.com/cpt/FAQ.html#Q94>, and ends up concluding:

>> You know, I wonder if the designers of pthreads used logic like this:
>> users of condition variables have to check the condition on exit anyway,
>> so we will not be placing any additional burden on them if we allow
>> spurious wakeups; and since it is conceivable that allowing spurious
>> wakeups could make an implementation faster, it can only help if we
>> allow them.
>>
>> They may not have had any particular implementation in mind.
>
> You're actually not far off at all, except you didn't push it far enough.
>
> The intent was to force correct/robust code by requiring predicate loops.
> This was driven by the provably correct academic contingent among the
> "core threadies" in the working group, though I don't think anyone really
> disagreed with the intent once they understood what it meant.
>
> We followed that intent with several levels of justification. The first
> was that "religiously" using a loop protects the application against its
> own imperfect coding practices. The second was that it wasn't difficult to
> abstractly imagine machines and implementation code that could exploit
> this requirement to improve the performance of average condition wait
> operations through optimizing the synchronization mechanisms.

But note that "it isn't difficult to abstractly imagine machines and
implementation code that could exploit this requirement" isn't the same
as actually describing a sketch of such an implementation. If it isn't
difficult, I'd like to see one.

--
David Hopwood <david.nosp...@blueyonder.co.uk>

David Hopwood

unread,
Mar 31, 2005, 4:55:31 PM3/31/05
to
Alexander Terekhov wrote:

> Vladimir Prus wrote:
>
>>can somebody explain the real reason why spurious wakeups are possible. That
>>is, why call to pthread_cond_wait can return without
>>pthread_cond_broadcast/pthread_cond_signal.
>
> Because apart from realtime scheduling,
> pthread_cond_broadcast/pthread_cond_signal can be implemented as nop.
>
> No kidding.
>
> cond_wait: mutex::release_guard guard(mutex); sleep(random());
>
> cond_signal: nop
>
> cond_broadcast: nop

This is a very inefficient implementation that would not be used in practice,
so it doesn't answer the OP's question.

--
David Hopwood <david.nosp...@blueyonder.co.uk>

Alexander Terekhov

unread,
Mar 31, 2005, 5:08:54 PM3/31/05
to

David Hopwood wrote:

[... cond_wait: mutex::release_guard guard(mutex); sleep(random()) ...]

> This is a very inefficient implementation that would not be used in practice,
> so it doesn't answer the OP's question.

You seem to have missed "see also" section.

regards,
alexander.

Joe Seigh

unread,
Mar 31, 2005, 8:53:07 PM3/31/05
to
On Thu, 31 Mar 2005 21:31:36 GMT, David Hopwood <david.nosp...@blueyonder.co.uk> wrote:

The examples were probably some of the current implementations. I
don't think anyone wanted to admit on record that their implementation
would suck unless spurious wakeups were allowed. I think people
were smiling a lot when they spoke that particular statement. A possible
implementation might be the syscall not keeping enough state information
to make it restartable after an EINTR. This is not the case
for Linux which has enough information in the condvar and in the futex
to make futex syscalls restartable on EINTR if they wanted to.

The implementation I did is here http://atomic-ptr-plus.sourceforge.net/
though I doubt it's what they had in mind when they did the Posix spec.
IIRC, it runs about 10% faster than the NPTL implementation, 3X faster
if you "fix" condvar signals to not preempt.

--
Joe Seigh

Vladimir Prus

unread,
Apr 1, 2005, 2:55:29 AM4/1/05
to
Joe Seigh wrote:

>> But note that "it isn't difficult to abstractly imagine machines and
>> implementation code that could exploit this requirement" isn't the same
>> as actually describing a sketch of such an implementation. If it isn't
>> difficult, I'd like to see one.
>
> The examples were probably some of the current implementations. I
> don't think anyone wanted to admit on record that their implementation
> would suck unless spurious wakeups were allowed. I think people
> were smiling a lot when they spoke that particular statement. A possible
> implementation might be the syscall not keeping enough state information
> to make it restartable after an EINTR. This is not the case
> for Linux which has enough information in the condvar and in the futex
> to make futex syscalls restartable on EINTR if they wanted to.

So,
- wait on futex returns EINTR on signal
- cond_wait implementation immediately returns on EINTR

Well, that's a reason. However, there are still two questions:

- why wait of futex returns on signals?
- as you indicate, cond_wait returns on EINTR just because it's allowed to.
Would be possible to waits on futex again in that case. And would not that
have no great effect on performance?

So, it looks I still haven't seen an example where avoiding spurious wakeups
will cause great performance loss of all condvar operations. And I'm still
interested to know ;-)

- Volodya

Vladimir Prus

unread,
Apr 1, 2005, 2:50:08 AM4/1/05
to
Alexander Terekhov wrote:

> Vladimir Prus wrote:
>>
>> Hello,
>>
>> can somebody explain the real reason why spurious wakeups are possible.
>> That is, why call to pthread_cond_wait can return without
>> pthread_cond_broadcast/pthread_cond_signal.
>
> Because apart from realtime scheduling,
> pthread_cond_broadcast/pthread_cond_signal can be implemented as nop.
>
> No kidding.

Yes, I understand. But I doubt there's any real implementation that does
this way. I'm mostly interested in a practical reasons.

And how spurious wakeups are possible in that implementation? Unfortunately
I can't immediately see that from 10K of code.

- Volodya

Alexander Terekhov

unread,
Apr 1, 2005, 3:40:42 AM4/1/05
to

Vladimir Prus wrote:
[...]

> > http://terekhov.de/DESIGN-futex-CV.cpp
> > http://terekhov.de/DESIGN-futex-CV-with-async.cancelable-wait.txt
>
> And how spurious wakeups are possible in that implementation? Unfortunately
> I can't immediately see that from 10K of code.

See .txt above. cond_broadcast() in cond_wait_cleanup_handler2() and
"FUTEX WAKE (cv->futex, ALL)" in cond_wait_cleanup() will cause
spurious wakeups (same effect due to futex token changes and races
with respect to futex wait and futex wake aside for a moment).

regards,
alexander.

Ben Hutchings

unread,
Apr 1, 2005, 10:23:07 AM4/1/05
to
Vladimir Prus wrote:
> Joe Seigh wrote:
>
>>> But note that "it isn't difficult to abstractly imagine machines and
>>> implementation code that could exploit this requirement" isn't the same
>>> as actually describing a sketch of such an implementation. If it isn't
>>> difficult, I'd like to see one.
>>
>> The examples were probably some of the current implementations. I
>> don't think anyone wanted to admit on record that their implementation
>> would suck unless spurious wakeups were allowed. I think people
>> were smiling a lot when they spoke that particular statement. A possible
>> implementation might be the syscall not keeping enough state information
>> to make it restartable after an EINTR. This is not the case
>> for Linux which has enough information in the condvar and in the futex
>> to make futex syscalls restartable on EINTR if they wanted to.
>
> So,
> - wait on futex returns EINTR on signal
> - cond_wait implementation immediately returns on EINTR
>
> Well, that's a reason. However, there are still two questions:
>
> - why wait of futex returns on signals?

The futex() caller may want to respond to the signal. Also signal
handling can involve calling user-space code, which may make it hard
to preserve the in-kernel context of the futex() call.

> - as you indicate, cond_wait returns on EINTR just because it's allowed to.
> Would be possible to waits on futex again in that case.

<snip>

No, because a notification might be missed.

Ben.

--
Ben Hutchings
Q. Which is the greater problem in the world today, ignorance or apathy?
A. I don't know and I couldn't care less.

gh...@cs.msu.su

unread,
Apr 18, 2005, 8:00:31 AM4/18/05
to
> > - why wait of futex returns on signals?


> The futex() caller may want to respond to the signal. Also signal
> handling can involve calling user-space code, which may make it hard
> to preserve the in-kernel context of the futex() call.

Ok, the second argument sounds reasonable.


> > - as you indicate, cond_wait returns on EINTR just because it's
allowed to.
> > Would be possible to waits on futex again in that case.

> No, because a notification might be missed.

Ehmm.. indeed!

Thanks for explaining. This is indeed valid reason for spurious
wakeups. I'd still be interested to
know the if that's the reason hinted on everywhere, or there are other
cases.

Thanks,
Volodya

Reply all
Reply to author
Forward
0 new messages