Google Groups no longer supports new Usenet posts or subscriptions. Historical content remains viewable.
Dismiss

Win32 condition variables redux

44 views
Skip to first unread message

d|dq

unread,
Aug 3, 2005, 3:25:27 AM8/3/05
to
I have read countless emails from very serious people...Alexander
Terekhov, Joe Seigh, David Schwartz, Dave Butenhof, Chris Thomasson,
etc.

You all seem to know a little bit on the topic. ;)

I have looked at many implementations of Win32 POSIX-like condition
variables, including pthreads-w32, Austria C++, Boost, cygwin, Apache
Portable Runtime (APR), Pelt, the implementation Joe Seigh posted here,
Douglas C. Schmidt's recommended implementations, Alexander Terekhov
and Louis Thomas' various algorithms (one of them being the base of
pthreads-w32)...and a few others.

My head is still spinning. (maybe on a spinlock)

Anyway, I have come the to the conclusion that Pthreads semantics for
condition variables is way too confusing/sophisticated to be
implemented on top of Win32.

So, I would settle for a redux version of condition variables...

For instance, if one assumes the following requirements:

- Thread cancellation is not supported.
- The external mutex is held by any thread executing a wait, timed
wait, signal or broascast operation.
- The solution does not need to support versions of Windows prior to
Windows 2000.

- The solution should be based on the Windows API without making any
assumption beyond its documentation.
- The solution should not make assumptions about the underlying
hardware.

So what would such an implementation look like?

Many implementations out there simply do not work or do not work well.
Others are reputed to work, but they are so complicated that I can't
decide if I can trust them or not.

So, before I go on a limb and pick something that looks reasonable to
me (and might still be "braindead"TM), I would like to know what you,
fellow experts, think.

BTW, those Interlocked* APIs are often bashed for all sorts of reasons,
event objects also have their detractors, and I even found people to
complain about EnterCriticalSection/LeaveCriticalSection!
I am not interested in getting the maximum performance the underlying
hardware could offer. I just want something that works reliably and can
be read without headache. Therefore, using these APIs/facilities is OK
with me.

Any helpers/takers?

Joe Seigh

unread,
Aug 3, 2005, 8:58:37 AM8/3/05
to
d|dq wrote:
> Anyway, I have come the to the conclusion that Pthreads semantics for
> condition variables is way too confusing/sophisticated to be
> implemented on top of Win32.
>
> So, I would settle for a redux version of condition variables...
>
> For instance, if one assumes the following requirements:
>
> - Thread cancellation is not supported.
> - The external mutex is held by any thread executing a wait, timed
> wait, signal or broascast operation.
> - The solution does not need to support versions of Windows prior to
> Windows 2000.
>
> - The solution should be based on the Windows API without making any
> assumption beyond its documentation.
> - The solution should not make assumptions about the underlying
> hardware.
>
> So what would such an implementation look like?
>

Well if PulseEvent worked (it doesn't) then it would look like
a broadcast only solution using SignalAndWait. It would be easy
enough to fix in the windows kernel but I guess Microsoft doesn't
have the talent they think they have.

The problem is ResetEvent makes Events not usable in the way you'd
want them to be. You need an eventcount which is sort of a
reentrant event. Without that, you're going to have a messy
solution no matter what you do.


--
Joe Seigh

When you get lemons, you make lemonade.
When you get hardware, you make software.

Joe Seigh

unread,
Aug 3, 2005, 9:14:38 AM8/3/05
to
Joe Seigh wrote:
>
> Well if PulseEvent worked (it doesn't) then it would look like
> a broadcast only solution using SignalAndWait. It would be easy
> enough to fix in the windows kernel but I guess Microsoft doesn't
> have the talent they think they have.
>
Actually, if I understand it right, *all* of win32 Wait functions are
broken. PulseEvent is probably considered broken since it's a really
fast SetEvent/ResetEvent relative to a kernel APC. Or saying it the
other way around, SetEvent/ResetEvent is slow compared to kernel APC
so race conditions are somewhat rare. This is not likely to be as true
on a multiprocessor. And I suspect this is true for other synchronization
objects as well that don't always remain in a signaled state.

d|dq

unread,
Aug 3, 2005, 11:36:11 AM8/3/05
to
Joe,

Thanks for jumping in. I appreciate your insight.

You seem to say (even) "CV redux" can't be programmed easily using
Win32 primitives.

Implementations I have seen rely on:
- one or more critical sections
- one or more semaphores
- up to half a dozen counters
- a data structure such as a queue or a doubly-linked linked list
- a few Interlocked* this or Interlocked* that
- up to a 1,000 lines of C code

That's a lot of code to digest. In the end, it's very difficult to
determine if problems are hidden in such complex implementations. (And
I can't believe so much code and "logic" improves performance.)

Example: The Apache Portable Runtime (APR)
A Win32 implementation of condition variables has been part of APR for
several years. People thought it worked. a few days ago, 2 patches had
to be applied because it didn't work at all, even with 2 threads.

That's just one example. There are many others. None of them give our
industry a good name.

But, going back to "CV redux"...

Could it be done with:
- 1 auto-reset event object
(using WaitForSingleObject to wait and SetEvent to communicate a
broadcast to all waiting threads by repeatedly marking the event
signaled)
- 1 "broadcast" flag
- 1 "waiting" counter

?

PS: According to several Microsoft employees, PulseEvent should be
avoided like the plague. I even read that it remains in the Win32 API
for backward compatibility reasons only.

Joe Seigh

unread,
Aug 3, 2005, 12:12:58 PM8/3/05
to
d|dq wrote:
> Joe,
>
> Thanks for jumping in. I appreciate your insight.
>
> You seem to say (even) "CV redux" can't be programmed easily using
> Win32 primitives.
>
> Implementations I have seen rely on:
> - one or more critical sections
> - one or more semaphores
> - up to half a dozen counters
> - a data structure such as a queue or a doubly-linked linked list
> - a few Interlocked* this or Interlocked* that
> - up to a 1,000 lines of C code

Well, it can be complicated but some people may make it more
complicated than it has to be.


>
> That's a lot of code to digest. In the end, it's very difficult to
> determine if problems are hidden in such complex implementations. (And
> I can't believe so much code and "logic" improves performance.)

That's why low level synchronization primatives are important. If
the platform provider makes bad choices, this is what you get.


>
> Example: The Apache Portable Runtime (APR)
> A Win32 implementation of condition variables has been part of APR for
> several years. People thought it worked. a few days ago, 2 patches had
> to be applied because it didn't work at all, even with 2 threads.

We spotted it right away since we already knew win32 condition
variables were problematic.

>
> That's just one example. There are many others. None of them give our
> industry a good name.
>
> But, going back to "CV redux"...
>
> Could it be done with:
> - 1 auto-reset event object
> (using WaitForSingleObject to wait and SetEvent to communicate a
> broadcast to all waiting threads by repeatedly marking the event
> signaled)
> - 1 "broadcast" flag
> - 1 "waiting" counter
>
> ?

No. If there was a simple way, we'd all use that instead of
coming up with complicated solutions.

>
> PS: According to several Microsoft employees, PulseEvent should be
> avoided like the plague. I even read that it remains in the Win32 API
> for backward compatibility reasons only.
>

The documentation mentions that also.

d|dq

unread,
Aug 3, 2005, 3:17:11 PM8/3/05
to
OK, this is sad.

Oh well, could anyone point me to a set of programs that can be used to
test a Win32 CV implementation?

At least something that could quickly identify implementations that
clearly don't work on multi-CPU hardware with a handful of threads.

Chris Thomasson

unread,
Aug 3, 2005, 6:08:18 PM8/3/05
to
> - The solution should be based on the Windows API without making any
> assumption beyond its documentation.
> - The solution should not make assumptions about the underlying
> hardware.
>
> So what would such an implementation look like?
>
> Many implementations out there simply do not work or do not work well.
> Others are reputed to work, but they are so complicated that I can't
> decide if I can trust them or not.

A quick and dirty method for windows is to force the users of your condvar
to use SCHED_OTHER.

;(


An event object per-thread and linked-lists for waitsets can create a
"fairly efficient" and correct condvar for windows. The reason I say its
efficient is because it only uses a single critical section per-condvar and
a single event per-thread. You could extend it even further by removing the
condvar_t::mutex member and just hash the address of a condvar_t into an
index of a global array of CRITICAL_SECTION's...


Something like this:

<pseudo-code with timeout logic>


#define PER_THREAD_WAITING 1


typedef struct per_thread_
{
struct per_thread_ *next;
struct per_thread_ *prev;
LONG refs;
int state;
HANDLE waiter;

} per_thread_t;


typedef struct condvar_
{
per_thread_t *front;
per_thread_t *back;
LONG waiters;
LONG count;
CRITICAL_SECTION mutex;

} condvar_t;


void condvar_init( condvar_t *_this )
{
_this->front = 0;
_this->back = 0;
_this->waiters = 0;
_this->count = 0;
InitalizeCriticalSection( &_this->mutex );
}

int condvar_destroy( condvar_t *_this )
{
int err = 0;

EnterCriticalSection( &_this->mutex );
if ( _this->front ) { err = EBUSY; }
LeaveCriticalSection( &_this->mutex );

if ( ! err )
{
DeleteCriticalSection( &_this->mutex );
if ( _this->count ) { abort(); }
}

return err;
}

int condvar_timedwait( condvar_t *_this, mutex_t *mutex, DWORD timeout )
{
int err = 0;
DWORD wait_ret;
per_thread_t *thread = per_thread_self();

InterlockedIncrement( &thread->refs );
EnterCriticalSection( &_this->mutex );

if ( ! _this->front )
{
_this->front = thread;
}

else
{
_this->back->next = thread;
thread->prev = _this->back;
}

_this->back = thread;
++_this->count;
thread->state = PER_THREAD_WAITING;

InterlockedIncrement( &_this->waiters );
LeaveCriticalSection( &_this->mutex );

LeaveCriticalSection( mutex );

wait_ret = WaitForSingleObject
( thread->waiter,
timeout );

if ( wait_ret != WAIT_OBJECT_0 )
{
err = prv_condvar_timeout( _this, thread );
if ( wait_ret != WAIT_TIMEOUT ) { abort(); }
}

EnterCriticalSection( mutex );
return err;
}

void condvar_signal( condvar_t *_this )
{
register LONG cmp, old = _this->waiters;

do
{
cmp = old;
old = InterlockedCompareExchange
( &_this->waiters,
( old > 0 ) ? old - 1 : old,
cmp );
} while ( cmp != old );

if ( old )
{
/* slow-path */
per_thread_t *thread;

EnterCriticalSection( &_this->mutex );

thread = _this->front;

if ( thread )
{
prv_condvar_pop( _this, thread );
assert( thread->state == PER_THREAD_WAITING );
thread->state = 0;
}

LeaveCriticalSection( &_this->mutex );

if ( thread )
{
if ( ! SetEvent( thread->waiter ) ) { abort(); }
per_thread_release( thread );
}
}
}

void condvar_broadcast( condvar_t *_this )
{
if ( InterlockedExchange( &_this->waiters, 0 ) )
{
/* slow-path */
per_thread_t *thread, *next;

EnterCriticalSection( &_this->mutex );

thread = _this->front;

if ( thread )
{
per_thread_t *temp = thread;
_this->front = 0;
_this->back = 0;
_this->count = 0;

while ( thread )
{
assert( thread->state == PER_THREAD_WAITING );
thread->state = 0;
thread = thread->next;
}

thread = temp;
}

LeaveCriticalSection( &_this->mutex );

while ( thread )
{
next = thread->next;
assert( ! thread->state );
thread->next = 0;
thread->prev = 0;

if ( ! SetEvent( thread->waiter ) ) { abort(); }
per_thread_release( thread );

thread = next;
}
}
}


per_thread_t* per_thread_self( void )
{
per_thread_t *_this = pthread_getspecific( ... );

if ( ! _this )
{
_this = malloc( sizeof( *_this ) );
if ( ! _this ) { abort(); }
_this->state = 0;
_this->next = 0;
_this->prev = 0;
_this->refs = 1;
_this->waiter = CreateEvent( 0, FALSE, FALSE, 0 );
if ( ! _this->waiter ) { free( _this ); abort(); }
pthread_setspecific( ..., _this );
}

assert( ! _this->state && ! _this->next && ! _this->prev );
return _this;
}

void per_thread_release( per_thread_t *_this )
{
if ( ! InterlockedDecrement( &_this->refs ) )
{
if ( ! CloseHandle( _this->waitset ) ) { abort(); }
free( _this );
}
}


void prv_per_thread_tlsdtor( void *tls )
{
if ( tls ) { per_thread_release( tls ); }
}


void prv_condvar_pop( condvar_t *_this, per_thread_t *thread )
{
per_thread_t *next = thread->next;
per_thread_t *prev = thread->prev;

if ( ! prev && ! next )
{
_this->front = 0;
_this->back = 0;
}

else if ( ! prev && next )
{
next->prev = 0;
_this->front = next;
}

else if ( prev && ! next )
{
prev->next = 0;
_this->back = prev;
}

else
{
next->prev = next;
prev->next = prev;
}

--_this->count;
thread->next = 0;
thread->prev = 0;
}


int prv_condvar_timeout( condvar_t *_this, per_thread_t *thread )
{
int err = 0, wait = 0;
LONG count = 0;

EnterCriticalSection( &_this->mutex );

if ( thread->state )
{
prv_condvar_pop( _this, thread );
count = _this->count;
thread->state = 0;
err = ETIMEDOUT;
}

else { wait = 1; }

if ( err )
{
/* process cancels */
InterlockedExchange
( &_this->waiters,
count );
}

LeaveCriticalSection( &_this->mutex );

if ( wait )
{
/* we are signaled for sure, so we need to wait on that */
if ( WaitForSingleObject
( thread->waiter,
INFINITE ) != WAIT_OBJECT_0 )
{
abort();
}
}

if ( err ) { per_thread_release( thread ); }

return err;
}


d|dq

unread,
Aug 3, 2005, 4:53:26 PM8/3/05
to
Thanks, Chris. I will review your solution.
It's quite a hunk of code, though.

How did you test it?

Joe Seigh

unread,
Aug 3, 2005, 5:15:27 PM8/3/05
to
Probably not since the usual problem is race conditions that are hard
to detect.

Charles Bryant

unread,
Aug 3, 2005, 6:02:33 PM8/3/05
to
In article <1123053927.3...@g44g2000cwa.googlegroups.com>,

d|dq <digital_d...@yahoo.com> wrote:
>So, I would settle for a redux version of condition variables...
>
>For instance, if one assumes the following requirements:
>
>- Thread cancellation is not supported.
>- The external mutex is held by any thread executing a wait, timed
>wait, signal or broascast operation.
>- The solution does not need to support versions of Windows prior to
>Windows 2000.
>
>- The solution should be based on the Windows API without making any
>assumption beyond its documentation.
>- The solution should not make assumptions about the underlying
>hardware.
>
>So what would such an implementation look like?

The folowing works, subject to one major restriction:

A condition variable is a manual reset event.

A mutex is anything you like (critical section, event, mutex, etc).

cv_wait(cv, mx) looks like this:
ResetEvent(cv);
unlock(mx);
WaitForSingleObject(cv);
lock(mx);

cv_signal(cv) and cv_broadcast(cv) both look like this:
SetEvent(cv);

Here's how it works. Suppose you have a predicate P used with a condition
variable. P might be 'the queue is empty'. The use of a condition
variable will be like this:

somewhere:
lock(mx);
while (P) {
cv_wait(cv, mx);
}
do something which requires !P
unlock(mx)

somewhere else:
lock(mx);
do something which makes P false
unlock(mx);
cv_signal(cv);

Given some reasonable assumptions, we can prove that the cv_wait will
always wake up soon after P becomes false. These assumptions are:
* if P is true at the point of any lock(mx), then at the
corresponding unlock(mx) either P is true or a cv_signal(cv)
will soon be executed
* no code can change the value of P unless it holds the lock

From this it follows that we can guarantee that if no thread holds the
lock then either P is false (which means threads can safely sleep) or
some thread is about to perform SetEvent(), or the event is aleady
signalled.

In fact, you may even be able to convince yourself that this approach
can be proven correct.

Unfortunately, there is one important assumption it makes: that there
is only one predicate used with a given condition variable. For
example, a queue object might use a single condition variable to
signal both "queue is not full" and "queue is not empty". With proper
condition variables, this is quite reasonable, since you can't have
one thread waiting for the queue to be non-empty while another waits
for it to have space. However the implementation outlined above can
fail in this way:

Thread A attempts to dequeue an item, finds the queue is
empty, and calls cv_wait(), which resets the event, and
unlocks the mutex. At this point thread A gets pre-empted.

Thread B starts up and quickly fills the queue. It then
attempts to put another item in the queue, finds it full, and
calls cv_wait(), which resets the event, unlocks the mutex,
and blocks thread B in the WaitForSingleObject().

Thread A now resumes, gets into WaitForSingleObject() and goes
to sleep on the event.

Both threads end up asleep on an event which will never be signalled.

Charles Bryant

unread,
Aug 3, 2005, 6:38:54 PM8/3/05
to
In article <59GdnTUDgpK...@comcast.com>,

Chris Thomasson <_no_damn_spam_cristom@_no_damn_comcast.net_spam> wrote:
>
>An event object per-thread and linked-lists for waitsets can create a
>"fairly efficient" and correct condvar for windows.
..
>Something like this:
>
><pseudo-code with timeout logic>

skip down to:

>void condvar_broadcast( condvar_t *_this )

..


> LeaveCriticalSection( &_this->mutex );
>
> while ( thread )
> {
> next = thread->next;

Note that at this point we don't hold any mutex

while in:

>void prv_condvar_pop( condvar_t *_this, per_thread_t *thread )
>{

..


> thread->next = 0;
> thread->prev = 0;
>}

called from

>int prv_condvar_timeout( condvar_t *_this, per_thread_t *thread )
>{
> int err = 0, wait = 0;
> LONG count = 0;
> EnterCriticalSection( &_this->mutex );
> if ( thread->state ) {
> prv_condvar_pop( _this, thread );


So if a thread gets a timeout just as a condvar_broadcast() is running
through the list of threads, then the thread may corrupt the list of
threads to wake, leading to threads never being woken.

To test, use this scenario:


Modify the code to insert 'Sleep(10 seconds)' just after the
LeaveCriticalSection in condvar_broadcast()

Run the following:
Start N threads which do condvar_timedwait() with no timeout.
Sleep one second to ensure that they have gone asleep
Start a thread which does condvar_timedwait() with a timeout
of 4 seconds
wait 1 second
Start N threads which do condvar_timedwait() with no timeout.
wait 1 second
perform condvar_broadcast()
wait 15 seconds

check how many threads were woken. It should be 2N+1, but I
believe it will be N+1

d|dq

unread,
Aug 3, 2005, 8:33:45 PM8/3/05
to
Charles, are you sure this works?

d|dq

unread,
Aug 3, 2005, 8:37:49 PM8/3/05
to
:)

It's Hell to implement correctly and impossible to verify. It must be
software.

Really, your signature is an insult to lemonade, which is easy to
produce and easy to identify. ;)

Seriously, I am starting to understand why some people think threads
are a bad idea.

Joe Seigh

unread,
Aug 3, 2005, 9:20:08 PM8/3/05
to
Only if the api's were badly designed in the first place as in the case
of windows. What make's it worse is inadequate low level api's needed
to rectify the problem. But I wouldn't take the fact that working with
low level api's is difficult as an indication of thread programming in
general. There are easy to use high level api's such as pipes or fifo
queues which allow you to implement producer/consumer quite easily without
worrying about tricky details of threaded programming.

BTW, I checked out all of the windows synchronization objects and they're all
events that have to be reset by the waiting thread, not the signaler, in order
to work correctly. The only exception appears to be the directory change
notification handle which acts sort of like an eventcount, you'd signal by
modifying the directory it's set on. Not very practical for basing condition
variables on.

d|dq

unread,
Aug 3, 2005, 11:56:46 PM8/3/05
to
Joe,

Do you still like/use the implementation you posted in 2003?

http://groups-beta.google.com/group/comp.programming.threads/browse_thread/thread/dec4902c7f6cd24a/31803c3398658e06?q=%2BCondVar.h&rnum=1&hl=en#31803c3398658e06

I am starting to understand it better, even though it's not a simple
thing...it is still simpler than what exists in pthreads-w32.
Not that I have anything against the pthreads-w32 implementation, but I
am not smart enough to understand it.

How much more would I have to reduce the Win32 CV redux requirements to
reach an implementation that leaves no doubt to the reader?

How about these?

- Thread cancellation is not supported.
- The external mutex is held by any thread executing a wait, timed
wait, signal or broascast operation.

- Spurious wakeups are OK.
- Never more than 8 threads.
- Never more than 4 CPUs.


- The solution does not need to support versions of Windows prior to
Windows 2000.

- The solution is based on the Windows API without making any
assumption beyond its documentation.
- The solution does not make assumptions about the underlying
hardware.

Come one, come on, do we have a deal? ;)

Message has been deleted

Charles Bryant

unread,
Aug 4, 2005, 7:02:40 PM8/4/05
to
In article <1123115625....@g14g2000cwa.googlegroups.com>,

I think so. Why wouldn't it?

Alexander Terekhov

unread,
Aug 5, 2005, 12:34:15 PM8/5/05
to

d|dq wrote:
[...]

> Anyway, I have come the to the conclusion that Pthreads semantics for
> condition variables is way too confusing/sophisticated to be
> implemented on top of Win32.

It's really simple if you can afford per-thread blockers (auto-reset
event) and explicit queuing; the only (mild) complication is a race
with respect to timeouts (or cancellation in some impls) and condvar
destruction after broadcast/signal(s)/cancel(s).

regards,
alexander.

d|dq

unread,
Aug 5, 2005, 5:19:47 PM8/5/05
to
In any case, Win32 is no natural ground for condition variables as
defined by POSIX.

Therefore, it was wrong for me to try to lay a POSIX layer on top of
Win32.

My abstraction layer won't have condition variables, that's all.

You can't fit a square peg in a round hole...efficiently, that is.

Sean Kelly

unread,
Aug 5, 2005, 5:46:13 PM8/5/05
to
As a matter of academic interest, Windows Services for Unix includes a
POSIX subsystem. Though I've yet to try building anything on top of
it.

Sean

d|dq

unread,
Aug 5, 2005, 5:51:30 PM8/5/05
to
Interix?

Umm...I think they gave up on that. It was full of bugs and no one ever
really used it.

I need to get a book that explains what can actually been done in terms
of synchronization with Win32.

Any recommendations?

Is this one any good?

http://www.amazon.com/exec/obidos/tg/detail/-/0321256190/103-8531230-2719806?v=glance

d|dq

unread,
Aug 5, 2005, 6:00:03 PM8/5/05
to
Alexander,

Simplicity is in the eye of the beholder.

In my opinion, none of these implementations are simple.
In fact, it took months to develop each of them and yet, some of them
ended up having significant defects.

Alexander Terekhov

unread,
Aug 5, 2005, 8:43:27 PM8/5/05
to

Well, POSIX is successfully layered on top of (much, much more) archaic
S/360 ("z" stuff nowadays)... rather efficiently (judging from profits),
may I note. :-)

regards,
alexander.

Alexander Terekhov

unread,
Aug 5, 2005, 8:43:21 PM8/5/05
to

Well, POSIX is successfully layered on top of (much, much more) archaic

Alexander Terekhov

unread,
Aug 5, 2005, 8:44:09 PM8/5/05
to

Well, POSIX is successfully layered on top of (much, much more) archaic

Alexander Terekhov

unread,
Aug 5, 2005, 8:43:37 PM8/5/05
to

Well, POSIX is successfully layered on top of (much, much more) archaic

Alexander Terekhov

unread,
Aug 5, 2005, 8:49:49 PM8/5/05
to

Alexander Terekhov wrote: ... (four times)

Sorry. Arcor sucks.

regards,
alexander.

d|dq

unread,
Aug 5, 2005, 11:53:12 PM8/5/05
to
I thought senility was getting the better of you there for a moment. ;)

Chris Thomasson

unread,
Aug 30, 2005, 6:55:11 PM8/30/05
to
<Charles Bryant> wrote:

> skip down to:

>>void condvar_broadcast( condvar_t *_this )

..
>> LeaveCriticalSection( &_this->mutex );

>> while ( thread ) { next = thread->next;

>Note that at this point we don't hold any mutex

This iteration is completly private. More on this below...


>while in:


>>void prv_condvar_pop( condvar_t *_this, per_thread_t *thread ) {

..


>> thread->next = 0; thread->prev = 0; }


>called from


>>int prv_condvar_timeout( condvar_t *_this, per_thread_t *thread ) { int
>>err = 0, wait = 0; LONG count = 0; EnterCriticalSection( &_this->mutex );
>>if ( thread->state ) { prv_condvar_pop( _this, thread );

> So if a thread gets a timeout just as a condvar_broadcast() is running
> through the list of threads, then the thread may corrupt the list of
> threads to wake, leading to threads never being woken.

> To test, use this scenario:


> Modify the code to insert 'Sleep(10 seconds)' just after the
> LeaveCriticalSection in condvar_broadcast()


> Run the following: Start N threads which do condvar_timedwait() with no
> timeout. Sleep one second to ensure that they have gone asleep Start a
> thread which does condvar_timedwait() with a timeout
of 4 seconds
> wait 1 second Start N threads which do condvar_timedwait() with no
> timeout. wait 1 second perform condvar_broadcast() wait 15 seconds


> check how many threads were woken. It should be 2N+1, but I believe
> it will be N+1


The race-condition you describe can't happen because the broadcast function
pops the entire list of nodes, sets their per_thread_t::state variables to
0, and stores the front pointer locally under the protection of the mutex.
Notice how the per_thread_t::state variable is always protected by the mutex
and is used to indicate whether or not a thread is in the list? How can the
timeout function corrupt the list when it only pops threads that have their
per_thread_t::state variables set to PER_THREAD_WAITING? I think you need to
study the pseudo-code a some more...


Keep this in mind:

/* not in list */
per_thread_t::state == 0;


/* in list */
per_thread_t::state == PER_THREAD_WAITING;


;)


0 new messages