Thread synchronization issues

Phil McGachey

unread,

Jan 21, 2002, 1:11:40 PM1/21/02

to

Hi

I'm writing a multithreaded app in C++, and am running into trouble
trying to wait and signal between threads. I have defined a Signal
wrapper class:

Signal::Signal(){
signal = CreateEvent(NULL, TRUE, FALSE, NULL);
}

Signal::~Signal(){
}

void Signal::waitForSignal(){
TRACE("%d waiting\n", signal);
WaitForSingleObject(signal, INFINITE);
TRACE("%d got signal\n", signal);
}

void Signal::sendSignal(){
TRACE("%d signalling\n", signal);
PulseEvent(signal);
}

The idea being that a thread would pause when calling waitForSignal,
and then continue once another thread called sendSignal. The output of
the program is:

160 waiting
144 waiting
144 signalling
160 signalling
160 got signal
160 signalling
160 got signal
144 signalling
160 signalling
160 got signal
160 signalling
160 got signal

Here, handle 160 works as intended, but the signal from 144 never
seems to wake the listening thread. The design of the program means
that there's never more than one thread waiting on any signal object.

Any advice?

Thanks

Phil McGachey

Pavel Lebedinsky

unread,

Jan 21, 2002, 4:33:31 PM1/21/02

to

Does this happen under debugger only? If yes, then this is
most likely reason:

http://support.microsoft.com/default.aspx?scid=kb;EN-US;q173260

BTW you should call CloseHandle() in Signal::~Signal.

Daniel Herlitz

unread,

Jan 21, 2002, 3:19:21 PM1/21/02

to

Phil McGachey wrote:

Hm, does it? How can "160 got signal" be printed four times when "160
waiting" is only printed once? How can a thread return from WFSO four
times if it only enters it once?

> but the signal from 144 never
> seems to wake the listening thread. The design of the program means
> that there's never more than one thread waiting on any signal object.
>

There are many ifs here. How long a time span does your printout
represent? Does the thread calling WFSO(144) even get a change to return
withing this time? What does your TRACE do? How is variable signal defined?

/D

Phil McGachey

unread,

Jan 21, 2002, 4:54:50 PM1/21/02

to

It works fine outside the debugger, so that'll probably be it.

Thanks

Phil

Slava M. Usov

unread,

Jan 22, 2002, 7:46:13 AM1/22/02

to

"Phil McGachey" <pmcg...@hotmail.com> wrote in message
news:3c4c58b2...@news.lineone.net...

[...]

> void Signal::waitForSignal(){
> TRACE("%d waiting\n", signal);
> WaitForSingleObject(signal, INFINITE);
> TRACE("%d got signal\n", signal);

[...]

> Here, handle 160 works as intended, but the signal from 144 never
> seems to wake the listening thread.

[...]

As has already been pointed out, don't do that under a debugger. However,
I'm willing to generalize and say "don't you ever do it", at least using
PulseEvent(). There is no simple way, using PulseEvent(), to ensure that a
thread will satisfy its wait. Consider this scenario: the thread is already
somewhere inside a kernel-mode implementation of WaitForSingleObject() but
has not yet reached the point of _having_ been put on the wait list
associated with the event. Then, PulseEvent() will pulse the event, but
since the thread is not on its wait list, the thread will miss the signal,
possibly waiting forever.

> The design of the program means that there's never more than one
> thread waiting on any signal object.

In that case, why not use an auto-reset event and replace PulseEvent() with
SetEvent()?

S

Phil McGachey

unread,

Jan 22, 2002, 12:28:35 PM1/22/02

to

A very good point - thanks.

For some reason I assumed that kernel operations were atomic.

Phil

Phil McGachey

unread,

Jan 22, 2002, 12:51:37 PM1/22/02

to

On Mon, 21 Jan 2002 20:19:21 GMT, Daniel Herlitz
<dan...@REMOVETHIS.telia.com> wrote:

>
>Hm, does it? How can "160 got signal" be printed four times when "160
>waiting" is only printed once? How can a thread return from WFSO four
>times if it only enters it once?

Sorry about that - I cut out various other outputs for clarity, and
clipped the "waiting" ones as well.

>
>> but the signal from 144 never
>> seems to wake the listening thread. The design of the program means
>> that there's never more than one thread waiting on any signal object.
>>
>
>
>There are many ifs here. How long a time span does your printout
>represent? Does the thread calling WFSO(144) even get a change to return
>withing this time? What does your TRACE do? How is variable signal defined?

The timespan was long enough for any returns, TRACE outputs in the
debugger and signal is a HANDLE.

It would appear that the problem was to do with the debugger anyway

Thanks

Phil

>
>/D
>

Pavel Lebedinsky

unread,

Jan 22, 2002, 3:01:03 PM1/22/02

to

"Slava M. Usov" wrote:

> As has already been pointed out, don't do that under a debugger. However,
> I'm willing to generalize and say "don't you ever do it", at least using
> PulseEvent(). There is no simple way, using PulseEvent(), to ensure that a
> thread will satisfy its wait. Consider this scenario: the thread is
already
> somewhere inside a kernel-mode implementation of WaitForSingleObject() but
> has not yet reached the point of _having_ been put on the wait list
> associated with the event. Then, PulseEvent() will pulse the event, but
> since the thread is not on its wait list, the thread will miss the signal,
> possibly waiting forever.

That's a little bit harsh, IMO. Except for not working reliably when
threads are suspended, PulseEvent is perfectly reasonable.

In your scenario, you can use it with a mutex/critical section so that
listener threads are properly synchronized with the thread that calls
PulseEvent.

This is very similar to the way monitor broadcast works in Java or
C#. Recepient of the event makes sure he doesn't miss it by acquiring
the associated lock before waiting.

eran

unread,

Jan 23, 2002, 8:13:54 AM1/23/02

to

"Pavel Lebedinsky" <m_pll at hot mail com> wrote in message news:<3c4dc4ff$1...@news.microsoft.com>...

> "Slava M. Usov" wrote:
> This is very similar to the way monitor broadcast works in Java or
> C#. Recepient of the event makes sure he doesn't miss it by acquiring
> the associated lock before waiting.

You are just paving the way for deadlock!!! Avoid waiting on event when a lock
is acquired unless you really know what you are doing.

Slava M. Usov

unread,

Jan 23, 2002, 1:25:12 PM1/23/02

to

"Pavel Lebedinsky" <m_pll at hot mail com> wrote in message
news:3c4dc4ff$1...@news.microsoft.com...

> That's a little bit harsh, IMO. Except for not working reliably when

> threads are suspended, PulseEvent is perfectly reasonable.

Have I said anywhere that PulseEvent() was 'unreasonable', whatever that
might mean? I only said that in that particular design PulseEvent() would
result in a race condition.

> In your scenario, you can use it with a mutex/critical section so that
> listener threads are properly synchronized with the thread that calls
> PulseEvent.

I said there were no simple way to employ PulseEvent() for guaranteed
delivery. I never said it was not possible. Your suggestion to use a mutex
or a critical section does not cut it. There are some intricacies involved,
and whoever attempts to do it without prior exposition to implementing
synchronization mechanisms will likely end up having deadlocks or again race
conditions. Not to mention that brutally serializing everything results in
code that does not scale at all, which may or may not apply to the OP's
case, but is worth mentioning.

> This is very similar to the way monitor broadcast works in Java or
> C#. Recepient of the event makes sure he doesn't miss it by acquiring
> the associated lock before waiting.

Yeah, right. Try that Javaish broadcasting with multiple listeners on at
least a four-way SMP and then ask yourself why the performance sucks so
badly.

S

Pavel Lebedinsky

unread,

Jan 23, 2002, 3:52:17 PM1/23/02

to

"eran" wrote:

> > This is very similar to the way monitor broadcast works in Java or
> > C#. Recepient of the event makes sure he doesn't miss it by acquiring
> > the associated lock before waiting.
>
> You are just paving the way for deadlock!!! Avoid waiting on event
> when a lock is acquired unless you really know what you are doing.

Waiting on a monitor (or condition variable in POSIX) atomically
releases the lock.

In win32 you can do it with SignalObjectAndWait (if you use mutex
for locking). Unfortunately, there is no simple way that I know of
to do this with critical sections.

Pavel Lebedinsky

unread,

Jan 23, 2002, 5:24:13 PM1/23/02

to

"Slava M. Usov" wrote:

> Your suggestion to use a mutex or a critical section does not cut it.
> There are some intricacies involved, and whoever attempts to do
> it without prior exposition to implementing synchronization
> mechanisms will likely end up having deadlocks or again
> race conditions.

Let me attempt to do it:

Producer()
{
WaitForSingleObject (hMutex, INFINITE);
while (queue.IsFull())
{
SignalObjectAndWait (hMutex, hEvent, INFINITE, FALSE);
WaitForSingleObject (hMutex, INFINITE);
}

queue.Push (item);
ReleaseMutex (hMutex);
PulseEvent (hEvent);
}

/* Consumer() implementation omitted for brevity */

What's wrong with this code? It uses PulseEvent, it's simple and
as far as I can tell it's correct. It can be easily modified to handle
different predicates.

I know there are better performing solutions. Most of them however
are rather difficult to get right unless you really know what you're
doing.

> Not to mention that brutally serializing everything results in code that
> does not scale at all, which may or may not apply to the OP's
> case, but is worth mentioning.

Why using locks means serializing "everything"? Serialize the sections
of code that need to be protected - that's what locks are for.

> Yeah, right. Try that Javaish broadcasting with multiple listeners on at
> least a four-way SMP and then ask yourself why the performance
> sucks so badly.

Of course it's hard to beat completion ports.

With Java/C#/pthreads style synchronization it's sometimes possible
to make things perform somewhat better by using signal instead
of broadcast.

Good implementations can aslo optimize away lots of unnecessary
context switches. See for example the discussion of "wait morphing"
here:

http://groups.google.com/groups?selm=3A3F59F2.E491ED9E%40compaq.com&output=g
plain

With some help from the OS it could be done in Java or CLR,
in which case broadcast would be very efficient even with multiple
waiting threads.

Slava M. Usov

unread,

Jan 24, 2002, 12:23:04 PM1/24/02

to

"Pavel Lebedinsky" <m_pll at hot mail com> wrote in message

news:3c4f380e$1...@news.microsoft.com...

> Let me attempt to do it:

[...]

My goal was to warn the OP that PulseEvent() has subtleties and that it
normally requires quite a bit of experience to produce robust and efficient
code that uses PulseEvent(). And that the resultant code is unlikely to be
simple. Your example just stresses my point. First, it is inefficient.
Second, even in that form it is complicated enough for an un-initiated
person to understand it, much less write and debug anything like that.
Especially debug, given that KB article. Besides, as the problem is
associated with SuspendThread(), which is available to applications other
than debuggers, it can easily be regarded as a fundamental robustness
problem.

[...]

> What's wrong with this code?

Lots of things are wrong with this code. But as you've admitted that you
know it is inefficient, let's skip its analysis. I'll only remark that by
using a semaphore with a separately kept count of waiting consumers, it is
possible to remove lots of context switches and iterations in situations
when there are many eager consumers and few producers [what you're doing now
is waking all of the consumers whenever there is a new item in the queue].

> Most of them however are rather difficult to get right unless
> you really know what you're doing.

The semaphore solution is similar in complexity, although, I'm afraid, I
lack the perspective of a novice in these matters.

> Why using locks means serializing "everything"? Serialize the sections
> of code that need to be protected - that's what locks are for.

Given your example, it is hard to argue with that. However, in the
signal/broadcasting scenarios it should not be necessary, in theory, to have
the semantics of monitors. More on that below.

> Of course it's hard to beat completion ports.

I've not mentioned those and am not going to.

[...]

> With some help from the OS it could be done in Java or CLR,
> in which case broadcast would be very efficient even with multiple
> waiting threads.

Unfortunately, no. The very semantics of Java monitors disallows that. It
requires that all of the waiters be in a synchronized section of code. While
they are waiting, this is not a problem. When, however, their wait is
satisfied, and they become runnable, they may only leave the synchronized
section one by one. If the JIT were smart enough it might be able to detect
wait/leave-monitor sequence in the byte code and call some
wait_leave_monitor() API, which, provided it is implemented in KM, might be
able to get rid of the context switches. As we're discussing MS Win32 OSes
right now, that's just wishful thinking and the fact remains that Java
monitors will perform sloppily on said OSes. And because monitors are the
only means of doing synchronization in Java, everything else must be built
on top of them. Hence my attitude towards the mix of serializing, Java
monitors, and high-performance concurrent programming [on SMP in
particular].

S