How to reason formally about syncronization in the following case?

Michael Podolsky

unread,

Sep 30, 2010, 2:38:30 PM9/30/10

to

I was in real trouble trying to find a correct subject for what I am
going to ask. Sorry about that, but you will need to read the
following text to understand the actual subject.

Suppose, we have an implementation of Win32 auto event (just as an
example, I am not interested in discussing this object or framework
particularly).

Now, if we clear this event and then wait on it in one thread:

1: ClearEvent(e);
2: WaitForSingleObject(e, infinite);

and after a "long while" signal it from the other thread and wait on
it there:
// another thread!!!
3: Sleep(1 week)
4: SetEvent(e);
5: WaitForSingleObject(e, infinite);

it is "nearly impossible" that wait on line 5 will succeed, while the
wait on line 2 will continue to hang.

Nearly impossible does not mean impossible at all, as thread 1 could
theoretically speaking, be preempted by other threads and not enter
its actual wait.

But, if it is not completely impossible, what if we implement our hand-
made auto event in a way it would be "more possible" or even "very
probable"? Will it break what we beleave a correct implementation of
Auto Event? Frankly, I do not know, but such an "event" would look
very strange for me if it would not wake up other threads, being
waiting for a long time, preferring instead to "eat" wake notification
itself IF it issues wait immediately after SetEvent.

Now my question is - how should we define formally a functionality of
event object in order to make a behaviour described above "nearly
impossible". Again, we can not make it impossible at all because we
can not control scheduler, but can we do at least ANYTHING formal
about definition of such an object to prohibit the behaviour described
above caused at least by wrong implementation? Or we just can not
formally distinguish between a win32 event implementation and another
implementation which enables (even encourages) threads to eat their
own SetEvent, even in case other threads has been waiting for the same
event for one day???

Anthony Williams

unread,

Oct 1, 2010, 3:30:10 AM10/1/10

to

Michael Podolsky <michael.p...@gmail.com> writes:

> I was in real trouble trying to find a correct subject for what I am
> going to ask. Sorry about that, but you will need to read the
> following text to understand the actual subject.
>
> Suppose, we have an implementation of Win32 auto event (just as an
> example, I am not interested in discussing this object or framework
> particularly).
>
> Now, if we clear this event and then wait on it in one thread:
>
> 1: ClearEvent(e);
> 2: WaitForSingleObject(e, infinite);
>
> and after a "long while" signal it from the other thread and wait on
> it there:
> // another thread!!!
> 3: Sleep(1 week)
> 4: SetEvent(e);
> 5: WaitForSingleObject(e, infinite);
>
> it is "nearly impossible" that wait on line 5 will succeed, while the
> wait on line 2 will continue to hang.

I don't think that's true. Consider a single-core machine, which
therefore only runs one thread at a time. The thread that calls SetEvent
could easily continue running into the Wait call before the first thread
gets another chance to run. Once both threads are in their Wait calls,
it is up to the scheduler which to wake. It is a perfectly viable
scheduling choice to say that the SetEvent thread gets to wake if it
still has some of its timeslice left, as its code and data is hot in the
cache, whereas the other thread's code and data might well have been
paged to disk.

> Now my question is - how should we define formally a functionality of
> event object in order to make a behaviour described above "nearly
> impossible". Again, we can not make it impossible at all because we
> can not control scheduler, but can we do at least ANYTHING formal
> about definition of such an object to prohibit the behaviour described
> above caused at least by wrong implementation? Or we just can not
> formally distinguish between a win32 event implementation and another
> implementation which enables (even encourages) threads to eat their
> own SetEvent, even in case other threads has been waiting for the same
> event for one day???

Condition variables work like that: only threads waiting on a condvar
when it is signalled can be woken.

Anthony
--
Author of C++ Concurrency in Action http://www.stdthread.co.uk/book/
just::thread C++0x thread library http://www.stdthread.co.uk
Just Software Solutions Ltd http://www.justsoftwaresolutions.co.uk
15 Carrallack Mews, St Just, Cornwall, TR19 7UL, UK. Company No. 5478976

Michael Podolsky

unread,

Oct 1, 2010, 11:16:02 AM10/1/10

to

On Oct 1, 3:30 am, Anthony Williams <anthony....@gmail.com> wrote:

> Michael Podolsky <michael.podolsky...@gmail.com> writes:
> > Suppose, we have an implementation of Win32 auto event (just as an
> > example, I am not interested in discussing this object or framework
> > particularly).
>
> > Now, if we clear this event and then wait on it in one thread:
>
> > 1: ClearEvent(e);
> > 2: WaitForSingleObject(e, infinite);
>
> > and after a "long while" signal it from the other thread and wait on
> > it there:
> > // another thread!!!
> > 3: Sleep(1 week)
> > 4: SetEvent(e);
> > 5: WaitForSingleObject(e, infinite);
>
> > it is "nearly impossible" that wait on line 5 will succeed, while the
> > wait on line 2 will continue to hang.
>
> I don't think that's true. Consider a single-core machine, which
> therefore only runs one thread at a time. The thread that calls SetEvent
> could easily continue running into the Wait call before the first thread
> gets another chance to run. Once both threads are in their Wait calls,
> it is up to the scheduler which to wake. It is a perfectly viable
> scheduling choice to say that the SetEvent thread gets to wake if it
> still has some of its timeslice left, as its code and data is hot in the
> cache, whereas the other thread's code and data might well have been
> paged to disk.

Oh, I was under the impression that Windows's SetEvent immediately
releases a waiting thread, if one exists and enables it to "consume"
the SetEvent notification. Now as I am looking at the documentation, I
see nothing about this idea, so I was wrong in my root post bringing a
Win32 event object as an example.

From the other side, if we look at linux's semaphore specification, we
can read the following:

sem_post() increments (unlocks) the semaphore pointed to by
sem. If
the semaphore's value consequently becomes greater than
zero, then
another process or thread blocked in a sem_wait(3) call will
be woken
up and proceed to lock the semaphore.

The wording "another" and "proceed to lock the semaphore" implies that
for a semaphore with current value zero, a sem_post() can not be
consumed by a following sem_wait() issued by the same thread,
providing there is another thread waiting for the same semaphore. Am I
right with such an implication? I am not sure. May be, again, to
"proceed to lock the semaphore" does not mean to lock it. Ok, let's
suppose theoretically that we have an implementation which really
releases a waiting thread in such a situation, end enables it to exit
successfully from sem_wait and continue its execution. Can we?

Now as I found a proper object to reason about :-) let me recite my
previous question (some corrections are made)

Now my question is - how should we define formally a functionality of

linux's (or our theroretical) semaphore object in order to make a

behaviour described above "nearly
impossible". Again, we can not make it impossible at all because we
can not control scheduler, but can we do at least ANYTHING formal
about definition of such an object to prohibit the behaviour described
above caused at least by wrong implementation? Or we just can not

formally distinguish between a linux semaphore implementation and

another
implementation which enables (even encourages) threads to eat their

own sem_post, even in case other threads has been waiting for the same
semaphore for one day???

> Condition variables work like that: only threads waiting on a condvar
> when it is signalled can be woken.

Yes, but this is another story - we may know exactly that a thread had
started to wait for condvar if we entered a critical section before
signalling. We may base our reasoning on this.
With a semaphore we never know, we may only wait for a couple of
seconds (weeks) to be sure the thread is really (99.99%) blocked on
the semaphore.

Anthony Williams

unread,

Oct 1, 2010, 11:34:24 AM10/1/10

to

Michael Podolsky <michael.p...@gmail.com> writes:

Without additional synchronization, you never know if that other thread
actually blocked on the semaphore. Therefore following a post with a
wait on the same semaphore from the same thread may well eat its own
notification.

What are you trying to achieve? You may be better off with a pair of
semaphores/events, or a condition variable.

Michael Podolsky

unread,

Oct 1, 2010, 1:53:42 PM10/1/10

to

On Oct 1, 11:34 am, Anthony Williams <anthony....@gmail.com> wrote:

> Without additional synchronization, you never know if that other thread
> actually blocked on the semaphore. Therefore following a post with a
> wait on the same semaphore from the same thread may well eat its own
> notification.
>
> What are you trying to achieve? You may be better off with a pair of
> semaphores/events, or a condition variable.
>

My question is not about solving of some particular application
problem, but rather about
implementation of synchronization objects and reasoning about their
correctness.
I'll try to restate my questions and bring a better example (my
original was incorrect,
thanks to pointing me out) and I still feel that I should bring less
vague example
to explain what I mean and need. This will take some time, though.

Thanks, Michael

Balog Pal

unread,

Oct 2, 2010, 5:32:12 AM10/2/10

to

"Michael Podolsky" <michael.p...@gmail.com>

If you google after PulseEvent, there are articles explaining that it is
"broken" as in has the possibility to a legal waiter miss the pulse.

As I imagine the system your SetEvent scenario can be hit by the very same
mechanic, the waiter may be temporarily removed from the queue, thus not
immediately wake/cause task switch right before thread2 exits from line 4.
If that happens, as Anthony described, any of your waits may be the
successful one. Even on multicore.

Balog Pal

unread,

Oct 2, 2010, 5:55:55 AM10/2/10

to

"Michael Podolsky" <michael.p...@gmail.com>

>> Without additional synchronization, you never know if that other thread
>> actually blocked on the semaphore. Therefore following a post with a
>> wait on the same semaphore from the same thread may well eat its own
>> notification.
>
>> What are you trying to achieve? You may be better off with a pair of
>> semaphores/events, or a condition variable.

>My question is not about solving of some particular application
>problem, but rather about
>implementation of synchronization objects and reasoning about their
>correctness.

Is there such thing as 'correctness' without context? IMO you shall have a
use case (an goat to achieve), some candidate solution, and only then can
you start evaluating whether the solution actually match the problem.

The straight way to make sure some thread passed some point is, as ponted
above to use a pair of events/semaphores. As I recall MFC plays that very
game launching CWinThread.... I guess the only other approach would be to
query the scheduler's database. I almost said it is hard, but actually
debuggers report that very information.

Michael Podolsky

unread,

Oct 2, 2010, 9:07:54 AM10/2/10

to

On Oct 2, 5:32 am, "Balog Pal" <p...@lib.hu> wrote:
>
> If you google after PulseEvent, there are articles explaining that it is
> "broken" as in has the possibility to a legal waiter miss the pulse.
>
> As I imagine the system your SetEvent scenario can be hit by the very same
> mechanic, the waiter may be temporarily removed from the queue, thus not
> immediately wake/cause task switch right before thread2 exits from line 4.
> If that happens, as Anthony described, any of your waits may be the

> successful one. Even on multicore.- Hide quoted text -
>
> - Show quoted text -

I agree, my scenario could be affected by this bug as well, but I'd
rather consider the scenario without interference of OS malfunction.
After all, I don't need in context of this topic to write some correct
code running on particular OS. It's more theoretical. I'd rather
ignore OS bugs.