I see many example of using Event Object (win32, CreateEvent) to achieve
multithread synchronization.
As I am a beginner in this topic, could anyone give me some guideline or
procedure for using Event Object.
Many Thanks,
Rico.
Forget (or just ignore) MS-event silliness and learn pthreads.
Google can help.
regards,
alexander.
P.S. Okay,
http://www.boost.org/libs/thread/doc/faq.html#question7
(7. Why supply condition variables rather than event variables?)
http://article.gmane.org/gmane.comp.lib.boost.devel/20840
(Re: no semaphores in boost::thread)
"Rico" <tbp...@owdot.com> wrote in message
news:bikdo2$2h2e$1...@justice.itsc.cuhk.edu.hk...
> It is not right place to ask this question: most of people here are
> ignorant about anything but posix threads.
> Regards,
> Michael
Sorry, but I must protest here. Many people here just prefer pthread to MS
threads exactly because they know both of them.
Many of the people here have partecipated in building the pthread port on
windows, so they have studied MS threading for years.
I know the other great people around here, like Alexander Terekhov or David
Butenhof (just to say two), won't loose time to reply your mail, so I am
doing it not to defend them, or even me, but to drag you out of your
evident single-minded view (I don't mean to be offensive here, sorry If
this term sounds offensive to you; just remember I am not an english
speaking person, and that I can misuse terms like that).
If you see that here 90% of messages are about pthreads is just because 90%
of people knows -enough- on threads to choose pthreads.
Remember that pthreads itself is NOT a "primitive" library, but is wrapped
around lower level OS dependant libraries, so people here could talk about
pthreads, linuxthreads, sun threads, and MS threads. If we talk mainly
about pthreds, and we talk about low level threads interface usually only
in their "peculiarities" with respect to POSIX standard, is just because we
know threads well. And we know how to choose.
So, study more about threading; and you'll find out why event objects can be
useful only in certain specific situations, that are about less than 5% of
real life programs, while condition variables are general enough to cope
with every the syncronization. Some other algorithms (e.g. lock free lists)
can be more efficient in particular circumstance; there isn't a BEST thing
here; even events can be better in very particular situation. The problem
is that MS provided ONLY that primitive, and that is clearly a design flaw.
Or, if you want, I can explain it here; since you are writing to this group,
you are not supposed to be a all-knowledged person. We are all users, there
is no shame in learning.
Bests,
Giancarlo Niccolai.
Look the following windows functions:
NtCreateEventPair( ... )
NtSetHighEventPair( ... )
NtSetLowEventPair( ... )
NtSetHighWaitLowEventPair( ... )
NtSetLowWaitHighEventPair( ... )
NtWaitHighEventPair( ... )
NtWaitLowEventPair( ... )
NtQueryEvent( ... )
They are actually useful!
;)
However, they are not exposed as part of the API. Which is a design flaw.
;(
--
The designer of the experimental, SMP and HyperThread friendly, AppCore
library.
http://AppCore.home.comcast.net
SenderX wrote:
>
> > The problem
> > is that MS provided ONLY that primitive, and that is clearly a design
> flaw.
>
> Look the following windows functions:
>
> NtCreateEventPair( ... )
>
> NtSetHighEventPair( ... )
>
> NtSetLowEventPair( ... )
>
> NtSetHighWaitLowEventPair( ... )
>
> NtSetLowWaitHighEventPair( ... )
>
> NtWaitHighEventPair( ... )
>
> NtWaitLowEventPair( ... )
>
> NtQueryEvent( ... )
>
> They are actually useful!
>
> ;)
>
> However, they are not exposed as part of the API. Which is a design flaw.
>
There does not appear to be any documentation for them, even unofficial. Looks
like SignalObjectAndWait().
Joe Seigh
My problem with this argument is that in order to implement a condtion
variable
wait function you have to put the calling thread to sleep. This involves
transitioning
into the kernel. You always get a race between the sleeping thread and the
thread
that will wake him up (unless of course you always transition into the
kernel with the mutex held etc).
You fix this race by having state in user mode that the kernel twiddles or
you have it in the kernel
with something like a pending wake of a thread. So in general it looks to me
that you always have
this state in there somewhere.
Now if your saying that condition variables have this state so you don't
have to have it
anywhere else in your multi-threaded programs that makes more sense. To say
that having state is
bad seems a little sweeping to me.
Neill.
--
This posting is provided "AS IS" with no warranties, and confers no rights.
"Alexander Terekhov" <tere...@web.de> wrote in message
news:3F4DD5E0...@web.de...
Neill Clift wrote:
>
> I see you guys use the state/nostate argument a lot in comming to the
> conclusion that condition variables are less error prone than events.
>
> My problem with this argument is that in order to implement a condtion
> variable
> wait function you have to put the calling thread to sleep. This involves
> transitioning
> into the kernel. You always get a race between the sleeping thread and the
> thread
> that will wake him up (unless of course you always transition into the
> kernel with the mutex held etc).
> You fix this race by having state in user mode that the kernel twiddles or
> you have it in the kernel
> with something like a pending wake of a thread. So in general it looks to me
> that you always have
> this state in there somewhere.
>
> Now if your saying that condition variables have this state so you don't
> have to have it
> anywhere else in your multi-threaded programs that makes more sense. To say
> that having state is
> bad seems a little sweeping to me.
> Neill.
The state refers to that condvars don't remember if they have been signaled,
not some internal state which is implementation dependent. The less error
prone, I think, refers to its common usage as opposed to using events and trying
to coordinate their states with that of the data without running into race
conditions. Events are perfectly safe as long as you know what is safe and
what is not safe usage for them.
As for implemtation, yes condvars have state. All synchronization objects
have internal state.
Joe Seigh
Yes they are.
For instance, you can implement a 100% safe condvar with them.
You can signal/broadcast to the threads waiting on the condvar in virtually
any order you want, FIFO, LIFO whatever...
The trick is to use a list of events. You simply cannot use a single event
to wake threads in a condvar. You can use a semaphore, but you can't release
the threads in any particular order...
The each item in the list can only be touched by 2 threads. The thread that
issued the wait, and the thread that issued the signal. List manipulation is
protected by the condition's critical section.
The waiting thread's cancellation varies between the wait issuer and the
signaler. This is due to a 100% harmless race-condition, as I signal waiting
threads outside the condition's critical section. The signaler would pop the
waiting threads from the list inside the critical section, and signal the
popped waiters outside the section.
I will post some c++ code that will show the working algo. It is not
lock-free, and there are no fast-paths. But it works, with Events! The event
handles can easily be reused as well...
;)
P.S.
This shows how a thread-safe list of event objects can be used to simulate a
semaphore that can release threads in an order provided by the programmer.
http://www.google.com/groups?selm=3CEA2764.3957A2C8%40web.de
(see "explicitly managed queue approach")
> You simply cannot use a single event to wake threads in a condvar.
It can be done... and the price is "a chained wakeup" (that's
in addition to "gating").
http://www.google.com/groups?selm=3AE92633.3243E30B%40web.de
(see "Algorithm 8c")
To Joe: "gating" is really needed. Two waitsets might not be
enough, I'm afraid. And, BTW, any "garbage collector" does
hurt performance/concurrence... life without garbage is much
better. ;-)
regards,
alexander.
Humm.
>The problem
>is that MS provided ONLY that primitive, and that is clearly a design flaw.
- Event
- Mutex
- Semaphore
- Critical Section
- Waitable timer
- Completion ports
- Interlocked operations
- Timer queues
- SLists
And probably others that I can't remember right now.
Ziv
Alexander Terekhov wrote:
>
> http://www.google.com/groups?selm=3AE92633.3243E30B%40web.de
> (see "Algorithm 8c")
>
> To Joe: "gating" is really needed. Two waitsets might not be
> enough, I'm afraid. And, BTW, any "garbage collector" does
> hurt performance/concurrence... life without garbage is much
> better. ;-)
>
You don't need to gate waiters, just the signalers. If the
signaler is holding the external mutex, then blocking the
signaler will block subsequent waiters ensuring proper
waitset membership. Two waitsets are sufficient. You could
use more if you wanted I suppose.
GC w/ recycling does not hurt performance. Reference counting
is just counting and you are doing a lot more counting then a
simple GC scheme would.
Joe Seigh
I know.
I was talking explicitly of Event Synchronization. Semaphore are similar but
different in the way they can be used; for the rest (mutexes etc.) none of
them is what condition is for pthread.
I was meaning that they did not provided a signalable object like pthread;
the pulse-or-manual reset is the only paradigm they will provide in
wait-for-predicate constructs. Sorry for not being clear.
Giancarlo.
I don't think so.
> If the signaler is holding the external mutex,
But the signaler doesn't need to hold the external mutex.
Actually, it is a not-so-good idea to have the signaler
holding the external mutex, unless "predictable scheduling"
(realtime stuff) is deadly needed, of course.
> then blocking the signaler will block subsequent waiters
> ensuring proper waitset membership.
Blocking the signaler aside, that's what "gating" is all
about. You don't allow any new waiters to mess with the old
ones until the last signaled/broadcasted one has passed
through (moved off waiting queue).
> Two waitsets are sufficient. You could use more if you wanted
> I suppose.
Yeah. A waitset per waiter is sufficient. That's true. ;-)
>
> GC w/ recycling does not hurt performance.
http://www.cs.umd.edu/~pugh/java/memoryModel/archive/1234.html
> Reference counting is just counting and you are doing a lot
> more counting then a simple GC scheme would.
Sure. And recycling itself isn't free either.
regards,
alexander.
But no 'unlock and wait'. The problem is more that a vital primitive is
missing.
DS
M$ should have a condvar API... No doubt.
However, you can create stable condvars in windows...
If you use a lock-based or lock-free pool of event handles, and share that
pool with multi-condvars, you can dynamically setup a specific broadcast
and/or signal order per-condvar, and reuse event handles globally. The
number of events created is proportional to the number of concurrent waiting
threads, which should not be a lot. Tons of threads is a design flaw
anyway... ;)
Could be useful.
SignalObjectAndWait()
Joe Seigh
That thing isn't really useful. In MS world, threads can be suspended
and suspended threads don't really "wait"... so that they can miss a
"stateless" signal (MS-pulse thing). I don't see how that thing can be
useful for waiting on semas (or whatnot with "a signaled" state). A
vital primitive is missing indeed... brain-dead MS "kernel objects for
everything" aside for a moment.
regards,
alexander.
Suspended by whom? The OS or the application?
Kernel objects aren't a problem if they can be fast pathed (futexed).
Joe Seigh
A debugger, for example.
>
> Kernel objects aren't a problem if they can be fast pathed (futexed).
Yeah. Apart from "access right" validation. I mean thier "security
information for a securable object" silliness for mutexes, events/
semas, and whatnot.
regards,
alexander.
This is useful, Alexander, for removing the need to gate the event to avoid
wakeup starving. It is not-necessary, but with this the algorithm for
rewriting condition variables are simpler (and thus, more efficient) by an
order of degree.
The problem is that not being available on 98 and lower, it must not be
relyed upon, or at least, there must be two versions of a library using
it.
Bests,
Giancarlo.
[... SignalObjectAndWait and Win32 condvars ...]
> The problem is that not being available on 98 and lower...
This is not the only problem.
http://groups.google.com/groups?selm=3AEAC433.1595FF4%40web.de
(Subject: Re: A theoretical question on synchronization)
"This means that if a thread is waiting on an auto-reset event
but is suspended when the PulseEvent occurs, it will miss the
notification because it was not really waiting on the event.
Debuggers suspend and resume threads all the time. This means
that single stepping a program while a signal occurs can very
easily cause any number of wake ups to be missed and deadlock
to occur. Because of this problem, this implementation is
basically unusable."
You'll have the same problem with 'waiting' on counting semas
(MS "auto-reset" event is nothing but a binary one; it's also
a sema... but without totally brain-dead max checking, I mean).
regards,
alexander.
Giancarlo Niccolai wrote:
[... SignalObjectAndWait and Win32 condvars ...]
> This is useful, Alexander, for removing the need to gate the event to avoid
> wakeup starving.
Pulsing aside, this would be true IFF a waiter would really
consume a signal if/when suspended while waiting on a sema.
I wouldn't bet even a penny on that. MS land is brain-dead.
regards,
alexander.
Yes; this function simplifies the solution of that problem with a fewer set
of instructions than the ones you provided in that algo; but it can be
hardly relied on:
1) because is not portable across windows system.
2) because who knows how is it internally implemented? -- you are right, I
wouldn't trust it until I see the source code. It may -- pretend -- to be
atomic and even 2.1) ---be--- atomic, but it could still loose some signal
or starve some wakeup.
Giancarlo.
Alexander Terekhov wrote:
>
> Joe Seigh wrote:
> [...]
> > Suspended by whom? The OS or the application?
>
> A debugger, for example.
So debuggers are broken for win32 threads. This isn't specific
to implementing condvars on windows (which in turn wasn't the OP)
so what's the problem?
Joe Seigh
And things like garbage collectors (they also do suspend threads,
AFAIK) are also broken for win32 threads. Apart from that problem
everything is just fine in Redmond. Right? Seriously, the problem
is that Win32 implementation should work in the environment with
arbitrary thread suspension that "kicks out" threads from waiting
queues (and puts them back once they got resumed). CV-"gating" is
really needed under brain-damaged Win32 (implementations with the
explicitly managed queue ala LinuxThreads aside for a moment).
regards,
alexander.
Alexander Terekhov wrote:
>
> > So debuggers are broken for win32 threads. This isn't specific
> > to implementing condvars on windows (which in turn wasn't the OP)
> > so what's the problem?
>
> And things like garbage collectors (they also do suspend threads,
> AFAIK) are also broken for win32 threads. Apart from that problem
> everything is just fine in Redmond. Right? Seriously, the problem
> is that Win32 implementation should work in the environment with
> arbitrary thread suspension that "kicks out" threads from waiting
> queues (and puts them back once they got resumed). CV-"gating" is
> really needed under brain-damaged Win32 (implementations with the
> explicitly managed queue ala LinuxThreads aside for a moment).
>
GC don't suspend that way since it wouldn't tell them what references
were in the threads internal register state.
Can't really discuss the suspension problem since I don't know whether
it's a bug or "working as designed". Either way, it doesn't matter
because even if you had CVs that worked (and I don't know what you
mean by "gating" in this instance) everything else would be "broken".
Joe Seigh
"gating" means basically that we have a single "queue" sema AND
http://groups.google.com/groups?selm=3AEAC433.1595FF4%40web.de
(Subject: Re: A theoretical question on synchronization)
<quote>
A thread may not begin a wait on the semaphore until it is
guaranteed not to steal a signal.
</quote>
> everything else would be "broken".
Not quite. PulseEvent is definitely broken (and even MS does kinda
acknowledge*** that). As for SignalObjectAndWait, well as I said,
that thing is pretty much useless... but NOT broken, of course. ;-)
regards,
alexander.
<quote>
This function is unreliable and should not be used. It exists
mainly for backward compatibility. For more information, see
Remarks.
....
A thread waiting on a synchronization object can be momentarily
removed from the wait state by a kernel-mode APC, and then
returned to the wait state after the APC is complete. If the
call to PulseEvent occurs while the thread is returned to the
wait state, it will not be released because this function
releases only the threads that are waiting at the moment it is
called. Therefore, PulseEvent is unreliable and should not be
used by new applications.
</quote>
AFAIK, thread suspension is done by a special kernel-mode APC, but
MS might use it for other things as well, I guess. Got it now? ;-)
Alexander Terekhov wrote:
>
> Joe Seigh wrote:
> [...]
> > Can't really discuss the suspension problem since I don't know whether
> > it's a bug or "working as designed". Either way, it doesn't matter
> > because even if you had CVs that worked (and I don't know what you
> > mean by "gating" in this instance)
>
> "gating" means basically that we have a single "queue" sema AND
>
> http://groups.google.com/groups?selm=3AEAC433.1595FF4%40web.de
> (Subject: Re: A theoretical question on synchronization)
>
> <quote>
>
> A thread may not begin a wait on the semaphore until it is
> guaranteed not to steal a signal.
>
> </quote>
You posted that URL before and I thought I knew what you meant by
"gate" at one point. I'm not going to make repeated attempts. I
know what the issues are with using win32 events or semaphores in
implementing condvars. Basically the issues are with how you share
events and semaphores.
>
> > everything else would be "broken".
>
> Not quite. PulseEvent is definitely broken (and even MS does kinda
> acknowledge*** that). As for SignalObjectAndWait, well as I said,
> that thing is pretty much useless... but NOT broken, of course. ;-)
>
Ok. That's a somewhat recent change to the docs. It's not in my older
win2k docs.
Joe Seigh
Microsoft rightfully believes it has the absolute right to innovate. ;-)
regards,
alexander.
Inside the NT kernel we have a few software interrupt levels
that code can use and still block threads in. These are a couple
of flavors of what we call APC's (Asynchronous Procedure Calls).
If you wait on an object like an event you enter the kernel and get
added to a wait list for the event (if its not set). You get removed
from this list if the event wait is satisfied or if the thread is the target
of APC
delivery. When we deliver an APC the thread is removed from the wait list,
runs the APC code and then reenters the wait when it finishes.
APCs are used for thread suspension as was previously mentioned but also
for more common things like i/o completion (if you queued async i/o
from this thread earlier). File/registry change notification and debug
functions for getting thread context also use these
As a consequence of this things like PulseEvent could miss threads
that are waiting but happened to be running an APC. Of course
you could also miss threads if they haven't run as far as waiting yet.
This rewaiting also causes a reordering of the wait lists and rewaiting
threads are reentered into the lists almost certainly in a different place.
--
This posting is provided "AS IS" with no warranties, and confers no rights.
"Joe Seigh" <jsei...@xemaps.com> wrote in message
news:3F573C75...@xemaps.com...
Neill Clift wrote:
>
> To be clear about what is going here I'll describe it.
(snip)
> As a consequence of this things like PulseEvent could miss threads
> that are waiting but happened to be running an APC. Of course
> you could also miss threads if they haven't run as far as waiting yet.
> This rewaiting also causes a reordering of the wait lists and rewaiting
> threads are reentered into the lists almost certainly in a different place.
>
It would be easy to fix PulseEvent so that it would work but then the
rest of us wouldn't have so much fun trying to create win32 condvars.
Joe Seigh
Bah, finally someone from Microsoft "core". ("AS IS" with no warranties ;-) )
http://google.com/groups?selm=c29b5e33.0201240132.3d78369f%40posting.google.com
(Subject: Re: Suspension of Linux threads. )
BTW, another Ruediger's piece highly deserving to be moved to MSDN
Archive*** (with added "cautions" in red) is this:
http://msdn.microsoft.com/library/default.asp?url=/library/en-us/dndllpro/html/msdn_osport.asp
(Emulating Operating System Synchronization in Win32 Applications)
regards,
alexander.
***) http://groups.google.com/groups?selm=3CF260F0.45381909%40web.de
(Subject: Re: The implementation of condition variables in pthreads-win32)
P.S. Hey, what are you going to do with SCO's "UNIX license" that you
bought recently? I have an idea.
http://groups.google.com/groups?selm=3EDC72F9.4847B371%40web.de
(Subject: Re: Upcoming ISO/IEC <thread>... and <pthread.h> -> <cthread> transition)
P.P.S.
http://www.microsoft.com/windows/sfu/productinfo/news/awards/linuxworld.asp
--
http://www.ibm.com/servers/eserver/linux/fun
http://www.ibm.com/e-business/doc/content/lp/prodigy.html
http://www.ibm.com/e-business/doc/content/ondemand/prodigy_transcript.html
Alexander Terekhov wrote:
>
> Neill Clift wrote: [... APCs are used for thread suspension... ]
>
> Bah, finally someone from Microsoft "core". ("AS IS" with no warranties ;-) )
>
(snip, snip, snip,...)
Well, I guess we won't see him again. What was that all about?
Joe Seigh
Maybe, or maybe not.
> What was that all about?
Follow the links.
regards,
alexander.
So there is no doubt. I have looked at condition variables. I wrote a few
simple programs to get the ideas straight and its seems like a clean
way to do synchronization. This is model we will look at supporting
in the future.
As to PulseEvent I think its usefulness is limited. I have seen very few
uses
that made sense and didn't have subtle race conditions. This is why we
have notes to this effect in MSDN.
Neill.
--
This posting is provided "AS IS" with no warranties, and confers no rights.
"Alexander Terekhov" <tere...@web.de> wrote in message
news:3F5C78D2...@web.de...
That's my permanent state. Well, "half serious".
>
> So there is no doubt. I have looked at condition variables. I wrote a few
> simple programs to get the ideas straight and its seems like a clean
> way to do synchronization. This is model we will look at supporting
> in the future.
Interesting. Why not the entire pthread? In any event, please don't
copy .Net's "pulsing monitors".
>
> As to PulseEvent I think its usefulness is limited.
Agreed.
> I have seen very few uses that made sense and didn't have subtle
> race conditions. This is why we have notes to this effect in MSDN.
Except that PulseEvent docu says nothing about race conditions (on
the application part).
regards,
alexander.
If you could explain what pthread bings to the table then that would be
helpful. Reading the original Hoare paper and lots of stuff after this with
the concepts of spurious wakeups etc it looks relatively self contained.
My only worry is that condition variables are tied to specific exclusive
locks as they are passed in as parameters. Some kind of general
scheme with callbacks might be in order but thats not a simple
interface to program to. Presumably its also possible to split it
down on the queue waiter/release lock/sleep. also not easy to code
to but easy to implement pthreads on top of.
I am unfamiliar with "Pulsing monitors" so if you could explain that
then that would be great.
> >
> > As to PulseEvent I think its usefulness is limited.
>
> Agreed.
>
> > I have seen very few uses that made sense and didn't have subtle
> > race conditions. This is why we have notes to this effect in MSDN.
>
> Except that PulseEvent docu says nothing about race conditions (on
> the application part).
>
MSDN says this on PulseEvent:
A thread waiting on a synchronization object can be momentarily removed from
the wait state by a kernel-mode APC, and then returned to the wait state
after the APC is complete. If the call to PulseEvent occurs while the thread
is returned to the wait state, it will not be released because this function
releases only the threads that are waiting at the moment it is called.
Therefore, PulseEvent is unreliable and should not be used by new
applications.
I think that is pretty clear.
Thanks.
Pthread mostly works well, so to speak.
> Reading the original Hoare paper and lots of stuff after this with
> the concepts of spurious wakeups etc it looks relatively self contained.
> My only worry is that condition variables are tied to specific exclusive
> locks as they are passed in as parameters. Some kind of general
> scheme with callbacks might be in order but thats not a simple
> interface to program to. Presumably its also possible to split it
> down on the queue waiter/release lock/sleep. also not easy to code
> to but easy to implement pthreads on top of.
>
> I am unfamiliar with "Pulsing monitors" so if you could explain that
> then that would be great.
Can you see a race condition here?
>
> > >
> > > As to PulseEvent I think its usefulness is limited.
> >
> > Agreed.
> >
> > > I have seen very few uses that made sense and didn't have subtle
> > > race conditions. This is why we have notes to this effect in MSDN.
> >
> > Except that PulseEvent docu says nothing about race conditions (on
> > the application part).
> >
>
> MSDN says this on PulseEvent:
>
> A thread waiting on a synchronization object can be momentarily removed from
> the wait state by a kernel-mode APC, and then returned to the wait state
> after the APC is complete. If the call to PulseEvent occurs while the thread
> is returned to the wait state, it will not be released because this function
> releases only the threads that are waiting at the moment it is called.
> Therefore, PulseEvent is unreliable and should not be used by new
> applications.
> I think that is pretty clear.
It's pretty clear from this text that SignalObjectAndWait is pretty
useless. It says nothing about race conditions (on the application
part; other than missing signals due to not waiting until kernel-
mode APC is complete) with respect to PulseEvent, though. IOW, race
conditions surely make it unreliable (apart from coupling it with
NOT useless SignalObjectAndWait) even without your kernel-mode APC
"kick out". Oder?
regards,
alexander.
Neill Clift wrote:
>
> If you could explain what pthread bings to the table then that would be
> helpful. Reading the original Hoare paper and lots of stuff after this with
> the concepts of spurious wakeups etc it looks relatively self contained.
> My only worry is that condition variables are tied to specific exclusive
> locks as they are passed in as parameters. Some kind of general
> scheme with callbacks might be in order but thats not a simple
> interface to program to. Presumably its also possible to split it
> down on the queue waiter/release lock/sleep. also not easy to code
> to but easy to implement pthreads on top of.
I don't think you necessarily need to drag in all of pthreads just to get
condition variables. As I mentioned before there is probably a simple
way to effect condition variables in windows and do it in a way that is
consistent with the current windows threading api. Are you guys serious
about this?
Joe Seigh
I am unfamiliar with this code so keep that in mind. Doesn't it depend on
the implementation? For example if the Wait does an atomic lock release and
wait
wrt a Pulse done by a thread that holds the lock then it looks very much
like
a condition varible. Holding the lock around Pulse doesn't add anything as
the
lock ownership doesn't look like its passed but they require this.
Maybe your asking that I analyse the example code for race conditions?
Without knowing more about this are its hard to say anything more.
This pulse doesn't have to have the issues that PulseEvent has which is
what you seem to be implying.
I am not sure that signalAndWait is useless. I haven't seen it used where I
though it necessary. I have seen it used in cases where the caller wants to
limit context swaps (wake a guy who we swap to only to come back to
swap again for the wait). I have heard people refer to race conditions
this api closes but don't know specifics. I wouldn't be surprised if there
was something like this as the waits etc have side effects. I can look into
this if your interested. Sure if your trying to build a condition variable
with
it your probably don't gain anything.
I forgot to mention that binding is dynamic and you can rebind, if you
want.
<quote source=DRB-TC2>
When a thread waits on a condition variable, having specified a
particular mutex to either the pthread_cond_timedwait() or
pthread_cond_wait() operation, a dynamic binding is formed between
that mutex and condition variable that remains in effect as long as
at least one thread is blocked on the condition variable. During
this time, the effect of an attempt by any thread to wait on that
condition variable using a different mutex is undefined. Once all
waiting threads have been unblocked (as by the
pthread_cond_broadcast() operation), the next wait operation on that
condition variable shall form a new dynamic binding with the mutex
specified by that wait operation. Even though the dynamic binding
between condition variable and mutex may be removed or replaced
between the time a thread is unblocked from a wait on the condition
variable and the time that it returns to the caller or begins
cancellation cleanup, the unblocked thread shall always re-acquire
the mutex specified in the condition wait operation call from which
it is returning.
</quote>
> > scheme with callbacks might be in order but thats not a simple
> > interface to program to. Presumably its also possible to split it
> > down on the queue waiter/release lock/sleep. also not easy to code
> > to but easy to implement pthreads on top of.
>
> I don't think you necessarily need to drag in all of pthreads just to get
> condition variables. As I mentioned before there is probably a simple
> way to effect condition variables in windows and do it in a way that is
> consistent with the current windows threading api. Are you guys serious
> about this?
They should first add destructors to TLS (with no "auto clearing" of
slots for keys with dtors)... and provide release(), dispose(), and
reset(). They might well even beat POSIX on that. ;-) ;-)
regards,
alexander.
I think you miss my point or I fail to see yours.
I am talking about different locks here not wrt usage in a program at
different times but actual call syntax. For example a CondWait type
function actualy has a mutex as a parameter. Thats fine if you only have
one kind of lock.
> > > scheme with callbacks might be in order but thats not a simple
> > > interface to program to. Presumably its also possible to split it
> > > down on the queue waiter/release lock/sleep. also not easy to code
> > > to but easy to implement pthreads on top of.
> >
> > I don't think you necessarily need to drag in all of pthreads just to
get
> > condition variables. As I mentioned before there is probably a simple
> > way to effect condition variables in windows and do it in a way that is
> > consistent with the current windows threading api. Are you guys serious
> > about this?
>
> They should first add destructors to TLS (with no "auto clearing" of
> slots for keys with dtors)... and provide release(), dispose(), and
> reset(). They might well even beat POSIX on that. ;-) ;-)
>
> regards,
> alexander.
Presumably you are talking here about the fact we do cross thread clearing
of tls slots when its deleted by one thread. Your asking for some kind of
cleanup routine for slots also. I think you need to provide more info.
I have never heard anyone ask for something like this before.
Their could be two interfaces for the API, one would take a
LPCRITICAL_SECTION, and the other would take a HANDLE to a mutex.
You could add a third interface that takes the lock as a void*, and calls
the application back when it needs to use the lock.
LPCONDVAR AllocCondVar_CriticalSection( LPCRITICAL_SECTION pLock, ... );
LPCONDVAR AllocCondVar_HANDLE( HANDLE pLock, ... );
LPCONDVAR AllocCondVar_Custom( void* pLock, ... );
Windows would benefit from a native condvar API.
P.S.
<ot>
How are you guys implementing the lock-free SList API on AMD64?
</ot>
--
The designer of the experimental, SMP and HyperThread friendly, AppCore
library.
http://AppCore.home.comcast.net
"Neill Clift [MSFT]" wrote:
>
> "Alexander Terekhov" <tere...@web.de> wrote in message
> news:3F5D1081...@web.de...
> >
> > I forgot to mention that binding is dynamic and you can rebind, if you
> > want.
> >
> > <quote source=DRB-TC2>
> >
> > When a thread waits on a condition variable, having specified a
> > particular mutex to either the pthread_cond_timedwait() or
> > pthread_cond_wait() operation, a dynamic binding is formed between
> > that mutex and condition variable that remains in effect as long as
> > at least one thread is blocked on the condition variable. During
> > this time, the effect of an attempt by any thread to wait on that
> > condition variable using a different mutex is undefined. Once all
> > waiting threads have been unblocked (as by the
> > pthread_cond_broadcast() operation), the next wait operation on that
> > condition variable shall form a new dynamic binding with the mutex
> > specified by that wait operation. Even though the dynamic binding
> > between condition variable and mutex may be removed or replaced
> > between the time a thread is unblocked from a wait on the condition
> > variable and the time that it returns to the caller or begins
> > cancellation cleanup, the unblocked thread shall always re-acquire
> > the mutex specified in the condition wait operation call from which
> > it is returning.
> >
> > </quote>
> >
>
> I think you miss my point or I fail to see yours.
You can ignore it. That's just Posix leaving semantic room for some
particular implementation. Even if you were implementing Posix
you could define whatever behavior you want since Posix is leaving
it undefined. It just the application that cannot depend on any
particular defined behavior.
>
> I am talking about different locks here not wrt usage in a program at
> different times but actual call syntax. For example a CondWait type
> function actualy has a mutex as a parameter. Thats fine if you only have
> one kind of lock.
>
> > > > scheme with callbacks might be in order but thats not a simple
> > > > interface to program to. Presumably its also possible to split it
> > > > down on the queue waiter/release lock/sleep. also not easy to code
> > > > to but easy to implement pthreads on top of.
> > >
Callbacks work fine. I've used them before in a win32 condvar implementation.
Of course you may not need them. SignalObjectAndWait() seems to know what
types of synchronization objects it is dealing with. So, you may be able to
use that mechanism.
Joe Seigh
[... CV rebinding ...]
> I am talking about different locks here not wrt usage in a program at
> different times but actual call syntax. For example a CondWait type
> function actualy has a mutex as a parameter. Thats fine if you only have
> one kind of lock.
Ah. Well, you might want to take a look at pthread_mutexattr_settype().
[... TLS ...]
> Presumably you are talking here about the fact we do cross thread clearing
> of tls slots when its deleted by one thread.
Yes, this (eager-vs-lazy-vs-versioning aside for a moment) is really
needed as long as you don't have keys ("indexes") with dtors for the
thread-specific data. You don't need to waste cycles on clearing for
keys with dtors because applications can easily do it themselves.
> Your asking for some kind of cleanup routine for slots also.
> I think you need to provide more info.
More info can be found in this thread:
http://www.opengroup.org/austin/mailarchives/austin-group-l/msg06004.html
regards,
alexander.
It looks very much like Java monitor... but without spurious wakeups.
Hopefully, the next .Net version will "steal" the JSR-166 like CVars
as well. (And to the full extent, this time ;-) )
> Holding the lock around Pulse doesn't add anything as the
> lock ownership doesn't look like its passed but they require this.
Well, in POSIX, it adds "predictable scheduling" on uniprocessors (or
within "uni"-domains).
>
> Maybe your asking that I analyse the example code for race conditions?
Yup.
> Without knowing more about this are its hard to say anything more.
Try to imagine that the second thread would race ahead of the first
one.
regards,
alexander.
"SenderX" <x...@xxx.xxx> wrote in message
news:1x97b.290654$cF.89387@rwcrnsc53...
> > I am talking about different locks here not wrt usage in a program at
> > different times but actual call syntax. For example a CondWait type
> > function actualy has a mutex as a parameter. Thats fine if you only have
> > one kind of lock.
>
> Their could be two interfaces for the API, one would take a
> LPCRITICAL_SECTION, and the other would take a HANDLE to a mutex.
>
> You could add a third interface that takes the lock as a void*, and calls
> the application back when it needs to use the lock.
>
> LPCONDVAR AllocCondVar_CriticalSection( LPCRITICAL_SECTION pLock, ... );
>
> LPCONDVAR AllocCondVar_HANDLE( HANDLE pLock, ... );
>
> LPCONDVAR AllocCondVar_Custom( void* pLock, ... );
>
> Windows would benefit from a native condvar API.
>
This is what I was talking about but the replier missed my point.
>
>
> P.S.
>
> <ot>
> How are you guys implementing the lock-free SList API on AMD64?
> </ot>
>
Its part of the win32 API so yes things like InterlockedPushEntrySList
should be there.
> --
> The designer of the experimental, SMP and HyperThread friendly, AppCore
> library.
>
> http://AppCore.home.comcast.net
>
>
> >
> > LPCONDVAR AllocCondVar_HANDLE( HANDLE pLock, ... );
^^^^^^^^^
> >
> > LPCONDVAR AllocCondVar_Custom( void* pLock, ... );
^^^^^^^^^
> >
> > Windows would benefit from a native condvar API.
> >
>
> This is what I was talking about but the replier missed my point.
Again, you might want to take a look at pthread_mutexattr_settype().
SenderXP, you might want to study JSR-166's Condition interface and
Condition objects that are "permanently" bound to locks (those are
things that implement the Lock interface). Different lock types
aside for a second, JSR-166 doesn't allow rebinding to a different
lock instance. POSIX does allow rebinding to mutexes of different
types (because pthread_mutex_t is basically the handle-body idiom,
aka the Bridge pattern, so to speak). The reason why JSR-166 does
NOT allow rebinding is that they simply didn't want to impose extra
checking (and, consequently, exception) to cover the "undefined
behavior" case (which is a "NO-NO" in Java). Finally, you should
know that boost.org folks just love compile time polymorphism and
just hate virtual calls (and things alike). The result is nothing
but bizillion "template <typename Lock> void wait(Lock & lock)"
based calls, timed wait aside for a moment.
regards,
alexander.
Alexander Terekhov wrote:
>
> "Neill Clift [MSFT]" wrote:
> > This is what I was talking about but the replier missed my point.
>
> Again, you might want to take a look at pthread_mutexattr_settype().
>
> SenderXP, you might want to study JSR-166's Condition interface and
> Condition objects that are "permanently" bound to locks (those are
> things that implement the Lock interface). Different lock types
> aside for a second, JSR-166 doesn't allow rebinding to a different
> lock instance. POSIX does allow rebinding to mutexes of different
> types (because pthread_mutex_t is basically the handle-body idiom,
> aka the Bridge pattern, so to speak). The reason why JSR-166 does
> NOT allow rebinding is that they simply didn't want to impose extra
> checking (and, consequently, exception) to cover the "undefined
> behavior" case (which is a "NO-NO" in Java). Finally, you should
> know that boost.org folks just love compile time polymorphism and
> just hate virtual calls (and things alike). The result is nothing
> but bizillion "template <typename Lock> void wait(Lock & lock)"
> based calls, timed wait aside for a moment.
>
There's nothing inherent in condvars that require binding them to
a mutex. Why are you throwing in unnecessary requirements?
Joe Seigh
A condvar is sort of "bound" to a predicate (well, you can have
it "bound" to multiple predicates but you'll have to always use
broadcast, then).
> Why are you throwing in unnecessary requirements?
It makes implementor's life easier and doesn't really hurt
applications. For example, NPTL uses lll_futex_requeue(futex,
nr_wake, nr_move, mutex) that provides "wait morphing". If
binding wouldn't be required, futex impl would have to know all
those mutexes and that would create problems for process-shared
condvars, I guess.
regards,
alexander.
Alexander Terekhov wrote:
>
> Joe Seigh wrote:
> [...]
> > There's nothing inherent in condvars that require binding them to
> > a mutex.
>
> A condvar is sort of "bound" to a predicate (well, you can have
> it "bound" to multiple predicates but you'll have to always use
> broadcast, then).
>
Unless you use a callback or enclosure for the predicate. Has
to run in user context, though.
> > Why are you throwing in unnecessary requirements?
>
> It makes implementor's life easier and doesn't really hurt
> applications. For example, NPTL uses lll_futex_requeue(futex,
> nr_wake, nr_move, mutex) that provides "wait morphing". If
> binding wouldn't be required, futex impl would have to know all
> those mutexes and that would create problems for process-shared
> condvars, I guess.
>
What? You mean if the mutex was specified on the wait, you couldn't
figure out which mutex it was? You're kidding.
The problem with restricting the api that way is the api "architect"
isn't always that clever and thinks there is only one possible way to
implement a particular function, and so there is no problem hacking
the api to suit the one true implemetation. As a result, we're
encumbered with a bunch of brain damaged, crippled api's that impose
unnecessary restrictions on the application.
Joe Seigh
I was talking about wait-morphing. Think of broadcast. It's surely
"easier" to re-queue (this is done on the signaling side), if you
have just one single "bounded" mutex (as a re-queue target). Oder?
regards,
alexander.
Right. Skip dispatching the thread if it's immediately going to wait
on a lock. Binding the mutex to the condvar does not affect the
applicability of this technique. It just affects where you are going
to store and look for the address of the mutex. For bound condvar, it
can be kept with the condvar. For unbound condvar, it has to be kept
with the waitset thread info since it can be different for each thread.
Joe Seigh
And the number of queues that you're going to touch, so to speak.
> For bound condvar, it can be kept with the condvar.
That begs and interesting question. Switch to futexes was partly
influenced by the ease of supporting process-shared sync. objects.
Now, you might want to take a look at NPTL sources... I can't
figure out how that is supposed to work given that in one case
(signal) they seem to wait-morph into *internal mutex* (address
is not a problem here, but "what for"), and in the other case
(broadcast) they seem to use a user-mode address of external
mutex that "might be different" in the signaler process. Hmmm...
regards,
alexander.
Lots of debate here on stuff that I expected you guys knew inside out.
Maybe these condition variables aren't the panacea you describe them as :-).
What your talking about if I understand correctly is tying the
implementation
of condition variables tightly to your mutex. The aim presumably is so that
your condition variable signal can take thread wait blocks and move them
from the list of waiters of a condition variable and move them to the list
of
waiters of the mutex. To do this you need a way to find the mutex from the
condition variable and this is what your calling binding.
Does this really pay off? For example if you find the mutex unlocked your
presumably not going to lock the mutex and pass on ownership. Doing that
creates convoys because lock hold time is expanded by context swap. So
maybe you wake in this case and only queue in the contented case. Do
you know any data supporting this to be a good thing to do? Maybe just
waking the thread up and letting it contend for the lock when it runs makes
more sense. The context swap time is so big that by the time it runs
anything
could have happened? Maybe its good if you lock hold times are large
or contention is very high?
If you have any references to anything done in this area let me know.
Lock ownership hand-off (across context switch) is, of course, a bad
thing. BTW, the only "fully reliable" POSIX CV impl for Win32 (that
I'm aware of) *does* suffer from this problem. As for the rest, try
<http://groups.google.com/groups?selm=3D8F3F3F.9A1ADFE0%40web.de>.
And also this:
http://groups.google.com/groups?selm=379C79ED.2CE07AB2%40zko.dec.com
http://groups.google.com/groups?selm=37B0300D.20237419%40zko.dec.com
http://groups.google.com/groups?selm=3A3F59F2.E491ED9E%40compaq.com
regards,
alexander.
> Their could be two interfaces for the API, one would take a
> LPCRITICAL_SECTION, and the other would take a HANDLE to a mutex.
>
> You could add a third interface that takes the lock as a void*, and calls
> the application back when it needs to use the lock.
>
> LPCONDVAR AllocCondVar_CriticalSection( LPCRITICAL_SECTION pLock, ... );
>
> LPCONDVAR AllocCondVar_HANDLE( HANDLE pLock, ... );
>
> LPCONDVAR AllocCondVar_Custom( void* pLock, ... );
>
> Windows would benefit from a native condvar API.
What I'd like to see is a Windows kernel-mode driver that natively
supports as much of the pthreads API as possible.
DS
"Neill Clift [MSFT]" wrote:
>
> Does this really pay off? For example if you find the mutex unlocked your
> presumably not going to lock the mutex and pass on ownership. Doing that
> creates convoys because lock hold time is expanded by context swap. So
> maybe you wake in this case and only queue in the contented case. Do
> you know any data supporting this to be a good thing to do? Maybe just
> waking the thread up and letting it contend for the lock when it runs makes
> more sense. The context swap time is so big that by the time it runs
> anything
> could have happened? Maybe its good if you lock hold times are large
> or contention is very high?
>
Good question. I suspect the answer is no, ie. the futex optimization
in this case doesn't matter.
Joe Seigh
> "Neill Clift [MSFT]" wrote:
>>
>> Does this really pay off? For example if you find the mutex unlocked your
>> presumably not going to lock the mutex and pass on ownership. Doing that
>> creates convoys because lock hold time is expanded by context swap.
Right. Ownership handoff is bad. It forces you to "cold start" stale threads
(that have been blocked) rather than making forward progress with "warm" or
"hot" threads that are contending for the same resource. Proponents of
handoff are generally looking for "fairness", without any concept of how
much throughput fairness costs, and how easy it generally is to improve the
application code by discarding expectations of "fairness". Handoff also
violates realtime scheduling goals, since a higher priority lock contender
might become runnable between handoff and when the "previously highest
priority contenter" was given ownership. While you might still be able to
"take it back" at that point, it would adds too much complication and
overhead to be worthwhile.
>> So maybe you wake in this case and only queue in the conten[d]ed case.
Absolutely.
>> Do you know any data supporting this to be a good thing to do? Maybe just
>> waking the thread up and letting it contend for the lock when it runs
>> makes more sense. The context swap time is so big that by the time it
>> runs anything could have happened? Maybe its good if you lock hold times
>> are large or contention is very high?
Depending on "anything can happen" being good doesn't sound to me like the
most firm footing on which to begin designing a scheduler. ;-)
And the fact that context switch time is generally large relative to
uncontended locking is exactly why wait morphing makes sense -- without it,
you waste a relatively large number of processor cycles (along with the
attendent cache misses, page faults, and so forth) merely to find out
whether the lock is now available. If that happens only a few times in the
application, you're not dealing with real world coding. (Or you're dealing
with a well constructed concurrent application that just doesn't have much
contention -- which is great, but all too rare.)
> Good question. I suspect the answer is no, ie. the futex optimization
> in this case doesn't matter.
"Fast path" optimizations, for UNcontended lock operations, are to improve
the performance of "GOOD" code -- which may use a lot of locking, but for
isolated and short critical sections, with little contention.
However, "slow (blocking) path" optimizations, like wait morphing, are for
the benefit of the vast majority of poorly written applications; as well as
a lot of library code that cannot control how it's used in poorly
constructed applications. Many many people DO hold locks too often, and for
too long. When that happens even "a fair amount", much less as often as is
frequently seen "in the wild", wait morphing can present enormous benefits.
And it "costs" virtually nothing, because it's not particular complicated or
difficult, and well constructed code rarely hits the blocking code paths
anyway.
So, yes; in the real world it DOES pay off. I've been doing wait morphing
for at least 10 years, and I've seen a wide variety of important commercial
packages where it makes a substantial difference.
--
/--------------------[ David.B...@hp.com ]--------------------\
| Hewlett-Packard Company Tru64 UNIX & VMS Thread Architect |
| My book: http://www.awl.com/cseng/titles/0-201-63392-2/ |
\----[ http://homepage.mac.com/dbutenhof/Threads/Threads.html ]---/
David Butenhof wrote:
>
> > Good question. I suspect the answer is no, ie. the futex optimization
> > in this case doesn't matter.
>
> "Fast path" optimizations, for UNcontended lock operations, are to improve
> the performance of "GOOD" code -- which may use a lot of locking, but for
> isolated and short critical sections, with little contention.
>
> However, "slow (blocking) path" optimizations, like wait morphing, are for
> the benefit of the vast majority of poorly written applications; as well as
> a lot of library code that cannot control how it's used in poorly
> constructed applications. Many many people DO hold locks too often, and for
> too long. When that happens even "a fair amount", much less as often as is
> frequently seen "in the wild", wait morphing can present enormous benefits.
>
> And it "costs" virtually nothing, because it's not particular complicated or
> difficult, and well constructed code rarely hits the blocking code paths
> anyway.
>
> So, yes; in the real world it DOES pay off. I've been doing wait morphing
> for at least 10 years, and I've seen a wide variety of important commercial
> packages where it makes a substantial difference.
>
Yes, that's basically it. I've been hitting that problem in trying to come
up with a testcase for win32 condvars that is not a self fulfilling prophecy.
Plus the fact that things tend to devolve into coroutines on a uniprocessor.
(I need to get a multiprocessor pc sometime but that will have to wait).
I think I know how to wait morphing for condvars on windows so I could test
it assuming testcases, both "good" and "bad", were available. For a
uniprocessor of course. :)
Joe Seigh