Programming with lock that might fail

Rani Sharoni

unread,

Dec 14, 2003, 8:16:21 AM12/14/03

to

Hello,

This discussion started in boost land and I thought that it will be more
appropriate to continue it over here.

Alexander Terekhov wrote:
> Rani Sharoni wrote:
> [...]
>> Sorry but I personally can't program with lock that might fail and
>> handle the failure. This will definitely change the non failure
>> guarantee of many functions/scopes such as cleanup functions and
>> non-mutating functions. Sure that there are cases in which the lock
>> will not change the failure guaranty.

I want to mention that I don't refer to trylock that should also be
non-failing lock with different well-defined behavior which is not related
to failures.

> http://groups.google.com/groups?selm=3FA61F96.223FDE85%40web.de

If I understand this correctly then it demonstrates heroic effort to
implement non-failing lock (although the new in DCSI might fail ...).

I know that there are examples of real life locks that might fail (e.g. old
implementations of EnterCriticalSection) but I consider this as flaw in the
OS that must provide the no fail guarantee even if special optimizations
(e.g. lazy resources allocations) are preformed. Just like the no-fail
guarantee of memory access that might *transparently* involve loading of
swapped out page.

I hope that I'm not being too naive about this issue.

> [...]
>>> shared_ptr, i.e. a programming error.
>>
>> I'm surprised to see [... "to throw or not to throw" ...]
>
> He he. (Never mind )

I will be happy if you'll elaborate on what was in your mind because I'll be
surprise to know if you think differently than me (i.e. programming errors
should *not* result with well-defined error and should *not* be handled in
the scope of the programming language).

Thanks,
Rani

Alexander Terekhov

unread,

Dec 15, 2003, 6:02:19 AM12/15/03

to

Rani Sharoni wrote:
[...]

> > http://groups.google.com/groups?selm=3FA61F96.223FDE85%40web.de
>
> If I understand this correctly then it demonstrates heroic effort to
> implement non-failing lock (although the new in DCSI might fail ...).

This was meant to be a thing illustrating "pretty safe" lazy init
for a lock.

>
> I know that there are examples of real life locks that might fail (e.g. old
> implementations of EnterCriticalSection) but I consider this as flaw in the
> OS that must provide the no fail guarantee even if special optimizations
> (e.g. lazy resources allocations) are preformed. Just like the no-fail
> guarantee of memory access that might *transparently* involve loading of
> swapped out page.
>
> I hope that I'm not being too naive about this issue.

You try to compare a bit different things, I think.

[...]

> >> I'm surprised to see [... "to throw or not to throw" ...]
> >
> > He he. (Never mind )
>

> I will be happy if you'll elaborate on what was in your mind ...

http://lists.boost.org/MailArchives/boost/msg07196.php
http://lists.boost.org/MailArchives/boost/msg27452.php
http://groups.google.com/groups?th=94e63c7613727eec
http://groups.google.com/groups?th=236c96ebdd0891c3

regards,
alexander.

Rani Sharoni

unread,

Dec 15, 2003, 2:37:42 PM12/15/03

to

Alexander Terekhov wrote:
> Rani Sharoni wrote:
> [...]
>>> http://groups.google.com/groups?selm=3FA61F96.223FDE85%40web.de
>>
>> If I understand this correctly then it demonstrates heroic effort to
>> implement non-failing lock (although the new in DCSI might fail ...).
>
> This was meant to be a thing illustrating "pretty safe" lazy init
> for a lock.

Ok. I rushed into thinking that the illustrating code was an improvement
over an implementation that supply lock that might fail but after taking
better look (read lock) I figured that it's clever illustration of
optimization that optionally changes the non-fail guarantee of the
underlying implementation. As I mentioned in the boost post, I can't program
with lock that might fail. I prefer that the OS will provide this
synchronization facility with no fail guarantee regardless of the underlying
and welcomed optimizations. I see some resemblance to demand paging since
it's also lazy resource allocation that doesn't change the no-fail guarantee
(of memory access). It's obvious that set of no-fail operations is needed in
order to write correct programs and I think that non-failing lock is one of
them.
I'll be happy to see improvement of your implementation that provides the
no-fail guarantee of the lock operation.

>> I know that there are examples of real life locks that might fail
>> (e.g. old implementations of EnterCriticalSection) but I consider
>> this as flaw in the OS that must provide the no fail guarantee even
>> if special optimizations (e.g. lazy resources allocations) are
>> preformed. Just like the no-fail guarantee of memory access that
>> might *transparently* involve loading of swapped out page.
>>
>> I hope that I'm not being too naive about this issue.
>
> You try to compare a bit different things, I think.
>
> [...]
>>>> I'm surprised to see [... "to throw or not to throw" ...]
>>>
>>> He he. (Never mind )
>>
>> I will be happy if you'll elaborate on what was in your mind ...
>
> http://lists.boost.org/MailArchives/boost/msg07196.php
> http://lists.boost.org/MailArchives/boost/msg27452.php
> http://groups.google.com/groups?th=94e63c7613727eec
> http://groups.google.com/groups?th=236c96ebdd0891c3

Thanks for the links, the Jack Reeves article and the logic_error paragraph
were hilarious.

Rani

Alexander Terekhov

unread,

Dec 15, 2003, 2:59:07 PM12/15/03

to

Rani Sharoni wrote:
[...]

> As I mentioned in the boost post, I can't program with lock that
> might fail.

What's your problem with a lock that might fail (std::bad_alloc)
on the *first* attempt (concurrent calls serialized internally) to
acquire it (failed attempts do not count)? An illustration, please.

regards,
alexander.

Rani Sharoni

unread,

Dec 15, 2003, 4:26:04 PM12/15/03

to

I don't understand what advantage you gain by this optimization (and
probably the initial state of swap_based_mutex_for_windows). The only way in
which I can use such facility is to call lock and unlock immediately after
its construction and therefore I prefer that the lock initialization will
simply be in the constructor.

BTW - I hope that retry_event.wait() can't fail.

Rani

Alexander Terekhov

unread,

Dec 15, 2003, 5:08:08 PM12/15/03

to

Rani Sharoni wrote:
[...]

> I prefer that the lock initialization will simply be in the constructor.

But PODs don't have constructors. ;-)

// PODs
typedef std::aligned_storage<std::mutex> pthread_mutex_t;
typedef std::aligned_storage<std::mutexattr_t> pthread_mutexattr_t;

static pthread_mutex_t mutex; // zeroed

extern "C" int pthread_mutex_init(pthread_mutex_t * mutex_storage,
const pthread_mutexattr_t * attr_storage) throw() {
try {
attr_storage ? new (mutex_storage->place())
std::mutex(attr_storage->object()) :
new (mutex_storage->place())
std::mutex();
}
catch(...) { // see ES of mutex::mutex(/*...*/)
return translate_exception_to_error();
}
return 0;
}

extern "C" int pthread_mutex_lock(pthread_mutex_t * m) throw() {
try {
m->object().acquire();
}
catch(...) { // see ES of mutex::acquire()
return translate_exception_to_error();
}
return 0;
}

extern "C" int pthread_mutex_destroy(pthread_mutex_t * m) throw() {
m->object().~mutex();
return 0;
}

Or something like that.

>
> BTW - I hope that retry_event.wait() can't fail.

I hope too. But it really CAN throw stack overflow. ;-) ;-)

regards,
alexander.

Alexander Terekhov

unread,

Dec 15, 2003, 5:12:59 PM12/15/03

to

Alexander Terekhov wrote:
>
> Rani Sharoni wrote:
> [...]
> > I prefer that the lock initialization will simply be in the constructor.
>
> But PODs don't have constructors. ;-)
>
> // PODs
> typedef std::aligned_storage<std::mutex> pthread_mutex_t;
> typedef std::aligned_storage<std::mutexattr_t> pthread_mutexattr_t;
>
> static pthread_mutex_t mutex; // zeroed

I meant

#define PTHREAD_MUTEX_INITIALIZER { 0 }
static pthread_mutex_t mutex = PTHREAD_MUTEX_INITIALIZER;

with no "magic", of course.

Rani Sharoni

unread,

Dec 16, 2003, 5:09:04 AM12/16/03

to

Alexander Terekhov wrote:
> Rani Sharoni wrote:
> [...]
>> I prefer that the lock initialization will simply be in the
>> constructor.
>
> But PODs don't have constructors. ;-)

I guess that atomic<T> has default constructor and therefore its containing
class don't have trivial default constructor which is required by POD.

Ask your favorite C++ compiler:
template<typename T>
void assert_pod(bool b = true)
{
if (b) goto skip_pod;

T this_is_not_pod;
(void)this_is_not_pod;

skip_pod:;
}

struct A { A(); };
struct B { A a; };

int main() {
assert_pod<B>();
}

> extern "C" int pthread_mutex_lock(pthread_mutex_t * m) throw() {
> try {
> m->object().acquire();
> }
> catch(...) { // see ES of mutex::acquire()
> return translate_exception_to_error();
> }
> return 0;
> }

void my_pthread_mutex_lock(pthread_mutex_t * m) throw() {
int err = pthread_mutex_lock(m);
assert(!err);
}

> extern "C" int pthread_mutex_destroy(pthread_mutex_t * m) throw() {
> m->object().~mutex();
> return 0;
> }

void my_pthread_mutex_destroy(pthread_mutex_t * m) throw() {
int err = pthread_mutex_destroy(m);
assert(!err);
}

>> BTW - I hope that retry_event.wait() can't fail.
>
> I hope too. But it really CAN throw stack overflow. ;-) ;-)

Which is bug and not a failure.

Rani

Alexander Terekhov

unread,

Dec 16, 2003, 6:43:31 AM12/16/03

to

Rani Sharoni wrote:
[...]

> I guess that atomic<T> has default constructor and therefore its containing
> class don't have trivial default constructor which is required by POD.

Whatever. But aligned_storage<mutex> (pthread_mutex_t) is POD.

[...]

> > extern "C" int pthread_mutex_lock(pthread_mutex_t * m) throw() {
> > try {
> > m->object().acquire();
> > }
> > catch(...) { // see ES of mutex::acquire()
> > return translate_exception_to_error();
> > }
> > return 0;
> > }
>
> void my_pthread_mutex_lock(pthread_mutex_t * m) throw() {
> int err = pthread_mutex_lock(m);
> assert(!err);
> }

Fine. The point is that you can end up with ENOMEM here -- on a
mutex that does some of the init lazily (e.g. statically allocated
pthread_mutex_t mutex under something like pthreads-win32 hack).

regards,
alexander.

Rani Sharoni

unread,

Dec 16, 2003, 7:51:13 AM12/16/03

to

Alexander Terekhov wrote:
> Rani Sharoni wrote:
> [...]
>> I guess that atomic<T> has default constructor and therefore its
>> containing class don't have trivial default constructor which is
>> required by POD.
>
> Whatever. But aligned_storage<mutex> (pthread_mutex_t) is POD.

What is the importance of the POD requirement?

>> void my_pthread_mutex_lock(pthread_mutex_t * m) throw() {
>> int err = pthread_mutex_lock(m);
>> assert(!err);
>> }
>
> Fine. The point is that you can end up with ENOMEM here -- on a
> mutex that does some of the init lazily (e.g. statically allocated
> pthread_mutex_t mutex under something like pthreads-win32 hack).

Finally we are aligned, cheers.
This is exactly my problem. I don't want the underlying optimization to
change the no-failure guarantee of the lock operation. I don't care what
heroic effort the facility provider have to do in order to supply
non-failing lock if the optimization is indeed important (and I believe that
it is). Again, just like on demand paging that doesn't change the
non-failing guarantee of operation like "++x" which is obviously needed
guarantee in order to write correct programs.

Rani

Alexander Terekhov

unread,

Dec 16, 2003, 8:28:20 AM12/16/03

to

Rani Sharoni wrote:
[...]

> > Whatever. But aligned_storage<mutex> (pthread_mutex_t) is POD.
>
> What is the importance of the POD requirement?

Well,

http://google.com/groups?selm=3F718BB5.B3E5E627%40web.de
http://google.com/groups?selm=3F735F24.24C0681B%40web.de
http://google.com/groups?selm=d6652001.0309260622.6e1143a7%40posting.google.com

>
> >> void my_pthread_mutex_lock(pthread_mutex_t * m) throw() {
> >> int err = pthread_mutex_lock(m);
> >> assert(!err);
> >> }
> >
> > Fine. The point is that you can end up with ENOMEM here -- on a
> > mutex that does some of the init lazily (e.g. statically allocated
> > pthread_mutex_t mutex under something like pthreads-win32 hack).
>
> Finally we are aligned, cheers.

Not quite.

> This is exactly my problem. I don't want the underlying optimization to
> change the no-failure guarantee of the lock operation. I don't care what

Heck. It is *NOT* optimization. It is there because I just can't
create windows event (binary sema) for statically initialized
pthread_mutex_t mutex before first pthread_mutex_lock(&mutex) call.

> heroic effort the facility provider have to do in order to supply
> non-failing lock if the optimization is indeed important (and I believe that
> it is). Again, just like on demand paging that doesn't change the
> non-failing guarantee of operation like "++x" which is obviously needed
> guarantee in order to write correct programs.

You compare apples and oranges.

regards,
alexander.

Rani Sharoni

unread,

Dec 16, 2003, 9:31:05 AM12/16/03

to

Alexander Terekhov wrote:
> Rani Sharoni wrote:
> [...]
>>> Whatever. But aligned_storage<mutex> (pthread_mutex_t) is POD.
>>
>> What is the importance of the POD requirement?
>
> Well,
>
> http://google.com/groups?selm=3F718BB5.B3E5E627%40web.de
> http://google.com/groups?selm=3F735F24.24C0681B%40web.de
>
http://google.com/groups?selm=d6652001.0309260622.6e1143a7%40posting.google.com

s/POD/Aggregate/g
Thanks for reminding me this gentle issue.

>>>> void my_pthread_mutex_lock(pthread_mutex_t * m) throw() {
>>>> int err = pthread_mutex_lock(m);
>>>> assert(!err);
>>>> }
>>>
>>> Fine. The point is that you can end up with ENOMEM here -- on a
>>> mutex that does some of the init lazily (e.g. statically allocated
>>> pthread_mutex_t mutex under something like pthreads-win32 hack).
>>
>> Finally we are aligned, cheers.
>
> Not quite.
>
>> This is exactly my problem. I don't want the underlying optimization
>> to change the no-failure guarantee of the lock operation. I don't
>> care what
>
> Heck. It is *NOT* optimization. It is there because I just can't
> create windows event (binary sema) for statically initialized
> pthread_mutex_t mutex before first pthread_mutex_lock(&mutex) call.

I'm interested to see code that assumes that lock might fail and I'll not be
surprise finding out that it isn't current. I'm sure that you can point me
for such code.

>> heroic effort the facility provider have to do in order to supply
>> non-failing lock if the optimization is indeed important (and I
>> believe that it is). Again, just like on demand paging that doesn't
>> change the non-failing guarantee of operation like "++x" which is
>> obviously needed guarantee in order to write correct programs.
>
> You compare apples and oranges.

Not in perspective of failure guarantees.

Rani

Alexander Terekhov

unread,

Dec 16, 2003, 9:51:51 AM12/16/03

to

Rani Sharoni wrote:
[...]

> I'm interested to see code that assumes that lock might fail and I'll not be
> surprise finding out that it isn't current. I'm sure that you can point me
> for such code.

http://www.terekhov.de/pthread_refcount_t/poor-man/beta2/prefcnt.c

really_bad(status) is meant to treat ENOMEM as NOT really bad status
(the expectation is that it can fail with ENOMEM on initial call(s)
only). Uhmm. Yeah, it isn't current. The "current" is this:

http://www.terekhov.de/pthread_refcount_t/experimental/refcount.cpp

;-)

regards,
alexander.

Rani Sharoni

unread,

Dec 16, 2003, 10:17:05 AM12/16/03

to

Thanks for the links.

Can you point me to a code (even sample) that uses the above facilities.
I wonder how code that uses the throwing arithmetic refcount methods looks
like. Remember that since it's well defined behavior you are bounded to it
and potential abuses of it.

Thanks,
Rani

Rani Sharoni

unread,

Dec 17, 2003, 5:53:08 AM12/17/03

to

I'll elaborate more on my concern.

My concern is about the huge complication effect on the poor client code,
which cares about program correctness, due to the handling of lock failures.
I know that you invented synchronizations facilities with challenging
requirements and I wonder if you can also provide non-failing lock.

Thanks,
Rani

Alexander Terekhov

unread,

Dec 17, 2003, 1:14:46 PM12/17/03

to

Rani Sharoni wrote:
[...]

> > Can you point me to a code (even sample) that uses the above
> > facilities.

Try <http://groups.google.com/groups?selm=3EC0F1F1.B78AA0DA%40web.de>.

> > I wonder how code that uses the throwing arithmetic
> > refcount methods looks like. Remember that since it's well defined
> > behavior you are bounded to it and potential abuses of it.

Well, do a search on overflow and underflow in, say, TC++PL. What
potential abuses?

>
> I'll elaborate more on my concern.
>
> My concern is about the huge complication effect on the poor client code,
> which cares about program correctness, due to the handling of lock failures.

Please show some illustration.

> I know that you invented synchronizations facilities with challenging
> requirements and I wonder if you can also provide non-failing lock.

I still don't see your problem. Are you saying that you just cant
live with "a bit relaxed failure guarantee" (with respect to lazy
init) for a statically allocated and statically initialized mutex
(using PTHREAD_MUTEX_INITIALIZER)?

regards,
alexander.

Rani Sharoni

unread,

Dec 17, 2003, 3:40:17 PM12/17/03

to

Alexander Terekhov wrote:
> Rani Sharoni wrote:
> [...]
>>> Can you point me to a code (even sample) that uses the above
>>> facilities.
>
> Try <http://groups.google.com/groups?selm=3EC0F1F1.B78AA0DA%40web.de>.

Thank you for the link, It demonstrates exactly what I was looking for.
IMO pthread_refcount_getvalue should have the no-fail guarantee in other to
easily write correct programs. I was not surprise to find out the following
code:

size_t IntAtomicGet( pthread_refcount_t& refs ) {
size_t result;
pthread_refcount_getvalue( &refs, &result ); // Oops might fail
return result;
}

inline String::~String() {
if ( 2 > IntAtomicGet( data_->refs ) || // Oops destructor might fail
1 > IntAtomicDecrement( data_->refs ) )
delete data_; // consequence is just leak?
}

The pthread_refcount_getvalue is used by non-fail code and therefore
affected the correctness of the program. This is exactly why I thought that
such operations should be part of the required set of non-failing operations
that is necessary condition for writing correct program without heroic
efforts.
In case that pthread_refcount_getvalue was throwing function then the
program might have been terminated under low memory conditions.

>>> I wonder how code that uses the throwing arithmetic
>>> refcount methods looks like. Remember that since it's well defined
>>> behavior you are bounded to it and potential abuses of it.
>
> Well, do a search on overflow and underflow in, say, TC++PL. What
> potential abuses?
>
>>
>> I'll elaborate more on my concern.
>>
>> My concern is about the huge complication effect on the poor client
>> code, which cares about program correctness, due to the handling of
>> lock failures.
>
> Please show some illustration.
>
>> I know that you invented synchronizations facilities with challenging
>> requirements and I wonder if you can also provide non-failing lock.
>
> I still don't see your problem. Are you saying that you just cant
> live with "a bit relaxed failure guarantee" (with respect to lazy
> init) for a statically allocated and statically initialized mutex
> (using PTHREAD_MUTEX_INITIALIZER)?

I hope that now you see my problem.

Thanks,
Rani

Alexander Terekhov

unread,

Dec 18, 2003, 6:18:04 AM12/18/03

to

Rani Sharoni wrote:
[...]

> Thank you for the link, It demonstrates exactly what I was looking for.
> IMO pthread_refcount_getvalue should have the no-fail guarantee in other to
> easily write correct programs. I was not surprise to find out the following
> code:
>
> size_t IntAtomicGet( pthread_refcount_t& refs ) {
> size_t result;
> pthread_refcount_getvalue( &refs, &result ); // Oops might fail

^^^^^^^^^^^^^^^

Only if pthread_refcount_t refs is static (and initialized statically
using PTHREAD_REFCOUNT_INITIALIZER) *and* no successful operation on
that refs object had been performed prior to calling this function.

Okay now?

regards,
alexander.

P.S. http://google.com/groups?selm=3EF2CED4.90291FF8%40web.de

Rani Sharoni

unread,

Dec 18, 2003, 9:57:33 AM12/18/03

to

Alexander Terekhov wrote:
> Rani Sharoni wrote:
> [...]
>> Thank you for the link, It demonstrates exactly what I was looking
>> for. IMO pthread_refcount_getvalue should have the no-fail guarantee
>> in other to easily write correct programs. I was not surprise to
>> find out the following code:
>>
>> size_t IntAtomicGet( pthread_refcount_t& refs ) {
>> size_t result;
>> pthread_refcount_getvalue( &refs, &result ); // Oops might fail
> ^^^^^^^^^^^^^^^
>
> Only if pthread_refcount_t refs is static (and initialized statically
> using PTHREAD_REFCOUNT_INITIALIZER) *and* no successful operation on
> that refs object had been performed prior to calling this function.
>
> Okay now?

And how does the poor client of such facilities suppose to survive such
subtleties?
If the cases in which the lock actually might fail are so rare then why
can't I get more functions and types that reflects the common no-fail
guarantee case.

> P.S. http://google.com/groups?selm=3EF2CED4.90291FF8%40web.de

I don't know why the negative attitude toward win32 critical section is
needed. EnterCriticalSection has low memory conditions *documented* bug in
some versions of windows due to an optimization and there is documented
workaround for this bug. In windows XP+ this bug was fixed without defeating
the optimization.
http://msdn.microsoft.com/library/default.asp?url=/library/en-us/dllproc/base/entercriticalsection.asp

Thanks,
Rani

Alexander Terekhov

unread,

Dec 18, 2003, 10:24:37 AM12/18/03

to

Rani Sharoni wrote:
[...]

> And how does the poor client of such facilities suppose to survive such
> subtleties?

Through outsourcing, I guess. ;-)

> If the cases in which the lock actually might fail are so rare then why
> can't I get more functions and types that reflects the common no-fail
> guarantee case.

Yeah. That would be "cleaner", of course. But the fact is that
pthread_mutex_lock() does have a "shall fail" error -- "[EINVAL]
The mutex was created with the protocol attribute having the value
PTHREAD_PRIO_PROTECT and the calling thread's priority is higher
than the mutex's current priority ceiling." So, it isn't throw(),
so to speak.

>
> > P.S. http://google.com/groups?selm=3EF2CED4.90291FF8%40web.de
>
> I don't know why the negative attitude toward win32 critical section is
> needed. EnterCriticalSection has low memory conditions *documented* bug in
> some versions of windows due to an optimization and there is documented
> workaround for this bug. In windows XP+ this bug was fixed without defeating
> the optimization.

And how did they do it? Details, please.

regards,
alexander.

Rani Sharoni

unread,

Dec 18, 2003, 10:59:36 AM12/18/03

to

Alexander Terekhov wrote:
> Rani Sharoni wrote:
> [...]
>> And how does the poor client of such facilities suppose to survive
>> such subtleties?
>
> Through outsourcing, I guess. ;-)
>
>> If the cases in which the lock actually might fail are so rare then
>> why can't I get more functions and types that reflects the common
>> no-fail guarantee case.
>
> Yeah. That would be "cleaner", of course. But the fact is that
> pthread_mutex_lock() does have a "shall fail" error -- "[EINVAL]
> The mutex was created with the protocol attribute having the value
> PTHREAD_PRIO_PROTECT and the calling thread's priority is higher
> than the mutex's current priority ceiling." So, it isn't throw(),
> so to speak.

Nevertheless I was pleased to see that the pthread_mutex_lock specifications
says that the behavior is undefined in many cases which means that the
programmer can't really mistakenly abuse those cases and get away with it.
Anyway, my pthread_* wrappers for such obviously no-fail operations return
void and have no-fail guarantee.

>>> P.S. http://google.com/groups?selm=3EF2CED4.90291FF8%40web.de
>>
>> I don't know why the negative attitude toward win32 critical section
>> is needed. EnterCriticalSection has low memory conditions
>> *documented* bug in some versions of windows due to an optimization
>> and there is documented workaround for this bug. In windows XP+ this
>> bug was fixed without defeating the optimization.
>
> And how did they do it? Details, please.

Although I work in Microsoft I don't really have internal knowledge on the
specific implementation details but I heard that some guy outside of
Microsoft investigated this using debugger and found out the trick is
something like fallback to pre-allocated event handle in case of failure.

Rani

Neill Clift [MSFT]

unread,

Dec 18, 2003, 11:22:46 AM12/18/03

to

"Alexander Terekhov" <tere...@web.de> wrote in message
news:3FE1C6B5...@web.de...

I did this work quite some time ago. This was a very big source of issues
for us
with internal stress testing.

EnterCriticalSection and LeaveCriticalSection could raise
in low memory. The error path was failure to allocate the event used for
synchonization.
InitializeCriticalSection could raise becuase a memeory allocation failed
for some internal debug information.

I considered it impossible to code to Enter/Leave raising. To make matters
worse the internal structure of the lock was damaged by the raise such
that it might be unusable after the event.

InitializeCriticalSection raising I thought was ok given that you could
handle
the condition and do something sensible. It's an area we will probably
address
though as c++ programmers have a hard time handling failures in their
constructors and they love to make this call in there.

So for XP we created a single global object that allowed threads to wait
with a partical key value (just a pointer). Threads may wake other threads
in their process using the global object and a matching key value.
Attempting
to wake a thread that's not waiting yet causes the waker to wait until the
waiting
thread enters the kernel (a rare event but this case is a little strange).
Waiting
threads are chained together using existing fields in the thread into one
list.
We don't care about performance in this path just correctness.

We only use this object if we fail to create the events in the contended
case.

Having your lock and unlock routines raise is a bad idea in my opinion. I
had
to work like hell to change it after it had been built this way.
Neill.

> regards,
> alexander.
--
This posting is provided "AS IS" with no warranties, and confers no rights.

Alexander Terekhov

unread,

Dec 19, 2003, 7:03:41 AM12/19/03

to

"Neill Clift [MSFT]" wrote:

[... I did this work quite some time ago ...]

Thanks for the info. Once I'll have some time I'll try to cook
something ala "swap_based_mutex_for_windows" but with "on demand"
event alloc/init and fallback to "something global" in the case
of out-of-resource error. Let's see then which one is better. ;-)

regards,
alexander.

Neill Clift [MSFT]

unread,

Dec 19, 2003, 12:27:13 PM12/19/03

to

"Alexander Terekhov" <tere...@web.de> wrote in message

news:3FE2E91D...@web.de...

Our critical section implementation doesn't perform well if there is
contention because we pass on ownership. This is something I hope to
change in the future.

Sean Kelly

unread,

Dec 23, 2003, 7:33:55 PM12/23/03

to

"Neill Clift [MSFT]" <nei...@microsoft.com> wrote in message news:<3fe1d456$1...@news.microsoft.com>...

>
> Having your lock and unlock routines raise is a bad idea in my opinion. I
> had to work like hell to change it after it had been built this way.

That's something I've been meaning to ask about... could
WaitForSingleObject ever return WAIT_FAILED when waiting on a mutex?
This is the one luck/unlock instance where my code might currently
raise an exception and I'd love if I could get rid of that
possibility.

Sean

Neill Clift [MSFT]

unread,

Dec 30, 2003, 3:53:51 PM12/30/03

to

"Sean Kelly" <ken...@pacbell.net> wrote in message
news:721ff0b.03122...@posting.google.com...

Waits don't allocate resources unless your using WaitForMultipleObjects with
more than THREAD_WAIT_OBJECTS (3). So you won't get resource errors
from these waits. Obviously you can get bad parameter/handle errors and
other issues like abandoned etc.

CriticalSections rely on this to not have an error path for example.

> Sean

Sean Kelly

unread,

Jan 2, 2004, 10:22:55 PM1/2/04

to

Since this is getting pretty platform-specific I thought I'd take it
offline, but I was wondering if you could provide additional insight.
I've since asked on one of the public MS newsgroups but didn't get a
useful response.

Basically, I've got a simple object that allocates a handle for an
unnamed mutex on creation, locks/unlocks it on demand, and destroys the
handle upon deletion. The lock function calls WaitForSingleObject to
obtain the mutex lock. I know the parameters will be valid because of
encapsulation. With this in mind, do you know the conditions where
WaitForSingleObject might return WAIT_FAILED? I'm hoping that I can
build the object in such a way that I can avoid checking for WAIT_FAILED
(and risk throwing an exception).

You mention parameter/handle errors and I think I've got those covered.
And I don't care about WAIT_ABANDONED since AFAIK the lock is still
acquired in this case. Is there anything else? I suppose worst case I
could drop the mutex object and only offer critical sections under
Windows but I'd like to have both as an option if at all possible.

Sean Kelly

Neill Clift [MSFT]

unread,

Jan 3, 2004, 1:49:42 PM1/3/04

to

"Sean Kelly" <ken...@pacbell.net> wrote in message

news:jAqJb.4820$Ym2....@newssvr27.news.prodigy.com...

If you lock the MUTEX 4 billion odd times you get an error for that.
So beyond these errors there shouldn't be anything else. If your code is
sound then there really isn't an error path.

I also look at some of the MS groups but I may have missed you post.
> Sean Kelly