possible to time out a pthread_mutex_lock() or otherwise detect deadlocks?

fred anger

unread,

Apr 29, 2000, 3:00:00 AM4/29/00

to

Hi. This is would seem like a FAQ - if so, please gimme a link.
Searching deja doesn't turn much up...

What I'm trying to do is port a mutex class that someone wrote
originally for OS/2, then ported to NT, to Linux. They have a method
called obtain_ownership(), in which they use a WaitForSingleObject()
with a 10 second timeout to lock a mutex, wrapped in a loop that writes
some diagnostics to a stream after every 10 seconds. If it doesn't get
the lock after 5 minutes, it decides there's a deadlock and proceeds to
dump it's core.

I've seen other people recommend using a pthread_cond_timedwait, but
that seems to implement a completely different function, as the mutex
associated with the condition variable must be locked prior to
pthread_cond_timedwait(). Doesn't help me. I need a timed wait on
attempting to aquire the lock on the mutex - like a
pthread_mutex_trylock() with a timeout. I could poll the lock with
pthread_mutex_trylock(), but I'd think that'd risk starvation, not to
mention waste a lot of CPU time.

Any help would be greatly appreciated.

--
fred anger
BRING BACK DEJANEWS.COM
'RATE THIS' SUCKS!

Sent via Deja.com http://www.deja.com/
Before you buy.

Kaz Kylheku

unread,

Apr 29, 2000, 3:00:00 AM4/29/00

to

On Sat, 29 Apr 2000 20:17:37 GMT, fred anger <fred...@my-deja.com> wrote:
>Hi. This is would seem like a FAQ - if so, please gimme a link.
>Searching deja doesn't turn much up...
>
>What I'm trying to do is port a mutex class that someone wrote
>originally for OS/2, then ported to NT, to Linux. They have a method
>called obtain_ownership(), in which they use a WaitForSingleObject()
>with a 10 second timeout to lock a mutex, wrapped in a loop that writes
>some diagnostics to a stream after every 10 seconds. If it doesn't get
>the lock after 5 minutes, it decides there's a deadlock and proceeds to
>dump it's core.

Note that Win32 mutexes do not correspond neatly to POSIX mutexes.
The approximate Win32 equivalent of the POSIX mutex is the CRITICAL_SECTION
object. Win32 does not support timed out waits on critical sections.

Timed out mutex waits have been recently added to the POSIX draft.
The new function takes an absolute timespec argument, not unlike
pthread_cond_timedwait:

int pthread_mutex_timedlock(pthread_mutex_t *, const struct timespec *);

>I've seen other people recommend using a pthread_cond_timedwait, but
>that seems to implement a completely different function, as the mutex
>associated with the condition variable must be locked prior to
>pthread_cond_timedwait(). Doesn't help me.

It does. Using a mutex, a boolean variable and a condition variable, you can
create a lock abstraction that supports timed out waits. Here is some untested
code I just banged out right into the article:

#include <pthread.h>

typedef struct {
pthread_mutex_t mutex;
pthread_cond_t cond;
int locked;
} longlock_t;

#define LONGLOCK_INITIALIZER { \
PTHREAD_MUTEX_INITIALIZER, \
PTHREAD_COND_INITIALIZER, \
0, \
};

void longlock_init(longlock_t *ll)
{
pthread_mutex_init(&ll->mutex, 0);
pthread_cond_init(&ll->cond, 0);
ll->locked = 0;
}

void longlock_destroy(longlock_t *ll)
{
pthread_mutex_destroy(&ll->mutex);
pthread_cond_destory(&ll->cond);
}

/* If thread is cancelled in pthread_cond_{timed}wait,
it executes this cleanup handler to release the mutex. */

static void cancel_cleanup(void *arg)
{
longlock_t *ll = arg;
ptherad_mutex_unlock(&ll->mutex);
}

void longlock_lock(longlock_t *ll)
{
pthread_mutex_lock(&ll->mutex);

pthread_cleanup_push(cancel_cleanup, ll);

while (ll->locked)
pthread_cond_wait(&ll->cond, &ll->mutex);

ll->locked = 1;
pthread_cleanup_pop(1);
}

void longlock_unlock(longlock_t *ll)
{
pthread_mutex_lock(&ll->mutex);
ll->locked = 0;
pthread_mutex_unlock(&ll->mutex);
pthread_cond_signal(&ll->cond);
}

int longlock_timedlock(longlock_t *ll, struct timespec *abstime)
{
int gotit = 0;

pthread_mutex_lock(&ll->mutex);

pthread_cleanup_push(cancel_cleanup, ll);

while (ll->locked) {
if (pthread_cond_timedwait(&ll->cond, &ll->mutex) == ETIMEDOUT)
break;
}

if (!ll->locked)
ll->locked = gotit = 1;

pthread_cleanup_pop(1);

return (gotit) ? 0 : ETIMEDOUT;
}

It's easy enough to make all kinds of primitives using conditions and mutexes.

--
#exclude <windows.h>

fred anger

unread,

May 1, 2000, 3:00:00 AM5/1/00

to

In article <slrn8gmlf...@ashi.FootPrints.net>,

k...@ashi.footprints.net wrote:
> Timed out mutex waits have been recently added to the POSIX draft.
> The new function takes an absolute timespec argument, not unlike
> pthread_cond_timedwait:
>
> int pthread_mutex_timedlock(pthread_mutex_t *, const struct
timespec *);

Cool! Can't wait...literally.

> >I've seen other people recommend using a pthread_cond_timedwait, but
> >that seems to implement a completely different function, as the mutex
> >associated with the condition variable must be locked prior to
> >pthread_cond_timedwait(). Doesn't help me.
>
> It does. Using a mutex, a boolean variable and a condition variable,
you can
> create a lock abstraction that supports timed out waits. Here is some
untested
> code I just banged out right into the article:

[snip]

> int longlock_timedlock(longlock_t *ll, struct timespec *abstime)
> {
> int gotit = 0;
>
> pthread_mutex_lock(&ll->mutex);
>
> pthread_cleanup_push(cancel_cleanup, ll);
>
> while (ll->locked) {
> if (pthread_cond_timedwait(&ll->cond, &ll->mutex) == ETIMEDOUT)
> break;
> }
>
> if (!ll->locked)
> ll->locked = gotit = 1;
>
> pthread_cleanup_pop(1);
>
> return (gotit) ? 0 : ETIMEDOUT;
> }
>
> It's easy enough to make all kinds of primitives using conditions and
mutexes.

I started down this path as well, but realized that it doesn't solve my
problem (that I can tell). My problem is that a mutex can be locked
indefinitely by some thread, so other threads calling
pthread_mutex_lock() will pend indefinitely - they won't even make it to
the pthread_cond_timedwait(). I need to be able to detect deadlocks -
perhaps I need a watchdog thread in some combination with a condition
variable...

In addition, I don't see that waiting on a condition variable will solve
my problem, given that the first thread to try to grab the lock (and all
subsequent threads) will be forever waiting for the condition to be
signalled - but it never will be. Any threads calling
longlock_timedlock() will never get the lock, because the condition
hasn't been signalled, and never will because none of the threads will
ever make it to an unlock call. Perhaps I'm missing something...

Carl Mailloux

unread,

May 1, 2000, 3:00:00 AM5/1/00

to

fred anger a écrit :

> In article <slrn8gmlf...@ashi.FootPrints.net>,
> k...@ashi.footprints.net wrote:
> > Timed out mutex waits have been recently added to the POSIX draft.
> > The new function takes an absolute timespec argument, not unlike
> > pthread_cond_timedwait:
> >
> > int pthread_mutex_timedlock(pthread_mutex_t *, const struct
> timespec *);
>
> Cool! Can't wait...literally.
>

> I started down this path as well, but realized that it doesn't solve my
> problem (that I can tell). My problem is that a mutex can be locked
> indefinitely by some thread, so other threads calling
> pthread_mutex_lock() will pend indefinitely - they won't even make it to
> the pthread_cond_timedwait(). I need to be able to detect deadlocks -
> perhaps I need a watchdog thread in some combination with a condition
> variable...
>

At this time, you have 3 choices

1. Write you own function that work with any version of linuxthread library
2. Wait a release of linuxthread that implement the new standard function
called pthread_mutex_timedwait
3. Hacking the linuxthread library, write the function pthread_mutex_timedwait
and submit them to glibc maintener

--
Carl Mailloux
ca...@oricom.ca

Carl Mailloux

unread,

May 1, 2000, 3:00:00 AM5/1/00

to

Carl Mailloux a écrit :

Sorry, the new function name is not pthread_mutex_timedwait. The name is
pthread_mutex_timedlock.

--
Carl Mailloux
ca...@oricom.ca

Kaz Kylheku

unread,

May 2, 2000, 3:00:00 AM5/2/00

to

On Mon, 01 May 2000 16:37:08 GMT, fred anger <fred...@my-deja.com> wrote:
>> int longlock_timedlock(longlock_t *ll, struct timespec *abstime)
>> {
>> int gotit = 0;
>>
>> pthread_mutex_lock(&ll->mutex);
>>
>> pthread_cleanup_push(cancel_cleanup, ll);
>>
>> while (ll->locked) {
>> if (pthread_cond_timedwait(&ll->cond, &ll->mutex) == ETIMEDOUT)
>> break;
>> }
>>
>> if (!ll->locked)
>> ll->locked = gotit = 1;
>>
>> pthread_cleanup_pop(1);
>>
>> return (gotit) ? 0 : ETIMEDOUT;
>> }
>>
>> It's easy enough to make all kinds of primitives using conditions and
>mutexes.
>

>I started down this path as well, but realized that it doesn't solve my
>problem (that I can tell). My problem is that a mutex can be locked
>indefinitely by some thread, so other threads calling
>pthread_mutex_lock() will pend indefinitely - they won't even make it to
>the pthread_cond_timedwait(). I need to be able to detect deadlocks -
>perhaps I need a watchdog thread in some combination with a condition
>variable...

The longlock implementation ensures that the internal mutex is properly
released in each function. So it cannot deadlock on a pthread_mutex_lock
unless something is horribly wrong with your CPU or motherboard. ;)

>In addition, I don't see that waiting on a condition variable will solve
>my problem, given that the first thread to try to grab the lock (and all
>subsequent threads) will be forever waiting for the condition to be
>signalled - but it never will be. Any threads calling

The first thread to grab the lock will find the ll->locked variable to
be zero, and thus will assert ownership of the lock and release ll->mutex.
Subsequent calls will enter the mutex, see that ll->locked is non-zero
and wait.

>longlock_timedlock() will never get the lock, because the condition
>hasn't been signalled, and never will because none of the threads will
>ever make it to an unlock call. Perhaps I'm missing something...

You might be missing that the pthread_cond_timedwait function releases the
mutex while the thread waits, and reacquires it before returning.

Also the pthread_cleanup_pop(1) causes the mutex to be released because it
calls the handler that was pushed with the corresponding pthread_cleanup_push.

Using thes handlers is just a way to dot your i's and cross your t's in POSIX
thread programming; failure to do so means that your library is not robust
against thread cancellation. (To be entirely thorough, you also have
to handle forking with pthread_atfork).

--
#exclude <windows.h>