POSIX process-shared synchronization variables on Solaris 2.6

krob...@srtc.com

unread,

Mar 30, 1998, 3:00:00 AM3/30/98

to

Development Environment: Solaris 2.6, C, POSIX threads.

Question:
I am developing a small, special purpose database to be used by several Unix
processes, each of which may contain more than one thread. To synchronize data
accesses between the various processes and threads I have embedded
process-shared mutexes and condition variables in the database file which is
mapped into shared memory quite similar to the example in chapter 10 of
Programming with Threads by Kleiman, Shah and Smaalders. I am curious to know
if anyone has solved the problem of detecting and handling the case when a
thread or a process is killed while holding a process-shared mutex lock?
Since the mutex lock can only be released by the thread that owns it, I assume
the mutex must first be destroyed and then reinitialized, but how do I even
determine if the owner of the mutex is dead or merely holding the lock for a
long time?

-----== Posted via Deja News, The Leader in Internet Discussion ==-----
http://www.dejanews.com/ Now offering spam-free web-based newsreading

Andrew Gabriel

unread,

Mar 31, 1998, 3:00:00 AM3/31/98

to

In article <6focbh$f7e$1...@nnrp1.dejanews.com>,

krob...@srtc.com writes:
>Development Environment: Solaris 2.6, C, POSIX threads.
>
>Question:
> I am developing a small, special purpose database to be used by several Unix
>processes, each of which may contain more than one thread. To synchronize data
>accesses between the various processes and threads I have embedded
>process-shared mutexes and condition variables in the database file which is
>mapped into shared memory quite similar to the example in chapter 10 of
>Programming with Threads by Kleiman, Shah and Smaalders. I am curious to know
>if anyone has solved the problem of detecting and handling the case when a
>thread or a process is killed while holding a process-shared mutex lock?
>Since the mutex lock can only be released by the thread that owns it, I assume
>the mutex must first be destroyed and then reinitialized, but how do I even
>determine if the owner of the mutex is dead or merely holding the lock for a
>long time?

Think about why your thread/process is holding the lock in
the first place. The lock is protecting access to an area
of data, such that it cannot be accessed during updates
where its state is inconsistent. Now if a thread stops
performing just such an update due to, for example crashing
or being killed, the update is inconsistent. If you now
want to break the mutex, you are effectively saying that it
is OK for other threads/processes to access this inconsistent
data, so why did you bother protecting it with a mutex in
the first place?

This area's a minefield; destroying and recreating the mutex
isn't really the problem, recovering the inconsistent data
which the mutex protects is rather more the problem. If you
have, say, a parent process which can do this, then you might
use a counting semaphore instead, so that the parent can
update the semaphore having repaired the inconsistent data.

--
Andrew Gabriel
Consultant Software Engineer

Bart Smaalders

unread,

Apr 13, 1998, 3:00:00 AM4/13/98

to

> >Question:
> > I am developing a small, special purpose database to be used by several Unix
> >processes, each of which may contain more than one thread. To synchronize data
> >accesses between the various processes and threads I have embedded
> >process-shared mutexes and condition variables in the database file which is
> >mapped into shared memory quite similar to the example in chapter 10 of
> >Programming with Threads by Kleiman, Shah and Smaalders. I am curious to know
> >if anyone has solved the problem of detecting and handling the case when a
> >thread or a process is killed while holding a process-shared mutex lock?
> >Since the mutex lock can only be released by the thread that owns it, I assume
> >the mutex must first be destroyed and then reinitialized, but how do I even
> >determine if the owner of the mutex is dead or merely holding the lock for a
> >long time?
>

> This area's a minefield; destroying and recreating the mutex
> isn't really the problem, recovering the inconsistent data
> which the mutex protects is rather more the problem. If you
> have, say, a parent process which can do this, then you might
> use a counting semaphore instead, so that the parent can
> update the semaphore having repaired the inconsistent data.

Given all the complications inherent in trying to prevent problems with
processes dying at awkward times with this scheme, I'd tend to suggest
an alternate approach: make the database portion of the code into a door
server. This will add about 20-30 usecs per call into the database code,
but will isolate the database portion from failures in other portions
of the applications and gives defined failure semantics when the clients
crash in the middle of a call. In addition, you don't need to use the
process-shared mutexes, so the code runs faster with the adapative
mutexes
in 2.6. If needed, various users can have different permissions in the
database w/o the need for set[ug]id applications, since the database
portion can check the caller's credentials with door_cred...

--
Bart Smaalders Solaris Clustering SunSoft
ba...@cyber.eng.sun.com (650) 786-5335 MS UMPK17-201
http://playground.sun.com/~barts 901 San Antonio Road
Palo Alto, CA
94303

line counter food