[boost] [interprocess] leaked named mutexes

680 views
Skip to first unread message

Eric Niebler

unread,
Mar 6, 2013, 5:51:32 PM3/6/13
to boost@lists.boost.org List
I recently discovered that a process can very easily leave a named mutex
dangling. Consider the following:

#include <cstdlib>
#include <iostream>
#include <boost/interprocess/sync/named_mutex.hpp>
#include <boost/interprocess/sync/scoped_lock.hpp>
namespace ip = boost::interprocess;

char const *name = "mynamedmutex";

int main(int argc, char*argv[])
{
ip::named_mutex mtx(ip::open_or_create, name);
std::cout << "acquiring named mutex" << std::endl;
ip::scoped_lock<ip::named_mutex> lock(mtx);
std::cout << "acquired" << std::endl;
exit(EXIT_FAILURE); // whoops
}

On my Linux box, this runs fine the first time, but the second time it
hangs waiting to acquire the mutex. I have to manually delete the
semaphore in /dev/shm/.

This will happen whenever the process exits without calling destructors
of locals; for instance:

- std::exit
- std::quick_exit
- std::abort
- std::terminate
- a crash
- assert failure
- an unhandled exception
- etc..

I find I can handle *some* of this by registering a terminate handler,
an exit handler (and on C++11, a quick_exit handler) that calls
boost::interprocess::named_mutex::remove. This raises a few questions,
though...

- Is there a better way?
- Is it safe to `remove` the same named mutex multiple times?
- Does this clean up only this process's use of the named_mutex, or does
it nuke it from the system, even if another process is using it? (The
docs suggest the latter, which is not what I want, is it?)

Thanks,
--
Eric Niebler
Boost.org

_______________________________________________
Unsubscribe & other changes: http://lists.boost.org/mailman/listinfo.cgi/boost

Belcourt, Kenneth

unread,
Mar 6, 2013, 6:02:22 PM3/6/13
to <boost@lists.boost.org>
Hi Eric,

If ipcs lists your named entity, ipcrm should remove it. I'm not sure how boost::ip objects are created so these commands may not help you.

-- Noel

Bjorn Reese

unread,
Mar 7, 2013, 6:02:16 AM3/7/13
to bo...@lists.boost.org
On 2013-03-06 23:51, Eric Niebler wrote:

> On my Linux box, this runs fine the first time, but the second time it
> hangs waiting to acquire the mutex. I have to manually delete the
> semaphore in /dev/shm/.

I do not know the specifics of named_mutex, but this is a general
problem with Unix shared memory, semaphores, and message queues. The
basic problem is that these IPC facilities are designed to be used
between processes, and sometimes you actually want them to survive a
crash, so it is difficult to make a garbage collection for them.

> - Does this clean up only this process's use of the named_mutex, or does
> it nuke it from the system, even if another process is using it? (The
> docs suggest the latter, which is not what I want, is it?)

Probably the latter. The workaround that I have been using is to get
the PID of the last process that has accessed the IPC, and check if it
is still running. If not, I remove the IPC. I was doing this with a
script, so I used the ipcs -p, ps, and ipcrm commands.

Kim Barrett

unread,
Mar 7, 2013, 12:50:23 PM3/7/13
to bo...@lists.boost.org
On Mar 6, 2013, at 5:51 PM, Eric Niebler wrote:
> I recently discovered that a process can very easily leave a named mutex
> dangling. Consider the following:

To deal with a mutex that was held by a thread whose process has died, one needs to use a "robust" mutex. There's a POSIX mutex construction attribute for making robust mutexes, and it's supported on Linux starting circa 2.6.18(?). I think (some versions of?) Windows provide this mechanism in the native mutexes too. There's a little protocol around mutex lock attempts, where an error return code indicates the earlier owner died, so that as part of lock acquisition you've now also acquired responsibility for dealing with any cleanup.

Dealing with robust mutexes is tricky. With exception safety one relies on no-throw operations as basic primitives. The nearest cognate in the robust mutex / cross process world is (true, not emulated) atomic operations. You can probably guess what that does to complexity.

Jonathan Wakely

unread,
Mar 7, 2013, 1:21:21 PM3/7/13
to bo...@lists.boost.org
On 7 March 2013 17:50, Kim Barrett wrote:
> On Mar 6, 2013, at 5:51 PM, Eric Niebler wrote:
>> I recently discovered that a process can very easily leave a named mutex
>> dangling. Consider the following:
>
> To deal with a mutex that was held by a thread whose process has died, one needs to use a "robust" mutex. There's a POSIX mutex construction attribute for making robust mutexes, and it's supported on Linux starting circa 2.6.18(?). I think (some versions of?) Windows provide this mechanism in the native mutexes too. There's a little protocol around mutex lock attempts, where an error return code indicates the earlier owner died, so that as part of lock acquisition you've now also acquired responsibility for dealing with any cleanup.
>
> Dealing with robust mutexes is tricky. With exception safety one relies on no-throw operations as basic primitives. The nearest cognate in the robust mutex / cross process world is (true, not emulated) atomic operations. You can probably guess what that does to complexity.

Tricky indeed, I've been trying to work out how to map robust mutexes
to the C++11 Mutex concepts, and have so far decided that calling
robust_mutex.lock() or robust_mutex.try_lock() should throw an
exception (with errc::state_not_recoverable) if the owner has died,
even though C++11 says try_lock() is non-throwing. Handling the
EOWNERDEAD case has to be requested explicitly by the user by passing
a special value of type robust_t:

auto result = robust_mutex.lock(robust);
if (result == robust_lock_result::locked)
{
// got the lock
}
else // result == robust_lock_result::inconsistent
{
// we have the lock, but state is unknown
// attempt to recover state and either
robust_mutex.recover();
// or
robust_mutex.unlock(); // mark mutex as unusable
}

Has anyone else looked at trying to fit robust mutexes into the C++11
concepts or into Boost?
I'd be interested in working with anyone looking into it.

Kim Barrett

unread,
Mar 7, 2013, 2:16:47 PM3/7/13
to bo...@lists.boost.org
On Mar 7, 2013, at 1:21 PM, Jonathan Wakely wrote:
> Tricky indeed, I've been trying to work out how to map robust mutexes
> to the C++11 Mutex concepts, and have so far decided that calling
> robust_mutex.lock() or robust_mutex.try_lock() should throw an
> exception (with errc::state_not_recoverable) if the owner has died,
> even though C++11 says try_lock() is non-throwing.

I think that demonstrates that robust mutexes are not models of the C++11 Mutex concepts.

> Handling the
> EOWNERDEAD case has to be requested explicitly by the user by passing
> a special value of type robust_t:

My approach to using robust mutexes does not attempt to treat them as models of the C++11 Mutex concept. Instead, there are now locking operations associated with them which *require* a handler function that gets invoked on EOWNERDEAD. This approach filtered up to condition variables too; can't use C++11 / boost.thread condition variables with these robust mutexes, because of the requirement for a handler for EOWNERDEAD.

I thought about providing C++11 Mutex-like operations to allow these robust mutexes to be used like ordinary mutexes, but ultimately decided there was no real use-case for that, since the whole point of a robust mutex is to support EOWNERDEAD handling.
Reply all
Reply to author
Forward
0 new messages