Mechanism for interrupting a task sleeping on a lock

Balban

unread,

Nov 1, 2009, 8:17:11 AM11/1/09

to

Hi all,

In my kernel I have implemented an interruptible sleeping mutex. As a
sleeping mutex, it is similar to ones you would find in popular OS'es
where the task that could not acquire the lock goes to sleep until the
lock becomes available.

I have real-time priority concerns and would like to be able to
interrupt a sleeping task, when it is sleeping for whichever reason.
In that respect, I have implemented the mutexes as interruptible, such
that a thread that sleeps on a lock can be woken up for two reasons:
Either the lock is available, or it has been interrupted by someone
(e.g. its superior parent, or a high priority task)

With this mechanism, a thread who already took 2 such locks and has
slept on the third one, can interrupt, and upon interruption finish
gracefully, by releasing all its locks, leave all the data structures
in good state, and leave the kernel, to retry the operation later.

The problem is that *every* mutex in the system now has become
conditional, e.g. I need to check the return value on whether it has
been interrupted or not on every mutex_lock() operation. Is this the
best way to handle things or do you have a better suggestion?

In a nutshell the core of my problem is to be able to take control
over a data structure or a sleeping task, in case a higher priority
event occurs that relates to one of those entities.

Thanks,

Bahadir

David Schwartz

unread,

Nov 1, 2009, 10:26:40 AM11/1/09

to

On Nov 1, 5:17 am, Balban <bilgehan.bal...@gmail.com> wrote:

> In a nutshell the core of my problem is to be able to take control
> over a data structure or a sleeping task, in case a higher priority
> event occurs that relates to one of those entities.

I guess I don't understand what you're talking about. Consider:

Thread T1 acquires mutex M1, which protects data structure D1.
Thread T1 dirties data structure D1.
Thread T1 blocks for mutex M2, which protects data structure D2.
Thread T2, at higher priority, wants mutex M1 to access data structure
D1.

Are you saying that you want to interrupt thread T1's blocking on M2
and command it to release M1? That would require the thread "backing
out" whatever changes it made to D1.

If so, why put that in the mutex? Why not just code thread T1 and T2
to do exactly that in user-space with the existing threading
primitives? As you've noticed, trying to put it in the mutex
implementation creates all kinds of problems.

DS

Marcel Müller

unread,

Nov 1, 2009, 12:04:58 PM11/1/09

to

Hi,

Balban wrote:
> Either the lock is available, or it has been interrupted by someone
> (e.g. its superior parent, or a high priority task)
>
> With this mechanism, a thread who already took 2 such locks and has
> slept on the third one, can interrupt, and upon interruption finish
> gracefully, by releasing all its locks, leave all the data structures
> in good state, and leave the kernel, to retry the operation later.

hmm, waiting for a mutex at real time priority is tough. Waiting for
multiple mutexes even more.

> The problem is that *every* mutex in the system now has become
> conditional, e.g. I need to check the return value on whether it has
> been interrupted or not on every mutex_lock() operation. Is this the
> best way to handle things or do you have a better suggestion?

Use an OO language and throw an interrupted exception in case you want
to cancel a request. It is then up to the application code to catch the
exception at the appropriate level of stack unwinding. Of course, your
application code has to be exception safe for that to avoid resource
lacks and undefined behavior. But this is a good advise anyway.

> In a nutshell the core of my problem is to be able to take control
> over a data structure or a sleeping task, in case a higher priority
> event occurs that relates to one of those entities.

A classical priority inversion problem. I don't think that your
suggested solution is a good advise. But of course, I don't know your
demands.

Have a look for solutions of priority inversion problems.
Sometimes using lock-free data structures and algorithms is an option too.

You should avoid shared data and too much mutexes in real time
applications. Use controller threads with state machines and local data.
The different engines could communicate over command queues. They can be
lock-free primitive objects. If well designed the payload has never to
be passed through the command queues at all.
If this is not adequate use conditional variables to lock certain
actions and access these conditional variables partially atomic to avoid
mutex requests at critical code locations.

Marcel

Balban

unread,

Nov 2, 2009, 3:54:37 PM11/2/09

to

Hi there,

On Nov 1, 5:26 pm, David Schwartz <dav...@webmaster.com> wrote:
> I guess I don't understand what you're talking about. Consider:
>
> Thread T1 acquires mutex M1, which protects data structure D1.
> Thread T1 dirties data structure D1.
> Thread T1 blocks for mutex M2, which protects data structure D2.
> Thread T2, at higher priority, wants mutex M1 to access data structure
> D1.

Yes.

> Are you saying that you want to interrupt thread T1's blocking on M2
> and command it to release M1? That would require the thread "backing
> out" whatever changes it made to D1.

Yes. It seems what I did is not very far-fetched. I checked the linux
device drivers book, and here's an excerpt from it:

[Excerpt start]
There are three versions of down:

void down(struct semaphore *sem);
int down_interruptible(struct semaphore *sem);
int down_trylock(struct semaphore *sem);

down decrements the value of the semaphore and waits as long as need
be. down_
interruptible does the same, but the operation is interruptible. The
interruptible version
is almost always the one you will want; it allows a user-space process
that is
waiting on a semaphore to be interrupted by the user. You do not, as a
general rule,
want to use noninterruptible operations unless there truly is no
alternative. Noninterruptible
operations are a good way to create unkillable processes (the dreaded
“D state” seen in ps), and annoy your users. Using down_interruptible
requires some
extra care, however, if the operation is interrupted, the function
returns a nonzero
value, and the caller does not hold the semaphore. Proper use of
down_interruptible
requires always checking the return value and responding accordingly.
[Excerpt end]

So I guess my mutexes are like an interruptible semaphore, and I need
to live with checking the return conditions. You are right about
taking multiple locks, the more locks you take and operate on
structures, it seems it will be more difficult to recover and restart.

> If so, why put that in the mutex? Why not just code thread T1 and T2
> to do exactly that in user-space with the existing threading
> primitives? As you've noticed, trying to put it in the mutex
> implementation creates all kinds of problems.

I don't see what you mean here. I am writing a microkernel, so it is
going to be those few mutexes that create the environment for threads
to run in userspace.

Thanks,

Bahadir

Balban

unread,

Nov 2, 2009, 4:24:07 PM11/2/09

to

On Nov 1, 7:04 pm, Marcel Müller <news.5.ma...@spamgourmet.com> wrote:
> > The problem is that *every* mutex in the system now has become
> > conditional, e.g. I need to check the return value on whether it has
> > been interrupted or not on every mutex_lock() operation. Is this the
> > best way to handle things or do you have a better suggestion?
>
> Use an OO language and throw an interrupted exception in case you want
> to cancel a request. It is then up to the application code to catch the
> exception at the appropriate level of stack unwinding. Of course, your
> application code has to be exception safe for that to avoid resource
> lacks and undefined behavior. But this is a good advise anyway.

I think your suggestion would solve the aesthetical problem well, but
I am using C.

> > In a nutshell the core of my problem is to be able to take control
> > over a data structure or a sleeping task, in case a higher priority
> > event occurs that relates to one of those entities.
>
> A classical priority inversion problem. I don't think that your
> suggested solution is a good advise. But of course, I don't know your
> demands.

You are right about accessing the data structure. It becomes a
priority inversion problem. I guess priority inheritence would solve
that one. But instead of the structure, whatif you want to manipulate
the thread who is holding the lock (e.g. suspend or destroy it) then
in that case my solution might make sense, or perhaps priority
inheritence might also make sense if you can afford to wait for the
thread to finish its last job before deleting it.

> You should avoid shared data and too much mutexes in real time
> applications. Use controller threads with state machines and local data.
> The different engines could communicate over command queues. They can be
> lock-free primitive objects. If well designed the payload has never to
> be passed through the command queues at all.

Is what you are suggesting a CSP-like model? E.g. use synchronous IPC
and keep otherwise shared data in per-thread storage.
I think that is very useful at least for correct concurrency. Although
I am implementing the underlying mechanism that will make this happen
in the upper layer.

I guess the bottom line is one must carefully design his system if
there is real-time involved in the concurrency. But my solution might
be of use if you are manipulating threads themselves.

Thanks,

Bahadir

David Schwartz

unread,

Nov 2, 2009, 5:02:18 PM11/2/09

to

On Nov 2, 12:54 pm, Balban <bilgehan.bal...@gmail.com> wrote:

> I don't see what you mean here. I am writing a microkernel, so it is
> going to be those few mutexes that create the environment for threads
> to run in userspace.

I'm not 100% sure I follow you here. Are we talking about the mutexes
your kernel will use for itself or the ones you implement for user-
space to use?

DS

David Schwartz

unread,

Nov 2, 2009, 9:52:27 PM11/2/09

to

On Nov 2, 1:24 pm, Balban <bilgehan.bal...@gmail.com> wrote:

> You are right about accessing the data structure. It becomes a
> priority inversion problem. I guess priority inheritence would solve
> that one. But instead of the structure, whatif you want to manipulate
> the thread who is holding the lock (e.g. suspend or destroy it) then
> in that case my solution might make sense, or perhaps priority
> inheritence might also make sense if you can afford to wait for the
> thread to finish its last job before deleting it.

Suspending or destroying the thread would be counter-productive. That
would simply prevent it from restoring the invariants it has broken,
delaying or removing your ability to unlock the mutex. You only need
to do something if the thread already holds a mutex you need to get it
to release. But presumably, the reason it's holding that mutex is
because it's in the middle of doing something. The thread will need to
sanely terminate what it was doing, so suspending or terminating it is
not a realistic option.

DS

Marcel Müller

unread,

Nov 3, 2009, 7:18:37 AM11/3/09

to

Balban wrote:

> On Nov 1, 7:04 pm, Marcel M�ller <news.5.ma...@spamgourmet.com> wrote:
>>> The problem is that *every* mutex in the system now has become
>>> conditional, e.g. I need to check the return value on whether it has
>>> been interrupted or not on every mutex_lock() operation. Is this the
>>> best way to handle things or do you have a better suggestion?
>> Use an OO language and throw an interrupted exception in case you want
>> to cancel a request. It is then up to the application code to catch the
>> exception at the appropriate level of stack unwinding. Of course, your
>> application code has to be exception safe for that to avoid resource
>> lacks and undefined behavior. But this is a good advise anyway.
>
> I think your suggestion would solve the aesthetical problem well, but
> I am using C.

Poor man. Error handling in C is twice as much work and sometimes blows
up the code up to unreadability. And OO languages like C++ can do
significantly more compile time checks. Besides that the differences are
mainly aesthetic as you said.

> You are right about accessing the data structure. It becomes a
> priority inversion problem. I guess priority inheritence would solve
> that one.

Only if the low priority thread exclusively calls non-blocking functions
while holding the mutex. Already a logging call will break this.

> But instead of the structure, whatif you want to manipulate
> the thread who is holding the lock (e.g. suspend or destroy it) then

As David told you, this is not an option without a transaction framework
which would roll back the structure content to the previous value before
the lock.

>> You should avoid shared data and too much mutexes in real time
>> applications. Use controller threads with state machines and local data.
>> The different engines could communicate over command queues. They can be
>> lock-free primitive objects. If well designed the payload has never to
>> be passed through the command queues at all.
>
> Is what you are suggesting a CSP-like model? E.g. use synchronous IPC
> and keep otherwise shared data in per-thread storage.

I don't know what CSP stands for, but it sounds similar.

> I think that is very useful at least for correct concurrency.

Absolutely. I remember a project in the 90's where I had several dozens
of threads with communication over channels, most of them with only two
pages of code. In this project concurrency problems like deadlocks were
seldom, and race conditions were nearly inexistent.
But the drawback is the number of threads. Fortunately on T805 starting
a thread is a machine instruction that takes only a few clock cycles.
The advantage was besides the reliability the smooth reaction even in
unexpected situations.

> I guess the bottom line is one must carefully design his system if
> there is real-time involved in the concurrency.

Yes, partially. You have to split the distinct components very strict
with few dependencies. And if your OS has only a priority based
scheduler you usually have to adjust the priorities dynamically.

> But my solution might
> be of use if you are manipulating threads themselves.

There are only rare conditions where this is a reliable solution. In
general you have to keep track of all shared resources that the thread
can use by an independent supervisor. Therefore you have to hook many
API calls like memory allocation and, of course, any synchronization API
and so on.

Marcel