fundamental semaphore question

eddie

unread,

Sep 14, 2004, 1:10:43 AM9/14/04

to

Hi group,

I'm doing OS as a subject and one of the pracs is to implement a
semaphore. I don't need any help with that :) What I do need help
with is a sort-of theoretical question. We have been advised to
implement a counting semaphore along the lines of:

|semaphore_wait( semaphore *sem, virtual_machine* mch )
|{
| sem->counter--;
| if( sem->counter < 0 )
| {
| //add currently running thread to blocked queue.
| //mark currently runing thread as not schedulable
| }
|}

and

|semaphore_signal( semaphore *sem, virtual_machine* mch )
|{
| sem->counter++;
| if( sem->counter <= 0 )
| {
| //awaken the first thread on the top of the blocked thread
| //stack and set it as schedulable again.
| }
|}

My question is why should only the top thread be awakened?
Suppose that A and B are threads blocking on a semaphore in that
order, The semaphore is signalled, ie another thread can access
some critical section. For some reason the scheduler, if both
threads were active, schedules thread B before thread A. The way
we have been told to implement it ensures that the first thread
to request a lock is the first thread to obtain it. Its a pretty
fundamental question ie asking about why the definition is done
that way. I was hoping someone could justify why the method above
is used.

thanks,
ed.

Alexander Terekhov

unread,

Sep 14, 2004, 5:18:20 AM9/14/04

to

eddie wrote:
[...]

> I was hoping someone could justify why the method above is used.

Your method above is basicaly nothing but archaic and inefficient
semantics meant to fight "starvation" and give a false sense of
"fairness". Unfortunately, SUS/POSIX's definition of semaphore
operations[1] is also sorta outdated.

Go MESA. Modern (ultra modern ;-) ) and efficient sema is this:

sema_lock:

WHILE // CAS/LL-SC
!atomic_decrement_if_binand_7FFFFFFF_is_not_zero_ccacq(&lock)
lock_queue_wait(&lock, 0) // wait if sem.value is zero

sema_unlock:

uintptr_t lock_queue;
IF atomic_increment_rel(lock_queue = &lock) > 0x80000000
THEN lock_queue_wake(lock_queue, 1)

(try/timed operations omitted for brevity)

"lock queue" is a parking interface (allowing spurious wakes) with
"waiters bit" maintained by its implementation in the lock status.
uintptr_t helper is needded because by the time lock_queue_wake()
is called, &lock may become indeterminate. Implementation of "lock
queue" shall cope with that.

BTW, for mutex:

mutex_lock:

WHILE
atomic_bit_test_set_ccacq(&lock, 1)
lock_queue_wait(&lock, 1) // wait if locked bit is set

mutex_unlock:

uintptr_t lock_queue;
IF atomic_decrement_rel(lock_queue = &lock)
THEN lock_queue_wake(lock_queue, 1)

"ccacq" stands for acquire-but-with-"control condition"-hint
(to avoid redundant memory barrier(s) -- arch specific stuff).

"rel" is a classic release.

regards,
alexander.

P.S. Stay away from semas (if you have a choice). Use condvars and
mutexes.

[1] www.opengroup.org/onlinepubs/009695399/basedefs/contents.html

"Semaphore Lock Operation

An operation that is applied to a semaphore. If, prior to the
operation, the value of the semaphore is zero, the semaphore
lock operation shall cause the calling thread to be blocked
and added to the set of threads awaiting the semaphore;
otherwise, the value shall be decremented.

Semaphore Unlock Operation

An operation that is applied to a semaphore. If, prior to the
operation, there are any threads in the set of threads awaiting
the semaphore, then some thread from that set shall be removed
from the set and becomes unblocked; otherwise, the semaphore
value shall be incremented."

Joe Seigh

unread,

Sep 14, 2004, 5:46:06 AM9/14/04

to

It's not clear from the intermixed use of the terms queue and stack
whether they are referring to a FIFO or LIFO implementation. Probably
FIFO so if you used the semaphore as a lock, it would be a FIFO lock.

The service order of a lock or semaphore isn't directly observable,
so it's a meaningless thing to specify. If you want to mess up
whoever assigned this, ask them how you would write a program that
could determine whether a lock or semaphore was FIFO or not.

Joe Seigh

Ed Skinner

unread,

Sep 14, 2004, 12:00:19 PM9/14/04

to

Hey, from one "Ed" to another ("Ed"):
The assigned problem is pretty basic and ignores some facets that, in
real life, you would probably also have to handle. The teacher is doing
you a favor.
Your question, however, shows you are thinking well into "real life"
and that's good.
So, you might want to consider doing this assignment twice: once to
satisfy the simplified requirements in the original problem, and then a
second time to start and deal with real-life necessities.
Here's some discussion to consider.
The "classic" semaphore signal operation only awakens one task, but
there are more complex objects that have the behaviour you describe. A
"condition variable", for example, behaves a lot like a semaphore (in some
of what it does) but has two "wake up" APIs -- one is called "signal" and,
like the semaphore, it awakens a single task (or thread, or process
depending on what you want to call it). The other API does a "broadcast"
and that API wakes up *all* tasks (or whatever) that are waiting.
Classic semaphores are kept simple because they have a number of uses
beyond this simple wait/signal mechanism. Consider, for example, a laser
printer. Laser printers work by receiving a "picture" of an entire page
(actually, they receive data that allows them to draw the picture
internally), and then dump out that picture onto the page. As such, a
laser printer can only be used by one program at a time. We could use a
semaphore to serialize (enforce "one at a time" usage of) the laser
printer. In that environment, we might have several tasks waiting for the
laser printer to become available and, as the task that is actually using
it finishes, it "signals" the semaphore to let the next task begin. (In
practice, the semaphore is typically buried inside the print subsystem and
the using tasks are unaware of the serialization. Also typically, the
"printing" is actually done to a queue and then that backlogged work is
processed one job at a time ... but I've wandered well away from the
current discussion.)
In real-life situations, those waiting tasks may have different
execution priorities so it may be important to have the signal wake up
either the highest priority waiting task, or to simply wake them up in
FIFO order. Some OSs (such as pSOS+) allowed the creator of the semaphore
to specify which kind of behavior to use with a given semaphore.
A deep problem with semaphores is alluded to in one of the other
postings, that of using "atomic" operations. In a multi-tasking OS, and in
one that supports a multiple processor architecture, these issues become
very important. For the scope of your assignment, however, it appears they
go far beyond what you need to worry about.
A good engineering proverb to remember is, "Solve only the problem at
hand. Other problems will present themselves in due time."
Good Luck!

SenderX

unread,

Sep 14, 2004, 5:23:50 PM9/14/04

to

> atomic_bit_test_set_ccacq(&lock, 1)

> lock_queue_wait(&lock, 1) // wait if locked bit is set

^^^^^^^^^^^^^^^^^^^^^^^^^

This looks like a slow-path. How exactly does lock_queue_wait atomically
check lock state, does it need to acquire "any" locks, or is it 100%
fast-pathed unless waiters bit is set? Humm...

The "parking interface" implementation would maintain a list of waiting
threads ( protected by hashed global mutex's of some sort ), and have the
ability to wake any one of them?The parking interface would need to be
synchronized with the waiter bit check, and any wake requests? Would all
algorithms that used the lock_queue need to cope with a spurious wake now
and then?

Sorry for all the questions Alex, I trying to understand the semantics of
your ultra modern sema's...

:)

Alexander Terekhov

unread,

Sep 14, 2004, 6:12:06 PM9/14/04

to

SenderX wrote:
>
> > atomic_bit_test_set_ccacq(&lock, 1)
>
> > lock_queue_wait(&lock, 1) // wait if locked bit is set
> ^^^^^^^^^^^^^^^^^^^^^^^^^
>
> This looks like a slow-path. How exactly does lock_queue_wait atomically
> check lock state, does it need to acquire "any" locks,

It can be slow and may acquire some locks. Well, see RTNPTL (but
note that its "robust" stuff is quite a fallacy).

> or is it 100%
> fast-pathed unless waiters bit is set? Humm...

WHILE

atomic_bit_test_set_ccacq(&lock, 1)
lock_queue_wait(&lock, 1) // wait if locked bit is set

>

> The "parking interface" implementation would maintain a list of waiting
> threads ( protected by hashed global mutex's of some sort ), and have the
> ability to wake any one of them?

Yes.

> The parking interface would need to be
> synchronized with the waiter bit check, and any wake requests?

If "realtime" is not an issue,

IF atomic_swap_rel(lock_queue = &lock, 0) < 0 // or > 1
THEN lock_queue_wake(lock_queue, 1)

is even better (I mean mutex). lock_queue_wake() (actually wake exit
path in lock_queue_wait()) would have to set waiters bit, not clear
it (that's the case with decrement).

> Would all
> algorithms that used the lock_queue need to cope with a spurious wake now
> and then?

Yup. "all" == mutexes + semas.

regards,
alexander.

John Taylor

unread,

Sep 14, 2004, 8:21:52 PM9/14/04

to

Alexander Terekhov <tere...@web.de> wrote in message news:<4146B75C...@web.de>...

> P.S. Stay away from semas (if you have a choice). Use condvars and
> mutexes.

Okay, semaphores are bad, but why? In my experience (which is mostly
RTOSes with a single CPU and a single memory space, i.e. threads but
no processes), I have not seen any inherient disadvantages/flaws to
using semaphores. I also don't understand how a binary semaphore is
"bad" vs. a mutex. In fact I do no see any implementation differences
between a binary semaphore and non-recursive mutex.

Alexander Terekhov

unread,

Sep 14, 2004, 9:56:18 PM9/14/04

to

John Taylor wrote:
[...]

> using semaphores. I also don't understand how a binary semaphore is

> "bad" vs. a mutex. ...

Semaphore (lock operation) is mandatory cancellation point, may
return EINTR, dosn't support mutex priority protocols (lacks
concept of ownership), and can't be used together with condvars
to build robust and effecient synchronization protocols.

regards,
alexander.

PS. http://google.com/groups?selm=3CED3306.DF6DB829%40web.de

SenderX

unread,

Sep 15, 2004, 3:36:56 PM9/15/04

to

> It can be slow and may acquire some locks. Well, see RTNPTL (but
> note that its "robust" stuff is quite a fallacy).

>> or is it 100%
>> fast-pathed unless waiters bit is set? Humm...
>
> WHILE
> atomic_bit_test_set_ccacq(&lock, 1)
> lock_queue_wait(&lock, 1) // wait if locked bit is set

I was wondering how the lock_queue_wait checks lock state...

int lock_queue_wait( int *plock, int cmp )
{
/* Get the lock-queue for the futex? */
lock_queue_impl *pImpl = lock_queue_self( plock );

/* Could you set the bit here? */
a1: if ( ! atomic_cas( plock, cmp, cmp & 0x80000000 ) )
{
return EWOULDBLOCK;
}

/* Enqueue ourselfs, and wait in the lock-queue */
a2: return lock_queue_push_and_wait
( pImpl, plock, cmp, pthread_self() );
}

If you could atomically set the waiters bit "and" acquire the lock_queue's
mutex... That would be something...

>
> is even better (I mean mutex). lock_queue_wake() (actually wake exit
> path in lock_queue_wait()) would have to set waiters bit, not clear
> it (that's the case with decrement).

Ok, that seems to mimic the logic in your fast mutex for windows? You do a
swap, then your wait loop always swings the lock into the waiters state
before it returns to the user.

Alexander Terekhov

unread,

Sep 16, 2004, 5:34:02 AM9/16/04

to

SenderX wrote:
[...]

> Ok, that seems to mimic the logic in your fast mutex for windows? You do a
> swap, then your wait loop always swings the lock into the waiters state
> before it returns to the user.

Not always. Only if the is not empty.

regards,
alexander.

Alexander Terekhov

unread,

Nov 4, 2004, 5:07:36 PM11/4/04

to

Alexander Terekhov wrote:
[...]

> sema_lock:
>
> WHILE // CAS/LL-SC
> !atomic_decrement_if_binand_7FFFFFFF_is_not_zero_ccacq(&lock)
> lock_queue_wait(&lock, 0) // wait if sem.value is zero
>
> sema_unlock:
>
> uintptr_t lock_queue;
> IF atomic_increment_rel(lock_queue = &lock) > 0x80000000
> THEN lock_queue_wake(lock_queue, 1)

sema_getvalue:

RETURN atomic_load_acq(&lock) & 7FFFFFFF

DCSI-SEMAS: (double-checked serialized initialization with semas)

Given:

sema sema_flag(0)
sema sema_lock(1)

Solution:

once_init() {
// 1st check
if (!sema_flag.getvalue()) {
// Lock
sema::lock_guard guard(sema_lock);
// 2nd check
if (!sema_flag.getvalue()) {
<init>
sema_flag.post();
}
}
}

regards,
alexander.

Alexander Terekhov

unread,

Nov 6, 2004, 10:45:12 AM11/6/04

to

Everybody and his dog outta know that semas suck.

Alexander Terekhov wrote:
[...]
> > sema_lock:
> >
> > WHILE // CAS/LL-SC
> > !atomic_decrement_if_binand_7FFFFFFF_is_not_zero_ccacq(&lock)

Err. !atomic_decrement_if_binand_7FFFFFFF_is_not_zero_ccacqrel(&lock)

It must be fully fenced, because...

> > lock_queue_wait(&lock, 0) // wait if sem.value is zero
> >
> > sema_unlock:
> >
> > uintptr_t lock_queue;
> > IF atomic_increment_rel(lock_queue = &lock) > 0x80000000
> > THEN lock_queue_wake(lock_queue, 1)
>
> sema_getvalue:
>
> RETURN atomic_load_acq(&lock) & 7FFFFFFF
>
> DCSI-SEMAS: (double-checked serialized initialization with semas)
>
> Given:
>
> sema sema_flag(0)
> sema sema_lock(1)
>
> Solution:
>
> once_init() {
> // 1st check
> if (!sema_flag.getvalue()) {
> // Lock
> sema::lock_guard guard(sema_lock);
> // 2nd check
> if (!sema_flag.getvalue()) {
> <init>
> sema_flag.post();
> }
> }
> }

... the {upcoming} POSIX-C/C++ memory model also allows

DCSI-SEMAS-ALL-ONE:

Given:

sema sema_flag(1)
sema sema_lock(1)

Solution:

once_init() {
// 1st check
if (sema_flag.getvalue()) {

// Lock
sema::lock_guard guard(sema_lock);
// 2nd check

if (sema_flag.getvalue()) {
<init>
sema_flag.wait(); // decrement
}
}
}

IOW, a semaphore can be used as (quoting DRB) "atomic/coherent
counter".

So let it be.

regards,
alexander.

Alexander Terekhov

unread,

Nov 6, 2004, 2:47:09 PM11/6/04

to

Alexander Terekhov wrote:
[...]

> the {upcoming} POSIX-C/C++ memory model

atomic<> and word-tearing-free programming (I mean "isolated<>"***)
aside for a moment, see

http://www.opengroup.org/sophocles/show_mail.tpl?source=L&listname=austin-review-l&id=1876

regards,
alexander.

***) http://groups.google.com/groups?selm=418AAF67.99C5F579%40web.de