Ordering of sem_wait waiters

Noob

unread,

Apr 25, 2012, 11:17:09 AM4/25/12

to

Hello everyone,

The (proprietary) OS I've been working with provides two "types"
of semaphores: fifo or prio. They differ in how callers of the
wait function are handled.

When the semaphore count is 0, callers of wait on a fifo semaphore
are stored in FIFO order in the wait queue, without consideration
for the thread's priority.

Callers of wait on a prio semaphore are stored in decreasing
priority order, thus higher priority thread will be signaled
before lower priority threads, even if they were late to the
party.

Looking at POSIX semaphores, and sem_wait specifically, I'm
wondering what kind of strategy is implemented on typical
POSIX platforms, and on recent Linux kernels?

Since POSIX doesn't specify this, I suppose implementations
are free to decide whatever strategy they prefer?

http://pubs.opengroup.org/onlinepubs/9699919799/functions/sem_wait.html

This statement might be relevant:

Applications using these functions may be subject to priority
inversion, as discussed in XBD Priority Inversion.

3.285 Priority Inversion

A condition in which a thread that is not voluntarily suspended
(waiting for an event or time delay) is not running while a lower
priority thread is running. Such blocking of the higher priority
thread is often caused by contention for a shared resource.

AFAICT, priority inversion could happen on either fifo or prio
semaphores, since the problem comes not from how waiters are
handled, but from a third unrelated medium-priority thread.

Regards.

Noob

unread,

Apr 26, 2012, 4:53:59 AM4/26/12

to

Noob wrote:

> The (proprietary) OS I've been working with provides two "types"
> of semaphores: fifo or prio. They differ in how callers of the
> wait function are handled.
>
> When the semaphore count is 0, callers of wait on a fifo semaphore
> are stored in FIFO order in the wait queue, without consideration
> for the thread's priority.
>
> Callers of wait on a prio semaphore are stored in decreasing
> priority order, thus higher priority thread will be signaled
> before lower priority threads, even if they were late to the
> party.
>
> Looking at POSIX semaphores, and sem_wait specifically, I'm
> wondering what kind of strategy is implemented on typical
> POSIX platforms, and on recent Linux kernels?

AFAIU, on Linux, NPTL's sem_wait seems to be implemented using
a "futex" (Fast Userspace muTex).

http://sourceware.org/cgi-bin/cvsweb.cgi/~checkout~/libc/nptl/sysdeps/unix/sysv/linux/sem_wait.c?cvsroot=glibc
https://www.kernel.org/doc/man-pages/online/pages/man7/futex.7.html
https://www.kernel.org/doc/man-pages/online/pages/man2/futex.2.html

So the question becomes: how does Linux handle threads that call
FUTEX_WAIT?

Noob

unread,

Apr 26, 2012, 5:47:57 AM4/26/12

to

Noob wrote:

> Looking at POSIX semaphores, and sem_wait specifically, I'm
> wondering what kind of strategy is implemented on typical
> POSIX platforms, and on recent Linux kernels?
>
> AFAIU, on Linux, NPTL's sem_wait seems to be implemented using
> a "futex" (Fast Userspace muTex).
>
> http://sourceware.org/cgi-bin/cvsweb.cgi/~checkout~/libc/nptl/sysdeps/unix/sysv/linux/sem_wait.c?cvsroot=glibc
> https://www.kernel.org/doc/man-pages/online/pages/man7/futex.7.html
> https://www.kernel.org/doc/man-pages/online/pages/man2/futex.2.html

Additional references:

http://en.wikipedia.org/wiki/Futex
http://lxr.linux.no/linux/Documentation/pi-futex.txt

> So the question becomes: how does Linux handle threads that call
> FUTEX_WAIT?

Looking at the source code, http://lxr.linux.no/linux/kernel/futex.c

In struct futex_q, list is a "priority-sorted list of tasks
waiting on this futex" and futex_top_waiter "returns the
highest priority waiter on a futex".

Thus, I conclude that NPTL semaphores are prio semaphores.
(Please correct me if I am mistaken.)

Regards.

Noob

unread,

May 2, 2012, 4:44:21 AM5/2/12

to

Would anyone care to comment?

Regards.

Rainer Weikusat

unread,

May 2, 2012, 7:09:46 AM5/2/12

to

Noob <ro...@127.0.0.1> writes:
> Noob wrote:
>
>> The (proprietary) OS I've been working with provides two "types"
>> of semaphores: fifo or prio. They differ in how callers of the
>> wait function are handled.
>>
>> When the semaphore count is 0, callers of wait on a fifo semaphore
>> are stored in FIFO order in the wait queue, without consideration
>> for the thread's priority.

[...]

>> Looking at POSIX semaphores, and sem_wait specifically, I'm
>> wondering what kind of strategy is implemented on typical
>> POSIX platforms, and on recent Linux kernels?
>

[...]

> Would anyone care to comment?

Hmm ... what kind of comment do you expect? I could offer a general
one, namely, trying to solve the problem of resource over-utilization
by priorisation is a misguided approach which reminds of something
like 'assume you're in a space ship and have to rescue yourself using
a really small emergency capsule, so you'd have to cut of one of your
limbs to be able to fit into it, which one would you miss least?',
IOW, this approach may serve as emergency workaround in critical
situations when other options aren't available but 'the solution' is
not to declare that right hands (I'm left-handed) are generally
disposable but to come up with a design/ implementation where 'pick
the lesser evil' descisions don't need to be made.

Noob

unread,

May 3, 2012, 6:50:45 AM5/3/12

to

Rainer Weikusat wrote:
> Noob <ro...@127.0.0.1> writes:
>> Noob wrote:
>>
>>> The (proprietary) OS I've been working with provides two "types"
>>> of semaphores: fifo or prio. They differ in how callers of the
>>> wait function are handled.
>>>
>>> When the semaphore count is 0, callers of wait on a fifo semaphore
>>> are stored in FIFO order in the wait queue, without consideration
>>> for the thread's priority.
>
> [...]
>
>>> Looking at POSIX semaphores, and sem_wait specifically, I'm
>>> wondering what kind of strategy is implemented on typical
>>> POSIX platforms, and on recent Linux kernels?
>>
>
> [...]
>
>> Would anyone care to comment?
>
> Hmm ... what kind of comment do you expect?

I was expecting something along the lines of: "POSIX indeed
leaves the behavior of sem_wait up to the implementation" and
"Yes, recent Linux kernels do implement a prio semaphore".

(Or the opposite.)

Regards.

Message has been deleted

Kaz Kylheku

unread,

Feb 23, 2017, 10:20:10 PM2/23/17

to

On 2017-02-24, zhan...@conew.com <zhan...@conew.com> wrote:
> 在 2012年4月25日星期三 UTC+8下午11:17:09，Noob写道：
^^^^^

>> Hello everyone,
>>
>> The (proprietary) OS I've been working with provides two "types"
>> of semaphores: fifo or prio.
>

> This discussion is useful.

It's also almost five years old and not going on any more.

Tseng ZH

unread,

Mar 9, 2022, 10:27:50 PM3/9/22

to

Kaz Kylheku 在 2017年2月24日星期五上午11:20:10 [UTC+8] 的信中寫道：

According to below link, I think it doesn't guarantee any order.
https://man7.org/linux/man-pages/man2/futex.2.html

FUTEX_WAKE (since Linux 2.6.0)
This operation wakes at most val of the waiters that are
waiting (e.g., inside FUTEX_WAIT) on the futex word at the
address uaddr. Most commonly, val is specified as either
1 (wake up a single waiter) or INT_MAX (wake up all
waiters). No guarantee is provided about which waiters
are awoken (e.g., a waiter with a higher scheduling
priority is not guaranteed to be awoken in preference to a
waiter with a lower priority).