basic question about concurrency

elvira

окула элек,

2010-ж., 4-мар. 11:19:104/3/10

–

Hi,
while reading a textbook about concurrency. the process code is
assumed
repeat
entry section
critical section
exit section
remainder section
forever

i don't understand while the loop is inserted in the above
code....does this insinuate more than a process running the critical
section? even if it was the case, they could take out the loop while
noting more than a process have similar critical section and can run
concurrently

David Schwartz

окула элек,

2010-ж., 4-мар. 16:29:114/3/10

–

Suppose while in the 'remainder section', you discovered that there
was more work to be done. If not for the loop, when would that work
get done?

DS

elvira

окула элек,

2010-ж., 5-мар. 20:21:205/3/10

–

thanks :). another question if you don't mind...does Pthreads
implements Hoare style or Mesea style conditionning variables?

David Schwartz

окула элек,

2010-ж., 6-мар. 20:17:146/3/10

–

On Mar 5, 5:21 pm, elvira <elvira_w...@yahoo.com> wrote:

> thanks :). another question if you don't mind...does Pthreads
> implements Hoare style or Mesea style conditionning variables?

Mesa style. A thread may signal a POSIX condition variable whether or
not it holds the associated mutex and a signaling operation never
blocks the signaling thread. This follows the POSIX philosophy of
implementing the simplest possible synchronization primitives and
leaving it to the application to implement more complex ones if it
desires.

DS

Ersek, Laszlo

окула элек,

2010-ж., 7-мар. 14:45:057/3/10

–

In article
<3542c642-b53b-4a59...@t31g2000prh.googlegroups.com>,
David Schwartz <dav...@webmaster.com> writes:

> A thread may signal a POSIX condition variable whether or

> not it holds the associated mutex [...]

Yes; "however, if predictable scheduling behaviour is required, then
that mutex is locked by the thread calling pthread_cond_signal() or
pthread_cond_broadcast()."

http://www.opengroup.org/onlinepubs/007908775/xsh/pthread_cond_signal.html
http://www.opengroup.org/onlinepubs/000095399/functions/pthread_cond_broadcast.html
http://www.opengroup.org/onlinepubs/9699919799/functions/pthread_cond_broadcast.html

(I know you know this, but the OP might not.)

Cheers,
lacos

David Schwartz

окула элек,

2010-ж., 7-мар. 15:50:397/3/10

–

On Mar 7, 11:45 am, la...@ludens.elte.hu (Ersek, Laszlo) wrote:

> Yes; "however, if predictable scheduling behaviour is required, then
> that mutex is locked by the thread calling pthread_cond_signal() or
> pthread_cond_broadcast()."

> (I know you know this, but the OP might not.)

Actually, I have always wondered what that meant. I don't see how the
scheduling behavior is any more or less predictable in either case
with pthread_cond_broadcast. I can see a weak almost argument that
with pthread_cond_signal you are guaranteed not to wake a thread that
chose to block based on the state of the condition variable after you
chose to signal.

I'm not even sure what "predictable scheduling behavior" even means in
that context, and I wonder if anyone does.

DS

Ersek, Laszlo

окула элек,

2010-ж., 7-мар. 17:13:167/3/10

–

In article <023a70a9-50b9-4350...@p3g2000pra.googlegroups.com>, David Schwartz <dav...@webmaster.com> writes:

Here's my take on it. Let's consider a trivial (and consequently,
probably buggy) semaphore implementation (error checking ignored etc
etc):

struct sema
{
unsigned val;
pthread_cond_t positive;
pthread_mutex_t mutex;
};

void
sema_post(struct sema *sema)
{
unsigned prev_val;

(void)pthread_mutex_lock(&sema->mutex);
prev_val = sema->val++; /* P1 */

#ifdef IN
if (0u == prev_val) {
(void)pthread_cond_broadcast(&sema->positive);
}
#endif

(void)pthread_mutex_unlock(&sema->mutex);

#ifndef IN
if (0u == prev_val) {
(void)pthread_cond_broadcast(&sema->positive);
}
#endif
}

void
sema_wait(struct sema *sema)
{
(void)pthread_mutex_lock(&sema->mutex); /* P2 */
while (0u == sema->val) {
(void)pthread_cond_wait(&sema->positive, &sema->mutex); /* P3 */
}
--sema->val;
(void)pthread_mutex_unlock(&sema->mutex);
}

Three threads: T1, T2, T3. Three positions: P1, P2, P3.

T1 has just finished executing the line on P1. So it is in the critical
section that protects sema->val.

T2 is blocked on P2.

T3 has previously examined sema->val, but it was not positive, so it
went to sleep on the condvar: T3 is sleeping in P3.

Suppose IN is defined (which is what the SUS tells us to do if
"predictable scheduling behaviour is required"). When the broadcast call
returns with the mutex held by T1, none of T2 and T3 can advance, but T3
can become immediately blocking on the mutex (changing from sleeping on
the condvar), and so when T1 leaves the critical section, T2 and T3 can
contend for the newly available mutex with equal chance (or, hopefully,
the implementation can favor T3, since it knows T3 tries to relock the
mutex after sleeping). That is, the implementation can modify the
blocking obstacle of T3, from the condvar to the mutex, synchronously to
the broadcast call, thus promote all condvar-waiting threads to the
class of mutex-waiting threads.

Suppose IN is *not* defined. T1 first unlocks the mutex, then
broadcasts. There is a nonzero chance that as soon as T1 unlocks the
mutex, T2 will enter the critical section (acquiring the mutex in P2),
fetch the value just posted to the semaphore, and exit the section.
Then, some time later, after T1 gets around to broadcasting, T3 will
wake up in P3, relock the mutex (possibly waiting for T2 just stealing
the value), then see there's nothing to consume, and go back to sleep on
the condvar. Even though T3 was waiting for a long time, it can't
consume the posted value first. The implementation cannot effectively
promote the condvar-waiting thread to the class of mutex-waiting
threads, because at the time of this promotion (ie. the broadcast call
in T1), there is no mutex-waiting thread anymore.

In short, the first is FIFO- (queue-)like, the second is LIFO- (stack-)
like, and "predictable scheduling behavior" is intuitively queue-like.

At least, that's my take on it :)

Cheers,
lacos

Ersek, Laszlo

окула элек,

2010-ж., 7-мар. 17:25:327/3/10

–

In article <1V8QPi3j1qPh@ludens>, la...@ludens.elte.hu (Ersek, Laszlo) writes:

> void
> sema_post(struct sema *sema)
> {
> unsigned prev_val;
>
> (void)pthread_mutex_lock(&sema->mutex);
> prev_val = sema->val++; /* P1 */
>
> #ifdef IN
> if (0u == prev_val) {
> (void)pthread_cond_broadcast(&sema->positive);
> }
> #endif
>
> (void)pthread_mutex_unlock(&sema->mutex);
>
> #ifndef IN
> if (0u == prev_val) {
> (void)pthread_cond_broadcast(&sema->positive);
> }
> #endif
> }

Small addition: if IN is *not* defined, then at the time of
broadcasting, the calling thread has actually no idea whether or not the
logical predicate associated with the condition variable holds (that is,
whether "the stored value is positive" or not), because it is not owning
the lock anymore that protects the shared resource.

For this reason alone, even without the SUS' remark on scheduling
behaviour, I prefer the form with IN defined, as I like to keep the
number of spurious broadcasts minimal.

Cheers,
lacos

David Schwartz

окула элек,

2010-ж., 8-мар. 00:50:238/3/10

–

On Mar 7, 2:25 pm, la...@ludens.elte.hu (Ersek, Laszlo) wrote:

> For this reason alone, even without the SUS' remark on scheduling
> behaviour, I prefer the form with IN defined, as I like to keep the
> number of spurious broadcasts minimal.

Why though? All that happens on a spurious broadcast is the thread
goes back to sleep immediately. If you broadcast while holding the
mutex, you have the same issue, the threads cannot wake when you tell
them to. It sounds like you're trying to prevent a rare performance
problem by introducing a common performance problem.

In the vast majority of cases, the condvar is associated with only a
single logical predicate. So you don't particularly care which threads
you wake. If you're broadcasting, you want to wake all of them that
are sleeping regardless of when they went to sleep. If you're
signaling, you only want to wake one, and you don't care which.

If you signal while holding the lock, you will *never* wake a thread
that can make forward progress. If you broadcast while holding the
lock, none of your threads can make forward progress. It sounds like
you are deliberately forcing the worst case to avoid a slight
inefficiency.

(But I agree, you are dead on if those assumptions don't hold. For
example, if you signal the same condvar for two different predicates
such as 'queue empty' and 'queue full'. You may wake lots of 'wrong
threads' in that case.)

DS

Ersek, Laszlo

окула элек,

2010-ж., 8-мар. 06:39:588/3/10

–

In article <85142b66-b6a8-47bb...@b5g2000prd.googlegroups.com>, David Schwartz <dav...@webmaster.com> writes:

> On Mar 7, 2:25=A0pm, la...@ludens.elte.hu (Ersek, Laszlo) wrote:
>
>> For this reason alone, even without the SUS' remark on scheduling
>> behaviour, I prefer the form with IN defined, as I like to keep the
>> number of spurious broadcasts minimal.
>
> Why though? All that happens on a spurious broadcast is the thread
> goes back to sleep immediately. If you broadcast while holding the
> mutex, you have the same issue, the threads cannot wake when you tell
> them to.

(I'm probably misunderstanding you.)

1) If a thread gets woken up, checks some shared variables, then goes
back to sleep, that is not the same as if the thread is never woken up
(scheduled) in the first place. Context switches and whatnot. At least,
the thread might advance from sleeping on the condvar to blocking on the
mutex in kernel space, without entering the user code. -- I know you
know more about this than I do, so I reckon you'll debunk this.

2) By saving a few spurious broadcasts I didn't mean *how* a spurious
broadcast ripples through the system, I rather meant that I like to
avoid sending broadcasts if possibe.

I posted these groups with this link to an article of mine many times
before, sorry for repeating it again:

"How to design condition variables (or monitors), with an eye towards C"

http://www.reddit.com/r/programming/comments/9ynxv/utter_verbiage_how_to_design_condition_variables/
http://www.reddit.com/r/programming/comments/9ynxv/utter_verbiage_how_to_design_condition_variables/c0f1mna

In short, I don't send a broadcast if I can prove that there was no
false->true *change* in the advancement predicate of the sleeping
threads. To determine if there was such a change, one must evaluate the
predicate both before and after modifying the shared state. (This is
done in my sema_post() example by checking prev_val against 0u.) This
approach doesn't work if I release the lock before sending the
broadcast, because unlocking the mutex immediately makes the
after-evalualtion of the predicate stale. This doesn't invalidate the
broadcast that comes after releasing the mutex, but some cases of the
broadcast placed that way could have been saved by placing it in the
critical section.

... I think my deeper incentive for this is striving for elegance. Of
course one can argue that the approach doesn't make sense at all, and my
sense of elegance is flawed. In the end, you can unconditionally
broadcast in sema_post(), independently of the previous value of the
counter, outside of the protected region, and everything will work.

> In the vast majority of cases, the condvar is associated with only a
> single logical predicate. So you don't particularly care which threads
> you wake. If you're broadcasting, you want to wake all of them that
> are sleeping regardless of when they went to sleep. If you're
> signaling, you only want to wake one, and you don't care which.
>
> If you signal while holding the lock, you will *never* wake a thread
> that can make forward progress. If you broadcast while holding the
> lock, none of your threads can make forward progress. It sounds like
> you are deliberately forcing the worst case to avoid a slight
> inefficiency.

If you mean by "worst case" that no thread can advance until I release
the lock after broadcasting, and that at that point they will all
thunder on the mutex, while in the other case, the thundering herd is
smaller, because threads not trying to re-lock from within the wait
could advance as soon as I release the mutex, then I have to agree.

Thanks for making me aware of this. I still find that the
broadcast-within-section approach creates a "leveler playing field" for
all consumer threads, but I admit that this may cause bigger spikes, and
also that -- perhaps -- a level playing field in this sense is not that
important.

I'm curious what effect this would have on performance, if the consumer
threads were bound to specific processor cores.

> (But I agree, you are dead on if those assumptions don't hold. For
> example, if you signal the same condvar for two different predicates
> such as 'queue empty' and 'queue full'. You may wake lots of 'wrong
> threads' in that case.)

I think I would never do such a thing (knowingly, at least). I believe
all threads waiting on the same condvar must be "homogenous". The
condvar can stand for "P1 || P2 ... || Pn", but any thread woken up must
be able to handle all of the 2**n-1 variations satisfying the predicate.

Thanks,
lacos

David Schwartz

окула элек,

2010-ж., 8-мар. 07:13:508/3/10

–

On Mar 8, 3:39 am, la...@ludens.elte.hu (Ersek, Laszlo) wrote:

> 1) If a thread gets woken up, checks some shared variables, then goes
> back to sleep, that is not the same as if the thread is never woken up
> (scheduled) in the first place. Context switches and whatnot. At least,
> the thread might advance from sleeping on the condvar to blocking on the
> mutex in kernel space, without entering the user code. -- I know you
> know more about this than I do, so I reckon you'll debunk this.

If you get a spurious wakeup from a race like this, it's also possible
that by the time the thread runs, the state will be what it was
waiting for anyway, and the spurious wakeup will turn out to be
(later) correct. Also, you can't assume that a condvar has wait
morphing (the ability to move a thread from waiting on the condvar to
waiting on the mutex without waking it).

> In short, I don't send a broadcast if I can prove that there was no
> false->true *change* in the advancement predicate of the sleeping
> threads. To determine if there was such a change, one must evaluate the
> predicate both before and after modifying the shared state. (This is
> done in my sema_post() example by checking prev_val against 0u.) This
> approach doesn't work if I release the lock before sending the
> broadcast, because unlocking the mutex immediately makes the
> after-evalualtion of the predicate stale. This doesn't invalidate the
> broadcast that comes after releasing the mutex, but some cases of the
> broadcast placed that way could have been saved by placing it in the
> critical section.

Right, but in exchange you are waking threads that you know can make
no forward progress (because you hold the mutex). Again, to save a
possible sub-optimal condition, you are forcing a certain sub-optimal
condition.

> ... I think my deeper incentive for this is striving for elegance. Of
> course one can argue that the approach doesn't make sense at all, and my
> sense of elegance is flawed. In the end, you can unconditionally
> broadcast in sema_post(), independently of the previous value of the
> counter, outside of the protected region, and everything will work.

You can also broadcast conditionally outside of the protected region.
So long as some thread broadcasts some time after it changes the
predicate, you will never get stuck threads. If the predicate changed,
some thread must have changed it. That thread is guaranteed to
broadcast after it releases the mutex. So no thread that blocked
before the predicate change can still be blocked after that thread
broadcasts.

> > If you signal while holding the lock, you will *never* wake a thread
> > that can make forward progress. If you broadcast while holding the
> > lock, none of your threads can make forward progress. It sounds like
> > you are deliberately forcing the worst case to avoid a slight
> > inefficiency.

> If you mean by "worst case" that no thread can advance until I release
> the lock after broadcasting, and that at that point they will all
> thunder on the mutex, while in the other case, the thundering herd is
> smaller, because threads not trying to re-lock from within the wait
> could advance as soon as I release the mutex, then I have to agree.

Exactly. And in some implementations, threads will wake up from the
condvar just to go to sleep on the mutex.

> Thanks for making me aware of this. I still find that the
> broadcast-within-section approach creates a "leveler playing field" for
> all consumer threads, but I admit that this may cause bigger spikes, and
> also that -- perhaps -- a level playing field in this sense is not that
> important.

I don't think it even actually does that.

> > (But I agree, you are dead on if those assumptions don't hold. For
> > example, if you signal the same condvar for two different predicates
> > such as 'queue empty' and 'queue full'. You may wake lots of 'wrong
> > threads' in that case.)

> I think I would never do such a thing (knowingly, at least). I believe
> all threads waiting on the same condvar must be "homogenous". The
> condvar can stand for "P1 || P2 ... || Pn", but any thread woken up must
> be able to handle all of the 2**n-1 variations satisfying the predicate.

You can do this, but it creates all kinds of hazards. You can even do
this and use pthread_cond_signal if you're really, really careful. But
the possibility of race conditions is massive and it's not worth the
effort to validate such a hideous beast. (And this is the one case
where signaling outside the mutex can actually mess you up massively.)
However, if you always use pthread_cond_broadcast, it's not too
terribly hard to use the same condvar to signal two different
predicates (such as 'queue empty' and 'queue full').

1) It is always safe to move a pthead_cond_broadcast (whether
conditional or not) outside of the protected area.

2) It is safe to move a pthread_cond_signal (whether conditional or
not) outside of the protected area if:
A) The condvar only ever signals one predicate state, AND
B) Any thread blocked on the condvar can service the predicate you
are signaling. (Regardless of which state it saw the predicate in when
it chose to block on the condvar).

Note that even if 2A and 2B are false, it may still be safe to move a
pthread_cond_signal outside the protected area. But proving it's safe
is more trouble than it's worth.

DS

Ersek, Laszlo

окула элек,

2010-ж., 8-мар. 08:28:118/3/10

–

In article
<7c0b3b8a-3ba4-41ad...@x1g2000prb.googlegroups.com>,
David Schwartz <dav...@webmaster.com> writes:

> On Mar 8, 3:39=A0am, la...@ludens.elte.hu (Ersek, Laszlo) wrote:

>> If you mean by "worst case" that no thread can advance until I release
>> the lock after broadcasting, and that at that point they will all
>> thunder on the mutex, while in the other case, the thundering herd is
>> smaller, because threads not trying to re-lock from within the wait
>> could advance as soon as I release the mutex, then I have to agree.
>

> Exactly. And in some implementations, threads will wake up from the
> condvar just to go to sleep on the mutex.
>

>> Thanks for making me aware of this. I still find that the
>> broadcast-within-section approach creates a "leveler playing field" for
>> all consumer threads, but I admit that this may cause bigger spikes, and
>> also that -- perhaps -- a level playing field in this sense is not that
>> important.
>

> I don't think it even actually does that.

Okay, you've sunk my argument made in Message-ID:
<iv+DNMw6xuEN@ludens>, ie. the one in which I state "even without the

SUS' remark on scheduling behaviour, I prefer the form with IN

defined".

However, since the standard, under my perception, seems to associate a
positive tone (however vague) with broadcasting/signalling in the
critical section, I would still put the corresponding call there:

----v----

if predictable scheduling behaviour is required, then that mutex is
locked by the thread calling pthread_cond_signal() or
pthread_cond_broadcast()

----^----

You mention in Message-ID:
<023a70a9-50b9-4350...@p3g2000pra.googlegroups.com>:

----v----

Actually, I have always wondered what that meant. I don't see how the
scheduling behavior is any more or less predictable in either case with
pthread_cond_broadcast.

----^----

Given that the latest version (v4) of the SUS continues to contain
equivalent normative wording and no related rationale, and that you seem
to be a member of the Austin CSRG, would you please consider bringing up
this topic on the list? You can certainly argue better than I ever could
that this passage is confusing and/or meaningless. Thanks.

--o--

(Slight topic change.)

> 2) It is safe to move a pthread_cond_signal (whether conditional or
> not) outside of the protected area if:
> A) The condvar only ever signals one predicate state, AND
> B) Any thread blocked on the condvar can service the predicate you
> are signaling. (Regardless of which state it saw the predicate in when
> it chose to block on the condvar).

If you don't mind, let's return to the sema_post() / sema_wait() example
from earlier, with IN undefined, and pthread_cond_broadcast() replaced
with pthread_cond_signal(). I leave the condition for signalling in
place, but it is not important for me now (we can remove "prev_val"
altogether).

struct sema
{
unsigned val;
pthread_cond_t positive;
pthread_mutex_t mutex;
};

void
sema_post(struct sema *sema)
{
unsigned prev_val;

(void)pthread_mutex_lock(&sema->mutex);
prev_val = sema->val++;

(void)pthread_mutex_unlock(&sema->mutex);

if (0u == prev_val) {
(void)pthread_cond_signal(&sema->positive);
}
}

void
sema_wait(struct sema *sema)
{
(void)pthread_mutex_lock(&sema->mutex);

while (0u == sema->val) {
(void)pthread_cond_wait(&sema->positive, &sema->mutex);
}

--sema->val;
(void)pthread_mutex_unlock(&sema->mutex);
}

This seems to satisfy all requirements and so it is safe, but I'd claim
this is sub-optimal. To paraphrase what I wrote in my reddit post:

----v----
# Why not use pthread_cond_signal() with multiple consumer threads, if
we know that the produced data, which switches P from false to true, is
enough for a single consumer? Because if the producer gets scheduled N
times in a row, and sends out a single wakeup signal, then only one
consumer will wake up, and N-1 items will idle in the queue while other
consumers sleep.

# Why not use pthread_cond_signal() with multiple consumer threads, if
we call it each time we produce something, independently from the
previous value of P? Because the producer might get scheduled N times in
a row, sending out N single wakeups, which might all be routed to the
same sleeping consumer. One consumer will wake up, take one chunk to
process, other consumers will stay sleeping, even though the queue holds
N-1 items.
----^----

What do you think?

Thank you very much,
lacos

Dave Butenhof

окула элек,

2010-ж., 9-мар. 11:20:099/3/10

–

Remember that realtime programming is about being predictable first;
while "mainstream" thread programming is more about being fast at the
deliberate expense of predictability. Not that realtime doesn't like
fast, too; it's just that losing predictability is an unacceptable cost.

The statement comes from realtime concerns about the advice put in (by
the "threadie" side of the virtual aisle) that signaling or broadcasting
"outside" the mutex makes for more efficient parallel processing.

The passage may seem confusing or even meaningless in certain contexts
(and even most pure realtime programmers would say that a program with
this concern is poorly designed), but it was put there specifically to
defuse a specific concern during balloting of the standard. It's perhaps
important to remember here that we (the sub working group that was
trying to push threads into POSIX) were working (for convenience, though
it wasn't always "convenient" as such) within the context of the
REALTIME working group (1003.4). Though of little interest to a vast
majority of thread programmers out there, (including, unfortunately,
many of the members of the working group who were there only for
threads), the realtime aspects of POSIX threads were absolutely critical
to the initial success of the standard document and working group.

Really, when it comes down to the "brass tacks", this is a political
statement, not a technical statement. One might easily argue that it
belonged more properly in rationale rather than normative text; but
that's a bit complicated by standards history. Under TOG, it's perfectly
reasonable to object to misleading or poorly written rationale; but
within 1003.4, objections to rationale were quietly ignored. It wasn't
part of the standard, and once it got in there it was there to stay; and
everyone knew it was meaningless. Sometimes, we could settle arguments
by allowing the loser to write up some rationale griping about it. In
other cases, everyone knew that was too easy; and some normative olive
branch was required. Though I don't recall the specific name or time,
I'm virtually certain this was such an instance.

Here's my reply to the query on the austin mailing list, which I noticed
before I got caught up here:

--------------------------------------------------

Basically, it was there due to some concerns between the real time and
threads "gangs" in the original 1003.4 working group that developed
POSIX threads.

The problem (from the realtime side) with condition variables is that
if you can signal/broadcast without holding the mutex, and any thread
currently running can acquire an unlocked mutex and check a predicate
without reference to the condition variable, then you can have an
indirect priority inversion.

That is, high priority consumer thread A blocks on a condition variable
because there are no entries on the queue; while lower priority consumer
thread B continues processing some request. Producer thread C now adds
an entry to the queue, unlocks the mutex, and signals the associated
condition variable to unblock the highest priority waiter. But before
the thread can be unblocked, scheduled, and contend for the mutex,
thread B completes its work, acquires the mutex (which has no waiters),
and removes the next work item. Higher priority thread A, when it is
scheduled, either finds the mutex still locked by thread B and needs to
reblock, or finds the mutex unlocked and "its" queue item gone. (On top
of this, it may have preempted thread B to find this out, so that no
work is getting done while we do the scheduling dance.) Even on a strict
realtime scheduling uniprocessor this is perfectly reasonable; there
might be a timeslice (e.g., producer and consumer B are SCHED_RR of
equal priority) between the unlock and the signal or broadcast; the
higher priority thread remains blocked and can't be considered by the
scheduler. (Many realtime designers would consider this bad design; and
in many ways it certainly is; but take it as just a simplified
conceptual sketch. ;-) )

Now, in "high performance thread" terms, this whole concept is generally
bad design; if the threads were symmetric and priority was associated
with the work (that is, higher priority work was simply at the head of
the queue to be handled by whatever consumer gets to it next), it
wouldn't matter that "the wrong thread" had the work item. (There is no
"wrong thread", only "wrong queued work"; and that can easily be managed
at the application level.) Still, we can imagine cases where this might
not meet the intent of the application designer. (Almost everyone would
agree "thread per client" is a bad model; yet it's amazingly pervasive
and generates an enormous amount of confusion and most of the poorly
performing code purporting to demonstrate that threads don't work.)

The quoted statement simply indicates that if the producer continues to
hold the mutex while waking consumer thread A, it will be able to run
and block on the mutex in priority order before consumer thread B can
acquire the mutex. (With wait morphing, the condition variable operation
immediately transfers the highest priority waiter to the mutex with
minimal overhead; but on any strict realtime implementation the act of
unblocking a high priority thread will immediately preempt a lower
priority thread and allow it to block on the mutex immediately.) Now,
consumer thread B may not contend for the mutex before the unlock, in
which case the "right thread" goes next, or it may contend anywhere
during this sequence and be sorted into the proper priority order along
with consumer thread A; in either case, the higher priority thread will
have the first chance at the queue.

That's what "predictable scheduling behavior" means in this context. In
more pragmatic terms, I'm pretty sure this means that someone on the
realtime side of the virtual aisle thought it was a bad idea to
recommend signaling "outside" the mutex, and the working group
compromised by adding that statement, which was sufficient to ward off
(or possibly to remove) a formal objection to the draft standard. This
is how a lot of the text in the standard grew.

Ersek, Laszlo

окула элек,

2010-ж., 9-мар. 13:37:469/3/10

–

(Sorry for keeping a long context!)

In article <hn5sbd$end$1...@usenet01.boi.hp.com>, Dave Butenhof <david.b...@hp.com> writes:
> On 3/8/2010 08:28, Ersek, Laszlo wrote:
>> In article

>> <7c0b3b8a-3ba4-41ad...@x1g2000prb.googlegroups.com>,

>> David Schwartz<dav...@webmaster.com> writes:
>>
>>> On Mar 8, 3:39=A0am, la...@ludens.elte.hu (Ersek, Laszlo) wrote:
>>

>>>> If you mean by "worst case" that no thread can advance until I release
>>>> the lock after broadcasting, and that at that point they will all
>>>> thunder on the mutex, while in the other case, the thundering herd is
>>>> smaller, because threads not trying to re-lock from within the wait
>>>> could advance as soon as I release the mutex, then I have to agree.
>>>

>>> Exactly. And in some implementations, threads will wake up from the
>>> condvar just to go to sleep on the mutex.
>>>

>>>> Thanks for making me aware of this. I still find that the
>>>> broadcast-within-section approach creates a "leveler playing field" for
>>>> all consumer threads, but I admit that this may cause bigger spikes, and
>>>> also that -- perhaps -- a level playing field in this sense is not that
>>>> important.
>>>

>>> I don't think it even actually does that.
>>
>> Okay, you've sunk my argument made in Message-ID:

>> <iv+DNMw6xuEN@ludens>, ie. the one in which I state "even without the

>> SUS' remark on scheduling behaviour, I prefer the form with IN

>> defined".
>>
>> However, since the standard, under my perception, seems to associate a
>> positive tone (however vague) with broadcasting/signalling in the
>> critical section, I would still put the corresponding call there:
>>
>> ----v----

>> if predictable scheduling behaviour is required, then that mutex is
>> locked by the thread calling pthread_cond_signal() or

>> pthread_cond_broadcast()
>> ----^----
>>
>> You mention in Message-ID:
>> <023a70a9-50b9-4350...@p3g2000pra.googlegroups.com>:
>>
>> ----v----

>> Actually, I have always wondered what that meant. I don't see how the
>> scheduling behavior is any more or less predictable in either case with
>> pthread_cond_broadcast.

>> ----^----
>>
>> Given that the latest version (v4) of the SUS continues to contain
>> equivalent normative wording and no related rationale, and that you seem
>> to be a member of the Austin CSRG, would you please consider bringing up
>> this topic on the list? You can certainly argue better than I ever could
>> that this passage is confusing and/or meaningless. Thanks.
>

> [snip]

>
> The statement comes from realtime concerns about the advice put in (by
> the "threadie" side of the virtual aisle) that signaling or broadcasting
> "outside" the mutex makes for more efficient parallel processing.

Thanks for describing the back-and-forth involved in the standardization
process.

> [snip]

>
> we (the sub working group that was
> trying to push threads into POSIX)

Please accept my gratitude for pushing threads into SUSv2 / UNIX 98.

> [snip]

>
> the realtime aspects of POSIX threads were absolutely critical
> to the initial success of the standard document and working group.

I think this is reasonable -- a standardized threads interface that
turns out to preclude realtime needs can be rightfully considered a
trap, in my opinion, one which defeats the very purpose of a standard
(ie. one has to look for a different standard, and the landscape of
implementations remains fragmented).

> [snip]

Outstanding! Thank you so much for explaining this!

Correct me if I'm wrong (and if you care what I think :)), but I think
this is the exact situation I described in

X-News: ludens comp.programming.threads:57685
From: la...@ludens.elte.hu (Ersek, Laszlo)
Subject: Re: basic question about concurrency
Date: 7 Mar 2010 23:13:16 +0100
Message-ID: <1V8QPi3j1qPh@ludens>

with the "small" exception that I didn't consider either realtime
scheduling or thread priorities at all. As it turns out, inhomogeneous
thread priorities play a crucial role here.

> Now, in "high performance thread" terms, this whole concept is generally
> bad design; if the threads were symmetric and priority was associated
> with the work (that is, higher priority work was simply at the head of
> the queue to be handled by whatever consumer gets to it next), it
> wouldn't matter that "the wrong thread" had the work item. (There is no
> "wrong thread", only "wrong queued work"; and that can easily be managed
> at the application level.)

(I remark in parentheses that the MWD (multiple workers decompressor) of
lbzip2 has a work queue with two priority bands: decompressing
reconstructed blocks is urgent, while reconstructing blocks for
decompression is not so urgent. A worker thread picks up a
reconstruction (scanning) task or continues scanning into the head of
the next input chunk only if there is no decompression task available.)

> [snip]

>
> The quoted statement simply indicates that if the producer continues to
> hold the mutex while waking consumer thread A, it will be able to run
> and block on the mutex in priority order before consumer thread B can
> acquire the mutex. (With wait morphing, the condition variable operation
> immediately transfers the highest priority waiter to the mutex with
> minimal overhead; but on any strict realtime implementation the act of
> unblocking a high priority thread will immediately preempt a lower
> priority thread and allow it to block on the mutex immediately.) Now,
> consumer thread B may not contend for the mutex before the unlock, in
> which case the "right thread" goes next, or it may contend anywhere
> during this sequence and be sorted into the proper priority order along
> with consumer thread A; in either case, the higher priority thread will
> have the first chance at the queue.

Here you also prove that wait morphing is not essential in the (strict)
realtime case, ie. for when the quoted statement was intended in the
first place.

> [snip]

Again, thank you very much for explaining this! Now I'll be able to
follow, in the future, David Schwartz' advice for the usual case of no
different priorities, but I'll also know about the important exception,
realtime scheduling with different thread priorities.

Cheers,
lacos

Chris Friesen

окула элек,

2010-ж., 9-мар. 17:56:299/3/10

– Dave Butenhof

On 03/09/2010 10:20 AM, Dave Butenhof wrote:
> <snip>

> That's what "predictable scheduling behavior" means in this context.

Dave,

That last post is just the latest in a long string of informative (and
interesting) contributions to this newsgroup. I appreciate hearing
about this sort of thing from someone who was actually there.

Thanks for that,

Chris Friesen