Porting from POSIX to Win32 threads

Jack Luh

unread,

Jun 24, 1999, 3:00:00 AM6/24/99

to

Hi,

I am trying to port some stuff over from Solaris to NT and was having some
problems finding a good mapping for thr_yield() and POSIX conditional
variables on NT. Does anyone have any suggestions?

Thanks...

Jack

Jeff Denham

unread,

Jun 25, 1999, 3:00:00 AM6/25/99

to

Jack Luh wrote:

Yield is easy: Sleep(0).

I'll let someone else tell you about the Cygnus Pthreads on
Win32 project. If check through DejaNews, there's
been tons of discussions here in the last year or two
about POSIX CVs vs NT Events.

__________________________________________________
Jeff Denham (jde...@allaire.com)

Allaire Corp: Resource-management software
for building and managing fast, reliable web sites
See us at http://www.allaire.com and http://www.brighttiger.com

125 Nagog Park
Acton, MA 01720
Phone: (978) 263-5455 x177
Fax: (978) 263-5547

Patrick TJ McPhee

unread,

Jun 26, 1999, 3:00:00 AM6/26/99

to

In article <37737E17...@allaire.com>,
Jeff Denham <jde...@allaire.com> wrote:

% Yield is easy: Sleep(0).

Or even better:
#define pthread_yield()

--

Patrick TJ McPhee
East York Canada
pt...@interlog.com

John E. Bossom

unread,

Jul 7, 1999, 3:00:00 AM7/7/99

to

That's fine on NT... not so fine on Win95... you need the Sleep(0) there

as a hook to perform the cooperative multitasking...

John.

David Schwartz

unread,

Jul 20, 1999, 3:00:00 AM7/20/99

to

Patrick TJ McPhee wrote:
>
> In article <37737E17...@allaire.com>,
> Jeff Denham <jde...@allaire.com> wrote:
>
> % Yield is easy: Sleep(0).
>
> Or even better:
> #define pthread_yield()

That can cause poor performance or even crashes on platforms that rely
on the semantics of pthread_yield. Consider the following psuedo-code:

Release_Lock();
pthread_yield();
Delete_Lock();

It should be safe to delete the lock, as any threads waiting on that
lock were released, and they should have run by now since we yielded,
right?

But if you make 'pthread_yield' a no-op, we're now deleting a lock
while some threads are waiting on that same lock. What's going to happen
when they wake up?

DS

Patrick TJ McPhee

unread,

Jul 21, 1999, 3:00:00 AM7/21/99

to

In article <37953BE9...@webmaster.com>,
David Schwartz <dav...@webmaster.com> wrote:

% Patrick TJ McPhee wrote [ages and ages ago :-]:
% >
% > In article <37737E17...@allaire.com>,

% > Jeff Denham <jde...@allaire.com> wrote:
% >

% > % Yield is easy: Sleep(0).
% >
% > Or even better:
% > #define pthread_yield()
%
% That can cause poor performance or even crashes on platforms that rely

It can only cause crashes when the code is written incorrectly.

% on the semantics of pthread_yield. Consider the following psuedo-code:

Let's.

%
% Release_Lock();
% pthread_yield();
% Delete_Lock();

If the pthread_yield() is supposed to provide some kind of golden guarantee
that it's safe to assume every other thread has finished every action it
could possibly have taken, its documentation needs to be updated.

This code is simply incorrect. I hope it does crash your program, and
I hope it happens during a demo just before your performance appraisal.

% It should be safe to delete the lock, as any threads waiting on that
% lock were released, and they should have run by now since we yielded,
% right?

Of course not. What a stupid thing to even suggest. There's no reason on
earth to assume that pthread_yield() is anything but a no-op. If you
have enough CPUs, then there'll be nothing to yield to.

Sean Burke

unread,

Jul 21, 1999, 3:00:00 AM7/21/99

to

I've found that the problem of determining when a shared object can
safely be destroyed is a tricky one. I've been successful with the
following strategy:

The object is accessed indirectly via a handle. To dereference the
handle, you perform a "lookup" operation which atomically increments
a reference count. Each such dereference must be balanced by a "discard"
operation which decrements the reference count. To destroy an object,
mark it for destruction and discard your reference. The discard
operation
destroys the object when the destroy flag is set and the reference count
is zero.

This technique demands strenuous coding discipline in C, but can be made
more foolproof in C++.

-SEan

David Schwartz

unread,

Jul 21, 1999, 3:00:00 AM7/21/99

to

I think you missed my point entirely. That was an example of why
implementing thr_yield as a no op can affect performance. Obviously,
code should not rely on that kind of sequencing to prevent it from
crashing.

However, code can (and does) assume that kind of sequencing, and falls
back to more expensive methods when it fails. By implementing thr_yield
as a no op, you will force that code into its least efficient path all
the time.

Solaris does seem to implement sched_yield as a no op, at least for
POSIX threads and it defines _POSIX_PRIORITY_SCHEDULING anyway. Some
software that I wrote suffered a pretty horrible performance penalty on
Solaris because it had to reschedule itself when the yield failed to
yield. All I had to do was tell my code not to use sched_yield, and the
problem went away.

The point is, if someone is calling sched_yield or thr_yield, or
whatever, they are doing so for some reason.

DS

David Schwartz

unread,

Jul 21, 1999, 3:00:00 AM7/21/99

to

For objects that need to be deleted 'often', reference counting is a
good approach. For objects with long lifetimes that rarely need to be
deleted, I prefer to simply remove them from the tables that threads use
to find them and then set a timer to delete them 60 seconds later. This
eliminates the overhead of managing the reference count.

DS

Dave Butenhof

unread,

Jul 26, 1999, 3:00:00 AM7/26/99

to

David Schwartz wrote:

> I think you missed my point entirely. That was an example of why
> implementing thr_yield as a no op can affect performance. Obviously,
> code should not rely on that kind of sequencing to prevent it from
> crashing.
>
> However, code can (and does) assume that kind of sequencing, and falls
> back to more expensive methods when it fails. By implementing thr_yield
> as a no op, you will force that code into its least efficient path all
> the time.

Or, by designing code that relies on incorrect assumptions about the scheduling
functions, such as sched_yield, you penalize your application by forcing it into
the least efficient path all the time. You do have the right to do this. Operating
system vendors certainly have the right to anticipate that, and, if they think
folks with the same misconceptions constitute a large and economically viable
segment of the market, they have the right to code non-standard implementations,
e.g. of sched_yield, to support your design. (But I wouldn't recommend it.)

> Solaris does seem to implement sched_yield as a no op, at least for
> POSIX threads and it defines _POSIX_PRIORITY_SCHEDULING anyway. Some
> software that I wrote suffered a pretty horrible performance penalty on
> Solaris because it had to reschedule itself when the yield failed to
> yield. All I had to do was tell my code not to use sched_yield, and the
> problem went away.

First, unless, (maybe), you're using Solaris 7, Solaris doesn't implement
_POSIX_PRIORITY_SCHEDULING. Yes, I know it defines the symbol; but it doesn't
implement the option. So, even if making sched_yield were in violation of the
standard, that'd really be somewhat irrelevant, wouldn't it?

More importantly, though, sched_yield() has defined behavior on POSIX only when all
threads involved have realtime scheduling policy (SCHED_FIFO or SCHED_RR, which
aren't supported by Solaris, at least before Solaris 7), AND on a uniprocessor. And
my guess that that Solaris 7 would probably do sched_yield correctly on a
uniprocessor with SCHED_FIFO or SCHED_RR threads. (And if not, that's a Solaris
support issue, not a philosophical thread design issue.)

If you're talking about non-realtime threads, or if any thread involved is
non-realtime (e.g., if a SCHED_FIFO thread calls sched_yield while a
SCHED_OTHER thread is ready), or if you're on a multiprocessor, POSIX says
absolutely nothing about the behavior of sched_yield.

And how do you know that "the yield failed to yield"? Presumably you're using some
form of reliable synchronization to do this. So why bother yielding? Sounds to me
like your "workaround" for the yield nop is the correct design.

> The point is, if someone is calling sched_yield or thr_yield, or
> whatever, they are doing so for some reason.

Yes, all things are done for a reason. Many of these reasons, however, (including
nearly all reasons for calling sched_yield) are wrong and irrelevant. That's life.
;-)

/---------------------------[ Dave Butenhof ]--------------------------\
| Compaq Computer Corporation David.B...@compaq.com |
| 110 Spit Brook Rd ZKO2-3/Q18 http://members.aol.com/drbutenhof |
| Nashua NH 03062-2698 http://www.awl.com/cseng/titles/0-201-63392-2/ |
\-----------------[ Better Living Through Concurrency ]----------------/

David Schwartz

unread,

Jul 28, 1999, 3:00:00 AM7/28/99

to

> First, unless, (maybe), you're using Solaris 7, Solaris doesn't implement
> _POSIX_PRIORITY_SCHEDULING. Yes, I know it defines the symbol; but it doesn't
> implement the option. So, even if making sched_yield were in violation of the
> standard, that'd really be somewhat irrelevant, wouldn't it?

Exactly my point.

> And how do you know that "the yield failed to yield"? Presumably you're using some
> form of reliable synchronization to do this. So why bother yielding? Sounds to me
> like your "workaround" for the yield nop is the correct design.

No, I know the yield failed to yield because the yield function
returned an error.

> > The point is, if someone is calling sched_yield or thr_yield, or
> > whatever, they are doing so for some reason.
>
> Yes, all things are done for a reason. Many of these reasons, however, (including
> nearly all reasons for calling sched_yield) are wrong and irrelevant. That's life.
> ;-)

I don't agree. There are many cases (especially uniprocessor) where
sched_yield can save you a lot of trouble.

DS

Dave Butenhof

unread,

Jul 29, 1999, 3:00:00 AM7/29/99

to

David Schwartz wrote:

> > First, unless, (maybe), you're using Solaris 7, Solaris doesn't implement
> > _POSIX_PRIORITY_SCHEDULING. Yes, I know it defines the symbol; but it doesn't
> > implement the option. So, even if making sched_yield were in violation of the
> > standard, that'd really be somewhat irrelevant, wouldn't it?
>
> Exactly my point.

Your point is that your point has no point? I don't see the point. (Sorry 'bout that.
;-) )

> > And how do you know that "the yield failed to yield"? Presumably you're using some
> > form of reliable synchronization to do this. So why bother yielding? Sounds to me
> > like your "workaround" for the yield nop is the correct design.
>
> No, I know the yield failed to yield because the yield function
> returned an error.

OK, that's a hoot. POSIX and UNIX98 define no errors for sched_yield, nor can I imagine
any reason it might reasonably fail. (Nor is there any reason for it to fail, since
returning success without doing anything is perfectly correct and reasonable.)

Still, "returning an error" isn't quite the same as your original statement that it was
"a no op", though, now, I understand what you meant.

Now, what's all this got to do with the fact that you've coded an algorithm that uses
sched_yield in an inappropriate and non-conforming way that performs badly with
reasonable and correct implementations of sched_yield? I thought I'd had some idea of
what you were talking about, but now I'm not so sure.

> > > The point is, if someone is calling sched_yield or thr_yield, or
> > > whatever, they are doing so for some reason.
> >
> > Yes, all things are done for a reason. Many of these reasons, however, (including
> > nearly all reasons for calling sched_yield) are wrong and irrelevant. That's life.
> > ;-)
>
> I don't agree. There are many cases (especially uniprocessor) where
> sched_yield can save you a lot of trouble.

Yes, and any code using sched_yield to resolve those uniprocessor problems is
unsuitable for use on a multiprocessor. (Or at least won't behave as you want.) Because
a multiprocessor is the "true home" of any threaded program, such an approach is simply
bad (i.e., self-limiting) design.

David Schwartz

unread,

Jul 30, 1999, 3:00:00 AM7/30/99

to

> > No, I know the yield failed to yield because the yield function
> > returned an error.
>
> OK, that's a hoot. POSIX and UNIX98 define no errors for sched_yield, nor can I imagine
> any reason it might reasonably fail. (Nor is there any reason for it to fail, since
> returning success without doing anything is perfectly correct and reasonable.)

The man page on Solaris 6 clearly states that ENOSYS can be returned.

> Still, "returning an error" isn't quite the same as your original statement that it was
> "a no op", though, now, I understand what you meant.

The thing is, since no errors are defined, it never occured to me to
check for an error. It wasn't until I looked at the Solaris man page
that it occured to me that anything that braindamaged could occur. So
now I do the rough equivalent of:

if (sched_yield()!=0) Sleep(0);

Where 'Sleep' is an internal yield function that uses usleep,
nanosleep, select, or some other mechanism to sleep momentarily.

> Now, what's all this got to do with the fact that you've coded an algorithm that uses
> sched_yield in an inappropriate and non-conforming way that performs badly with
> reasonable and correct implementations of sched_yield? I thought I'd had some idea of
> what you were talking about, but now I'm not so sure.

The point is, if you do something that causes other threads to stop
referencing an object, then call sched_yield, there's a good chance that
the object will not be referenced when you return. That allows you to
clean it up via a fast path, rather than a slow one.

Turning sched_yield into a no op would break that logic and force the
slow path every time. Knowing that sched_yield doesn't work, or it
returning an error, would cause my code to sleep, once again probably
allowing the fast path.

> > > > The point is, if someone is calling sched_yield or thr_yield, or
> > > > whatever, they are doing so for some reason.
> > >
> > > Yes, all things are done for a reason. Many of these reasons, however, (including
> > > nearly all reasons for calling sched_yield) are wrong and irrelevant. That's life.
> > > ;-)

This is the sentiment with which I disagree. It's incredibly useful for
trying to avoid delaying destroying shared objects.

> > I don't agree. There are many cases (especially uniprocessor) where
> > sched_yield can save you a lot of trouble.
>
> Yes, and any code using sched_yield to resolve those uniprocessor problems is
> unsuitable for use on a multiprocessor. (Or at least won't behave as you want.) Because
> a multiprocessor is the "true home" of any threaded program, such an approach is simply
> bad (i.e., self-limiting) design.

Umm, right, whatever. Yes, we shouldn't make our programs work better
on uniprocessor systems, even though they are the majority case, because
they're not our "true home". Okay.

DS

Patrick TJ McPhee

unread,

Jul 31, 1999, 3:00:00 AM7/31/99

to

In article <37A1D878...@webmaster.com>,
David Schwartz <dav...@webmaster.com> wrote:

% The point is, if you do something that causes other threads to stop
% referencing an object, then call sched_yield, there's a good chance that
% the object will not be referenced when you return. That allows you to
% clean it up via a fast path, rather than a slow one.

This is where I came in before. There is absolutely no chance that this
will work reliably on any platform. Do not write code like this if there's
a chance it might affect me, please.

Make sched_yield() a no-op doesn't break this logic. The logic is inherently
broken. It assumes that sched_yield() will do something. This is not a
valid assumption. It assumes that sched_yield() doing something means that
all threads which might be accessing some shared object will no longer
access it. This is not a valid assumption.

Code that does this is broken. It is broken on uni-processors as well as
SMP machines. It is broken on every system, with every thread implementation.
That anyone is stupid enough to even think of doing it is an argument for
removing sched_yield() and similar functions from all APIs.

All sched_yield() is good for is to tell the system that there isn't going
to be a logical place to pre-empt a thread for a while, so it might as well
do it now if it seems like there's any point. A better way to get the same
effect might be to have a CPU_BOUND thread attribute, which would tell the
system to try and stick the thread on its own CPU, or to pre-empt it if
some I/O-bound thread was ready to run. On most systems, sched_yield() works
best the way I'd suggested:
#define sched_yield()

David Schwartz

unread,

Aug 1, 1999, 3:00:00 AM8/1/99

to

Patrick TJ McPhee wrote:
>
> In article <37A1D878...@webmaster.com>,
> David Schwartz <dav...@webmaster.com> wrote:
>
> % The point is, if you do something that causes other threads to stop
> % referencing an object, then call sched_yield, there's a good chance that
> % the object will not be referenced when you return. That allows you to
> % clean it up via a fast path, rather than a slow one.
>
> This is where I came in before. There is absolutely no chance that this
> will work reliably on any platform. Do not write code like this if there's
> a chance it might affect me, please.

I suppose it depends upon your definition of 'reliably'. It cannot
cause a crash or problem, since the object is never freed unless its
reference count is zero. All it can do is allow the fast path to occur
more frequently.

> Make sched_yield() a no-op doesn't break this logic. The logic is inherently
> broken. It assumes that sched_yield() will do something.

No, it makes no such assumption. It simply _hopes_ that sched_yield
will allow it so save some extra work that it will have to do. It's very
much like a spinlock optimization.

> This is not a
> valid assumption. It assumes that sched_yield() doing something means that
> all threads which might be accessing some shared object will no longer
> access it. This is not a valid assumption.

It does not assume that. It hopes that. If that doesn't happen, oh
well, we've lost nothing. If it does happen, we save having to create an
object to command another thread to delete the object later.

> Code that does this is broken. It is broken on uni-processors as well as
> SMP machines. It is broken on every system, with every thread implementation.
> That anyone is stupid enough to even think of doing it is an argument for
> removing sched_yield() and similar functions from all APIs.

I think you're just assuming that I do something bad if in fact the
object is still referenced. I don't. I just have to go to the trouble of
setting up for the object to be deleted later. That's a pain, and I'd
prefer not to.

Again, it's very much like a spinlock for a fast lock.

> All sched_yield() is good for is to tell the system that there isn't going
> to be a logical place to pre-empt a thread for a while, so it might as well
> do it now if it seems like there's any point.

I'm glad that this is all you've found it good for. But it's actually
good for all sorts of things. One other big thing it's good for is
increasing fairness. If a thread just released a resource that there is
contention for, calling sched_yield before grabbing it again may
increase the chance that another thread will grab it first. If not, what
has been lost?

> A better way to get the same
> effect might be to have a CPU_BOUND thread attribute, which would tell the
> system to try and stick the thread on its own CPU, or to pre-empt it if
> some I/O-bound thread was ready to run. On most systems, sched_yield() works
> best the way I'd suggested:
> #define sched_yield()

Wow, I can't imagine I'm encountering such strong style prejudice!

DS

Patrick TJ McPhee

unread,

Aug 2, 1999, 3:00:00 AM8/2/99

to

In article <37A4ADA1...@webmaster.com>,
David Schwartz <dav...@webmaster.com> wrote:

% Wow, I can't imagine I'm encountering such strong style prejudice!

I think you're giving very bad advice. I don't think it's a matter of
style. When I see people giving what I consider to be very bad advice,
my inclination is to jump on it. This is the value of public discussion:
bad advice will not go unchallenged.

Everybody's entitled to their opinion. _My_ opinion is that if yield()
makes your code run faster, either your code is written wrong or your
thread scheduler is crap. That's just an opinion. Yours may differ, and
that's fine (although you're wrong). My impression is that yield() is a
no-op on almost every platform. Your impression is different, but this
is something which could be verified if anyone cared to, so there's no
point discussing it. At least in POSIX, yield() doesn't _have_ to do
anything. This is a fact, and if you suggest that an implementation in
which yield() does nothing breaks something, then I will vehemently
point out that this is not true.

Stylistically, I tend to get a bit snarky when I see people doing a lot
of weird shit because they think it's an optimisation. I've spent so
much of my working life fixing bugs introduced by people who feel like
they can take a fast path if only yield() allows an object to become
dereferenced or access to integers is atomic, that it makes me queasy.

Kaz Kylheku

unread,

Aug 3, 1999, 3:00:00 AM8/3/99

to

On Sun, 01 Aug 1999 13:27:13 -0700, David Schwartz <dav...@webmaster.com> wrote:
>> This is not a
>> valid assumption. It assumes that sched_yield() doing something means that
>> all threads which might be accessing some shared object will no longer
>> access it. This is not a valid assumption.
>
> It does not assume that. It hopes that. If that doesn't happen, oh
>well, we've lost nothing. If it does happen, we save having to create an
>object to command another thread to delete the object later.

I don't understand why you are going thorugh all these ``rain dances''
to delete shared objects.

If you want to delete a shared object, you must have some assurance that no
thread is using it any longer. You are barking up the wrong tree if you think
that sched_yield() needs to be involved in this in any manner.

The application logic will determine that the object is no longer shared.
One way to do that is reference counting. There are other ways, of course.
The necessary synchronization can be done using mutexes and conditions.

Good software should be founded upon more than just ``hope''.

>> All sched_yield() is good for is to tell the system that there isn't going
>> to be a logical place to pre-empt a thread for a while, so it might as well
>> do it now if it seems like there's any point.
>
> I'm glad that this is all you've found it good for. But it's actually
>good for all sorts of things. One other big thing it's good for is
>increasing fairness. If a thread just released a resource that there is

Second-guessing your scheduler is a bad idea. By calling the scheduler when
there is no need to delay the execution of the thread, all you are doing is
making wasteful system calls that introduce overhead, and possibly confuse the
scheduling algorithms.

Even if sched_yield() actually calls the scheduler, it might do nothing anyway
if the calling thread has a higher priority than other threads that are
competing for the resource. The sched_yield() function can't be depended on to
actually pass control to another thread.

If a thread needs to delay its execution until another thread does
something, it should wait on an appropriate synchronization object. If
it doesn't need to delay its execution, then it should just keep on
executing!

Leave the black magic to the people that implement the synchronization objects.
They have rationale for working with very system dependent assumptions.

>contention for, calling sched_yield before grabbing it again may
>increase the chance that another thread will grab it first. If not, what
>has been lost?

Cycles. Anyway, what kind of resource do you have in mind that is used this
way? That is, it's acquired and released frequently, and a thread that
gives it up may re-acquire it immediately?

This sort of thing smacks of bad design. I doubt that bad design can be
smoothed over by a liberal sprinkling of sched_yield() calls.

If you can't eliminate the contention for the resource, why can't you impose
some discipline that will reliably ensure fairness? Threads wanting to acquire
the resource could be made to wait on a condition variable, which would ensure
fairness in a way that that takes into account thread priority.

David Schwartz

unread,

Aug 3, 1999, 3:00:00 AM8/3/99

to

> >contention for, calling sched_yield before grabbing it again may
> >increase the chance that another thread will grab it first. If not, what
> >has been lost?
>
> Cycles. Anyway, what kind of resource do you have in mind that is used this
> way? That is, it's acquired and released frequently, and a thread that
> gives it up may re-acquire it immediately?

> [snip]

>
> If you can't eliminate the contention for the resource, why can't you impose
> some discipline that will reliably ensure fairness? Threads wanting to acquire
> the resource could be made to wait on a condition variable, which would ensure
> fairness in a way that that takes into account thread priority.

Cycles.

DS

Dave Butenhof

unread,

Aug 10, 1999, 3:00:00 AM8/10/99

to

David Schwartz wrote:

> Patrick TJ McPhee wrote:
> >
> > Code that does this is broken. It is broken on uni-processors as well as
> > SMP machines. It is broken on every system, with every thread implementation.
> > That anyone is stupid enough to even think of doing it is an argument for
> > removing sched_yield() and similar functions from all APIs.
>
> I think you're just assuming that I do something bad if in fact the
> object is still referenced. I don't. I just have to go to the trouble of
> setting up for the object to be deleted later. That's a pain, and I'd
> prefer not to.
>
> Again, it's very much like a spinlock for a fast lock.

No, it's rather like a CPU-bound loop that accumulates arbitrary and
unpredicatable amounts of CPU time doing unspecified things. That's not a
"spinlock", and, if it's ever "fast", that's sheer accident.

Yes, there are "spin-and-back-off" locks that spin for a bit and then yield (or
block for a small amount of time). That can sometimes be valid in cases where you
don't control scheduling. But you need to at least know enough about scheduling to
understand what your yield will accomplish. The code's simply not portable unless
it's portable. Under POSIX, that means that all threads involved are in SCHED_FIFO
and/or SCHED_RR scheduling policy. If your threads are, then the behavior of
sched_yield() is well-defined. Otherwise, the behavior is unspecified,
unpredictable, and nonportable. None of those words are a good basis for a
portable application design.

> > All sched_yield() is good for is to tell the system that there isn't going
> > to be a logical place to pre-empt a thread for a while, so it might as well
> > do it now if it seems like there's any point.
>
> I'm glad that this is all you've found it good for. But it's actually
> good for all sorts of things. One other big thing it's good for is
> increasing fairness. If a thread just released a resource that there is

> contention for, calling sched_yield before grabbing it again may
> increase the chance that another thread will grab it first. If not, what
> has been lost?

No, it's not any good for "increasing fairness". Unless, by "fair", you mean
unspecified, unpredictable, and nonportable. That's an odd definition, to say the
least.

You go on to describe one example of exactly what Patrick was explaining. A call
to sched_yield() is, at best, a hint to the scheduler that "if there's any chance
that you might desire to preempt me, now is a good time". You might, indeed,
generally do this immediately before acquiring a shared resource (such as a mutex,
especially when you know you may have recently unlocked the mutex). But that
doesn't provoke "fairness". On a uniprocessor with thread timeslicing, it may
decrease the chances of a timeslice occuring while you hold the mutex, because, if
you really yield, you'll acquire the mutex on return, with a fresh scheduling
quantum. Assuming you hold the mutex for a substantial portion of the timeslice
quantum (which is bad design anyway), this may improve throughput by decreasing
the chances of other threads blocking on the mutex while the owner waits its next
turn. On a multiprocessor, the effect of the timeslice is less significant because
other threads may be running simultaneously, and will still need to block while
you hold the mutex. The solution is to reduce the time you hold the mutex.

Another thing to remember: even when you CAN achieve fairness, you should avoid
it. Fairness is unsupported by the POSIX standard for an important reason: it's
extremely expensive at all levels of the system. You pay in context switches,
scheduling overhead, and cache misses by forcing "thread a" to run when "thread b"
is ready to go. "Fairness" is often thought to be desirable for server systems, so
that one client connection doesn't lock out others. That usually means the server
follows a simplistic "thread per client" style of design, and that's rarely an
efficient allocation of resources. Threads are not your data, they're engines to
help you process your data more efficiently. You want data fairness, not thread
(scheduling) fairness. You can usually achieve that better by something like a
work queue model where client connection requests are queued (in your desired
order) for processing by a smaller number of server threads.

> > A better way to get the same
> > effect might be to have a CPU_BOUND thread attribute, which would tell the
> > system to try and stick the thread on its own CPU, or to pre-empt it if
> > some I/O-bound thread was ready to run. On most systems, sched_yield() works
> > best the way I'd suggested:
> > #define sched_yield()
>

> Wow, I can't imagine I'm encountering such strong style prejudice!

The Internet does not require review and jury approval for publication. Readers
are responsible for evaluating postings and correcting errors. Some of us have
informally and unofficially accepted a sort of responsibility for correcting
potentially harmful misinformation disseminated in this newsgroup, and you're
feeling the pressure. You're allowed to disagree, and you're certainly allowed to
voice your own opinions. With luck, most readers will listen more carefully to
Patrick, and to me, than to you, and fewer erroneous programs will be written as a
result. If not, well, we've done our best.

David Schwartz

unread,

Aug 10, 1999, 3:00:00 AM8/10/99

to

> No, it's not any good for "increasing fairness". Unless, by "fair", you mean
> unspecified, unpredictable, and nonportable. That's an odd definition, to say the
> least.

Hmm, I thought you were the one who pointed out that one of the
advantages of signalling a cv while you still hold the mutex is that it
might increase fairness.

> You go on to describe one example of exactly what Patrick was explaining. A call
> to sched_yield() is, at best, a hint to the scheduler that "if there's any chance
> that you might desire to preempt me, now is a good time". You might, indeed,
> generally do this immediately before acquiring a shared resource (such as a mutex,
> especially when you know you may have recently unlocked the mutex). But that
> doesn't provoke "fairness". On a uniprocessor with thread timeslicing, it may
> decrease the chances of a timeslice occuring while you hold the mutex, because, if
> you really yield, you'll acquire the mutex on return, with a fresh scheduling
> quantum. Assuming you hold the mutex for a substantial portion of the timeslice
> quantum (which is bad design anyway), this may improve throughput by decreasing
> the chances of other threads blocking on the mutex while the owner waits its next
> turn. On a multiprocessor, the effect of the timeslice is less significant because
> other threads may be running simultaneously, and will still need to block while
> you hold the mutex. The solution is to reduce the time you hold the mutex.

The problem is, whether the next scheduler timeslice occurs when you
hold the mutex or when you don't hold the mutex is entirely random. In
one case, another thread is likely to get the mutex, in the other case
it's not. (Of course, the response should be 'if you care, don't use a
(raw) mutex'.)

The odd thing is, most platforms that I have tested on seem to behave
as if you had a 'sched_yield' after releasing the mutex anyway. For
example, suppose I write a program that creates two threads, each in the
following loop on the same mutex:

while(1)
{
pthread_mutex_lock(&mutex);
sleep(1);
pthread_mutex_unlock(&mutex);
};

You might think that the same thread will keep cycling while one keeps
blocking. The logic being that the thread in 'sleep' awakens with a full
timeslice, and surely can get back to 'sleep' without using it all up.
Hence you'd expect one thread to cycle once a second and the other to
get stuck.

Yet when I test this degenerate case, the two threads seem to get the
mutex in alternation. This is what you might expect if there was a
'sched_yield' after unlocking the mutex.

The funny thing is, in most (real world) cases, I'd prefer the same
thread kept running. (See below, I usually want maximum unfairness.)

> Another thing to remember: even when you CAN achieve fairness, you should avoid
> it. Fairness is unsupported by the POSIX standard for an important reason: it's
> extremely expensive at all levels of the system. You pay in context switches,
> scheduling overhead, and cache misses by forcing "thread a" to run when "thread b"
> is ready to go. "Fairness" is often thought to be desirable for server systems, so
> that one client connection doesn't lock out others. That usually means the server
> follows a simplistic "thread per client" style of design, and that's rarely an
> efficient allocation of resources. Threads are not your data, they're engines to
> help you process your data more efficiently. You want data fairness, not thread
> (scheduling) fairness. You can usually achieve that better by something like a
> work queue model where client connection requests are queued (in your desired
> order) for processing by a smaller number of server threads.

I agree. In fact, I generally try to design for maximum unfairness. I
want the same thread to keep running as long as possible, doing as many
jobs as possible. This has cache benefits, scheduler benefits, and
allows the unused threads to 'expire' (if coded to do so). This is why
I'm so puzzled by why mutexes appear to behave so 'fairly'.

> The Internet does not require review and jury approval for publication. Readers
> are responsible for evaluating postings and correcting errors. Some of us have
> informally and unofficially accepted a sort of responsibility for correcting
> potentially harmful misinformation disseminated in this newsgroup, and you're
> feeling the pressure. You're allowed to disagree, and you're certainly allowed to
> voice your own opinions. With luck, most readers will listen more carefully to
> Patrick, and to me, than to you, and fewer erroneous programs will be written as a
> result. If not, well, we've done our best.

I'm trying to figure out whether the paragraph above is polite or not.
;)

DS

Kaz Kylheku

unread,

Aug 11, 1999, 3:00:00 AM8/11/99

to

On Tue, 10 Aug 1999 21:06:26 -0700, David Schwartz <dav...@webmaster.com> wrote:
>> feeling the pressure. You're allowed to disagree, and you're certainly allowed to
>> voice your own opinions. With luck, most readers will listen more carefully to
>> Patrick, and to me, than to you, and fewer erroneous programs will be written as a
>> result. If not, well, we've done our best.
>

> I'm trying to figure out whether the paragraph above is polite or not.
>;)

He's just throwing his weight around. I warned you.

---
Better living through cooki^H^H^H^H^Hconcurrency.

Dave Butenhof

unread,

Aug 11, 1999, 3:00:00 AM8/11/99

to

David Schwartz wrote:

> > No, it's not any good for "increasing fairness". Unless, by "fair", you mean
> > unspecified, unpredictable, and nonportable. That's an odd definition, to say the
> > least.
>

> Hmm, I thought you were the one who pointed out that one of the
> advantages of signalling a cv while you still hold the mutex is that it
> might increase fairness.

Nope, not me. I don't believe in fairness. "Fairness" is relative, subjective, and
usually isn't actually what anyone wants when they ask for it. ("It's not fair" usually
means that one hasn't been given as much more than one's equal share as one might wish.
;-) )

Signalling under the mutex may yield more PREDICTABLE behavior among realtime scheduled
threads by increasing the likelyhood that the awakened waiter(s) and simultaneous mutex
contenders are forced to queue (and wake) in order from the mutex. Realtime scheduling
isn't fair, (at least in the normal conversational sense of "equal share") and is in
fact explicitly designed to be the opposite. The intent of realtime scheduling is to
specify exactly which threads are to be the pigs. ("All animals are equal, but some are
more equal than others".)

A realtime programmer may indeed consider such moderate improvements in predictability
to be "fair". However, I try to use loose/subjective terms such as "fair" only in the
typical conversational English sense. (This isn't always possible, so it's "only fair"
to admit that I may once or even twice have let an inappropriate use of "fair" slip
into some posting.)

> The problem is, whether the next scheduler timeslice occurs when you
> hold the mutex or when you don't hold the mutex is entirely random. In
> one case, another thread is likely to get the mutex, in the other case
> it's not. (Of course, the response should be 'if you care, don't use a
> (raw) mutex'.)

Except in a really degenerate case, the resulting "scheduler thrashing" is unlikely to
be worse (and will usually be far better) than that you might generate by calling
sched_yield(). Timeslicing should be a "fairly" infrequent event (e.g., thousands of
times the processor cycle frequency), and mutexes should be held for a substantially
smaller period of time.

> The odd thing is, most platforms that I have tested on seem to behave
> as if you had a 'sched_yield' after releasing the mutex anyway. For
> example, suppose I write a program that creates two threads, each in the
> following loop on the same mutex:
>
> while(1)
> {
> pthread_mutex_lock(&mutex);
> sleep(1);
> pthread_mutex_unlock(&mutex);
> };
>
> You might think that the same thread will keep cycling while one keeps
> blocking. The logic being that the thread in 'sleep' awakens with a full
> timeslice, and surely can get back to 'sleep' without using it all up.
> Hence you'd expect one thread to cycle once a second and the other to
> get stuck.
>
> Yet when I test this degenerate case, the two threads seem to get the
> mutex in alternation. This is what you might expect if there was a
> 'sched_yield' after unlocking the mutex.

I'm afraid I no longer remember (if you mentioned it at all) which platform you're
using. On some systems, "unblock" may entail a scheduler boost for normal "timeshare"
policy threads. On such a system, the thread released by unlocking the mutex would gain
a temporary advantage over the thread that had released it, just as if you were using
realtime scheduling and the waiter had a higher static priority. You'd get a
preemption, (on a uniprocessor), and the previously-blocked thread would be able to
acquire the mutex, and sleep, before the unlocking thread would have a chance to loop.
I'd expect to see this on Linux, for example, where threads are scheduled as normal O/S
processes. I wouldn't expect to see it on Solaris or Tru64 UNIX, unless, perhaps, if
you're using System Contention Scope. (Where the threads are directly scheduled by the
O/S, and may also be subject to priority boost behavior.)

> The funny thing is, in most (real world) cases, I'd prefer the same
> thread kept running. (See below, I usually want maximum unfairness.)

Good. I'm glad to hear that.

> > Another thing to remember: even when you CAN achieve fairness, you should avoid
> > it. Fairness is unsupported by the POSIX standard for an important reason: it's
> > extremely expensive at all levels of the system. You pay in context switches,
> > scheduling overhead, and cache misses by forcing "thread a" to run when "thread b"
> > is ready to go. "Fairness" is often thought to be desirable for server systems, so
> > that one client connection doesn't lock out others. That usually means the server
> > follows a simplistic "thread per client" style of design, and that's rarely an
> > efficient allocation of resources. Threads are not your data, they're engines to
> > help you process your data more efficiently. You want data fairness, not thread
> > (scheduling) fairness. You can usually achieve that better by something like a
> > work queue model where client connection requests are queued (in your desired
> > order) for processing by a smaller number of server threads.
>

> I agree. In fact, I generally try to design for maximum unfairness. I
> want the same thread to keep running as long as possible, doing as many
> jobs as possible. This has cache benefits, scheduler benefits, and
> allows the unused threads to 'expire' (if coded to do so). This is why
> I'm so puzzled by why mutexes appear to behave so 'fairly'.

So set up a picket line in front of the headquarters of the vendor in question: "<Name
of company here> NOT unfair to threads!"

(Sorry, I couldn't resist. Well, OK, I could have. But I didn't. So there.)

And if you see this with non-realtime threads on Tru64 UNIX, let me know, because I
would consider fairness a serious bug. (But I have a nasty suspicion that it may happen
with System Contention Scope threads, and I'm not sure if we can fix that.)

> > The Internet does not require review and jury approval for publication. Readers
> > are responsible for evaluating postings and correcting errors. Some of us have
> > informally and unofficially accepted a sort of responsibility for correcting
> > potentially harmful misinformation disseminated in this newsgroup, and you're
> > feeling the pressure. You're allowed to disagree, and you're certainly allowed to
> > voice your own opinions. With luck, most readers will listen more carefully to
> > Patrick, and to me, than to you, and fewer erroneous programs will be written as a
> > result. If not, well, we've done our best.
>

> I'm trying to figure out whether the paragraph above is polite or not.
> ;)

But of course. The dagger was neatly wrapped in elegant cloth-of-gold. (Stained only
slightly red near the point.) ;-)

David Schwartz

unread,

Aug 11, 1999, 3:00:00 AM8/11/99

to

> > The problem is, whether the next scheduler timeslice occurs when you
> > hold the mutex or when you don't hold the mutex is entirely random. In
> > one case, another thread is likely to get the mutex, in the other case
> > it's not. (Of course, the response should be 'if you care, don't use a
> > (raw) mutex'.)
>
> Except in a really degenerate case, the resulting "scheduler thrashing" is unlikely to
> be worse (and will usually be far better) than that you might generate by calling
> sched_yield(). Timeslicing should be a "fairly" infrequent event (e.g., thousands of
> times the processor cycle frequency), and mutexes should be held for a substantially
> smaller period of time.

One would certainly hope so.

> > The odd thing is, most platforms that I have tested on seem to behave
> > as if you had a 'sched_yield' after releasing the mutex anyway. For
> > example, suppose I write a program that creates two threads, each in the
> > following loop on the same mutex:
>

> I'm afraid I no longer remember (if you mentioned it at all) which platform you're
> using.

I tested both LinuxThreads and FreeBSD's user-space threads
implementation. I was _quite_ surprised at the results. I'll test a few
more implementations available to me (Solaris, OSF1, and NT's events).

> On some systems, "unblock" may entail a scheduler boost for normal "timeshare"
> policy threads. On such a system, the thread released by unlocking the mutex would gain
> a temporary advantage over the thread that had released it, just as if you were using
> realtime scheduling and the waiter had a higher static priority. You'd get a
> preemption, (on a uniprocessor), and the previously-blocked thread would be able to
> acquire the mutex, and sleep, before the unlocking thread would have a chance to loop.
> I'd expect to see this on Linux, for example, where threads are scheduled as normal O/S
> processes. I wouldn't expect to see it on Solaris or Tru64 UNIX, unless, perhaps, if
> you're using System Contention Scope. (Where the threads are directly scheduled by the
> O/S, and may also be subject to priority boost behavior.)

Even with the boost, on uniprocessor, unless the CPU is actually
yielded to another process, the thread should finish out its timeslace
and make it back to the pthread_mutex_lock line. It really did seem, at
least on LinuxThreads, that pthread_mutex_unlock tended to yield the
processor immediately to a thread blocked on pthread_mutex_lock!

This is so astonishing that I think I'll retest. Maybe I tested wrong.
It certainly seems like an efficiency killer in any realistic use of
mutexes.

DS

David Schwartz

unread,

Aug 11, 1999, 3:00:00 AM8/11/99

to

>
> Even with the boost, on uniprocessor, unless the CPU is actually
> yielded to another process, the thread should finish out its timeslace
> and make it back to the pthread_mutex_lock line. It really did seem, at
> least on LinuxThreads, that pthread_mutex_unlock tended to yield the
> processor immediately to a thread blocked on pthread_mutex_lock!
>
> This is so astonishing that I think I'll retest. Maybe I tested wrong.
> It certainly seems like an efficiency killer in any realistic use of
> mutexes.
>

Yes, effectively, LinuxThreads yields on pthread_mutex_unlock.
Shouldn't this only occur if the unlock woke a thread at a higher
priority?

DS

Kaz Kylheku

unread,

Aug 12, 1999, 3:00:00 AM8/12/99

to

In LinuxThreads, pthread_mutex_unlock does nothing (other than mark the mutex
released) if no thread is waiting on the wait queue of that mutex. Otherwise it
signals that thread in order to wake it up. This is done by delivering
a signal using the kill() system call.

In the system call, the process if the process that is given the signal
is found to be in an interruptible sleep, then it is woken up via
the wake_up_process() function.

Whether or not this will preempt the caller is based on subtle decisions, and
the code differs markedly betwee SMP and non-SMP. The decision is made
underneath the reschedule_idle() function.