volatile guarantees?

Tristan Wibberley

unread,

Aug 29, 2000, 3:00:00 AM8/29/00

to

Hi,

I'm having trouble with the meaning of C/C++ keyword volatile. I know you
declare a variable volatile wherever it may be changed externally to the
flow of logic that the compiler is processing and optimising. This makes the
compiler read from the ultimate reserved storage when it is accessed (or so
I believed).

I have seen a discussion in one of the comp.lang.c* groups where it is
suggested that the compiler does not always have to avoid optimising away
memory accesses. This seems logical - since a thread which alters the value
of a variable might not get scheduled, the value of the variable may not
change for some time (many times round a busy loop), so the compiler can use
a cached value for many loops without changing the guarantees made by the
machine abstraction defined in the standards (*good* for performance). That
then renders volatile practically undefinable (since a thread may legally
*never* be scheduled) and when it is for hardware changing a flag, the
hardware doesn't necessarily change memory (a CPU register may be changed).
Volatile behaviour seems best implemented with a function call which uses
some guaranteed behaviour internally (in assembler or other language).

Has anyone hashed this out before and come to any conclusion whether to
trust volatile or spread to other languages? because it's doing my head in
(I ask this here because it concerns concurrent programming and the people
here probably have the experience of this problem).

--
Tristan Wibberley

Kaz Kylheku

unread,

Aug 29, 2000, 3:00:00 AM8/29/00

to

On Tue, 29 Aug 2000 19:41:50 +0100, Tristan Wibberley <blo...@cus.org.uk>
wrote:

>I have seen a discussion in one of the comp.lang.c* groups where it is
>suggested that the compiler does not always have to avoid optimising away
>memory accesses. This seems logical - since a thread which alters the value
>of a variable might not get scheduled, the value of the variable may not

The C language does not define any support for threading. It's possible for a
conforming implemetnation to ignore volatile, so long as it ensures that any
required constraint violation diagnostics related to volatile are produced
(e.g. stripping the volatile qualifier in an assignment without a cast). Also,
the implementation has to ensure that a signal handler can safely store a value
to a static object of type volatile sig_atomic_t.

>Has anyone hashed this out before and come to any conclusion whether to
>trust volatile or spread to other languages?

Whether or not you can trust volatile, or even whether or not you need it,
depends on the compiler you are using.

Under POSIX threads, you don't need volatile so long as you use the locking
mechanism supplied by the interface.

--
Any hyperlinks appearing in this article were inserted by the unscrupulous
operators of a Usenet-to-web gateway, without obtaining the proper permission
of the author, who does not endorse any of the linked-to products or services.

Tristan Wibberley

unread,

Aug 29, 2000, 3:00:00 AM8/29/00

to

Kaz Kylheku wrote in message ...

>On Tue, 29 Aug 2000 19:41:50 +0100, Tristan Wibberley <blo...@cus.org.uk>
>wrote:

>Under POSIX threads, you don't need volatile so long as you use the locking

>mechanism supplied by the interface.

so
/*****************/
static int a;
printf ("%d\n", a);
/* another thread happens to increment a here */
printf ("%d\n", a);
/*****************/
writes:
0
0
or
0
1
to the console, but
/*****************/
static int a;
int b;
printf ("%d\n", a);
/* another thread happens to increment a here */
pthread_mutex_lock (&m); /* m is an initialised mutex */
b = a;
pthread_mutex_unlock (&m);
printf ("%d\n", b);
/*****************/
writes
0
1
to the console (so long as the other thread increments a where stated -
ignoring issues of ensuring that)?

That doesn't seem right so I must misunderstand you. How does using POSIX
threads change how the "C" compiler translates "b = a" to machine code?

--
Tristan Wibberley

David Schwartz

unread,

Aug 29, 2000, 3:00:00 AM8/29/00

to

Tristan Wibberley wrote:

> That doesn't seem right so I must misunderstand you. How does using POSIX
> threads change how the "C" compiler translates "b = a" to machine code?

The compiler treats the mutex function as an unknown. This forces it to
write the variable to memory beforehand and read it back afterwards.
Consider:

void foo(void)
{
i=3;
if(i==3) Do_Something();
}

In this case, there is no guarantee that 'i=3' is written to memory
before 'i==3' is evaluated. There is also no guarantee that 'i==3' reads
from memory rather than a register. Now consider:

void foo(void)
{
i=3;
SomeFunction();
if(i==3) Do_Something();
}

If the compiler doesn't know what 'SomeFunction' does, it has to write
'i' back to memory before calling SomeFunction. After all, if
'SomeFunction' reads 'i', it had better get 3! Similarly, 'i==3' now has
to read from memory. After all, 'SomeFunction' might contain the line
'i=4'.

So external functions the compiler cannot peer into change the assembly
code generated around them. This is not a special property of the
pthread lock functions on most platforms. Of course, the standard
doesn't specify how this has to happen, just that it has to. The funny
thing is that most compilers do this automatically and all that's needed
to help them are memory barriers. (To make sure that the writes/reads
are visible correctly on other processors)

DS

Kaz Kylheku

unread,

Aug 29, 2000, 3:00:00 AM8/29/00

to

On Tue, 29 Aug 2000 12:30:14 -0700, David Schwartz <dav...@webmaster.com> wrote:
> So external functions the compiler cannot peer into change the assembly
>code generated around them. This is not a special property of the
>pthread lock functions on most platforms. Of course, the standard
>doesn't specify how this has to happen, just that it has to. The funny

Do you know where this is specified? Everyone says that it is, but I can't find
the relevant text in the fairly recent 200X POSIX draft that I have.
I've been taking everyone's word for it.

Tristan Wibberley

unread,

Aug 29, 2000, 3:00:00 AM8/29/00

to

David Schwartz wrote in message <39AC0F46...@webmaster.com>...

>
>Tristan Wibberley wrote:
>
>> That doesn't seem right so I must misunderstand you. How does using POSIX
>> threads change how the "C" compiler translates "b = a" to machine code?
>
> The compiler treats the mutex function as an unknown. This forces it to
>write the variable to memory beforehand and read it back afterwards.

I see, since the compiler cannot be certain that the pthread library will
not have a pointer to a from elsewhere, it cannot assume that it won't
change the value of a. But what about when static int a is *not* declared in
the global scope, and no pointer to it is returned or passed as an argument
to a function? Then the compiler "knows" that a will not be changed from
underneath it's optimisations. Must we simply avoid that situation?

--
Tristan Wibberley

David Schwartz

unread,

Aug 29, 2000, 3:00:00 AM8/29/00

to

Tristan Wibberley wrote:

> > The compiler treats the mutex function as an unknown. This forces it to
> >write the variable to memory beforehand and read it back afterwards.

> I see, since the compiler cannot be certain that the pthread library will
> not have a pointer to a from elsewhere, it cannot assume that it won't
> change the value of a. But what about when static int a is *not* declared in
> the global scope, and no pointer to it is returned or passed as an argument
> to a function? Then the compiler "knows" that a will not be changed from
> underneath it's optimisations. Must we simply avoid that situation?

POSIX requires that the mutex functions work no matter what. In
practice, compilers don't make assumptions. Even if something has local
scope, there are still ways for pointers to it to get smuggled around.

Look at it this way -- if another thread has a way to modify the
variable, so does an unknown function. So if it's really impossible for
an unknown function to modify the variable, it should also be impossible
for another thread to do so.

If no pointer to the variable exists anywhere but in that local
function, no other thread could modify the variable anyway. Any attempt
to pass the pointer to another thread could also pass the variable to an
unknown function. So there should never be a problem.

DS

David Schwartz

unread,

Aug 29, 2000, 3:00:00 AM8/29/00

to

Kaz Kylheku wrote:
>
> On Tue, 29 Aug 2000 12:30:14 -0700, David Schwartz <dav...@webmaster.com> wrote:
> > So external functions the compiler cannot peer into change the assembly
> >code generated around them. This is not a special property of the
> >pthread lock functions on most platforms. Of course, the standard
> >doesn't specify how this has to happen, just that it has to. The funny
>
> Do you know where this is specified? Everyone says that it is, but I can't find
> the relevant text in the fairly recent 200X POSIX draft that I have.
> I've been taking everyone's word for it.

POSIX doesn't require this. This is just the way the POSIX requirements
are usually met. POSIX requires that mutexes work but doesn't specify
how. The usually 'how' is by forcing the compiler to flush to/from
memory and putting memory barriers inside the mutex functions.

DS

Kaz Kylheku

unread,

Aug 29, 2000, 3:00:00 AM8/29/00

to

On Tue, 29 Aug 2000 21:46:59 +0100, Tristan Wibberley <blo...@cus.org.uk>
wrote:

>change the value of a. But what about when static int a is *not* declared in
>the global scope, and no pointer to it is returned or passed as an argument
>to a function? Then the compiler "knows" that a will not be changed from
>underneath it's optimisations.

How can it know that the foreign translation unit will not call back into
an external function in the current translation unit which does modify
the object? You don't need a pointer to an object to modify it, you just need
to be able to reach some code which modifies it.

Tristan Wibberley

unread,

Aug 29, 2000, 3:00:00 AM8/29/00

to

David Schwartz wrote in message <39AC1FA7...@webmaster.com>...

>
> Look at it this way -- if another thread has a way to modify the
>variable, so does an unknown function.

I don't agree (unless I don't understand static local variables properly),
see below.

> If no pointer to the variable exists anywhere but in that local
>function, no other thread could modify the variable anyway. Any attempt
>to pass the pointer to another thread could also pass the variable to an
>unknown function. So there should never be a problem.

So posix threads don't share local static variables then?

int daft_function (void)
{
static int a;
int b, condition;

/* do something useful with some other data (but do nothing with a) */

if (condition) {
pthread_mutex_lock (&m); /* m is declared elsewhere */
a++;
pthread_mutex_unlock (&m);
}

pthread_mutex_lock (&m);

b = a;
pthread_mutex_unlock (&m);

return b;
}

Two threads can be executing this function and there share "a" (don't
they?). But the compiler knows that no called functions touch "a" (nor
anything else). But one thread can find "condition" to be true so it
increments "a", then the other thread runs and finds "condition" is now true
so it increments "a" again, the first thread runs again and uses a copy of
"a" from a register (the compiler does this because it knows nothing of
threads). Thread 1 returns 1 when it should return 2, thread 2 returns 2
correctly. Okay, it's a race condition and shouldn't be there, but one could
replace the mutexes with wait conditions.

--
Tristan Wibberley

Kaz Kylheku

unread,

Aug 29, 2000, 3:00:00 AM8/29/00

to

On Tue, 29 Aug 2000 22:37:56 +0100, Tristan Wibberley <blo...@cus.org.uk>
wrote:
>

>> If no pointer to the variable exists anywhere but in that local
>>function, no other thread could modify the variable anyway. Any attempt
>>to pass the pointer to another thread could also pass the variable to an
>>unknown function. So there should never be a problem.
>
>
>So posix threads don't share local static variables then?
>
>int daft_function (void)
>{
> static int a;
> int b, condition;
>
> /* do something useful with some other data (but do nothing with a) */
>
> if (condition) {
> pthread_mutex_lock (&m); /* m is declared elsewhere */
> a++;
> pthread_mutex_unlock (&m);
> }
>
> pthread_mutex_lock (&m);
> b = a;
> pthread_mutex_unlock (&m);
>
> return b;
>}
>
>Two threads can be executing this function and there share "a" (don't
>they?). But the compiler knows that no called functions touch "a" (nor
>anything else).

That's right; the compiler *could* know here that pthread_mutex_lock() is a
standard library function which cannot possibly call back into daft_function(),
and daft_function() is the only piece of code in the system which can possibly
alter the value of a. So why should it suspect that a++ might be altered in
surprising ways?

I can't find the chapter and verse in my February 2000 POSIX draft which would
unequivocally say that in the above function, the accesses and modifications of
variable ``a'' cannot be moved outside of the critical region. I've been
relying on the unverified assurances of others.

Doug Hockin

unread,

Aug 29, 2000, 3:00:00 AM8/29/00

to

Volatile existed long before people were doing threads. It
is used when you have interrupt handlers and background code
that share data structures and also when accessing hardware
registers. The kind of stuff that's done in device drivers
and firmware.

In the case of interrupt handlers the use of volatile is
mated with enabling and disabling interrupts to control
concurrent access.

And in systems with memory caches there are special hardware features to
prevent hardware registers/memory from being cached in addition
to the volatile keyword in C.

From the "C++ Programming Language - Second Edition" by Bjarne
Stroustrup:

"There are no implementation-independent semantics for volatile objects;
volatile is a hint to the compiler to avoid aggressive optimization
involving the object because the value of the object may be changed
by means undetectable by a compiler."

-- Doug

David Schwartz

unread,

Aug 29, 2000, 3:00:00 AM8/29/00

to

Tristan Wibberley wrote:

> > If no pointer to the variable exists anywhere but in that local
> >function, no other thread could modify the variable anyway. Any attempt
> >to pass the pointer to another thread could also pass the variable to an
> >unknown function. So there should never be a problem.
>
> So posix threads don't share local static variables then?

Sure they do.

> int daft_function (void)
> {
> static int a;
> int b, condition;
>
> /* do something useful with some other data (but do nothing with a) */
>
> if (condition) {
> pthread_mutex_lock (&m); /* m is declared elsewhere */
> a++;
> pthread_mutex_unlock (&m);
> }
>
> pthread_mutex_lock (&m);
> b = a;
> pthread_mutex_unlock (&m);
>
> return b;
> }
>
> Two threads can be executing this function and there share "a" (don't
> they?). But the compiler knows that no called functions touch "a" (nor
> anything else).

Nonsense! How does the compiler know that 'pthread_mutex_lock' doesn't
call 'daft_function'?

DS

David Schwartz

unread,

Aug 29, 2000, 3:00:00 AM8/29/00

to

Kaz Kylheku wrote:

>
> On Tue, 29 Aug 2000 16:55:00 -0700, David Schwartz <dav...@webmaster.com>
> wrote:
> >> Two threads can be executing this function and there share "a" (don't
> >> they?). But the compiler knows that no called functions touch "a" (nor
> >> anything else).
> >
> > Nonsense! How does the compiler know that 'pthread_mutex_lock' doesn't
> >call 'daft_function'?
>

> Because, as part of a POSIX implementation, the compiler can be endowed with
> POSIX API knowledge. The pthread_mutex_lock() function would not be conforming
> if it linked to an application-supplied daft_function().

The question was how can these functions work without special compiler
knowledge. That was the question I was answering.

DS

Charles Bryant

unread,

Aug 29, 2000, 9:52:01 PM8/29/00

to

In article <slrn8qobr...@ashi.FootPrints.net>,
Kaz Kylheku <k...@ashi.footprints.net> wrote:
... optimising daft_function() with static data in local scope...

>That's right; the compiler *could* know here that pthread_mutex_lock() is a
>standard library function which cannot possibly call back into daft_function(),

If the compiler has a list of such standard library functions it is
seriously broken if the mutex functions appear on the list, since
that would totally defeat the purpose of mutexes.

--
Eppur si muove

Charles Bryant

unread,

Aug 29, 2000, 9:46:35 PM8/29/00

to

In article <bUVq5.7709$SR1.1...@news6-win.server.ntlworld.com>,
Tristan Wibberley <blo...@cus.org.uk> wrote:
...

>So posix threads don't share local static variables then?

They do.

>int daft_function (void)
>{
> static int a;
> int b, condition;
>
> /* do something useful with some other data (but do nothing with a) */
>
> if (condition) {
> pthread_mutex_lock (&m); /* m is declared elsewhere */
> a++;
> pthread_mutex_unlock (&m);
> }
>
> pthread_mutex_lock (&m);
> b = a;
> pthread_mutex_unlock (&m);
>
> return b;
>}
>

>Two threads can be executing this function and there share "a" (don't
>they?).

Yes.

>But the compiler knows that no called functions touch "a" (nor
>anything else).

It cannot possibly know that, since it's not true. Since
'daft_function' is global, how does the compiler know that
pthread_mutex_lock() doesn't call it?

If you change it to be declared static then, if it can be run in two or
more threads simultaneously, it must be reachable from a thread start
function (i.e. a function whose pointer has been passed to
pthread_create()). This means that the compiler must assume that
pthread_create() might have saved the function pointer somewhere and
pthread_mutex_lock() might call it through this saved pointer. Since
this applies to single-threaded programs (e.g. state machines and
suchlike), no compiler can optimise away the references to 'a'.

--
Eppur si muove

Kaz Kylheku

unread,

Aug 29, 2000, 11:26:08 PM8/29/00

to

On Tue, 29 Aug 2000 16:55:00 -0700, David Schwartz <dav...@webmaster.com>
wrote:

>> Two threads can be executing this function and there share "a" (don't

>> they?). But the compiler knows that no called functions touch "a" (nor
>> anything else).
>

David Jones

unread,

Aug 30, 2000, 12:39:04 AM8/30/00

to

In message <slrn8qovu...@ashi.FootPrints.net>

k...@ashi.footprints.net (Kaz Kylheku) writes:
>On Tue, 29 Aug 2000 16:55:00 -0700, David Schwartz <dav...@webmaster.com>
>wrote:

>> Nonsense! How does the compiler know that 'pthread_mutex_lock' doesn't
>>call 'daft_function'?
>
>Because, as part of a POSIX implementation, the compiler can be endowed with
>POSIX API knowledge. The pthread_mutex_lock() function would not be conforming
>if it linked to an application-supplied daft_function().

If the compiler has special knowledge of pthread_mutex_lock(), then that
knowledge would include that it can call the thread scheduler. Since the
scheduler is originally what called daft_function or one of it's ancestors on
the call stack, the compiler has to recognize the possibilty of
re-entering daft_function and flush the static variables.

David L. Jones | Phone: (614) 292-6929
Ohio State University | Internet:
140 W. 19th St. Rm. 231a | jon...@er6s1.eng.ohio-state.edu
Columbus, OH 43210 | vm...@osu.edu

Disclaimer: Dogs can't tell it's not bacon.

Tristan Wibberley

unread,

Aug 30, 2000, 3:00:00 AM8/30/00

to

David Schwartz wrote in message <39AC4D54...@webmaster.com>...

>
> Nonsense! How does the compiler know that 'pthread_mutex_lock' doesn't
>call 'daft_function'?

D'oh!

--
Tristan Wibberley

Tristan Wibberley

unread,

Aug 30, 2000, 3:00:00 AM8/30/00

to

Charles Bryant wrote in message <2000-08-3...@chch.demon.co.uk>...

Ah yes, C has the compile *then* link thing. The linker knows that
pthread_mutex_lock won't call the global function (because the symbol is not
defined in the library in which pthread_mutex_lock is), but the compiler
doesn't.

pthread_mutex_lock must *not* be inlined *ever* then (unless it has a
non-inlined function call itself).

This is great, now I can trust my multi-threaded programs (the C ones) a
little more.

--
Tristan Wibberley

David Schwartz

unread,

Aug 30, 2000, 3:00:00 AM8/30/00

to

Tristan Wibberley wrote:

> pthread_mutex_lock must *not* be inlined *ever* then (unless it has a
> non-inlined function call itself).

One could imagine a system which allowed you to inline
pthread_mutex_lock. GCC, for example, has a way of saying "don't keep
anything cached in registers across this line of code". It's usually
used to implement inline assembly, but could be used to create an
inlined version of pthread_mutex_lock.

The interesting thing is that in the usual case, you don't have to do
anything special. However, if you do try to do something special, like
make pthread_mutex_lock inline for performance reasons, you may have to
do other special things to prevent breakage.

You can feel safe though, the POSIX standard requires the mutex
functions to work. How that's implemented is not something you really
need to worry about. It is, however, interesting that it's actually
quite easy to meet the requirements.

DS

Joerg Faschingbauer

unread,

Aug 30, 2000, 3:00:00 AM8/30/00

to

k...@ashi.footprints.net (Kaz Kylheku) writes:

> Under POSIX threads, you don't need volatile so long as you use the
> locking mechanism supplied by the interface.

How does pthread_mutex_(un)lock manage to get the registers flushed?

You are making several assumptions which need not hold necessarily.

1. pthread_mutex_(un)lock() are functions. Does POSIX require that?
(Honestly, I don't know.) What if they were macros? The call
wouldn't be a call then, so no compiler would be forced by no means to
flush registers.

2. No compiler does interprocedural optimization. Of course, even if
one would do, it would be hard for it to coordinate the allocated
registers of the calling function with those of the pthread
functions. But I don't think it is written anywhere that this must
not be done, so your assumption is wrong generally.

So, to make sure the code is correct in all possible environments, I'd
say volatile is mandatory.

Joerg

Richard Brodie

unread,

Aug 30, 2000, 3:00:00 AM8/30/00

to

"Joerg Faschingbauer" <jfa...@hyperwave.com> wrote in message
news:861yz71...@hwiw01.hyperwave.com...

> k...@ashi.footprints.net (Kaz Kylheku) writes:
>
> > Under POSIX threads, you don't need volatile so long as you use the
> > locking mechanism supplied by the interface.

> So, to make sure the code is correct in all possible environments, I'd
> say volatile is mandatory.

You are arguing backwards. The POSIX specification defines what the
code may do. If on hypothetical implementation 'A', it does something
different then the implementation it is broken.

Even if it were impossible to build a real implementation, it wouldn't
change how a conformant implementation should behave.

Joerg Faschingbauer

unread,

Aug 30, 2000, 3:00:00 AM8/30/00

to

"Richard Brodie" <R.Br...@rl.ac.uk> writes:

Pardon? Somehow we misunderstood. What do you mean when you say
"implementation"?

When I was saying "environment" I actually meant the compiler and the
linker (which is not targeted by any of the relevant pthread standards
we are talking about).

Joerg

Drazen Kacar

unread,

Aug 30, 2000, 3:00:00 AM8/30/00

to

Joerg Faschingbauer wrote:

> 2. No compiler does interprocedural optimization. Of course, even if

Yes, they do and it's even not very uncommon.

> one would do, it would be hard for it to coordinate the allocated
> registers of the calling function with those of the pthread
> functions. But I don't think it is written anywhere that this must
> not be done, so your assumption is wrong generally.

The oldest one I can think of is Ultrix cc where -O3 works only if
you give all of your source files to the compiler in a single invocation.
Some modern compilers also have that option (-xcrossfile with Sun C or -O4
and higher with Apogee C). Here's a quote from Apogee C man page:

Caution must be used in handling object files produced
by -O4 and -O5. In these modes, when multiple files
are passed to the compiler, interprocedural
optimization across files occurs, so the resultant
object files are dependent on each other for correct
execution. If a change is made in one of these source
files, all of the related files must be recompiled.

--
.-. .-. I don't work for my employer.
(_ \ / _)
| da...@srce.hr
| da...@fly.srk.fer.hr

Richard Brodie

unread,

Aug 30, 2000, 3:00:00 AM8/30/00

to

"Joerg Faschingbauer" <jfa...@hyperwave.com> wrote in message

news:86zolvy...@hwiw01.hyperwave.com...

> Pardon? Somehow we misunderstood. What do you mean when you say
> "implementation"?

Something that takes a piece of source code including POSIX threads
calls and executes it. This will include compiler, O/S and library support.

> When I was saying "environment" I actually meant the compiler and the
> linker (which is not targeted by any of the relevant pthread standards
> we are talking about).

I see the distinction but I don't think it is a valid one. The normal way of
thinking of an API is the interface a set of libraries provide. However,
from a standards point of view you should be able to take a black box
approach: you provide the system with source that is well defined, it
gives you the right answer. If you look at it that way, you can't factor
out the compiler.

Ian Collins

unread,

Aug 30, 2000, 3:00:00 AM8/30/00

to

Joerg Faschingbauer wrote:

> k...@ashi.footprints.net (Kaz Kylheku) writes:
>
> > Under POSIX threads, you don't need volatile so long as you use the
> > locking mechanism supplied by the interface.
>

> How does pthread_mutex_(un)lock manage to get the registers flushed?
>

It depends on the hardware to do this.

>
> You are making several assumptions which need not hold necessarily.
>
> 1. pthread_mutex_(un)lock() are functions. Does POSIX require that?
> (Honestly, I don't know.) What if they were macros? The call
> wouldn't be a call then, so no compiler would be forced by no means to
> flush registers.
>

> 2. No compiler does interprocedural optimization. Of course, even if

> one would do, it would be hard for it to coordinate the allocated
> registers of the calling function with those of the pthread
> functions. But I don't think it is written anywhere that this must
> not be done, so your assumption is wrong generally.
>

Think of a RISC machine with a register wheel (Sparc, for example). Here
one function's output registers are anothers input ones.

>
> So, to make sure the code is correct in all possible environments, I'd
> say volatile is mandatory.
>

It's not. That's what memory barriers are for.

Ian

>
> Joerg

Joerg Faschingbauer

unread,

Aug 30, 2000, 3:00:00 AM8/30/00

to

Ian Collins <it...@imerge.co.uk> writes:

> Joerg Faschingbauer wrote:
>
> > k...@ashi.footprints.net (Kaz Kylheku) writes:
> >
> > > Under POSIX threads, you don't need volatile so long as you use the
> > > locking mechanism supplied by the interface.
> >
> > How does pthread_mutex_(un)lock manage to get the registers flushed?
> >
>
> It depends on the hardware to do this.

Registers are generally flushed by store instructions. Store
instructions are generated by the compiler when it decides that
registers have to be flushed. A compiler decides that registers have
to be flushed based on (compiler) implementation dependent
criteria. One common criterion is an intermittent function call. Not
that much to do with hardware. What you mean is processor cache
coherency.

> >
> > You are making several assumptions which need not hold necessarily.
> >
> > 1. pthread_mutex_(un)lock() are functions. Does POSIX require that?
> > (Honestly, I don't know.) What if they were macros? The call
> > wouldn't be a call then, so no compiler would be forced by no means to
> > flush registers.
> >
> > 2. No compiler does interprocedural optimization. Of course, even if
> > one would do, it would be hard for it to coordinate the allocated
> > registers of the calling function with those of the pthread
> > functions. But I don't think it is written anywhere that this must
> > not be done, so your assumption is wrong generally.
> >
>
> Think of a RISC machine with a register wheel (Sparc, for example). Here
> one function's output registers are anothers input ones.
>
>
> >
> > So, to make sure the code is correct in all possible environments, I'd
> > say volatile is mandatory.
> >
>
> It's not. That's what memory barriers are for.

A barrier won't write registers to memory. For sure.

Joerg

Eric Sosman

unread,

Aug 30, 2000, 3:00:00 AM8/30/00

to

Tristan Wibberley wrote:
>
> Ah yes, C has the compile *then* link thing. The linker knows that
> pthread_mutex_lock won't call the global function (because the symbol is not
> defined in the library in which pthread_mutex_lock is), but the compiler
> doesn't.

If the linker "knows" this, the linker is broken. Consider
passing a function pointer as an argument to a library function;
the library has no reference to the pointed-to function, yet the
library can call that function anyhow.

--
Eric....@east.sun.com

David Schwartz

unread,

Aug 30, 2000, 3:00:00 AM8/30/00

to

Joerg Faschingbauer wrote:
>
> k...@ashi.footprints.net (Kaz Kylheku) writes:
>
> > Under POSIX threads, you don't need volatile so long as you use the
> > locking mechanism supplied by the interface.
>
> How does pthread_mutex_(un)lock manage to get the registers flushed?

Who cares, it just does. POSIX requires it.

> You are making several assumptions which need not hold necessarily.

No, they are not assumptions, they are guarantees provided in the
standard.

> 1. pthread_mutex_(un)lock() are functions. Does POSIX require that?

Nope. Why should POSIX tell an implementation HOW to make things work?

> (Honestly, I don't know.) What if they were macros? The call
> wouldn't be a call then, so no compiler would be forced by no means to
> flush registers.

IF pthread_mutex_(un)lock were a macro, and the implementation were
still to conform to POSIX's requirements, it would need to use some form
of 'memory invalidation' statement to warn the compiler not to stash
information in registers across the statement. GCC, for example, has
just such a statement. It's commonly used for inline assembly.

> 2. No compiler does interprocedural optimization. Of course, even if
> one would do, it would be hard for it to coordinate the allocated
> registers of the calling function with those of the pthread
> functions. But I don't think it is written anywhere that this must
> not be done, so your assumption is wrong generally.

I have no idea what this is supposed to mean.

> So, to make sure the code is correct in all possible environments, I'd
> say volatile is mandatory.

You'd say wrong. Read the standard.

DS

Kaz Kylheku

unread,

Aug 30, 2000, 3:00:00 AM8/30/00

to

On 30 Aug 2000 04:39:04 GMT, David Jones <JON...@er6.eng.ohio-state.edu> wrote:
>In message <slrn8qovu...@ashi.FootPrints.net>
> k...@ashi.footprints.net (Kaz Kylheku) writes:
>>On Tue, 29 Aug 2000 16:55:00 -0700, David Schwartz <dav...@webmaster.com>
>>wrote:
>>> Nonsense! How does the compiler know that 'pthread_mutex_lock' doesn't
>>>call 'daft_function'?
>>
>>Because, as part of a POSIX implementation, the compiler can be endowed with
>>POSIX API knowledge. The pthread_mutex_lock() function would not be conforming
>>if it linked to an application-supplied daft_function().
>
>If the compiler has special knowledge of pthread_mutex_lock(), then that
>knowledge would include that it can call the thread scheduler.

Under preemptive threading, the execution can be suspended *at any point* to
invoke the scheduler; pthread_mutex_lock is not special in that regard. Yet
compilers clearly do not treat each instruction with the same suspicion that
pthread_mutex_lock deserves.

The standard is flawed because it doesn't mention that calls to
pthread_mutex_lock and pthread_mutex_unlock must be treated specially.
We all know how we want POSIX mutexes to work, and how they do work in
practice, but it should also be codified in the standard, even though
it may be painfully obvious.

How about the following paragraph?

The values of all objects shall be made stable immediately
prior to the call to pthread_mutex_lock, pthread_mutex_unlock,
pthread_mutex_trylock and pthread_mutex_timedlock. The
first abstract access to any object after a call to one of the
locking functions shall be an actual access; any cached copy of
an object that is accessed shall be invalidated. Moreover, the
values of objects made stable prior to a pthread_mutex_unlock by
one thread, shall appear stable to another thread which subsequently
acquires the mutex. The only exception to these rules are objects which
are not shared by threads. (1)
-----
1. Objects of storage class auto whose address is never taken obviously
have this property. If the language implementation is able to prove this
property for other objects, it may apply optimizations to them that would
otherwise be forbidden.

Kaz Kylheku

unread,

Aug 30, 2000, 3:00:00 AM8/30/00

to

On 30 Aug 2000 17:24:19 +0200, Joerg Faschingbauer <jfa...@hyperwave.com> wrote:

>Ian Collins <it...@imerge.co.uk> writes:
>
>> Joerg Faschingbauer wrote:
>>
>> > k...@ashi.footprints.net (Kaz Kylheku) writes:
>> >
>> > > Under POSIX threads, you don't need volatile so long as you use the
>> > > locking mechanism supplied by the interface.
>> >
>> > How does pthread_mutex_(un)lock manage to get the registers flushed?
>> >
>>

>> It depends on the hardware to do this.
>
>Registers are generally flushed by store instructions. Store

That's right. A good term for this is ``spill'', which is harder to confuse
with cache flushing.

Kaz Kylheku

unread,

Aug 30, 2000, 3:00:00 AM8/30/00

to

On Wed, 30 Aug 2000 11:28:46 -0700, David Schwartz <dav...@webmaster.com> wrote:
>
>Joerg Faschingbauer wrote:
>>
>> k...@ashi.footprints.net (Kaz Kylheku) writes:
>>
>> > Under POSIX threads, you don't need volatile so long as you use the
>> > locking mechanism supplied by the interface.
>>
>> How does pthread_mutex_(un)lock manage to get the registers flushed?
>

> Who cares, it just does. POSIX requires it.

Everyone is saying that, but I've never seen a chapter and verse quote.
I'm not saying that I don't believe it or that it's not existing practice;
but just that maybe it's not adequately codified in the document.

To answer the question: how can it manage to get the registers flushed?
Whether or not the requirement is codified in the standard, it can can be met
in a number of ways. An easy way to meet the requirement is to spill
registers at each external function call.

Barring that, the pthread_mutex_lock functions could be specially recognized by
the compiler. They could be, for instance, implemented as inline functions
which contains special compiler directives which tell the compiler to avoid
caching.

The GNU compiler has such a directive, for instance:

__asm__ __volatile__ ("" : : : "memory");

The "memory" part takes care of defeating caching, and the __volatile__
prevents code motion of the inlined code itself.

Of course, GCC doesn't need this in the context we are discussing, because
it will do ``the right thing'' with external function calls.

I've only used the above as a workaround to GCC optimization bugs.
It can also be used as the basis for inserting a memory barrier instruction:

#define mb() __asm__ __volatile__ \
("<insert barrier opcode here>" : : : "memory");

It's a good idea to do it like this so that the compiler's optimizations do
not make the memory barrier useless, by squirreling away data in registers
or moving the instruction around in the generated code.

This would be typically used in the implementation of a mutex function,
not in its interface, to ensure that internal accesses to the mutex object
itself are conducted properly.

Tristan Wibberley

unread,

Aug 30, 2000, 3:00:00 AM8/30/00

to

Kaz Kylheku wrote in message ...

> The values of all objects shall be made stable immediately
> prior to the call to pthread_mutex_lock, pthread_mutex_unlock,
> pthread_mutex_trylock and pthread_mutex_timedlock. The
> first abstract access to any object after a call to one of the
> locking functions shall be an actual access; any cached copy of
> an object that is accessed shall be invalidated. Moreover, the
> values of objects made stable prior to a pthread_mutex_unlock by
> one thread, shall appear stable to another thread which subsequently
> acquires the mutex. The only exception to these rules are objects which
> are not shared by threads. (1)

From elsewhere in this thread, I believe that C will do this if
pthread_mutex_lock is not defined in the current translation unit (so posix
implementations which do not bother with special support must simply not
define pthread_mutex_(un)lock in the current translation unit.

If a static function is never ever used the optimisations can be done, other
wise not. In that case the function can be optimised out of existence.

--
Tristan Wibberley

Ian Collins

unread,

Aug 30, 2000, 3:00:00 AM8/30/00

to

Joerg Faschingbauer wrote:
>
> Ian Collins <it...@imerge.co.uk> writes:
>

> > Joerg Faschingbauer wrote:
> >
> > > k...@ashi.footprints.net (Kaz Kylheku) writes:
> > >
> > > > Under POSIX threads, you don't need volatile so long as you use the
> > > > locking mechanism supplied by the interface.
> > >
> > > How does pthread_mutex_(un)lock manage to get the registers flushed?
> > >
> >

> > It depends on the hardware to do this.
>
> Registers are generally flushed by store instructions. Store

> instructions are generated by the compiler when it decides that
> registers have to be flushed. A compiler decides that registers have
> to be flushed based on (compiler) implementation dependent
> criteria. One common criterion is an intermittent function call. Not
> that much to do with hardware. What you mean is processor cache
> coherency.
>
> > >

> > > You are making several assumptions which need not hold necessarily.
> > >

> > > 1. pthread_mutex_(un)lock() are functions. Does POSIX require that?

> > > (Honestly, I don't know.) What if they were macros? The call
> > > wouldn't be a call then, so no compiler would be forced by no means to
> > > flush registers.
> > >

> > > 2. No compiler does interprocedural optimization. Of course, even if
> > > one would do, it would be hard for it to coordinate the allocated
> > > registers of the calling function with those of the pthread
> > > functions. But I don't think it is written anywhere that this must
> > > not be done, so your assumption is wrong generally.
> > >
> >

> > Think of a RISC machine with a register wheel (Sparc, for example). Here
> > one function's output registers are anothers input ones.
> >
> >
> > >

> > > So, to make sure the code is correct in all possible environments, I'd
> > > say volatile is mandatory.
> > >
> >

> > It's not. That's what memory barriers are for.
>
> A barrier won't write registers to memory. For sure.
>
> Joerg

All true, sorry for talking bollocks.

Ian

David Schwartz

unread,

Aug 30, 2000, 3:00:00 AM8/30/00

to

Kaz Kylheku wrote:

> Under preemptive threading, the execution can be suspended *at any point* to
> invoke the scheduler; pthread_mutex_lock is not special in that regard. Yet
> compilers clearly do not treat each instruction with the same suspicion that
> pthread_mutex_lock deserves.

POSIX doesn't require preemptive threading.

> The standard is flawed because it doesn't mention that calls to
> pthread_mutex_lock and pthread_mutex_unlock must be treated specially.

Because they don't need to be.

> We all know how we want POSIX mutexes to work, and how they do work in
> practice, but it should also be codified in the standard, even though
> it may be painfully obvious.

Impossible, since the standard makes no assumptions about the
underlying hardware.

> How about the following paragraph?
>

> The values of all objects shall be made stable immediately
> prior to the call to pthread_mutex_lock, pthread_mutex_unlock,
> pthread_mutex_trylock and pthread_mutex_timedlock.

What does "made stable" mean?

> The
> first abstract access to any object after a call to one of the
> locking functions shall be an actual access; any cached copy of
> an object that is accessed shall be invalidated.

Why? Cache coherency on most platforms makes it safe to use a cached
copy. Unless you mean something else by "cached".

> Moreover, the
> values of objects made stable prior to a pthread_mutex_unlock by
> one thread, shall appear stable to another thread which subsequently
> acquires the mutex.

I really don't know what you're trying to say here. What does "appear
stable" mean? Note that there is no requirement in the POSIX standard
that a CPU implementing it even have registers.

DS

Kaz Kylheku

unread,

Aug 31, 2000, 1:14:09 AM8/31/00

to

On Wed, 30 Aug 2000 19:57:20 -0700, David Schwartz <dav...@webmaster.com> wrote:
>
>Kaz Kylheku wrote:
>
>> Under preemptive threading, the execution can be suspended *at any point* to
>> invoke the scheduler; pthread_mutex_lock is not special in that regard. Yet
>> compilers clearly do not treat each instruction with the same suspicion that
>> pthread_mutex_lock deserves.
>
> POSIX doesn't require preemptive threading.

But it doesn't prohibit it.

>> The standard is flawed because it doesn't mention that calls to
>> pthread_mutex_lock and pthread_mutex_unlock must be treated specially.
>
> Because they don't need to be.
>
>> We all know how we want POSIX mutexes to work, and how they do work in
>> practice, but it should also be codified in the standard, even though
>> it may be painfully obvious.
>
> Impossible, since the standard makes no assumptions about the
>underlying hardware.
>
>> How about the following paragraph?
>>
>> The values of all objects shall be made stable immediately
>> prior to the call to pthread_mutex_lock, pthread_mutex_unlock,
>> pthread_mutex_trylock and pthread_mutex_timedlock.
>
> What does "made stable" mean?

This usage is borrowed from the C standard, which says

``at sequence points, volatile objects are stable in the sense
that previous evaluations are complete and subsequent
evaluations have not yet occured.''

In effect, what the proposed requirement ask is that some objects be treated as
volatile when crossing into the critical region.

>> The
>> first abstract access to any object after a call to one of the
>> locking functions shall be an actual access; any cached copy of
>> an object that is accessed shall be invalidated.
>
> Why? Cache coherency on most platforms makes it safe to use a cached
>copy. Unless you mean something else by "cached".

By cached I mean held in some fast storage in the data processing unit,
under the control of the language implementation. E.g. caching variables
in a register. I could not think of a nice way to say the above without
making use of the term cache.

>> Moreover, the
>> values of objects made stable prior to a pthread_mutex_unlock by
>> one thread, shall appear stable to another thread which subsequently
>> acquires the mutex.
>
> I really don't know what you're trying to say here. What does "appear
>stable" mean? Note that there is no requirement in the POSIX standard
>that a CPU implementing it even have registers.

That's right; so we want wording which doesn't make an implementation
impossible on machines that don't have registers. We don't want to use
terms like ``memory barrier'' and whatnot, only to say that it should work
right from the C programmer's point of view.

The wording is abstract. All it means is that if you make a change to a shared
variable inside a mutex lock, then when another thread acquires the mutex lock,
it will see the full result of that change. Stable means the value of the
object has stabilized; the update has completed and a new update has not
yet begun.

For example, suppose that object i has the value X and the statement
i = Y is executed. The object i is stable when it can be observed to
have the value Y. If i = Y is executed within a mutex lock, and the
mutex is unlocked and then acquired by some other thread, then
the requirement for stability means that the other thread is guaranteed
to see the value Y.

This doesn't assume that you have registers, but does put some constraints
about what you can do with registers if you have them.

Charles Bryant

unread,

Aug 30, 2000, 9:25:05 PM8/30/00

to

In article <slrn8qqm2...@ashi.FootPrints.net>,

Kaz Kylheku <k...@ashi.footprints.net> wrote:
>Under preemptive threading, the execution can be suspended *at any point* to
>invoke the scheduler; pthread_mutex_lock is not special in that regard. Yet
>compilers clearly do not treat each instruction with the same suspicion that
>pthread_mutex_lock deserves.

pthread_mutex_lock() is special. While threads may be pre-empted at
any point, they are not permitted to access shared data, so the order
in which the operations are performed is irrelevant. By calling
pthread_mutex_lock() a thread gains permission to access shared data,
so at that point the thread needs to update any local copies of that
data. Similarly, by calling pthread_mutex_unlock() a thread
relinquishes this permission, so it must have updated the shared data
from any local copies. Between these two calls, it is the only thread
which is permitted to access the shared data, so it can safely cache
as it likes.

>How about the following paragraph?
>
> The values of all objects shall be made stable immediately
> prior to the call to pthread_mutex_lock, pthread_mutex_unlock,

> pthread_mutex_trylock and pthread_mutex_timedlock. The

> first abstract access to any object after a call to one of the
> locking functions shall be an actual access; any cached copy of
> an object that is accessed shall be invalidated.

That is unnecessarily restrictive. Suppose we have a buffering scheme
which uses this code:
for (;;) {
lock
while (!items_ready) wait(data)
if (!--items_ready) signal(space)
unlock
use buffer[tail]
tail = (tail + 1) % BUFLEN
}

Analysis of the other uses of 'items_ready' may indicate that it can
be optimised into:
local_ready = 0
lastbatch = 0
for (;;) {
if (!local_ready) {
lock
items_ready -= lastbatch
if (!items_ready) signal(space)
while (!items_ready) wait(data)
local_ready = lastbatch = items_ready
unlock
} else local_ready--
use buffer[tail]
tail = (tail + 1) % BUFLEN
}

Of course the analysis required to determine that this is a valid
optimisation is not simple, and I would not expect to find it in
current compilers, but I don't think the standard should prohibit it.

--
Eppur si muove

Kaz Kylheku

unread,

Aug 31, 2000, 7:22:48 AM8/31/00

to

On 31 Aug 2000 01:25:05 -0000, Charles Bryant <n79561...@chch.demon.co.uk>
wrote:

>In article <slrn8qqm2...@ashi.FootPrints.net>,
>Kaz Kylheku <k...@ashi.footprints.net> wrote:
>>Under preemptive threading, the execution can be suspended *at any point* to
>>invoke the scheduler; pthread_mutex_lock is not special in that regard. Yet
>>compilers clearly do not treat each instruction with the same suspicion that
>>pthread_mutex_lock deserves.
>
>pthread_mutex_lock() is special. While threads may be pre-empted at
>any point, they are not permitted to access shared data, so the order
>in which the operations are performed is irrelevant. By calling
>pthread_mutex_lock() a thread gains permission to access shared data,
>so at that point the thread needs to update any local copies of that
>data.

Again, I can't find any wording like this in POSIX. But it does reflect how I
wish mutexes would be formally required to work. :)

>> The values of all objects shall be made stable immediately
>> prior to the call to pthread_mutex_lock, pthread_mutex_unlock,
>> pthread_mutex_trylock and pthread_mutex_timedlock. The
>> first abstract access to any object after a call to one of the
>> locking functions shall be an actual access; any cached copy of
>> an object that is accessed shall be invalidated.
>
>That is unnecessarily restrictive. Suppose we have a buffering scheme
>which uses this code:
> for (;;) {
> lock
> while (!items_ready) wait(data)
> if (!--items_ready) signal(space)
> unlock
> use buffer[tail]
> tail = (tail + 1) % BUFLEN
> }
>
>Analysis of the other uses of 'items_ready' may indicate that it can
>be optimised into:
> local_ready = 0
> lastbatch = 0
> for (;;) {
> if (!local_ready) {
> lock
> items_ready -= lastbatch

This assumes that some other thread did not decrement items_ready
in the meanwhile. The lastbatch variable is a cached local which
could be out of date at this point; it could overestimate how many
items are actually ready and thus make items_ready negative.

> if (!items_ready) signal(space)
> while (!items_ready) wait(data)
> local_ready = lastbatch = items_ready

You really need to zero out items_ready here to indicate that the items are
consumed before you give up the lock.

Now maybe the assumption is that there is only one consumer; but the original
code works with more than one and this one doesn't, so they are not equivalent.

But I understand what you are getting at; the compiler could turn the
inefficient use of the lock into a better algorithm which acquires the
items in batches, and then processes them in batches outside of the
lock, rather than getting one item at a time out of the lock.

The rules that I wrote down were not intended to rule out useful optimizations.
Rather, they describe the behavior of the abstract machine. Any optimizations
which produce the same end result as would the abstract machine are fine!

This is in keeping with how the C language is defined. For example, the C
standard says that in the abstract machine, the values of objects are stable
at each sequence point. But in the actual semantics this need not be so.
E.g. a program which multiples two vectors together to form the result 42 on
standard output could just be replaced by a main function which contains
puts("42");

The rules for POSIX mutexes should be stated according to the same
principles. That is, it is sufficient to describe what the mutex behavior is
like in the abstract machine, and allow implementors to deduce from that
what optimizations are possible.

Thus the revised text can be condensed to:

In the abstract machine, value most recently stored in an object by a
thread which owns a lock shall be available to any other thread
which subsequently acquires that lock.

That's it! Actually a rule to cover synchronization
objects in general might be better:

The value most recently stored in an object by a thread which
subsequently signals a synchronization object shall be available to any
other thread which proceeds or resumes as a result of that signal.

That actually covers the mutex case as well, if you consider
pthread_mutex_unlock to be a such a signal.

Dave Butenhof

unread,

Aug 31, 2000, 9:30:10 AM8/31/00

to

Kaz Kylheku wrote:

> The standard is flawed because it doesn't mention that calls to
> pthread_mutex_lock and pthread_mutex_unlock must be treated specially.

> We all know how we want POSIX mutexes to work, and how they do work in
> practice, but it should also be codified in the standard, even though
> it may be painfully obvious.

The standard requires memory coherency between threads based on the POSIX
synchronization operations. It does NOT specifically dictate the compiler or system
behavior necessary to achieve that coherency, because it has no power over the C
language nor over the hardware. Besides, it really doesn't matter how the
requirements are achieved, nor by whom.

An implementation (thread library, compiler, linker, OS, hardware, etc.) that
doesn't make memory behave correctly with respect to POSIX synchronization
operations simply does not conform to POSIX. This means, in particular, (because
POSIX does not require use of volatile), that any system that doesn't work without
volatile is not POSIX. Can such a system be built? Certainly; but it's not POSIX.
(It's also not particularly usable, which may be even more important to some
people.)

OK, you want chapter and verse? Sure, here we go. POSIX 1003.1-1996, page 32:

2.3.8 memory synchronization: Applications shall ensure that access to any memory
location by more than one thread of control (threads or processes) is restricted
such that no thread of control can read or modify a memory location while another
thread of control may be modifying it. Such access is restricted using functions
that synchronize thread execution and also synchronize memory with respect to other
threads. The following functions synchronize memory with respect to other threads:

fork() pthread_mutex_unlock() sem_post()
pthread_create() pthread_cond_wait() sem_trywait()
pthread_join() pthread_cond_timedwait() sem_wait()
pthread_mutex_lock() pthread_cond_signal() wait()
pthread_mutex_trylock()pthread_cond_broadcast() waitpid()

In other words, the application is reponsible for relying only on explicit memory
synchronization based on the listed POSIX functions. The implementation is
responsible for ensuring that correct code will see synchronized memory. "Whatever
it takes."

Normally, the compiler doesn't need to do anything it wouldn't normally do for a
routine call to achieve this. A particularly aggressive global optimizer, or an
implementation that "inlines" mutex operations, might need additional compiler
support to meet the requirements, but that's all beyond the scope of the standard.
The requirements must be met, and if they are, application and library developers
who use threads just don't need to worry. Unless of course you choose to try to
create your own memory synchronization without using the POSIX functions, in which
case no current standard will help you and you're entirely on your own on each
platform.

/------------------[ David.B...@compaq.com ]------------------\
| Compaq Computer Corporation http://members.aol.com/drbutenhof |
| 110 Spit Brook Rd ZKO2-3/Q18, Nashua NH 03062-2698 |
\--------[ http://www.awl.com/cseng/titles/0-201-63392-2/ ]-------/

Charles Bryant

unread,

Aug 31, 2000, 9:07:46 PM8/31/00

to

In article <slrn8qsg8...@ashi.FootPrints.net>,

Kaz Kylheku <k...@ashi.footprints.net> wrote:
>On 31 Aug 2000 01:25:05 -0000, Charles Bryant <n79561...@chch.demon.co.uk>
>wrote:

>> Suppose we have a buffering scheme
>>which uses this code:
>> for (;;) {
>> lock
>> while (!items_ready) wait(data)
>> if (!--items_ready) signal(space)
>> unlock
>> use buffer[tail]
>> tail = (tail + 1) % BUFLEN
>> }
>>
>>Analysis of the other uses of 'items_ready' may indicate that it can
>>be optimised into:
>> local_ready = 0
>> lastbatch = 0
>> for (;;) {
>> if (!local_ready) {
>> lock
>> items_ready -= lastbatch
>
>This assumes that some other thread did not decrement items_ready
>in the meanwhile. The lastbatch variable is a cached local which
>could be out of date at this point; it could overestimate how many
>items are actually ready and thus make items_ready negative.

True. It is only equivalent if it is possible to determine that there
is only one consumer.

>> if (!items_ready) signal(space)
>> while (!items_ready) wait(data)
>> local_ready = lastbatch = items_ready
>
>You really need to zero out items_ready here to indicate that the items are
>consumed before you give up the lock.

That makes it worse. The producer can then overwrite the entries
which this thread has not yet used. To allow for multiple consumers
you need to add 'space_ready' which keeps track of how much space is
available for re-use. However if there are multiple consumers it's
probably not a good idea for one of them to grab all the items.

--
Eppur si muove

Kaz Kylheku

unread,

Sep 1, 2000, 12:35:43 PM9/1/00

to

On Thu, 31 Aug 2000 09:30:10 -0400, Dave Butenhof <David.B...@compaq.com>
wrote:

>OK, you want chapter and verse? Sure, here we go. POSIX 1003.1-1996, page 32:
>
>2.3.8 memory synchronization: Applications shall ensure that access to any
>memory location by more than one thread of control (threads or processes) is
>restricted such that no thread of control can read or modify a memory location
>while another thread of control may be modifying it. Such access is restricted
>using functions that synchronize thread execution and also synchronize memory
>with respect to other threads. The following functions synchronize memory with
>respect to other threads:

Alas, my copy of a 200x draft doesn't have this text, or anything resembling
it. Perhaps it has been removed, without a suitable replacement having been
found yet?

Dave Butenhof

unread,

Sep 5, 2000, 8:11:31 AM9/5/00

to

Kaz Kylheku wrote:

You're just looking in the wrong place.

"UNIX 98" (and the merged -2001 family) has a different structure than
1003.1-1996. In particular, the "definitions" are found in the XBD document rather
than in XSH where the "interfaces" live (and which most people think of, not quite
correctly, as the equivalent to 1003.1).

The relevant definition is on page 123 (in my PDF copy), section 4.8 "Memory
Synchronization". Thanks, though, for making me look, because it's somewhat
broken. It hasn't had the 1003.1d and 1003.1j synchronization functions added
(including our nasty friend pthread_mutex_timedlock, though, I suppose, some would
consider that lack a "feature").