Talking about volatile and threads synchronization...

Carlos Moreno

μη αναγνωσμένη,

8 Οκτ 2002, 6:38:27 μ.μ.8/10/02

ως

I'm having a hard time with one detail related to this
issue of whether or not using volatile...

I keep hearing that volatile is useless from the
multithreading point of view... Though I completely
agree that the use of volatile can not be sufficient
to guarantee thread-safety, my problem is that I
think it may be necessary in some cases (in addition
to proper synchronization).

What prevents the optimizer to assume whatever it
wants to assume about the variable "shared" in the
following program:

int main()
{
....

int shared = 1;

...

lots of code that don't touch shared

// at this point, the compiler may assume that
// shared is 1 -- however, with or without proper
// synchronization, shared may have been changed
// by another thread (a thread that may have been
// created and handed a pointer or reference to
// shared)

...
}

Am I missing something? The way I see it, shared would
have to be declared volatile -- that won't be sufficient,
but it seems necessary to guarantee thread-safety.

Why is it that I always hear that volatile and thread-
safety are two completely unrelated things?? (I do
understand that the naive eye may tend to believe that
volatile alone can guarantee thread-safety, and that
is a misconception ... But the typical reaction to
such claim seems also exaggerated... Again, unless
I'm missing something?)

Thanks for any comments,

Carlos
--

Phil Frisbie, Jr.

μη αναγνωσμένη,

8 Οκτ 2002, 7:15:18 μ.μ.8/10/02

ως

Carlos Moreno wrote:
>
> I'm having a hard time with one detail related to this
> issue of whether or not using volatile...

I use a volatile atomic variable when I need to change it infrequently, but I
need to read it frequently. I still wrap the variable in a mutex to protect
changing the variable.

Phil Frisbie, Jr.
Hawk Software
http://www.hawksoft.com

Geoff Hale

μη αναγνωσμένη,

8 Οκτ 2002, 8:32:37 μ.μ.8/10/02

ως

As far as I have been able to tell, declaring a variable "volatile" simply
turns off any optimizations on that variable. The original reason for using
volatile that I was told, was for when the hardware modified a particular
value behind the software's back. It makes sense to me that some
optimizations would have problems with this. In terms of threading though, I
think the only reason to use volatile is because of mistakes that
optimizations could make ... probably around multiple threads using the
variable at the same time. Generally though, I'd think the compiler should
be smart enough not to make these mistakes, and the only time you'd need to
use volatile is when there is a bug in the compiler's optimizations. Is this
correct to assume?

Also, in your example your shared variable was local to main (located on the
stack). Did you mean to make it global (located on the heap)?

Anyway, later.
-geoff

Hillel Y. Sims

μη αναγνωσμένη,

8 Οκτ 2002, 11:33:26 μ.μ.8/10/02

ως

"Carlos Moreno" <moreno_at_mo...@xx.xxx> wrote in message
news:3DA35E63...@xx.xxx...

>
> int main()
> {
> ....
>
> int shared = 1;
>
> ...
>
> lots of code that don't touch shared
>
> // at this point, the compiler may assume that
> // shared is 1 -- however, with or without proper
> // synchronization, shared may have been changed
> // by another thread (a thread that may have been

^^^^^^^^^^^^^^^^^^^^^^^^^^^^

> // created and handed a pointer or reference to

^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
> // shared)
^^^^^^^^^^^^^^^^^^

>
> ...
> }
>
> Am I missing something? The way I see it, shared would
> have to be declared volatile -- that won't be sufficient,
> but it seems necessary to guarantee thread-safety.
>

I don't believe a value may remain cached in a register once you have taken
its address.

hys

--
Hillel Y. Sims
FactSet Research Systems
hsims AT factset.com

Steve Watt

μη αναγνωσμένη,

9 Οκτ 2002, 1:34:46 π.μ.9/10/02

ως

In article <3DA35E63...@xx.xxx>,

Carlos Moreno <moreno_at_mo...@xx.xxx> wrote:
>
>I'm having a hard time with one detail related to this
>issue of whether or not using volatile...
>
>I keep hearing that volatile is useless from the
>multithreading point of view... Though I completely
>agree that the use of volatile can not be sufficient
>to guarantee thread-safety, my problem is that I
>think it may be necessary in some cases (in addition
>to proper synchronization).
>
>What prevents the optimizer to assume whatever it
>wants to assume about the variable "shared" in the
>following program:

In order for another thread to have access to the variable, its
address must be taken, since it has automatic storage class.

If another thread then modifies it through that pointer, if the
program is to work correctly, it must be accessed with some
mutual exclusion taken to ensure memory visibility. The act
of locking a mutex (among other things) will, on a correct
POSIX implementation, flush any cached registers to memory.

>Am I missing something? The way I see it, shared would
>have to be declared volatile -- that won't be sufficient,
>but it seems necessary to guarantee thread-safety.
>
>Why is it that I always hear that volatile and thread-
>safety are two completely unrelated things?? (I do
>understand that the naive eye may tend to believe that
>volatile alone can guarantee thread-safety, and that
>is a misconception ... But the typical reaction to
>such claim seems also exaggerated... Again, unless
>I'm missing something?)

Volatile can not guarantee thread safety. It is also not required
for thread safety. Correct adherence by the application to the
POSIX memory visibility rules is *all* that is required for thread
safety.
--
Steve Watt KD6GGD PP-ASEL-IA ICBM: 121W 56' 57.8" / 37N 20' 14.9"
Internet: steve @ Watt.COM Whois: SW32
Free time? There's no such thing. It just comes in varying prices...

Momchil Velikov

μη αναγνωσμένη,

9 Οκτ 2002, 3:19:12 π.μ.9/10/02

ως

Carlos Moreno <moreno_at_mo...@xx.xxx> wrote in message news:<3DA35E63...@xx.xxx>...

> I'm having a hard time with one detail related to this
> issue of whether or not using volatile...
>
> I keep hearing that volatile is useless from the
> multithreading point of view... Though I completely

Not true. (IMHO)

> What prevents the optimizer to assume whatever it
> wants to assume about the variable "shared" in the
> following program:

When you pass the address as a parameter to some function (e.g.
pthread_create) the compiler _will_ assume that the value of
``shared'' may have changed (if it doesn't not anything about that
other function or has no inter-procedural optimizations).

> Am I missing something? The way I see it, shared would
> have to be declared volatile -- that won't be sufficient,
> but it seems necessary to guarantee thread-safety.

The fact is that ``shared'' may be modified in ways unknown to the
program (in this particular case by another thread). Then it _must_
be declared volatile.

> Why is it that I always hear that volatile and thread-
> safety are two completely unrelated things?? (I do

One can hear all sorts of things :) Multithreading is one way to
obtain volatile behavior - thus volatile and thread safety are not
unrelated.

(by "thread-safety" I mean "correct execution" for the various values
of "correct").

"Thread-safety" does not equal "mutual exclusion".

Think about optimistic concurrency control algorithms, which perform
an unlocked (atomic) read of some variable and if the value is "right"
lock the variable and read it again. One would want to avoid the
compiler caching the value in register (or in other more "convenient"
place other than the variable's home location).

Another applicability - in reality most compilers (all?) would perform
volatile variable accesses in the order specified in the source, i.e.
would not reorder them, which coupled with the appropriate memory
barrier operations open the way to yet another class of algorithms.

~velco

Joshua Jones

μη αναγνωσμένη,

9 Οκτ 2002, 9:29:26 π.μ.9/10/02

ως

Momchil Velikov <ve...@fadata.bg> wrote:
>
> The fact is that ``shared'' may be modified in ways unknown to the
> program (in this particular case by another thread). Then it _must_
> be declared volatile.
>

I've written tons of multithreaded programs, all of which worked
as expected, without ever using volatile.

--
josh(at)intmain.net | http://intmain.net
CS @ College of Computing, Georgia Institute of Technology, Atlanta
532604 local keystrokes since last reboot, 37 days ago.

Mark Johnson

μη αναγνωσμένη,

9 Οκτ 2002, 1:14:35 μ.μ.9/10/02

ως

Carlos Moreno wrote:
>
> I'm having a hard time with one detail related to this
> issue of whether or not using volatile...
>
> I keep hearing that volatile is useless from the
> multithreading point of view... Though I completely
> agree that the use of volatile can not be sufficient
> to guarantee thread-safety, my problem is that I
> think it may be necessary in some cases (in addition
> to proper synchronization).
>

This is correct.

> What prevents the optimizer to assume whatever it
> wants to assume about the variable "shared" in the
> following program:
>
> int main()
> {
> ....
>
> int shared = 1;
>
> ...
>
> lots of code that don't touch shared
>
> // at this point, the compiler may assume that
> // shared is 1 -- however, with or without proper
> // synchronization, shared may have been changed
> // by another thread (a thread that may have been
> // created and handed a pointer or reference to
> // shared)
>
> ...
> }
>

Nothing prevents the optimizer from making that assumption. Not a C
example, but one in Ada with GNAT, you might even get a compiler warning
that "shared" could be declared constant if the scope was right. I
really *like* that warning since it sometimes points out bugs in the
code.

> Am I missing something? The way I see it, shared would
> have to be declared volatile -- that won't be sufficient,
> but it seems necessary to guarantee thread-safety.
>

Nope. You have a far better view of this than most readers.

> Why is it that I always hear that volatile and thread-
> safety are two completely unrelated things?? (I do
> understand that the naive eye may tend to believe that
> volatile alone can guarantee thread-safety, and that
> is a misconception ... But the typical reaction to
> such claim seems also exaggerated... Again, unless
> I'm missing something?)
>

Volatile is not enough for thread safety for a variety of reasons:
- the system may in some cases write values "out of order". Some
versions of the Alpha have this behavior. So if you update a buffer &
then the update index, you need a memory barrier between the two
instructions for safe operation. (to prevent the CPU from reordering
those two writes)
- other caches may exist that don't get flushed properly. I have a
special case in a system where caches on a card have to be flushed prior
to some read / write operations. In this case, the threads are on two
separate systems using memory "shared" across this special interface.
Again, these are cases where the hardware design allows for some
"strange behavior" to occur so it can get the maximum performance out of
the system. Over 99% of the code works just fine in this environment -
the < 1% that does not must get fixed.

Other people have stated that they can get away without volatile. They
are correct for the specific system / compiler combination (and even
compiler switch setting) they are using, but not for the general case.
--Mark

David Butenhof

μη αναγνωσμένη,

9 Οκτ 2002, 1:31:28 μ.μ.9/10/02

ως

Momchil Velikov wrote:

> (by "thread-safety" I mean "correct execution" for the various values
> of "correct").
>
> "Thread-safety" does not equal "mutual exclusion".
>
> Think about optimistic concurrency control algorithms, which perform
> an unlocked (atomic) read of some variable and if the value is "right"
> lock the variable and read it again. One would want to avoid the
> compiler caching the value in register (or in other more "convenient"
> place other than the variable's home location).
>
> Another applicability - in reality most compilers (all?) would perform
> volatile variable accesses in the order specified in the source, i.e.
> would not reorder them, which coupled with the appropriate memory
> barrier operations open the way to yet another class of algorithms.

The definition of "volatile" is essentially as you describe, with some
limitations.

The C language defines a series of "sequence points" in the "abstract
language model" at which variable values must be consistent with language
rules. An optimizer is allowed substantial leeway in reordering or
eliminating sequence points to minimize loads and stores or other
computation. EXCEPT that operations involving a "volatile" variable must
conform to the sequence points defined in the abstract model: there is no
leeway for optimization or other modifications. Thus, all changes
previously made must be visible at each sequence point, and no subsequent
modifications may be visible at that point. (In other words, as C99 points
out explicitly, if a compiler exactly implements the language abstract
semantics at all sequence points then "volatile" is redundant.)

On a multiprocessor (which C does not recognize), "sequence points" can only
be reasonably interpreted to refer to the view of memory from that
particular processor. (Otherwise the abstract model becomes too expensive
to be useful.) Therefore, volatile may say nothing at all about the
interaction between two threads running in parallel on a multiprocessor.

On a high-performance modern SMP system, memory transactions are effectively
pipelined. A memory barrier does not "flush to memory", but rather inserts
barriers against reordering of operations in the memory pipeline. For this
to have any meaning across processors there must be a critical sequence on
EACH end of a transaction that's protected by appropriate memory barriers.
This protocol has no possible meaning for an isolated volatile variable,
and therefore cannot be applied.

The protocol can only be employed to protect the relationship between two
items; e.g., "if I assert this flag then this data has been written" paired
with "if I can see the flag is asserted, then I know the data is valid".

That's how a mutex works. The mutex is a "flag" with builtin barriers
designed to enforce the visibility (and exclusion) contract with data
manipulations that occur while holding the mutex. Making the data volatile
contributes nothing to this protocol, but inhibits possibly valuable
compiler optimizations within the code that holds the mutex, reducing
program efficiency to no (positive) end.

If you have a way to generate inline barriers (or on a machine that doesn't
require barriers), and you wish to build your own low-level protocol that
doesn't rely on synchronization (e.g., a mutex), then your compiler might
require that you use volatile -- but this is unspecified by either ANSI C
or POSIX. (That is, ANSI C doesn't recognize parallelism and therefore
doesn't apply, while POSIX applies no specific additional semantics to
"volatile".) So IF you need volatile, your code is inherently nonportable.

A corollary is that if you wish to write portable code, you have no need for
volatile. (Or at least, if you think you do have a need, it won't help you
any.)

In your case, trying to share (for unsynchronized read) a "volatile"
counter... OK. Fine. The use of volatile, portably, doesn't help; but as
long as you're not doing anything but "ticking" the counter, (not a lot of
room for optimization) it probably won't hurt. IF your variable is of a
size and alignment that the hardware can modify atomically, and IF the
compiler chooses the right instructions (this may be more likely with
volatile, statistically, but again is by no means required by any
standard), then the worst that can happen is that you'll read a stale
value. (Potentially an extremely stale value, unless there's some
synchronization that ensures memory visibility between the threads at some
regular interval.) If the above conditions are NOT true, then you may read
"corrupted" values through word tearing and related effects.

If that's acceptable, you're probably OK... but volatile isn't helping you.

In summary, you're right, you don't need SYNCHRONIZATION here. But you
probably do expect some level of VISIBILITY, while you're doing nothing to
portably ensure any visibility. What you get occurs by accident, either
because your machine has a sequential memory system or because you're
gaining the "accidental" benefit of something else on the system, such as
clock ticks interrupts (which will tend to eventually synchronize memory
visibility across the processors).

--
/--------------------[ David.B...@hp.com ]--------------------\
| Hewlett-Packard Company Tru64 UNIX & VMS Thread Architect |
| My book: http://www.awl.com/cseng/titles/0-201-63392-2/ |
\----[ http://homepage.mac.com/dbutenhof/Threads/Threads.html ]---/

Alexander Terekhov

μη αναγνωσμένη,

9 Οκτ 2002, 1:56:05 μ.μ.9/10/02

ως

Mark Johnson wrote:
[...]

> Other people have stated that they can get away without volatile. They
> are correct for the specific system / compiler combination (and even
> compiler switch setting) they are using, but not for the general case.

I think you got it kinda backwards... And, BTW, the C standard says:

"What constitutes an access to an object that has volatile-qualified
type is implementation-defined"

regards,
alexander.

David Schwartz

μη αναγνωσμένη,

9 Οκτ 2002, 1:58:34 μ.μ.9/10/02

ως

Momchil Velikov wrote:

> The fact is that ``shared'' may be modified in ways unknown to the
> program (in this particular case by another thread). Then it _must_
> be declared volatile.

You are absolutely 100% right, unless you're talking about POSIX
threads, in which case you are absolutely 100% wrong. If you're talking
about POSIX threads, the use of POSIX synchronization functions is both
necessary and sufficient.

DS

Patrick TJ McPhee

μη αναγνωσμένη,

9 Οκτ 2002, 4:16:14 μ.μ.9/10/02

ως

In article <3DA463FB...@raytheon.com>,
Mark Johnson <mark_h_...@raytheon.com> wrote:

% Other people have stated that they can get away without volatile. They
% are correct for the specific system / compiler combination (and even
% compiler switch setting) they are using, but not for the general case.

To reiterate, it is definitely not required for any POSIX system. For
any other system, it really depends on what the system's stated
requirements are. You can't depend on the C standard because threading
is always an extension to that.
--

Patrick TJ McPhee
East York Canada
pt...@interlog.com

Momchil Velikov

μη αναγνωσμένη,

9 Οκτ 2002, 5:54:18 μ.μ.9/10/02

ως

Joshua Jones <jo...@intmain.net> wrote in message news:<ao1avm$fuq$2...@solaria.cc.gatech.edu>...

> Momchil Velikov <ve...@fadata.bg> wrote:
> >
> > The fact is that ``shared'' may be modified in ways unknown to the
> > program (in this particular case by another thread). Then it _must_
> > be declared volatile.
> >
>
> I've written tons of multithreaded programs, all of which worked
> as expected, without ever using volatile.

So ?

Maybe you compiled without optimizations, maybe the registers were not
enough (like IA32), maybe you used global variables, which rarely get
registers ...

I've written tons of multithreaded programs, which didn't work without
using volatile.

From neither of these statements should be implied that
a) there does not exist a multithreaded program, which requires
volatile, nor
b) any multithreaded program requires volatile

but rather:
a) not every multithreaded program requires volatile
b) there exists multithreaded program, which requires volatile

Trivial examples:

a) optimistic unlocked reads - read, if the values is right lock and
read again. The volatile qualifier is needed so the read is actually
performed or otherwise the compiler can just load the variable in a
register and use that register throughout the function.

b) a "sensor" variable - a variable providing a stream of values,
where it is ok to read some slightly stale data, but it is important
that the new data is eventually read also - again volatile enusres the
actual read is performed.

c) memory ordering - CPU can reorder memory accesses, which is
prevented by memory barriers, but the compiler can reorder memory
accesses too, which is prevented by volatile.

Of course the above ones assume that

"volatile int x; y = x;"

constitutes "an access to an object that has volatile-qualified type"

(ISO/IEC 9899:1999).

~velco

Momchil Velikov

μη αναγνωσμένη,

9 Οκτ 2002, 6:11:30 μ.μ.9/10/02

ως

Joshua Jones <jo...@intmain.net> wrote in message news:<ao1avm$fuq$2...@solaria.cc.gatech.edu>...

> Momchil Velikov <ve...@fadata.bg> wrote:
> >
> > The fact is that ``shared'' may be modified in ways unknown to the
> > program (in this particular case by another thread). Then it _must_
> > be declared volatile.
> >
>
> I've written tons of multithreaded programs, all of which worked
> as expected, without ever using volatile.

As an example, think what would happen if the ``q_size'' variable in
mthttpd is allocated in a register.

(I don't think the standard forbids it and you can even force the GCC
to do it by declaring it ``int q_size asm ("esi");''. Note that which
the asm construct is GCC extension it merely forces the compiler to do
something not forbidded by the standard, thus one can pretend the
compiler did it all by itself :)

~velco

Steve Watt

μη αναγνωσμένη,

9 Οκτ 2002, 10:23:15 μ.μ.9/10/02

ως

In article <87bded37.02100...@posting.google.com>,

Momchil Velikov <ve...@fadata.bg> wrote:
>Joshua Jones <jo...@intmain.net> wrote in message
>news:<ao1avm$fuq$2...@solaria.cc.gatech.edu>...
>> Momchil Velikov <ve...@fadata.bg> wrote:
>> >
>> > The fact is that ``shared'' may be modified in ways unknown to the
>> > program (in this particular case by another thread). Then it _must_
>> > be declared volatile.
>> >
>>
>> I've written tons of multithreaded programs, all of which worked
>> as expected, without ever using volatile.
>
>So ?
>
>Maybe you compiled without optimizations, maybe the registers were not
>enough (like IA32), maybe you used global variables, which rarely get
>registers ...

No, he was following the memory visibility rules for whatever platform(s)
he was working on. I am not aware of any platform that requires volatile
for thread safety.

Likewise, I have written several hundred non-trivial threaded programs,
on architectures ranging from microcontrollers to 64 bit processors,
numerous different compilers, and the highest optimization levels
possible.

Not once have I needed volatile for threading reasons.

>I've written tons of multithreaded programs, which didn't work without
>using volatile.

Then you were not on POSIX or Win32 platforms.

>From neither of these statements should be implied that
> a) there does not exist a multithreaded program, which requires
>volatile, nor
> b) any multithreaded program requires volatile
>
>but rather:
> a) not every multithreaded program requires volatile
> b) there exists multithreaded program, which requires volatile

>Trivial examples:
>
> a) optimistic unlocked reads - read, if the values is right lock and
>read again. The volatile qualifier is needed so the read is actually
>performed or otherwise the compiler can just load the variable in a
>register and use that register throughout the function.

This will not work on systems with weak memory ordering.

> b) a "sensor" variable - a variable providing a stream of values,
>where it is ok to read some slightly stale data, but it is important
>that the new data is eventually read also - again volatile enusres the
>actual read is performed.

If the variable's storage is in a device, it must be volatile, but that
has nothing to do with threads. If the variable is in another thread
that does nothing but compute new values for that variable, then you need
to obey the memory visibility rules that your platform sets out.

> c) memory ordering - CPU can reorder memory accesses, which is
>prevented by memory barriers, but the compiler can reorder memory
>accesses too, which is prevented by volatile.

This is not an example. Using volatile in an attempt to achieve thread
safety *WILL NOT WORK* on all platforms. Period. Further, there is
always a way to do the same job (thread safety of data) without using
volatile.

Steve Watt

μη αναγνωσμένη,

9 Οκτ 2002, 10:35:32 μ.μ.9/10/02

ως

In article <87bded37.02100...@posting.google.com>,
Momchil Velikov <ve...@fadata.bg> wrote:

>Joshua Jones <jo...@intmain.net> wrote in message
>news:<ao1avm$fuq$2...@solaria.cc.gatech.edu>...
>> Momchil Velikov <ve...@fadata.bg> wrote:
>> >
>> > The fact is that ``shared'' may be modified in ways unknown to the
>> > program (in this particular case by another thread). Then it _must_
>> > be declared volatile.
>> >
>>
>> I've written tons of multithreaded programs, all of which worked
>> as expected, without ever using volatile.
>
>As an example, think what would happen if the ``q_size'' variable in
>mthttpd is allocated in a register.

False example.

>(I don't think the standard forbids it and you can even force the GCC
>to do it by declaring it ``int q_size asm ("esi");''. Note that which
>the asm construct is GCC extension it merely forces the compiler to do
>something not forbidded by the standard, thus one can pretend the
>compiler did it all by itself :)

The compiler could put it into a register, if it knew how to make that
register available to all threads. mthttpd correctly follows the
memory visibility rules: All accesses to q_size are done while holding
a mutex.

The C language standard does not address threads, and thus things are
possible in it that are not possible under POSIX. POSIX requires
(roughly speaking -- read the standard for the gory details) that
modifications made to a memory object while holding a mutex will be
visible to any other thread that acquires that mutex afterward.

The compiler is free to keep q_size in a register up until the call to
pthread_mutex_unlock() or pthread_cond_wait(), in this particular example,
and that may well be a useful optimization. (But it's unimportant in
this exact bit of code.)

Drazen Kacar

μη αναγνωσμένη,

9 Οκτ 2002, 11:00:11 μ.μ.9/10/02

ως

Steve Watt wrote:
> In article <87bded37.02100...@posting.google.com>,
> Momchil Velikov <ve...@fadata.bg> wrote:

> >I've written tons of multithreaded programs, which didn't work without
> >using volatile.
>
> Then you were not on POSIX or Win32 platforms.

It might not have anything to do with threads. I think the only "usefuly
defined" use for volatile in C is for variables modified in signal
handlers. You have to declare them as volatile, threads or no threads.

> >Trivial examples:
> >
> > a) optimistic unlocked reads - read, if the values is right lock and
> >read again. The volatile qualifier is needed so the read is actually
> >performed or otherwise the compiler can just load the variable in a
> >register and use that register throughout the function.
>
> This will not work on systems with weak memory ordering.

Why not? I've inherited a certain program which uses those constructs a
lot and I'd like to prove it wrong, but I haven't been able to see a
problem, apart from the fact that it used int instead of sig_atomic_t.

The code goes like this:

int foo(...)
{
static volatile sig_atomic_t initialized = 0;

if (!initialized)
{
pthread_mutex_lock(...);
if (!initialized)
{
initialize_me();
initialized = 1;
}
pthread_mutex_unlock(...);
}
...
}

Unless... initialize_me function could store some values in memory. After
that we store one in initialized. When the next thread starts executing
this code, it might read one from initialized, but it won't necessarily
see all the values stored by initialize_me function, because there's no
memory barrier in its code path.

--
.-. .-. I don't work here. I'm a consultant.
(_ \ / _)
| da...@willfork.com
|

Carlos Moreno

μη αναγνωσμένη,

10 Οκτ 2002, 12:02:48 π.μ.10/10/02

ως

Wow! This has been a great discussion! (well, for me
anyway). I appreciate your comments and thoughts!

However, I do have another doubt (well, kind of the same
doubt, but now in a more specific context).

Several people have pointed out that when using POSIX
threads, volatile is never necessary, and that proper
synchronization using the right POSIX threads facilities
is always sufficient.

Now, how can the compiler know that? Is the compiler
aware that when using pthreads it must disable any
optimizations surrounding shared variables access? As
I understand it, POSIX threads are kinda multiplatform
(there are pthreads libraries for Unix/Linux, but also
for Windows -- maybe for MAC and other OS's too?). So,
how can POSIX threads alone provide any guarantee about
something that seems completely in the compiler's hands?

I understand that when passing a pointer to another
function, the compiler must know that such function
*could* modify the value, so any assumption about its
value would be dropped. But that only applies to the
call to pthread_create, where the pointer is passed
such that the other thread is given access to "shared".
But in a situation like:

int shared = 0;

pthread_create ( ...... , &shared);

pthread_mutex_lock ( some_mutex .... );
shared = 2;
pthread_mutex_unlock ( ...... );

lots of code that does NOT modify shared
(but in the mean time, the other thread
could have modified it, of course)

// at this point -- why is it that the calls
// to mutex_lock and unlock guarantee that
// the compiler will not do something wrong
// because it assumed that shared is 2?

// Or is that a wrong synchronization mechanism
// for this case?

Again, thanks for this great discussion! And thanks
in advance for any further comments on this POSIX
threads question.

Cheers,

Carlos
--

Joshua Jones

μη αναγνωσμένη,

10 Οκτ 2002, 12:36:12 π.μ.10/10/02

ως

Carlos Moreno <mor...@mochima.com> wrote:
>
> // at this point -- why is it that the calls
> // to mutex_lock and unlock guarantee that
> // the compiler will not do something wrong
> // because it assumed that shared is 2?
>

At this point, if 'shared' is accessed, it should be protected
with a mutex, since the data is shared and could have been written
to by another thread. Remember, you as a programmer are responsible
for this level of synchronization... when you add synchronization,
you're adding cancelation points, and the compiler then 'knows'
not to assume anything.

I'll let the other, more frequent thread programmers out there
do a better job of explaining than I just did :-)

--
josh(at)intmain.net | http://intmain.net
CS @ College of Computing, Georgia Institute of Technology, Atlanta

532704 local keystrokes since last reboot, 38 days ago.

Patrick TJ McPhee

μη αναγνωσμένη,

10 Οκτ 2002, 12:25:30 π.μ.10/10/02

ως

In article <3DA4FBE8...@mochima.com>,
Carlos Moreno <mor...@mochima.com> wrote:

% Several people have pointed out that when using POSIX
% threads, volatile is never necessary, and that proper
% synchronization using the right POSIX threads facilities
% is always sufficient.
%
% Now, how can the compiler know that? Is the compiler
% aware that when using pthreads it must disable any
% optimizations surrounding shared variables access? As

It doesn't have to disable all optimisations, but the short
answer is that the compiler is part of the POSIX system, and
it has to do whatever it has to do in order to satisfy the
requirements.

t...@cs.ucr.edu

μη αναγνωσμένη,

10 Οκτ 2002, 12:49:50 π.μ.10/10/02

ως

Momchil Velikov <ve...@fadata.bg> wrote:
[...]
+ The fact is that ``shared'' may be modified in ways unknown to the
+ program (in this particular case by another thread). Then it _must_
+ be declared volatile.
[...]
+ One can hear all sorts of things :) Multithreading is one way to
+ obtain volatile behavior - thus volatile and thread safety are not
+ unrelated.

+ (by "thread-safety" I mean "correct execution" for the various values
+ of "correct").

Thread-shared objects must be protected by mutexes whose operations
must
1) behave as super sequence points for thread-shared variables and
2) must do more (i.e., invoke hardware-level barriers to reordering).
If all thread-shared variables have volatile-qualified types, we pay
a huge penalty in overhead, but #1 becomes redundant. Sigh.

However, the mutexes themselves involve thread-shared objects, i.e.,
the objects that represent the state of those mutexes. Obviously,
those state objects require the same treatment that C/C++ requires for
objects of volatile-qualified types, i.e., you can't lock a mutex and
then keep its dirty state variable in a register --- no other thread
will know that the mutex is locked. (Note, however, that mutexes
cannot be implemented in standard C/C++.)

Tom Payne

Jim Rogers

μη αναγνωσμένη,

10 Οκτ 2002, 2:09:24 π.μ.10/10/02

ως

Carlos Moreno wrote:

>
> Wow! This has been a great discussion! (well, for me
> anyway). I appreciate your comments and thoughts!
>
> However, I do have another doubt (well, kind of the same
> doubt, but now in a more specific context).
>
> Several people have pointed out that when using POSIX
> threads, volatile is never necessary, and that proper
> synchronization using the right POSIX threads facilities
> is always sufficient.
>
> Now, how can the compiler know that? Is the compiler
> aware that when using pthreads it must disable any
> optimizations surrounding shared variables access? As
> I understand it, POSIX threads are kinda multiplatform
> (there are pthreads libraries for Unix/Linux, but also
> for Windows -- maybe for MAC and other OS's too?). So,
> how can POSIX threads alone provide any guarantee about
> something that seems completely in the compiler's hands?

A C or C++ compiler cannot know that, because the language does not
address threading. The Posix libraries can know that. If you want
a compiler to know about threading issues you must use a language
with threading built into the syntax. The three languages that
most frequently come to my mind for this purpose are Java, C#,
and Ada.

Of those three, the most robust and complete threading model is

provided by Ada. Ada has robust locking mechanisms as well as a
couple of useful pragmas.

Pragma Atomic is used to specify that access (reads and writes)
must be indivisible. This can only be applied to objects no larger
than a "word" on the current hardware.

Pragma Volatile specifies that all accesses must be direct, not
through local copies of a variable.

These two pragmas are useful for shared data as long as you can
ensure there is no possible race condition between reader threads
and writer threads. In general this is very difficult to ensure without
some form of locking.

Jim Rogers

Momchil Velikov

μη αναγνωσμένη,

10 Οκτ 2002, 3:04:20 π.μ.10/10/02

ως

st...@nospam.Watt.COM (Steve Watt) wrote in message news:<H3quM...@Watt.COM>...

> In article <87bded37.02100...@posting.google.com>,
> Momchil Velikov <ve...@fadata.bg> wrote:
> >Joshua Jones <jo...@intmain.net> wrote in message
> >news:<ao1avm$fuq$2...@solaria.cc.gatech.edu>...
> >> Momchil Velikov <ve...@fadata.bg> wrote:
> >> >
> >> > The fact is that ``shared'' may be modified in ways unknown to the
> >> > program (in this particular case by another thread). Then it _must_
> >> > be declared volatile.
> >> >
> >>
> >> I've written tons of multithreaded programs, all of which worked
> >> as expected, without ever using volatile.
> >
> >So ?
> >
> >Maybe you compiled without optimizations, maybe the registers were not
> >enough (like IA32), maybe you used global variables, which rarely get
> >registers ...
>
> No, he was following the memory visibility rules for whatever platform(s)
> he was working on. I am not aware of any platform that requires volatile
> for thread safety.
>
> Likewise, I have written several hundred non-trivial threaded programs,
> on architectures ranging from microcontrollers to 64 bit processors,
> numerous different compilers, and the highest optimization levels
> possible.
>
> Not once have I needed volatile for threading reasons.

Who claimed you _always_ need volatile for threading reasons ? Care
to read my message to the end ?

> >I've written tons of multithreaded programs, which didn't work without
> >using volatile.
>
> Then you were not on POSIX or Win32 platforms.

Well, I was both on POSIX and Win32 platforms.

>
> >From neither of these statements should be implied that
> > a) there does not exist a multithreaded program, which requires
> >volatile, nor
> > b) any multithreaded program requires volatile
> >
> >but rather:
> > a) not every multithreaded program requires volatile
> > b) there exists multithreaded program, which requires volatile
>
> >Trivial examples:
> >
> > a) optimistic unlocked reads - read, if the values is right lock and
> >read again. The volatile qualifier is needed so the read is actually
> >performed or otherwise the compiler can just load the variable in a
> >register and use that register throughout the function.
>
> This will not work on systems with weak memory ordering.

Not true. Memory ordering is ensured by the implementations of the
lock/unlock functions. The point is that no memory ordering can help
you if the compiler does _not_ perform the read.

> > b) a "sensor" variable - a variable providing a stream of values,
> >where it is ok to read some slightly stale data, but it is important
> >that the new data is eventually read also - again volatile enusres the
> >actual read is performed.
>
> If the variable's storage is in a device, it must be volatile, but that
> has nothing to do with threads. If the variable is in another thread
> that does nothing but compute new values for that variable, then you need
> to obey the memory visibility rules that your platform sets out.

Define "memory visibility" ?

On every platform I'm aware of, if a CPU performs an (atomic) memory
write, the value written is _eventually_ visible to other CPUs.

Moreover, on every platform I'm aware of, the sequence of values read
from a single memory location is a (not necessarily proper)
subsequence of the sequence of values written, i.e. no CPU can observe
values occuring in the opposite order.

These, along with atimicity of reads/writes if sufficient for the
above examples to work. Note that I don't claim (and have _never_
claimed) they are necessary.

> > c) memory ordering - CPU can reorder memory accesses, which is
> >prevented by memory barriers, but the compiler can reorder memory
> >accesses too, which is prevented by volatile.
>
> This is not an example.

This is not an argument.

> Using volatile in an attempt to achieve thread
> safety *WILL NOT WORK* on all platforms. Period.

This is not an argument either.

> Further, there is
> always a way to do the same job (thread safety of data) without using
> volatile.

Most probably. So, what?

Correctness first, but after all we want performance too, don't we ?
On modern SMP architectures every write to concurently accessed memory
is a potential bottleneck - including things like
pthread_mutex_lock/unlock.
Note that the same applies to pthread_rwlock_rdlock too as it performs
a memory write too. Your best bet after making all the efforts to
have no concurrently accessed locations is to perform mostly reads
there and avoid as hell pthread_ synchronization functions (or any
sync functions for that matter).

(Well, maybe I misunderstood the topic of the newsgroup, maybe it is
about bitching about standards instead of multithreaded programming in
the real world, so I'll go find the FAQ).

~velco

Momchil Velikov

μη αναγνωσμένη,

10 Οκτ 2002, 3:08:59 π.μ.10/10/02

ως

pt...@interlog.com (Patrick TJ McPhee) wrote in message news:<i80p9.14468$qh1.1...@news.ca.inter.net>...

> You can't depend on the C standard because threading
> is always an extension to that.

This is not a valid argument. You _can_ depend on the C standard for
the things specified in the C standard. The compiler is unaware of
threads, but this does not mean it will perform randon violations of
the standard.

~velco

Momchil Velikov

μη αναγνωσμένη,

10 Οκτ 2002, 3:36:21 π.μ.10/10/02

ως

st...@nospam.Watt.COM (Steve Watt) wrote in message news:<H3qv7...@Watt.COM>...

> In article <87bded37.02100...@posting.google.com>,
> Momchil Velikov <ve...@fadata.bg> wrote:
> >Joshua Jones <jo...@intmain.net> wrote in message
> >news:<ao1avm$fuq$2...@solaria.cc.gatech.edu>...
> >> Momchil Velikov <ve...@fadata.bg> wrote:
> >> >
> >> > The fact is that ``shared'' may be modified in ways unknown to the
> >> > program (in this particular case by another thread). Then it _must_
> >> > be declared volatile.
> >> >
> >>
> >> I've written tons of multithreaded programs, all of which worked
> >> as expected, without ever using volatile.
> >
> >As an example, think what would happen if the ``q_size'' variable in
> >mthttpd is allocated in a register.
>
> False example.

No. :)

>
> >(I don't think the standard forbids it and you can even force the GCC
> >to do it by declaring it ``int q_size asm ("esi");''. Note that which
> >the asm construct is GCC extension it merely forces the compiler to do
> >something not forbidded by the standard, thus one can pretend the
> >compiler did it all by itself :)
>
> The compiler could put it into a register, if it knew how to make that
> register available to all threads.

The compiler is not entitled to reasoning about threads.

> mthttpd correctly follows the
> memory visibility rules: All accesses to q_size are done while holding
> a mutex.

Which will not work if q_size is in a register and it _is_ possible,
of the compiler determines that pthread_mutex_lock cannot modify it.

> The C language standard does not address threads, and thus things are
> possible in it that are not possible under POSIX.

POSIX does not invalidate the C standard. Features of the C standard
not specified, explicitly modified or forbidden by POSIX are by no
means invalidated.

> POSIX requires
> (roughly speaking -- read the standard for the gory details) that
> modifications made to a memory object while holding a mutex will be
> visible to any other thread that acquires that mutex afterward.

Yes. That POSIX requirement (IEEE Std. 1003.1-2001 [4.10 Memory
Synchronization]) indeed renders all of my examples non-conforming.

It is yet another topic for discussion whether this point should have
been included at all in the standard. I'm yet to see a single system
where this is the _only_ way to obtain safe access to shared
variables. On the opposite, on the _majority_ of systems out there
this requirement is unnecessarily restrictive and severely damaging
performance.

> The compiler is free to keep q_size in a register up until the call to
> pthread_mutex_unlock() or pthread_cond_wait(), in this particular example,
> and that may well be a useful optimization. (But it's unimportant in
> this exact bit of code.)

Why it is not allowed to keep it in a register across calls to
pthread_mutex_lock/unlock ?

~velco

Momchil Velikov

μη αναγνωσμένη,

10 Οκτ 2002, 3:50:44 π.μ.10/10/02

ως

Drazen Kacar <da...@willfork.com> wrote in message news:<slrnaq9r9...@willfork.com>...

> int foo(...)
> {
> static volatile sig_atomic_t initialized = 0;
>
> if (!initialized)
> {
> pthread_mutex_lock(...);
> if (!initialized)
> {
> initialize_me();
> initialized = 1;
> }
> pthread_mutex_unlock(...);
> }
> ...
> }
>
> Unless... initialize_me function could store some values in memory. After
> that we store one in initialized. When the next thread starts executing
> this code, it might read one from initialized, but it won't necessarily
> see all the values stored by initialize_me function, because there's no
> memory barrier in its code path.

It will see them. There will be a memory barrier inside
pthread_mutex_unlock, said memory barrier ensuring that the writes in
``initialize_me()'' and to ``initialized'' are ordered before the
write, which makes the mutex unlocked, i.e. no other CPU/thread can
see the mutex unlocked and read stale values.

~velco

Alexander Terekhov

μη αναγνωσμένη,

10 Οκτ 2002, 5:15:40 π.μ.10/10/02

ως

Momchil Velikov wrote:
[...]
> Define "memory visibility" ?

Try this:

http://www.crhc.uiuc.edu/ece412/papers/models_tutorial.pdf

http://www.primenet.com/~jakubik/mpsafe/MultiprocessorSafe.pdf

regards,
alexander.

Alexander Terekhov

μη αναγνωσμένη,

10 Οκτ 2002, 5:16:59 π.μ.10/10/02

ως

Momchil Velikov wrote:
>
> Drazen Kacar <da...@willfork.com> wrote in message news:<slrnaq9r9...@willfork.com>...
> > int foo(...)
> > {
> > static volatile sig_atomic_t initialized = 0;
> >
> > if (!initialized)
> > {
> > pthread_mutex_lock(...);
> > if (!initialized)
> > {
> > initialize_me();
> > initialized = 1;
> > }
> > pthread_mutex_unlock(...);
> > }
> > ...
> > }
> >
> > Unless... initialize_me function could store some values in memory. After
> > that we store one in initialized. When the next thread starts executing
> > this code, it might read one from initialized, but it won't necessarily
> > see all the values stored by initialize_me function, because there's no
> > memory barrier in its code path.
>

> It will see them. ...

Stop silly arguing and read {trying to understand} the stuff
you're pointed to.

http://www.cs.umd.edu/~pugh/java/memoryModel/DoubleCheckedLocking.html
(The "Double-Checked Locking is Broken" Declaration)

http://groups.google.com/groups?selm=G3Q49.14%24Fr2.86101%40news.cpqcorp.net
(Subject: Re: DCL and code optimization and etc.)

regards,
alexander.

Alexander Terekhov

μη αναγνωσμένη,

10 Οκτ 2002, 5:18:59 π.μ.10/10/02

ως

Drazen Kacar wrote:
[...]

> It might not have anything to do with threads. I think the only "usefuly
> defined" use for volatile in C is for variables modified in signal
> handlers. You have to declare them as volatile, threads or no threads.

^^^^^^^^^^^^^^^^^^^^^

http://groups.google.com/groups?selm=3D81DFCF.D8FF9B50%40web.de

<copy&paste>

White Wolf wrote:
>
> Alexander Terekhov wrote:
> > Attila Feher wrote:
> >
> >>Alexander Terekhov wrote:
> >>[SNIP]
> >>
> >>>>That _is_ wrong. bool has to be sig_atomic_t to work - as long as only
> >>>>assignment and read is done. (Only those are atomic on sig_atomic_t.)
> >>>
> >>>http://groups.google.com/groups?threadm=3D4A8DDD.5A935D95%40web.de
> >>>(Subject: Re: "memory location")
> >>
> >>OK. Now I am puzzled. So far I was assured by several, that
> >>sig_atomic_t is "the stuff" which is A OK and safe to write from one
> >>thread and read from the other. I mean Thread A _only_ writes (OK,
> >>maybe reads but what for) and Thread B _only_ reads. Do you mean that
> >>this doesn't work?
> >
> >
> > Yes. And even if it WOULD work (i.e. atomicity), you I'd still have
> > the problem of visibility w.r.t. dependent {mutable} data (if any).
>
> Now I am even more puzzled: why is it called sig_atomic_t if it isn't?

It (i.e. *static volatile sig_atomic_t*) IS "atomic" (and even
thread-safe ;-) ) with respect to ONE SINGLE thread that reads
and writes it AND signal handler(s) "interrupting" THAT thread.
IOW, it has really nothing to do with threads.

http://www.lysator.liu.se/c/rat/b.html#2-2-3
("2.2.3 Signals and interrupts", ANSI C89 Rationale)

> Why do "big old names" say it _is_ safe to use it (well, from interrupt
> routines, but in an MT environment that IT can come whoknowswhatway).
>
> So one thing. If read and write _is_ atomic for this type, what is the
> problem? AFAIK (OK, did not read it) the standard asks for it. Then
> why cannot I use for (for example) a dirty flag? I look at my copy of
> sth (reading this flag) and if it is non-zero (integral type it is) then
> I know I have (when I want) update my copy. Which will certainly
> involve some sort of locking - again if needed. Got it? Why cannot I
> use this type in MT enviroment for this? One threads writes (only!) the
> data, others may read it. Where can it go wrong? Please do not post
> links to scattered lengthy discussion - I am a simple man and I get
> confused easily.
>
> I have worked in Intel (OK, only 8086) assembly, still have the books
> about Z80 and some more HW design so let's cut to the chase! What does
> make sig_atomic_t non-safe. I do not care about POSIX, that is
> something what is made by pretty clever in a way that I will not
> understand (standard). However I understand the concept of system bus,
> somewhat the cache (cash even moe :-), controll signals and the like.
> So where the heck can it go wrong, if the operation is atomic?

http://rsim.cs.uiuc.edu/~sadve/Publications/models_tutorial.ps

"5.2.1 Cache Coherence and Sequential Consistency
Several definitions for cache coherence (also referred to
as cache consistency) exist in the literature. The strongest
definitions treat the term virtually as a synonym for
sequential consistency. Other definitions impose
extremely relaxed ordering guarantees.
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^

What does the programmer expect from the memory system to
ensure correct execution of this program fragment? One
important requirement is that the value read from the
data field within a dequeued record should be the same
as that written by P1 in that record. However, in many
commercial shared memory systems, it is possible for
processors to observe the old value of the data field
(i.e., the value prior to P1's write of the field),
leading to behavior different from the programmer's
expectations.

http://groups.google.com/groups?selm=c29b5e33.0202150632.6d579f24%40posting.google.com
(Subject: Re: Can multiple threads set a global variable simultaneously?)

regards,
alexander.

--
POSIX

"Applications shall ensure that access to any memory
location by more than one thread of control (threads
or processes) is restricted such that no thread of
control can read or modify a memory location while
another thread of control may be modifying it. Such
access is restricted using functions that synchronize
thread execution and also synchronize memory with
respect to other threads."

JAVA (revised Java volatiles aside)

"If two threads access a normal variable, and one
of those accesses is a write, then the program should
be synchronized so that the first access is visible to
the second access. When a thread T 1 acquires a lock
on/enters a monitor m that was previously held by
another thread T 2, all actions that were visible to
T 2 at the time it released the lock on m become
visible to T 1"

Drazen Kacar

μη αναγνωσμένη,

10 Οκτ 2002, 5:38:43 π.μ.10/10/02

ως

Alexander Terekhov wrote:
>
> Drazen Kacar wrote:
> [...]
> > It might not have anything to do with threads. I think the only "usefuly
> > defined" use for volatile in C is for variables modified in signal
> > handlers. You have to declare them as volatile, threads or no threads.
> ^^^^^^^^^^^^^^^^^^^^^

[...]

> It (i.e. *static volatile sig_atomic_t*) IS "atomic" (and even
> thread-safe ;-) ) with respect to ONE SINGLE thread that reads
> and writes it AND signal handler(s) "interrupting" THAT thread.
> IOW, it has really nothing to do with threads.

OK, you've forced me to go and read the C standard and I'm not thankful.

Let me modify the above:

Although C claims that an object of type sig_atomic_t "can be accessed as
an atomic entity, even in the presence of asynchronous interrupts", I've
seen compiler documentation which asks that variables modified from
signal handlers be declared as volatile if the user wants to use certain
level of optimization.

Alexander Terekhov

μη αναγνωσμένη,

10 Οκτ 2002, 5:49:00 π.μ.10/10/02

ως

My point was: <copy&paste>

well, i was under impression that "sig_atomic_t" alone
does not guarantee thread (or even signal) safety..
only the combination of _static_storage_duration_,
_volatile_ and _sig_atomic_t makes it safe.. and only
for signal handlers.. i could imagine an impl. which
would just disable signal delivery while accessing
"static volatile sig_atomic_t" variable (allocated
in some special storage region - for static volatiles
sig_atomic_t's only) or would do something else which
would NOT work with respect to threads.

or am i missing something?
---

> : What is needed is something similar to the Java memory model requirement
> : that values cannot "come out of thin air"

I don't think so. http://groups.google.com/groups?selm=3C9236F3.49C68326%40web.de

> (i.e. roughly speaking, a value
> : read from any variable must have been previously written to that variable,
> : with some additional ordering constraints). This has little or nothing to do
> : with the semantics of sig_atomic_t (or volatile), which the C99 Standard
> : only defines for single-threaded programs.
>
> Moreover, the standard only guarantees atomicity of writes by signal
> handlers to data

static data

> of type sig_atomic_t, and only when the object is
> also declared to be volatile. Objects of type sig_atomic_t are not
> guaranteed to be atomic in any other context.

AFAICS, it's even worse than that... in a multithreaded application that
happens to use asynchronous signals [vs. sigwait and/or SIGEV_THREAD delivery]
with static volatile sig_atomic_t vars you'd have to ensure that such signals
could only be "delivered" to a corresponding ONE SINGLE thread -- the one that
reads/writes a particular static volatile sig_atomic_t variable(s). You just
can't have such signal(s) delivered to any other thread.

regards,
alexander.

Carlos Moreno

μη αναγνωσμένη,

10 Οκτ 2002, 7:50:49 π.μ.10/10/02

ως

Joshua Jones wrote:

> Carlos Moreno <mor...@mochima.com> wrote:
>
>> // at this point -- why is it that the calls
>> // to mutex_lock and unlock guarantee that
>> // the compiler will not do something wrong
>> // because it assumed that shared is 2?
>>
>>
>
> At this point, if 'shared' is accessed, it should be protected
> with a mutex, since the data is shared and could have been written
> to by another thread. Remember, you as a programmer are responsible
> for this level of synchronization... when you add synchronization,
> you're adding cancelation points, and the compiler then 'knows'
> not to assume anything.

Muy doubt was (is?) how does the compiler know not to assume
anything?? If I need to access "shared", I would do:

pthread_mutex_lock ( the mutex );
int a = shared;
pthread_unlock ( the mutex );

Since the calls to lock and unlock do not involve taking the
address of "shared", how would the compiler know not to assume
that shared is 2? (well, whatever value was assigned originally
and never changed in this thread)

The next reply in this branch answers this question, saying
that the compiler itself is part of the POSIX system (that
is something that wasn't very clear in my mind -- I kept
thinking that you were simply talking about a threading
library called "POSIX threads", as opposed to a complete
system specification)... But then, the following reply
seems to suggest that C or C++ are not necessarily part
of that specification. As you can understand, my poor
brain is about to explode with so many details! :-)

So, I'll turn my doubt into a concrete question: I'm using
Linux (RedHat 7.2, or 7.3, or soon it'll be 8.0), with g++
(I normally use C++, but in some cases I might end up using
C as well). In that scenario, is my system compliant with
the POSIX and POSIX threads specification? Or might I
need to use volatile in cases like my example with the
shared variable? I guess *if* I need to use volatile, it
would be only with auto storage variables, right?

Thanks!

Carlos
--

Alexander Terekhov

μη αναγνωσμένη,

10 Οκτ 2002, 8:28:35 π.μ.10/10/02

ως

Carlos Moreno wrote:
>
> Joshua Jones wrote:
>
> > Carlos Moreno <mor...@mochima.com> wrote:
> >
> >> // at this point -- why is it that the calls
> >> // to mutex_lock and unlock guarantee that
> >> // the compiler will not do something wrong
> >> // because it assumed that shared is 2?
> >>
> >>
> >
> > At this point, if 'shared' is accessed, it should be protected
> > with a mutex, since the data is shared and could have been written
> > to by another thread. Remember, you as a programmer are responsible
> > for this level of synchronization... when you add synchronization,
> > you're adding cancelation points,

To begin with, POSIX mutexes aren't cancelation points. See
POSIX threads rationale if you want to know/understand why.

> > and the compiler then 'knows' not to assume anything.

Yes it knows, but with has really nothing to do with thread
cancelation, AFAIK.

> Muy doubt was (is?) how does the compiler know not to assume
> anything?? If I need to access "shared", I would do:
>
> pthread_mutex_lock ( the mutex );
> int a = shared;
> pthread_unlock ( the mutex );
>
> Since the calls to lock and unlock do not involve taking the
> address of "shared", how would the compiler know not to assume
> that shared is 2? (well, whatever value was assigned originally

> and never changed in this thread) [.... RedHat ....]

Quoting James Kanze: < Newsgroups: comp.lang.c++.moderated,
Subject: Re: volatile -- what does it mean in relation to
member functions? >

<quote>

> Without volatile, the compiler might decide that it already knows the
> value that it will be using a few lines down anyway and keep it in a
> register instead of writing it back to memory and then reading it back
> from memory. This is a portable behaviour of volatile.

As we've been trying to explain, guaranteeing that the compiler will not
use a value which it explicitly cached in a register simply doesn't buy
you anything.

> Will a mutex force the compiler to generate memory reads/writes?

There's no such thing as a mutex in C++, so it obviously depends on the
system definition of mutex.

In Posix (IEEE Std 1003.1, Base Definitions, General Concepts, Memory
Synchronization): "The following functions synchronize memory with
respect to other threads: [...]". Both pthread_mutex_lock and
pthread_mutex_unlock are in the list.

> It's only even possible if it is aware that you *are* using a mutex.

If I call pthread_mutex_lock, I guess that the compiler can suppose that
I am using a mutex.

Posix makes certain requirements. (I suppose that Windows threads offer
similar guarantees, and make similar requirements.) If my program
conforms to those requirements, and the system claims Posix compliance,
then it is the compiler's or the system's problem to make my program
work. It's none of my business how they do it.

With regards to code motion of the compilers, there are two relatively
simple solutions:

- The compiler knows about the system calls, and knows that it cannot
move reads or writes around across them, or

- The compiler doesn't know about them, and treats them just as any
other external function call. In this case, of course, it had
better ensure that the necessary reads and writes have taken place,
since it cannot assume that the called code doesn't make use of or
modify the variables in question. (Any object accessible from
^^^^^^^^^^^^^^^^^^^^^^^^^^
another thread would also be accessible from an external function
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
with unknown semantics.)
^^^^^^^^^^^^^^^^^^^^^^^^

Most compilers currently use the second strategy, at least partially
because they have to implement it anyway -- I can call functions written
in assembler from C++, and there is no way that the C++ compiler can
know their semantics, so all that is needed is that the C++ compiler
treat pthread_mutex_lock et al. as if they were unknown functions
written in assembler (which is often the case in fact anyway).

</quote>

regards,
alexander.

Alexander Terekhov

μη αναγνωσμένη,

10 Οκτ 2002, 8:51:32 π.μ.10/10/02

ως

Carlos Moreno wrote:
[...]

> I guess *if* I need to use volatile, it would be only
> with auto storage variables, right?

Heck, here's my final copy&paste.

regards,
alexander. < ``over and out.'' >

Subject: Re: stl deques and "volatile"
Newsgroups: comp.lang.c++
Date: 2002-08-23 05:47:59 PST

Gerhard Prilmeier wrote:
[...]
> Opposed to that, Andrei Alexandrescu wrote a lengthy article about using
> volatile in multhithreaded programs:
> http://www.cuj.com/experts/1902/alexandr.htm
>
> You see me rather confused, unfortunately.
> Who is wrong? Are both right?

Ha! Yeah, that was rather funny and noisy stuff, indeed. ;-)

http://groups.google.com/groups?selm=94ccng%24m2t%241%40nnrp1.deja.com

"Please, Andrei, not 'volatile'!"

http://groups.google.com/groups?selm=3A66E4A1.54EE37AD%40compaq.com

<quote>

Andrei Alexandrescu wrote:

> In my opinion the use of volatile that the article suggests is fully in
> keeping with its original intent. Volatile means "this data can be changed
> beyond compiler's ability to figure it out" and this is exactly what happens
> data that's shared between threads. Once you lock the synchronization
> object, the code's semantics become single-threaded so it cannot be changed
> beyond compiler's ability to comprehend and so you can cast volatile away.
> What's so wicked in that? Looks very kosher to me :o).

After the number of responses in this thread, you can still say that? Amazing.

Original intent? The intent of the "volatile" attribute is to change the code
generated by the compiler on references to memory tagged with that attribute.
You are using the syntactic tag while defeating the intent of that tag, never
allowing the compiler to generate code in the way required by that tag. Abuse,
I'm afraid, is often in the eye of the beholder, but it's hard to see how
anyone could refuse to admit that this is, at best, "pretty close to the edge".

In fact, though you've said you weren't intending to advocate direct access to
any "volatile" variable without applying synchronization and casting away
"volatile", your first Gadget::Wait example does precisely that, and is wildly
incorrect and dangerously misleading. Compiler volatile semantics are not
sufficient when sharing flag_ between threads, because the hardware, as well as
the compiler, may reorder memory accesses arbitrarily, even with volatile. (Nor
would a compiler implementation that issued memory barriers at each sequence
point for volatile variables be sufficient, unless ALL data was volatile, which
is impractical and unreasonably expansive.)

Memory barriers must be applied where necessary on many architectures, and
there is no standard or portable way to generate them. There is no excuse for a
compiler to require both volatile AND memory barriers, because there's no
excuse for a compiler to reorder memory access around its own memory barrier
construct. (Usually either a compiler builtin such as Compaq C/C++ "__MB()" or
an asm("mb") "pseudocall".) The standard and portable way to ensure memory
consistency is to rely on the POSIX memory model, which is based solely on
specific POSIX API calls rather than expensive and inappropriately defined
language keywords or nonportable hardware instructions. A system or compiler
that does not provide the proper memory model (without volatile) with proper
use of the portable POSIX API calls does not conform to POSIX, and cannot be
considered for serious threading work. Volatile is irrelevant.

Entirely aside from the language issues, my point was simply that "volatile",
and especially its association with threaded programming, has been an extremely
confusing issue for many people. Simply using them together is going to cause
even more confusion. The illusory promise of volatile will lead novices into
trouble.

In contradiction to your absurd statement that "writing multithread programs
becomes impossible" without volatile, the intended C and C++ semantics
associated with volatile are neither useful nor sufficient for threaded code.
And it is WITH volatile, not without, that "the compiler wastes vast
optimization opportunities", especially as the expense of meeting the volatile
"contract" is of no benefit to threaded code.

With all that said, I wish there was a language keyword intended to be used in
the manner you're (ab)using volatile. Though I think your method falls far
short of your promises to detect all race conditions at compile time (unless
applying such a limited and unusual definition of "race" that the term becomes
essentially meaningless), it does have value. What you've done is, in some
ways, one step beyond the Java "synchronized" keyword. It provides not only
syntax to require that access be synchronized, but your type magic allows the
compiler to determine whether, in the current scope, the data is already
synchronized. (This might allow avoiding the Java dependency on expensive
recursive mutexes. Though I'm not entirely convinced your method would survive
a complicated application with multilevel lock hierarchies, I'm not entirely
convinced it wouldn't, either.)

Still, if you're willing to point out that applying volatile to tag temporaries
would be "abuse", recognize that others might reasonably draw the line a bit
differently.

</quote>

http://groups.google.com/groups?selm=3A684272.EC191FD%40compaq.com

<quote>

Andrei Alexandrescu wrote:

> "Dave Butenhof" <David.B...@compaq.com> wrote in message
>
> > In fact, though you've said you weren't intending to advocate direct access to
> > any "volatile" variable without applying synchronization and casting away
> > "volatile", your first Gadget::Wait example does precisely that, and is wildly
> > incorrect and dangerously misleading. Compiler volatile semantics are not
> > sufficient when sharing flag_ between threads, because the hardware, as well as
> > the compiler, may reorder memory accesses arbitrarily, even with volatile. (Nor
> > would a compiler implementation that issued memory barriers at each sequence
> > point for volatile variables be sufficient, unless ALL data was volatile, which
> > is impractical and unreasonably expansive.)
>
> Yeah, I learned to hate the Gadget example. Where's that chrononaut to go
> back in time and remove it.

You can, at least, write a followup article to correct and clarify. It won't
reach everyone who ought to see it, but it's better than nothing.

> > In contradiction to your absurd statement that "writing multithread programs
> > becomes impossible" without volatile, the intended C and C++ semantics
> > associated with volatile are neither useful nor sufficient for threaded
> code.
>
> I agree. Boy this is hard :o).

Yeah, well, you certainly got a lot of attention for your paper. As the saying
goes, "I don't care what they say about me as long as they get my name right."
(Or, "all advertising is good advertising.")

You're right; it is hard to play around between the cracks as you're doing.
There's not a lot of wiggle room. Sounds like you'll be more careful in the
future, and that's good. Now your job is to try to help anyone you confused the
first time around. ;-)

> > And it is WITH volatile, not without, that "the compiler wastes vast
> > optimization opportunities", especially as the expense of meeting the volatile
> > "contract" is of no benefit to threaded code.
>
> What I meant was that the compiler would waste optimization opportunities if
> it treated all variables as if they were volatile. But anyway, given that
> volatile is not really of a lot of help...

Ah. Yes, eliminating all optimization would make threaded programming
impractical, at best. After all, most (though not all) applications use threads
to improve performance. While it's true that in some cases parallelized but
unoptimized code might outperform optimized unthreaded code, I wouldn't want to
bet my job on it happening a lot.

> I am glad I'm not the only one who felt there is something cool here.
> Perhaps the most important point of the article is the importance of type
> modifiers in programming languages, and how one can define/use such
> modifiers to help with multithreaded programs.

Oh yes, it's cool. In principle. It's also fairly simple, and may prove
applicable only to relatively simple programs (e.g., that never hold more than
one mutex at a time, as we'll get into below).

Perhaps this is an interesting opportunity for the language folks; to build a
language (or maybe a new C++ version) that allows something like an
"attributedef" statement, defining properties of an attribute keyword that can
be applied to classes and typedefs. You, for example, could have used a
"locked" keyword instead of confusingly overloading "volatile". I'll bet such a
keyword, which could be added or cast away at need, would enable all sorts of
interesting extensions of the compiler's type checking... including perhaps
that thing about detecting temporaries.

> The projects to which I've applied the idiom are "classic" multithreaded
> applications. The technique is easy to explain and is field tested, and not
> only by me - programmers who are not MT saviors have caught it up in no time
> and loved it, and this is is an important reason for which I believe the
> idiom is valuable. Indeed, I don't know what would happen on special
> threading models. Could you please tell what multilevel lock hierarchies
> are?

There are many cases in complicated threaded applications where a region of
code must hold more than one lock at the same time. Such code must always have
DANGER signs posted at the entrances, and you need to be really careful. Still,
there are well established ways to deal with the risks (just as, foolish though
it may be, we often drive our cars onto highways without bothering to consider
that we might die there).

The risk is deadlock, or "deadly embrace" -- the good ol' Dining Philosophers
problem. One thread owns Mutex A, and waits for Mutex B; while another thread
owns Mutex B and waits for Mutex A. The most common and "well structured"
solution to this problem is to design a strict "mutex hierarchy" defining the
"level" of each mutex. That is, if one needs both the mutex on the head of a
queue and on an element of the queue, one must always first lock the head and
only then lock the element. There is no risk of deadlock, because the element
cannot be locked unless the head is also locked.

Your technique doesn't make it impossible or even more difficult to manage
mutex hierarchies: but it doesn't make it any easier, either. Furthermore, the
"advertised power" of the technique (as currently structured) is somewhat
weakened when an object needs to be protected by multiple mutexes: locking the
element would provide a non-volatile pointer, even though correct use of that
pointer actually requires a second mutex (the header). Could you reasonably
extend the model to deal syntactically with mutex hierarchies? Would the
complacency suggested by reliance on the model prove disastrous in an
application that required hierarchies?

</quote>

Well, to be fair:

http://www.informit.com/isapi/product_id~%7BE3967A89-4E20-425B-BCFF-B84B6DEED6CA%7D/element_id~%7B1872DFB1-6031-4CB0-876D-9533C4A23FC9%7D/st~3FAD3499-20A6-4782-9A96-05825F8E6E5B/content/articlex.asp
(Multithreading and the C++ Type System
FEB 08, 2002 By Andrei Alexandrescu. Article is provided courtesy of Addison Wesley.)

<quote>

An article of mine[4] describes practical compile-time race condition
detection in C++. The method exploits the volatile keyword not as a
semantic vehicle, but only as a participant to the C++ type system.
The programmer qualifies the shared user-defined objects with volatile.
Those objects can be manipulated only by applying a const_cast. Finally,
a helper object can ensure that the const_cast is performed only in
conjunction with locking a synchronization object. Effectively, an
object's type (volatile-qualified or not) depends on whether its
corresponding synchronization object is locked or not. The main
caveat of the technique is that the use of the obscure volatile
qualifier might appear confusing to the unwitting maintainer.

</quote>

4. Andrei Alexandrescu, "volatile: Multithreaded Programmer's Best
Friend," C/C++ Users Journal, February 2001.

regards,
alexander.

David Butenhof

μη αναγνωσμένη,

10 Οκτ 2002, 10:16:43 π.μ.10/10/02

ως

Momchil Velikov wrote:

>> If the variable's storage is in a device, it must be volatile, but that
>> has nothing to do with threads. If the variable is in another thread
>> that does nothing but compute new values for that variable, then you need
>> to obey the memory visibility rules that your platform sets out.
>
> Define "memory visibility" ?
>
> On every platform I'm aware of, if a CPU performs an (atomic) memory
> write, the value written is _eventually_ visible to other CPUs.

Yes. For some definition of "eventually". ;-)

But that says nothing of the SEQUENCE, and often that's more important...

> Moreover, on every platform I'm aware of, the sequence of values read
> from a single memory location is a (not necessarily proper)
> subsequence of the sequence of values written, i.e. no CPU can observe
> values occuring in the opposite order.

This is simply wrong. X86 doesn't reorder anything. SPARC (normally)
reorders writes but not reads (so a barrier on the writer side is enough).
But it's not true for Alpha and it's not true for IPF. You need a barrier
(or fence) on BOTH sides of the transaction. That's the essential bug in
the double-checked initialization. All the mutexing and volatility and
everything else in the writer does absolutely no good for the poor thread
who comes along later and reads "initialization done" before it can see all
the initialized data.

The ONLY correct and portable solution short of proper POSIX synchronization
is a barrier/fence between reading the "initialized" flag and any access to
data the presence of which the flag is intended to indicate.

"DCL" initialization code is inherently nonportable. Period. If your
definition of "correct" is proper operation on the architectures on which
it operates properly, then fine... it's "correct but nonportable". If you
don't like that tautology, then you have to accept that it's "incorrect".

> These, along with atimicity of reads/writes if sufficient for the
> above examples to work. Note that I don't claim (and have _never_
> claimed) they are necessary.
>
>> > c) memory ordering - CPU can reorder memory accesses, which is
>> >prevented by memory barriers, but the compiler can reorder memory
>> >accesses too, which is prevented by volatile.
>>
>> This is not an example.
>
> This is not an argument.

"Yes it is." (If you're not a fan of Monty Python, or don't know who they
are, just forget I said that...)

>> Using volatile in an attempt to achieve thread
>> safety *WILL NOT WORK* on all platforms. Period.
>
> This is not an argument either.
>
>> Further, there is
>> always a way to do the same job (thread safety of data) without using
>> volatile.
>
> Most probably. So, what?
>
> Correctness first, but after all we want performance too, don't we ?

How do you define "correct"? If you violate the POSIX memory model (as your
double-checked initialization variable does), then your code MAY be correct
on some particular processor models, and it may have the best performance
on those processors; but it is not portable. In terms of the standard, and
to many people writing here, nonportable is not correct. This dichotomy can
lead to endless and pointless arguments. ("... No it can't!"; "Yes it
can!")

> On modern SMP architectures every write to concurently accessed memory
> is a potential bottleneck - including things like
> pthread_mutex_lock/unlock.
> Note that the same applies to pthread_rwlock_rdlock too as it performs
> a memory write too. Your best bet after making all the efforts to
> have no concurrently accessed locations is to perform mostly reads
> there and avoid as hell pthread_ synchronization functions (or any
> sync functions for that matter).

People still tend to count instruction cycles to measure "performance"; but
you're right... in most modern systems all that really matters is memory
references, and particularly writes. Write conflicts can hash up everything
by filling the cache coherency channels -- and because those are oriented
towards cache lines rather than "variables", and even with multi-way
associativity you can still get widely space data competing for a single
cache line, the effects are difficult to predict or control.

Still, you avoid synchronization only as much as possible, and no more. One
of the inputs to your decision must be the consideration of how much you're
willing to rewrite/redesign your code when porting to a platform that's
more agressive. If you stick with synchronization for memory visibility and
sequence, you're safe on any conforming platform. If you skip "the rules"
and roll your own architecture-specific code, you're going to miss
something when you port the code, and it's going to blow up in weird ways
that will waste inordinate amounts of time.

Write it correct and portable first. Then ANALYZE what's really happening
and optimize only where needed... and where the loss of portability is
worth the payback.

--
/--------------------[ David.B...@hp.com ]--------------------\
| Hewlett-Packard Company Tru64 UNIX & VMS Thread Architect |
| My book: http://www.awl.com/cseng/titles/0-201-63392-2/ |
\----[ http://homepage.mac.com/dbutenhof/Threads/Threads.html ]---/

Drazen Kacar

μη αναγνωσμένη,

10 Οκτ 2002, 10:24:04 π.μ.10/10/02

ως

David Butenhof wrote:

> This is simply wrong. X86 doesn't reorder anything. SPARC (normally)
> reorders writes but not reads (so a barrier on the writer side is enough).

Even with those or similar processors... Wouldn't it be possible that a
largish machine has a bus which reorders memory accesses independently of
what the CPU is capable of? Or the memory interface has to keep the same
promise as the CPU?

David Butenhof

μη αναγνωσμένη,

10 Οκτ 2002, 10:36:31 π.μ.10/10/02

ως

Patrick TJ McPhee wrote:

It's not quite as bad as it might seem, really.

Following the C/C++ rules will generally lead even a fairly agressively
optimizing compiler to "do the right thing" automatically without any
explicit awareness of POSIX synchronization or memory barrier builtins.
It's HARD to optimize across function calls using the variable visibility
rules. I've dealt with some highly agressive optimizing compilers, and I've
never yet heard of a compiler that could figure out how to break the POSIX
memory rules in a conforming POSIX threaded program. (Or even an "extended"
program using direct memory barriers.) That, of course, isn't meant to be a
guarantee that such compilers can't (or even don't) exist.

However, as Patrick says, any compiler that COULD do this sort of
optimization in a way that could break POSIX semantics, and was intended to
support a POSIX conforming threading environment, would simply have to do
whatever was necessary to make sure the generated code would work. That
might be just as difficult as breaking the rules in the first place, but it
doesn't matter -- it's an unyielding requirement.

Mark Johnson

μη αναγνωσμένη,

10 Οκτ 2002, 11:27:46 π.μ.10/10/02

ως

Alexander Terekhov wrote:
>
> Mark Johnson wrote:
> [...]
> > Other people have stated that they can get away without volatile. They
> > are correct for the specific system / compiler combination (and even
> > compiler switch setting) they are using, but not for the general case.
>
> I think you got it kinda backwards... And, BTW, the C standard says:
>
I don't think so - reread what I stated - leaving out volatile works in
some cases but not all. How can you disagree with that statement? See
the notes below for more detail on why I make that statement. The other
reply to my comment makes emphasizes this point - but perhaps not what
he meant. Perhaps you assume that everybody uses a Posix compliant
system?

> "What constitutes an access to an object that has volatile-qualified
> type is implementation-defined"
>

Actually, a quick look a "Standard C: Programmer's Quick Reference" by
Plauger & Brodie brings up the following points about volatile...
- you specify volatile qualified types for data objects accessed or
altered by signal handlers, by concurrently executing programs, or by
special hardware (such as memory mapped I/O control register). Page 37.
- you specify both const and volatile qualified types for data objects
that the program does not alter, but that other agencies can alter (such
as a memory mapped interval timer). Page 37.
- side effects occur when the program ... accesses a value from a data
object of volatile qualified type .... Page 99.
- a comment on page 137 in describing longjmp noting the effects of
possible value changes since setjmp was called.
- [perhaps what you are referring to] As part of the portability
description, page 182 describes... When the program accesses or stores a
value in a volatile data object, each implementation defines the number
and nature of the accesses and stores. Three possibilities exist:
multiple accesses to different bytes; multiple accesses to the same
byte; no accesses at all, in some cases. You cannot write a program that
produces the same pattern of accesses across multiple implementations.
- another portability note about signals & the use of volatile
sig_atomic_t typed variables on page 183.

I would hope that Plauger and Bowie know what they are talking about :-)
and would not deliberately mislead the reader. Based on that I would say
that the C standard (and this description of it) actually has a *lot* to
say about the implementation of volatile. That part of that is subject
to each implementation does not affect the rest of the effects.
--Mark

Alexander Terekhov

μη αναγνωσμένη,

10 Οκτ 2002, 12:20:12 μ.μ.10/10/02

ως

Mark Johnson wrote:
>
> Alexander Terekhov wrote:
> >
> > Mark Johnson wrote:
> > [...]
> > > Other people have stated that they can get away without volatile. They
> > > are correct for the specific system / compiler combination (and even
> > > compiler switch setting) they are using, but not for the general case.
> >
> > I think you got it kinda backwards... And, BTW, the C standard says:
> >
> I don't think so - reread what I stated - leaving out volatile works in
> some cases but not all. How can you disagree with that statement?

I agree with the following statement(s):

http://groups.google.com/groups?selm=3D906223.4D97597B%40web.de

----
< somewhere, sometime, usenet, Mr.B >

"....
> - when the 'volatile' keyword must be used in
> multithreaded programming?

Never in PORTABLE threaded programs. The semantics of the C and
C++ "volatile" keyword are too loose, and insufficient, to have
any particular value with threads. You don't need it if you're
using portable synchronization (like a POSIX mutex or semaphore)
because the semantics of the synchronization object provide the
consistency you need between threads.

The only use for "volatile" is in certain non-portable
"optimizations" to synchronize at (possibly) lower cost in
certain specialized circumstances. That depends on knowing and
understanding the specific semantics of "volatile" under your
particular compiler, and what other machine-specific steps you
might need to take. (For example, using "memory barrier"
builtins or assembly code.)

In general, you're best sticking with POSIX synchronization, in
which case you've got no use at all for "volatile". That is,
unless you have some existing use for the feature having nothing
to do with threads, such as to allow access to a variable after
longjmp(), or in an asynchronous signal handler, or when
accessing hardware device registers.
...."
----

[...]

> Actually, a quick look a "Standard C: Programmer's Quick Reference" by
> Plauger & Brodie brings up the following points about volatile...

^^^^^^^

[...]

> I would hope that Plauger and Bowie know what they are talking about :-)

^^^^^^^

http://groups.google.com/groups?threadm=3D8EEE36.C164536D%40web.de
("Subject: "PJP doesn't seem to care" ;-)")

regards,
alexander.

--
http://aspn.activestate.com/ASPN/Mail/Message/boost/1375874
(Fair enough. If/when you'll see PJP, please drop these links "on him"...)

Momchil Velikov

μη αναγνωσμένη,

10 Οκτ 2002, 1:15:46 μ.μ.10/10/02

ως

Alexander Terekhov <tere...@web.de> wrote in message news:<3DA5458B...@web.de>...

> Momchil Velikov wrote:
> >
> > Drazen Kacar <da...@willfork.com> wrote in message news:<slrnaq9r9...@willfork.com>...
> > > int foo(...)
> > > {
> > > static volatile sig_atomic_t initialized = 0;
> > >
> > > if (!initialized)
> > > {
> > > pthread_mutex_lock(...);
> > > if (!initialized)
> > > {
> > > initialize_me();
> > > initialized = 1;
> > > }
> > > pthread_mutex_unlock(...);
> > > }
> ...
> > > }
> > >
> > > Unless... initialize_me function could store some values in memory. After
> > > that we store one in initialized. When the next thread starts executing
> > > this code, it might read one from initialized, but it won't necessarily
> > > see all the values stored by initialize_me function, because there's no
> > > memory barrier in its code path.
> >
> > It will see them. ...
>
> Stop silly arguing and read {trying to understand} the stuff
> you're pointed to.

What looks silly to you was indeed quite a useful discussion for me
...

> http://www.cs.umd.edu/~pugh/java/memoryModel/DoubleCheckedLocking.html
> (The "Double-Checked Locking is Broken" Declaration)

Thanks, this is _very_ interesting.

I can see where I'm wrong.

Indeed, in POSIX environment, volatile adds nothing of value to the
solutions of synchronization problems.

~velco

Alexander Terekhov

μη αναγνωσμένη,

10 Οκτ 2002, 1:34:34 μ.μ.10/10/02

ως

Momchil Velikov wrote:
[...]

> > Stop silly arguing and read {trying to understand} the stuff
> > you're pointed to.
>
> What looks silly to you was indeed quite a useful discussion for me

Sure. ``Aahz's law: The best way to get information on usenet is
not to ask a question, but to post the wrong information.'' ;-) ;-)

regards,
alexander.

Patrick TJ McPhee

μη αναγνωσμένη,

10 Οκτ 2002, 2:15:23 μ.μ.10/10/02

ως

In article <87bded37.02100...@posting.google.com>,
Momchil Velikov <ve...@fadata.bg> wrote:

% st...@nospam.Watt.COM (Steve Watt) wrote in message
% news:<H3qv7...@Watt.COM>...

% > The compiler could put it into a register, if it knew how to make that
% > register available to all threads.
%
% The compiler is not entitled to reasoning about threads.

No, but each thread must be able to get at the variable. It's got to be
static or global, or you've got to take its address. If a compiler wants
to put such a variable into a register, it's got to be able to allow
other functions to get the same register from the static/global/pointer.

Patrick TJ McPhee

μη αναγνωσμένη,

10 Οκτ 2002, 2:20:43 μ.μ.10/10/02

ως

In article <87bded37.02100...@posting.google.com>,
Momchil Velikov <ve...@fadata.bg> wrote:

% pt...@interlog.com (Patrick TJ McPhee) wrote in message
% news:<i80p9.14468$qh1.1...@news.ca.inter.net>...
% > You can't depend on the C standard because threading
% > is always an extension to that.
%
% This is not a valid argument. You _can_ depend on the C standard for
% the things specified in the C standard.

OK, but the C standard says nothing about whether volatile is required
for threading, does it?

David Schwartz

μη αναγνωσμένη,

10 Οκτ 2002, 2:58:07 μ.μ.10/10/02

ως

Momchil Velikov wrote:

> This is not a valid argument. You _can_ depend on the C standard for
> the things specified in the C standard. The compiler is unaware of
> threads, but this does not mean it will perform randon violations of
> the standard.
>
> ~velco

The compiler is not required to be unaware of threads. And if the
compiler is shipped as part of a package that claims POSIX compliance,
it had better have as much awareness of threads as it needs.

As for:

>Why it is not allowed to keep it in a register across calls to
>pthread_mutex_lock/unlock ?

The answer should really be obvious. The compile has no idea what
pthread_mutex_lock/unlock do. So it has to assume that those functions
could modify the global variable, which would break if the global
variable were cached in a register.

DS

David Schwartz

μη αναγνωσμένη,

10 Οκτ 2002, 2:59:23 μ.μ.10/10/02

ως

Momchil Velikov wrote:

> I've written tons of multithreaded programs, which didn't work without
> using volatile.
>

> From neither of these statements should be implied that
> a) there does not exist a multithreaded program, which requires
> volatile, nor
> b) any multithreaded program requires volatile
>
> but rather:
> a) not every multithreaded program requires volatile
> b) there exists multithreaded program, which requires volatile

So long as it's clear that you are NOT talking about POSIX threads, I
have no disagreement with you. If you claim that what you're talking
about has anything to do with writing programs that are POSIX-compliant,
then that's another story.

DS

Momchil Velikov

μη αναγνωσμένη,

10 Οκτ 2002, 3:23:13 μ.μ.10/10/02

ως

David Butenhof <David.B...@compaq.com> wrote in message news:<fZfp9.19$3S3.3...@news.cpqcorp.net>...

> Momchil Velikov wrote:
> > On every platform I'm aware of, if a CPU performs an (atomic) memory
> > write, the value written is _eventually_ visible to other CPUs.
>
> Yes. For some definition of "eventually". ;-)
>
> But that says nothing of the SEQUENCE, and often that's more important...
>
> > Moreover, on every platform I'm aware of, the sequence of values read
> > from a single memory location is a (not necessarily proper)
> > subsequence of the sequence of values written, i.e. no CPU can observe
> > values occuring in the opposite order.
>
> This is simply wrong. X86 doesn't reorder anything. SPARC (normally)
> reorders writes but not reads (so a barrier on the writer side is enough).
> But it's not true for Alpha and it's not true for IPF. You need a barrier

Thanks, Alpha memory ordering was quite a surprise to me.

> (or fence) on BOTH sides of the transaction. That's the essential bug in

I was thinking one can get away with no memory barrier on the read
side.

> The ONLY correct and portable solution short of proper POSIX synchronization
> is a barrier/fence between reading the "initialized" flag and any access to
> data the presence of which the flag is intended to indicate.

I see.

> "DCL" initialization code is inherently nonportable. Period. If your
> definition of "correct" is proper operation on the architectures on which
> it operates properly, then fine... it's "correct but nonportable". If you

"Correct, but not quite portable" is the preferred way I like to write
programs. Of course, one has to have clear picture of the platform
dependencies employed in a given solution so he is able to avoid them
on another platform (or alternatively, repeat the same if the same
assertions hold on the new platform).

So, the DCL case will looke like this, right ?

void foo ()
{
static int initialized;
int i;

i = initialized;

/* Read memory barrier, so the above read of ``initialized''
is issued before reads of memory, initialized by ``do_stuff''.
We may not need this on some architectures. Which ones ? */
rmb ();
if (i == 0)
{
lock ();
if (initialized == 0)
{
do_stuff ();

/* Write memory barrier - prevent CPU from reordering
writes across this point. Necessary (but not
necessarily sufficient) to ensure the reader sees
the new values written in ``do_stuff''
whenever it sees ``initialized'' being nonzero. */
wmb ();
initialized = 1;
}
unlock ();
}

do_more_stuff ();
}

Please as more comments as necessary, as these are incomplete (and
maybe wrong ?). Note, that I have deliberately not used ``pthread_''
functions to not unnecessarily limit the discussion to the POSIX
standard. (You know, there's POSIX, Win32, ADA, nITRON, DSPBIOS,
VxWorks, pSOS and all sorts of tiny multitasking executives).

> Write it correct and portable first. Then ANALYZE what's really happening
> and optimize only where needed... and where the loss of portability is
> worth the payback.

Yes. I do tend to "extend" the standards only with memory barrier
inlines and "volatile", both readily available and usually well worth
the effort.

(Well, I would have failed on Alpha (or maybe not, because I would
have read the handbook :)).

Thanks again. Your and Alexander's comments are greatly appreciated.

~velco

Steve Watt

μη αναγνωσμένη,

10 Οκτ 2002, 5:17:18 μ.μ.10/10/02

ως

In article <3DA5194B...@worldnet.att.net>,

Jim Rogers <jimmaure...@worldnet.att.net> wrote:
>Carlos Moreno wrote:
>
>> Wow! This has been a great discussion! (well, for me
>> anyway). I appreciate your comments and thoughts!

Glad you're enjoying it.

>> However, I do have another doubt (well, kind of the same
>> doubt, but now in a more specific context).
>>
>> Several people have pointed out that when using POSIX
>> threads, volatile is never necessary, and that proper
>> synchronization using the right POSIX threads facilities
>> is always sufficient.
>>
>> Now, how can the compiler know that? Is the compiler
>> aware that when using pthreads it must disable any
>> optimizations surrounding shared variables access? As
>> I understand it, POSIX threads are kinda multiplatform
>> (there are pthreads libraries for Unix/Linux, but also
>> for Windows -- maybe for MAC and other OS's too?). So,
>> how can POSIX threads alone provide any guarantee about
>> something that seems completely in the compiler's hands?

The compiler only needs to know that there are external functions
named pthread_mutex_lock() (etc.). It can not know what those
functions do. With those two simple constraints, any C or C++
compiler can generate correct code...

>A C or C++ compiler cannot know that, because the language does not
>address threading. The Posix libraries can know that. If you want
>a compiler to know about threading issues you must use a language
>with threading built into the syntax. The three languages that
>most frequently come to my mind for this purpose are Java, C#,
>and Ada.

Yes, but the language also guarantees that if function a sets a
global variable to 15, and then calls function b, function b must
see that global variable as 15. If the function is in another
translation unit (read: source file) then it is impossible for
the compiler to cache a global value in registers while it makes
such a function call, because the basic semantics of the language
won't work. Threading is not involved.

[ snip remainder of advertisement for Ada ]

I like a lot of Ada's concepts, but it's not necessary for this
problem.
--
Steve Watt KD6GGD PP-ASEL-IA ICBM: 121W 56' 57.8" / 37N 20' 14.9"
Internet: steve @ Watt.COM Whois: SW32
Free time? There's no such thing. It just comes in varying prices...

Carlos Moreno

μη αναγνωσμένη,

10 Οκτ 2002, 5:31:53 μ.μ.10/10/02

ως

David Schwartz wrote:

>
> As for:
>
>>Why it is not allowed to keep it in a register across calls to
>>pthread_mutex_lock/unlock ?
>
> The answer should really be obvious. The compile has no idea what
> pthread_mutex_lock/unlock do. So it has to assume that those functions
> could modify the global variable, which would break if the global
> variable were cached in a register.

Though I didn't ask that question, it goes more or less with
my original doubt (I'm the OP, BTW).

It was pretty obvious for me that with global variables there
wouldn't be a problem (as you say, facing an "unknown" function,
the compiler *must* assume the possibility that such function
could modify the global variable)... But what was not clear
to me was, if the shared variable was local, how would the
compiler know? The call to lock or unlock don't receive the
address of the shared variable, so how can the compiler know?
Only a previous call to thread_create did (so, how would
the compiler establish the link between those two apparently
unrelated actions, if it's not aware of what thread_create
or mutex_lock do?)

This has already been answered, at least for POSIX compliant
systems; I just wanted to clarify that the doubt was in a
situation less trivial than this one :-)

Carlos
--

Arnold Hendriks

μη αναγνωσμένη,

10 Οκτ 2002, 6:10:27 μ.μ.10/10/02

ως

Carlos Moreno <moreno_at_mo...@xx.xxx> wrote:
> compiler know? The call to lock or unlock don't receive the
> address of the shared variable, so how can the compiler know?
> Only a previous call to thread_create did (so, how would
> the compiler establish the link between those two apparently
> unrelated actions, if it's not aware of what thread_create
> or mutex_lock do?)

That has nothing to do with threads. In the original example, the address of
the local variable was taken and passed to an external function. Later on,
another external function (mutex lock) is called. The compiler must assume
that the 2nd external function could have received the address of the local
variable from the 1st external function, so it may not make any assumptions
about the local variable's contents after an external function.

Basically, the local variable can be considered a global variable as soon
as its address 'escapes' from the code the compiler can see.

Carlos Moreno

μη αναγνωσμένη,

10 Οκτ 2002, 8:52:28 μ.μ.10/10/02

ως

Arnold Hendriks wrote:

Duh!!! You're right, my question was very trivial after all! :-(

Cheers,

Carlos
--

Joshua Jones

μη αναγνωσμένη,

11 Οκτ 2002, 12:26:56 π.μ.11/10/02

ως

Momchil Velikov <ve...@fadata.bg> wrote:
>
> Define "memory visibility" ?
>

Go check out Tanenbaum and Van Steen's "Distributed Systems." It
has a great section on memory visibility and different models
that can be followed. Very interesting.

>
> (Well, maybe I misunderstood the topic of the newsgroup, maybe it is
> about bitching about standards instead of multithreaded programming in
> the real world, so I'll go find the FAQ).
>

After all the clear explanations given, I fail to see how you
still argue your "volatile" point. If this is not what you're
arguing, I apologize--I've gotten lost wading in all the posts.

--
josh(at)intmain.net | http://intmain.net
CS @ College of Computing, Georgia Institute of Technology, Atlanta
532704 local keystrokes since last reboot, 39 days ago.

Momchil Velikov

μη αναγνωσμένη,

11 Οκτ 2002, 3:02:27 π.μ.11/10/02

ως

David Schwartz <dav...@webmaster.com> wrote in message news:<3DA5CDBF...@webmaster.com>...

> Momchil Velikov wrote:
>
> > This is not a valid argument. You _can_ depend on the C standard for
> > the things specified in the C standard. The compiler is unaware of
> > threads, but this does not mean it will perform randon violations of
> > the standard.
> >
> > ~velco
>
> The compiler is not required to be unaware of threads. And if the

I think it is required, unless the particular thread aware feature is
implementation define by the sense the of standard. Thread awareness
in other parts of the standard will introduce implementation
dependencies in places where they aren't allowed, effectively making
the compiled language not C, no ?

> compiler is shipped as part of a package that claims POSIX compliance,
> it had better have as much awareness of threads as it needs.

I'd rather hope POSIX does not require its own language. As much as I
like POSIX (pre 2001 :) there are other environments out there.

> As for:
>
> >Why it is not allowed to keep it in a register across calls to
> >pthread_mutex_lock/unlock ?
>
> The answer should really be obvious. The compile has no idea what
> pthread_mutex_lock/unlock do. So it has to assume that those functions
> could modify the global variable, which would break if the global
> variable were cached in a register.

Yes. Absolutely. Faced with a lack of knowledge, the compiler will err
on the conservative side, i.e. it'll assume that the variable might be
changed.

But ... (playing devil's advocate here)

The FACT is that lock/unlock functions DO NOT modify the variable.
Isn't it possible for this information to be conveyed (by some
implementation specific means) to the compiler ? That certainly
doesn't contradict to the standard as the standard has nothing to do
with registers (besides the denounced "register" keyword).

(As an example of implementation specific "aid" to the compiler,
consider
GCC -fno-exceptions or MSVC /Oa)

~velco

Ziv Caspi

μη αναγνωσμένη,

11 Οκτ 2002, 7:04:34 π.μ.11/10/02

ως

On 10 Oct 2002 22:10:27 GMT, Arnold Hendriks <a.hen...@b-lex.com>
wrote:

>That has nothing to do with threads. In the original example, the address of
>the local variable was taken and passed to an external function. Later on,
>another external function (mutex lock) is called. The compiler must assume
>that the 2nd external function could have received the address of the local
>variable from the 1st external function, so it may not make any assumptions
>about the local variable's contents after an external function.
>
>Basically, the local variable can be considered a global variable as soon
>as its address 'escapes' from the code the compiler can see.

This is bogus reasoning.

Let's assume the most difficult case: the 2nd external function is
dynamically loaded at run time (so it is unknown at compile or link
time).

A conforming compiler/runtime could theoretically disassemble that
function when it is loaded into memory, determine that it doesn't get
the address of the local variable after all, and then start
optimizing.

Such a conforming compiler would break your assumptions.

Ziv

Arnold Hendriks

μη αναγνωσμένη,

11 Οκτ 2002, 7:36:54 π.μ.11/10/02

ως

Ziv Caspi <zi...@netvision.net.il> wrote:
>>that the 2nd external function could have received the address of the local
>>variable from the 1st external function, so it may not make any assumptions
>>about the local variable's contents after an external function.
>>

> This is bogus reasoning.
[...]

> A conforming compiler/runtime could theoretically disassemble that
> function when it is loaded into memory, determine that it doesn't get
> the address of the local variable after all, and then start
> optimizing.

The discussion was not about what a conforming compiler can or cannot do.
It was about how a compiler figures out that caching a local variable
accross an external function call might be unsafe.

--
Arnold Hendriks <a.hen...@b-lex.com>
B-Lex Information Technologies, http://www.b-lex.com/

Carlos Moreno

μη αναγνωσμένη,

11 Οκτ 2002, 8:18:33 π.μ.11/10/02

ως

Ziv Caspi wrote:

>>
>>Basically, the local variable can be considered a global variable as soon
>>as its address 'escapes' from the code the compiler can see.
>
> This is bogus reasoning.
>
> Let's assume the most difficult case: the 2nd external function is
> dynamically loaded at run time (so it is unknown at compile or link
> time).
>
> A conforming compiler/runtime could theoretically disassemble that
> function when it is loaded into memory, determine that it doesn't get
> the address of the local variable after all, and then start
> optimizing.
>
> Such a conforming compiler would break your assumptions.

Actually, I think the *only* thing we know a conforming compiler
must do, is do the right thing -- IOW, we could *almost* argue
that a conforming compiler *must* read from memory... Anything
beyond that comes from the optimizer determining that *with
absolute certainty* there is no possibility that the variable
has a different value than the value the compiler thinks it
should have -- then the compiler decides not to read it from
memory, since it can *predict* with absolute certainty what is
the value that is going to be read...

But then, Arnold's reasoning is accurate: the compiler must
drop any assumption given the slight possibility that suggests
the contrary -- BTW, if the compiler is so smart to analyze
code *at run-time* (which I *really* doubt any compiler does
or *ever* does -- what kind of optimization would that be???
To optimize away the reading *one* variable it has to reproduce
*the entire execution of the program*??? I wouldn't call that
an optimization :-))... I was saying that if the compiler is
*that* smart in analyzing what happens, then it should know
that when calling pthread_create passing the address of a
local variable, then that local variable should automatically
be treated as volatile storage...

Cheers,

Carlos
--

Arnold Hendriks

μη αναγνωσμένη,

11 Οκτ 2002, 8:29:09 π.μ.11/10/02

ως

Carlos Moreno <moreno_at_mo...@xx.xxx> wrote:
> the contrary -- BTW, if the compiler is so smart to analyze
> code *at run-time* (which I *really* doubt any compiler does
> or *ever* does -- what kind of optimization would that be???
> To optimize away the reading *one* variable it has to reproduce
> *the entire execution of the program*??? I wouldn't call that
> an optimization :-))... I was saying that if the compiler is

Well, perhaps the entire program code is stored in local memory, and the
local variable is stored on a memory node located on Pluto, so that
analysis of the program code is faster than rechecking memory :-)

David Butenhof

μη αναγνωσμένη,

11 Οκτ 2002, 9:54:57 π.μ.11/10/02

ως

Momchil Velikov wrote:

> So, the DCL case will looke like this, right ?

The important thing to remember with true "memory barrier" systems (e.g.,
Alpha and IPF) is that the instruction doesn't touch memory; it's entirely
within the CPU's memory logic. After all, going to the main memory system
is expensive!

All you can control is the ORDER of issued memory operations.

> void foo ()
> {
> static int initialized;
> int i;
>
> i = initialized;
>
> /* Read memory barrier, so the above read of ``initialized''
> is issued before reads of memory, initialized by ``do_stuff''.
> We may not need this on some architectures. Which ones ? */
> rmb ();

Alpha doesn't have a "read barrier" -- only "memory barriers" (read and
write), and "write barriers".

Depending on the nature of the initialized data, and how it was used in the
two threads, you should be able to get away with a read barrier, but I'd
just stick with a full barrier.

Also, a barrier usually isn't free, and can be fairly expensive. You really
don't need this one in Thread A, so I'd put it later, in the 'else' clause
on the outer if statement. (The inner 'if' will provide memory visibility
thanks to the mutex unlock.)

> if (i == 0)
> {
> lock ();
> if (initialized == 0)
> {
> do_stuff ();
>
> /* Write memory barrier - prevent CPU from reordering
> writes across this point. Necessary (but not
> necessarily sufficient) to ensure the reader sees
> the new values written in ``do_stuff''
> whenever it sees ``initialized'' being nonzero. */
> wmb ();
> initialized = 1;
> }
> unlock ();
> }

else
mb ();

>
> do_more_stuff ();
> }
>
> Please as more comments as necessary, as these are incomplete (and
> maybe wrong ?). Note, that I have deliberately not used ``pthread_''
> functions to not unnecessarily limit the discussion to the POSIX
> standard. (You know, there's POSIX, Win32, ADA, nITRON, DSPBIOS,
> VxWorks, pSOS and all sorts of tiny multitasking executives).
>
>> Write it correct and portable first. Then ANALYZE what's really happening
>> and optimize only where needed... and where the loss of portability is
>> worth the payback.
>
> Yes. I do tend to "extend" the standards only with memory barrier
> inlines and "volatile", both readily available and usually well worth
> the effort.
>
> (Well, I would have failed on Alpha (or maybe not, because I would
> have read the handbook :)).

The problem with relying on "figure it out later when porting" is that
unless you've left really good porting documentation (and knew what to
expect!), someone who follows you later in maintaining/porting the code is
probably going to miss the "nearly portable" assumptions you've made.
That's a project risk that should be considered up front. Are you really
getting enough performance benefit to justify that cost? I'm not saying
that you won't, or that the risk isn't worthwhile for your project -- only
that this shouldn't be taken lightly.

David Schwartz

μη αναγνωσμένη,

11 Οκτ 2002, 1:21:31 μ.μ.11/10/02

ως

Momchil Velikov wrote:

> David Schwartz <dav...@webmaster.com> wrote in message news:<3DA5CDBF...@webmaster.com>...
> > Momchil Velikov wrote:

> > > This is not a valid argument. You _can_ depend on the C standard for
> > > the things specified in the C standard. The compiler is unaware of
> > > threads, but this does not mean it will perform randon violations of
> > > the standard.

> > The compiler is not required to be unaware of threads. And if the

> I think it is required, unless the particular thread aware feature is
> implementation define by the sense the of standard. Thread awareness
> in other parts of the standard will introduce implementation
> dependencies in places where they aren't allowed, effectively making
> the compiled language not C, no ?

Of course not, this argument is ridiculous. It is perfectly legal and
sensible to implement part (or even all!) of POSIX threads in the
compiler itself. The C standard, for example, doesn't require that there
be a header file called 'stdio.h', you could build all that
functionality into the compiler itself, so long as '#include <stdio.h>'
causes it to work. The standards don't care where and how you implement
functionality, just that you do.

> > compiler is shipped as part of a package that claims POSIX compliance,
> > it had better have as much awareness of threads as it needs.

> I'd rather hope POSIX does not require its own language. As much as I
> like POSIX (pre 2001 :) there are other environments out there.

This doesn't make any sense. The C library is part of the C language
and is specified by the same standard. You are imagining a border that
does not exist in the standard. Nothing says that the compiler is the
language.

> > As for:

> > >Why it is not allowed to keep it in a register across calls to
> > >pthread_mutex_lock/unlock ?

> > The answer should really be obvious. The compile has no idea what
> > pthread_mutex_lock/unlock do. So it has to assume that those functions
> > could modify the global variable, which would break if the global
> > variable were cached in a register.

> Yes. Absolutely. Faced with a lack of knowledge, the compiler will err
> on the conservative side, i.e. it'll assume that the variable might be
> changed.

No, the people who make the threads library will ensure that the
compiler cannot possibly have this knowledge. Or, if they have some
input into the compiler too, they will give it the additional knowledge
that these calls are part of the thread safety mechanism.

> But ... (playing devil's advocate here)
>
> The FACT is that lock/unlock functions DO NOT modify the variable.

No, the fact is that they could, albeit indirectly.

> Isn't it possible for this information to be conveyed (by some
> implementation specific means) to the compiler ? That certainly
> doesn't contradict to the standard as the standard has nothing to do
> with registers (besides the denounced "register" keyword).

Sure, this can be conveyed to the compiler. However, a POSIX-compliant
platform would also have to convey to the compiler that these calls are
part of the thread-safety mechanism and it must take that into account.

Either approach will work, total ignorance or full knowledge, and
either approach is equally complaint with both the C and POSIX
standards. There's no way a program that is compliant with either or
both standards could tell which approach is used.

> (As an example of implementation specific "aid" to the compiler,
> consider
> GCC -fno-exceptions or MSVC /Oa)

Again, either the compiler would have to understand some portion of the
threading library or certain compiler options would be prohibited on
multithreaded code or the compiler would have to be made aware of what
functions must ignore which options or any of a dozen approaches would
have to be taken. It doesn't matter how it's done, but it must be done
somehow or the POSIX standard is not being met.

DS

David Schwartz

μη αναγνωσμένη,

11 Οκτ 2002, 1:23:47 μ.μ.11/10/02

ως

Ziv Caspi wrote:

> Let's assume the most difficult case: the 2nd external function is
> dynamically loaded at run time (so it is unknown at compile or link
> time).

> A conforming compiler/runtime could theoretically disassemble that
> function when it is loaded into memory, determine that it doesn't get
> the address of the local variable after all, and then start
> optimizing.

> Such a conforming compiler would break your assumptions.

Such a compiler would not conform to the POSIX threads standard, so it
is not a "conforming compiler" with respect to that standard.

DS

David Schwartz

μη αναγνωσμένη,

11 Οκτ 2002, 5:46:08 μ.μ.11/10/02

ως

Carlos Moreno wrote:

> It was pretty obvious for me that with global variables there
> wouldn't be a problem (as you say, facing an "unknown" function,
> the compiler *must* assume the possibility that such function
> could modify the global variable)... But what was not clear
> to me was, if the shared variable was local, how would the
> compiler know? The call to lock or unlock don't receive the
> address of the shared variable, so how can the compiler know?
> Only a previous call to thread_create did (so, how would
> the compiler establish the link between those two apparently
> unrelated actions, if it's not aware of what thread_create
> or mutex_lock do?)

The answer to this question is fairly simple, by the way. Any way
another thread could get the address of that local variable, the
pthread_mutex_[un]lock function could get it.

Consider:

int a;

pthread_create(...some_function, &a);

....

pthread_mutex_lock(foo);

...

How does the compiler know that 'pthread_create' doesn't store the
address of 'a' somewhere and pthread_mutex_lock use it?

DS

Ziv Caspi

μη αναγνωσμένη,

11 Οκτ 2002, 6:37:54 μ.μ.11/10/02

ως

Exactly. It conforms to the C standard, but not to POSIX. The post I
commented on made it look as if any compiler must do so with any
external function -- this is untrue. POSIX compilers are so
constrained, when the external function is one of a limited set of
functions marked by the standard.

Ziv

Ziv Caspi

μη αναγνωσμένη,

11 Οκτ 2002, 6:37:52 μ.μ.11/10/02

ως

On 11 Oct 2002 11:36:54 GMT, Arnold Hendriks <a.hen...@b-lex.com>
wrote:

>The discussion was not about what a conforming compiler can or cannot do.

>It was about how a compiler figures out that caching a local variable
>accross an external function call might be unsafe.

The use of "a compiler...." "must" and "may not" in the original post
contradicts what you're saying.

Ziv

Momchil Velikov

μη αναγνωσμένη,

12 Οκτ 2002, 5:32:42 π.μ.12/10/02

ως

David Schwartz <dav...@webmaster.com> wrote in message news:<3DA7089B...@webmaster.com>...

> Momchil Velikov wrote:
>
> > David Schwartz <dav...@webmaster.com> wrote in message news:<3DA5CDBF...@webmaster.com>...
> > > Momchil Velikov wrote:
>
> > > > This is not a valid argument. You _can_ depend on the C standard for
> > > > the things specified in the C standard. The compiler is unaware of
> > > > threads, but this does not mean it will perform randon violations of
> > > > the standard.
>
> > > The compiler is not required to be unaware of threads. And if the
>
> > I think it is required, unless the particular thread aware feature is
> > implementation define by the sense the of standard. Thread awareness
> > in other parts of the standard will introduce implementation
> > dependencies in places where they aren't allowed, effectively making
> > the compiled language not C, no ?
>
> Of course not, this argument is ridiculous. It is perfectly legal and
> sensible to implement part (or even all!) of POSIX threads in the

A compiler, which implements the (parts of) the POSIX standard,
a) is not a C compiler
b) is not a compiler at all, but an _operating system_ :)

> compiler itself. The C standard, for example, doesn't require that there
> be a header file called 'stdio.h', you could build all that
> functionality into the compiler itself, so long as '#include <stdio.h>'
> causes it to work. The standards don't care where and how you implement
> functionality, just that you do.

Probably, but this is irrelevant to the discussion.

> > > compiler is shipped as part of a package that claims POSIX compliance,
> > > it had better have as much awareness of threads as it needs.
>
> > I'd rather hope POSIX does not require its own language. As much as I
> > like POSIX (pre 2001 :) there are other environments out there.
>
> This doesn't make any sense. The C library is part of the C language
> and is specified by the same standard. You are imagining a border that
> does not exist in the standard.

I think you're confused about what is the C library (the one specified
by ISO/IEC 9899:1999) and, e.g., ``libc.a'', which usually contains
both C and POSIX function. I'm not imagining at all the existing
distinction between
ISO/IEC 9899:1999 and IEEE Std. 1003.1-2001.

> Nothing says that the compiler is the
> language.

When a compiler contradicts to a language standard it is an
implementation for some other language. From an OS interface standard
requiring such a compiler follows the OS interface standard requires
its own language.

>
> > > As for:
>
> > > >Why it is not allowed to keep it in a register across calls to
> > > >pthread_mutex_lock/unlock ?
>
> > > The answer should really be obvious. The compile has no idea what
> > > pthread_mutex_lock/unlock do. So it has to assume that those functions
> > > could modify the global variable, which would break if the global
> > > variable were cached in a register.
>
> > Yes. Absolutely. Faced with a lack of knowledge, the compiler will err
> > on the conservative side, i.e. it'll assume that the variable might be
> > changed.
>
> No, the people who make the threads library will ensure that the
> compiler cannot possibly have this knowledge. Or, if they have some
> input into the compiler too, they will give it the additional knowledge
> that these calls are part of the thread safety mechanism.
>
> > But ... (playing devil's advocate here)
> >
> > The FACT is that lock/unlock functions DO NOT modify the variable.
>
> No, the fact is that they could, albeit indirectly.

Oops, "they could, albeit indirectly". What does that "indirectly"
mean ?
Maybe "by some means unknown to the program" ? But that means the
variable _is_ "volatile" by the sense of ISO/IEC 9899:1999, no ?

> > Isn't it possible for this information to be conveyed (by some
> > implementation specific means) to the compiler ? That certainly
> > doesn't contradict to the standard as the standard has nothing to do
> > with registers (besides the denounced "register" keyword).
>
> Sure, this can be conveyed to the compiler. However, a POSIX-compliant
> platform would also have to convey to the compiler that these calls are
> part of the thread-safety mechanism and it must take that into account.
>
> Either approach will work, total ignorance or full knowledge, and
> either approach is equally complaint with both the C and POSIX
> standards. There's no way a program that is compliant with either or
> both standards could tell which approach is used.
>
> > (As an example of implementation specific "aid" to the compiler,
> > consider
> > GCC -fno-exceptions or MSVC /Oa)
>
> Again, either the compiler would have to understand some portion of the
> threading library or certain compiler options would be prohibited on
> multithreaded code or the compiler would have to be made aware of what
> functions must ignore which options or any of a dozen approaches would
> have to be taken. It doesn't matter how it's done, but it must be done
> somehow or the POSIX standard is not being met.

Again: the C compiler should compile the C language.

Otherwise one introduces implementation specific dependecies in the
program. i.e. while striving to make it POSIX conformant one makes it
not C conformant.

~velco

Alexander Terekhov

μη αναγνωσμένη,

12 Οκτ 2002, 6:55:30 π.μ.12/10/02

ως

Momchil Velikov wrote:
[...]

> Oops, "they could, albeit indirectly". What does that "indirectly"
> mean ? Maybe "by some means unknown to the program" ? But that means
> the variable _is_ "volatile" by the sense of ISO/IEC 9899:1999, no ?

Forget C/C++ volatiles, Momchil. Java/RTJ folks did it "right":

http://www.rtj.org/rtsj-V1.0.pdf

"....
Raw Memory Access

An instance of RawMemoryAccess models a range of physical memory as a
fixed sequence of bytes. A full complement of accessor methods allow
the contents of the physical area to be accessed through offsets from
the base, interpreted as byte, short, int, or long data values or as
arrays of these types. Whether the offset addresses the high-order or
low-order byte is based on the value of the BYTE_ORDER static boolean
variable in class RealtimeSystem. The RawMemoryAccess class allows a
real-time program to implement device drivers, memory-mapped I/O,
flash memory, battery-backed RAM, and similar lowlevel software.

A raw memory area cannot contain references to Java objects. Such a
capability would be unsafe (since it could be used to defeat Java's
type checking) and errorprone (since it is sensitive to the specific
representational choices made by the Java compiler).
...."

And as for C/C++, well, consider: < note 1991! >

http://groups.google.com/groups?selm=1991Sep12.170305.6639%40zoo.toronto.edu

">have been more sensible to have a "device" declarator with completely
>implementation-defined semantics rather than trying to overload volatile
>and cross our fingers and hope.

"volatile" was invented for device registers. The mistake was to overload
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^

it for setjmp and signal handlers. A wise compiler writer, writing for
the systems-programming market, will know this and do things right."

regards,
alexander.

Patrick TJ McPhee

μη αναγνωσμένη,

12 Οκτ 2002, 1:16:45 μ.μ.12/10/02

ως

In article <3DA746A0...@webmaster.com>,
David Schwartz <dav...@webmaster.com> wrote:

% The answer to this question is fairly simple, by the way. Any way
% another thread could get the address of that local variable, the
% pthread_mutex_[un]lock function could get it.

If it's a static local variable, you never need to take its address,
since different threads can get at it simply by calling the function
which declares the variable. A sophisticated optimiser might look at the
call graph and determine that pthread_mutex_lock() never results in the
function being called.

An optimiser which does this would result in broken code if the call
graph of pthread_mutex_lock(), or any other function for that matter,
were to change such that f_declaring_v() does get called. I would expect
that sort of analysis to be disabled for functions in shared objects,
as well as for multi-threaded compilation.