Interesting GCC optimization, but is it legal if POSIX thread support is claimed?

David Schwartz

unread,

Oct 25, 2007, 12:34:53 AM10/25/07

to

Tomash Brecko, on the GCC mailing list, has pointed out that GCC may
optimize code like this:

if (set_v) v = 1;

to this:

cmpl $1, %eax ; test set_v
movl v, %edx ; load
adcl $0, %edx ; maybe add 1
movl %edx, v ; store

This is a pretty well known optimization, converting a conditional
branch to a conditional move. It's a win for some CPUs, since a
conditional move cannot be mispredicted. But consider:

int trylock()
{
int res;
res = pthread_mutex_trylock(&mutex);
if (res == 0)
++acquires_count;
return res;
}

Crap. This can race. If another thread recursively acquires the mutex
again while we try and fail to lock it, then an increment to
'acquires_count' may be lost and that's fatal.

So the question is, is GCC violating POSIX when it makes this
optimization? If not, how do you implement a 'trylock' type function
and how much invalid code is out there?

DS

David Schwartz

unread,

Oct 25, 2007, 12:35:45 AM10/25/07

to

The optimized code should be:

cmpl $0, 8(%ebp)
movl $1, %eax
cmove v, %eax ; load (maybe)
movl %eax, v ; store (always)

Sorry. Saw that a second after I pushed 'post'.

DS

Chris Friesen

unread,

Oct 25, 2007, 1:59:42 AM10/25/07

to

David Schwartz wrote:

> int trylock()
> {
> int res;
> res = pthread_mutex_trylock(&mutex);
> if (res == 0)
> ++acquires_count;
> return res;
> }
>
> Crap. This can race. If another thread recursively acquires the mutex
> again while we try and fail to lock it, then an increment to
> 'acquires_count' may be lost and that's fatal.

I don't think the optimization is legal in threaded programs.
Essentially it could amount to the following code running while not
holding the mutex:

tmp = acquires_count;
tmp2 = tmp + 1;
if (res == 0) tmp = tmp2; (using cmove)
aquires_count = tmp;

This essentially reduces to writing to a variable while not holding the
associated mutex, and as such it should be invalid.

Chris

Zeljko Vrba

unread,

Oct 25, 2007, 2:33:38 AM10/25/07

to

On 2007-10-25, David Schwartz <dav...@webmaster.com> wrote:
>
> So the question is, is GCC violating POSIX when it makes this
> optimization? If not, how do you implement a 'trylock' type function
> and how much invalid code is out there?
>

IIRC, concurrent stores to a memory location w/o holding an associated
mutex is invalid according to POSIX.

David Schwartz

unread,

Oct 25, 2007, 5:25:20 PM10/25/07

to

On Oct 24, 11:33 pm, Zeljko Vrba <zvrba.nos...@ieee-sb1.cc.fer.hr>
wrote:

> IIRC, concurrent stores to a memory location w/o holding an associated
> mutex is invalid according to POSIX.

That doesn't answer the question. The program only does concurrent
stores to a memory location w/o holding an associated mutex because of
the optimization. Without the optimization, it doesn't. The question
is whether the optimization is legal. If it's legal, how do you code a
trylock type function?

Or, to put it another way, if a POSIX-compliant compiler may add:

int j=i;
i=j;

randomly to your code, since it causes no harm in a single thread --
how can you ever ensure you don't write to a variable without holding
the associated mutex?

To make POSIX-compliant code even possible, surely optimizations that
add writes to variables must be prohibited. That is -- if POSIX
prohibits writing to a variable in certain cases only the programmer
can detect, then a POSIX-compliant compiler cannot write to a variable
except where explicitly told to do so. Any optimization that *adds* a
write to a variable that would not otherwise occur *must* be
prohibited.

But does this mean turning a conditional jump into a conditional move
must always be prohibited? That's kind of painful.

DS

David Schwartz

unread,

Oct 25, 2007, 5:26:44 PM10/25/07

to

On Oct 24, 10:59 pm, Chris Friesen <cbf...@mail.usask.ca> wrote:

> I don't think the optimization is legal in threaded programs.
> Essentially it could amount to the following code running while not
> holding the mutex:
>
> tmp = acquires_count;
> tmp2 = tmp + 1;
> if (res == 0) tmp = tmp2; (using cmove)
> aquires_count = tmp;
>
> This essentially reduces to writing to a variable while not holding the
> associated mutex, and as such it should be invalid.

The code complies with POSIX if the optimization is allowed and
violates POSIX if the optimization is prohibited. So the question is
-- is the optimization allowed? For my reasoning why it shouldn't be,
see my reply to Zeljko Vrba's post.

DS

Chris Friesen

unread,

Oct 25, 2007, 5:57:46 PM10/25/07

to

David Schwartz wrote:

> The code complies with POSIX if the optimization is allowed and
> violates POSIX if the optimization is prohibited.

I respectfully disagree with the above. The optimised code doesn't
comply with POSIX, because it ends up writing to a variable while not
holding the mutex. This is forbidden by POSIX.

Alternately, any compiler that does such an optimization is not POSIX
compliant.

Generally I agree with your reply to Zeljko Vrba.

Chris

Chris Friesen

unread,

Oct 25, 2007, 6:02:43 PM10/25/07

to

David Schwartz wrote:

> But does this mean turning a conditional jump into a conditional move
> must always be prohibited? That's kind of painful.

As was mentioned on lkml, if the store was made to be conditional then
the write would only occur if the lock is held and everything would be fine.

Chris

David Schwartz

unread,

Oct 25, 2007, 6:21:41 PM10/25/07

to

On Oct 25, 2:57 pm, Chris Friesen <cbf...@mail.usask.ca> wrote:
> David Schwartz wrote:
> > The code complies with POSIX if the optimization is allowed and
> > violates POSIX if the optimization is prohibited.
>
> I respectfully disagree with the above. The optimised code doesn't
> comply with POSIX, because it ends up writing to a variable while not
> holding the mutex. This is forbidden by POSIX.

Sorry, I got it backwards. The question of whether the original code
is legal depends upon whether the optimization is allowed or not. If
the optimization is allowed, the code violates POSIX. However, if the
optimization is allowed, almost all code violates POSIX.

How can you avoid concurrent writes if the compiler is free to add:
int temp=i; i=temp;
randomly to your code anywhere it feels like it?

DS

llothar

unread,

Oct 25, 2007, 7:58:03 PM10/25/07

to

Sorry i don't get it. Can you please write/disassemble a complete code
snippet please.
This code looks as broken as your first version. For example what is
on the stack variable [EBP+8] ?

Zeljko Vrba

unread,

Oct 26, 2007, 1:44:29 AM10/26/07

to

On 2007-10-25, David Schwartz <dav...@webmaster.com> wrote:
>

> But does this mean turning a conditional jump into a conditional move
> must always be prohibited? That's kind of painful.
>

Well, you don't have to prohibit it for non-memory operands. It's still
useful for calculating various intermediate results that either don't
get stored to memory or get stored into compiler's auxiliary data (i.e.
temporary data on the stack, not visible to the programmer). Applying
this optimization to (programmer-generated) local variables would require
the compiler to *prove* that no reference to that local variable has leaked
out of the function. [which might not even be difficult in the vast
majority of cases].

As for inserting int j = i; i = j; throughout the code.. The i=j write is
not considered as a "side-effect" _unless_ i is declared volatile. So as
far the C standard is concerned, I believe that this random insertion _is_
allowed. I'm not sure about POSIX, but.. are there any parts of POSIX that
*invalidate* the C standard, i.e. say something opposite?

David Schwartz

unread,

Oct 26, 2007, 1:46:50 AM10/26/07

to

It's just a conversion of a conditional store to a conditional move.

DS

David Schwartz

unread,

Oct 26, 2007, 1:48:58 AM10/26/07

to

On Oct 25, 10:44 pm, Zeljko Vrba <zvrba.nos...@ieee-sb1.cc.fer.hr>
wrote:

> As for inserting int j = i; i = j; throughout the code.. The i=j write is

> not considered as a "side-effect" _unless_ i is declared volatile. So as
> far the C standard is concerned, I believe that this random insertion _is_
> allowed. I'm not sure about POSIX, but.. are there any parts of POSIX that
> *invalidate* the C standard, i.e. say something opposite?

What if the memory that 'i' is stored in is not always writable? What
if the conditional makes sure the pointer is valid?

Consider:

if (x_is_writable) x=0;

This is not the same as:

x=(x_is_writable) ? x : 0;

Because if x is not writable, writing to x will cause a fault.

Another case is:

if (ValidPtr(j)) *j=0;

This is not the as:

*j=(ValidPtr(j)) ? 0 : *j;

Because it is not safe to dereference 'j' if 'j' is not a valid
pointer.

Even ignoring pthreads issues, this optimization seems bogus to me on
a system that support memory protection.

DS

Zeljko Vrba

unread,

Oct 26, 2007, 2:21:58 AM10/26/07

to

On 2007-10-26, David Schwartz <dav...@webmaster.com> wrote:
>
> What if the memory that 'i' is stored in is not always writable? What
> if the conditional makes sure the pointer is valid?
>

Hmm, I think that this is again an issue for "volatile". To turn around
the topic a bit, the Sun's C compiler has 5 optimization levels.. Up to
a certain optimization level (-xO2) , it assumes that all memory accesses
are considered "volatile", and this (presumably) would disable such
optimizations in these cases. Furthermore, the manual explicitly states
that the code that relies on implicit volatile might break on higher
optimization levels.

Both of your examples contain side-effects, and I believe that you
should declare

>
> if (x_is_writable) x=0;
>

the x variable to be volatile in this case (the possible side-effect
being the triggering of memory access violation).

David Schwartz

unread,

Oct 26, 2007, 5:44:10 AM10/26/07

to

On Oct 25, 11:21 pm, Zeljko Vrba <zvrba.nos...@ieee-sb1.cc.fer.hr>
wrote:

> Both of your examples contain side-effects, and I believe that you

> should declare
>
> > if (x_is_writable) x=0;
>
> the x variable to be volatile in this case (the possible side-effect
> being the triggering of memory access violation).

That would be an utterly self-defeating thing to do. It makes no sense
to make a major pessimization to preserve a minor optimization.

If the compiler can insert writes to variables where the unoptimized
code flow would not do so, all potentially shared variables would have
to be volatile. That would slow a significant amount of code by an
order of magnitude across the board.

DS

Greg Herlihy

unread,

Nov 1, 2007, 12:17:34 AM11/1/07

to

On Oct 26, 2:44 am, David Schwartz <dav...@webmaster.com> wrote:
> On Oct 25, 11:21 pm, Zeljko Vrba <zvrba.nos...@ieee-sb1.cc.fer.hr>
> wrote:
>
> > Both of your examples contain side-effects, and I believe that you
> > should declare
>
> > > if (x_is_writable) x=0;
>
> > the x variable to be volatile in this case (the possible side-effect
> > being the triggering of memory access violation).
>
> That would be an utterly self-defeating thing to do. It makes no sense
> to make a major pessimization to preserve a minor optimization.

Declaring an object "volatile" is a pessimization only if that
particular object is not, in fact, volatile. Otherwise, not declaring
a volatile object "volatile" is an error. Because - unless a volatile
object is identified as such to the compiler - the compiler is free to
apply all kinds of optimizations (such as conditional moves) with that
object - optimizations that are safe only for nonvolatile objects. The
original example (shown below) clearly demonstrates one such
optimization that the compiler would not have generated - had the
compiler known that the value of acquiesce_count is volatile

> If the compiler can insert writes to variables where the unoptimized
> code flow would not do so, all potentially shared variables would have
> to be volatile. That would slow a significant amount of code by an
> order of magnitude across the board.

If a function writes to a global variable along any one of its
execution paths, then that function is free to write to that same
variable along any other of its execution paths (because a caller
knows only that calling the function might write to the global
variable, the caller - to be on the safe side - has to assume that the
function will write to the global variable).

Therefore, in the original example, trylock() is free to write to
acquires_count - even when the write operation might not be strictly
necessary. Or to put it another way, no caller of trylock()_ should be
surprised if trylock() writes to acquires_count:

int acquires_count;

int trylock()
{
int res;
res = pthread_mutex_trylock(&mutex);
if (res == 0)
++acquires_count;
return res;
}

The bug in this program is that the acquires_count is a volatile
object - but not declared as such. Recall that a volatile variable is
one whose value can change asynchronously to the program's execution.
So even though acquires_count's value changes synchronously to the
execution of the thread that holds the mutex (and increments its
value), the value of acquires_count nonetheless changes asynchronously
with regard to the execution of other, suspended threads (and which
might be executing this same function.)

Therefore "acquires_count" is a volatile object - whether or not it is
declared "volatile" by the programmer. In other words, declaring an
object "volatile" does not confer volatility upon an object - instead
a "volatile" qualifier simply identifies those objects that are
volatile - to the compiler. So yes, most shared variables, being
volatile objects, have to be declared volatile in order to prevent
unsafe optimizations being applied upon them. And if declaring a
previously undeclared volatile object as "volatile" - has any effect
on existing code, then the existing code must be unsafe. (Otherwise
the compiler would not need to generate different code - now that it
knows that the object is volatile).

Greg

Dave Butenhof

unread,

Nov 1, 2007, 7:55:28 AM11/1/07

to

Greg Herlihy wrote:

> int acquires_count;
>
> int trylock()
> {
> int res;
> res = pthread_mutex_trylock(&mutex);
> if (res == 0)
> ++acquires_count;
> return res;
> }
>
> The bug in this program is that the acquires_count is a volatile
> object - but not declared as such. Recall that a volatile variable is
> one whose value can change asynchronously to the program's execution.
> So even though acquires_count's value changes synchronously to the
> execution of the thread that holds the mutex (and increments its
> value), the value of acquires_count nonetheless changes asynchronously
> with regard to the execution of other, suspended threads (and which
> might be executing this same function.)

No, it's NOT volatile -- it's protected by a mutex.

This is an argument that's appeared before in this newsgroup. I know
what ISO C says, and it's a red herring in this context. What ISO C
allows simply doesn't matter.

The use of pthread_mutex_trylock() makes this a POSIX application, not
an ISO C application. As such, the platform, INCLUDING the compiler, is
subject to POSIX rules, not merely ISO C rules.

POSIX rules are a superset of ISO C, and in particular this code
fragment SHALL operate correctly without volatile under POSIX rules.

If the compiler is willing to make the "optimizations" under discussion
here, then it may well be a conforming ISO C compiler -- but it is NOT
part of a conforming POSIX system. (POSIX doesn't specifically speak of
a "POSIX conforming compiler", so I'll avoid that term... but it amounts
to the same thing.) This is no different from an OS that chooses to
implement pthread_mutex_trylock() without providing memory
synchronization. It's simply broken. Whether it would be broken in a
DIFFERENT context, without the mutex, is irrelevant. (Or, conversely, if
the use of an ISO C compiler and ISO-conforming optimization is
appropriate, then the use of POSIX mutex operations is inappropriate.)

Note that such optimizations may be made for non-threaded code; but code
built to run in a threaded environment generally will need to completely
disable them, (unless the compiler can reliably perform the analysis
necessary to determine it's safe); perhaps based on the standard "thread
environment" compile options. (-pthread, -mt, $(getconf
_CS_POSIX_V7_THREADS_CFLAGS))

POSIX is not just a set of C99 APIs -- it's an environment. That
environment must be consistent. If it isn't, then the environment isn't
POSIX and trying to write POSIX code is a waste of time.

David Schwartz

unread,

Nov 2, 2007, 12:01:27 PM11/2/07

to

On Oct 31, 9:17 pm, Greg Herlihy <gre...@mac.com> wrote:

> Declaring an object "volatile" is a pessimization only if that
> particular object is not, in fact, volatile. Otherwise, not declaring
> a volatile object "volatile" is an error. Because - unless a volatile
> object is identified as such to the compiler - the compiler is free to
> apply all kinds of optimizations (such as conditional moves) with that
> object - optimizations that are safe only for nonvolatile objects. The
> original example (shown below) clearly demonstrates one such
> optimization that the compiler would not have generated - had the
> compiler known that the value of acquiesce_count is volatile

This is utterly and completely wrong. It is so wrong, almost all you
can do is just shudder at it.

As a simple thought experiment to see why this is wrong, consider the
case where the compiler documentation states, "if all accesses to a
variable used by multiple threads are protected by a mutex, there is
no need to declare the variable volatile"? Would you still insist that
it's an error to not declare it volatile?

> Therefore "acquires_count" is a volatile object - whether or not it is
> declared "volatile" by the programmer.

So what? The "volatile" keyword doesn't belong on every volatile
object any more than the "integer" keyword belongs on every variable
that stores an integer.

DS

Zeljko Vrba

unread,

Nov 2, 2007, 4:29:11 PM11/2/07

to

On 2007-11-01, Dave Butenhof <david.b...@hp.com> wrote:
>
> The use of pthread_mutex_trylock() makes this a POSIX application, not
> an ISO C application. As such, the platform, INCLUDING the compiler, is
> subject to POSIX rules, not merely ISO C rules.
>

Hmm, a reasoning followed by a question.

Since mutexes are _not_ part of the C language (like eg. Java's synchronized{}
blocks are), they have dynamic scope over shared variables. In other words,
given the following simple scenario:

int some_global;
pthread_mutex_t mtx;

void f(void)
{
++some_global;
}

void g(void)
{
f();
}

void h(void)
{
pthread_mutex_lock(&mtx);
f();
pthread_mutex_unlock(&mtx);
}

and given the existence of function pointers and general hardness of
constructing an accurate control flow graph of a C program (not made
easier by separate compilation), how is the compiler supposed to
know whether the some_global variable is protected by a mutex at the
moment when it is generating the code for function f() ?

How did the POSIX commitee envision this to actually work?

I don't see a way to generate POSIX-compliant code unless the compiler assumes
"volatile" semantics for all variables unless it can be *proven* that they are
not shared.

Dave Butenhof

unread,

Nov 2, 2007, 5:40:35 PM11/2/07

to

It certainly does NOT need to assume "volatile" or anything of the sort.

However it MAY need to disable some more aggressive code migrations
optimizations when building code for a threaded environment.

OR, the compiler/linker can figure out ways to acquire sufficient
information to be intelligent about this sort of thing -- and, yes, that
requires a lot of knowledge and analysis. That's life.

David Schwartz

unread,

Nov 2, 2007, 5:56:52 PM11/2/07

to

On Nov 2, 1:29 pm, Zeljko Vrba <zvrba.nos...@ieee-sb1.cc.fer.hr>
wrote:

> Since mutexes are _not_ part of the C language (like eg. Java's synchronized{}

> blocks are), they have dynamic scope over shared variables. In other words,
> given the following simple scenario:
>
> int some_global;
> pthread_mutex_t mtx;
>
> void f(void)
> {
> ++some_global;
> }
>
> void g(void)
> {
> f();
> }
>
> void h(void)
> {
> pthread_mutex_lock(&mtx);
> f();
> pthread_mutex_unlock(&mtx);
>
> }

> and given the existence of function pointers and general hardness of
> constructing an accurate control flow graph of a C program (not made
> easier by separate compilation), how is the compiler supposed to
> know whether the some_global variable is protected by a mutex at the
> moment when it is generating the code for function f() ?

It doesn't know, and so cannot assume either that it is or that it is
not. It has to be able to generate code that works correctly in either
case. If it doesn't, that compiler cannot support the POSIX standard.

> How did the POSIX commitee envision this to actually work?

I imagine the committee envisioned it would work the way it actually
does work. A very small class of optimizations that are legal in
single-threaded code or code that doesn't use memory protection are
prohibited in code that is multi-threaded or does use memory
protection.

> I don't see a way to generate POSIX-compliant code unless the compiler assumes
> "volatile" semantics for all variables unless it can be *proven* that they are
> not shared.

That's a real lack of imagination on your part. All that's needed are
a few rules such as that reads and writes may not be moved across
function boundaries if the function does not inline.

DS

Chris Friesen

unread,

Nov 3, 2007, 1:08:40 AM11/3/07

to

David Schwartz wrote:

> So what? The "volatile" keyword doesn't belong on every volatile
> object any more than the "integer" keyword belongs on every variable
> that stores an integer.

I suspect that part of the problem is that the term "volatile" is
overloaded and can mean different things depending on who is talking.

I'm tempted to define a "volatile variable" as simply a variable that
has been declared using the "volatile" keyword.

By this definition, the "volatile" keyword does belong on every volatile
variable...because that is what it means to _be_ a volatile variable.

I wouldn't consider regular variables that happen to be accessed by
multiple threads or processes as volatile just because they're not
limited to single threads.

Chris

David Schwartz

unread,

Nov 3, 2007, 7:02:40 PM11/3/07

to

On Nov 2, 10:08 pm, Chris Friesen <cbf...@mail.usask.ca> wrote:

> I wouldn't consider regular variables that happen to be accessed by
> multiple threads or processes as volatile just because they're not
> limited to single threads.

They're 'volatile' in the sense that they can change in ways that are
not visible to the compiler and therefore compiler optimizations that
are otherwise sensible may cause the program not to behave as
intended.

The problem is cases like this one (thanks to Hans-J. Boehm):

for(int i=0; i<100; i++)
{
if (y) pthread_mutex_lock(&mutex);
x++;
if (y) pthread_mutex_unlock(&mutex);
}

Suppose compiler heuristics indicate that 'y' is almost always false.
The compiler has no idea what the pthread_mutex functions do, so it
must spill 'x' before calling them. So the compiler optimizes the code
as follows:

register=x;
for(int i=0; i<100; i++)
{
if (y) { x=register; pthread_mutex_lock(&mutex); register=x; }
register++;
if (y) { x=register; pthread_mutex_unlock(&mutex); register=x; }
}
x=register;

Ack. This is a disaster. The mutex will not protect 'x' as intended.

Now, a compiler that supports multi-threaded code cannot do this
optimization. It will obviously break reasonable code in horrible
ways. The question is, how do you get a compiler not to make this
optimization?

There are only three choices:

1) Disable this optimization entirely and all others like it. At
least, do so for multi-threaded code and other types of code broken by
optimizations like this.

2) Require the programmer to mark shared variables in some way and
disable the optimization for variables so marked.

3) Use some kind of heuristic to figure out when this optimization
must be prohibited that never errs on the side that breaks code.

Note that, as I mentioned above, this optimization can break other
types of code too. For example, consider:

if(x_is_protected) mprotect(&x, ..., PROT_READ);
x++;
if(x_is_protected) mprotect(&x, ..., PROT_NONE);

Spilling 'x' before calling 'mprotect' definitely won't work.

DS

Chris Friesen

unread,

Nov 3, 2007, 7:14:37 PM11/3/07

to

David Schwartz wrote:

> So the compiler optimizes the code
> as follows:

> register=x;
> for(int i=0; i<100; i++)
> {
> if (y) { x=register; pthread_mutex_lock(&mutex); register=x; }
> register++;
> if (y) { x=register; pthread_mutex_unlock(&mutex); register=x; }
> }
> x=register;
>
> Ack. This is a disaster. The mutex will not protect 'x' as intended.

A sufficiently smart compiler could understand that for certain types of
code this is an invalid optimization. Thus, I would argue this is not a
volatile variable.

On the other hand, memory-mapping a hardware register is something that
it would not make sense to try and teach the compiler about, and thus it
makes sense that it would be labeled volatile.

> There are only three choices:
> 1) Disable this optimization entirely and all others like it. At
> least, do so for multi-threaded code and other types of code broken by
> optimizations like this.
>
> 2) Require the programmer to mark shared variables in some way and
> disable the optimization for variables so marked.
>
> 3) Use some kind of heuristic to figure out when this optimization
> must be prohibited that never errs on the side that breaks code.

Option 2 is a non-starter. We're never going to get all existing code
updated.

Would it be enough to simply not hoist speculative reads before memory
barriers or functions that may contain memory barriers, and not move
writes after such barriers/functions?

Chris

Greg Herlihy

unread,

Nov 6, 2007, 1:34:09 PM11/6/07

to

On Nov 2, 8:01 am, David Schwartz <dav...@webmaster.com> wrote:
> On Oct 31, 9:17 pm, Greg Herlihy <gre...@mac.com> wrote:
>
> > Declaring an object "volatile" is a pessimization only if that
> > particular object is not, in fact, volatile. Otherwise, not declaring
> > a volatile object "volatile" is an error. Because - unless a volatile
> > object is identified as such to the compiler - the compiler is free to
> > apply all kinds of optimizations (such as conditional moves) with that
> > object - optimizations that are safe only for nonvolatile objects. The
> > original example (shown below) clearly demonstrates one such
> > optimization that the compiler would not have generated - had the
> > compiler known that the value of acquiesce_count is volatile
>
> This is utterly and completely wrong. It is so wrong, almost all you
> can do is just shudder at it.

The lack of any kind of threading model in C and C++ may be a moral
wrong, but the description above is correct with regard to the C (and C
++) program and memory models. In fact, the race condition in the
original program is a perfect example of what can go wrong when a
volatile object is not identified as such to an optimizing C or C++
compiler.

> As a simple thought experiment to see why this is wrong, consider the
> case where the compiler documentation states, "if all accesses to a
> variable used by multiple threads are protected by a mutex, there is
> no need to declare the variable volatile"? Would you still insist that
> it's an error to not declare it volatile?

In that case, the compiler would be implicitly declaring the shared
variable "volatile"- so whether or not the programmer explicitly
declared the variable as such - would make no difference. So, the
answer is that case would be "no", there would be no bug - because the
shared object is still being declared a volatile object (only this
time the declaration is implicit - and is being supplied by the
compiler).

But barring such documentation, a missing "volatile" qualifier for a
volatile object is a programming error. In the original example, all
accesses to the acquires_count variable were protected by a mutex.
Nevertheless, the program still had a race condition - because
acquires_count was not explicitly declared volatile (and there was no
evidence to suggest that the compiler in question would - on its own -
implicitly declare acquires_count volatile).

> > Therefore "acquires_count" is a volatile object - whether or not it is
> > declared "volatile" by the programmer.
>
> So what? The "volatile" keyword doesn't belong on every volatile
> object any more than the "integer" keyword belongs on every variable
> that stores an integer.

"Integer" is not a keyword in C or C++, but "int" is. And as far as I
know, every integral type in those languages does have "int" in its
name. Besides, if only certain volatile objects need a "volatile"
qualifier, while others do not, how is a programmer to tell the two
apart?

Greg

Zeljko Vrba

unread,

Nov 6, 2007, 2:50:03 PM11/6/07

to

On 2007-11-06, Greg Herlihy <gre...@mac.com> wrote:
>
> "Integer" is not a keyword in C or C++, but "int" is. And as far as I
> know, every integral type in those languages does have "int" in its
> name.
>

While I agree with the rest of your post, small nitpicking here: 'char' is
also an integer type :)

David Schwartz

unread,

Nov 6, 2007, 9:11:19 PM11/6/07

to

On Nov 6, 10:34 am, Greg Herlihy <gre...@mac.com> wrote:

> In that case, the compiler would be implicitly declaring the shared
> variable "volatile"- so whether or not the programmer explicitly
> declared the variable as such - would make no difference. So, the
> answer is that case would be "no", there would be no bug - because the
> shared object is still being declared a volatile object (only this
> time the declaration is implicit - and is being supplied by the
> compiler).

> But barring such documentation,

In other words, in your fantasy world.

> a missing "volatile" qualifier for a
> volatile object is a programming error. In the original example, all
> accesses to the acquires_count variable were protected by a mutex.
> Nevertheless, the program still had a race condition - because
> acquires_count was not explicitly declared volatile (and there was no
> evidence to suggest that the compiler in question would - on its own -
> implicitly declare acquires_count volatile).

Nothing you say is applicable to the real world, for precisely the
reasons I explained.

DS

David Schwartz

unread,

Nov 6, 2007, 9:11:38 PM11/6/07

to

On Nov 6, 11:50 am, Zeljko Vrba <zvrba.nos...@ieee-sb1.cc.fer.hr>
wrote:

And, of course "unsigned long".

DS

David Schwartz

unread,

Nov 6, 2007, 9:23:40 PM11/6/07

to

On Nov 3, 3:14 pm, Chris Friesen <cbf...@mail.usask.ca> wrote:

> Would it be enough to simply not hoist speculative reads before memory
> barriers or functions that may contain memory barriers, and not move
> writes after such barriers/functions?

You need to eliminate speculative writes. With that addition, I think
that covers everything. And note that all not all speculative reads
must be exempt from hoisting, only those of objects whose address
could be known to the program.

DS

Logan Shaw

unread,

Nov 6, 2007, 10:04:30 PM11/6/07

to

And just plain "unsigned", as in:

unsigned foo(unsigned x) {
return x / 2;
}

And "short":

short bar(short x) {
return x + 1;
}

Also, you can make a reasonable argument that C++'s "bool" is an
integral type that just doesn't happen to have a very wide range
of possible values.

- Logan

Frank Cusack

unread,

Nov 14, 2007, 3:24:24 PM11/14/07

to

So what is the takeaway from this thread. Does gcc do bad things with
threaded code? Do I need to workaround this by disabling optimization
(ouch) or adding 'volatile' to shared global data (ugh)?

-frank

Chris Thomasson

unread,

Nov 14, 2007, 3:49:33 PM11/14/07

to

"Frank Cusack" <fcu...@fcusack.com> wrote in message
news:m2ir44r...@sucksless.local...

> So what is the takeaway from this thread. Does gcc do bad things with
> threaded code? Do I need to workaround this by disabling optimization
> (ouch) or adding 'volatile' to shared global data (ugh)?
>

If GCC performs the optimization that David Schwartz pointer out, your
basically screwed. AFAICT, GCC is totally busted if it allows stores to
escape a critical-section. This is a race-condition waiting to happen. I am
surprised if GCC would do such a thing!

Zeljko Vrba

unread,

Nov 15, 2007, 2:27:17 AM11/15/07

to

On 2007-11-14, Chris Thomasson <cri...@comcast.net> wrote:
>
> If GCC performs the optimization that David Schwartz pointer out, your
> basically screwed. AFAICT, GCC is totally busted if it allows stores to
> escape a critical-section. This is a race-condition waiting to happen. I am
>

How is the compiler supposed to know where a CS begins and ends? should
it have a knowledge of every imaginable official and unofficial API?

Ian Collins

unread,

Nov 15, 2007, 3:19:53 AM11/15/07

to

In order to back that claim, can you point us to the section on critical
sections in the C standard?

--
Ian Collins.

Alexander Terekhov

unread,

Nov 15, 2007, 6:08:38 AM11/15/07

to

http://www.hpl.hp.com/personal/Hans_Boehm/c++mm/user-faq.html

"Given that I don't have access to a C++0x implementation, how much of
this applies?

[...]

There is no clear prohibition against "spurious" compiler-introduced
stores of the old value back into a global variable x, when such an
update is not called for by the source. This may cause a concurrent
update to x to be lost, somewhat unpredictably. Many compilers can
introduce such updates, though they rarely do so in cases in which it
turns out to matter. (See Boehm, Threads Cannot be Implemented as a
Library, PLDI 2005 for details.)

[...]

So how do I deal with potential spurious stores in current compilers?

The good news appears to be that these are introduced primarily in cases
in which they do not introduce a bug. If count is a potentially shared
variable, many compilers would compile the loop

for (p = q; p != 0; p = p->next)
if (p -> data > 0) count++;

by unconditionally loading count into a register at the beginning of the
loop, and unconditionally writing it back at the end, even if the loop
happened to never update count because all the data was negative. This
can introduce a race and lose an update to count if it is being
concurrently modified by another thread, presumably because the
programmer is counting on the knowledge that none of the data is
positive.

Needless to say, such correctness arguments should be avoided.

Although it is common for compilers to generate technically incorrect
(by proposed C++0x standards) for this code, the actual failure scenario
seems fairly far-fetched.

More dangerous miscompilations, such as the example described in Boehm,
Threads Cannot be Implemented as a Library, PLDI 2005 are fortunately
rare, and it is unclear that there are effective counter-measures, other
than modifying compilers to comply to the C++0x rules; significant
reduction of optimization level; using non-standard directives or
compiler switches that prevent such behavior for specified accesses,
variables, or compilation units; or manually checking the resulting
assembly code.

Explicit optimization-suppression directives in many current compilers
may allow this type of code to be safely written. Of course, all such
usage is non-standard and very likely also non-portable. "

HTH. HAND. :-)

regards,
alexander.

Chris Thomasson

unread,

Nov 15, 2007, 2:25:23 PM11/15/07

to

"Ian Collins" <ian-...@hotmail.com> wrote in message
news:5q2dpbF...@mid.individual.net...

There is none. I thought that GCC could identify POSIX calls. Linux would
have a problem if GCC did something like this... Perhaps I am missing
something here.

Chris Thomasson

unread,

Nov 15, 2007, 2:26:44 PM11/15/07

to

"Zeljko Vrba" <zvrba....@ieee-sb1.cc.fer.hr> wrote in message
news:slrnfjnt6l...@ieee-sb1.cc.fer.hr...

I was under the impression that POSIX puts some restrictions on compilers.
Humm... I can't really remember where I heard that right now, but I sure
think I did. Humm...

Chris Friesen

unread,

Nov 15, 2007, 3:28:13 PM11/15/07

to

Zeljko Vrba wrote:
> On 2007-11-14, Chris Thomasson <cri...@comcast.net> wrote:
>
>>If GCC performs the optimization that David Schwartz pointer out, your
>>basically screwed. AFAICT, GCC is totally busted if it allows stores to
>>escape a critical-section. This is a race-condition waiting to happen. I am
>>
>
> How is the compiler supposed to know where a CS begins and ends?

Usually it would involve some sort of barrier.

In glibc for instance the locking primitives add "memory" to the list of
clobbered registers in the inline assembly to tell gcc to not move stuff
across that chunk of code.

Chris

David Schwartz

unread,

Nov 15, 2007, 4:30:34 PM11/15/07

to

On Nov 15, 12:28 pm, Chris Friesen <cbf...@mail.usask.ca> wrote:

> > How is the compiler supposed to know where a CS begins and ends?
>
> Usually it would involve some sort of barrier.
>
> In glibc for instance the locking primitives add "memory" to the list of
> clobbered registers in the inline assembly to tell gcc to not move stuff
> across that chunk of code.

Sadly, a barrier doesn't work. Look again at the example. All of
memory can be spilled before the barrier and reloaded after it, and
you're still screwed.

DS

David Schwartz

unread,

Nov 15, 2007, 4:32:23 PM11/15/07

to

On Nov 15, 3:08 am, Alexander Terekhov <terek...@web.de> wrote:

> The good news appears to be that these are introduced primarily in cases
> in which they do not introduce a bug. If count is a potentially shared
> variable, many compilers would compile the loop
>
> for (p = q; p != 0; p = p->next)
> if (p -> data > 0) count++;
>
> by unconditionally loading count into a register at the beginning of the
> loop, and unconditionally writing it back at the end, even if the loop
> happened to never update count because all the data was negative.

Then how does the compiler even know that "count" is a legal,
accessible object? What if "if (p->data>0)" really means "if
(count_is_writable)"?

This seems to be an invalid optimization on any platform that supports
memory protection.

DS

Chris Friesen

unread,

Nov 15, 2007, 5:39:56 PM11/15/07

to

David Schwartz wrote:

> Sadly, a barrier doesn't work. Look again at the example. All of
> memory can be spilled before the barrier and reloaded after it, and
> you're still screwed.

Hmm...right.

I think we're back to your earlier suggestion that the compiler should
not be allowed to add speculative writes, and may not speculatively read

objects whose address could be known to the program.

Chris

Dave Butenhof

unread,

Nov 15, 2007, 6:12:14 PM11/15/07

to

The point is that POSIX puts restrictions on the behavior of a
conforming system. That includes library, kernel, and compiler. If the
RESULT doesn't behave like POSIX, then it's not POSIX.

A compiler that's part of a conforming POSIX system environment can't
generate code that breaks synchronization. How it and the rest of the
system accomplish that is unspecified.

Often, it means simply not performing risky optimizations. But if they
are enabled, then the system needs to be able to detect and avoid
performing them in "dangerous" areas of code. (A complicated problem,
but nothing's impossible.)

Marcin ‘Qrczak’ Kowalczyk

unread,

Nov 15, 2007, 7:17:22 PM11/15/07

to

Dnia 15-11-2007, Cz o godzinie 07:27 +0000, Zeljko Vrba pisze:

> > If GCC performs the optimization that David Schwartz pointer out, your
> > basically screwed. AFAICT, GCC is totally busted if it allows stores to
> > escape a critical-section. This is a race-condition waiting to happen. I am
>
> How is the compiler supposed to know where a CS begins and ends? should
> it have a knowledge of every imaginable official and unofficial API?

It is not supposed to be able to know for sure. It is supposed to
perform such optimizations only when it knows for sure that there
is *no* synchronization, or that variables involved are not shared
with other threads.

--
__("< Marcin Kowalczyk
\__/ qrc...@knm.org.pl
^^ http://qrnik.knm.org.pl/~qrczak/

David Schwartz

unread,

Nov 15, 2007, 7:25:41 PM11/15/07

to

On Nov 15, 2:39 pm, Chris Friesen <cbf...@mail.usask.ca> wrote:

> I think we're back to your earlier suggestion that the compiler should
> not be allowed to add speculative writes, and may not speculatively read
> objects whose address could be known to the program.

It seems that GCC and the relevant standards will be moving in this
direction.
http://www.open-std.org/jtc1/sc22/wg21/docs/papers/2007/n2338.html

DS

David Schwartz

unread,

Nov 15, 2007, 7:27:09 PM11/15/07

to

On Nov 15, 12:19 am, Ian Collins <ian-n...@hotmail.com> wrote:

> In order to back that claim, can you point us to the section on critical
> sections in the C standard?

Why would that matter? The C standard is not the only standard GCC
claims to support. For example, my man page for GCC says:

-fopenmp
Enable handling of OpenMP directives "#pragma omp" in C/C++
and
"!$omp" in Fortran. When -fopenmp is specified, the
compiler gen-
erates parallel code according to the OpenMP Application
Program
Interface v2.5 <http://www.openmp.org/>.

So wouldn't a cite to the OpenMP standard do just as well? How about
POSIX?

DS

Frank Cusack

unread,

Nov 15, 2007, 8:35:27 PM11/15/07

to

In the meantime, do I need to "protect" my shared data with 'volatile'?

-frank

Zeljko Vrba

unread,

Nov 16, 2007, 4:05:14 AM11/16/07

to

On 2007-11-15, David Schwartz <dav...@webmaster.com> wrote:
>
> Then how does the compiler even know that "count" is a legal,
> accessible object? What if "if (p->data>0)" really means "if
> (count_is_writable)"?
>

Would not that qualify "count" as a volatile variable? Yes, we're returning
to the same old story, but I haven't yet seen a clear yes/no answer together
with an explenation _why_.

Alexander Terekhov

unread,

Nov 16, 2007, 5:23:48 AM11/16/07

to

David Schwartz wrote:
>
> On Nov 15, 3:08 am, Alexander Terekhov <terek...@web.de> wrote:
>
> > The good news appears to be that these are introduced primarily in cases
> > in which they do not introduce a bug. If count is a potentially shared
> > variable, many compilers would compile the loop
> >
> > for (p = q; p != 0; p = p->next)
> > if (p -> data > 0) count++;
> >
> > by unconditionally loading count into a register at the beginning of the
> > loop, and unconditionally writing it back at the end, even if the loop
> > happened to never update count because all the data was negative.
>
> Then how does the compiler even know that "count" is a legal,
> accessible object? What if "if (p->data>0)" really means "if
> (count_is_writable)"?

Eh? Because it was defined as a legal accessible object. And writable as
well. That's apart from the fact that implementations are absolutely
free to turn "const" into "mutable" for objects with standard storage
duration (not some external stuff; and sig_atomic_t aside for a moment).

regards,
alexander.

David Schwartz

unread,

Nov 16, 2007, 1:56:59 PM11/16/07

to

On Nov 15, 5:35 pm, Frank Cusack <fcus...@fcusack.com> wrote:

> > It seems that GCC and the relevant standards will be moving in this
> > direction.
> >http://www.open-std.org/jtc1/sc22/wg21/docs/papers/2007/n2338.html
>
> In the meantime, do I need to "protect" my shared data with 'volatile'?

I'm not sure that question can be answered. It depends upon exactly
what you're doing, what version of GCC you're using, and what your
requirements are.

I'm not sure such a grossly offensive workaround is needed to such an
obscure, improbable bug.

DS

David Schwartz

unread,

Nov 16, 2007, 1:58:40 PM11/16/07

to

On Nov 16, 1:05 am, Zeljko Vrba <zvrba.nos...@ieee-sb1.cc.fer.hr>
wrote:

> Would not that qualify "count" as a volatile variable?

It depends upon your definition of "volatile".

> Yes, we're returning
> to the same old story, but I haven't yet seen a clear yes/no answer together
> with an explenation _why_.

Of why what? Why you shouldn't use the keyword "volatile" in this
case? Because that keyword only has defined semantics for cases that
are nothing like this case. (For example, it has defined semantics
with signals and with longjmp.)

We don't want to throw some code at the problem so that it happens to
work. We want to use appropriate methods so that it is guaranteed to
work.

DS

David Schwartz

unread,

Nov 16, 2007, 4:57:41 PM11/16/07

to

On Nov 16, 2:23 am, Alexander Terekhov <terek...@web.de> wrote:

> Eh? Because it was defined as a legal accessible object. And writable as
> well.

So it seems that there's no sense in having 'mprotect' given that a
compiler, if ever allowed to read or write a memory location, may move
or duplicate that read or write any place it pleases.

DS

Alexander Terekhov

unread,

Nov 17, 2007, 11:28:37 AM11/17/07

to

Eh? What does mprotect() have to do with objects of standard storage
duration to begin with?

Keep in mind that "The behavior of this function is unspecified if the
mapping was not established by a call to mmap()."

regards,
alexander.

David Thompson

unread,

Nov 18, 2007, 9:52:49 PM11/18/07

to

On Tue, 06 Nov 2007 21:04:30 -0600, Logan Shaw
<lshaw-...@austin.rr.com> wrote:

> David Schwartz wrote:
> > On Nov 6, 11:50 am, Zeljko Vrba <zvrba.nos...@ieee-sb1.cc.fer.hr>
> > wrote:
> >> On 2007-11-06, Greg Herlihy <gre...@mac.com> wrote:
> >>
> >>> "Integer" is not a keyword in C or C++, but "int" is. And as far as I
> >>> know, every integral type in those languages does have "int" in its
> >>> name.
>
> >> While I agree with the rest of your post, small nitpicking here: 'char' is
> >> also an integer type :)
>

More precisely, (plain) char, signed char, and unsigned char are three
distinct integer types never named (nor declared) with 'int'.

> > And, of course "unsigned long".
>

> And just plain "unsigned", as in: <snip>
> And "short": <snip>

Those are _declarations_ (or more precisely the type-specifier
portion(s) of declarations) that don't include 'int', but specify a
type whose canonical name does.

> Also, you can make a reasonable argument that C++'s "bool" is an
> integral type that just doesn't happen to have a very wide range
> of possible values.
>

I'd call it more than a reasonable argument.
3.9.1[basic.fundamental]p7: Types bool, char, wchar_t, and the signed
and unsigned integer types are collectively called _integral_ types.
[where italicization, which I render as _ _, means definition]

And C99's _Bool similarly, except that it is categorized under
unsigned-integer rather than in the third sign-agnostic zone.
(C90 and C99, and C++, all have plain char in that third zone.)

C99 allows implementation-defined additional 'extended' integer types,
whose names (chosen by the implementor) might or might not use 'int'.

And in C (both C90 and C99) enum types are 'compatible' with one of
the (builtin) integer types; it's angels-on-a-pinhead whether this
makes them additional types that happen to be the same, or just
additional names for the same types. While in C++ they are distinct
types in the type system (e.g. for overload resolution) that are
really integers 'under the covers' and can silently be converted _to_
integer, but converted _from_ (or between) only with a cast.

- formerly david.thompson1 || achar(64) || worldnet.att.net

David Schwartz

unread,

Nov 19, 2007, 2:24:03 PM11/19/07

to

On Nov 17, 8:28 am, Alexander Terekhov <terek...@web.de> wrote:

> > So it seems that there's no sense in having 'mprotect' given that a
> > compiler, if ever allowed to read or write a memory location, may move
> > or duplicate that read or write any place it pleases.

> Eh? What does mprotect() have to do with objects of standard storage
> duration to begin with?

The compiler does not know how the object was created and allocated.
It could be allocated by another module.

> Keep in mind that "The behavior of this function is unspecified if the
> mapping was not established by a call to mmap()."

The allocator could certainly use 'mmap' if it wanted to. Consider a C+
+ program where 'operator new' is modified to 'mmap' its own page and
make objects read-only or read-write as needed. The references in my
code could just as well be "foo.bar" and the same rules would apply.

DS

Alexander Terekhov

unread,

Nov 21, 2007, 9:02:33 AM11/21/07

to

David Schwartz wrote:
>
> On Nov 17, 8:28 am, Alexander Terekhov <terek...@web.de> wrote:
>
> > > So it seems that there's no sense in having 'mprotect' given that a
> > > compiler, if ever allowed to read or write a memory location, may move
> > > or duplicate that read or write any place it pleases.
>
> > Eh? What does mprotect() have to do with objects of standard storage
> > duration to begin with?
>
> The compiler does not know how the object was created and allocated.
> It could be allocated by another module.

Revisit loop example and think of a "count" being an object of static
storage duration.

regards,
alexander.

David Schwartz

unread,

Nov 21, 2007, 7:00:42 PM11/21/07

to

On Nov 21, 6:02 am, Alexander Terekhov <terek...@web.de> wrote:

> > The compiler does not know how the object was created and allocated.
> > It could be allocated by another module.

> Revisit loop example and think of a "count" being an object of static
> storage duration.

So long as the address of the object could never have been taken, the
optimization is safe. If the address has ever been taken, it could
have, at least in theory, been passed to 'mprotect'.

Certainly a member object of a C++ class passed in from another
function could be protected. Certainly a member of a structure passed
in from another function could be.

DS

Alexander Terekhov

unread,

Nov 22, 2007, 12:20:02 PM11/22/07

to

David Schwartz wrote:
>
> On Nov 21, 6:02 am, Alexander Terekhov <terek...@web.de> wrote:
>
> > > The compiler does not know how the object was created and allocated.
> > > It could be allocated by another module.
>
> > Revisit loop example and think of a "count" being an object of static
> > storage duration.
>
> So long as the address of the object could never have been taken, the
> optimization is safe. If the address has ever been taken, it could
> have, at least in theory, been passed to 'mprotect'.

DS, mprotect() is about changing the memory protections initially
established by mmap(). And addresses of objects of static storage
duration have absolutely nothing to do with mmap()'s mappings.

regards,
alexander.

Chris Friesen

unread,

Nov 22, 2007, 3:44:54 PM11/22/07

to

Alexander Terekhov wrote:

> DS, mprotect() is about changing the memory protections initially
> established by mmap(). And addresses of objects of static storage
> duration have absolutely nothing to do with mmap()'s mappings.

Consider the case where you allocate memory using mmap(), then set it up
as an alternate signal stack using sigaltstack(). It would be legal to
use mprotect() on it, and could also contain objects of static storage
duration.

Chris

Alexander Terekhov

unread,

Nov 23, 2007, 3:45:05 AM11/23/07

to

I suspect that you mean automatic storage duration, not static. And
good luck trying to run with write-protected stack. :-)

regards,
alexander.

Chris Friesen

unread,

Nov 23, 2007, 10:46:22 AM11/23/07

to

Alexander Terekhov wrote:
> Chris Friesen wrote:

>>Consider the case where you allocate memory using mmap(), then set it up
>>as an alternate signal stack using sigaltstack(). It would be legal to
>>use mprotect() on it, and could also contain objects of static storage
>>duration.

> I suspect that you mean automatic storage duration, not static. And
> good luck trying to run with write-protected stack. :-)

Whoops. Yes, automatic of course.

As for running with a write-protected stack, it might be useful to
write-protect stack frames other than the currently active one in order
to test for memory tramplers.

Chris

Alexander Terekhov

unread,

Nov 26, 2007, 5:02:11 AM11/26/07

to

That may well be so, but as far as standard is concerned, 'stack frames'
may be 'active' all the time because there is no constraint on what and
when implementation can write to 'stack frames'.

regards,
alexander.

Zeljko Vrba

unread,

Nov 26, 2007, 8:21:04 AM11/26/07

to

On 2007-11-23, Chris Friesen <cbf...@mail.usask.ca> wrote:
>
> As for running with a write-protected stack, it might be useful to
> write-protect stack frames other than the currently active one in order
> to test for memory tramplers.
>

That would invalidate perfectly valid (non-malicious!) C code:

void g(int *x);
void f(void)
{
int x;
g(&x); // <- f's stack frame is WP at this point, per your suggestion
}

void g(int *x)
{
++*x;
}

David Schwartz

unread,

Nov 26, 2007, 12:24:46 PM11/26/07

to

On Nov 26, 2:02 am, Alexander Terekhov <terek...@web.de> wrote:

> That may well be so, but as far as standard is concerned, 'stack frames'
> may be 'active' all the time because there is no constraint on what and
> when implementation can write to 'stack frames'.

There is a constraint, but it does not come from the C standard. It
comes from the existence of 'mprotect' on the platform. If you are
writing a compiler for a platform that supports memory protection, you
cannot insert uncoded reads or writes of memory that might be
protected or you render 'mprotect' impossible to use.

DS

David Schwartz

unread,

Nov 26, 2007, 12:25:39 PM11/26/07

to

On Nov 26, 5:21 am, Zeljko Vrba <zvrba.nos...@ieee-sb1.cc.fer.hr>
wrote:

> On 2007-11-23, Chris Friesen <cbf...@mail.usask.ca> wrote:
>
> > As for running with a write-protected stack, it might be useful to
> > write-protect stack frames other than the currently active one in order
> > to test for memory tramplers.
>
> That would invalidate perfectly valid (non-malicious!) C code:

So what?

> void g(int *x);
> void f(void)
> {
> int x;
> g(&x); // <- f's stack frame is WP at this point, per your suggestion
>
> }
>
> void g(int *x)
> {
> ++*x;
>
> }

What's your point? That it's possible to use the technique he
suggested to write buggy code? Name any legitimate coding technique
for which that's not true.

DS

Zeljko Vrba

unread,

Nov 26, 2007, 3:03:14 PM11/26/07

to

On 2007-11-26, David Schwartz <dav...@webmaster.com> wrote:
> On Nov 26, 5:21 am, Zeljko Vrba <zvrba.nos...@ieee-sb1.cc.fer.hr>
> wrote:
>> On 2007-11-23, Chris Friesen <cbf...@mail.usask.ca> wrote:
>>
>> > As for running with a write-protected stack, it might be useful to
>> > write-protect stack frames other than the currently active one in order
>> > to test for memory tramplers.
>>
>> That would invalidate perfectly valid (non-malicious!) C code:
>
> So what?
>

So the platform is not even ANSI C, let alone POSIX.

>
> What's your point? That it's possible to use the technique he
> suggested to write buggy code? Name any legitimate coding technique
> for which that's not true.
>

What's buggy in the above code? The point is that "inactive" stack frames
may not be write-protected (as he suggested), otherwise perfectly valid
(both according to C standard and non-buggy) code ceases to function. Unless
his "test for memory tramplers" includes elaborate code in the signal handler
to perform "allowed" writes.

Chris Friesen

unread,

Nov 26, 2007, 5:09:47 PM11/26/07

to

Zeljko Vrba wrote:

> What's buggy in the above code? The point is that "inactive" stack frames
> may not be write-protected (as he suggested), otherwise perfectly valid
> (both according to C standard and non-buggy) code ceases to function. Unless
> his "test for memory tramplers" includes elaborate code in the signal handler
> to perform "allowed" writes.

Or maybe I know in advance that for my particular code nothing should be
writing to previous stack frames, so I can mark them read-only. Only
the compiler does the optimization and does write to the stack,
triggering an exception.

The point is that it is possible to come up with a POSIX-legal case
where portions of the stack could be read-only. Therefore the compiler
would not be allowed to use the optimization in that case.

Chris

David Schwartz

unread,

Nov 27, 2007, 12:23:59 AM11/27/07

to

On Nov 26, 12:03 pm, Zeljko Vrba <zvrba.nos...@ieee-sb1.cc.fer.hr>
wrote:

> > What's your point? That it's possible to use the technique he
> > suggested to write buggy code? Name any legitimate coding technique
> > for which that's not true.

> What's buggy in the above code?

Umm, duh, it sets some memory write protected and then explicitly
writes to it. It's carefully constructed to fail, and it does fail, as
expected.

> The point is that "inactive" stack frames
> may not be write-protected (as he suggested), otherwise perfectly valid
> (both according to C standard and non-buggy) code ceases to function.

How is code that sets a chunk of memory to read-only and then writes
to it "perfectly valid" or "non-buggy"?!

> Unless
> his "test for memory tramplers" includes elaborate code in the signal handler
> to perform "allowed" writes.

Or he simply never performs writes of that kind.

Zeljko Vrba

unread,

Nov 27, 2007, 1:34:19 AM11/27/07

to

On 2007-11-27, David Schwartz <dav...@webmaster.com> wrote:
>
> How is code that sets a chunk of memory to read-only and then writes
> to it "perfectly valid" or "non-buggy"?!
>

It seems we have a misunderstanding here. I understood his post that he
wanted to make previous stack frames read-only for any kind of program
(ie. as a general feature of the compilation/execution environment).

Alexander Terekhov

unread,

Nov 27, 2007, 8:28:09 AM11/27/07

to

This is going in circle, DS. Here's my last try: how are you going to
'mprotect' an object of static storage duration (write-protected stack
aside for a moment) without triggering 'unspecified' behavior? Recall
that "behavior of [mprotect()] function is unspecified if the mapping

was not established by a call to mmap()."

regards,
alexander.

David Schwartz

unread,

Nov 27, 2007, 1:27:51 PM11/27/07

to

On Nov 27, 5:28 am, Alexander Terekhov <terek...@web.de> wrote:

> This is going in circle, DS. Here's my last try: how are you going to
> 'mprotect' an object of static storage duration (write-protected stack
> aside for a moment) without triggering 'unspecified' behavior? Recall
> that "behavior of [mprotect()] function is unspecified if the mapping
> was not established by a call to mmap()."

I don't understand why the object has to be of static storage
duration. Why do you think that matters?

DS

Hallvard B Furuseth

unread,

Nov 30, 2007, 3:03:06 AM11/30/07

to

Chris Thomasson writes:
>>> If GCC performs the optimization that David Schwartz pointer out, your
>>> basically screwed. AFAICT, GCC is totally busted if it allows stores to
>>> escape a critical-section. This is a race-condition waiting to happen. I
>>> am surprised if GCC would do such a thing!
>>
>> In order to back that claim, can you point us to the section on critical
>> sections in the C standard?
>
> There is none. I thought that GCC could identify POSIX calls. Linux
> would have a problem if GCC did something like this... Perhaps I am
> missing something here.

No good in this case. You can write wrappers around functions like
pthread_mutex_trylock() and call the wrappers; gcc would then need to
identify them as well.

--
Hallvard