standard C question

jpont

unread,

Jul 14, 2002, 2:33:50 AM7/14/02

to

A lot of graphics engines and libraries say they are made in C but standard
C doesn't have functions for graphics and colors and stuff. So what do they
mean when they say that? They can't make it in C. I think they would have
to do a lot of it in Assembly.

Ben Pfaff

unread,

Jul 14, 2002, 2:39:52 AM7/14/02

to

"jpont" <jpon...@attbi.com> writes:

Please don't multi-post. Cross-post if you feel a great need,
but multi-posting is antisocial.
--
"You know, they probably have special dorms for people like us."
--American Pie

Douglas A. Gwyn

unread,

Jul 15, 2002, 4:12:11 AM7/15/02

to

jpont wrote:
> They can't make it in C.

Sure they can; just not using code that is guaranteed to work
unchanged on every platform.

Dan Pop

unread,

Jul 15, 2002, 6:33:10 AM7/15/02

to

In <3D328415...@null.net> "Douglas A. Gwyn" <DAG...@null.net> writes:

>jpont wrote:
>> They can't make it in C.
>
>Sure they can;

It depends on what you mean by making it in C.

>just not using code that is guaranteed to work
>unchanged on every platform.

There is no way to do graphics with a C program whose behaviour is well
defined by the C standard.

You can do it with a C program that invokes undefined behaviour by either
calling functions it doesn't define or by manipulating invalid pointers
(pointers that don't point into any correctly allocated C object).

Dan
--
Dan Pop
DESY Zeuthen, RZ group
Email: Dan...@ifh.de

James Kuyper

unread,

Jul 15, 2002, 9:31:19 AM7/15/02

to

Dan Pop wrote:
...

> You can do it with a C program that invokes undefined behaviour by either
> calling functions it doesn't define or by manipulating invalid pointers
> (pointers that don't point into any correctly allocated C object).

"pointers that don't point into any correctly allocated C object" are
not necessarily "invalid pointers". They can be perfectly valid
pointers; their validity is simply not guaranteed by the C standard.
Their validity could be guaranteed by some other document, such as an
implementation's documentation of it's extensions to the C standard
library.

Dan Pop

unread,

Jul 15, 2002, 10:49:52 AM7/15/02

to

In <3D32CEA7...@gscmail.gsfc.nasa.gov> James Kuyper <kuy...@gscmail.gsfc.nasa.gov> writes:

>Dan Pop wrote:
>...
>> You can do it with a C program that invokes undefined behaviour by either
>> calling functions it doesn't define or by manipulating invalid pointers
>> (pointers that don't point into any correctly allocated C object).
>
>"pointers that don't point into any correctly allocated C object" are
>not necessarily "invalid pointers". They can be perfectly valid
>pointers; their validity is simply not guaranteed by the C standard.

What happens to a C program that tries to access a byte which is not
part of any properly allocated C object? Chapter and verse, please.

>Their validity could be guaranteed by some other document, such as an
>implementation's documentation of it's extensions to the C standard
>library.

Are you sure you've read the subject line? The OP was obviously talking
about implementing graphics in the framework of the C standard, not
outside it.

Anything can be done in C, if we allow implementation extensions into the
picture. But it doesn't make much sense to talk about "standard C" in
this case. The only thing the C standard guarantees about it is that it
invokes undefined behaviour, i.e. nothing at all.

James Kuyper

unread,

Jul 15, 2002, 12:24:30 PM7/15/02

to

Dan Pop wrote:
>
> In <3D32CEA7...@gscmail.gsfc.nasa.gov> James Kuyper <kuy...@gscmail.gsfc.nasa.gov> writes:

...

> >"pointers that don't point into any correctly allocated C object" are
> >not necessarily "invalid pointers". They can be perfectly valid
> >pointers; their validity is simply not guaranteed by the C standard.
>
> What happens to a C program that tries to access a byte which is not
> part of any properly allocated C object? Chapter and verse, please.

The behavior is undefined by the C standard, of course. That doesn't
make the pointer invalid. If the address of that byte is provided by an
implementation-defined source, I'd expect the results of accessing it to
be documentated in the same location; of course, they might not be -
there's always QoI issues. If an implementation's documentation claims
that the pointer points at an object of the specified C type with
specified contents, then the C standard says precisely what that claim
means, in terms of what should happen when the pointer is actually used.

...

> >Their validity could be guaranteed by some other document, such as an
> >implementation's documentation of it's extensions to the C standard
> >library.
>
> Are you sure you've read the subject line? The OP was obviously talking
> about implementing graphics in the framework of the C standard, not
> outside it.

I was commenting on the narrow issue of the validity of pointers that
are not guaranteed valid by the C standard itself. I was not commenting
on the more general topic brought up by the subject line.

Dan Pop

unread,

Jul 16, 2002, 4:48:23 AM7/16/02

to

In <3D32F73E...@gscmail.gsfc.nasa.gov> James Kuyper <kuy...@gscmail.gsfc.nasa.gov> writes:

>Dan Pop wrote:
>>
>> In <3D32CEA7...@gscmail.gsfc.nasa.gov> James Kuyper <kuy...@gscmail.gsfc.nasa.gov> writes:
>...
>> >"pointers that don't point into any correctly allocated C object" are
>> >not necessarily "invalid pointers". They can be perfectly valid
>> >pointers; their validity is simply not guaranteed by the C standard.
>>
>> What happens to a C program that tries to access a byte which is not
>> part of any properly allocated C object? Chapter and verse, please.
>
>The behavior is undefined by the C standard, of course.

So, you can't do it in standard C, which is the point of this thread.

James Kuyper

unread,

Jul 16, 2002, 10:11:41 AM7/16/02

to

Dan Pop wrote:
>
> In <3D32F73E...@gscmail.gsfc.nasa.gov> James Kuyper <kuy...@gscmail.gsfc.nasa.gov> writes:

...

> >The behavior is undefined by the C standard, of course.
>
> So, you can't do it in standard C, which is the point of this thread.

Messages are only supposed to address the original point of a thread,
and are never supposed to wander into related issues, much less
unrelated ones? That would make usenet much less interesting than it
actually is.

Dan Pop

unread,

Jul 16, 2002, 11:09:19 AM7/16/02

to

In <3D34299D...@gscmail.gsfc.nasa.gov> James Kuyper <kuy...@gscmail.gsfc.nasa.gov> writes:

>Dan Pop wrote:
>>
>> In <3D32F73E...@gscmail.gsfc.nasa.gov> James Kuyper <kuy...@gscmail.gsfc.nasa.gov> writes:
>...
>> >The behavior is undefined by the C standard, of course.
>>
>> So, you can't do it in standard C, which is the point of this thread.
>
>Messages are only supposed to address the original point of a thread,
>and are never supposed to wander into related issues, much less
>unrelated ones?

When doing that, it should be *explicitly* mentioned, to avoid
confusions. Furthermore, there is little point in doing it if it
adds no value to the discussion. I doubt there is a single reader of
this newsgroup who isn't aware that an implementation may define things
left undefined by the standard, so I fail to see the point of your
contributions to this thread.

James Kuyper

unread,

Jul 16, 2002, 11:44:10 AM7/16/02

to

Dan Pop wrote:
...

> When doing that, it should be *explicitly* mentioned, to avoid
> confusions. Furthermore, there is little point in doing it if it
> adds no value to the discussion. I doubt there is a single reader of
> this newsgroup who isn't aware that an implementation may define things
> left undefined by the standard, so I fail to see the point of your
> contributions to this thread.

I recognize your failure to see my point; I wish I had some idea how to
correct it.

Douglas A. Gwyn

unread,

Jul 16, 2002, 1:02:09 PM7/16/02

to

Dan Pop wrote:
> So, you can't do it in standard C, which is the point of this thread.

No, the OP guessed that assembler had to be used since C couldn't
be used. He was wrong on both counts. Your pedanticism obscures
the real issue that the OP misunderstood.

Dan Pop

unread,

Jul 17, 2002, 4:31:17 AM7/17/02

to

The OP was confused, indeed, but your original reply could only increase
his confusion.

The main problem (in context) is that C has no low level I/O primitives.
If the I/O address space is distinct from the memory address space
(i.e. it cannot be accessed via C pointers *at all*), at least a couple
of functions that CANNOT be implemented in C (I/O port read and I/O port
write) are needed to create the infrastructure for a GUI.

Alexander Terekhov

unread,

Jul 17, 2002, 5:23:01 AM7/17/02

to

Dan Pop wrote:
[...]

> The main problem (in context) is that C has no low level I/O primitives.

volatiles/memory mapped I/O [to begin with], stupid.

regards,
alexander.

Dan Pop

unread,

Jul 17, 2002, 7:41:46 AM7/17/02

to

In <3D353775...@web.de> Alexander Terekhov <tere...@web.de> writes:

>Dan Pop wrote:
>[...]
>> The main problem (in context) is that C has no low level I/O primitives.
>
>volatiles/memory mapped I/O [to begin with], stupid.

How do you do memory mapped I/O in C if the underlying platform doesn't
support it?

Alexander Terekhov

unread,

Jul 17, 2002, 9:25:22 AM7/17/02

to

Dan Pop wrote:
[...]

> How do you do memory mapped I/O in C if the underlying platform doesn't
> support it?

Uhmm. Do you have any suggestion for making it possible to do
[low level, volatile-semantics-based -- i.e. implementation-
defined anyway, well mostly (and static sig_atomic_t's aside,
of course] memory-mapped I/O -- hardware registers, "ports",
whatever-you-call-it... in C/C++, if the underlying platform
doesn't support it?

What is your point?

--- "someone" wrote ---
"....
It's obvious that such code is by its very nature unportable;
none of my current processors has an 8051 at address 0xff24,
and even if they did, I certainly couldn't access it in this
way from a user application. On the other hand, it isn't
unreasonable to imagine a more or less portable implementation
of driver code for a memory mapped 8051, which would receive
the port addresses as arguments, and could be compiled by
different compilers. While the standard doesn't require
semantics of volatile that would make this work, it is
expressedly the intent (at least in C90).

Whether volatile is the perfect solution or not is not the
question. It is the standard C/C++ solution. It works,
in practice."
^^^^^^^^^^^

regards,
alexander.

Dik T. Winter

unread,

Jul 17, 2002, 9:59:57 AM7/17/02

to

In article <3D357042...@web.de> tere...@web.de writes:
>
> Dan Pop wrote:
> [...]
> > How do you do memory mapped I/O in C if the underlying platform doesn't
> > support it?
>
> Uhmm. Do you have any suggestion for making it possible to do

...

> doesn't support it?
>
> What is your point?

You suggested it:

>> The main problem (in context) is that C has no low level I/O primitives.
>
>volatiles/memory mapped I/O [to begin with], stupid.

--
dik t. winter, cwi, kruislaan 413, 1098 sj amsterdam, nederland, +31205924131
home: bovenover 215, 1025 jn amsterdam, nederland; http://www.cwi.nl/~dik/

Dan Pop

unread,

Jul 17, 2002, 10:40:18 AM7/17/02

to

In <3D357042...@web.de> Alexander Terekhov <tere...@web.de> writes:

>Dan Pop wrote:
>[...]
>> How do you do memory mapped I/O in C if the underlying platform doesn't
>> support it?
>
>Uhmm. Do you have any suggestion for making it possible to do
>[low level, volatile-semantics-based -- i.e. implementation-
>defined anyway, well mostly (and static sig_atomic_t's aside,
>of course] memory-mapped I/O -- hardware registers, "ports",
>whatever-you-call-it... in C/C++, if the underlying platform
>doesn't support it?
>
>What is your point?

Read again this thread, from the very beginning, and you may get my
point (yes, I know I'm overoptimistic here :-)

>--- "someone" wrote ---
>"....
> It's obvious that such code is by its very nature unportable;
> none of my current processors has an 8051 at address 0xff24,
> and even if they did, I certainly couldn't access it in this
> way from a user application. On the other hand, it isn't
> unreasonable to imagine a more or less portable implementation
> of driver code for a memory mapped 8051, which would receive
> the port addresses as arguments, and could be compiled by
> different compilers. While the standard doesn't require
> semantics of volatile that would make this work, it is
> expressedly the intent (at least in C90).
>
> Whether volatile is the perfect solution or not is not the
> question. It is the standard C/C++ solution. It works,
> in practice."
> ^^^^^^^^^^^

Since I haven't seen this text posted in *this* thread, I can hardly
see how it fits *in context*. We were NOT debating the merits of
volatile in this discussion, but rather the possibility to do graphics
using (or even abusing) only the standard C features.

Then again, one who can't tell the difference between different
newsgroups can't be expected to be able to tell the difference between
different discussions on different topics.

Douglas A. Gwyn

unread,

Jul 17, 2002, 10:21:01 AM7/17/02

to

Dan Pop wrote:
> If the I/O address space is distinct from the memory address space
> (i.e. it cannot be accessed via C pointers *at all*), at least a couple
> of functions that CANNOT be implemented in C (I/O port read and I/O port
> write) are needed to create the infrastructure for a GUI.

They still *can* be implemented in C, using platform-specific
extensions within a *conforming* implementation. Indeed, we've
been discussing such hardware I/O extensions for years in the
C standards committee (without arriving on a consensus spec).
Using such an API would be no different in principle from using
<stdio.h>, except for level of standardization.

Alexander Terekhov

unread,

Jul 17, 2002, 12:24:51 PM7/17/02

to

Dan Pop wrote:
[...]

> Read again this thread, from the very beginning,

Done.

> and you may get my point

I'm still somewhat "uncertain".

> (yes, I know I'm overoptimistic here :-)

Indeed.

[...volatiles/memory-mapped I/O...]

> Since I haven't seen this text posted in *this* thread, I can hardly
> see how it fits *in context*.

The context is this:

: The main problem (in context) is that C has no low level I/O primitives.

Wrong.

: If the I/O address space is distinct from the memory address space

: (i.e. it cannot be accessed via C pointers *at all*),

Again, you've missed "volatile" here [I personally never heard
of doing (in standard C/C++) memory-mapped I/O without using
volatiles].

: at least a couple

: of functions that CANNOT be implemented in C (I/O port read and I/O port
: write) are needed to create the infrastructure for a GUI.

Still, to me, it doesn't mean that:

: So, you can't do it in standard C, which is the point of this thread.

Well, addressing "the point of this thread" you also wrote this:

: There is no way to do graphics with a C program whose

: behaviour is well defined by the C standard.

:
: You can do it with a C program that invokes undefined

: behaviour by either calling functions it doesn't define

: or by manipulating invalid pointers (pointers that don't
: point into any correctly allocated C object).

but others have already addressed this brilliant "answer",
so I'll refrain.

regards,
alexander.

Zack Weinberg

unread,

Jul 17, 2002, 1:20:16 PM7/17/02

to

Alexander Terekhov <tere...@web.de> writes:
>: If the I/O address space is distinct from the memory address space
>: (i.e. it cannot be accessed via C pointers *at all*),
>
>Again, you've missed "volatile" here [I personally never heard of doing
>(in standard C/C++) memory-mapped I/O without using volatiles].

Dan is talking about the case where the I/O is NOT memory mapped, but
done with special hardware instructions. There is indeed no way within
standard C to do that kind of I/O.

There's another example of this problem, which comes up in just about
every hosted implementation of Standard C itself. To implement fopen()
on a modern hosted system, at some point control needs to transfer
across a hardware-enforced privilege boundary. This is invariably done
with special hardware instructions. Again, there is no way to write
that in standard C.

The two most common solutions to this problem are:

- The implementation can offer an extension to the language which lets
the programmer inject raw assembly instructions into a function
otherwise written in C.

- The implementation can offer an extension to the language which lets
the programmer write an entire translation unit in assembly language,
then use functions defined there from other translation units.

(Yeah, it's a little weird to describe the presence of a user-accessible
assembler as an extension to C, but that is indeed what it is from the
C standard's perspective.)

zw

Francis Glassborow

unread,

Jul 17, 2002, 2:16:30 PM7/17/02

to

In article <3D359A53...@web.de>, Alexander Terekhov
<tere...@web.de> writes

>Again, you've missed "volatile" here [I personally never heard
>of doing (in standard C/C++) memory-mapped I/O without using
>volatiles].

You brought up memory mapped I/O that is NOT what was being referred to.
What happens if your I/O device has no address in the linear address
space being used by C (which is the way that the C abstract machine
treats memory) I realise that you are not a native English speaker but
that does not excuse your repeated failure to understand what you are
commenting on.

--
Francis Glassborow ACCU
64 Southfield Rd
Oxford OX4 1PA +44(0)1865 246490
All opinions are mine and do not represent those of any organisation

Alexander Terekhov

unread,

Jul 17, 2002, 2:40:15 PM7/17/02

to

Zack Weinberg wrote:
[...]

> Dan is talking about the case where the I/O is NOT memory mapped, but
> done with special hardware instructions. There is indeed no way within
> standard C to do that kind of I/O.

Well, for example, I guess that some perfectly conforming C/C+ implementation
could "easily" [performance aside, using memory protection to trap writes and
reads to/from a certain areas of "illegal memory"]... translate standard C/C++
memory-mapped I/O [at runtime] to *I/O instructions* -- the stuff "Dan is
talking about"... and I'm totally unaware of.

regards,
alexander.

Zack Weinberg

unread,

Jul 17, 2002, 4:06:25 PM7/17/02

to

Alexander Terekhov <tere...@web.de> writes:
>Zack Weinberg wrote:
>[...]
>> Dan is talking about the case where the I/O is NOT memory mapped,
>> but done with special hardware instructions. There is indeed no way
>> within standard C to do that kind of I/O.
>

>Well, for example, I guess that some perfectly conforming C/C++

>implementation could "easily" [performance aside, using memory
>protection to trap writes and reads to/from a certain areas of
>"illegal memory"]... translate standard C/C++ memory-mapped I/O [at
>runtime] to *I/O instructions* -- the stuff "Dan is talking about"...
>and I'm totally unaware of.

Sure, and such things have indeed been done, but that is still an
extension from the standard's point of view - the implementation is
defining behavior left undefined by the standard.

zw

those who know me have no need of my name

unread,

Jul 18, 2002, 1:51:41 AM7/18/02

to

in comp.std.c i read:

>There is no way to do graphics with a C program whose behaviour is well
>defined by the C standard.
>
>You can do it with a C program that invokes undefined behaviour by either
>calling functions it doesn't define or by manipulating invalid pointers
>(pointers that don't point into any correctly allocated C object).

this is not strictly true for all cases. one can write regis, naplps or
other `data stream' oriented graphics without any non-standard code at all.
leaving it to the underlying system to move the instructions to the device,
whether that's memory mapped, a serial port or something else. in general
this sort of graphics processing isn't done, in some cases because it's
never been attempted in others because the devices are now passe, but
there's no impediment to doing so.

--
bringing you boring signatures for 17 years

Alexander Terekhov

unread,

Jul 18, 2002, 11:48:44 AM7/18/02

to

Apropos...

Francis Glassborow wrote:

> ...the way that the C abstract machine treats memory...

The IEEE Std 1003.1-2001 (POSIX.1) standard says:

http://www.opengroup.org/onlinepubs/007904975/basedefs/xbd_chap04.html#tag_04_10

"Memory Synchronization

Applications shall ensure that access to any memory location by more
than one thread of control (threads or processes) is restricted such
that no thread of control can read or modify a memory location while
another thread of control may be modifying it. Such access is
restricted using functions that synchronize thread execution and
also synchronize memory with respect to other threads. ...."

AFAIK, POSIX standard itself doesn't define the term "memory location".
POSIX is "nothing" but C99 extension(s). So, I guess, the answer is
awaiting me somewhere inside ANSI+ISO+IEC+9899-1999.pdf and/or
ISO+IEC+9899+Cor1-2001.pdf [BTW, is there anything newer than that?].
The problem is that can't find it. Can someone here, in this newsgroup
[*comp.lang.c++.moderated*, just-in-case ;-)], help me? TIA.

regards,
alexander.

Barry Margolin

unread,

Jul 18, 2002, 2:01:48 PM7/18/02

to

In article <3D36E35C...@web.de>,

It sounds like "memory location" is being used to refer to the same thing
that the C standard calls "object".

--
Barry Margolin, bar...@genuity.net
Genuity, Woburn, MA
*** DON'T SEND TECHNICAL QUESTIONS DIRECTLY TO ME, post them to newsgroups.
Please DON'T copy followups to me -- I'll assume it wasn't posted to the group.

Douglas A. Gwyn

unread,

Jul 18, 2002, 1:47:00 PM7/18/02

to

Alexander Terekhov wrote:
> ... Can someone here, in this newsgroup ..., help me?

Please clearly state the issue or question.

Alexander Terekhov

unread,

Jul 19, 2002, 6:46:56 AM7/19/02

to

The issue is also known as "word-tearing"; "false-sharing"
aside for a moment. The question is what does the term
"memory location" mean in the sense of writing *portable*
programs. AFAICS, C standard partially addresses "word-tearing"
problem -- for signal handling [static volatile sig_atomic_t],
but it is unclear to me what is the intended meaning [if any]
of the "memory location" term from the C standards committee
POV.

regards,
alexander.

Alexander Terekhov

unread,

Jul 19, 2002, 7:07:16 AM7/19/02

to

Barry Margolin wrote:
[...]

> It sounds like "memory location" is being used to refer to the same thing
> that the C standard calls "object".

struct object_t {
char is_this_an_object_question_mark;
char is_this_yet_another_object_question_mark;
};

char how_many_objects_do_we_have_here_question_mark[2]; // 3? 2? 1?

regards,
alexander.

t...@cs.ucr.edu

unread,

Jul 19, 2002, 11:21:11 AM7/19/02

to

Alexander Terekhov <tere...@web.de> wrote:

Mathematicians never really define what a number is. Rather they give
axioms regarding how numbers work, i.e., properties of the operations
on numbers. I believe that the only way to understand the term
"memory location" is to take a similarly axiomatic view of the read
and write operations on memory locations, e.g., that non-volatile
memory locations are supposed to remember what was last written
to them, etc.

Tom Payne

t...@cs.ucr.edu

unread,

Jul 19, 2002, 11:22:57 AM7/19/02

to

Alexander Terekhov <tere...@web.de> wrote:

Strictly speaking, we have none. Rather, we have the declaration of a
struct type, each instance of which is an object having two member
objects.

Tom Payne

James Kuyper

unread,

Jul 19, 2002, 11:15:01 AM7/19/02

to

Alexander Terekhov wrote:
>
> Barry Margolin wrote:
> [...]
> > It sounds like "memory location" is being used to refer to the same thing
> > that the C standard calls "object".
>
> struct object_t {
> char is_this_an_object_question_mark;

Not as such. It is a name that can be used to refer to an object, but
until an object of 'struct object_t' type has been defined, there is no
sub-object named 'is_this_an_object_question_mark'.

> char is_this_yet_another_object_question_mark;

It's the name of a different sub-object, but it's not the name of actual
subobject yet.

> };
>
> char how_many_objects_do_we_have_here_question_mark[2]; // 3? 2? 1?

That's one aggregate object, with two sub-objects. You could say that
this makes a total of 3 overlapping objects; but it would be misleading
to say so without the adjective "overlapping".

If you added the following code:

struct object_t my_object;

Then 'my_object' would be another aggregate object, containing two
distinct sub-objects.

James Kuyper

unread,

Jul 19, 2002, 12:12:14 PM7/19/02

to

That's true of the struct declaration, but the array declaration also
happens to be a definition, so it creates an array object with two
'char' sub-objects.

Alexander Terekhov

unread,

Jul 19, 2002, 12:34:06 PM7/19/02

to

James Kuyper wrote:
[...]

> > char how_many_objects_do_we_have_here_question_mark[2]; // 3? 2? 1?
>
> That's one aggregate object, with two sub-objects. You could say that
> this makes a total of 3 overlapping objects; but it would be misleading
> to say so without the adjective "overlapping".
>
> If you added the following code:
>
> struct object_t my_object;
>
> Then 'my_object' would be another aggregate object, containing two
> distinct sub-objects.

Let's now count the number of "memory locations" [a.k.a. memory
"granules"]... anyone?

regards,
alexander.

James Kuyper

unread,

Jul 19, 2002, 1:10:50 PM7/19/02

to

If I understand you correctly, you're wondering whether the POSIX
wording about memory synchonization between threads applies to my_object
as a whole, or only to the two sub-objects seperately? That's a good
question; for a POSIX newsgroup. C doesn't use the concept of "memory
location", and it's only an educated guess that it might mean the same
as the C concept of an "object". It could easily instead mean "the
single word of memory at a specified location" i.e. sig_atomic_t.

I'd recommend checking with comp.std.unix.

t...@cs.ucr.edu

unread,

Jul 19, 2002, 1:19:55 PM7/19/02

to

James Kuyper <kuy...@gscmail.gsfc.nasa.gov> wrote:

Oops. Thanks. I missed the array declaration.

Tom Payne

t...@cs.ucr.edu

unread,

Jul 19, 2002, 1:43:25 PM7/19/02

to

Alexander Terekhov <tere...@web.de> wrote:

I don't find the term "memory location" in the index of the standard,
but given a collection of objects I would define the number of
locations to be the sum of the sizes of those objects that aren't
subobjects of others. Wouldn't you?

Tom Payne

Alexander Terekhov

unread,

Jul 19, 2002, 2:09:25 PM7/19/02

to

James Kuyper wrote:
[...]

> I'd recommend checking with comp.std.unix.

comp.std.unix is 'almost-dead' newsgroup, probably
because it's moderated. ;-)

Well, actually I've submitted an Aardvark "comment"
directly to Austin Group folks. Ha! Folks there have
REALLY good sense of humor... here is the response:

http://www.opengroup.org/austin/docs/austin_107.txt

"....
Our advice is as follows.

Hardware that does not allow atomic accesses cannot have
a POSIX implementation on it.

We propose no changes to the standard.
...."

I've replied:

Excuse me, but does this mean that the following
(piece of informal memory model semantics) applies
to POSIX "memory model" as well:

http://www.cs.umd.edu/~pugh/java/memoryModel/semantics.pdf

"....
The fact that two variables may be stored in ad-
jacent bytes (e.g., in a byte array) is immaterial.
Two variables can be simultaneously updated by
different threads without needing to use synchro-
nization to account for the fact that they are
'adjacent'. Any word-tearing must be invisible
to the programmer.
...."

<?>

And/or, perhaps, what is actually meant by the "advice"
above is that the *undefined* term "memory location"
is actually a substitute of *defined* POSIX "Byte" term
(C-"char"/CHAR_BIT==8 required) or something like that?

And got [thus far], billion+1-billion-1 answers.

regards,
alexander.

Alexander Terekhov

unread,

Jul 19, 2002, 2:25:00 PM7/19/02

to

t...@cs.ucr.edu wrote:
[...]

> I don't find the term "memory location" in the index of the standard,
> but given a collection of objects I would define the number of
> locations to be the sum of the sizes of those objects that aren't
> subobjects of others. Wouldn't you?

http://tru64unix.compaq.com/docs/base_doc/DOCUMENTATION/V51A_HTML/ARH9RBTE/DOCU0007.HTM#gran_sec
(3.7 Granularity Considerations)

"Granularity refers to the smallest unit of storage (that is, bytes,
words, longwords, or quadwords) that a host computer can load or
store in one machine instruction. Granularity considerations can
affect the correctness of a program in which concurrent or
asynchronous access can occur to separate pieces of data stored
in the same memory granule. This can occur in a multithreaded
program, where different threads access the data, or in any
program that has any of the following characteristics:

- Accesses data in memory that is shared with other processes

- Accesses data that can be accessed by asynchronous device
drivers, signal handlers (on Tru64 UNIX), or ASTs (on OpenVMS)

- Accesses data objects that can be accessed by a continuable
exception handler

The subsections that follow explain the granularity concept,
why it can affect the correctness of a multithreaded program,
and techniques the programmer can use to prevent the granularity-
related race condition known as word tearing. ...."

regards,
alexander.

t...@cs.ucr.edu

unread,

Jul 20, 2002, 6:42:57 AM7/20/02

to

Zack Weinberg <za...@panix.com> wrote:

The purpose of the Standard is to say what externally observable
behavior is acceptable from a given program in a given situation.
The Standard provides for two situations where a program's behavior is
externally observable (i.e., have an external effect):

* accessing volatile variables (which are to some extent abstractions
of I/O registers)

* accessing files via functions defined in the standard library.

Such behavior is well-defined, but the Standard does not
specify/define what will be the external effect when the program
engages in these behaviors, e.g., what pixel of what screen will light
up at what intensity of what color. One can imagine a strictly
conforming program that has one effect on screens of one system and a
different effect on the screens of another.

Tom Payne

Douglas A. Gwyn

unread,

Jul 21, 2002, 11:05:53 PM7/21/02

to

Alexander Terekhov wrote:
> "Douglas A. Gwyn" wrote:

> > Please clearly state the issue or question.
> The issue is also known as "word-tearing"; "false-sharing"
> aside for a moment. The question is what does the term
> "memory location" mean in the sense of writing *portable*
> programs. AFAICS, C standard partially addresses "word-tearing"
> problem -- for signal handling [static volatile sig_atomic_t],
> but it is unclear to me what is the intended meaning [if any]
> of the "memory location" term from the C standards committee
> POV.

That still doesn't clearly state an issue. Do you have sample
code that will illustrate a specific problem?

What part of the C standard uses the term "memory location"?
I just searched the PDF version of ISO/IEC 9899:1999 and didn't
find that phrase used in the text.

Alexander Terekhov

unread,

Jul 24, 2002, 6:18:12 AM7/24/02

to

< c.p.t. added >

"Douglas A. Gwyn" wrote:
[...]

> > > Please clearly state the issue or question.

> > The issue is also known as "word-tearing"; "false-sharing"
> > aside for a moment. The question is what does the term
> > "memory location" mean in the sense of writing *portable*
> > programs. AFAICS, C standard partially addresses "word-tearing"
> > problem -- for signal handling [static volatile sig_atomic_t],
> > but it is unclear to me what is the intended meaning [if any]
> > of the "memory location" term from the C standards committee
> > POV.
>
> That still doesn't clearly state an issue. Do you have sample
> code that will illustrate a specific problem?

< non-standard C/SEH exception handling with resumption aside >

char data[2]; // thread/process A reads/writes data[0]
// thread/process B reads/writes data[1]

> What part of the C standard uses the term "memory location"?
> I just searched the PDF version of ISO/IEC 9899:1999 and didn't
> find that phrase used in the text.

Yeah, I know. Consider: <copy&paste>

----

Newsgroups: comp.lang.c++.moderated, comp.programming.threads
Subject: Re: MT design advice
Date: 23 Jul 2002 18:49:05 -0400

Alexander Terekhov wrote:

> Excuse me, but does this mean that the following
> (piece of informal memory model semantics) applies
> to POSIX "memory model" as well:
>
> http://www.cs.umd.edu/~pugh/java/memoryModel/semantics.pdf
>
> "....
> The fact that two variables may be stored in ad-
> jacent bytes (e.g., in a byte array) is immaterial.
> Two variables can be simultaneously updated by
> different threads without needing to use synchro-
> nization to account for the fact that they are
> 'adjacent'. Any word-tearing must be invisible
> to the programmer.
> ...."

No. There's a problem here. The Java memory model is a LANGUAGE rule. "By
whatever means necessary, a proper implementation of the language must
achieve this behavior." Fine. A programmer can count on that as long as the
Java implementation claims conformance.

POSIX rules are trickier. The memory model imposes restrictions on the
hardware and on the API. However, in general you're not accessing those two
bytes through the API, nor using hardware data access instructions. You're
writing C (or C++) language code that a compiler will translate, at its
discretion, into hardware instructions that meet the C/C++ language
definitions.

The C and C++ languages do not have such rules for normal accesses. The
compiler is allowed to generate 64-bit wide access instructions for your
"char c; ... c++;" sequence. And if it does, neither the API nor the
hardware rules do you any good. Java doesn't have that problem. IF the C
and/or C++ languages choose to address multiprocessing, this is something
that they could require. But it will have a cost, because on some machines
the conforming code will be suboptimal. (This is equally true for Java, of
course, and C++ could decide to make the same tradeoffs.)

--
/--------------------[ David.B...@hp.com ]--------------------\
| Hewlett-Packard Company Tru64 UNIX & VMS Thread Architect |
| My book: http://www.awl.com/cseng/titles/0-201-63392-2/ |
\-------------[ http://homepage.mac.com/~dbutenhof ]--------------/

[ Send an empty e-mail to c++-...@netlab.cs.rpi.edu for info ]
[ about comp.lang.c++.moderated. First time posters: do this! ]

----

Well, Butenhof also wrote this [there was a rather long thread
"Re: POSIX threads and word tearing" spanning over even comp.std.c
about a year or so ago]:

< annotations are mine ;-) >

"You don't run into problems except with adjacent data of relatively
small size that have different "sharing scopes". This is much like
saying you don't run into problems as long as you don't assume that
a byte is exactly 8 bits. It's not hard to do, but it is a constraint,
and code that doesn't accept the constraint isn't portable. ..."

Ha! Note that POSIX does state that "a byte is exactly 8 bits"! ;-)

"...
While it would have been nice if POSIX could have specified "small"
and "close" such that one could portably use data that is "big enough"
and/or "sufficiently far". This might even have been generally acceptable
for "long", though probably not for "int", and I can't imagine trying to
get concensus on an acceptable definition of "far" for "small" data.

Of course, it really did do this, though not directly or in a manner
quite as generally useful as you might wish. The definition of the type
sig_atomic_t is sufficient for the needs. ..."

I don't think so. >>static volatile sig_atomic_t<< is meant for async.
signals ONLY [async.access from the programmers point of view, but in
no relation to threading whatsoever... and THAT asynchrony could even
be easily SUPPRESSED by the implementation (masking signals simply to
"protect" access to static volatile sig_atomic_t variables), to begin
with], in my school-of-thoughts.

"...Therefore, by POSIX, data of
type sig_atomic_t is safe, and all "nearby" data with different sharing
scope must, to be fully portable, be of type sig_atomic_t. If you're
willing to make reasonable pragmatic inferences that will be safe on
all likely "mainstream" systems, you could extend that guarantee to
data the size of sig_atomic_t, or placed so that each smaller item is
not within sizeof(sig_atomic_t) bytes of either the previous or following
item. This is, in fact, what most portable threaded code does. ..."

I don't think that this would be *truly portable*, though.

"...
To do much better than this requires an ABI (binary interface for a
particular machine) or an abstract machine model (like Java, which has
its own set of problems). Language standards like C and C++, if they
chose to recognize the existence of threads, could easily place constraints
on the language that, for example, data of "int" or "long", or aggregate
data of at least that size, must be protected from corruption by concurrent
writes to adjacent data of similar characteristic."

But THAT would be in the scope of comp.std.c{++}, I guess. ;-)

regards,
alexander.

t...@cs.ucr.edu

unread,

Jul 24, 2002, 7:58:10 AM7/24/02

to

In comp.std.c Alexander Terekhov <tere...@web.de> wrote:
: < c.p.t. added >

: "Douglas A. Gwyn" wrote:
: [...]

:> > > Please clearly state the issue or question.

[... long and helpful reply that included quotes from Dave Butenhof ...]

Thanks. I think I finally see what the question is.

It seems to me that threads can reasonably be expected to restrict
their attention to:
- automatic objects, where false sharing seems not to be an issue,
- members of structs to which they've acquired exclusive access
in a monitor-like fashion,
- dynamically allocated objects, i.e., segments returned by malloc.
If I'm looking at things correctly, it would seem to suffice to require
that objects of the last two categories be word aligned. Right?

Tom Payne

t...@cs.ucr.edu

unread,

Jul 24, 2002, 8:02:33 AM7/24/02

to

In comp.std.c t...@cs.ucr.edu wrote:
[...]
: It seems to me that threads can reasonably be expected to restrict

: their attention to:
: - automatic objects, where false sharing seems not to be an issue,
: - members of structs to which they've acquired exclusive access
: in a monitor-like fashion,
: - dynamically allocated objects, i.e., segments returned by malloc.
: If I'm looking at things correctly, it would seem to suffice to require
: that objects of the last two categories be word aligned. Right?

Oops. For the second category, I meant that the structs themselves
(rather than the individual members) should be word aligned.

Tom Payne

Alexander Terekhov

unread,

Jul 24, 2002, 11:14:57 AM7/24/02

to

Well, standard library aside, I'd probably {still ;-)} want
something along the lines of: <copy&paste>

Newsgroups: comp.programming.threads, comp.std.c
Subject: Re: Most conforming POSIX threads implementation
Date: Wed, 18 Jul 2001 20:58:42 +0200
Message-ID: <3B55DC62...@web.de>

James Kuyper wrote:

[...]
> Then I'm confused. I traced your discussion back before sending that
> message, and came away with the impression that you were arguing for
> different members of an array to be stored in different blocks of
> memory.

it seems that there is no portable way to fight word tearing race
condition.. how about yet another 'granularizer' ;-) qualifier:

/* distinct */ char byte1; // should be word tearing safe
/* distinct */ char byte2; // should be word tearing safe
distinct char byteArr[] = { 'a','b' }; // should be word tearing safe
distinct char* bytePtr = byteArr; // should be word tearing safe
struct { distinct char a,b; } ab = { 'a','b' }; // should be word tearing safe
char _byteArr[] = { 'a','b' }; // could be word tearing unsafe
char* _bytePtr = byteArr; // could be word tearing unsafe
bytePtr = _byteArr; // COMPILE ERROR!!
_bytePtr = byteArr; // COMPILE ERROR!!
bytePtr = _bytePtr; // COMPILE ERROR!!
_bytePtr = bytePtr; // COMPILE ERROR!!
// sizeof( byteArr ) >= sizeof( _byteArr ) // extra space could be added!

btw, that is actually an 'existing practice' already.
well, sort of..

Compaq uses 'volatile' qualifier to ensure word tearing
safe programming (basically switching over to single
byte granularity which could require software emulation
on older processors):

http://tru64unix.compaq.com/faqs/publications/base_doc/DOCUMENTATION/V51_HTML/ARH9RBTE/DOCU0007.HTM#gran_sec
http://tru64unix.compaq.com/faqs/publications/base_doc/DOCUMENTATION/V51_HTML/ARH9RBTE/DOCU0008.HTM

"(On OpenVMS Alpha or OpenVMS VAX) Compile all application
modules for byte actual granularity. Doing so automatically
prevents word-tearing race conditions for structure or union
members and array elements of size byte or larger that are
accessed concurrently by different threads. No other program
modification is required. This may have a performance penalty
on Alpha EV4 and EV5 processors.
Or,
(On Tru64 UNIX systems) For arrays, add the C language
volatile storage qualifier to the definition of the entire
array; for structures, add volatile to the declaration of
only those members that share the pertinent memory granule.
You must also compile the application's modules using the
Compaq C or Compaq C++ compiler's -strong-volatile switch.
Doing so causes the compiler to produce code that forces
all accesses to those members to occur as atomic operations.
See the description of the -strong-volatile switch in the
Compaq C or Compaq C++ documentation and on the cc reference
page. This may also have a severe performance penalty. "

next step... :) 'very distinct' for fighting cache trashing :) :)

regards,
alexander.

Witless

unread,

Jul 24, 2002, 7:56:02 PM7/24/02

to

t...@cs.ucr.edu wrote:

Threads may also use simple static variables where a single writer thread
publishes state information for one or more audience threads.

Steve Watt

unread,

Jul 24, 2002, 7:53:43 PM7/24/02

to

But it would be reasonable[1] for an application to have two separate
bits of a structure protected by different mutices. For example, say
you've got:

struct header_pt1 {
... blah ...
unsigned char flag;
};
struct other_hdr {
unsigned char status;
... foo ...
};

struct image {
struct header_pt1 hp1;
struct other_hdr oh;
}

Are there size restrictions on the last element of struct header_pt1?
The first element of struct other_hdr? Is the compiler allowed to pack
hp1.flag and oh.status into the same word? I would argue that the C
standard, today, says yes. And now you have word tearing issues in
a multithreaded system. One thread could manipulate hp1 (and, indeed
have it locked), another thread could manipulate oh, and the two can,
while still obeying all of the POSIX rules, get in deep trouble.

And if you say the structs must be word aligned, then just recast the
structure thus:

struct im2 {
mutex lock1, lock2;
char data1[6]; // locked by lock1
char data2[6]; // locked by lock2
}

and then, potentially, the last few elements of data1 and the first few
of data2 (on 32 bit architectures) share a tearing region. Oops.

[1] For some value thereof.
--
Steve Watt KD6GGD PP-ASEL-IA ICBM: 121W 56' 57.8" / 37N 20' 14.9"
Internet: steve @ Watt.COM Whois: SW32
Free time? There's no such thing. It just comes in varying prices...

Fergus Henderson

unread,

Jul 24, 2002, 10:06:21 PM7/24/02

to

t...@cs.ucr.edu writes:

>It seems to me that threads can reasonably be expected to restrict
>their attention to:
> - automatic objects, where false sharing seems not to be an issue,

Not true. A thread can pass the address of an automatic object
to another thread. In particular, when doing a fork-join operation,
it is quite common to pass the address of automatic objects to
newly created threads.

> - members of structs to which they've acquired exclusive access
> in a monitor-like fashion,
> - dynamically allocated objects, i.e., segments returned by malloc.

You forgot statically allocated objects.

>If I'm looking at things correctly, it would seem to suffice to require
>that objects of the last two categories be word aligned. Right?

No, that's not sufficient.

--
Fergus Henderson <f...@cs.mu.oz.au> | "I have always known that the pursuit
The University of Melbourne | of excellence is a lethal habit"
WWW: <http://www.cs.mu.oz.au/~fjh> | -- the last words of T. S. Garp.

t...@cs.ucr.edu

unread,

Jul 25, 2002, 1:20:21 AM7/25/02

to

In comp.std.c Fergus Henderson <f...@cs.mu.oz.au> wrote:
: t...@cs.ucr.edu writes:

:>It seems to me that threads can reasonably be expected to restrict
:>their attention to:
:> - automatic objects, where false sharing seems not to be an issue,

: Not true. A thread can pass the address of an automatic object
: to another thread. In particular, when doing a fork-join operation,
: it is quite common to pass the address of automatic objects to
: newly created threads.

Good point. Return values for base functions of threads would need to
be handled as as special case. The common function-local automatics
should not be accessed by other threads.

:> - members of structs to which they've acquired exclusive access
:> in a monitor-like fashion,
:> - dynamically allocated objects, i.e., segments returned by malloc.

: You forgot statically allocated objects.

I didn't forget about them. Rather, my point/conjecture is that there
is no need to worry about such objects that are not wrapped inside a
(lock protected) struct.

:>If I'm looking at things correctly, it would seem to suffice to

require :>that objects of the last two categories be word aligned.
Right?

: No, that's not sufficient.

Perhaps not. I'm looking for a minor-but-sufficient modification to
the standard that would eliminate the sorts of problems that Alexander
has drawn attention to. For instance: "if you don't want it to get
falsely shared, wrap it in a struct". I'm willing to believe that
there is no such solution, but the threaded programming that I've done
confines all of its statically shared data to monitors and doesn't
concurrently share dynamic (with the exception of return values) or
automatic data.

I have an instinct that something fairly reasonable can be found.

Tom Payne

t...@cs.ucr.edu

unread,

Jul 25, 2002, 8:33:23 AM7/25/02

to

In comp.std.c Witless <wit...@attbi.com> wrote:
: t...@cs.ucr.edu wrote:
[...]
:> It seems to me that threads can reasonably be expected to restrict

:> their attention to:
:> - automatic objects, where false sharing seems not to be an issue,
:> - members of structs to which they've acquired exclusive access
:> in a monitor-like fashion,
:> - dynamically allocated objects, i.e., segments returned by malloc.
:> If I'm looking at things correctly, it would seem to suffice to require
:> that objects of the last two categories be word aligned. Right?

: Threads may also use simple static variables where a single writer thread
: publishes state information for one or more audience threads.

Since there is no portable way to guarantee that this simple variable
is atomic, it needs to be protected by a lock. I'm suggesting that
all lock-protected statics should be inside of structs, per the
monitor paradigm. The hope/conjecture is that, if MT programs can be
counted on to follow that (reasonable?) restriction, a fairly weak
alignment restriction, would suffice to eliminate the word-tearing
issue.

Tom Payne

Alexander Terekhov

unread,

Jul 25, 2002, 9:25:17 AM7/25/02

to

Check out this "GRANULARIZE(X)" {sub-}thread:

http://groups.google.com/groups?threadm=Xgl57.661%24h8.34523%40news1.rdc1.bc.home.com

I personally think that anti-word-tearing qualifier ["special" type
specifier, sort of] is the only [reasonable; C/C++] way to "make it
work".

regards,
alexander.

Konrad Schwarz

unread,

Jul 29, 2002, 1:40:18 PM7/29/02

to

<t...@cs.ucr.edu> schrieb im Newsbeitrag news:aho1ql$c6i$1...@glue.ucr.edu...

>
> Perhaps not. I'm looking for a minor-but-sufficient modification to
> the standard that would eliminate the sorts of problems that Alexander
> has drawn attention to. For instance: "if you don't want it to get
> falsely shared, wrap it in a struct". I'm willing to believe that
> there is no such solution, but the threaded programming that I've done
> confines all of its statically shared data to monitors and doesn't
> concurrently share dynamic (with the exception of return values) or
> automatic data.
>
> I have an instinct that something fairly reasonable can be found.

Word tearing is not an issue for current machines, to
the best of my knowledge. Even for
old machines, code can be generated for arbitrary
data packings that does not suffer from
word tearing problems. The fact that
these sequences are slower than direct
loads/stores of bytes is one reason why these
old machines have been superceeded.

From a standards point of view, no
allowances need to be made for
word-tearing machines: they don't
have to word tear and they
are obsolete.

Alexander Terekhov

unread,

Jul 30, 2002, 5:16:51 AM7/30/02

to

http://groups.google.com/groups?selm=qXU47.813%24rc5.60656%40news.cpqcorp.net
(Subject: Re: Most conforming POSIX threads implementation)

"Norman Black wrote:

> Generally speaking, if a processor supports a data type directly in
> load/store instructions then that instruction will operate atomically with
> respect to other processors.
>
> The SPARC, MIPS, PowerPC/POWER, IA-64 architectures all support 8 and
> 16-bit writes. This is NOT true for the Alpha however.

That hasn't been true since the EV56, which is now pretty old. However, the
compiler will still generally try to write code that can be executed by any
old Alpha unless you use the -arch switch. (Ironically, the addition of
byte and word instructions was mostly a concession to the economic
realities of porting NT device drivers... which may have already become
irrelevant by the time EV56 shipped, but certainly soon after.)

Still, more importantly, the C and C++ languages don't guarantee what
operations will be used to read or write your data, even when the hardware
happens to support an operation that would do the job atomically. So
whether the hardware can do it is really irrelevant unless you're writing
in assembler."

regards,
alexander.

t...@cs.ucr.edu

unread,

Jul 30, 2002, 7:50:32 AM7/30/02

to

In comp.std.c Konrad Schwarz <konradDO...@mchpdotsiemens.de> wrote:

: <t...@cs.ucr.edu> schrieb im Newsbeitrag news:aho1ql$c6i$1...@glue.ucr.edu...

:>
:> Perhaps not. I'm looking for a minor-but-sufficient modification to
:> the standard that would eliminate the sorts of problems that Alexander
:> has drawn attention to. For instance: "if you don't want it to get
:> falsely shared, wrap it in a struct". I'm willing to believe that
:> there is no such solution, but the threaded programming that I've done
:> confines all of its statically shared data to monitors and doesn't
:> concurrently share dynamic (with the exception of return values) or
:> automatic data.
:>
:> I have an instinct that something fairly reasonable can be found.

: Word tearing is not an issue for current machines, to
: the best of my knowledge. Even for
: old machines, code can be generated for arbitrary
: data packings that does not suffer from
: word tearing problems. The fact that
: these sequences are slower than direct
: loads/stores of bytes is one reason why these
: old machines have been superceeded.

Good news.

: From a standards point of view, no

: allowances need to be made for
: word-tearing machines: they don't
: have to word tear and they
: are obsolete.

Hmmmmm. On modern hardware it is easy to generate efficient code that
doesn't tear words, but I suspect that it's possible to generate
word-tearing code. So, my no-fasle-sharing-of-structs doesn't go far
enough and involves a needless distinction. But, the standard should
specify that a conforming implementation is not allowed to generate
code that falsely shares portions of objects.

Tom Payne

David Butenhof

unread,

Jul 30, 2002, 9:11:04 AM7/30/02

to

t...@cs.ucr.edu wrote:

> Hmmmmm. On modern hardware it is easy to generate efficient code that
> doesn't tear words, but I suspect that it's possible to generate
> word-tearing code. So, my no-fasle-sharing-of-structs doesn't go far
> enough and involves a needless distinction. But, the standard should
> specify that a conforming implementation is not allowed to generate
> code that falsely shares portions of objects.

This is really the crux of the matter. Either the compiler needs to know
what is shared (an unreasonable and complicated provision), or else it
needs to presume that EVERYTHING may be shared. Longer data accesses are
often more efficient, even when atomic accesses to smaller data are
possible. And, for example, if your structure is "struct {char a, b, c;}
*bar;" and you have "foo1 = bar->a; foo2 = bar->c;", a smart compiler can
often far more efficiently read the aligned int, long, (or even long long
if that's different) and pick the bits out of a register than issue two
separate byte reads. Without something like a new (incompatible) and
cumbersome "shared" attribute, no compiler under the new rules would ever
be able to do this. That's as bad as the current tendency to misuse
"volatile".

And note that this is really all "false sharing", not "word tearing". The
former (at least in this sense) comes when the compiler uses access widths
wider than the data, and you can force it to narrow the width without fear
of placing impossible restrictions on modern machines.

I'm not so sure about dealing with word tearing, which is the opposite; when
the access width is narrower than the data (so that more than one "atomic"
access is needed). Some machines can't atomically access unaligned data
(e.g., an int or long pointer with odd low bit). Although such machines
usually align data by default, there are usually ways to override it, and
usually some reason why that's important. But you may not be able to force
the machine to do a single atomic unaligned access. (Nor do I think that
would be a reasonable requirement.) This gets a lot messier. You might be
able to address this in the language by applying atomicity rules only for
data aligned to implementation rules, if the language people feel they can
do that. I don't recall seeing any reocgnition of alignment distinctions
(except possibly in dealing with casts between pointer types?) I suspect
it'd also have to be exlicitly restricted to scalar data types, since no
machine can atomically access an arbitrary structure.

Alexander Terekhov

unread,

Jul 30, 2002, 10:28:00 AM7/30/02

to

David Butenhof wrote:
[...]

> And note that this is really all "false sharing", not "word tearing". The
> former (at least in this sense) comes when the compiler uses access widths
> wider than the data, and you can force it to narrow the width without fear

> of placing impossible restrictions on modern machines. ....

Uhmm, to me:

A) "word tearing" -- a memory location/granule/whatever is shared/occupied
by multiple C/C++ {sub-}objects used in different sharing scopes; clear
violation of 4.10 ``rules.'' ;-)

B) "false sharing" -- a somewhat similar problem but with respect to cache
lines; MP performance impact [cache thrashing] only.

regards,
alexander.

t...@cs.ucr.edu

unread,

Jul 30, 2002, 4:57:22 PM7/30/02

to

In comp.std.c David Butenhof <David.B...@compaq.com> wrote:
[...]
: Either the compiler needs to know

: what is shared (an unreasonable and complicated provision), or else it
: needs to presume that EVERYTHING may be shared. Longer data accesses are
: often more efficient, even when atomic accesses to smaller data are
: possible. And, for example, if your structure is "struct {char a, b, c;}
: *bar;" and you have "foo1 = bar->a; foo2 = bar->c;", a smart compiler can
: often far more efficiently read the aligned int, long, (or even long long
: if that's different) and pick the bits out of a register than issue two
: separate byte reads. Without something like a new (incompatible) and
: cumbersome "shared" attribute, no compiler under the new rules would ever
: be able to do this. That's as bad as the current tendency to misuse
: "volatile".

I don't see any problem with reads per se. Rather, when updating an
object, a stale version of neighboing data might get written back. I
presume that a hardware-supported partial-word write involves an
atomic read-modify-write to make sure the neighboring data is
preserved, but obviously an atomic read-modify-write costs more than a
simple write. Of course, an implementation must use partial word
writes wherever there is adjacent data that shares a word and an
intervening update of that adjacent data cannot be ruled out.

: And note that this is really all "false sharing", not "word tearing". The

: former (at least in this sense) comes when the compiler uses access widths
: wider than the data, and you can force it to narrow the width without fear
: of placing impossible restrictions on modern machines.

Agreed.

: I'm not so sure about dealing with word tearing, which is the opposite; when

: the access width is narrower than the data (so that more than one "atomic"
: access is needed). Some machines can't atomically access unaligned data
: (e.g., an int or long pointer with odd low bit). Although such machines
: usually align data by default, there are usually ways to override it, and
: usually some reason why that's important. But you may not be able to force
: the machine to do a single atomic unaligned access. (Nor do I think that
: would be a reasonable requirement.) This gets a lot messier. You might be
: able to address this in the language by applying atomicity rules only for
: data aligned to implementation rules, if the language people feel they can
: do that. I don't recall seeing any reocgnition of alignment distinctions
: (except possibly in dealing with casts between pointer types?) I suspect
: it'd also have to be exlicitly restricted to scalar data types, since no
: machine can atomically access an arbitrary structure.

The above paragraph seems to boil down to the fact that data whose
writes are non-atomic require exclusive access and must be lock/mutex
protected. What am I missing?

Tom Payne

David Hopwood

unread,

Jul 30, 2002, 5:53:37 PM7/30/02

to

-----BEGIN PGP SIGNED MESSAGE-----

More precisely, the *POSIX* standard should specify this. The C Standard
doesn't need to because it doesn't address threading (or shared memory).

- --
David Hopwood <david....@zetnet.co.uk>

Home page & PGP public key: http://www.users.zetnet.co.uk/hopwood/
RSA 2048-bit; fingerprint 71 8E A6 23 0E D3 4C E5 0F 69 8C D4 FA 66 15 01
Nothing in this message is intended to be legally binding. If I revoke a
public key but refuse to specify why, it is because the private key has been
seized under the Regulation of Investigatory Powers Act; see www.fipr.org/rip

-----BEGIN PGP SIGNATURE-----
Version: 2.6.3i
Charset: noconv

iQEVAwUBPUcKiDkCAxeYt5gVAQF93ggAz8rUHGg6VOowU2ROsoj9XxMZfQxi6Gno
CR4zU/6hUYg+8ZxD2sDcPm2tDHQTNzkW4xgepgUnG58FxtJjoZQPv9sxj/paCQH0
lc5eDRl3PCYHVmFJN/UAlY3BOtoI8kkZ4GJgMbQtM1briFh/HBCoGfJR3ONdmJkr
nPCXH/k4pnIeaEfLgKwDD5EiN9KIZq7V64EZUHZCNYgQSYL1zEKbBq8tiklGaQpT
EBSUumbhhvM4xdOasLE68D3I38ZIPrvwTj1aPO9rPdDx22r0p3bX+pnyFuSboRHf
ecat3X0nDOAJ0DilSBysqS3m39RC6j6gmjZktkcJWKOE3gUn+z5dFA==
=x386
-----END PGP SIGNATURE-----

t...@cs.ucr.edu

unread,

Jul 30, 2002, 5:23:26 PM7/30/02

to

In comp.std.c David Hopwood <david....@zetnet.co.uk> wrote:
: -----BEGIN PGP SIGNED MESSAGE-----

: t...@cs.ucr.edu wrote:
:> In comp.std.c Konrad Schwarz <konradDO...@mchpdotsiemens.de> wrote:
:> : From a standards point of view, no
:> : allowances need to be made for
:> : word-tearing machines: they don't
:> : have to word tear and they
:> : are obsolete.
:>
:> Hmmmmm. On modern hardware it is easy to generate efficient code that
:> doesn't tear words, but I suspect that it's possible to generate
:> word-tearing code. So, my no-fasle-sharing-of-structs doesn't go far
:> enough and involves a needless distinction. But, the standard should
:> specify that a conforming implementation is not allowed to generate
:> code that falsely shares portions of objects.

: More precisely, the *POSIX* standard should specify this. The C Standard
: doesn't need to because it doesn't address threading (or shared memory).

It's my (ill-informed) understanding that POSIX tries to be an API
standard, and hence doesn't address such matters as how the language
implementation should interact with memory.

Tom Payne

David Hopwood

unread,

Jul 30, 2002, 6:42:06 PM7/30/02

to

-----BEGIN PGP SIGNED MESSAGE-----

t...@cs.ucr.edu wrote:

Anything that purports to standardise threading or shared memory facilities
that are accessible from at least one high-level language, *must* specify a
memory model (assuming that the language standard doesn't already have one
that is adequate for that purpose). POSIX already imposes some requirements
that constrain how C is to be implemented on POSIX platforms; it is not just
an API standard.

In any case, the C Standard can't specify this unless it becomes a standard
for an (optionally) multithreaded programming language - which I think is
unlikely for standards-politics reasons. As a practical matter, there is
nowhere else for it to be specified but POSIX.

- --
David Hopwood <david....@zetnet.co.uk>

Home page & PGP public key: http://www.users.zetnet.co.uk/hopwood/
RSA 2048-bit; fingerprint 71 8E A6 23 0E D3 4C E5 0F 69 8C D4 FA 66 15 01
Nothing in this message is intended to be legally binding. If I revoke a
public key but refuse to specify why, it is because the private key has been
seized under the Regulation of Investigatory Powers Act; see www.fipr.org/rip

-----BEGIN PGP SIGNATURE-----
Version: 2.6.3i
Charset: noconv

iQEVAwUBPUcWFDkCAxeYt5gVAQET5ggAuaF98loQgkpWs51NDLloaa4fCXht9SY0
uS8HEy/Gu9mGRGiRkdaimC/xLF6fsMIxAYRLPtC+b5QYPZa/qagOjuiZ21g/mz3a
XZl8BZveGlPMukIzZ4JOiotGT+vtthz6uOo3l9C4GXpK26cvlAxGNvtmNdFJm95w
RJNDDgn9Jei6k63lhqNw4kTH/QBjvKIEz9uGHTBSgS3h8h+eusVqedPElQCrwHKM
T0lx1+5xOxKwYCxjWvejsRDZFekZUxFs+VU3XnrO7LMu/zRgi3U4YYRd/o0C7xRG
lb1HkYPO0H2avbDqJRLRI4ZaRX0dRW2eja0Z1yS/tTNgh25FgimfjA==
=/WMb
-----END PGP SIGNATURE-----

James Kuyper

unread,

Jul 30, 2002, 5:39:20 PM7/30/02

to

t...@cs.ucr.edu wrote:
>
> In comp.std.c David Hopwood <david....@zetnet.co.uk> wrote:
> : -----BEGIN PGP SIGNED MESSAGE-----

...

> : More precisely, the *POSIX* standard should specify this. The C Standard
> : doesn't need to because it doesn't address threading (or shared memory).
>
> It's my (ill-informed) understanding that POSIX tries to be an API
> standard, and hence doesn't address such matters as how the language
> implementation should interact with memory.

It really shouldn't define the threading and memory sharing APIs,
without defining what the consequences of using them are. Those
consequences are entirely outside the scope of the C standard. If POSIX
doesn't define them, who should?

Douglas A. Gwyn

unread,

Jul 30, 2002, 10:40:34 PM7/30/02

to

t...@cs.ucr.edu wrote:
> But, the standard should
> specify that a conforming implementation is not allowed to generate
> code that falsely shares portions of objects.

See if you can come up with wording that correct specifies what
you want *and* does not impose an undue burden on C implementations
on several reasonable existing platforms. My inclination is to
couple such requirements to volatile qualification.

t...@cs.ucr.edu

unread,

Jul 31, 2002, 2:55:07 AM7/31/02

to

In comp.std.c Douglas A. Gwyn <DAG...@null.net> wrote:

The volatile qualification on an object's declaration suspends the
normal guarantees that the object will remember its value until it is
modified by some kind of write and that modifications of its value are
not externally visible. That suspension blocks many optimizations
that we consder normal and forces a lot of fetching and write-backs at
sequence points.

False sharing occurs when there is occupied left-over space on the
word at one end of an object or the othere. Writing the object
requires writing the appropriate bits to that occupied left-over
space, which requires that we read the bits that are originally there
and then write them back. But if that other object gets modified in
the interim, we have a lost-update problem.

To me these seem like totally distinct notions.

Tom Payne

Mark Williams

unread,

Jul 31, 2002, 9:32:05 AM7/31/02

to

t...@cs.ucr.edu wrote in message news:<ai81kb$hv7$2...@glue.ucr.edu>...

But if that other object can get modified in the interim, it must be
declared volatile anyway... see your first paragraph...

Mark Williams

t...@cs.ucr.edu

unread,

Jul 31, 2002, 10:31:22 AM7/31/02

to

In comp.std.c Mark Williams <kmar...@yahoo.com> wrote:
: t...@cs.ucr.edu wrote in message news:<ai81kb$hv7$2...@glue.ucr.edu>...

IMHO, declaring that other object to be volatile imposes a great deal
of overhead and is neither necessary nor sufficient for correct
operation.

The overhead that implementations must incur to cope with volatility is
well known so let's skip that part.

Unnecessary: We can cure the problem by using partial-word writes.

Insufficiency: To cope with volatilty (behave as if we) re-read the
object's value after each sequence point. But all that does is to
make sure that we have up-to-date corrupted data. If the correct
value gets clobbered via a write to a neighbor, it gets clobbered.
Re-reading it doesn't restore the correct value. (Keep in mind
that a volatile object is more or less an abstraction of an
I/O register, in which case the current value is by definition
the right value.)

Tom Payne

Alexander Terekhov

unread,

Jul 31, 2002, 10:27:22 AM7/31/02

to

Mark Williams wrote:
[...]

> But if that other object can get modified in the interim, it must be
> declared volatile anyway... see your first paragraph...

Gee, please forget volatiles [unless you really want to {re-}invent
something similar to Java's revised volatiles with its load-acquire
and store-release memory synchronization semantics] w.r.t. threading.
And, BTW, I'd like to urge folks at comp.std.c to consider *FIXING*
the C99 rationale as well...

"Rationale for
International Standard
Programming Languages
C
Revision 2
20 October 1999
WG14/N897 J11/99-032"
....
Such behavior is, of course, too loose for hardware-oriented
applications such as device drivers and memory-mapped I/O. The
following loop looks almost identical to the previous example,
but the specification of volatile ensures that each assignment
to *ttyport takes place in the same sequence, and with the same
values, as the abstract machine would have done.

volatile short *ttyport;
/* ... */
for (i = 0; i < N; ++i)
*ttyport = a[i]; "

"A static volatile object is an appropriate model for a
memory-mapped I/O register. Implementors of C translators
^^^^^^^^^^^^^^^^^^^^^^^^^^

should take into account relevant hardware details on the
target systems when implementing accesses to volatile objects.
For instance, the hardware logic of a system may require that
a two-byte memory-mapped register not be accessed with byte
operations; and a compiler for such a system would have to
assure that no such instructions were generated, even if the
source code only accesses one byte of the register. Whether
read-modify-write instructions can be used on such device
registers must also be considered. Whatever decisions are
adopted on such issues must be documented, as volatile access
is implementation-defined. A volatile object is also an
^^^^^^^^^^^^^^^^^^^^^^^^^

appropriate model for a variable shared among multiple processes. "
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
>>!!!!ATTN WRONG ATTN WRONG ATTN WRONG ATTN WRONG ATTN WRONG!!!!<<

"A static const volatile object appropriately models a memory-mapped
input port, such as a real-time clock. Similarly, a const volatile
object models a variable which can be altered by another process"
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
>>!!!!ATTN WRONG ATTN WRONG ATTN WRONG ATTN WRONG ATTN WRONG!!!!<<

"but not by this one.

"Signals are difficult to specify in a system-independent
way. The C89 Committee concluded that about the only thing
a strictly conforming program can do in a signal handler
is to assign a value to a volatile static variable which
can be written uninterruptedly and promptly return. "

http://groups.google.com/groups?as_umsgid=1991Sep12.170305.6639%40zoo.toronto.edu

">have been more sensible to have a "device" declarator with completely
>implementation-defined semantics rather than trying to overload volatile
>and cross our fingers and hope.

"volatile" was invented for device registers. The mistake was to overload
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^

it for setjmp and signal handlers. A wise compiler writer, writing for
the systems-programming market, will know this and do things right."

regards,
alexander.

Alexander Terekhov

unread,

Jul 31, 2002, 11:15:39 AM7/31/02

to

t...@cs.ucr.edu wrote:
[...]

> To me these seem like totally distinct notions.

^^^^^^^^

struct something { distinct char read_write_thread_process_A,
^^^^^^^^ read_write_thread_process_B; };

distinct char data[2]; // thread/process A reads/writes data[0]
^^^^^^^^ // thread/process B reads/writes data[1]

and it's up to the implementation to either properly granularize
"distinct" stuff or use LL-SC/CAS/whatever techniques "emulating"
proper granularity instead...

regards,
alexander.

t...@cs.ucr.edu

unread,

Aug 1, 2002, 2:00:53 AM8/1/02

to

In comp.std.c Alexander Terekhov <tere...@web.de> wrote:

: t...@cs.ucr.edu wrote:
: [...]
:> To me these seem like totally distinct notions.
: ^^^^^^^^

The pun was unintentional. ;-)

: struct something { distinct char read_write_thread_process_A,
: ^^^^^^^^ read_write_thread_process_B; };

: distinct char data[2]; // thread/process A reads/writes data[0]
: ^^^^^^^^ // thread/process B reads/writes data[1]

: and it's up to the implementation to either properly granularize
: "distinct" stuff or use LL-SC/CAS/whatever techniques "emulating"
: proper granularity instead...

Good point. I had hoped that a new keyword would not be required,
e.g., that we could simply make all structs "distinct", without
significantly imparing performance and/or program size.
Unfortunately, I have no empiricle basis for believing in that hope.

Tom Payne

Jim Rogers

unread,

Aug 1, 2002, 2:36:27 AM8/1/02

to

Alexander Terekhov wrote:

> < c.p.t. added >
>
> "Douglas A. Gwyn" wrote:
> [...]
>
>
>>>>Please clearly state the issue or question.
>>>>
>

>>>The issue is also known as "word-tearing"; "false-sharing"
>>>aside for a moment. The question is what does the term
>>>"memory location" mean in the sense of writing *portable*
>>>programs. AFAICS, C standard partially addresses "word-tearing"
>>>problem -- for signal handling [static volatile sig_atomic_t],
>>>but it is unclear to me what is the intended meaning [if any]
>>>of the "memory location" term from the C standards committee
>>>POV.
>>>
>>That still doesn't clearly state an issue. Do you have sample
>>code that will illustrate a specific problem?
>>
>
> < non-standard C/SEH exception handling with resumption aside >

>
> char data[2]; // thread/process A reads/writes data[0]

> // thread/process B reads/writes data[1]
>
>

>>What part of the C standard uses the term "memory location"?
>>I just searched the PDF version of ISO/IEC 9899:1999 and didn't
>>find that phrase used in the text.
>>
>
> Yeah, I know. Consider: <copy&paste>
>
> ----
>
> Newsgroups: comp.lang.c++.moderated, comp.programming.threads
> Subject: Re: MT design advice
> Date: 23 Jul 2002 18:49:05 -0400
>
> Alexander Terekhov wrote:
>
> > Excuse me, but does this mean that the following
> > (piece of informal memory model semantics) applies
> > to POSIX "memory model" as well:
> >
> > http://www.cs.umd.edu/~pugh/java/memoryModel/semantics.pdf
> >
> > "....
> > The fact that two variables may be stored in ad-
> > jacent bytes (e.g., in a byte array) is immaterial.
> > Two variables can be simultaneously updated by
> > different threads without needing to use synchro-
> > nization to account for the fact that they are
> > 'adjacent'. Any word-tearing must be invisible
> > to the programmer.
> > ...."
>
> No. There's a problem here. The Java memory model is a LANGUAGE rule. "By
> whatever means necessary, a proper implementation of the language must
> achieve this behavior." Fine. A programmer can count on that as long as the
> Java implementation claims conformance.
>
> POSIX rules are trickier. The memory model imposes restrictions on the
> hardware and on the API. However, in general you're not accessing those two
> bytes through the API, nor using hardware data access instructions. You're
> writing C (or C++) language code that a compiler will translate, at its
> discretion, into hardware instructions that meet the C/C++ language
> definitions.
>
> The C and C++ languages do not have such rules for normal accesses. The
> compiler is allowed to generate 64-bit wide access instructions for your
> "char c; ... c++;" sequence. And if it does, neither the API nor the
> hardware rules do you any good. Java doesn't have that problem. IF the C
> and/or C++ languages choose to address multiprocessing, this is something
> that they could require. But it will have a cost, because on some machines
> the conforming code will be suboptimal. (This is equally true for Java, of
> course, and C++ could decide to make the same tradeoffs.)
>
>

When considering implementation options concerning this problem you might
want to also consider the Ada solution.

Ada provides two pragmas to deal with atomicity relating to concurrent
access.

pragma Atomic (name);

Takes either the name of a variable or the name of a type as a pragma
argument. It stipulates two things about the named variable, or about
every variable of the named type:

* The variable is to examined and updated atomically. That is, it must be
impossible for one task to observe such a variable in a state in which it is
only partially updated by another task.

* N0 temporary copies of the variable are allowed. Every examination of the
variable must examine the variable itself, and every modification of the
variable must immediately modify the variable itself.

The second pragma is:

pragma Atomic_Components (name);

which task tha name of a one-of-a-kind array, belonging to an anonymous
array type, or of any array type as a pragma argument. It stipulates the
same things as pragma Atomic, but applies individually to every component
of the named array, or to every component of every array of the named type.

The advantage of the Ada approach is that there is no overhead in
data alignment or copy optimization paid by any component not explicitly
identified by either of the two pragmas. This supports the concept that
you do not pay for a feature you do not use.

The Ada approach does not promise best performance across all hardware
architectures. It does allow you as a programmer to specify what you
want done, independent of a particular architecture. Use of such pragmas
may not always produce portable code. Atomic access is normally restricted
to data items that fit into a single register. The size of registers varies
across hardware architectures. Thus, a program that works properly on a
64 bit machine may fail if compiled and run on an 8 bit machine.

Java's solution to this problem is to ignore actual hardware architecture
and require the VM to control Java threads as though atomicity and
word size were constants. C lives in the world of real hardware, just as
Ada does. C needs to devise a solution that recognizes the realities of
such an environment when dealing atomic access to data.

Jim Rogers

David Butenhof

unread,

Aug 1, 2002, 10:24:46 AM8/1/02

to

Jim Rogers wrote:

> When considering implementation options concerning this problem you might
> want to also consider the Ada solution.
>
> Ada provides two pragmas to deal with atomicity relating to concurrent
> access.
>
> pragma Atomic (name);
>
> Takes either the name of a variable or the name of a type as a pragma
> argument. It stipulates two things about the named variable, or about
> every variable of the named type:
>
> * The variable is to examined and updated atomically. That is, it must be
> impossible for one task to observe such a variable in a state in which it
> is only partially updated by another task.
>
> * N0 temporary copies of the variable are allowed. Every examination of
> the variable must examine the variable itself, and every modification of
> the variable must immediately modify the variable itself.

Neither attribute is what you want for data controlled by explicit
synchronization. Atomic access can require substantial overhead that's
mostly pointless if the data is already synchronized; and you WANT the
compiler to be able to cache the data in a register as long as the data is
externally protected (e.g., by a mutex).

A different solution might be something like an "isolate" attribute. The
real problem is that there's no control over the code sequences generated
by the compiler; even if the data is susceptible to safe hardware access,
the compiler may legitimately decide it can be more efficiently accessed
with a wider instruction, for example to aggregate two shorter fields in a
single read or write.

The "isolate" attribute would simply tell the compiler that THIS data cannot
be aggregated because there's sharing going on about which the compiler is
unaware.

isolate char foo;
char bar;
int plugh;

would prevent the compiler from aggregating foo and bar (on behalf of
either), but it could choose to aggregate bar and plugh as long as that
aggregation didn't involve foo. It could do this by generating some form of
atomic instruction sequences (so aggregation might occur, but would be
externally transparent), or (better and simpler) by inserting padding
between foo and bar so that bar becomes aligned at a sufficient boundary to
allow convenient independent access. (For example instead of generating
byte instructions for foo and bar on Alpha, which would be slow for
pre-EV56 chips, the compiler could simply align foo and bar to 4-byte
granularity and use 32-bit access.)

Array elements couldn't be isolated by alignment, so it'd either need to use
byte access or atomic instruction sequences. (You could make sizeof(isolate
char) be 4 instead of 1... but that'd probably break way too much code to
be feasible.)

Mark Williams

unread,

Aug 1, 2002, 10:25:12 AM8/1/02

to

t...@cs.ucr.edu wrote in message news:<ai8sbq$p8r$3...@glue.ucr.edu>...

Im not sure what you're saying here... certainly you can cure the
problem of false sharing; you cannot cure the problem that when you
want to access the other word (and you presumably do want to do that),
the compiler may not actually do so (because it thinks it has the
value cached somewhere).

>
> Insufficiency: To cope with volatilty (behave as if we) re-read the
> object's value after each sequence point. But all that does is to
> make sure that we have up-to-date corrupted data. If the correct
> value gets clobbered via a write to a neighbor, it gets clobbered.
> Re-reading it doesn't restore the correct value.

Wrong, because if the other object is declared volatile, the standard
forbids the compiler from writing it except as specified by the
program; thus the value cannot get clobbered via a write to a
neighbor...

I fully understand the deficiencies of volatile, but it is certainly
sufficient to prevent false sharing as described in this thread, and
also necessary in the case that an object changes its value behind the
programs back (failure to declare such an object volatile results in
UB if I recall correctly).

Mark Williams

Fergus Henderson

unread,

Aug 1, 2002, 10:57:06 AM8/1/02

to

Alexander Terekhov <tere...@web.de> writes:

>Mark Williams wrote:
>[...]
>> But if that other object can get modified in the interim, it must be
>> declared volatile anyway... see your first paragraph...
>
>Gee, please forget volatiles

Why?

>[unless you really want to {re-}invent
>something similar to Java's revised volatiles with its load-acquire
>and store-release memory synchronization semantics] w.r.t. threading.

What would be wrong with that?

> "volatile" was invented for device registers. The mistake was to overload
> ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
> it for setjmp and signal handlers.

Why was that a mistake?

Alexander Terekhov

unread,

Aug 1, 2002, 10:57:47 AM8/1/02

to

Jim Rogers wrote:

[...Ada/pragma Atomic/Atomic_Components...]

Yeah, also, consider [especially "at least three independent issues" and
"The situation would be far better if a future version of ANSI C (and C++)
*did* explicitly..." below]:

< Butenhof, 1998, c.p.t.*** [but I personally don't share his opinion (?at
that time only -- perhaps it's changed in the meantime?) with respect to
sort of the ``universally-applicable-magic'' of sig_atomic_t >

----

David Holmes wrote:

I really wasn't going to step into this one, because I've gone through this
many times before. Unfortunately, the discussion has deteriorated, and I feel
an obligation. (Oddly, someone dropped by my office to ask about memory
barriers while I was writing this.)

> Bil Lewis <B...@LambdaCS.com> wrote in article
> <34F22F...@LambdaCS.com>...
> > Felix,
> > 1: Writing a byte on most modern machines is non-atomic. It goes:
> > read word, mask out byte, add in new byte, write word.
>
> Most modern machines? Surely many modern machines are byte addressable and
> thus the individual byte can be written independent of the rest of the
> word?

Yes, "most" "modern" machines are byte-addressable. (Both words in quotes
because the concepts are ill-defined and probably meaningless, but I'm not in
a mood to worry about that now.)

However, the previous posting claiming that byte-addressable machines
necessarily provide atomic byte access was simply in error. "Byte
addressable" means that each byte has an address. That doesn't help. SOME
machines may have atomic byte access, but this has nothing to do with whether
the byte is addressable. For example, a machine where each address was
attached to a 48 byte word could provide an instruction to atomically update
an arbitrary bit-field within that word -- while a byte-addressable machine
need not necessarily provide atomic access.

There are (depending on how one breaks these things down) at least three
independent issues here:

1. Atomic access granularity
2. Read/write ordering
3. Memory coherency

This discussion has been trying to address the three as if they were the
same. They're not even connected. (Or, at best, only very loosely connected.)

Most modern machines (yes, I should probably use quotes again) are designed
for fast and efficient multiprocessing. The memory interconnect is the big
bottleneck, and a lot of work has gone into streamlining the memory access
and cache protocols. Memory accesses are usually made in "chunks" not
necessarily related to the size of the data type. In general, a machine has a
few "memory access types", very likely a smaller set than the set of
instruction set types. Usually, only "memory access types" are atomic. On an
Alpha EV4 and EV5, for example, the memory access types are 32-bit "words"
and 64-bit "longwords". Smaller accesses require loading and masking a word
or longword. While some machines might choose to hide the distinction between
"memory types" and "instruction types", Alpha doesn't. To read a byte, you
must load the enclosing word/longword, (I'll use "word" for convenience from
now on), and use special instructions to mask/shift into the desired
position.

That's all fine, except when you get into concurrently executing entities
(threads, or processes using shared memory), and one tries to STORE into the
non-atomic field. To store such a field, you fetch the current word, shift &
mask (field insert) your new data, and then write the word back. If another
entity is simultaneously setting a different field in the same word, only ONE
of the two will supply a new value for the entire word, including the other
thread's field. Worse, a non-atomic field (e.g., a 16-bit short rather than a
byte) might be split between two words. If it does, you can get into trouble
even reading, because you have to read both words and combine them to create
the short value you want. Some other thread might have changed one of the
words between your two reads. That's "word tearing", and it means you've
gotten the wrong data. Of course word tearing can happen on writes, too, in
addition to the normal field access problems. Messy!

Attempting to share information between concurrently executing instruction
streams (that may be on separate processors) also requires read/write
ordering. That is, if you set a flag to signal some change in state (e.g.,
adding an item to a queue), you must be able to know that seeing the change
in the flag means you can see the new queue item. Modern SMP memory systems
frequently allow reordering of operations in the CPU to memory controller
pipeline, for all sorts of reasons (including cache synchronization issues,
speculative execution, etc.) So you may queue an item and then set a flag (or
fill in the fields of a structure and then queue it), but have the data
become visible to another processor in a different order. Unless you're
communicating between concurrently executing entities, reordering doesn't
affect you -- so it's a great performance tradeoff. But it means that when
you require ordering, you need to do something extra. One common way to force
ordering is a "memory barrier" instruction. On Alpha, for example, MB
prevents memory operation reordering across the instruction. (One could
consider a stream of memory requests in a pipe between the CPU and memory,
which can be arbitrarily reordered for implementation convenience; but the
reordering agent can't move anything past an "MB token".)

And then we've got memory coherency. A "write-through" cache may invalidate
other caches, and update main memory. But a "write-back" cache may not write
into main memory for some time. Even if other caches are invalidated, the
processors won't see the new value until it's written. That's OK, though, as
long as both processors make proper use of memory barriers. The writer puts a
memory barrier between writing the data and writing the flag (or pointer),
and the reader puts a memory barrier between reading the flag/pointer and
reading the data. Now, whenever the flag/pointer appears in memory, you know
that the data to which it points is valid -- because you can't have read it
before the flag, and the writer can't have written it after the flag.

For more information without getting too deep into processor implementation
details, see the section "Memory visibility between threads" in my book
(Programming with POSIX Threads, web link in my .sig). Curt Schimmel's
UNIX Systems for Modern Architectures (Addison-Wesley) has a section called
"Other Memory Models" that describes the SPARC architecture's "Partial Store
Ordering" (loose read/write ordering with memory barriers), though it doesn't
address word tearing.

----

Tim Beckmann wrote:

> Dave Butenhof wrote:
> > > David,
> > >
> > > My thoughts exactly!
> > >
> > > Does anyone know of a mainstream architecture that does this sort of
> > > thing?
> >
> > Oh, absolutely. SPARC, MIPS, and Alpha, for starters. I'll bet most other RISC
> > systems do it, too, because it substantially simplifies the memory subsystem
> > logic. And, after all, the whole point of RISC is that simplicity means speed.
>
> MIPS I know :) The latest MIPS processors R10K and R5K are byte addressable.
> The whole point of RISC is simplicity of hardware, but if it makes the software
> more complex it isn't worth it :)

The whole idea of RISC is *exactly* to make software more complex. That is, by
simplifying the hardware, hardware designers can produce more stable designs that
can be produced more quickly and with more advanced technology to result in faster
hardware. The cost of this is more complicated software. Most of the complexity is
hidden by the compiler -- but you can't necessarily hide everything. Remember that
POSIX took advantage of some loopholes in the ANSI C specification around external
calls to proclaim that you can do threaded programming in C without requiring
expensive and awkward hacks like "volatile". Still, the interpretation of ANSI C
semantics is stretched to the limit. The situation would be far better if a future
version of ANSI C (and C++) *did* explicitly recognize the requirements of threaded
programming.

> > If you stick to int or long, you'll probably be safe. If you use anything
> > smaller, be sure they're not allocated next to each other unless they're under
> > the same lock.
>
> Actually, you can be pretty sure that a compiler will split two declarations
> like:
> char dataA;
> char dataB;
> to be in two separate natural machine words. It is much faster and easier for
> those RISC processors to digest. However if you declare something as:

While that's certainly possible, that's just a compiler optimization strategy. You
shouldn't rely on it unless you know FOR SURE that YOUR compiler does this.

> char data[2]; /* or more than 2 */
> you have to be VERY concerned with the effects of word tearing since the
> compiler will certainly pack them into a single word.

Yes, this packing is required. You've declared an array of "char" sized data, so
each array element had better be allocated exactly 1 char.

> > I wrote a long post on most of the issues brought up in this thread, which
> > appears somewhere down the list due to the whims of news feeds, but I got
> > interrupted and forgot to address this issue.
>
> Yep, I saw it. It was helpful. So was the later post by someone else who
> included a link to a DEC alpha document that explained what a memory barrier
> was in this context. I've seen three different definitions over the years.
> The definition you described in your previous post agreed with the DEC alpha
> description... That a memory barrier basically doesn't allow out of order
> memory accesses to cross the barrier. A very important issue if you are
> implementing mutexes or semaphores :)[...]
>
> However, I really believe that dataA and dataB should both be declared as
> "volatile" to prevent the compiler from being too aggressive on it's
> optimization. The mutex still doesn't guarantee that the compiler hasn't
> cached the data in an internal register across a function call. My memory
> isn't perfect, but I do think this bit me on IRIX.

The existence of the mutex doesn't require this, but the semantics of POSIX and
ANSI C do require it. Remember that you lock a mutex by calling a function, passing
an address. While an extraordinarily aggressive C compiler with a global analyzer
might be able to determine reliably that there's no way that call could access the
data you're trying to protect, such a compiler is unlikely -- and, if it existed,
it would simply violate POSIX 1003.1-1996, failing to support threads.

You do NOT need volatile for threaded programming. You do need it when you share
data between "main code" and signal handlers, or when sharing hardware registers
with a device. In certain restricted situations, it MIGHT help when sharing
unsynchronized data between threads (but don't count on it -- the semantics of
"volatile" are too fuzzy). If you need volatile to share data, protected by POSIX
synchronization objects, between threads, then your implementation is busted.

> > There are, of course, no absolute guarantees. If you want to be safe and
> > portable, you might do well to have a config header that typedefs
> > "smallest_safe_data_unit_t" to whatever's appropriate for the platform. Then
> > it's just a quick trip to the hardware reference manual when you start a port.
> > On a CISC, you can probably use "char". On most RISC systems, you should use
> > "int" or "long".
>
> There never are guarantees are there :)

To reiterate again one more time, ( ;-) ), the correct (ANSI C) portable type for
atomic access is sig_atomic_t.

> > Yes, this is one more complication to the process of threading old code. But
> > then, it's nothing compared to figuring out which data is shared and which is
> > private, and then getting the locking protocols right.
>
> But what fun would it be if it wasn't a challenge :)

Well, yeah. That's my definition of "fun". But not everyone's. Sometimes "boring
and predictable" can be quite comforting.

> However, I would like to revist the original topic of whether it is "safe" to
> change a single byte without a mutex. Although, instead of "byte" I'd like to
> say "natural machine word" to eliminate the word tearing and non-atomic memory
> access concerns. I'm not sure it's safe to go back to the original topic, but
> what the heck ;)

sig_atomic_t.

> If you stick to a "natural machine word" that is declared as "volatile",
> you do not absolutely need a mutex (in fact I've done it). Of course, there are
> only certain cases where this works and shouldn't be done unless you really know
> your hardware architecture and what you're doing! If you have a machine with a
> lot of processors, unnecessarily locking mutexes can really kill parallelism.
>
> I'll give one example where this might be used:
>
> volatile int stop_flag = 0; /* assuming an int is atomic */
>
> thread_1
> {
> /* bunch of code */
>
> if some condition exists such that we wish to stop thread_2
> stop_flag = 1;
>
> /* more code - or not :) */
> }
>
> thread_2
> {
> while(1)
> {
> /* check if thread should stop */
> if (stop_flag)
> break;
>
> /* do whatever is going on in this loop */
> }
> }
>
> Of course, this assumes the hardware has some sort of cache coherency
> mechanism. But I don't believe POSIX mutex's or memory barriers (as
> defined for the DEC alpha) have any impact on cache coherency.

If a machine has a cache, and has no mechanism for cache coherency, then it can't
work as a multiprocessor.

> The example is simplistic, but it should work on a vast majority of
> systems. In fact the stop_flag could just as easily be a counter
> of some sort as long as only one thread is modifying the counter...

In some cases, yes, you can do this. But, especially with your "stop_flag",
remember that, if you fail to use a mutex (or other POSIX-guaranteed memory
coherence operation), a thread seeing stop_flag set CANNOT assume anything about
other program state. Nor can you ensure that any thread will see the changed value
of stop_flag in any particular bounded time -- because you've done nothing to
ensure memory ordering, or coherency.

And remember very carefully that bit about "as long as only one thread is
modifying". You cannot assume that "volatile" will ever help you if two threads
might modify the counter at the same time. On a RISC machine, "modify" still means
load, modify, and store, and that's not atomic. You need special instructions to
protect atomicity across that sequence (e.g., load-lock/store-conditional, or
compare-and-swap).

Am I trying to scare you? Yeah, sure, why not? If you really feel the need to do
something like this, do yourself (and your project) the courtesy of being EXTREMELY
frightened about it. Document it in extreme and deadly detail, and write that
documentation as if you were competing with Stephen King for "best horror story of
the year". I mean to the point that if someone takes over the project from you, and
doesn't COMPLETELY understand the implications, they'll be so terrified of the risk
that they'll rip out your optimizations and use real synchronization. Because this
is just too dangerous to use without full understanding.

There are ways to ensure memory ordering and coherency without using any POSIX
synchronization mechanisms, on any machine that's capable of supporting POSIX
semantics. It's just that you need to be really, really careful, and you need to be
aware that you're writing extremely machine-specific (and therefore inherently
non-portable) code. Some of this is "more portable" than others, but even the
"fairly portable" variants (like your stop_flag) are subject to a wide range of
risks. You need to be aware of them, and willing to accept them. Those who aren't
willing to accept those risks, or don't feel inclined to study and fully understand
the implications of each new platform to which they might wish to port, should
stick with mutexes.

----

mma...@dazel.com wrote:

> Dave Butenhof <bute...@zko.dec.com> wrote:
> > There are, of course, no absolute guarantees. If you want to be safe and
> > portable, you might do well to have a config header that typedefs
> > "smallest_safe_data_unit_t" to whatever's appropriate for the platform. Then
> > it's just a quick trip to the hardware reference manual when you start a
port.
> > On a CISC, you can probably use "char". On most RISC systems, you should use
> > "int" or "long".
>
> If I'm not mistaken, isn't that spelled:
>
> #include <signal.h>
>
> typedef sig_atomic_t smallest_safe_data_unit_t;

You are not mistaken, and thank you very much for pointing that out. While I'd
been aware at some point of the existence of that type, it was far from the top
of my mind.

If you have data that you intend to share without explicit synchronization, you
should be safe in using sig_atomic_t. Additionally, using sig_atomic_t will
protect you against word tearing in adjacent data protected by separate mutexes.

There are additional performance considerations, such as "false sharing" effects
in cache systems, that might dictate larger separations between two shared pieces
of data: but those won't affect program CORRECTNESS, and are therefore more a
matter of tuning for optimal performance on some particular platform.

----

David Holmes wrote:

> Dave Butenhof <bute...@zko.dec.com> wrote in article
> <34FD6950...@zko.dec.com>...
> > If you've got
> >
> > pthread_mutex_t mutexA = PTHREAD_MUTEX_INITIALIZER;
> > pthread_mutex_t mutexB = PTHREAD_MUTEX_INITIALIZER;
> >
> > char dataA;
> > char dataB;
> >
> > And one thread locks mutexA and writes dataA while another locks mutexB and
> > writes dataB, you risk word tearing, and incorrect results. That's a "platform
> > issue", that, as someone else commented, POSIX doesn't (and can't)
> address.
>
> That's a pretty serious impediment to threaded programming. How do you know
> how things have been allocated? Isn't the compiler free to re-arrange data
> declarations to 'optimise' alignment etc? Putting the data and mutex
> together in a struct might work for C, but what about C++ where the data
> and mutex are already within the object? Do you need to put structs within
> the class as well?

You need to avoid word-tearing. You should avoid sharing "potentially
adjacent" data of a size smaller than sig_atomic_t. Is that a serious
imediment to threaded programming? Sure, if you're dependent on using
arbitrary data sizes in arbitrary arrangements, and want your code to be
portable. On the other hand, if that's your most serious concern in porting
between differing machine architectures, then your code is probably pretty
clean -- dealing with this shouldn't be a big deal, and it'll serve to improve
your portability even further. (Surely a worthwhile goal?)

> When your average programmer sits down at their SparcWorkstation to write a
> neat little POSIX pthreads program, how on earth are they supposed to know
> about this? This isn't mentioned in your book Dave, nor in others that
> discuss using threads on Solaris.

It's in my section on "Memory visibility between threads". Or at least, I
discuss what "word tearing" means, and why it's of concern. You are, however,
correct in pointing out that I failed to warn the reader that it could be a
problem even WITH mutexes, if the separate data that share "memory access
logic" are protected by separate mutexes. Well, sorry. Even I'm not perfect
all the time! (As if anyone really needed proof of that statement.) I'll make
a note of this for the next time I update the text. (I'll also add a mention
of Mike Martin's fine advice to make use of the ANSI C "atomic data type",
sig_atomic_t, which will avoid word tearing.)

regards,
alexander.

[***] http://groups.google.com/groups?selm=34FC0286.6A889AF5%40zko.dec.com
http://groups.google.com/groups?selm=3503EB30.68D23EE6%40zko.dec.com
http://groups.google.com/groups?selm=3503E02C.180387F5%40zko.dec.com
http://groups.google.com/groups?selm=3503E333.8147C81B%40zko.dec.com

Fergus Henderson

unread,

Aug 1, 2002, 11:24:42 AM8/1/02

to

t...@cs.ucr.edu writes:

>Mark Williams <kmar...@yahoo.com> wrote:
>: t...@cs.ucr.edu wrote in message news:<ai81kb$hv7$2...@glue.ucr.edu>...

>:> The volatile qualification on an object's declaration suspends the
>:> normal guarantees that the object will remember its value until it is
>:> modified by some kind of write and that modifications of its value are
>:> not externally visible. That suspension blocks many optimizations
>:> that we consder normal and forces a lot of fetching and write-backs at
>:> sequence points.
>:>
>:> False sharing occurs when there is occupied left-over space on the
>:> word at one end of an object or the othere. Writing the object
>:> requires writing the appropriate bits to that occupied left-over
>:> space, which requires that we read the bits that are originally there
>:> and then write them back. But if that other object gets modified in
>:> the interim, we have a lost-update problem.
>
>: But if that other object can get modified in the interim, it must be
>: declared volatile anyway... see your first paragraph...
>
>IMHO, declaring that other object to be volatile imposes a great deal
>of overhead and is neither necessary nor sufficient for correct
>operation.

Have you measured the overhead?

>The overhead that implementations must incur to cope with volatility is
>well known so let's skip that part.
>
>Unnecessary: We can cure the problem by using partial-word writes.

OK. But I'm not convinced that this is going to be significantly more
efficient than `volatile'. Maybe it is, but personally I doubt it.

>Insufficiency: To cope with volatilty (behave as if we) re-read the
>object's value after each sequence point. But all that does is to
>make sure that we have up-to-date corrupted data. If the correct
>value gets clobbered via a write to a neighbor, it gets clobbered.

How would the correct value get clobbered?
For the correct value to get clobbered, the thread which
clobbered it would have to be ignoring `volatile', wouldn't it?

In which case, well, don't use compilers that ignore `volatile'.

Alexander Terekhov

unread,

Aug 1, 2002, 12:17:31 PM8/1/02

to

David Butenhof wrote:
[...]

> Array elements couldn't be isolated by alignment,

Why not treat "isolated" elements of ALL aggregates completely as beasts
of a somewhat "special" type... sizeof( isolated whatever ) >= sizeof(
whatever ), but "isolated_whatever <-copy/assignment-> whatever" should
be OK [value wise]?

Non-aggregated objects [static, dynamic, auto] need NOT be "isolated"
explicitly {that shouldn't be even allowed to be specified for non-
aggregated stuff, I think] and ALL THAT ought to be clearly spelled
out in future C/C++/POSIX standard(s)... otherwise that would definitely
break way too much code to be feasible, I believe. ;-)

> so it'd either need to use
> byte access or atomic instruction sequences. (You could make sizeof(isolate
> char) be 4 instead of 1... but that'd probably break way too much code to
> be feasible.)

I don't think so. Such code isn't "portable" today... since it is
vulnerable w.r.t. word-tearing [unless it's designated using something
like "DEC/Compaq/NewHP-volatile-with-strong-volatile-switch"*** instead
of hypothetical future standard {and much less ugly than misuse of
standard C/C++ volatile} "isolated" type qualifier ;-)] to begin with.
Or am I missing something?

regards,
alexander.

[***]

http://www.tru64unix.compaq.com/docs/base_doc/DOCUMENTATION/V51_HTML/ARH9RBTE/DOCU0008.HTM

"....
3.7.5.2 Maintaining the Composite Data Object's Layout

If you cannot change the organization or layout of the
composite data object's definition, you should do one
of the following:

(On OpenVMS Alpha or OpenVMS VAX) Compile all application
modules for byte actual granularity. Doing so automatically
prevents word-tearing race conditions for structure or
union members and array elements of size byte or
larger that are accessed concurrently by different threads.
No other program modification is required. This may have
a performance penalty on Alpha EV4 and EV5 processors.

Or,

(On Tru64 UNIX systems) For arrays, add the C language
volatile storage qualifier to the definition of the
entire array; for structures, add volatile to the
declaration of only those members that share the
pertinent memory granule. You must also compile the
application's modules using the Compaq C or Compaq C++
compiler's -strong-volatile switch. Doing so causes
the compiler to produce code that forces all accesses
to those members to occur as atomic operations. See
the description of the -strong-volatile switch
in the Compaq C or Compaq C++ documentation
and on the cc reference page.
This may also have a severe performance penalty.
...."

http://www.tru64unix.compaq.com/dtk/misc_docs/man/cc.1.html

"....
-strong_volatile
Affects the generation of code for assignments to objects that are less
than or equal to 16 bits in size (for instance char, short) that have
been declared as volatile. The generated code includes a load-locked
instruction for the enclosing longword or quadword, an insertion of the
new value of the object, and a store-conditional instruction for the
enclosing longword or quadword. By using this locked instruction
sequence for byte and word stores, the -strong_volatile option allows
byte and word access of data at byte granularity. This means that
assignments to adjacent volatile small objects by different threads in
a multithreaded program will not cause one of the objects to receive an
incorrect value.
...."

Alexander Terekhov

unread,

Aug 1, 2002, 1:19:01 PM8/1/02

to

Fergus Henderson wrote:
>
> Alexander Terekhov <tere...@web.de> writes:
>
> >Mark Williams wrote:
> >[...]
> >> But if that other object can get modified in the interim, it must be
> >> declared volatile anyway... see your first paragraph...
> >
> >Gee, please forget volatiles
>
> Why?

Because "What constitutes an access to an object that has volatile-qualified
type is implementation-defined" [ISO/IEC 9899:1999 (E), Pg. 109], to begin with.

> >[unless you really want to {re-}invent
> >something similar to Java's revised volatiles with its load-acquire
> >and store-release memory synchronization semantics] w.r.t. threading.
>
> What would be wrong with that?

Nothing wrong [also, please agree and provide me a new portable interface
for non-blocking/lock-less reference counting with optimizations for immutable
objects, please], I guess. But to me, *C/C++ volatiles* that meant to provide
support for "sharing" a hardware register/port*** with some device is completely
different from sharing memory between threads (or processes).

> > "volatile" was invented for device registers. The mistake was to overload
> > ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
> > it for setjmp and signal handlers.
>
> Why was that a mistake?

Ask Mr. Henry Spencer (he...@zoo.toronto.edu). Well, I agree because,
to my way of thinking, jumps and signals have really nothing to do with
"sharing" a hardware register/port with some device and the specification
that says {for really good reasons} that: "What constitutes an access to
an object that has volatile-qualified type is implementation-defined".

regards,
alexander.

[***] Oh, BTW, consider: http://www.rtj.org/rtsj-V1.0.pdf

"....
Raw Memory Access

An instance of RawMemoryAccess models a range of physical memory as a
fixed sequence of bytes. A full complement of accessor methods allow
the contents of the physical area to be accessed through offsets from
the base, interpreted as byte, short, int, or long data values or as
arrays of these types. Whether the offset addresses the high-order or
low-order byte is based on the value of the BYTE_ORDER static boolean
variable in class RealtimeSystem. The RawMemoryAccess class allows a
real-time program to implement device drivers, memory-mapped I/O, flash
memory, battery-backed RAM, and similar lowlevel software.

A raw memory area cannot contain references to Java objects. Such a
capability would be unsafe (since it could be used to defeat Java's
type checking) and errorprone (since it is sensitive to the specific
representational choices made by the Java compiler).
...."

Wojtek Lerch

unread,

Aug 1, 2002, 1:53:09 PM8/1/02

to

kmar...@yahoo.com (Mark Williams) wrote in message news:<7adff5d1.02073...@posting.google.com>...

> t...@cs.ucr.edu wrote in message news:<ai81kb$hv7$2...@glue.ucr.edu>...

> > The volatile qualification on an object's declaration suspends the
> > normal guarantees that the object will remember its value until it is
> > modified by some kind of write and that modifications of its value are
> > not externally visible. That suspension blocks many optimizations
> > that we consder normal and forces a lot of fetching and write-backs at
> > sequence points.
> >
> > False sharing occurs when there is occupied left-over space on the
> > word at one end of an object or the othere. Writing the object
> > requires writing the appropriate bits to that occupied left-over
> > space, which requires that we read the bits that are originally there
> > and then write them back. But if that other object gets modified in
> > the interim, we have a lost-update problem.
>
> But if that other object can get modified in the interim, it must be
> declared volatile anyway... see your first paragraph...

Wouldn't that mean that all static variables in a multithreaded
program must be declared volatile, just in case a compiler picks two
of them at random and allocates adjacent storage for them?

Actually, since the C Standard doesn't guarantee that two *automatic*
variables in two different functions can't occupy adjacent storage,
doesn't that mean that I have to make *all* my variables volatile?

Wouldn't that make the "volatile" qualifier pretty much meaningless in
a multithreaded program?

Wouldn't it be better to admit that not all C implementations are
suitable for compiling multithreaded programs? I don't think false
sharing is the only problem that you can possibly have if your
compiler wasn't designed with multithreading in mind. Imagine, for
instance, a compiler that uses an internal global variable to store
the address of the current stack frame. There's no reason such a
compiler couldn't be a conforming C implementation, is there?

t...@cs.ucr.edu

unread,

Aug 1, 2002, 5:02:41 PM8/1/02

to

In comp.std.c Fergus Henderson <f...@cs.mu.oz.au> wrote:
: t...@cs.ucr.edu writes:

: >Mark Williams <kmar...@yahoo.com> wrote:
: >: t...@cs.ucr.edu wrote in message news:<ai81kb$hv7$2...@glue.ucr.edu>...
: >:> The volatile qualification on an object's declaration suspends the
: >:> normal guarantees that the object will remember its value until it is
: >:> modified by some kind of write and that modifications of its value are
: >:> not externally visible. That suspension blocks many optimizations
: >:> that we consder normal and forces a lot of fetching and write-backs at
: >:> sequence points.
: >:>
: >:> False sharing occurs when there is occupied left-over space on the
: >:> word at one end of an object or the othere. Writing the object
: >:> requires writing the appropriate bits to that occupied left-over
: >:> space, which requires that we read the bits that are originally there
: >:> and then write them back. But if that other object gets modified in
: >:> the interim, we have a lost-update problem.
: >
: >: But if that other object can get modified in the interim, it must be
: >: declared volatile anyway... see your first paragraph...
: >
: >IMHO, declaring that other object to be volatile imposes a great deal
: >of overhead and is neither necessary nor sufficient for correct
: >operation.

: Have you measured the overhead?

Yes -- a simple test loop that increments three variables took 25% longer
when those variables were volatile.

: >The overhead that implementations must incur to cope with volatility is

: >well known so let's skip that part.
: >
: >Unnecessary: We can cure the problem by using partial-word writes.

: OK. But I'm not convinced that this is going to be significantly more
: efficient than `volatile'. Maybe it is, but personally I doubt it.

Since the read/modify/write operation is handled automatically in
hardware, I expect that the overhead would be far less than the
additional read and write instructions associated with volatility.
I agree however that this conjecture deserves to be tested.

: >Insufficiency: To cope with volatilty (behave as if we) re-read the

: >object's value after each sequence point. But all that does is to
: >make sure that we have up-to-date corrupted data. If the correct
: >value gets clobbered via a write to a neighbor, it gets clobbered.

: How would the correct value get clobbered?
: For the correct value to get clobbered, the thread which
: clobbered it would have to be ignoring `volatile', wouldn't it?

: In which case, well, don't use compilers that ignore `volatile'.

Hmmmm. AFAIK, for a thread to honor the volatility of an object:

* Each time it modifies the object that thread must write out
the new value prior to the next sequence point (of that thread).

* That thread must never use a value of that object that is older
than the most recent sequence point (of that thread).

Obviously, you have something different in mind.

Tom Payne

t...@cs.ucr.edu

unread,

Aug 1, 2002, 6:39:37 PM8/1/02

to

In comp.std.c Mark Williams <kmar...@yahoo.com> wrote:

: t...@cs.ucr.edu wrote in message news:<ai8sbq$p8r$3...@glue.ucr.edu>...

:> In comp.std.c Mark Williams <kmar...@yahoo.com> wrote:

[...]
:> : But if that other object can get modified in the interim, it must be

:> : declared volatile anyway... see your first paragraph...
:>
:> IMHO, declaring that other object to be volatile imposes a great deal
:> of overhead and is neither necessary nor sufficient for correct
:> operation.
:>
:> The overhead that implementations must incur to cope with volatility is
:> well known so let's skip that part.
:>
:> Unnecessary: We can cure the problem by using partial-word writes.

: Im not sure what you're saying here... certainly you can cure the
: problem of false sharing; you cannot cure the problem that when you
: want to access the other word (and you presumably do want to do that),
: the compiler may not actually do so (because it thinks it has the
: value cached somewhere).

If I understand correctly, we are considering two threads, t1 and t2,
that access two adjacent objects, o1 and o2 respectively, where o1 and
o2 share a word in common. t1 reads o1 and writes a new value to o1
using an atomic partial-word write operation that leaves the portion
of o2 that is on their common word unchanged. Now, you say that t2
may not read that portion of o2, "because it thinks that it has the
value cached somewhere." How would t2 have any idea of what t1 has
cached? These are independent threads, running on different
processors. They should each mind their own business. (I'm presuming
that the hardware architects can implement coherent caches.)

:> Insufficiency: To cope with volatilty (behave as if we) re-read the

:> object's value after each sequence point. But all that does is to
:> make sure that we have up-to-date corrupted data. If the correct
:> value gets clobbered via a write to a neighbor, it gets clobbered.
:> Re-reading it doesn't restore the correct value.

: Wrong, because if the other object is declared volatile, the standard
: forbids the compiler from writing it except as specified by the
: program; thus the value cannot get clobbered via a write to a
: neighbor...

IIRC, the standard says that writes to volatiles cannot be delays past
the first subsequent sequence point and that the program cannot count
on volatiles retaining their value.

: I fully understand the deficiencies of volatile, but it is certainly

: sufficient to prevent false sharing as described in this thread, and
: also necessary in the case that an object changes its value behind the
: programs back (failure to declare such an object volatile results in
: UB if I recall correctly).

IIRC, its the other way around. Non-volatile objects are required to
retain their values. Volatiles might be input registers and are not
required to remember.

Tom Payne

Fergus Henderson

unread,

Aug 1, 2002, 10:00:48 PM8/1/02

to

t...@cs.ucr.edu writes:

>In comp.std.c Mark Williams <kmar...@yahoo.com> wrote:
>:> Insufficiency: To cope with volatilty (behave as if we) re-read the
>:> object's value after each sequence point. But all that does is to
>:> make sure that we have up-to-date corrupted data. If the correct
>:> value gets clobbered via a write to a neighbor, it gets clobbered.
>:> Re-reading it doesn't restore the correct value.
>
>: Wrong, because if the other object is declared volatile, the standard
>: forbids the compiler from writing it except as specified by the
>: program; thus the value cannot get clobbered via a write to a
>: neighbor...
>
>IIRC, the standard says that writes to volatiles cannot be delays past
>the first subsequent sequence point

Yes...

>and that the program cannot count on volatiles retaining their value.

The language in the standard is a bit ambiguous, and it is often
misinterpreted that way, but IMHO it means that the *implementation*

cannot count on volatiles retaining their value.

>: I fully understand the deficiencies of volatile, but it is certainly
>: sufficient to prevent false sharing as described in this thread, and
>: also necessary in the case that an object changes its value behind the
>: programs back (failure to declare such an object volatile results in
>: UB if I recall correctly).

Yes.

>IIRC, its the other way around. Non-volatile objects are required to
>retain their values. Volatiles might be input registers and are not
>required to remember.

Nope. All variables are required to retain their last-stored value.
It's just that for volatile variables, the last store may not be
explicit in the program (e.g. it can be done by another thread, or
by hardware, etc.). See C99 6.2.4 [#2].

Fergus Henderson

unread,

Aug 1, 2002, 10:19:01 PM8/1/02

to

Wojt...@yahoo.ca (Wojtek Lerch) writes:

>kmar...@yahoo.com (Mark Williams) wrote in message news:<7adff5d1.02073...@posting.google.com>...
>> t...@cs.ucr.edu wrote in message news:<ai81kb$hv7$2...@glue.ucr.edu>...
>> > The volatile qualification on an object's declaration suspends the
>> > normal guarantees that the object will remember its value until it is
>> > modified by some kind of write and that modifications of its value are
>> > not externally visible. That suspension blocks many optimizations
>> > that we consder normal and forces a lot of fetching and write-backs at
>> > sequence points.
>> >
>> > False sharing occurs when there is occupied left-over space on the
>> > word at one end of an object or the othere. Writing the object
>> > requires writing the appropriate bits to that occupied left-over
>> > space, which requires that we read the bits that are originally there
>> > and then write them back. But if that other object gets modified in
>> > the interim, we have a lost-update problem.
>>
>> But if that other object can get modified in the interim, it must be
>> declared volatile anyway... see your first paragraph...
>
>Wouldn't that mean that all static variables in a multithreaded
>program must be declared volatile, just in case a compiler picks two
>of them at random and allocates adjacent storage for them?
>
>Actually, since the C Standard doesn't guarantee that two *automatic*
>variables in two different functions can't occupy adjacent storage,
>doesn't that mean that I have to make *all* my variables volatile?

Hmm. Yes.

>Wouldn't that make the "volatile" qualifier pretty much meaningless in
>a multithreaded program?
>
>Wouldn't it be better to admit that not all C implementations are
>suitable for compiling multithreaded programs? I don't think false
>sharing is the only problem that you can possibly have if your
>compiler wasn't designed with multithreading in mind. Imagine, for
>instance, a compiler that uses an internal global variable to store
>the address of the current stack frame. There's no reason such a
>compiler couldn't be a conforming C implementation, is there?

Certainly.

The question is not whether all possible conforming C implementations
are suitable for multithreaded programming -- clearly they are not.
The question is what additional properties, other than those guaranteed
by the C standard, is it reasonable to rely on?

Clearly it is reasonable to rely on there being no false sharing of
distinct automatic objects, since not doing so leads to the
unreasonable conclusion that all variables must be declared volatile.

I guess it would be reasonable to also rely on there being no false
sharing of distinct static objects, since a compiler which avoids false
sharing of distinct automatic objects should be able to also prevent
false sharing of static objects -- the additional burden on the
implementor would be small.

t...@cs.ucr.edu

unread,

Aug 1, 2002, 11:57:12 PM8/1/02

to

In comp.std.c Fergus Henderson <f...@cs.mu.oz.au> wrote:
: t...@cs.ucr.edu writes:

[...]
:>IIRC, its the other way around. Non-volatile objects are required to

:>retain their values. Volatiles might be input registers and are not
:>required to remember.

: Nope. All variables are required to retain their last-stored value.
: It's just that for volatile variables, the last store may not be
: explicit in the program (e.g. it can be done by another thread, or
: by hardware, etc.). See C99 6.2.4 [#2].

Sounds to me like a distinction in search of a difference. What the
difference between saying that an object need not remember its last
stored value and saying that its last stored value might have resulted
from a gamma ray from Alpha Centauri and/or from some other arbitrary
event? An input register is an input register. The values stored in
input registers change arbitrarily and capriciously. They don't
remember their values. What more can one say?

Tom Payne

David Hopwood

unread,

Aug 1, 2002, 10:29:46 PM8/1/02

to

-----BEGIN PGP SIGNED MESSAGE-----

Alexander Terekhov wrote (quoting Butenhof?):
> [...] make use of the ANSI C "atomic data type", sig_atomic_t, which will
> avoid word tearing.

Since when is there any guarantee that declaring variables as sig_atomic_t
will avoid word tearing for those variables? Maybe it will in practice, but
there's no spec that says so.

What is needed is something similar to the Java memory model requirement
that values cannot "come out of thin air" (i.e. roughly speaking, a value
read from any variable must have been previously written to that variable,
with some additional ordering constraints). This has little or nothing to do
with the semantics of sig_atomic_t (or volatile), which the C99 Standard
only defines for single-threaded programs.

- --
David Hopwood <david....@zetnet.co.uk>

Home page & PGP public key: http://www.users.zetnet.co.uk/hopwood/
RSA 2048-bit; fingerprint 71 8E A6 23 0E D3 4C E5 0F 69 8C D4 FA 66 15 01
Nothing in this message is intended to be legally binding. If I revoke a
public key but refuse to specify why, it is because the private key has been
seized under the Regulation of Investigatory Powers Act; see www.fipr.org/rip

-----BEGIN PGP SIGNATURE-----
Version: 2.6.3i
Charset: noconv

iQEVAwUBPUnuezkCAxeYt5gVAQGwRQgAvPUdpKHEjk66dqr89HsmdP/G3YXoGHoZ
aL/lc5Lf284Uxv1l+Cuq/a3GkJ0t77X7VDuaZdG3UtMOsF3Xgue49MOY3Z5Rtt6B
CDec67e7tsguysKVS7nA52Iptoq/tINhS4Pjny8TqcDTIEqq2cSCCVSvwI7FLNs/
NHNHq5iwj596AxyM9baFx9zz3FDuxur6NFaVCzG3vv1H+jsjl7b5ueOKUqSDO+sM
SituYXaUhy8LN8uXFEhTdRRtx182RzEwHVd/6VSe4k4AgZaIRPPRJA1CSboqBbC8
429wYzBeJiltHnQzgrCTEGG1YvO95HkIOIZj6cTX9wqyc4SjqmfVbw==
=6oxb
-----END PGP SIGNATURE-----

David Hopwood

unread,

Aug 1, 2002, 10:03:41 PM8/1/02

to

-----BEGIN PGP SIGNED MESSAGE-----

Jim Rogers wrote:
> Java's solution to this problem is to ignore actual hardware architecture
> and require the VM to control Java threads as though atomicity and
> word size were constants. C lives in the world of real hardware, just as
> Ada does. C needs to devise a solution that recognizes the realities of
> such an environment when dealing atomic access to data.

I'd like to correct any misconception that Java [i.e. the language and VM
design] doesn't "live in the world of real hardware".

The approach that the Java and JVM specifications take is that references and
all primitive types of 32 bits or less must be accessed atomically, but 64-bit
types need not be (by default). In practice, this requirement can be implemented
with no loss of efficiency on a wide range of hardware, including essentially
all architectures that it would be feasible to implement the rest of the
Java specification on.

Note that atomic access to pointers [*] is absolutely required in order to
ensure type safety in a multithreaded, GC'd language. Since at least a 32-bit
address space is pretty much essential to implement Java (it isn't implementable
in 64Kbytes), there is no penalty in specifying that all types of 32 bits or less
must be accessed atomically. The fact that the 64-bit 'long' and 'double' types
do not have to be accessed atomically is very much a concession to "real hardware"
(since these are not pointer types, there is no resulting problem with type
safety).

In any case, an architecture that did not support atomic accesses to pointer
variables (regardless of size), would have to be considered severely deficient
and unsuitable for running multithreaded programs. The designers of Java weren't
inclined to compromise the guarantees made to application programmers just on
the basis of portability to platforms that could theoretically exist, but
actually don't, and are not likely to exist in future.

[*] a reference is normally implemented as a pointer. If it isn't, that
doesn't really affect the argument: the Java implementation must
provide atomicity for references, and to do that efficiently it needs
the underlying hardware to provide atomicity for pointers, almost
regardless of how a reference is represented.

- --
David Hopwood <david....@zetnet.co.uk>

Home page & PGP public key: http://www.users.zetnet.co.uk/hopwood/
RSA 2048-bit; fingerprint 71 8E A6 23 0E D3 4C E5 0F 69 8C D4 FA 66 15 01
Nothing in this message is intended to be legally binding. If I revoke a
public key but refuse to specify why, it is because the private key has been
seized under the Regulation of Investigatory Powers Act; see www.fipr.org/rip

-----BEGIN PGP SIGNATURE-----
Version: 2.6.3i
Charset: noconv

iQEVAwUBPUnoMDkCAxeYt5gVAQHF4Qf9G/Uqi28gB3ckxUPKio8yK9nQ/1X98MV4
zsBx4K5tm1arIte8/ap9nHCNBZlAc46AucCGChzB+OrHqSsuJC0aW9C2slGxSPlP
3IoiI61HZnj9KGBpJ4Cpoul3QsCUNZF05Hq0izLrxO6CYQNWMzIt+4Nf0ifIL0Ev
mVFLKfePwrSRxfZTavGfsL5f7pmihke5+I+QOE6c3SYBLt9mpHD8cptKOFPxEOMS
J1DdE2qco9EwNfWO2/E4QAGsHxfBbp0w9VXYezrZRX8XNnIHGD9SQvEyIgSvkdcq
V3ERk4BEgV5JyVRLnDiMV+5S2i+wt/BpRBWTiSjv/YtMvoQGiSslWw==
=naB+
-----END PGP SIGNATURE-----

t...@cs.ucr.edu

unread,

Aug 2, 2002, 7:15:40 AM8/2/02

to

In comp.std.c David Hopwood <david....@zetnet.co.uk> wrote:

: -----BEGIN PGP SIGNED MESSAGE-----

: Alexander Terekhov wrote (quoting Butenhof?):
:> [...] make use of the ANSI C "atomic data type", sig_atomic_t, which will
:> avoid word tearing.

: Since when is there any guarantee that declaring variables as sig_atomic_t
: will avoid word tearing for those variables? Maybe it will in practice, but
: there's no spec that says so.

: What is needed is something similar to the Java memory model requirement
: that values cannot "come out of thin air" (i.e. roughly speaking, a value
: read from any variable must have been previously written to that variable,
: with some additional ordering constraints). This has little or nothing to do
: with the semantics of sig_atomic_t (or volatile), which the C99 Standard
: only defines for single-threaded programs.

Moreover, the standard only guarantees atomicity of writes by signal
handlers to data of type sig_atomic_t, and only when the object is
also declared to be volatile. Objects of type sig_atomic_t are not
guaranteed to be atomic in any other context.

Tom Payne

Alexander Terekhov

unread,

Aug 2, 2002, 9:49:17 AM8/2/02

to

t...@cs.ucr.edu wrote:
>
> In comp.std.c David Hopwood <david....@zetnet.co.uk> wrote:
> : -----BEGIN PGP SIGNED MESSAGE-----
>
> : Alexander Terekhov wrote (quoting Butenhof?):

^^^^^^^^^

I wrote:

"....
Butenhof, 1998, c.p.t.***
....

[***] http://groups.google.com/groups?selm=34FC0286.6A889AF5%40zko.dec.com
http://groups.google.com/groups?selm=3503EB30.68D23EE6%40zko.dec.com
http://groups.google.com/groups?selm=3503E02C.180387F5%40zko.dec.com
http://groups.google.com/groups?selm=3503E333.8147C81B%40zko.dec.com"

> :> [...] make use of the ANSI C "atomic data type", sig_atomic_t, which will

> :> avoid word tearing.
>
> : Since when is there any guarantee that declaring variables as sig_atomic_t
> : will avoid word tearing for those variables? Maybe it will in practice, but
> : there's no spec that says so.

Yeah, and I'm actually still waiting for a reply to the following rather old
post of mine:

http://groups.google.com/groups?selm=3B0E5122.A574104D%40web.de

---
Dave Butenhof wrote:

[...]
> What's lacking, perhaps, is a requirement that some particular C data size is "safe"; that
> all possible machines & compilers must support and maintain a definition of "a memory
> location" no larger than some type, such as "int" or "long". The logical candidate here
> would be sig_atomic_t. Though the requirement isn't spelled out, this must have the required
> characteristics in order to be safe for signal handler access as specified.

well, i was under impression that "sig_atomic_t" alone
does not guarantee thread (or even signal) safety..
only the combination of _static_storage_duration_,
_volatile_ and _sig_atomic_t makes it safe.. and only
for signal handlers.. i could imagine an impl. which
would just disable signal delivery while accessing
"static volatile sig_atomic_t" variable (allocated
in some special storage region - for static volatiles
sig_atomic_t's only) or would do something else which
would NOT work with respect to threads.

or am i missing something?
---

> : What is needed is something similar to the Java memory model requirement
> : that values cannot "come out of thin air"

I don't think so. http://groups.google.com/groups?selm=3C9236F3.49C68326%40web.de

> (i.e. roughly speaking, a value
> : read from any variable must have been previously written to that variable,
> : with some additional ordering constraints). This has little or nothing to do
> : with the semantics of sig_atomic_t (or volatile), which the C99 Standard
> : only defines for single-threaded programs.
>
> Moreover, the standard only guarantees atomicity of writes by signal
> handlers to data

static data

> of type sig_atomic_t, and only when the object is
> also declared to be volatile. Objects of type sig_atomic_t are not
> guaranteed to be atomic in any other context.

AFAICS, it's even worse than that... in a multithreaded application that
happens to use asynchronous signals [vs. sigwait and/or SIGEV_THREAD delivery]
with static volatile sig_atomic_t vars you'd have to ensure that such signals
could only be "delivered" to a corresponding ONE SINGLE thread -- the one that
reads/writes a particular static volatile sig_atomic_t variable(s). You just
can't have such signal(s) delivered to any other thread.

regards,
alexander.

David Butenhof

unread,

Aug 2, 2002, 12:58:41 PM8/2/02

to

Alexander Terekhov wrote:

> Yeah, and I'm actually still waiting for a reply to the following rather
> old post of mine:
>
> http://groups.google.com/groups?selm=3B0E5122.A574104D%40web.de

You're STILL WAITING? Wow. ;-)

Fergus Henderson

unread,

Aug 2, 2002, 7:17:12 PM8/2/02

to

t...@cs.ucr.edu writes:

>In comp.std.c Fergus Henderson <f...@cs.mu.oz.au> wrote:
>: t...@cs.ucr.edu writes:
>[...]
>:>IIRC, its the other way around. Non-volatile objects are required to
>:>retain their values. Volatiles might be input registers and are not
>:>required to remember.
>
>: Nope. All variables are required to retain their last-stored value.
>: It's just that for volatile variables, the last store may not be
>: explicit in the program (e.g. it can be done by another thread, or
>: by hardware, etc.). See C99 6.2.4 [#2].
>
>Sounds to me like a distinction in search of a difference.

The point is that a C implementation isn't allowed to keep a variable
in an input register that does not retain its last-stored value merely
because the programmer happened to declare it volatile.

>What the
>difference between saying that an object need not remember its last
>stored value and saying that its last stored value might have resulted
>from a gamma ray from Alpha Centauri and/or from some other arbitrary
>event?

The point is that the C implementation isn't allowed to fire those gamma
rays itself. If the programmer or some other external event causes
the value to change, fine. But the implementation isn't allowed to
change the value.

Alexander Terekhov

unread,

Aug 2, 2002, 7:11:29 PM8/2/02

to

David Butenhof wrote:
>
> Alexander Terekhov wrote:
>
> > Yeah, and I'm actually still waiting for a reply to the following rather
> > old post of mine:
> >
> > http://groups.google.com/groups?selm=3B0E5122.A574104D%40web.de
>
> You're STILL WAITING? Wow. ;-)

Well, nope. Not anymore. ;-) Forget that old stuff.

Beginning "2002-08-01 09:27:59 PST" I'm now waiting for a reply to
the following [yet another "what-am-I-missing-again"] post of mine:

http://groups.google.com/groups?selm=3D495F1B.60DEA4C1%40web.de

regards,
alexander.

David Hopwood

unread,

Aug 2, 2002, 11:13:27 PM8/2/02

to

-----BEGIN PGP SIGNED MESSAGE-----

Fergus Henderson wrote:
> t...@cs.ucr.edu writes:
> >In comp.std.c Fergus Henderson <f...@cs.mu.oz.au> wrote:
> >: t...@cs.ucr.edu writes:
> >[...]
> >:>IIRC, its the other way around. Non-volatile objects are required to
> >:>retain their values. Volatiles might be input registers and are not
> >:>required to remember.
> >
> >: Nope. All variables are required to retain their last-stored value.
> >: It's just that for volatile variables, the last store may not be
> >: explicit in the program (e.g. it can be done by another thread, or
> >: by hardware, etc.). See C99 6.2.4 [#2].
> >
> >Sounds to me like a distinction in search of a difference.
>
> The point is that a C implementation isn't allowed to keep a variable
> in an input register that does not retain its last-stored value merely
> because the programmer happened to declare it volatile.

Not *merely* because it is volatile, but this is allowed if the variable
corresponds to memory-mapped I/O, since accessing memory-mapped I/O is
undefined behaviour anyway.

- --
David Hopwood <david....@zetnet.co.uk>

Home page & PGP public key: http://www.users.zetnet.co.uk/hopwood/
RSA 2048-bit; fingerprint 71 8E A6 23 0E D3 4C E5 0F 69 8C D4 FA 66 15 01
Nothing in this message is intended to be legally binding. If I revoke a
public key but refuse to specify why, it is because the private key has been
seized under the Regulation of Investigatory Powers Act; see www.fipr.org/rip

-----BEGIN PGP SIGNATURE-----
Version: 2.6.3i
Charset: noconv

iQEVAwUBPUtKPjkCAxeYt5gVAQE0Awf/ZmBISQxEoUw3YgGxTav2Bs1E00MYOzuN
pHfINOFGRarlAtjt6Emul9ARCQMPMazP8Jajkvwb2Rps57LQdomIyqjlqOfF9bfE
s0jvJ41oo1L4XCFhaGfftaRMHJTrCiUgkED/Xh19oB3x35W+ZTR7xrcVMeH0aSJ7
Egs6RkLEMMQIIqCSVznNLq2YgKXmjUJQ8zMU0lvyWbi9qqzJjX/XTeizMR9fd0TT
StrKtB8KdvCjbplxDjWpFjqkWit5gmKGtYpjqFXWFU4u3yFEe4Z/oEXUnb7NtE66
9SRQ4v45G6Zf4qwZ8JYQ5FZbeJThjYq27yd1u/y4mZ9XPSWfH2hSxg==
=eXMX
-----END PGP SIGNATURE-----

Gabriel Dos Reis

unread,

Aug 3, 2002, 9:38:58 AM8/3/02

to

f...@cs.mu.oz.au (Fergus Henderson) writes:

| t...@cs.ucr.edu writes:
|
| >In comp.std.c Fergus Henderson <f...@cs.mu.oz.au> wrote:
| >: t...@cs.ucr.edu writes:
| >[...]
| >:>IIRC, its the other way around. Non-volatile objects are required to
| >:>retain their values. Volatiles might be input registers and are not
| >:>required to remember.
| >
| >: Nope. All variables are required to retain their last-stored value.
| >: It's just that for volatile variables, the last store may not be
| >: explicit in the program (e.g. it can be done by another thread, or
| >: by hardware, etc.). See C99 6.2.4 [#2].
| >
| >Sounds to me like a distinction in search of a difference.
|
| The point is that a C implementation isn't allowed to keep a variable
| in an input register that does not retain its last-stored value merely
| because the programmer happened to declare it volatile.

How can a conforming program tell?

| >What the
| >difference between saying that an object need not remember its last
| >stored value and saying that its last stored value might have resulted
| >from a gamma ray from Alpha Centauri and/or from some other arbitrary
| >event?
|
| The point is that the C implementation isn't allowed to fire those gamma
| rays itself.

How can a conforming program detect that the gamma ray was fired
by the implementation? The implementation can pretend it knew nothing
about itself fireing gamma rays, it won't cease being conforming for
that sole reason.

| If the programmer or some other external event causes
| the value to change, fine. But the implementation isn't allowed to
| change the value.

But then how can a conforming program tell the difference ?

-- Gaby

Alexander Terekhov

unread,

Aug 3, 2002, 9:50:30 AM8/3/02

to

Fergus Henderson wrote:
>
> t...@cs.ucr.edu writes:
>
> >In comp.std.c Fergus Henderson <f...@cs.mu.oz.au> wrote:
> >: t...@cs.ucr.edu writes:
> >[...]
> >:>IIRC, its the other way around. Non-volatile objects are required to
> >:>retain their values. Volatiles might be input registers and are not
> >:>required to remember.
> >
> >: Nope. All variables are required to retain their last-stored value.
> >: It's just that for volatile variables, the last store may not be
> >: explicit in the program (e.g. it can be done by another thread, or

^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^

Only if the access to shared data is >>properly synchronized<<; standard C/C++
volatiles aren't meant to provide memory synchronization [and atomicity with
respect to itself] like revised Java volatiles [and proposed atomic classes
from JSR-166]. POSIX defines API calls that synchronize memory [and execution,
of course]... if you use them [and you SHALL use them] then volatiles aren't
needed at all for data sharing ["implementation-defined" semantics aside]. If
you don't use synchronization protocols... well, then you are clearly violating
4.10 rules [>>undefined<< "memory locations" aside] with or without volatiles.

> >: by hardware, etc.). See C99 6.2.4 [#2].
> >
> >Sounds to me like a distinction in search of a difference.
>
> The point is that a C implementation isn't allowed to keep a variable
> in an input register that does not retain its last-stored value merely
> because the programmer happened to declare it volatile.

Uhmm. My English definitely sucks... can someone here translate it [for me-
stupid] to Russian or German? TIA. ;-)

regards,
alexander.

P.S. I DO understand this: [C99, 6.2.4/2] "The lifetime of an object is the
portion of program execution during which storage is guaranteed to be reserved
for it. An object exists, has a constant address,25) and retains its last-
stored value throughout its lifetime.26)...26) In the case of a volatile
object, the last store need not be explicit in the program."

Ross Ridge

unread,

Aug 3, 2002, 7:19:45 PM8/3/02

to

David Hopwood <david....@zetnet.co.uk> wrote:
>Note that atomic access to pointers [*] is absolutely required in
>order to ensure type safety in a multithreaded, GC'd language. Since at
>least a 32-bit address space is pretty much essential to implement Java
>(it isn't implementable in 64Kbytes), there is no penalty in specifying
>that all types of 32 bits or less must be accessed atomically.

You're working under the false assumption that if an architecture can
access 32-bit values in memory atomically, it can also access smaller
16-bit and 8-bit values atomically.

Ross Ridge

--
l/ // Ross Ridge -- The Great HTMU
[oo][oo] rri...@csclub.uwaterloo.ca
-()-/()/ http://www.csclub.uwaterloo.ca/u/rridge/
db //

Fergus Henderson

unread,

Aug 4, 2002, 8:19:36 AM8/4/02

to

Gabriel Dos Reis <g...@soliton.integrable-solutions.net> writes:

>f...@cs.mu.oz.au (Fergus Henderson) writes:
>
>| t...@cs.ucr.edu writes:
>|
>| >In comp.std.c Fergus Henderson <f...@cs.mu.oz.au> wrote:
>| >: t...@cs.ucr.edu writes:
>| >[...]
>| >:>IIRC, its the other way around. Non-volatile objects are required to
>| >:>retain their values. Volatiles might be input registers and are not
>| >:>required to remember.
>| >
>| >: Nope. All variables are required to retain their last-stored value.
>| >: It's just that for volatile variables, the last store may not be
>| >: explicit in the program (e.g. it can be done by another thread, or
>| >: by hardware, etc.). See C99 6.2.4 [#2].
>| >
>| >Sounds to me like a distinction in search of a difference.
>|
>| The point is that a C implementation isn't allowed to keep a variable
>| in an input register that does not retain its last-stored value merely
>| because the programmer happened to declare it volatile.
>
>How can a conforming program tell?

A strictly conforming program probably can't tell.
But the programmer who translates and executes the program
with a particular C implementation can tell.

If the following program is translated and executed,

#include <stdio.h>
int main() {
volatile int x = 0;
if (x == 0) printf("x retained its value\n");
return 0;
}

and the programmer does not do anything to caused the value of x to be
modified (such as running the program under a debugger and explicitly
modifying its value), and the program does not output the message "x
retained its value", then the implementation is not conforming.

(Modulo the usual stuff about the size of the program exceeding the
implementation's capacity, and the possibility of printf() failing.)

>| >What the
>| >difference between saying that an object need not remember its last
>| >stored value and saying that its last stored value might have resulted
>| >from a gamma ray from Alpha Centauri and/or from some other arbitrary
>| >event?
>|
>| The point is that the C implementation isn't allowed to fire those gamma
>| rays itself.
>
>How can a conforming program detect that the gamma ray was fired
>by the implementation? The implementation can pretend it knew nothing
>about itself fireing gamma rays, it won't cease being conforming for
>that sole reason.

I strongly disagree. Such an implementation would violate the standard.
Volatile objects can only be modified in ways *unknown to the implementation*.
Apart from that, they have to obey the rules of the abstract machine.
(C99 section 6.7.3 [#6].)

If the implementation *pretends* it doesn't know about the way in which
the variable was modified, but actually this is known to the implementation,
then the implementation isn't conforming. If the modification is
truly unknown to the implementation, then the implementation can be
conforming.

>| If the programmer or some other external event causes
>| the value to change, fine. But the implementation isn't allowed to
>| change the value.
>
>But then how can a conforming program tell the difference ?

Who cares?
Even if a conforming program can't tell the difference, that doesn't
give implementations the right to ignore 6.7.3 [#6] and 5.1.2.3 [#5]
of the standard.

The "as if" rule, 5.1.2.3, allows an implementation to ignore certain
parts of the standard in certain cituations, but this isn't one of them.

Gabriel Dos Reis

unread,

Aug 4, 2002, 9:07:28 AM8/4/02

to

f...@cs.mu.oz.au (Fergus Henderson) writes:

[...]

How? Certainly the standard is of no help here.

| If the following program is translated and executed,
|
| #include <stdio.h>
| int main() {
| volatile int x = 0;
| if (x == 0) printf("x retained its value\n");
| return 0;
| }
|
| and the programmer does not do anything to caused the value of x to be
| modified (such as running the program under a debugger and explicitly
| modifying its value), and the program does not output the message "x
| retained its value", then the implementation is not conforming.

On which basis? Citation, please? The way the value of x could be
changed is *not* certainly limited to explicit action of the programmer.
It could be that a ray fired from Alpha Century changed the value of x.

[...]

| >| >What the
| >| >difference between saying that an object need not remember its last
| >| >stored value and saying that its last stored value might have resulted
| >| >from a gamma ray from Alpha Centauri and/or from some other arbitrary
| >| >event?
| >|
| >| The point is that the C implementation isn't allowed to fire those gamma
| >| rays itself.
| >
| >How can a conforming program detect that the gamma ray was fired
| >by the implementation? The implementation can pretend it knew nothing
| >about itself fireing gamma rays, it won't cease being conforming for
| >that sole reason.
|
| I strongly disagree.

You're welcome.

| Such an implementation would violate the standard.
| Volatile objects can only be modified in ways *unknown to the

^^^^
| implementation*.

No. The standard doesn't say that. Here is exactly what it says:

[#6] An object that has volatile-qualified type may be
modified in ways unknown to the implementation or have other
unknown side effects. Therefore any expression referring to
such an object shall be evaluated strictly according to the
rules of the abstract machine, as described in 5.1.2.3.
Furthermore, at every sequence point the value last stored
in the object shall agree with that prescribed by the
abstract machine, except as modified by the unknown factors
mentioned previously.114) What constitutes an access to an

object that has volatile-qualified type is implementation-

defined.

Spontaneously changing its own value, for example, is such "ways
unknown to the implementation or have other unknown side effects".

| Apart from that, they have to obey the rules of the abstract machine.
| (C99 section 6.7.3 [#6].)
|
| If the implementation *pretends* it doesn't know about the way in which
| the variable was modified, but actually this is known to the implementation,
| then the implementation isn't conforming.

The point is: How can you or a conforming program determine that that
was known to the implementation whereas it documents knowing nothing
about it?

| If the modification is
| truly unknown to the implementation, then the implementation can be
| conforming.

Yes, but if the implementation documents it knows nothing about it,
how can you or a comforming implementation determine it knew it?

| >| If the programmer or some other external event causes
| >| the value to change, fine. But the implementation isn't allowed to
| >| change the value.
| >
| >But then how can a conforming program tell the difference ?
|
| Who cares?

If you can't tell the difference, then the implementation won't cease
being conforming.

| Even if a conforming program can't tell the difference, that doesn't
| give implementations the right to ignore 6.7.3 [#6] and 5.1.2.3 [#5]
| of the standard.

Certainly, but 6.7.3/6 doesn't say what you think it says (see above).
5.1.2.3/5 says:

[#5] A strictly conforming program shall use only those
features of the language and library specified in this
International Standard.2) It shall not produce output
dependent on any unspecified, undefined, or implementation-
defined behavior, and shall not exceed any minimum
implementation limit.

Since what is an access to a volatile objet is implementation-defined,
5.1.2.3/5 is irrelevant for this discussion.

| The "as if" rule, 5.1.2.3, allows an implementation to ignore certain
| parts of the standard in certain cituations, but this isn't one of them.

I'm not talking of the "as if" rule. I'm talking of 6.7.3/6.

-- Gaby

Fergus Henderson

unread,

Aug 4, 2002, 11:01:21 AM8/4/02

to

Alexander Terekhov <tere...@web.de> writes:

>Fergus Henderson wrote:
>>
>> t...@cs.ucr.edu writes:
>>
>> >In comp.std.c Fergus Henderson <f...@cs.mu.oz.au> wrote:

>> >: All variables are required to retain their last-stored value.

>> >: It's just that for volatile variables, the last store may not be
>> >: explicit in the program (e.g. it can be done by another thread, or
> ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
>
>Only if the access to shared data is >>properly synchronized<<;

Not necessarily. You are assuming POSIX threads, but in the paragraph
quoted above I am just talking about what the C standard guarantees.
The thread in question need not be a POSIX thread.
Or it might be a POSIX thread running on an implementation
that makes additional guarantees beyond what POSIX mandates.

>standard C/C++
>volatiles aren't meant to provide memory synchronization [and atomicity with
>respect to itself] like revised Java volatiles [and proposed atomic classes
>from JSR-166].

Standard C/C++ volatiles aren't guaranteed to provide memory
synchronization by the C, C++ or POSIX standards. However, they can be
used for that purpose, and some C implementations do provide that. The
question of whether they are "meant" for this purpose is a rather
difficult one, but IMHO using them for that purpose is consistent
with the C committee's original intent as I understand it, though
perhaps not consistent with the POSIX committee's intent.

Fergus Henderson

unread,

Aug 4, 2002, 4:47:16 PM8/4/02

to

Gabriel Dos Reis <g...@soliton.integrable-solutions.net> writes:

>f...@cs.mu.oz.au (Fergus Henderson) writes:
>
>[...]
>
>| >| The point is that a C implementation isn't allowed to keep a variable
>| >| in an input register that does not retain its last-stored value merely
>| >| because the programmer happened to declare it volatile.
>| >
>| >How can a conforming program tell?
>|
>| A strictly conforming program probably can't tell.
>| But the programmer who translates and executes the program
>| with a particular C implementation can tell.
>
>How? Certainly the standard is of no help here.
>
>| If the following program is translated and executed,
>|
>| #include <stdio.h>
>| int main() {
>| volatile int x = 0;
>| if (x == 0) printf("x retained its value\n");
>| return 0;
>| }
>|
>| and the programmer does not do anything to caused the value of x to be
>| modified (such as running the program under a debugger and explicitly
>| modifying its value), and the program does not output the message "x
>| retained its value", then the implementation is not conforming.
>
>On which basis? Citation, please? The way the value of x could be
>changed is *not* certainly limited to explicit action of the programmer.
>It could be that a ray fired from Alpha Century changed the value of x.

I disagree. Although the bare wording of the standard is not very clear,
your interpretation would lead to contradictory conclusion that
almost any use of `volatile' would lead to undefined behaviour,
which would in turn imply that any use of `setjmp' or `sig_atomic_t'
that required the use of `volatile' would also lead to undefined
behaviour. This clearly contradicts the intent of the committee,
who wouldn't have bothered with specificing all those complicated rules
about `setjmp' and the use of `sig_atomic_t' in signal handlers if any
use of them was going to have undefined behaviour.

So IMHO the only sensible interpretation of the wording which says that
volatile objects "may be modified in ways unknown to the implementation
or have other unknown side effects" is that this is giving a permission
to the programmer, and does not give the implementation license to
misbehave on the grounds of hypothetical rays fired from Alpha Centauri
or anywhere else.

>| >| >What the
>| >| >difference between saying that an object need not remember its last
>| >| >stored value and saying that its last stored value might have resulted
>| >| >from a gamma ray from Alpha Centauri and/or from some other arbitrary
>| >| >event?
>| >|
>| >| The point is that the C implementation isn't allowed to fire those gamma
>| >| rays itself.
>| >
>| >How can a conforming program detect that the gamma ray was fired
>| >by the implementation? The implementation can pretend it knew nothing
>| >about itself fireing gamma rays, it won't cease being conforming for
>| >that sole reason.

>| Such an implementation would violate the standard.
>| Volatile objects can only be modified in ways *unknown to the
> ^^^^
>| implementation*.
>
>No. The standard doesn't say that. Here is exactly what it says:
>
> [#6] An object that has volatile-qualified type may be
> modified in ways unknown to the implementation or have other
> unknown side effects. Therefore any expression referring to
> such an object shall be evaluated strictly according to the
> rules of the abstract machine, as described in 5.1.2.3.
> Furthermore, at every sequence point the value last stored
> in the object shall agree with that prescribed by the
> abstract machine, except as modified by the unknown factors
> mentioned previously.114) What constitutes an access to an
> object that has volatile-qualified type is implementation-
> defined.

That amounts to the same thing.
It says that volatile objects may be modified "in ways unknown ...",
and that the last stored value shall agree with that prescribed
by the abstract machine, except as modified by the unknown factors.
The latter implies that they can't be spontaneously modified
by anything other than the unknown factors.

If you interpret these "unknown factors" as just being unknown to
the authors of the standard or to the programmer, rather than
unknown to the implementation, then this section becomes on the face
of it ridiculous. The clause would not impose any requirements on
implementations. The authors might as well have written

: Furthermore, at every sequence point the value last stored

: in the object shall agree with that prescribed by the

: abstract machine, except when they don't.

which is of course vaccuous!

>| Apart from that, they have to obey the rules of the abstract machine.
>| (C99 section 6.7.3 [#6].)
>|
>| If the implementation *pretends* it doesn't know about the way in which
>| the variable was modified, but actually this is known to the implementation,
>| then the implementation isn't conforming.
>
>The point is: How can you or a conforming program determine that that
>was known to the implementation whereas it documents knowing nothing
>about it?

Well, if the implementation generates code which allocates the variable
in an I/O register, then obviously the implementation is at fault
because the implementation knows which registers it allocates variables in.
This can for example be determined by the programmer when reading the
generated code.

In general you may not always be able to determine whether or not
an implementation is conforming. Likewise, you can't always determine
whether or not a program will halt. Such is life.

However, if the implementation is not conforming, and the non-conformance
causes people significant trouble, then the implementation will most likely
get caught out eventually.

If it came down to a court battle over whether or not the compiler was
conforming, then you could subpoena the implementor's email records,
revision control logs, and so forth, and prove it (at least to the
standards of proof required in a court of law) using these as evidence ;-).

>| Even if a conforming program can't tell the difference, that doesn't
>| give implementations the right to ignore 6.7.3 [#6] and 5.1.2.3 [#5]
>| of the standard.
>
>Certainly, but 6.7.3/6 doesn't say what you think it says (see above).
>5.1.2.3/5 says:
>
> [#5] A strictly conforming program shall use only those
> features of the language and library specified in this
> International Standard.2) It shall not produce output
> dependent on any unspecified, undefined, or implementation-
> defined behavior, and shall not exceed any minimum
> implementation limit.

No, you must have cut-and-paste the wrong section there -- that's 4/5
(paragraph 5 in section 4), not 5.1.2.3/5.

The part of 5.1.2.3/5 that I'm referring to is the second of these
paragraphs:

| -- At sequence points, volatile objects are stable in the
| sense that previous accesses are complete and
| subsequent accesses have not yet occurred.
|
| -- At program termination, all data written into files
| shall be identical to the result that execution of the
| program according to the abstract semantics would have
| produced.

>Since what is an access to a volatile objet is implementation-defined,
>5.1.2.3/5 is irrelevant for this discussion.

The first paragraph of the two quoted immediate above is irrelevant,
but the second paragraph about data written into files is relevant.
That is the one which requires the execution of the program
to output the message "x retained its value".

>| The "as if" rule, 5.1.2.3, allows an implementation to ignore certain
>| parts of the standard in certain cituations, but this isn't one of them.
>
>I'm not talking of the "as if" rule. I'm talking of 6.7.3/6.

Well, as I've argued above, your interpretation of 6.7.3/6 leads
to large chunks of it being entirely vaccous, and also IMHO clearly
contradicts the intent of several other parts of the standard.
So I don't think your interpretation is reasonable.

Gabriel Dos Reis

unread,

Aug 4, 2002, 6:12:41 PM8/4/02

to

f...@cs.mu.oz.au (Fergus Henderson) writes:

[...]

| >| If the following program is translated and executed,

| >|
| >| #include <stdio.h>
| >| int main() {
| >| volatile int x = 0;
| >| if (x == 0) printf("x retained its value\n");
| >| return 0;
| >| }
| >|
| >| and the programmer does not do anything to caused the value of x to be
| >| modified (such as running the program under a debugger and explicitly
| >| modifying its value), and the program does not output the message "x
| >| retained its value", then the implementation is not conforming.
| >
| >On which basis? Citation, please? The way the value of x could be
| >changed is *not* certainly limited to explicit action of the programmer.
| >It could be that a ray fired from Alpha Century changed the value of x.
|
| I disagree.

Well, it doesn't suffice to disagree. You need to back up your
disagreement with chapters and verses. I asked citations, you didn't
provide one that support your disagreement; I'm just left with your
disagreement. I'm afraid that doesn't suffice :-)

| Although the bare wording of the standard is not very clear,
| your interpretation would lead to contradictory conclusion that
| almost any use of `volatile' would lead to undefined behaviour,
| which would in turn imply that any use of `setjmp' or `sig_atomic_t'
| that required the use of `volatile' would also lead to undefined
| behaviour.

Not just because you're afraid of the conclusion means that the
interpretation by itself isn't valid. It is just symptomatic of the fact
that the standard wording doesn't say exactly what you expect it to
say.

| This clearly contradicts the intent of the committee,

I cannot say for sure exactly the intent of every single member on the C
committee had in mind when they voted on that wording (when I voted, I
certainly didn't have what you said in mind) but the *fact* is that
that "spontaneous mutation" or "gamma ray fired from Alpha Centauri"
are "ways unknown to the implementation".

| who wouldn't have bothered with specificing all those complicated rules
| about `setjmp' and the use of `sig_atomic_t' in signal handlers if any
| use of them was going to have undefined behaviour.

No, that doesn't follow. If you think that what the standard says isn't
what the committee intended (which I don't doubt) then, then I would say
it is the wording that has to be changed; but I can't accept "if they
bothered then it can't be that" as a reason to reject an intepretation.

| So IMHO the only sensible interpretation of the wording which says that
| volatile objects "may be modified in ways unknown to the implementation
| or have other unknown side effects" is that this is giving a permission
| to the programmer, and does not give the implementation license to
| misbehave on the grounds of hypothetical rays fired from Alpha Centauri
| or anywhere else.

Well, however the fact is hard: Can you provide us with a citation of
chapter and verse that limit "may be modified in ways unknown to
the implementation or have other unkown side effects" only to explicit
actions of the programmer?

No, I'm afraid that is not the case. There is nothing in that wording
that limits "have other unknown side effects" only to explicit actions
from the part of programmer.

[...]

| If you interpret these "unknown factors" as just being unknown to
| the authors of the standard or to the programmer, rather than
| unknown to the implementation,

I'm NOT interpreting all "unknown factors" as "just being unknown to
the authors of the standard or to the programmer". No.
However, there is nothing in that wording that says that *all* of the
"unknown ways" and all of the "unknown side effects" are all known to
the authors of the standard or to the programmers.
There is no explicit description of the behaviour for
"unknown side effects".

| then this section becomes on the face of it ridiculous.

Yes, that is my main point: The wording is defective and allows more
than it should, IHMO.

| The clause would not impose any requirements on
| implementations. The authors might as well have written
|
| : Furthermore, at every sequence point the value last stored
| : in the object shall agree with that prescribed by the
| : abstract machine, except when they don't.
|
| which is of course vaccuous!

Agreed. And in effect, that is what they wrote :-)

But then, that assumes that the implementation offers a generated code
readable by the programmer. An assumtpion that has no foundation in
the scope of the C standards.

| In general you may not always be able to determine whether or not
| an implementation is conforming. Likewise, you can't always determine
| whether or not a program will halt. Such is life.

Yeah, c'est la vie.

| However, if the implementation is not conforming, and the non-conformance
| causes people significant trouble, then the implementation will most likely
| get caught out eventually.

If the conformance is discussed as described by the C standards, then
the means by which people determine the implementation non-conformance
will be questioned and examined :-)

| If it came down to a court battle over whether or not the compiler was
| conforming, then you could subpoena the implementor's email records,
| revision control logs, and so forth, and prove it (at least to the
| standards of proof required in a court of law) using these as evidence ;-).

That is no longer conformance discussed according to the C standards :-)

Ouch, you're right. I'm sorry. I cut-and-pasted the wrong section
indeed. Thanks for the correction.

| The part of 5.1.2.3/5 that I'm referring to is the second of these
| paragraphs:
|
| | -- At sequence points, volatile objects are stable in the
| | sense that previous accesses are complete and
| | subsequent accesses have not yet occurred.
| |
| | -- At program termination, all data written into files
| | shall be identical to the result that execution of the
| | program according to the abstract semantics would have
| | produced.
|
| >Since what is an access to a volatile objet is implementation-defined,
| >5.1.2.3/5 is irrelevant for this discussion.
|
| The first paragraph of the two quoted immediate above is irrelevant,
| but the second paragraph about data written into files is relevant.
| That is the one which requires the execution of the program
| to output the message "x retained its value".

No, that doesn't follow, because:

(1) What is an access to a volatile object is implementation-defined.
And the abstract machine doesn't specify one. So there is no
telling that the abstract machine would have output the message

"x retained its value".

(2) There is still the issue of the "unknown side effects" produced
by the volatile object x.

| >| The "as if" rule, 5.1.2.3, allows an implementation to ignore certain
| >| parts of the standard in certain cituations, but this isn't one of them.
| >
| >I'm not talking of the "as if" rule. I'm talking of 6.7.3/6.
|
| Well, as I've argued above, your interpretation of 6.7.3/6 leads
| to large chunks of it being entirely vaccous, and also IMHO clearly
| contradicts the intent of several other parts of the standard.
| So I don't think your interpretation is reasonable.

Maybe it contradicts the intend of some parts of the standard, but
then the standard is not known to be non-contraditory or bug-free.
Before rejecting my interpretation as non reasonable because it
may contradict some parts of the standard, you would have first to
prove that the standard is consistent or free of contradictions.

-- Gaby

Alexander Terekhov

unread,

Aug 5, 2002, 6:11:06 AM8/5/02

to

Fergus Henderson wrote:
[...]

> I disagree. Although the bare wording of the standard is not very clear,
> your interpretation would lead to contradictory conclusion that
> almost any use of `volatile' would lead to undefined behaviour,
> which would in turn imply that any use of `setjmp' or `sig_atomic_t'
> that required the use of `volatile' would also lead to undefined
> behaviour.

Bingo! ;-)

> This clearly contradicts the intent of the committee,
> who wouldn't have bothered with specificing all those complicated rules
> about `setjmp' and the use of `sig_atomic_t' in signal handlers if any
> use of them was going to have undefined behaviour.

http://groups.google.com/groups?selm=3D496D85.5AC9CE91%40web.de

"....

> > "volatile" was invented for device registers. The mistake was to overload
> > ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
> > it for setjmp and signal handlers.
>
> Why was that a mistake?

Ask Mr. Henry Spencer (he...@zoo.toronto.edu). Well, I agree because,
to my way of thinking, jumps and signals have really nothing to do with
"sharing" a hardware register/port with some device and the specification

that says {for really good reasons} that: "What constitutes an access to

an object that has volatile-qualified type is implementation-defined".

...."

regards,
alexander.