Volatile declared objects

Tito

unread,

Nov 14, 2002, 6:38:28 PM11/14/02

to

I don't catch what is the aim of volatile declared objects, for which only
volatile declared member-functions can be called.

What are these kind of objects good for?

Why this (as far as I can see) "forced" parallelism between "volatile" and
"const"?

Tito

Ioannis Vranos

unread,

Nov 14, 2002, 6:55:16 PM11/14/02

to

"Tito" <titogarcianos...@inicia.es> wrote in message
news:ar1c6q$dpthn$1...@ID-123881.news.dfncis.de...

volatile means that the object's value can be changed by means outside
of the system (e.g. a variable representating the current environment
temperature). volatile tells the compiler to avoid specific variable
optimisations so as to ensure valid behaviour.

--
Ioannis

* Ioannis Vranos
* Programming pages: http://www.noicys.freeurl.com
* Alternative URL: http://run.to/noicys

Tito

unread,

Nov 14, 2002, 7:24:38 PM11/14/02

to

> volatile means that the object's value can be changed by means outside
> of the system (e.g. a variable representating the current environment
> temperature). volatile tells the compiler to avoid specific variable
> optimisations so as to ensure valid behaviour.

Yes, what you say is true when talking about objects of primitive types
("ordinary" variables).

When I talk about "objects" I'm referring to instances of user defined types
(structs or classes).

Tito

Ioannis Vranos

unread,

Nov 14, 2002, 9:18:52 PM11/14/02

to

"Tito" <titogarcianos...@inicia.es> wrote in message

news:ar1etd$ecgsp$1...@ID-123881.news.dfncis.de...

It's the same i guess. The compiler avoids optimisations regarding
their state.

Tito

unread,

Nov 15, 2002, 4:57:12 AM11/15/02

to

> It's the same i guess. The compiler avoids optimisations regarding
> their state.

Then making an object volatile is the same as if all the attributes of his
class were volatile?

But one question remains: why it is only permitted to call volatile function
members on volatile objects?

Tito

Alexander Terekhov

unread,

Nov 15, 2002, 6:57:13 AM11/15/02

to

Tito wrote:
>
> I don't catch what is the aim of volatile ...

The aim of C/C++ volatile keyword is simply to confuse people...
resulting in a worldwide annual loss of productivity measured in
quite a few $10^6 ["probably"]. And it [i.e. loss] is growing
constantly.

regards,
alexander.

--
http://groups.google.com/groups?selm=3DBFF494.5FB11101%40web.de

Ioannis Vranos

unread,

Nov 15, 2002, 12:26:44 PM11/15/02

to

"Tito" <titogarcianos...@inicia.es> wrote in message

news:ar2gev$eghbr$1...@ID-123881.news.dfncis.de...

Well as it was said, volatile means that an object can be changed by
outside means. So if you have

volatile int x=0;

cout<<x<<endl;

x can be something else than 0 (like 3).

A volatile object means that its state can be changed by outside
means. I guess a volatile member function prompts the compiler to not
apply specific optimisations on the code of that function So we
declare the member function volatile to help the compiler, probably it
was considered difficult for the average compiler to protect volatile
class members across all member functions, so a little help was
required.

Jerry Coffin

unread,

Nov 15, 2002, 12:49:34 PM11/15/02

to

In article <ar2gev$eghbr$1...@ID-123881.news.dfncis.de>,
titogarcianos...@inicia.es says...

Because the code you generate to deal with volatile data often has to
be different from the code you generate to deal with non-volatile
data. Consider something like this:

class whatever {
int a;
int b;
int c;
public:
void add() { a += c; b += c; a += c;}
void add() volatile { a += c; b+= c; a += c; }
};

Now, even though the code _I've_ written in the two member functions
looks identical, what the compiler generates for them may be entirely
different. In particular, in the non-volatile member function, the
compiler will normally generate code that only loads c into a
register once, and then adds the contents of that register to both a
and c. In the volatile version, it can't do that: it HAS to re-load
c from memory each time it's used. Likewise, in the non-volatile
version, it can see that b+=c has no effect on the two a+=c
statements. As such, in that version it will typically produce code
that's essentially equivalent to a+=c<<1; Again, in the volatile
version, it can't do that: the order and number of writes to volatile
objects may be meaningful in and of itself, so it can't combine the
code for different statements like this.

--
Later,
Jerry.

The universe is a figment of its own imagination.

Douglas A. Gwyn

unread,

Nov 15, 2002, 5:48:10 PM11/15/02

to

Alexander Terekhov wrote:
> Tito wrote:
> > I don't catch what is the aim of volatile ...

Tha purpose of volatile qualification is to provide a method
for the programmer to instruct the compiler that the access
pattern for a specific object needs to be exactly as the
programmer has written it (i.e. according to the C abstract
machine), thus the compiler must not change that pattern in
the process of optimizing the generated machine code.

> The aim of C/C++ volatile keyword is simply to confuse people...

Maybe that's *your* aim.

t...@cs.ucr.edu

unread,

Nov 15, 2002, 11:00:19 PM11/15/02

to

In comp.std.c Alexander Terekhov <tere...@web.de> wrote:
+
+ Tito wrote:
+>
+> I don't catch what is the aim of volatile ...
+
+ The aim of C/C++ volatile keyword is simply to confuse people...
+ resulting in a worldwide annual loss of productivity measured in
+ quite a few $10^6 ["probably"]. And it [i.e. loss] is growing
+ constantly.

I have to agree that the effect has been something along those lines,
but the aim has been rather clear: to provide a mechanism by which the
programmer can relieve certain burdens on the implementation and
replace them with other burdens:

relieved:
* the burden that the object needs to remember the value
last stored to it by the program

added:
* the burden that all computations involving the value of a
volatile object must use a very fresh copy.
* the burden that all modification of a volatile object must
be committed by the next sequence point.

In other words, objects of volatile-qualified types are viewed as
I/O registers.

Okay, so that's the aim. The problem with C's notion of volatility is
that its granualarity is not sufficiently fine, i.e., 1the notion that
one size fits all:
1. Input registers are not output registers and vice versa.
2. Neither are objects whose values are written by signal handlers.
3. Neither are automatic objects whose updates should be remembered
that are local to functions that contain a call to setjmp.
Why do any of these three kinds of objects need to have their values
stored to memory by the first sequence point following a modification
of that object?

Tom Payne

Douglas A. Gwyn

unread,

Nov 16, 2002, 3:51:29 AM11/16/02

to

t...@cs.ucr.edu wrote:
> In other words, objects of volatile-qualified types are viewed as
> I/O registers.

No. I've explained this before, but you persist in giving
misinformation and drawing incorrect inferences from it.

t...@cs.ucr.edu

unread,

Nov 16, 2002, 5:01:46 AM11/16/02

to

In comp.std.c Douglas A. Gwyn <DAG...@null.net> wrote:
+ t...@cs.ucr.edu wrote:
+> In other words, objects of volatile-qualified types are viewed as
+> I/O registers.
+
+ No. I've explained this before, but you persist in giving
+ misinformation and drawing incorrect inferences from it.

When I use the term "viewed as", I'm making an analogy. This
particular analogy is supported by C89, 6.5.3, footnote 67:

A volatile declaration may be used to describe an object
corresponding to a memory-mapped input/output port ...

The point of the analogy is that a conforming implementation must
treat objects of volatile-qualified types as though they are:

1) subject to spontaneous changes in value, just like input
registers

2) subject to observation, just like output registers, i.e.
their values at sequence points are part of the program's
behavior.

Point #1 above implies that, for an object of volatile-qualified type,
an implementation should not uses possibly stale values that have been
cached in registers. Point #2 implies that, for an object of
volatile-qualified type, an implementation should promptly commit any
newly assigned value to the object's location (rather than simply
caching it in memory). I don't see either of those points are being
incompatible with your comment that:

Tha purpose of volatile qualification is to provide a method
for the programmer to instruct the compiler that the access
pattern for a specific object needs to be exactly as the
programmer has written it (i.e. according to the C abstract
machine), thus the compiler must not change that pattern in
the process of optimizing the generated machine code.

Have I missed something here?

Tom Payne

Ioannis Vranos

unread,

Nov 16, 2002, 6:04:41 AM11/16/02

to

<t...@cs.ucr.edu> wrote in message news:ar552a$qv5$1...@glue.ucr.edu...

> In comp.std.c Douglas A. Gwyn <DAG...@null.net> wrote:
> + t...@cs.ucr.edu wrote:
> +> In other words, objects of volatile-qualified types are viewed as
> +> I/O registers.
> +
> + No. I've explained this before, but you persist in giving
> + misinformation and drawing incorrect inferences from it.
>
> When I use the term "viewed as", I'm making an analogy. This
> particular analogy is supported by C89, 6.5.3, footnote 67:
>
> A volatile declaration may be used to describe an object
> corresponding to a memory-mapped input/output port ...

Then probably you should have used "like". Also making an analogy of a
specific technical aspect with other irrelevant technical aspects,
only confusion can cause.

t...@cs.ucr.edu

unread,

Nov 16, 2002, 6:58:48 AM11/16/02

to

In comp.std.c Ioannis Vranos <noi...@spammers.get.lost.hotmail.com> wrote:
+ <t...@cs.ucr.edu> wrote in message news:ar552a$qv5$1...@glue.ucr.edu...
+> In comp.std.c Douglas A. Gwyn <DAG...@null.net> wrote:
+> + t...@cs.ucr.edu wrote:
+> +> In other words, objects of volatile-qualified types are viewed as
+> +> I/O registers.

+> +
+> + No. I've explained this before, but you persist in giving

+> + misinformation and drawing incorrect inferences from it.
+>
+> When I use the term "viewed as", I'm making an analogy. This
+> particular analogy is supported by C89, 6.5.3, footnote 67:
+>
+> A volatile declaration may be used to describe an object
+> corresponding to a memory-mapped input/output port ...
+
+
+ Then probably you should have used "like". Also making an analogy of a
+ specific technical aspect with other irrelevant technical aspects,
+ only confusion can cause.

If your point is that I/O ports are irrelevant to volatility, I
couldn't agree less. IMHO, memory-mapped I/O ports are the
fundamental paradigm for volatile objects. In fact, if a program
attempts to access a memory-mapped I/O port via an lvalue whose type
is not volatile-qualified, the resulting behavior is undefined.

Tom Payne

Alexander Terekhov

unread,

Nov 16, 2002, 1:40:25 PM11/16/02

to

"Douglas A. Gwyn" wrote:
>
> Alexander Terekhov wrote:
> > Tito wrote:
> > > I don't catch what is the aim of volatile ...
>
> Tha purpose of volatile qualification is to provide a method
> for the programmer to instruct the compiler that the access
> pattern for a specific object needs to be exactly as the
> programmer has written it (i.e. according to the C abstract
> machine), thus the compiler must not change that pattern in
> the process of optimizing the generated machine code.

Yeah, ``how nice.'' <http://tinyurl.com/2r53> ("Forget C/C++
volatiles, Momchil. ...")

> > The aim of C/C++ volatile keyword is simply to confuse people...
>
> Maybe that's *your* aim.

Nope. My aim is simply to get C/C++ volatile and *jmp deprecated,
replace async. signals with threads... uhhm, and merge C and C++
and POSIX.1... making exceptions work in "C" language core and
with "C" bindings of POSIX.1++ (in addition to "C++" bindings).
That's it. ;-)

regards,
alexander.

Douglas A. Gwyn

unread,

Nov 16, 2002, 2:59:41 PM11/16/02

to

t...@cs.ucr.edu wrote:
> IMHO, memory-mapped I/O ports are the
> fundamental paradigm for volatile objects.

No; they're the most obvious *application* for this
facility. There are other, quite different, reasons
for using volatile qualification as have been
discussed here and in other newsgroups (such as
comp.os.plan9). If you want a *paradigm*, try the
original Ritchie PDP-11 C compiler which performed
minimal optimizations and thus could be treated
almost as a high-level assembler. (However, there
were still other access issues in certain
circumstances, but "volatile" wouldn't address
those other than through the requirement that the
implementation document access semantics.)

Douglas A. Gwyn

unread,

Nov 16, 2002, 3:02:27 PM11/16/02

to

Alexander Terekhov wrote:
> Yeah, ``how nice.'' <http://tinyurl.com/2r53>

which contains several erroneous claims.

Alexander Terekhov

unread,

Nov 16, 2002, 3:16:39 PM11/16/02

to

"Douglas A. Gwyn" wrote:
>
> t...@cs.ucr.edu wrote:
> > IMHO, memory-mapped I/O ports are the
> > fundamental paradigm for volatile objects.
>
> No; they're the most obvious *application* for this
> facility. There are other, quite different, reasons
> for using volatile qualification as have been
> discussed here and in other newsgroups (such as

> comp.os.plan9). ^^^^^^^
^^^^^^^^^^^^^

Ah. Okay.

http://groups.google.com/groups?selm=3DD47620.6040909%40null.net
(comp.os.plan9; Re: [9fans] how to avoid a memset() optimization)

<quote>

> The intent of volatile was to capture appropriate
> behavior of memory-mapped registers and similar
> things (like a clock in user space updated by the
> OS.) So, things like
> *p = 0;
> *p = 0;
> should generate two stores if p is volatile *int.

Yes, the C standard requires that, with the correction
that it is the int that must be volatile-qualified,
not the pointer. I.e., volatile int* if we're using C
abstract types. It is still up to the implementation
to determine whether the store involves a read also
and how wide the access is (e.g., if int is 32 bits on
a 64-bit word bus, the store would necessitate fetch
of 64 bits, modification of 32 of them, and write-back
of 64 bits). There doesn't seem to be any point in
trying to let the programmer specify such details,
since they're normally built into the hardware. But
volatile as it is specified at least lets the programmer
control the *compiler* (code generator), which is
partial control and quite often good enough.

</quote>

regards,
alexander.

Alexander Terekhov

unread,

Nov 16, 2002, 3:16:54 PM11/16/02

to

``such as...'' ? ;-)

regards,
alexander.

t...@cs.ucr.edu

unread,

Nov 17, 2002, 12:41:33 AM11/17/02

to

In comp.std.c Douglas A. Gwyn <DAG...@null.net> wrote:

+ t...@cs.ucr.edu wrote:
+> IMHO, memory-mapped I/O ports are the
+> fundamental paradigm for volatile objects.
+
+ No; they're the most obvious *application* for this
+ facility.

I'm meaning paradigm in the sense of an example that serves as a
pattern or model, i.e., one that captures all of the essential aspects
of the concept.

+ There are other, quite different, reasons
+ for using volatile qualification as have been
+ discussed here and in other newsgroups (such as
+ comp.os.plan9).

Fine. Then please furnish just one specific example of an object that
needs to have volatile-qualified type for a reason that is
fundamentally different from the reasons that memory-mapped I/O ports
must have volatile-qualified types.

Tom Payne

t...@cs.ucr.edu

unread,

Nov 17, 2002, 12:55:50 AM11/17/02

to

In comp.std.c Douglas A. Gwyn <DAG...@null.net> wrote:

+ Alexander Terekhov wrote:
+> Yeah, ``how nice.'' <http://tinyurl.com/2r53>
+
+ which contains several erroneous claims.

Like what? Please be specific!

Tom Payne

t...@cs.ucr.edu

unread,

Nov 17, 2002, 1:05:11 AM11/17/02

to

In comp.std.c Alexander Terekhov <tere...@web.de> wrote:
+

+ "Douglas A. Gwyn" wrote:
+>
+> t...@cs.ucr.edu wrote:

+> > IMHO, memory-mapped I/O ports are the
+> > fundamental paradigm for volatile objects.
+>
+> No; they're the most obvious *application* for this
+> facility. There are other, quite different, reasons
+> for using volatile qualification as have been
+> discussed here and in other newsgroups (such as
+> comp.os.plan9). ^^^^^^^
+ ^^^^^^^^^^^^^
+
+ Ah. Okay.
+
+ http://groups.google.com/groups?selm=3DD47620.6040909%40null.net
+ (comp.os.plan9; Re: [9fans] how to avoid a memset() optimization)
+
+ <quote>
+
+ > The intent of volatile was to capture appropriate
+ > behavior of memory-mapped registers and similar
+ > things (like a clock in user space updated by the
+ > OS.) So, things like
+ > *p = 0;
+ > *p = 0;
+ > should generate two stores if p is volatile *int.
+
+ Yes, the C standard requires that, with the correction
+ that it is the int that must be volatile-qualified,
+ not the pointer. I.e., volatile int* if we're using C
+ abstract types. It is still up to the implementation
+ to determine whether the store involves a read also
+ and how wide the access is (e.g., if int is 32 bits on
+ a 64-bit word bus, the store would necessitate fetch
+ of 64 bits, modification of 32 of them, and write-back
+ of 64 bits). There doesn't seem to be any point in
+ trying to let the programmer specify such details,
+ since they're normally built into the hardware.

I believe that the C standard tried to make that point in general via
the sentence:

What constitutes an access to an object that has volatile-qualified
type is implementation defined.
[C89, 6.5.3]

The term "constitutes" is unfortunately ambiguous here, e.g.,

This letter does not constitute an offer of employment.

versus

A father, a mother and their children constitute a family.

I have to go along with Doug's claim that the standard doesn't make
sense if we take the fisrt interpretaton for constitute. Rather, the
only way for the stanard to make sense is to interpret it as saying
that the implementation is free to add constituents of the sort you
mentioned above to reads and writes of volatile objects.

Tom Payne

Douglas A. Gwyn

unread,

Nov 17, 2002, 1:33:41 AM11/17/02

to

t...@cs.ucr.edu wrote:
> + There are other, quite different, reasons
> + for using volatile qualification as have been
> + discussed here and in other newsgroups (such as
> + comp.os.plan9).
> Fine. Then please furnish just one specific example of an object that
> needs to have volatile-qualified type for a reason that is
> fundamentally different from the reasons that memory-mapped I/O ports
> must have volatile-qualified types.

I just did.
Two examples were auto variables whose values you
want to rely on after longjmp()ing back, and a buffer
that you really want cleared with no other use made
of the zero fill. The former is necessary for
reasons having to do with allowable code optimization
while the latter is necessary because some external
agent beyond the scope of the C standard will be
examining the contents (sometimes volatile
qualification can help with run-time debugging for a
similar reason). Memory-mapped device registers used
for input have values that are potentially changed
from their last-stored (by the program) value, due to
action of external agents. These are three different
situations, although of course they are connected due
to being cases where volatile qualification is useful.
No one of them captures all the relevant aspects.

t...@cs.ucr.edu

unread,

Nov 17, 2002, 2:23:06 AM11/17/02

to

In comp.std.c Douglas A. Gwyn <DAG...@null.net> wrote:
+ t...@cs.ucr.edu wrote:

+> + There are other, quite different, reasons
+> + for using volatile qualification as have been
+> + discussed here and in other newsgroups (such as
+> + comp.os.plan9).
+> Fine. Then please furnish just one specific example of an object that
+> needs to have volatile-qualified type for a reason that is
+> fundamentally different from the reasons that memory-mapped I/O ports
+> must have volatile-qualified types.
+
+ I just did.
+ Two examples were auto variables whose values you
+ want to rely on after longjmp()ing back, and a buffer
+ that you really want cleared with no other use made
+ of the zero fill. The former is necessary for
+ reasons having to do with allowable code optimization
+ while the latter is necessary because some external
+ agent beyond the scope of the C standard will be
+ examining the contents (sometimes volatile
+ qualification can help with run-time debugging for a
+ similar reason).

In the case of the zero-filled object, the zero-filling must not to be
optimized away because the object might be externally observable,
i.e., it might behave like an output register.

In the case of the automatics that are local to a function that
invokes setjmp, there are potentially two continuations from the call
to setjmp:
- the first one begins with the normal return from setjmp()
- the second begins when setjmp returns as a result of a longjmp()
That second continuation must read values written to local automatics
by the first continuation. In demanding that such local automatics
have volatile-qualified type, the standard is requiring the first
continuation to treat such objects as output ports and the second
continuation to treat them as input ports, which sufficies to
achieve reliable post-longjmp values.

However, it's overkill to require that such variables be written out
at each sequence point -- for purposes under discussion, they need
only be saved at function calls. What the Standard requires of the
handling of objects of volatile-qualified type is determined by what's
needed in handling I/O registers. These automatics don't exacly fit
that paradigm. And, in requiring that they have volatile-qualified
types, the standard has imposed an unfortunate burden of inefficiency
on them.

Tom Payne

Douglas A. Gwyn

unread,

Nov 17, 2002, 6:45:20 AM11/17/02

to

t...@cs.ucr.edu wrote:
> ... In demanding that such local automatics

> have volatile-qualified type, the standard is requiring the first
> continuation to treat such objects as output ports and the second

> continuation to treat them as input ports, ...

That's rather far-fetched.

> However, it's overkill to require that such variables be written out
> at each sequence point --

Volatile qualification is a simple mechanism with multiple
uses. It isn't specifically tailored for just one of them.

Walter Briscoe

unread,

Nov 17, 2002, 3:59:07 AM11/17/02

to

In article <ar7g4q$5s8$1...@glue.ucr.edu> of Sun, 17 Nov 2002 07:23:06 in
comp.std.c, t...@cs.ucr.edu writes
[snip]

>However, it's overkill to require that such variables be written out
>at each sequence point -- for purposes under discussion, they need
>only be saved at function calls. What the Standard requires of the
>handling of objects of volatile-qualified type is determined by what's
>needed in handling I/O registers. These automatics don't exacly fit
>that paradigm. And, in requiring that they have volatile-qualified
>types, the standard has imposed an unfortunate burden of inefficiency
>on them.

I have a counter example where complying with that burden of
inefficiency produces the desired result. (I don't think it is relevant
that the implementation uses an implicit function call to do a cast.)

In news://twIA9.23319$nB.2761@sccrnsc03 an OP questioned why the
following snippet resulted in 100098 rather than 100099.
float fTemp = 1000.99f; fTemp *= 100; unsigned un = fTemp;
The problem was due to different rounding behaviour of an 80 bit
floating point register and a 32 bit float variable. Making fTemp
volatile forces the write to un to be from the variable rather than the
register. Unfortunately, the implementation does not obey the
instruction. (There are other implementation-dependent ways to force the
desired behaviour.)

I care less for efficiency than that an implementation does as told.
--
Walter Briscoe

t...@cs.ucr.edu

unread,

Nov 17, 2002, 11:18:30 AM11/17/02

to

In comp.std.c Douglas A. Gwyn <DAG...@null.net> wrote:
+ t...@cs.ucr.edu wrote:

+> ... In demanding that such local automatics
+> have volatile-qualified type, the standard is requiring the first
+> continuation to treat such objects as output ports and the second
+> continuation to treat them as input ports, ...
+
+ That's rather far-fetched.
+
+> However, it's overkill to require that such variables be written out
+> at each sequence point --
+
+ Volatile qualification is a simple mechanism with multiple
+ uses.

I count three that are mentioned in the stanard:

1) I/O registers and things that behave like them.
2) Local automatics of functions that invoke setjmp.
3) Static objects that are written by signal handlers.

+ It isn't specifically tailored for just one of them.

The requirements that the stanard places on volatiles are (almost)
correct for case #1. As I just pointed out they are overkill for case
#2. They are insufficient for case #3 and have to be supplemented
with requirements of atomicity.

Tom Payne

Mike Wahler

unread,

Nov 17, 2002, 7:54:43 PM11/17/02

to

<t...@cs.ucr.edu> wrote in message news:ar7a6d$52t$1...@glue.ucr.edu...

What about objects shared with other processes?

-Mike

Alexander Terekhov

unread,

Nov 17, 2002, 8:21:23 PM11/17/02

to

"Douglas A. Gwyn" wrote:
[...]

> Two examples were auto variables whose values you
> want to rely on after longjmp()ing back,

It is ``brain-dead!'', so to speak. <http://tinyurl.com/2s3e>,
<http://tinyurl.com/2s38>,
<http://tinyurl.com/2s2q>.

> and a buffer that you really want cleared with no other
> use made of the zero fill.

``Undefined'' AND ``brain-dead!''. <http://tinyurl.com/2s34>,
<http://tinyurl.com/2s2s>,
<http://tinyurl.com/2s2u>

regards,
alexander.

Alexander Terekhov

unread,

Nov 17, 2002, 8:24:45 PM11/17/02

to

Mike Wahler wrote:
[...]

> What about objects shared with other processes?

Hi Mike, contributing to the "quite a few $10^6" {growing} amount or what?

regards,
alexander.

Mike Wahler

unread,

Nov 17, 2002, 9:18:41 PM11/17/02

to

Alexander Terekhov <tere...@web.de> wrote in message
news:3DD8415D...@web.de...

>
> Mike Wahler wrote:
> [...]
> > What about objects shared with other processes?
>
> Hi Mike, contributing to the "quite a few $10^6" {growing} amount or what?

I have no idea what you're asking me. English please.

-Mike

Alexander Terekhov

unread,

Nov 17, 2002, 9:36:00 PM11/17/02

to

A) http://groups.google.com/groups?selm=3DD4E119.CB51D2D6%40web.de
(Subject: Re: Volatile declared objects)

B) http://groups.google.com/groups?selm=3D47F3CA.E5B4239%40web.de
(Subject: Re: "memory location")

"....
And, BTW, I'd like to urge folks at comp.std.c to consider *FIXING*
the C99 rationale as well..."

C) http://groups.google.com/groups?selm=3DC9930A.10CB5BE%40web.de
(Subject: Re: When to use the 'volatile' keyword ?)

regards,
alexander.

t...@cs.ucr.edu

unread,

Nov 17, 2002, 9:45:54 PM11/17/02

to

In comp.std.c Mike Wahler <mkwa...@mkwahler.net> wrote:
+
+ <t...@cs.ucr.edu> wrote in message news:ar7a6d$52t$1...@glue.ucr.edu...
+> In comp.std.c Douglas A. Gwyn <DAG...@null.net> wrote:
+> + t...@cs.ucr.edu wrote:
+> +> IMHO, memory-mapped I/O ports are the
+> +> fundamental paradigm for volatile objects.

+> +
+> + No; they're the most obvious *application* for this

+> + facility.
+>
+> I'm meaning paradigm in the sense of an example that serves as a
+> pattern or model, i.e., one that captures all of the essential aspects
+> of the concept.
+>
+> + There are other, quite different, reasons
+> + for using volatile qualification as have been
+> + discussed here and in other newsgroups (such as
+> + comp.os.plan9).
+>

+> Fine. Then please furnish just one specific example of an object that
+> needs to have volatile-qualified type for a reason that is
+> fundamentally different from the reasons that memory-mapped I/O ports
+> must have volatile-qualified types.
+

+ What about objects shared with other processes?

The requirements that the Standard places on the handling of objects
of volatile qualified types were designed for things that act like I/O
registers. The authors of the Standard, however, found that by
imposing those same requirements they could solve some shared-access
problems involving setjmp and longjmp and other shared-access problems
involving signal handlers. The fit wasn't so all that good, but it
worked, at the expense of significant overhead. That success led a
lot of people to believe that shared-access issues in multithreading
can be solved by the silver bullet of requiring volatile-qualified
types. Dave Butenhof has worked very hard to point out that
volatile-qualified types are neither necessary nor sufficient to
guarantee coherence among thread-shared objects. Most of Dave's
postings on the matter have appeared in comp.programming.threads, but
some of them have also appeared in this group.

Tom Payne

Douglas A. Gwyn

unread,

Nov 18, 2002, 12:03:26 AM11/18/02

to

Alexander Terekhov wrote:
> It is ``brain-dead!'', so to speak. ...
> ``Undefined'' AND ``brain-dead!''. ...

Just because somebody makes a bogus argument doesn't
mean there is actually a problem with the specification.

Mike Wahler

unread,

Nov 18, 2002, 12:20:08 AM11/18/02

to

Alexander Terekhov <tere...@web.de> wrote in message

news:3DD85210...@web.de...

So you're quoting yourself. I've already read those threads.
You've just effectively told me "Alexander is correct because
Alexander says so." Huh?

-Mike

Alexander Terekhov

unread,

Nov 18, 2002, 12:27:05 AM11/18/02

to

Mike Wahler wrote:
[...]

> So you're quoting yourself. I've already read those threads.
> You've just effectively told me "Alexander is correct because
> Alexander says so." Huh?

Read again those threads.

regards,
alexander.

Alexander Terekhov

unread,

Nov 18, 2002, 12:36:37 AM11/18/02

to

"Douglas A. Gwyn" wrote:
>
> Alexander Terekhov wrote:
> > It is ``brain-dead!'', so to speak. ...
> > ``Undefined'' AND ``brain-dead!''. ...
>
> Just because somebody makes a bogus argument

I don't see ANY arguments in the text you've quoted here.

> doesn't mean there is actually a problem with the specification.

Open your eyes [phrases in "emotionally loaded language" aside].

Really.

regards,
alexander.

Jerry Coffin

unread,

Nov 18, 2002, 12:50:46 AM11/18/02

to

In article <ar9t09$7uo$1...@slb9.atl.mindspring.net>,
mkwa...@mkwahler.net says...

[ ... ]

> So you're quoting yourself. I've already read those threads.
> You've just effectively told me "Alexander is correct because
> Alexander says so." Huh?

Eventually you'll learn that even though Alexander is a fairly
intelligent person, it's best to just plonk him and be done with it.

--
Later,
Jerry.

The universe is a figment of its own imagination.

Alexander Terekhov

unread,

Nov 18, 2002, 12:57:07 AM11/18/02

to

Jerry Coffin wrote:
>
> In article <ar9t09$7uo$1...@slb9.atl.mindspring.net>,
> mkwa...@mkwahler.net says...
>
> [ ... ]
>
> > So you're quoting yourself. I've already read those threads.
> > You've just effectively told me "Alexander is correct because
> > Alexander says so." Huh?
>
> Eventually you'll learn that even though Alexander is a fairly
> intelligent person, it's best to just plonk him and be done with it.

http://groups.google.com/groups?selm=3CA25853.2B30E9BC%40web.de
(Subject: Re: Q: use CreateThread() and TerminateThread() and message procedure)

regards,
alexander.

Mike Wahler

unread,

Nov 18, 2002, 1:46:30 AM11/18/02

to

Jerry Coffin <jco...@taeus.com> wrote in message
news:MPG.18422d9e4...@news.direcpc.com...

> In article <ar9t09$7uo$1...@slb9.atl.mindspring.net>,
> mkwa...@mkwahler.net says...
>
> [ ... ]
>
> > So you're quoting yourself. I've already read those threads.
> > You've just effectively told me "Alexander is correct because
> > Alexander says so." Huh?
>
> Eventually you'll learn that even though Alexander is a fairly
> intelligent person, it's best to just plonk him and be done with it.

Done. (Result of another thread)

-Mike

Jack Klein

unread,

Nov 18, 2002, 1:50:08 AM11/18/02

to

On Sun, 17 Nov 2002 16:54:43 -0800, "Mike Wahler"
<mkwa...@mkwahler.net> wrote in comp.std.c:

Since neither C nor C++ supports other processes or shared objects,
what about them?

--
Jack Klein
Home: http://JK-Technology.Com
FAQs for
comp.lang.c http://www.eskimo.com/~scs/C-faq/top.html
comp.lang.c++ http://www.parashift.com/c++-faq-lite/
alt.comp.lang.learn.c-c++ ftp://snurse-l.org/pub/acllc-c++/faq

Alexander Terekhov

unread,

Nov 18, 2002, 1:53:07 AM11/18/02

to

Mike Wahler wrote:
[...]

> > Eventually you'll learn that even though Alexander is a fairly
> > intelligent person, it's best to just plonk him and be done with it.
>
> Done.

Yeah. What a terrible loss. I'll never be bothered by a reply of yours
again. Excellent.

> (Result of another thread)

It's now here as well.

< Forward Inline >

-------- Original Message --------
Message-ID: <3DD88CD7...@web.de>
Newsgroups: comp.lang.c++
Subject: Re: Temporaries and const-ness

Mike Wahler wrote:
[...]
> <shrug>
> *PLONK*

Ouch.

regards,
alexander. < almost dead but smiling nevertheless >

Douglas A. Gwyn

unread,

Nov 18, 2002, 2:08:13 AM11/18/02

to

Jack Klein wrote:
> Since neither C nor C++ supports other processes
> or shared objects, what about them?

Sure they do. Just not portably, and not always
in a way that one would desire.

Alexander Terekhov

unread,

Nov 18, 2002, 2:44:06 AM11/18/02

to

"Douglas A. Gwyn" wrote:
>
> Jack Klein wrote:
^^^^^^^^^^ (see the 2nd message forwarded below)

> > Since neither C nor C++ supports other processes
> > or shared objects, what about them?
>
> Sure they do. Just not portably,

Yep, PORTABLE [POSIX(R) "THR" option <http://tinyurl.com/2sbj>]
threads aside, of course.

> and not always in a way that one would desire.

Yup, < Forward Inline >

-------- Original Message --------
Message-ID: <3D0DCB20...@web.de>
Date: Mon, 17 Jun 2002 13:42:24 +0200
Newsgroups: comp.lang.c++
Subject: Re: Multithreading, synchronization and variables

josh wrote:
>
> On Sat, 15 Jun 2002 23:05:50 +0200, Alexander Terekhov
> <tere...@web.de> wrote:
> > josh wrote:
> > > On Sat, 15 Jun 2002 19:20:28 +0200, Bernd Fuhrmann
> > > <Silver...@gmx.de> wrote:
> > > > Suppose I've got a variable and two concurrent threads
> > > > that use (i.e. change) it. Would it be neccessary to make
> > > > that variable volatile
> >
> > > No. Volatile's got nothing to do with threads.
> >
> > Standard C/C++ POSIX Threads. But there are C/C+
> > implementations which provide non standard volatile semantics
> > that ARE relevant w.r.t. threading -- e.g. to ensure the BYTE
> > memory granularity to fight 'word-tearing' race condition,
> > etc. Also, in Java, REVISED volatiles are quite relevant
> > w.r.t. threading.

> What's "w.r.t"?

'with respect to'

> On topic: this is interesting, any more detail
> on that? Hopefully in just a few plain words <g>.

Nah, in just one single web link:

http://groups.google.com/groups?selm=c29b5e33.0205241457.24f12178%40posting.google.com
(Subject: Re: Parallel Programming in C++; see "B) Your 'volatile'-bullshit")

As for granularity, try this:

http://www.tru64unix.compaq.com/docs/base_doc/DOCUMENTATION/V51_HTML/ARH9RBTE/DOCU0007.HTM#gran_sec
("3.7 Granularity Considerations....")

regards,
alexander.

-------- Original Message --------
Message-ID: <3D0DCB4E...@web.de>
Date: Mon, 17 Jun 2002 13:43:10 +0200
Newsgroups: comp.lang.c++
Subject: Re: Multithreading, synchronization and variables

Jack Klein wrote:
[...]
> But POSIX threads, and anything at all concerning threading, is
> off-topic here.

Nothing is off-topic here (this group has no CHARTER to begin with).

> The meaning of the volatile keyword, as defined by
> both ISO C and ISO C++, has nothing at all to do with threads, since
> neither ISO C nor ISO C++ define or support any sort of threading.

I personally don't care what is supported or not supported. Also, you
may want to fire a search on 'thread' here:

http://std.dkuug.dk/JTC1/SC22/WG21/docs/papers/2002/n1361.html

> josh was 100% exactly right in the context of this group,

Show me please your c.l.c++-Judge certificate or something
like that. Well, Y'know, I actually DO accept complaints on
topicality submitted via:

http://www.slack.net/~shiva/complain.txt

So please take a few spare minutes and submit the form...

regards,
alexander.

Momchil Velikov

unread,

Nov 18, 2002, 3:22:43 AM11/18/02

to

Alexander Terekhov <tere...@web.de> wrote in message news:<3DD69119...@web.de>...

> "Douglas A. Gwyn" wrote:
> >
> > Alexander Terekhov wrote:

> Yeah, ``how nice.'' <http://tinyurl.com/2r53> ("Forget C/C++
> volatiles, Momchil. ...")

Well, I must clarify, that while I was totaly convinced that
"volatile" is not required in a POSIX environment, existing
implementations have quite compatible notion of what constitutes a
"violatile object access" and notably similar behavior (not putting
volatiles in registers, not reordering volatile accesses accross
sequence points (as mandated by the standard)), etc. so volatile is
quite usable, especially together with non-standard primitives (a
POSIX impementation is rarely implemented on top of a POSIX
implementation right ?)

> Nope. My aim is simply to get C/C++ volatile and *jmp deprecated,
> replace async. signals with threads...

Oh, no ... we already have Visual Basic, don't we ?

> uhhm, and merge C and C++ and POSIX.1

Uh-oh, I imagine monstrosity of such a beast ...

>... making exceptions work in "C" language core and

Amen. But first namespaces, eh ?

~velco

t...@cs.ucr.edu

unread,

Nov 18, 2002, 2:58:33 AM11/18/02

to

In comp.std.c Jack Klein <jack...@spamcop.net> wrote:
+ On Sun, 17 Nov 2002 16:54:43 -0800, "Mike Wahler"
+ <mkwa...@mkwahler.net> wrote in comp.std.c:
+
+>
+> <t...@cs.ucr.edu> wrote in message news:ar7a6d$52t$1...@glue.ucr.edu...
+> > In comp.std.c Douglas A. Gwyn <DAG...@null.net> wrote:
+> > + t...@cs.ucr.edu wrote:
+> > +> IMHO, memory-mapped I/O ports are the
+> > +> fundamental paradigm for volatile objects.

+> > +
+> > + No; they're the most obvious *application* for this

+> > + facility.
+> >
+> > I'm meaning paradigm in the sense of an example that serves as a
+> > pattern or model, i.e., one that captures all of the essential aspects
+> > of the concept.
+> >
+> > + There are other, quite different, reasons
+> > + for using volatile qualification as have been
+> > + discussed here and in other newsgroups (such as
+> > + comp.os.plan9).
+> >
+> > Fine. Then please furnish just one specific example of an object that
+> > needs to have volatile-qualified type for a reason that is
+> > fundamentally different from the reasons that memory-mapped I/O ports
+> > must have volatile-qualified types.
+>

+> What about objects shared with other processes?
+>
+> -Mike
+
+ Since neither C nor C++ supports other processes or shared objects,
+ what about them?

There are extensions to C/C++ that support threads, and the question
of whether thread-shared objects must (or should) have
volatile-qualified occurs frequently. The discussion quickly gets to
issues of what the standards require of volatile objects and why, at
which point (IMHO) the discussion this group becomes an appropriate
forum.

Also, there have been a number of suggestions that some from of
threading be added to the standard and discussions of what such a
specification might look like, which again is appropriate for this
group.

Tom Payne

t...@cs.ucr.edu

unread,

Nov 18, 2002, 3:16:48 AM11/18/02

to

In comp.std.c Momchil Velikov <ve...@fadata.bg> wrote:
+ Alexander Terekhov <tere...@web.de> wrote in message news:<3DD69119...@web.de>...

+> "Douglas A. Gwyn" wrote:
+> >

+> > Alexander Terekhov wrote:
+> Yeah, ``how nice.'' <http://tinyurl.com/2r53> ("Forget C/C++
+> volatiles, Momchil. ...")
+
+ Well, I must clarify, that while I was totaly convinced that
+ "volatile" is not required in a POSIX environment, existing
+ implementations have quite compatible notion of what constitutes a
+ "violatile object access" and notably similar behavior (not putting
+ volatiles in registers, not reordering volatile accesses accross
+ sequence points (as mandated by the standard)), etc.

Objects that represent the states of coordination mechanisms, e.g.,
mutexes, must be treated as volatiles --- it does no good to lock a
mutex and keep it cached in a register.

+ so volatile is
+ quite usable, especially together with non-standard primitives (a
+ POSIX impementation is rarely implemented on top of a POSIX
+ implementation right ?)

It is often claimed that one of C's early goals was to replace
assembly language. I'd like to be able to implement concurrency
packages like Pthreads portably in C, without the need to resort to
assembly code.

Tom Payne

Alexander Terekhov

unread,

Nov 18, 2002, 4:15:28 AM11/18/02

to

t...@cs.ucr.edu wrote:
[...]

> I'd like to be able to implement concurrency packages like Pthreads
> portably in C, without the need to resort to assembly code.

Nah, please solve {accept one of "proposed"/find another solution,
reach consensus/legislate/blah-blah} THIS

http://groups.google.com/groups?selm=3DC944D5.EC2EF029%40web.de
(Subject: Re: Memory isolation)

*first*.

regards,
alexander.

James Kuyper

unread,

Nov 18, 2002, 9:53:04 AM11/18/02

to

Alexander Terekhov wrote:
>
> Mike Wahler wrote:
> [...]
> > What about objects shared with other processes?
>
> Hi Mike, contributing to the "quite a few $10^6" {growing} amount or what?

I'd recommend that you try being a little less terse. You're point isn't
clear; and quoting yourself as you did further down doesn't make your
point any clearer.

Alexander Terekhov

unread,

Nov 18, 2002, 10:40:53 AM11/18/02

to

James Kuyper wrote:
[...]

> You're point isn't clear; and quoting yourself as you did further down
> doesn't make your point any clearer.

The point was that the Std. C Rationale with respect to volatile and
threading is wrong and, IMO, simply confuse people. The quote was used
as a pointer to the relevant context within the referenced message:

"....
And, BTW, I'd like to urge folks at comp.std.c to consider *FIXING*

the C99 rationale as well..." ---> 2 x ">>!!!!ATTN WRONG ATTN WRONG
ATTN WRONG ATTN WRONG ATTN WRONG!!!!<<"

regards,
alexander.

t...@cs.ucr.edu

unread,

Nov 18, 2002, 10:25:24 AM11/18/02

to

In comp.std.c Alexander Terekhov <tere...@web.de> wrote:
+
+ t...@cs.ucr.edu wrote:

+ [...]
+> I'd like to be able to implement concurrency packages like Pthreads
+> portably in C, without the need to resort to assembly code.
+
+ Nah, please solve {accept one of "proposed"/find another solution,
+ reach consensus/legislate/blah-blah} THIS
+
+ http://groups.google.com/groups?selm=3DC944D5.EC2EF029%40web.de
+ (Subject: Re: Memory isolation)
+
+ *first*.

Thanks. For some reason my browser got hung up at the second level of
indirection. But I think that your point is that the necessary
extensions/adjustments to the C Standard are not likely to be
forthcoming. I've realize that. But, so far, they've not been
proposed. Nor am I sure of exactly what they should be.

Tom Payne

James Kuyper

unread,

Nov 18, 2002, 10:57:40 AM11/18/02

to

Alexander Terekhov wrote:
>
> James Kuyper wrote:
> [...]
> > You're point isn't clear; and quoting yourself as you did further down
> > doesn't make your point any clearer.
>
> The point was that the Std. C Rationale with respect to volatile and
> threading is wrong and, IMO, simply confuse people. The quote was used
> as a pointer to the relevant context within the referenced message:
>
> "....
> And, BTW, I'd like to urge folks at comp.std.c to consider *FIXING*

Keep in mind that the "folks at comp.std.c" have no authority to fix it.
That's an issue you need to bring up with the Committee, not with this
newsgroup. There's only a small overlap between this newsgroup and the
Committee, and you shouldn't rely on that overlap for actually getting
anything done.

> the C99 rationale as well..." ---> 2 x ">>!!!!ATTN WRONG ATTN WRONG
> ATTN WRONG ATTN WRONG ATTN WRONG!!!!<<"

Yes, that's your general point. I was more concerned about your specific
point - the one you were making in your response to Mike Wahler's
question:

> What about objects shared with other processes?

Your response to that particular question was extremely cryptic. I have
no idea what it was intended to mean. And apparantly, neither did Mike.
And the follow up messages only served to raise the temperature of the
discussion, without clarifying your response to that particular
question.

Alexander Terekhov

unread,

Nov 18, 2002, 11:56:10 AM11/18/02

to

<copy&paste>

G'Day,

FYI...

A) http://www.opengroup.org/austin/docs/austin_107.txt
(Defect in XBD 4.10 Memory Synchronization (rdvk# 26), Rationale
for rejected or partial changes)

"Our advice is as follows.

Hardware that does not allow atomic accesses cannot have
a POSIX implementation on it.

We propose no changes to the standard.

Please note that the committee is not required to give advice,
this sort of topic may be better to be discussed initially on the group
reflector prior to any aardvark submission."

B)
http://groups.google.com/groups?threadm=h0Ss9.855%24HI1.63365%40newsfep1-win.server.ntli.net

-------- Original Message --------
From: "Garry Lancaster" <glanc...@ntlworld.com>
Newsgroups: comp.programming.threads
Subject: Memory isolation
Message-ID: <h0Ss9.855$HI1....@newsfep1-win.server.ntli.net>
Date: Mon, 21 Oct 2002 13:01:55 +0100

Hi All

Say I have two global char variables:

char c1 = 0;
char c2 = 0;

and two mutexes:

Mutex m1;
Mutex m2;

(Assume Mutex is a C++ class with the obvious Lock and
Unlock functions, wrapping a mutex API such as the
pthread_mutex_init/destroy/lock/unlock functions.)

I have two functions, f1 and f2.

void f1() {
m1.Lock();
c1 = 1; // Line X.
if (1 != c1) abort(); // Line Y.
c1 = 0;
m1.Unlock();
}

void f2 {
m2.Lock();
c2 = 1; // Line Z.
if (1 != c2) abort();
c2 = 0;
m2.Unlock();
}

The critical sections in f1 and f2 may run concurrently because
they use different mutexes.

I spawn several threads. Some run f1, others f2.

It is my understanding that, even though I have protected my
variables using mutexes, the two variable values may interfere
with one another. For example, on a platform with only word-
sized memory access (and, naturally, more than a single 1 byte
char per word), if the two globals reside in adjacent memory
locations the write to c2 at line Z may generate a word
read including both c1 and c2, followed by a word write of the
same. If line X is run in between this read and write, it will
effectively be ignored, since the new value will be overwritten
by the old, and the program will abort at line Y.

In other words:

State: c1 = 0 c2 = 0
Action: Line Z word read.
State: c1 = 0 c2 = 0
Action: Line X word read
State: c1 = 0 c2 = 0
Action: Line X word write
State: c1 = 1 c2 = 0
Action: Line Z word write
State: c1 = 0 c2 = 1
Action: Line Y, condition false so abort.

Even though each variable is protected by its own mutex,
since they are not using the *same* mutex, they still
interfere.

I know the avoidance of this behaviour is part of what is
meant by atomicity. But, since it is not the whole of what
is meant (it does not address interruptibility), I am currently
using a different term: isolation. (I hope someone will
correct me if there is a standard term for this.)

Is the scenario I post possible under pthreads (or any other
threading system for that matter) or have I missed something
that means the problem will not occur?

If lack of isolation *is* a problem, what is the most portable
solution?

Thanks in advance.

Kind regards

Garry Lancaster

-------- Original Message --------
From: David Butenhof <David.B...@compaq.com>
Subject: Re: Memory isolation
Newsgroups: comp.programming.threads
Message-ID: <BKTs9.20$ZG7.4...@news.cpqcorp.net>
Date: Mon, 21 Oct 2002 13:58:57 GMT

Garry Lancaster wrote:

> Say I have two global char variables:
>
> char c1 = 0;
> char c2 = 0;
>
> and two mutexes:
>
> Mutex m1;
> Mutex m2;
>
> (Assume Mutex is a C++ class with the obvious Lock and
> Unlock functions, wrapping a mutex API such as the
> pthread_mutex_init/destroy/lock/unlock functions.)
>
> I have two functions, f1 and f2.
>
> void f1() {
> m1.Lock();
> c1 = 1; // Line X.
> if (1 != c1) abort(); // Line Y.
> c1 = 0;
> m1.Unlock();
> }
>
> void f2 {
> m2.Lock();
> c2 = 1; // Line Z.
> if (1 != c2) abort();
> c2 = 0;
> m2.Unlock();
> }
>
> The critical sections in f1 and f2 may run concurrently because
> they use different mutexes.
>
> I spawn several threads. Some run f1, others f2.
>
> It is my understanding that, even though I have protected my
> variables using mutexes, the two variable values may interfere
> with one another. For example, on a platform with only word-
> sized memory access (and, naturally, more than a single 1 byte
> char per word), if the two globals reside in adjacent memory
> locations the write to c2 at line Z may generate a word
> read including both c1 and c2, followed by a word write of the
> same. If line X is run in between this read and write, it will
> effectively be ignored, since the new value will be overwritten
> by the old, and the program will abort at line Y.
>
> In other words:
>
> State: c1 = 0 c2 = 0
> Action: Line Z word read.
> State: c1 = 0 c2 = 0
> Action: Line X word read
> State: c1 = 0 c2 = 0
> Action: Line X word write
> State: c1 = 1 c2 = 0
> Action: Line Z word write
> State: c1 = 0 c2 = 1
> Action: Line Y, condition false so abort.
>
> Even though each variable is protected by its own mutex,
> since they are not using the *same* mutex, they still
> interfere.
>
> I know the avoidance of this behaviour is part of what is
> meant by atomicity. But, since it is not the whole of what
> is meant (it does not address interruptibility), I am currently
> using a different term: isolation. (I hope someone will
> correct me if there is a standard term for this.)
>
> Is the scenario I post possible under pthreads (or any other
> threading system for that matter) or have I missed something
> that means the problem will not occur?
>
> If lack of isolation *is* a problem, what is the most portable
> solution?

There is no completely portable solution, because standards do not
provide means to control the exact layout of data in memory, nor
the instructions generated by a compiler to access them. (Even
"volatile" provides only very loose constraints on the
instructions used, and they're not useful here.)

Note that aside from problems that can destroy data, like "word
tearing", there are performance problems such as "false sharing".
False sharing won't hurt your final data (or even intermediate
data), but can drastically affect your performance when multiple
threads (running on separate CPUs) concurrently write to
non-adjacent data in the same cache line(s). (Because of cache
invalidate thrashing in the memory system.)

Your best bet to avoid the functional problems and minimize the
performance risks is to avoid declaring shared data as you've
shown. That is, instead of:

char c1 = 0;
char c2 = 0;
Mutex m1;
Mutex m2;

That not only places the shared data adjacent to each other (in
most implementations), but actually interleaves the shared data
to guarantee you'll get cache conflicts. (c1 is separated from
m1; while c1 and c2, and m1 and m2, are pushed together.)

Instead, use:

char c1 = 0;
Mutex m1;
char c2 = 0;
Mutex m2;

This still doesn't guarantee cache isolation, though at least
you know that the machine is far less likely to have atomicity
problems accessing c1 and c2 with respect to each other. For one
thing, on most machines without atomic access to the char data
type, the compiler will generate padding between the char and
the Mutex (which most likely has wider data, such as int or
long or pointer).

Or even better,

typedef struct {char c; Mutex m} Data;
Data *d1;
Data *d2;

d1 = malloc (sizeof Data);
d2 = malloc (sizeof Data);

Now you're letting the heap manager buy you some reasonable
minimal data alignment, as well as a high likelihood (though
still not a guarantee) that the two allocations will be in
separate cache lines. For further assurance, you could easily
pad the allocations to some reasonable size; 64 bytes is a
common cache line size.

"Mounting" your data into a structure comes as close as you
can in C to controlling the actual layout of data in memory.

While this is a little less trivially simple than the original,
it's not horrendously complicated, either. It'll buy you a lot
of flexibility to adapt to various architectures, as well as a
fair level of builtin basic protection.

--
/--------------------[ David.B...@hp.com ]--------------------\
| Hewlett-Packard Company Tru64 UNIX & VMS Thread Architect |
| My book: http://www.awl.com/cseng/titles/0-201-63392-2/ |
\----[ http://homepage.mac.com/dbutenhof/Threads/Threads.html ]---/

-------- Original Message --------
From: David Butenhof <David.B...@compaq.com>
Subject: Re: Memory isolation
Newsgroups: comp.programming.threads
Message-ID: <PpRt9.10$Nx2.3...@news.cpqcorp.net>
Date: Thu, 24 Oct 2002 12:09:19 GMT

Alexander Terekhov wrote:

> Max Khesin wrote:
>>
>> why not something like this:
>>
>> char sharedData[sizeof(int)+1];
>>
>> char& c1 = sharedData[0];
>> char& c2 = sharedData[sizeof(int)];
>>
>> this would seem (assuming "int" is the largest size read at once) to
>> sufficiently separate the data.
>
> assert( sizeof( char ) == sizeof( int ) );

Again, the real problem with this alternative (as already pointed
out by several) is that a char array has no required alignment
and element 0 need not have (int*) alignment either. Now you need
to do bit masking to align the address of c1 as well as c2.

> [...]
>> > Instead, use:
>> >
>> > char c1 = 0;
>> > Mutex m1;
>> > char c2 = 0;
>> > Mutex m2;
>
> assert( sizeof( char ) == sizeof( Mutex ) );

This would almost certainly still be better than having c1 and c2
in adjacent bytes.

However, in general, yes, you've successfully described ONE (of
many) of the possibilities that caused me to say that these
strategies will often help but provide no real guarantees. I don't
really see why you bothered. (Or, even worse, why I'm bothering to
respond.)

>> > This still doesn't guarantee cache isolation, though at least
>> > you know that the machine is far less likely to have atomicity
>> > problems accessing c1 and c2 with respect to each other. For one
>> > thing, on most machines without atomic access to the char data
>> > type, the compiler will generate padding between the char and
>> > the Mutex (which most likely has wider data, such as int or long
>> > or pointer).
>> >
>> > Or even better,
>> >
>> > typedef struct {char c; Mutex m} Data;
>> > Data *d1;
>> > Data *d2;
>> >
>> > d1 = malloc (sizeof Data);
>> > d2 = malloc (sizeof Data);
>
> assert( 2*sizeof( Data ) <= sizeof( pthread_memory_granule_np_t ) );

Again, and more directly this time: "exactly: and so what"?

> http://groups.google.com/groups?selm=yahiuqxlycg.fsf%40berling.diku.dk
>
> "....
> However, I still think that malloc(1) on an implementation where all
> pointers have the same representation may still return a pointer that
> is not "aligned" in the everyday, non-standardese meaning of the term.
> Because it is undefined anyway what happens when one tries to use the
> pointer to access an object bigger than the size I asked malloc() for."

On a machine with no address alignment requirements, there are no
alignment requirements on malloc(). But address alignment rules aren't
the same as atomic access rules, and this can complicate "isolation".
On a machine like Alpha that requires natural data aligment, a (short*)
MUST have the low address bit clear, (int*) must have 2 low address bits
clear, and so forth. Therefore an implementation of malloc() that did
not return a value with the maximum number of cleared low address bits
would be erroneous. (Yes, 'malloc(1)' could return an unaligned address,
'malloc(2)' could return an address with a single cleared low bit, and
so forth, though this is an unlikely implementation. Certainly
'malloc(2)' cannot return a value with the low address bit set, because
it cannot legally presume the storage will be mapped to 'char[2]' rather
than 'int' even though that's all the information it has.)

It's possible, (though I know of no examples except one subtly broken
model of the VAX family), that a machine without address alignment
rules could have restrictions on atomic access to unaligned data. In
such an implementation, malloc(8) might return an address with the low
bit set, restricting atomic access to that data. Possible, but unlikely.
Except for very early VAX models, unaligned data access may have been
LEGAL, but was extremely inefficient (it meant locking the memory bus,
doing multiple atomic ALIGNED fetches, unlocking the memory bus, and
gluing the data together) -- and of course every VAX data access was
required to be atomic, so there was no way to skip that overhead. No
rational implementation of malloc() would ever return unaligned
addresses even though it might be "legal".

The only real solution to this has to be at the language level, an area
where POSIX and SUS can't tread. There must be language syntax, and it
must be general and simple. I don't recall the context of discussions
sited regarding an "isolated" keyword, but I doubt that'd be practical
or usable except in, uh, "isolated" instances.

Better might be a general compiler option, perhaps a standard #pragma,
to force all "discrete" data allocations to be sufficiently isolated
for atomic access on the target hardware. At the simplest (and most
easily usable) level it would appear in a header file (perhaps
<pthread.h>?) to cause all externs, statics, and allocated return
values (e.g., from malloc()) to be sufficiently separated to ensure
atomicity with respect to other values so allocated.

But... what about 'char foo[2];'? Clearly the address "&foo" must be
"aligned". But what about "&foo[1]"? If it IS, then you really need
to force the compiler to change the definition of sizeof(char) in
that compilation scope or break many patterns in previously portable
code. For example, "char *bar = &foo; bar[1] = 0;". (One could
construct nastier examples that would be harder to detect and fix.)

What about structures? Is each field in the structure expanded?
Essentially what we're saying is that if the machine can access
'long', but not 'int', 'short', or 'char', atomically, then we
really allocate nothing smaller than 'long'. Is that acceptable?
How does it impact application code (and data sizes)?

The best strategy would probably be to say that AN array or A
structure is an "atomicity unit". You don't, by default, gain any
guaranteed atomic access to members of the unit. (This could be
provided for by an additional pragma, or by something like
'isolated'; though the pragma would probably be cleaner.)

Often we want a larger alignment than strictly needed, for
efficiency. The best unit here is almost always the machine's cache
line size -- a value not commonly communicated to application code.
This has proven particularly critical in designing data structures
for NUMA environments, but compiler support tends to be pretty bad.

Perhaps something like "#pragma align_all ({cache|atomicity})"
(Where "cache" is required to subsume "atomicity", just to remove
ambiguity.)

I'm not entirely sure that'd be sufficient, either, but it's another
idea to consider.

--
/--------------------[ David.B...@hp.com ]--------------------\
| Hewlett-Packard Company Tru64 UNIX & VMS Thread Architect |
| My book: http://www.awl.com/cseng/titles/0-201-63392-2/ |
\----[ http://homepage.mac.com/dbutenhof/Threads/Threads.html ]---/

-------- Original Message --------
From: "Garry Lancaster" <glanc...@ntlworld.com>
Newsgroups: comp.programming.threads
Subject: Re: Memory isolation
Message-ID: <mVTt9.7507$Af5.2...@newsfep2-win.server.ntli.net>
Date: Thu, 24 Oct 2002 16:00:07 +0100

[snip]

David Butenhof:
> The only real solution to this has to be at the language level,
> an area where POSIX and SUS can't tread.

I think POSIX *could* do it, but the languages *should*
do it. But then I also think that a lot of what POSIX
currently does would, in an ideal world, be done by
the languages. At the moment this has somehow
fallen through the cracks because it's quite subtle.

> There must be language syntax, and it must
> be general and simple. I don't recall the context of discussions
> sited regarding an "isolated" keyword, but I doubt that'd be
> practical or usable except in, uh, "isolated" instances.
>
> Better might be a general compiler option, perhaps a standard
> #pragma, to force all "discrete" data allocations to be
> sufficiently isolated for atomic access on the target hardware.
> At the simplest (and most easily usable) level it would appear
> in a header file (perhaps <pthread.h>?) to cause all externs,
> statics, and allocated return values (e.g., from malloc()) to
> be sufficiently separated to ensure atomicity with respect to
> other values so allocated.

We have to be careful not to confuse atomicity with
isolation. Depending on exactly how you define it,
atomicity is probably sufficient for isolation, but isolation
is not sufficient for atomicity e.g. a machine with only
byte access to memory can isolate multi-byte words,
but cannot access them atomically (at least not without
a global system lock or some other extra form of
synchronisation.)

If I can take it that you're actually talking about isolation
rather than full atomicity, I tend to agree with most of
what you write. Anyway, I make that assumption in my
subsequent comments...

You write that we need to isolate globals and dynamics.
As Alexander pointed out earlier, you also need
to isolate "thread private" objects. This includes
automatics (a.k.a. stack dwellers). In very many cases
the compiler has to do nothing special at all in order to
isolate automatics (specifically, where it can prove that
all objects existing within a given natural word/isolation
boundary are only accessed by the same single thread),
so it shouldn't waste much space, but requiring automatics
also to be isolated spells out that those cases where
special action is necessary must be dealt with correctly
by the compiler.

> But... what about 'char foo[2];'? Clearly the address "&foo"
> must be "aligned". But what about "&foo[1]"? If it IS, then
> you really need to force the compiler to change the definition
> of sizeof(char) in that compilation scope or break many
> patterns in previously portable code. For example, "char
> *bar = &foo; bar[1] = 0;". (One could construct nastier
> examples that would be harder to detect and fix.)

Changing sizeof(char) is a no-no. This, and the
rules for sizing arrays, provide a good reason why
the Java-esque default of having everything isolated
from everything else is not tenable for C and C++.

> What about structures? Is each field in the structure expanded?
> Essentially what we're saying is that if the machine can access
> 'long', but not 'int', 'short', or 'char', atomically, then we
> really allocate nothing smaller than 'long'. Is that acceptable?
> How does it impact application code (and data sizes)?

Right: this wouldn't be acceptable.

> The best strategy would probably be to say that AN array or A
> structure is an "atomicity unit". You don't, by default, gain
> any guaranteed atomic access to members of the unit. (This could
> be provided for by an additional pragma, or by something like
> 'isolated'; though the pragma would probably be cleaner.)

Yes, and/but:

- For the reasons stated above, I prefer "isolation unit".

- For what are properly known in C and C++ as arrays
of arrays, but which are often termed multi-dimensional
arrays, only the topmost array is an isolation unit (for one
thing because the language rules insist that T a[n][n] has
always to be n times the size of T b[n]). In contrast, structs
(and classes and unions) are allowed extra internal byte
padding, so they can always be isolation units.

- Objects that are not members of a struct/class/union
nor elements of an array should be in their own isolation
unit. An entirely non-isolated object is a dangerous
thing in a multi-threaded program: the languages would
do no service to their users by permitting it.

- The idea of a standard #pragma is, at least currently,
a contradiction in terms: they are specified to be used
for *implementation-defined* purposes. There
is always the possibility of changing this by introducing
the first ever standard #pragma, but I think it would be
difficult to sell this as better than a new keyword. (Plus
there is general dislike of the pre-processor amongst
the C++ standards people: they are unlikely to go for
anything that extends its role.) You don't need a pragma
anyway: when you need additional isolation units, just
refactor into multiple structs. For example,

// Members not guaranteed isolated from each other.
struct a {
char b;
char c;
};

// Members guaranteed isolated from each other.
struct ia {
struct { char b; } bb;
struct { char c; } cc;
};

Admittedly an anonymous-struct would be a nice
extension here, but for single member isolation
we can, in C++ at least, use an anonymous union
to permit the access syntax to remain unchanged.

// Members guaranteed isolated from each other.
// Anonymous-union syntax is C++ only.
struct ia2 {
union { char b; };
union { char c; };
};

> Often we want a larger alignment than strictly needed, for
> efficiency. The best unit here is almost always the machine's
> cache line size -- a value not commonly communicated to
> application code. This has proven particularly critical in
> designing data structures for NUMA environments, but compiler
> support tends to be pretty bad.
>
> Perhaps something like "#pragma align_all ({cache|atomicity})"
> (Where "cache" is required to subsume "atomicity", just to
> remove ambiguity.)
>
> I'm not entirely sure that'd be sufficient, either, but it's
> another idea to consider.

Aligning to cache lines *is* something that is suitable
for a #pragma: an environment-specific efficiency tweak.
This wouldn't be something that a language standard
would specify.

Kind regards

Garry Lancaster
Codemill Ltd
Visit our web site at http://www.codemill.net

-------- Original Message --------
From: "Garry Lancaster" <glanc...@ntlworld.com>
Newsgroups: comp.programming.threads
Subject: Re: Memory isolation
Message-ID: <EN7u9.8900$Af5.3...@newsfep2-win.server.ntli.net>

Alexander Terekhov:
> After spending some time [thanks to "mainframe schedulers: if you
> don't have cpu utilization at 99+%, something is seriously wrong"]
> trying to digest messages from Garry Lancaster and David Butenhof,
> I'm now thinking in the following direction:
>
> 1. std::thread_allocator<T>, thread_new/thread_delete, operator
> thread_new/operator thread_delete, thread_malloc()/thread_free(),
> etc. -- thread specific memory allocation facilities that would
> allow slightly more optimized/less expensive operations with
> respect to synchronization and isolation for data that is *NOT*
> meant to be thread-shared.

Alexander's later correction:
> Well, "*NOT* meant to be thread-shared" was probably confusing.
> The allocator AND all its allocated objects COULD be accessed by
> different threads, but serialized/synchronized -- with precluded
> asynchrony on some "higher" level. Sort of "dynamic segmented
> stack model" where the entire stack can be passed from thread to
> thread [if needed].

If I understand your correction correctly, all objects allocated
using these per-thread techniques are isolated except that
those allocated on the same thread need not be isolated
from each other. Makes sense.

I can think of two reasons why you might want thread-specific
allocation as part of a language standard:

1. You can reduce the padding between adjacent allocations
if you know they are only going to be used by the same
thread since isolation with respect to each other is not an
issue. This is sound in theory, however, the smallest allocation
chunks in most language library allocation routines are already
at or beyond the granularity of the natural isolation/word boundary.
If you only ask for 1 byte, you probably get 8 or 16 in many cases.
This happens because general purpose allocators need to
supply memory aligned to the maximum alignment requirement
of any type in the system and because of the bookkeeping space
overhead of small allocations. Your type-specific
std::thread_allocator<T> could get around the alignment issue,
but is likely that a relatively simple user-defined allocator
tailored for a specific purpose could out-perform it, so why
bother supplying this half-way house as standard?

2. Per-thread allocators can avoid the need for global
synchronization during each allocation and deallocation by
maintaining per-thread allocation and free lists etc. But
this doesn't require a special interface - thread local
storage is just as available to the current allocation
interfaces as it would be to your newly suggested ones.
I'm guessing things like the Hoard allocator do this.
So there is no advantage over what we have now. At
first glance you might think that using std::thread_allocator<T>
could get around the need to use TLS to implement
this, but the standard allocator interface doesn't work
like that: any state must be shared between objects.
(Bizarrely, all allocators of the same type must be able
to free each others allocations. Don't ask me why it is
that way. It just is.)

> 2. "isolation" scopes [might also be nested; possibly] for defs of
> objects of static storage duration and non-static class members:
>
> isolated {
>
> static char a;
> static char b;
> static mutex m1;
>
> }
>
> isolated {
>
> static char c;
> static char d;
> static mutex m2;
>
> }

For ease of comparison I'll re-write your examples as I would
write them if the isolation rules I proposed were in place.
(Just to show that we don't *need* a new keyword.)

static struct {
char a;
char b;
mutex m1;
} e;
static struct {
char c;
char d;
mutex m2;
} f;

or

static char a;
static char b;
static mutex m1;
static char c;
static char d;
static mutex m2;

That last is "over-isolated" compared to the others,
but given the relatively small amount of static data
in most programs any extra padding is likely to be
negligible (and the mutexes will most likely already be
aligned and padded to avoid cross-thread
interference in any case).

(Corrections applied to following.)
> struct something {
>
> isolated {
>
> char a;
> char b;
> mutex m1;
>
> }
>
> isolated {
>
> char c;
> char d;
> mutex m2;
>
> }
>
> } s; // isolated by default -- see below

struct something {
struct internal {
char a;
char b;
mutex m1;
};
struct internal c;
struct internal d;
} s;

> This would allow one to clearly express isolation boundaries.

Your use of the "isolated" keyword is sufficient, but it's
not necessary.

> By default, definitions of objects of static storage duration
> shall be treated as being isolated from each other:
>
> static char a; // isolated { static char a; }
> static char b; // isolated { static char b; }

Yes, I agree, and so do the rules I posted.

> Objects of automatic storage duration need NOT be isolated
> [the isolation of the entire thread stack aside] unless an
> address/ref is taken and it can't be proven that access to
> it from some other thread is impossible.

I agree with your intent, but you don't need to say that.
Just say they "shall be isolated" and let the implementations
figure out what they actually have to do to achieve it for
each object. If that's nothing and they can easily deduce
that during compilation, they will do.

> 3. Array elements can be made isolated ONLY using class type
> with "isolated" member(s):
>
> char c_array[2]; // no isolation with respect to elems

This is the same with my model.

(Corrections applied to following.)
> struct isolated_char {
>
> isolated { char c; }
>
> } ic_array[2]; // fully isolated ic_array[0].c
> // and ic_array[1].c

struct ichar { char c; } ic_array[2];

I think the two sets of rules are the same except that
in your model sub-objects or array elements of
class-type are not guaranteed to be isolated from
their "sibling" sub-objects or elements, and in mine
they are. Both models work.

In other words your "isolated" has the same semantics
with respect to isolation as sub-object structs/classes/
unions in mine.

I don't think there is any real difference in the
isolation boundaries achievable with the two sets
of rules: they just differ in their defaults and how
you control them.

So, the default amount of isolation in your model
is slightly less than in mine, which means you are
forced to hand-tweak the isolation boundaries
slightly more often to ensure isolation safety. In
favour of your rules you will undoubtedly save a
few bytes here and there in many programs. Since
ideally we would just be standardising current
practice, it would be interesting to know what current
compilers do with respect to isolation units (if they
even consider them).

The other main difference is the addition of the
keyword. Why are new keywords a bad thing?

- Any programs that already use the identifier
"isolated" (e.g. for a variable or type) will break. If
you choose the uglier "__isolated" instead you would
avoid breaking standard-conforming programs
provided no compiler vendor had already used this
as an extension. (The language standards say that
names containing double underscores are reserved
for implementations.)

- You create a mismatch between pre- and post-
isolated-aware code. Any single use of the new keyword
means the program will not compile on a pre-isolated
compiler.

> 4. Introduce something ala offsetof-"magic" with respect to
> alignment/padding that would provide the means to write
> thread-shared *AND* thread-private allocators entirely in
> standard C/C++.

What problems are there at the moment that wouldn't
be fixed by either set of suggested isolation rules?

> 5. In the single threaded "mode", isolation scopes can simply
> be ignored.

Again, you are right, but you do not need to say so
explicitly: implementations can figure that out for
themselves, provided they can tell the difference
between a single- and a multi-threaded build.

I bet that most of them will choose to keep the sizes
of all types the same across the different build models
though. Doing otherwise is not wrong but is likely to
break code that works but assumes more than it
should about structure layouts. (Some people think it
is a good thing for compilers to go out of their way to
break non-conforming code, though. Maybe I'm too
soft ;-)

Kind regards

Garry Lancaster
Codemill Ltd
Visit our web site at http://www.codemill.net

-------- Original Message --------
From: Alexander Terekhov <tere...@web.de>
Newsgroups: comp.programming.threads
Subject: Re: Memory isolation
Date: Fri, 25 Oct 2002 20:31:09 +0200
Message-ID: <3DB98DED...@web.de>

Garry Lancaster wrote:
>
> Alexander Terekhov:
> > After spending some time [thanks to "mainframe schedulers: if you
> > don't have cpu utilization at 99+%, something is seriously wrong"]
> > trying to digest messages from Garry Lancaster and David Butenhof,
> > I'm now thinking in the following direction:
> >
> > 1. std::thread_allocator<T>, thread_new/thread_delete, operator
> > thread_new/operator thread_delete, thread_malloc()/thread_free(),
> > etc. -- thread specific memory allocation facilities that would
> > allow slightly more optimized/less expensive operations with
> > respect to synchronization and isolation for data that is *NOT*
> > meant to be thread-shared.
>
> Alexander's later correction:
> > Well, "*NOT* meant to be thread-shared" was probably confusing.
> > The allocator AND all its allocated objects COULD be accessed by
> > different threads, but serialized/synchronized -- with precluded
> > asynchrony on some "higher" level. Sort of "dynamic segmented
> > stack model" where the entire stack can be passed from thread to
> > thread [if needed].
>
> If I understand your correction correctly, all objects allocated
> using these per-thread techniques are isolated except that
> those allocated on the same thread need not be isolated
> from each other.

Well, I'd say that all objects allocated by the same allocator
are isolated from all other objects allocated by some other
allocator(s) but aren't necessarily isolated with respect to
each other. This would mean that the allocator and all its
allocated object shall be accesses by only one thread at any
time, but the "ownership" can be "transferred" from thread to
thread [optionally; if needed/wanted].

> Makes sense.
>
> I can think of two reasons why you might want thread-specific
> allocation as part of a language standard:
>
> 1. You can reduce the padding between adjacent allocations
> if you know they are only going to be used by the same
> thread since isolation with respect to each other is not an
> issue.

Yes.

> This is sound in theory, however, the smallest allocation
> chunks in most language library allocation routines are already
> at or beyond the granularity of the natural isolation/word boundary.
> If you only ask for 1 byte, you probably get 8 or 16 in many cases.
> This happens because general purpose allocators need to
> supply memory aligned to the maximum alignment requirement
> of any type in the system and because of the bookkeeping space
> overhead of small allocations.

Well, yes. And even the "buckets"-things like

http://publibn.boulder.ibm.com/doc_link/en_US/a_doc_lib/aixbman/prftungd/2365c35.htm#HDRI45811
(see MALLOCBUCKETS...)

http://publibn.boulder.ibm.com/doc_link/en_US/a_doc_lib/aixprggd/genprogc/malloc_buckets.htm
("Malloc Buckets")

have some "restrictions" w.r.t. sizing/alignment:

"The bucket sizing factor must be a multiple of 8 for 32-bit
implementations and a multiple of 16 for 64-bit implementations
in order to guarantee that addresses returned from malloc
subsystem functions are properly aligned for all data types."

> Your type-specific
> std::thread_allocator<T> could get around the alignment issue,

I'm not sure how would one "get around the alignment issue"...

> but is likely that a relatively simple user-defined allocator
> tailored for a specific purpose could out-perform it, so why
> bother supplying this half-way house as standard?

Well, yes. I've played a bit with "user-defined allocator
tailored for a specific purpose" myself. You might want to
take a look at the following: [that's rather old stuff, but
it illustrates some ideas -- modulo bugs ;-) ]

http://www.terekhov.de/hsamemal.hpp
http://www.terekhov.de/hsamemal.inl
http://www.terekhov.de/hsamemal.cpp
http://www.terekhov.de/hsamemal.c

But I'd really prefer to use something "Standard" instead.

[...]
> > 2. "isolation" scopes [might also be nested; possibly] for defs of
> > objects of static storage duration and non-static class members:
> >
> > isolated {
> >
> > static char a;
> > static char b;
> > static mutex m1;
> >
> > }
> >
> > isolated {
> >
> > static char c;
> > static char d;
> > static mutex m2;
> >
> > }
>
> For ease of comparison I'll re-write your examples as I would
> write them if the isolation rules I proposed were in place.
> (Just to show that we don't *need* a new keyword.)
>
> static struct {
> char a;
> char b;
> mutex m1;
> } e;
> static struct {
> char c;
> char d;
> mutex m2;
> } f;
>
> or
>
> static char a;
> static char b;
> static mutex m1;
> static char c;
> static char d;
> static mutex m2;
>
> That last is "over-isolated" compared to the others,
> but given the relatively small amount of static data
> in most programs any extra padding is likely to be
> negligible (and the mutexes will most likely already be
> aligned and padded to avoid cross-thread
> interference in any case).
>
> (Corrections applied to following.)
> > struct something {
> >
> > isolated {
> >
> > char a;
> > char b;
> > mutex m1;
> >
> > }
> >
> > isolated {
> >
> > char c;
> > char d;
> > mutex m2;
> >
> > }
> >
> > } s; // isolated by default -- see below
>
> struct something {
> struct internal {
> char a;
> char b;
> mutex m1;
> };
> struct internal c;
> struct internal d;
> } s;
>
> > This would allow one to clearly express isolation boundaries.
>
> Your use of the "isolated" keyword is sufficient, but it's
> not necessary.

Apart from the problem of "over-isolation". ;-)

> > By default, definitions of objects of static storage duration
> > shall be treated as being isolated from each other:
> >
> > static char a; // isolated { static char a; }
> > static char b; // isolated { static char b; }
>
> Yes, I agree, and so do the rules I posted.
>
> > Objects of automatic storage duration need NOT be isolated
> > [the isolation of the entire thread stack aside] unless an
> > address/ref is taken and it can't be proven that access to
> > it from some other thread is impossible.
>
> I agree with your intent, but you don't need to say that.
> Just say they "shall be isolated" and let the implementations
> figure out what they actually have to do to achieve it for
> each object. If that's nothing and they can easily deduce
> that during compilation, they will do.
>
> > 3. Array elements can be made isolated ONLY using class type
> > with "isolated" member(s):
> >
> > char c_array[2]; // no isolation with respect to elems
>
> This is the same with my model.

Yes, you've convinced me that introduction of yet another type
qualifier [e.g. "isolated char", where sizeof( isolated char )
>= sizeof( char )] would be rather messy.

> (Corrections applied to following.)
> > struct isolated_char {
> >
> > isolated { char c; }
> >
> > } ic_array[2]; // fully isolated ic_array[0].c
> > // and ic_array[1].c
>
> struct ichar { char c; } ic_array[2];
>
> I think the two sets of rules are the same except that
> in your model sub-objects or array elements of
> class-type are not guaranteed to be isolated from
> their "sibling" sub-objects or elements, and in mine
> they are. Both models work.

Yes, I'm just fearing a bit the problem/overhead of "over-
isolation". Imagine how much memory could be wasted when
isolation is be done on something around 64 bytes [cache
line size]...

> In other words your "isolated" has the same semantics
> with respect to isolation as sub-object structs/classes/
> unions in mine.
>
> I don't think there is any real difference in the
> isolation boundaries achievable with the two sets
> of rules: they just differ in their defaults and how
> you control them.
>
> So, the default amount of isolation in your model
> is slightly less than in mine, which means you are
> forced to hand-tweak the isolation boundaries
> slightly more often to ensure isolation safety. In
> favour of your rules you will undoubtedly save a
> few bytes here and there in many programs. Since
> ideally we would just be standardising current
> practice, it would be interesting to know what current
> compilers do with respect to isolation units (if they
> even consider them).
>
> The other main difference is the addition of the
> keyword. Why are new keywords a bad thing?

They are NOT Good Things, for sure. ;-)

> - Any programs that already use the identifier
> "isolated" (e.g. for a variable or type) will break. If
> you choose the uglier "__isolated" instead you would
> avoid breaking standard-conforming programs
> provided no compiler vendor had already used this
> as an extension. (The language standards say that
> names containing double underscores are reserved
> for implementations.)
>
> - You create a mismatch between pre- and post-
> isolated-aware code. Any single use of the new keyword
> means the program will not compile on a pre-isolated
> compiler.

Yes, that's a problem. However, consider that I'm sort
of dreaming to have even more new keywords/constructs...

http://groups.google.com/groups?selm=3DA6C62A.AB8FF3D3%40web.de
(Subject: Re: local statics and TLS objects)

So, few keywords less here and there... ``big deal.'' ;-) ;-)

> > 4. Introduce something ala offsetof-"magic" with respect to
> > alignment/padding that would provide the means to write
> > thread-shared *AND* thread-private allocators entirely in
> > standard C/C++.
>
> What problems are there at the moment that wouldn't
> be fixed by either set of suggested isolation rules?

First off, under your rules, struct Char { char c; }
could be way too big [and with no gains whatsoever] for
purely "thread-specific"/"intra-thread" stuff. I really
don't like it. Under "my rules", I'd probably need to
know how much extra space need to be added to make my
custom user allocator "inter-thread" safe.

regards,
alexander.

-------- Original Message --------
From: "Garry Lancaster" <glanc...@ntlworld.com>
Newsgroups: comp.programming.threads
Subject: Re: Memory isolation
Message-ID: <bx7v9.135$P55....@newsfep1-win.server.ntli.net>
Date: Mon, 28 Oct 2002 09:35:32 -0000

[snip]

Garry Lancaster:
> > This is sound in theory, however, the smallest allocation
> > chunks in most language library allocation routines are already
> > at or beyond the granularity of the natural isolation/word
> > boundary. If you only ask for 1 byte, you probably get 8 or 16
> > in many cases. This happens because general purpose allocators
> > need to supply memory aligned to the maximum alignment
> > requirement of any type in the system and because of the
> > bookkeeping space overhead of small allocations.

[snip]

> > Your type-specific
> > std::thread_allocator<T> could get around the alignment issue,

Alexander Terekhov:
> I'm not sure how would one "get around the alignment issue"...

I simply mean that when T is known at compile time
an allocator can be designed that only satisfies T's
alignment requirements, rather than having to satisfy
the most conservative alignment requirements of all
types.

If a concrete example helps, think of std::thread_allocator<char>
implemented as a simple array-based allocator.

Maybe this allocator discussion would be better as a separate
thread. It is only tenuously related to the main subject.

[snip]

> > Your use of the "isolated" keyword is sufficient, but it's
> > not necessary.

> Apart from the problem of "over-isolation". ;-)

It's not necessary to avoid over-isolation either.
Any data layout you can acheive with the isolated
keyword can also be acheived without it by my
suggested rules.

[snip]

> Yes, I'm just fearing a bit the problem/overhead of "over-
> isolation". Imagine how much memory could be wasted when
> isolation is be done on something around 64 bytes [cache
> line size]...

Aren't you overstating your case a bit here? Aren't most
platforms' natural isolation boundaries the same or less
than their natural word size, so typically 8 bytes or less
rather than 64 bytes?

I understood from previous comments that cache size
alignment was an efficiency issue, not an isolation
issue (recall I defined isolation as what is necessary
to avoid word-tearing).

If I misunderstood then I would agree that some rethink
is required.

[snip]

> > The other main difference is the addition of the
> > keyword. Why are new keywords a bad thing?
>
> They are NOT Good Things, for sure. ;-)
>
> > - Any programs that already use the identifier
> > "isolated" (e.g. for a variable or type) will break. If
> > you choose the uglier "__isolated" instead you would
> > avoid breaking standard-conforming programs
> > provided no compiler vendor had already used this
> > as an extension. (The language standards say that
> > names containing double underscores are reserved
> > for implementations.)
> >
> > - You create a mismatch between pre- and post-
> > isolated-aware code. Any single use of the new keyword
> > means the program will not compile on a pre-isolated
> > compiler.
>
> Yes, that's a problem. However, consider that I'm sort
> of dreaming to have even more new keywords/constructs...
>
> http://groups.google.com/groups?selm=3DA6C62A.AB8FF3D3%40web.de
> (Subject: Re: local statics and TLS objects)
>
> So, few keywords less here and there... ``big deal.'' ;-) ;-)

Smileys noted, but I don't think the arguments against new
keywords are quite so easily dismissed. Particularly when
there is a counter-proposal without any new keywords.

> > > 4. Introduce something ala offsetof-"magic" with respect to
> > > alignment/padding that would provide the means to write
> > > thread-shared *AND* thread-private allocators entirely in
> > > standard C/C++.
> >
> > What problems are there at the moment that wouldn't
> > be fixed by either set of suggested isolation rules?
>
> First off, under your rules, struct Char { char c; }
> could be way too big [and with no gains whatsoever] for
> purely "thread-specific"/"intra-thread" stuff.
> I really don't like it.

(Ignoring isolation I think most people would write the above
as

typedef char Char;

anyway. But I take your general point even if I quibble over
your exact example.)

Same with (admittedly even more unlikely)

struct Char { isolated { char c; } };

under your rules.

Someone who understood isolation would tend to avoid
over-isolation in any case, so we are mostly talking about
supporting those who are not familiar with the concept.
Fair enough, that is a large enough chunk of people at the
moment (and until recently I was one of them!) and I wouldn't
expect isolation being tackled by the standard would
improve matters that much.

This is a hard decision, but at the moment I prefer a
solution that offers more of a safety net for these people:
one that more often produces code that is over-isolated
and wastes a few extra bytes than code that is under-
isolated and doesn't work. Not that even my suggestion
is totally safe for these people: we already rejected the
Java rules as too inefficient. The best solution in any case
is programmer education, but is that realistic here?

[snip]

Kind regards

Garry Lancaster
Codemill Ltd
Visit our web site at http://www.codemill.net

Mike Wahler

unread,

Nov 18, 2002, 12:28:31 PM11/18/02

to

Jack Klein <jack...@spamcop.net> wrote in message
news:kd3htuko2cll5b3kq...@4ax.com...

I mean is 'volatile' a reasonable way to deal with them,
as it is with 'i/o ports'.

-Mike

Alexander Terekhov

unread,

Nov 18, 2002, 1:52:25 PM11/18/02

to

James Kuyper wrote:
[...]

> I was more concerned about your specific point - the one you were
> making in your response to Mike Wahler's question:
>
> > What about objects shared with other processes?

Yeah. Note that Mike STILL doesn't get it (given his latest message).

>
> Your response to that particular question was ....

< Heck, it took me almost *1.5* cigarette. >

2002-11-16 10:40:01 PST [http://tinyurl.com/2st3]

<http://tinyurl.com/2r53> was posted by me to this thread pointing
to the "Talking about volatile and threads synchronization..."
c.p.t. discussion.

2002-11-17 16:49:14 PST [http://tinyurl.com/2st5]

Mike posts the question "What about objects shared with other processes."

2002-11-17 17:24:20 PST [http://tinyurl.com/2st7]

I reply "Hi Mike, contributing to the "quite a few $10^6" {growing}
amount or what?"

2002-11-17 18:35:34 PST [http://tinyurl.com/2st9]

http://groups.google.com/groups?selm=3DC9930A.10CB5BE%40web.de was
posted by me to this thread {AGAIN} pointing to the "Talking about
volatile and threads synchronization..." c.p.t. discussion.

2002-11-17 21:14:49 PST [http://tinyurl.com/2sta]

Mike replies: "I've already read those threads. You've just effectively
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^

told me "Alexander is correct because Alexander says so." Huh?". And
a bit later joins Jerry in my collection of "plonkers" (I consider it
as sort of "trophies", so to speak).

Well, http://groups.google.com/groups?selm=3CA25853.2B30E9BC%40web.de
(but this time it's Mike/"above", not Jerry/"below").

regards,
alexander.

Douglas A. Gwyn

unread,

Nov 18, 2002, 1:33:51 PM11/18/02

to

t...@cs.ucr.edu wrote:
> ... the necessary

> extensions/adjustments to the C Standard are not likely to be
> forthcoming. I've realize that. But, so far, they've not been
> proposed. Nor am I sure of exactly what they should be.

I suggested at the recent C standards meeting that we needed
to be working on thread support, but the general response was
to the effect that other people are already working on it.
I can only hope that they do a better job than I fear they
might.

Douglas A. Gwyn

unread,

Nov 18, 2002, 1:46:11 PM11/18/02

to

There is nothing to prevent introduction of some new
data (C object) type with any access property you
think all implementations can provide, and having
programmers alias their other types against it using
unions. The only thing missing is standardization.
POSIX is actually in a better position to do this
particular hack.

Alexander Terekhov

unread,

Nov 18, 2002, 2:59:37 PM11/18/02

to

"Douglas A. Gwyn" wrote:
>
> There is nothing to prevent introduction of some new
> data (C object) type with any access property you
> think all implementations can provide, and having
> programmers alias their other types against it using
> unions.

I don't understand this. Could you please elaborate?
With some example(s), of possible. TIA.

regards,
alexander.

t...@cs.ucr.edu

unread,

Nov 18, 2002, 2:49:48 PM11/18/02

to

In comp.std.c Douglas A. Gwyn <DAG...@null.net> wrote:

+ t...@cs.ucr.edu wrote:
+> ... the necessary
+> extensions/adjustments to the C Standard are not likely to be
+> forthcoming. I've realize that. But, so far, they've not been
+> proposed. Nor am I sure of exactly what they should be.
+
+ I suggested at the recent C standards meeting that we needed
+ to be working on thread support, but the general response was
+ to the effect that other people are already working on it.
+ I can only hope that they do a better job than I fear they
+ might.

Me too.

There are lots of efforts directed toward specification of threading
libraries -- pthreads is a noteworthy example. I would like to extend
C just enough that those libraries could be written in C. Such such
support could be based on minor adaptations of:
- setjmp/longjmp
- volatility
- sig_atomic_t
plus
- a library functions that behaves as a barrier.
- some way of guaranteeing isolation, e.g., requiring that dynamically
allocated structs be isolated or adding an alternate version of malloc()
that provides such a guarantee.

Tom Payne

Douglas A. Gwyn

unread,

Nov 18, 2002, 4:04:53 PM11/18/02

to

#include <extensions.h>
typedef union {
ext_cache_line_aligned_t dummy;
uint_least32_t data;
} cla_ul32;
cla_ul32 *p = malloc(sizeof(cla_ul32));
*p->data = 42; // whataver

Of course there are ways to clean up the cosmetics.

Douglas A. Gwyn

unread,

Nov 18, 2002, 4:38:34 PM11/18/02

to

"Douglas A. Gwyn" wrote:
> *p->data = 42; // whataver

Oops, get the extra * off there and fix it up any other
way you think appropriate. The point about the union
with a special type should be clear, though.

t...@cs.ucr.edu

unread,

Nov 18, 2002, 8:16:45 PM11/18/02

to

In comp.std.c Mike Wahler <mkwa...@mkwahler.net> wrote:
+
+ Jack Klein <jack...@spamcop.net> wrote in message
[...]
+> Since neither C nor C++ supports other processes or shared objects,
+> what about them?
+
+ I mean is 'volatile' a reasonable way to deal with them,
+ as it is with 'i/o ports'.

volatile-qualification of type can be useful for the state objects of
locks, etc. For the lock-protected, thread-shared data objects it is
neither necessary nor sufficient. Besides that, it imposes a large
and unnecessary performance penalty.

Tom Payne

t...@cs.ucr.edu

unread,

Nov 19, 2002, 1:40:03 AM11/19/02

to

In comp.std.c Douglas A. Gwyn <DAG...@null.net> wrote:

+ Alexander Terekhov wrote:
+> "Douglas A. Gwyn" wrote:

+> > There is nothing to prevent introduction of some new
+> > data (C object) type with any access property you
+> > think all implementations can provide, and having
+> > programmers alias their other types against it using
+> > unions.
+> I don't understand this. Could you please elaborate?
+> With some example(s), of possible. TIA.
+
+ #include <extensions.h>
+ typedef union {
+ ext_cache_line_aligned_t dummy;
+ uint_least32_t data;
+ } cla_ul32;
+ cla_ul32 *p = malloc(sizeof(cla_ul32));
+ *p->data = 42; // whataver
+
+ Of course there are ways to clean up the cosmetics.

Clever idea.

Tom Payne

Alexander Terekhov

unread,

Nov 19, 2002, 6:15:04 AM11/19/02

to

t...@cs.ucr.edu wrote:
>
> In comp.std.c Douglas A. Gwyn <DAG...@null.net> wrote:
> + Alexander Terekhov wrote:
> +> "Douglas A. Gwyn" wrote:
> +> > There is nothing to prevent introduction of some new
> +> > data (C object) type with any access property you
> +> > think all implementations can provide, and having
> +> > programmers alias their other types against it using
> +> > unions.
> +> I don't understand this. Could you please elaborate?
> +> With some example(s), of possible. TIA.
> +
> + #include <extensions.h>
> + typedef union {
> + ext_cache_line_aligned_t dummy;
> + uint_least32_t data;
> + } cla_ul32;

Does this mean that in addition to cache line alignment, sizeof
such union is ALWAYS guaranteed to be a multiple of "cache line"
(if sizeof(data) >= cache line)?

> + cla_ul32 *p = malloc(sizeof(cla_ul32));

Do you mean that this version of malloc will cache align all
allocations of "N * sizeof( ext_cache_line_aligned_t )" size?
I mean: what about malloc(1) and the implementation where all
pointers have the same representation?

> + *p->data = 42; // whataver
> +
> + Of course there are ways to clean up the cosmetics.

Such as

typedef union {
isolated { uint_least32_t data; }
} cla_ul32;

typedef struct {
isolated { uint_least32_t data1; }
isloated { uint_least32_t data2; }
} cla_ul32s;

<?> ;-)

regards,
alexander.

tom_usenet

unread,

Nov 19, 2002, 7:52:16 AM11/19/02

to

On Tue, 19 Nov 2002 12:15:04 +0100, Alexander Terekhov
<tere...@web.de> wrote:

>
>t...@cs.ucr.edu wrote:
>>
>> In comp.std.c Douglas A. Gwyn <DAG...@null.net> wrote:
>> + Alexander Terekhov wrote:
>> +> "Douglas A. Gwyn" wrote:
>> +> > There is nothing to prevent introduction of some new
>> +> > data (C object) type with any access property you
>> +> > think all implementations can provide, and having
>> +> > programmers alias their other types against it using
>> +> > unions.
>> +> I don't understand this. Could you please elaborate?
>> +> With some example(s), of possible. TIA.
>> +
>> + #include <extensions.h>
>> + typedef union {
>> + ext_cache_line_aligned_t dummy;
>> + uint_least32_t data;
>> + } cla_ul32;
>
>Does this mean that in addition to cache line alignment, sizeof
>such union is ALWAYS guaranteed to be a multiple of "cache line"
>(if sizeof(data) >= cache line)?

sizeof a union is always a multiple of the lowest common multiple of
the alignment requirements of the members of the union. Otherwise
arrays couldn't exist, because array[1] wouldn't be correctly aligned.

>> + cla_ul32 *p = malloc(sizeof(cla_ul32));
>
>Do you mean that this version of malloc will cache align all
>allocations of "N * sizeof( ext_cache_line_aligned_t )" size?
>I mean: what about malloc(1) and the implementation where all
>pointers have the same representation?

malloc has to return a pointer suitably aligned for *all* types, so no
change to malloc is required.

Tom

Alexander Terekhov

unread,

Nov 19, 2002, 8:15:58 AM11/19/02

to

Well, ``I know.'' Now, what if array[]/ptr arithmetic isn't used
ANYWERE in the application?

> >> + cla_ul32 *p = malloc(sizeof(cla_ul32));
> >
> >Do you mean that this version of malloc will cache align all
> >allocations of "N * sizeof( ext_cache_line_aligned_t )" size?
> >I mean: what about malloc(1) and the implementation where all
> >pointers have the same representation?
>
> malloc has to return a pointer suitably aligned for *all* types,
> so no change to malloc is required.

A) "suitably aligned" might be a rather "tricky" thing -- open to
various interpreteations, I'm afraid (see 1354-lines message).

B) If you're saying that each and every dynamic allocation shall
consume a multiple of cache line on MP (AFAIK, 64 bytes is
quite common; currently), then well, it's "OK".

regards,
alexander.

tom_usenet

unread,

Nov 19, 2002, 10:15:23 AM11/19/02

to

On Tue, 19 Nov 2002 14:15:58 +0100, Alexander Terekhov
<tere...@web.de> wrote:

>> sizeof a union is always a multiple of the lowest common multiple of
>> the alignment requirements of the members of the union. Otherwise
>> arrays couldn't exist, because array[1] wouldn't be correctly aligned.
>
>Well, ``I know.'' Now, what if array[]/ptr arithmetic isn't used
>ANYWERE in the application?

This isn't a practical problem worth discussing.

>> >> + cla_ul32 *p = malloc(sizeof(cla_ul32));
>> >
>> >Do you mean that this version of malloc will cache align all
>> >allocations of "N * sizeof( ext_cache_line_aligned_t )" size?
>> >I mean: what about malloc(1) and the implementation where all
>> >pointers have the same representation?
>>
>> malloc has to return a pointer suitably aligned for *all* types,
>> so no change to malloc is required.
>
>A) "suitably aligned" might be a rather "tricky" thing -- open to
> various interpreteations, I'm afraid (see 1354-lines message).
>B) If you're saying that each and every dynamic allocation shall
> consume a multiple of cache line on MP (AFAIK, 64 bytes is
> quite common; currently), then well, it's "OK".

If this cache line type thing exists at all (which it may not on
current implementations - I don't know of a type with a 64 byte
alignment requirement!), then I believe that it is ok, since malloc
will have to align calls suitably for it, on 64 byte boundaries
(making memory allocation rather wasteful!).

But I think the relevent infrastructure for this should be in POSIX,
not C and C++. Perhaps a new allocation function called ialloc
(isolated alloc) could be added, to prevent having to make malloc so
wasteful. Perhaps you have a better suggestion?

Tom

Alexander Terekhov

unread,

Nov 19, 2002, 11:38:23 AM11/19/02

to

tom_usenet wrote:
>
> On Tue, 19 Nov 2002 14:15:58 +0100, Alexander Terekhov
> <tere...@web.de> wrote:
>
> >> sizeof a union is always a multiple of the lowest common multiple of
> >> the alignment requirements of the members of the union. Otherwise
> >> arrays couldn't exist, because array[1] wouldn't be correctly aligned.
> >
> >Well, ``I know.'' Now, what if array[]/ptr arithmetic isn't used
> >ANYWERE in the application?
>
> This isn't a practical problem worth discussing.

Well,

void f() {
union LOCAL {
ext_cache_line_aligned_t dummy;
X data; // sizeof(X) == cache line + 1
} cla_X;
char c;
/* ... */
// &cla_X.data is passed to some other thread
// and f() does something with its char c and
// NO array[]/ptr arithmetic (for cla_X) is
// used here. Of course, f()'s author SHOULD
// synchronize with other threads (cla_X's
// lifetime). But what about c's and cla_X's/
// .data's ISOLATION with respect to each
// other?
}

[...]

> Perhaps you have a better suggestion?

All that I currently have can be found in the "1354-line"
message and [some "extra" stuff] in this c.p.t. thread:

<http://tinyurl.com/2tua>

regards,
alexander.

t...@cs.ucr.edu

unread,

Nov 19, 2002, 8:17:39 PM11/19/02

to

In comp.std.c Alexander Terekhov <tere...@web.de> wrote:

+ tom_usenet wrote:
+>
+> On Tue, 19 Nov 2002 14:15:58 +0100, Alexander Terekhov
+> <tere...@web.de> wrote:
+>
+> >> sizeof a union is always a multiple of the lowest common multiple of
+> >> the alignment requirements of the members of the union. Otherwise
+> >> arrays couldn't exist, because array[1] wouldn't be correctly aligned.
+> >
+> >Well, ``I know.'' Now, what if array[]/ptr arithmetic isn't used
+> >ANYWERE in the application?
+>
+> This isn't a practical problem worth discussing.
+
+ Well,
+
+ void f() {
+ union LOCAL {
+ ext_cache_line_aligned_t dummy;
+ X data; // sizeof(X) == cache line + 1
+ } cla_X;
+ char c;
+ /* ... */
+ // &cla_X.data is passed to some other thread
+ // and f() does something with its char c and
+ // NO array[]/ptr arithmetic (for cla_X) is
+ // used here. Of course, f()'s author SHOULD
+ // synchronize with other threads (cla_X's
+ // lifetime). But what about c's and cla_X's/
+ // .data's ISOLATION with respect to each
+ // other?
+ }

I would presume that, if the size of ext_cache_line_aligned_t is as
great as that of a cache line, cla_X and c are isolated. Otherwise,
we will have an case of false sharing. If we have strong cache
coherence, the consequence of that false sharing will be performance
degradation. Otherwise, ...

+ [...]
+> Perhaps you have a better suggestion?
+
+ All that I currently have can be found in the "1354-line"
+ message and [some "extra" stuff] in this c.p.t. thread:
+
+ <http://tinyurl.com/2tua>

By the way, thanks for that 1354-line posting -- I still studying it.

Tom Payne

Douglas A. Gwyn

unread,

Nov 19, 2002, 10:56:39 PM11/19/02

to

Alexander Terekhov wrote:
> Does this mean that in addition to cache line alignment, sizeof
> such union is ALWAYS guaranteed to be a multiple of "cache line"
> (if sizeof(data) >= cache line)?

No, but the union must be properly aligned for each of its
component types.

> Do you mean that this version of malloc will cache align all
> allocations of "N * sizeof( ext_cache_line_aligned_t )" size?

Well, probably I shouldn't have used malloc but rather
have simply declared the variable.

Note that the same issue arises no matter how you want
to specify such alignment.