Google Groups no longer supports new Usenet posts or subscriptions. Historical content remains viewable.

Dismiss

Memory isolation

50 views

Skip to first unread message

Garry Lancaster

unread,

Oct 21, 2002, 8:01:55 AM10/21/02

Hi All

Say I have two global char variables:

char c1 = 0;
char c2 = 0;

and two mutexes:

Mutex m1;
Mutex m2;

(Assume Mutex is a C++ class with the obvious Lock and
Unlock functions, wrapping a mutex API such as the
pthread_mutex_init/destroy/lock/unlock functions.)

I have two functions, f1 and f2.

void f1() {
m1.Lock();
c1 = 1; // Line X.
if (1 != c1) abort(); // Line Y.
c1 = 0;
m1.Unlock();
}

void f2 {
m2.Lock();
c2 = 1; // Line Z.
if (1 != c2) abort();
c2 = 0;
m2.Unlock();
}

The critical sections in f1 and f2 may run concurrently because
they use different mutexes.

I spawn several threads. Some run f1, others f2.

It is my understanding that, even though I have protected my
variables using mutexes, the two variable values may interfere
with one another. For example, on a platform with only word-
sized memory access (and, naturally, more than a single 1 byte
char per word), if the two globals reside in adjacent memory
locations the write to c2 at line Z may generate a word
read including both c1 and c2, followed by a word write of the
same. If line X is run in between this read and write, it will
effectively be ignored, since the new value will be overwritten
by the old, and the program will abort at line Y.

In other words:

State: c1 = 0 c2 = 0
Action: Line Z word read.
State: c1 = 0 c2 = 0
Action: Line X word read
State: c1 = 0 c2 = 0
Action: Line X word write
State: c1 = 1 c2 = 0
Action: Line Z word write
State: c1 = 0 c2 = 1
Action: Line Y, condition false so abort.

Even though each variable is protected by its own mutex,
since they are not using the *same* mutex, they still
interfere.

I know the avoidance of this behaviour is part of what is
meant by atomicity. But, since it is not the whole of what
is meant (it does not address interruptibility), I am currently
using a different term: isolation. (I hope someone will
correct me if there is a standard term for this.)

Is the scenario I post possible under pthreads (or any other
threading system for that matter) or have I missed something
that means the problem will not occur?

If lack of isolation *is* a problem, what is the most portable
solution?

Thanks in advance.

Kind regards

Garry Lancaster

Alexander Terekhov

unread,

Oct 21, 2002, 8:46:29 AM10/21/02

Garry Lancaster wrote:
[...]

> If lack of isolation *is* a problem, what is the most portable
> solution?

Volatile! Ok, ok. Sort of kidding [DEC/Compaq/NewHP's "strong-volatile" aside].

< Forward Inline >

-------- Original Message --------
Message-ID: <3DB3CF9B...@web.de>
Date: Mon, 21 Oct 2002 11:57:47 +0200
Newsgroups: gmane.comp.lib.boost.devel
Subject: Re: "Thread-safe", etc. definitions [was: Re: Threads Review]

Joe Gottman wrote:
>
> David Abrahams wrote:
> [...]
> > * "Thread-safe": this is a term which is well-defined for programs. It
> > is /not/ well-defined for classes or functions, though the
> > documentation uses it that way repeatedly. ....
>
> The SGI STL implementation gives a good definition of the minimal acceptable
> level of thread safety in a class that may be used in multi-threaded
> programs (see http://www.sgi.com/tech/stl/thread_safety.html ). To

Well, the POSIX version is here [missing pthread_once() and *erroneously*
mentioned pthread_cond_signal()/ _broadcast() aside]:

http://www.opengroup.org/onlinepubs/007904975/basedefs/xbd_chap04.html#tag_04_10
(General Concepts, Memory Synchronization)

> paraphrase, it says that simultaneous accesses to distinct objects are
> safe, and simultaneous read accesses to shared objects are safe. If a class
> obeys these two rules, then it should be possible to make a shared object of
> that class thread safe by putting a mutex, critical section, etc. arround
> any write accesses to that object.

Unfortunately, this is NOT true because on one hand, POSIX doesn't address
the problem of word-tearing with respect to "adjacent" objects in neither
C nor C++, and, on the other hand, neither C nor C++ acknowledge the
existence of threads/shared memory access with respect to threads/processes.
You might want to take a look at this thread:

http://groups.google.com/groups?threadm=3D36E35C.D8C511AE%40web.de
(comp.std.c, Subject: "memory location")

regards,
alexander.

David Butenhof

unread,

Oct 21, 2002, 9:58:57 AM10/21/02

Garry Lancaster wrote:

There is no completely portable solution, because standards do not provide
means to control the exact layout of data in memory, nor the instructions
generated by a compiler to access them. (Even "volatile" provides only very
loose constraints on the instructions used, and they're not useful here.)

Note that aside from problems that can destroy data, like "word tearing",
there are performance problems such as "false sharing". False sharing won't
hurt your final data (or even intermediate data), but can drastically
affect your performance when multiple threads (running on separate CPUs)
concurrently write to non-adjacent data in the same cache line(s). (Because
of cache invalidate thrashing in the memory system.)

Your best bet to avoid the functional problems and minimize the performance
risks is to avoid declaring shared data as you've shown. That is, instead
of:

char c1 = 0;
char c2 = 0;

Mutex m1;
Mutex m2;

That not only places the shared data adjacent to each other (in most
implementations), but actually interleaves the shared data to guarantee
you'll get cache conflicts. (c1 is separated from m1; while c1 and c2, and
m1 and m2, are pushed together.)

Instead, use:

char c1 = 0;
Mutex m1;
char c2 = 0;
Mutex m2;

This still doesn't guarantee cache isolation, though at least you know that
the machine is far less likely to have atomicity problems accessing c1 and
c2 with respect to each other. For one thing, on most machines without
atomic access to the char data type, the compiler will generate padding
between the char and the Mutex (which most likely has wider data, such as
int or long or pointer).

Or even better,

typedef struct {char c; Mutex m} Data;
Data *d1;
Data *d2;

d1 = malloc (sizeof Data);
d2 = malloc (sizeof Data);

Now you're letting the heap manager buy you some reasonable minimal data
alignment, as well as a high likelihood (though still not a guarantee) that
the two allocations will be in separate cache lines. For further assurance,
you could easily pad the allocations to some reasonable size; 64 bytes is a
common cache line size.

"Mounting" your data into a structure comes as close as you can in C to
controlling the actual layout of data in memory.

While this is a little less trivially simple than the original, it's not
horrendously complicated, either. It'll buy you a lot of flexibility to
adapt to various architectures, as well as a fair level of builtin basic
protection.

--
/--------------------[ David.B...@hp.com ]--------------------\
| Hewlett-Packard Company Tru64 UNIX & VMS Thread Architect |
| My book: http://www.awl.com/cseng/titles/0-201-63392-2/ |
\----[ http://homepage.mac.com/dbutenhof/Threads/Threads.html ]---/

Max Khesin

unread,

Oct 21, 2002, 3:25:43 PM10/21/02

why not something like this:

char sharedData[sizeof(int)+1];

char& c1 = sharedData[0];
char& c2 = sharedData[sizeof(int)];

this would seem (assuming "int" is the largest size read at once) to
sufficiently separate the data.
?

--
========================================
Max Khesin, software developer -
m...@cNvOiSsPiAoMntech.com
[check out our image compression software at www.cvisiontech.com, JBIG2-PDF
compression @
www.cvisiontech.com/cvistapdf.html]

"David Butenhof" <David.B...@compaq.com> wrote in message
news:BKTs9.20$ZG7.4...@news.cpqcorp.net...

Alexander Terekhov

unread,

Oct 21, 2002, 4:09:49 PM10/21/02

Max Khesin wrote:
>
> why not something like this:
>
> char sharedData[sizeof(int)+1];
>
> char& c1 = sharedData[0];
> char& c2 = sharedData[sizeof(int)];
>
> this would seem (assuming "int" is the largest size read at once) to
> sufficiently separate the data.

assert( sizeof( char ) == sizeof( int ) );

[...]

> > Instead, use:
> >
> > char c1 = 0;
> > Mutex m1;
> > char c2 = 0;
> > Mutex m2;

assert( sizeof( char ) == sizeof( Mutex ) );

> > This still doesn't guarantee cache isolation, though at least you know
> that
> > the machine is far less likely to have atomicity problems accessing c1 and
> > c2 with respect to each other. For one thing, on most machines without
> > atomic access to the char data type, the compiler will generate padding
> > between the char and the Mutex (which most likely has wider data, such as
> > int or long or pointer).
> >
> > Or even better,
> >
> > typedef struct {char c; Mutex m} Data;
> > Data *d1;
> > Data *d2;
> >
> > d1 = malloc (sizeof Data);
> > d2 = malloc (sizeof Data);

assert( 2*sizeof( Data ) <= sizeof( pthread_memory_granule_np_t ) );

http://groups.google.com/groups?selm=yahiuqxlycg.fsf%40berling.diku.dk

"....
However, I still think that malloc(1) on an implementation where all
pointers have the same representation may still return a pointer that
is not "aligned" in the everyday, non-standardese meaning of the term.
Because it is undefined anyway what happens when one tries to use the
pointer to access an object bigger than the size I asked malloc() for."

regards,
alexander.

Garry Lancaster

unread,

Oct 22, 2002, 5:02:06 AM10/22/02

Max Khesin:

> why not something like this:
>
> char sharedData[sizeof(int)+1];
>
> char& c1 = sharedData[0];
> char& c2 = sharedData[sizeof(int)];
>
> this would seem (assuming "int" is the largest size read at once) to
> sufficiently separate the data.
> ?

There are two reasons why this might not provide
isolation.

The first is that the assumption may not hold.
Although it is typically assumed that int == natural
word size, and indeed, both C and C++ standards
deliver strong hints that this is the intention, there
is no absolute requirement that this is so. Indeed, I think
there are at least some compilers for 64 bit machines
where int is 32 bits (although I admit I don't know if
they have *only* 64 bit access to memory: whatever,
this arrangement is theoretically possible and would
comply with the C, C++ and POSIX threading standards)

The second is more subtle. Even if int padding is
sufficient, the first char of the array isn't necessarily the
first byte of an int-sized word, so may not be isolated from
a variable prior to it in memory (maybe one in a different
translation unit or even a library for which you don't
have the source). We would need padding before
as well as after.

For what it's worth, I think David Butenhof's suggestion
for including our data inside a struct (or, of course, in
C++, a class) together with at least one synchronization
primitive is the most portable solution. But it is still not
100% guaranteed by C, C++ or POSIX standards (nor
by any other threading standard of which I am aware), and
that bothers me a little. For example, one can imagine an
implementation where synchronization primitives were
protected from cross-thread interference purely by some
kind of global system flush-and-lock during the thread
API calls rather than by virtue of their alignment and size.

I really think one - at least one - of the relevant standards
needs to address isolation in the future, so we don't have
to rely on these non-portable assumptions.

I had thought of an "isolated" keyword and I see in
the newsgroups that David Butenhof suggested
"isolate" a long time ago. But, for a number of slightly
hazy reasons, I'm not sure yet that this is the way to go.
Maybe it would be better just to standardize the
assumptions underlying today's "most portable"
approach. In other words, say that any object
containing, directly or indirectly, a synchronization primitive
sub-object is guaranteed to be isolated from all other
objects. (Note: that "indirect" refers to layering, not to
containment by pointer or reference.)

Kind regards

Garry Lancaster
Codemill Ltd
Visit our web site at http://www.codemill.net

Sergey P. Derevyago

unread,

Oct 21, 2002, 1:05:17 PM10/21/02

David Butenhof wrote:
> Or even better,
>
> typedef struct {char c; Mutex m} Data;
> Data *d1;
> Data *d2;
>
> d1 = malloc (sizeof Data);
> d2 = malloc (sizeof Data);
>

IMHO it makes sense to issue these calls to malloc from different threads,
say from the owners of the appropriate pointers.
--
With all respect, Sergey. http://cpp3.virtualave.net/
mailto : ders at skeptik.net

Alexander Terekhov

unread,

Oct 22, 2002, 6:09:49 AM10/22/02

Garry Lancaster wrote:
>
> Max Khesin:
> > why not something like this:
> >
> > char sharedData[sizeof(int)+1];
> >
> > char& c1 = sharedData[0];
> > char& c2 = sharedData[sizeof(int)];
> >
> > this would seem (assuming "int" is the largest size read at once) to
> > sufficiently separate the data.
> > ?
>
> There are two reasons why this might not provide
> isolation.
>
> The first is that the assumption may not hold.
> Although it is typically assumed that int == natural
> word size, and indeed, both C and C++ standards
> deliver strong hints that this is the intention, there
> is no absolute requirement that this is so. Indeed, I think
> there are at least some compilers for 64 bit machines
> where int is 32 bits (although I admit I don't know if
> they have *only* 64 bit access to memory: whatever,
> this arrangement is theoretically possible and would
> comply with the C, C++ and POSIX threading standards)

http://tru64unix.compaq.com/docs/base_doc/DOCUMENTATION/V51A_HTML/ARH9RBTE/DOCU0007.HTM#gran_sec
(3.7 Granularity Considerations)

> The second is more subtle. Even if int padding is
> sufficient, the first char of the array isn't necessarily the
> first byte of an int-sized word, so may not be isolated from
> a variable prior to it in memory (maybe one in a different
> translation unit or even a library for which you don't
> have the source). We would need padding before
> as well as after.

http://groups.google.com/groups?selm=3B57DFDE.7AC868E3%40web.de
http://groups.google.com/groups?selm=3B55DC62.2CCE3B26%40web.de
(Subject: Re: Most conforming POSIX threads implementation)

[...]

> I really think one - at least one - of the relevant standards
> needs to address isolation in the future, so we don't have
> to rely on these non-portable assumptions.

Dead accurate. Unfortunately, this isn't easy... consider:

http://www.opengroup.org/austin/docs/austin_107.txt
(Defect in XBD 4.10 Memory Synchronization (rdvk# 26), Rationale
for rejected or partial changes)

"Our advice is as follows.

Hardware that does not allow atomic accesses cannot have
a POSIX implementation on it.

We propose no changes to the standard.

Please note that the committee is not required to give advice,
this sort of topic may be better to be discussed initially on the group
reflector prior to any aardvark submission.

_____________________________________________________________________________
Page: 100 Line: 3115 Section: 4.10

Problem:

Defect code : 3. Clarification required

d...@dvv.ru (Dima Volodin) wrote:
....
The standard doesn't provide any definition on memory location [POSIX is
a C API, so it must be done in C terms?]. Also, as per standard C rules,
access to one memory location [byte?] shouldn't have any effect on a
different memory location. POSIX doesn't seem to address this issue, so
the assumption is that the usual C rules apply to multi-threaded
programs. On the other hand, the established industry practices are such
that there is no guarantee of integrity of certain memory locations when
modification of some "closely residing" memory locations is performed.
The standard either has to clarify that access to distinct memory
locations doesn't have to be locked [which, I hope, we all understand,
is not a feasible solution] or incorporate current practices in its
wording providing users with means to guarantee data integrity of
distinct memory locations. "Please advise."

---

http://groups.google.com/groups?hl=en&selm=3B0CEA34.845E7AFF%40compaq.com

Dave Butenhof (David.B...@compaq.com) wrote:
....
POSIX says you cannot have multiple threads using "a memory location"
without explicit synchronization. POSIX does not claim to know, nor
try to specify, what constitutes "a memory location" or access to it,
across all possible system architectures. On systems that don't use
atomic byte access instructions, your program is in violation of the
rules.
...."

> I had thought of an "isolated" keyword and I see in
> the newsgroups that David Butenhof suggested
> "isolate" a long time ago.

Do you have a link? Or do you mean the recent "memory location"
thread?

> But, for a number of slightly hazy reasons, I'm not sure yet
> that this is the way to go.

I'm sure it is.

> Maybe it would be better just to standardize the
> assumptions underlying today's "most portable"
> approach. In other words, say that any object
> containing, directly or indirectly, a synchronization primitive
> sub-object is guaranteed to be isolated from all other
> objects. (Note: that "indirect" refers to layering, not to
> containment by pointer or reference.)

Thread "private" data needs to be isolated too... (and adding
synchronization primitives to it would be rather silly, IMHO)

regards,
alexander.

Garry Lancaster

unread,

Oct 22, 2002, 9:31:48 AM10/22/02

[snip]

Garry Lancaster:

> > I had thought of an "isolated" keyword and I see in
> > the newsgroups that David Butenhof suggested
> > "isolate" a long time ago.

Alexander Terekhov:

> Do you have a link? Or do you mean the recent "memory location"
> thread?

Yes, the "memory location" thread - actually not so long
ago, now that I check the date.

> > But, for a number of slightly hazy reasons, I'm not sure yet
> > that this is the way to go.
>
> I'm sure it is.
>
> > Maybe it would be better just to standardize the
> > assumptions underlying today's "most portable"
> > approach. In other words, say that any object
> > containing, directly or indirectly, a synchronization primitive
> > sub-object is guaranteed to be isolated from all other
> > objects. (Note: that "indirect" refers to layering, not to
> > containment by pointer or reference.)
>
> Thread "private" data needs to be isolated too... (and adding
> synchronization primitives to it would be rather silly, IMHO)

Good point. I agree this means the synchronization
primitive rule is unworkable. But I don't think this
necessarily supports "isolated". I think it just means
we need to define the existing sensible assumptions
better than I have done above. I will try to articulate
some of my concerns over "isolated" then have another
go:

1. By finally defining what is isolated and what is not
compiler writers have the chance to be more aggressive
in their packing of non-isolated data. If they avail
themselves of this opportunity, code that is written with
the current assumptions will likely break. If they don't,
what is the benefit of the new keyword over and above
standardizing existing practice? (Probably, some will
and some won't.)

2. Should isolated be part of an object's type (like const
or volatile or even, in C99, restrict) or just a storage directive
governing alignment and external padding? If it is part of
the object's type we will introduce programmers to the
concept of "isolated-correctness" which, just like "const-
correctness", will involve additional code on occassion.
(Don't get me wrong - I like const, but I'm not sure how
many more type qualifiers can be comfortably supported
by C or C++: there are already more permutations than I
would like.) The fact that isolated char will be allowed to
have different alignment requirements to char is not
compatible with the rules for the existing qualifiers (C++
standard section 3.9.3/1). The advantage of making isolated
part of the type is that the compiler may choose wider,
quicker accesses for accessing non-isolated objects
and narrower, slightly slower accesses for accessing
isolated objects so reducing the alignment and padding
requirements of isolated objects. If we don't consider
isolated as part of an object's type, this means an external
function passed a pointer or reference to a type will not
know whether it is isolated or not. Either the padding
and alignment of isolated values will have to be sufficient
to ensure any access is isolated or the type of accesses
generated by the compiler for accesses through pointers
or references will be restricted. Or maybe a bit of both.
For example, a compiler for certain versions of the Alpha
processor, which can access 32 or 64 bits at once might
choose between external padding of 32 bit data or restricting
accesses to it to 32 bits, even when 64 bit access would
be faster. Then again, if isolated is not part of the type, how
would one declare arrays where each item is isolated from
the others?

So, instead, what about this (for C++):

"Implementations may permit (or rather, suffer from ;-)
'word-tearing', which is the modification of one object's
memory as viewed from one thread due to the modification
of a different object by a different thread. When an object A
is guaranteed not to suffer word-tearing due to the
modifications of an object B, A is said to be 'isolated' from
B. All objects shall be isolated from all others, except that:

- Sub-objects which are not of class type, or which are
base class sub-objects, need not be isolated from other
sub-objects of the same object.

- Array elements which are not of class type need not
be isolated from other elements of the same array.

[Implementation note: Implementations typically achieve
isolation using alignment and/or padding bytes. If internal
padding bytes are used, affecting the size of an object,
they must be used consistently across all objects of
that type, for both objects that are required to be isolated
and those that need not be isolated. In other words, the
isolation requirements do not permit different objects of
the same type to have different internal layouts.]"

(Notes:
- "Class type" includes structs as well as classes.
- "Object" means any variable, not just those of class type.
- "Sub-object" means an object that is a direct member of,
or direct base class of, an object of class type.)

This may not be rigourous standardese, but I hope it
communicates the intent.

Alexander Terekhov

unread,

Oct 23, 2002, 9:26:23 AM10/23/02

Garry Lancaster wrote:
[...]

> 1. By finally defining what is isolated and what is not
> compiler writers have the chance to be more aggressive
> in their packing of non-isolated data. If they avail
> themselves of this opportunity, code that is written with
> the current assumptions will likely break. If they don't,
> what is the benefit of the new keyword over and above
> standardizing existing practice? (Probably, some will
> and some won't.)

The only way to avoid introduction of "isolation" is
to declare/legislate something along the lines of Java's

"A variable refers to a static variable of a loaded
class, a field of an allocated object, or element of
an allocated array. The system must maintain the
following properties with regards to variables and the
memory manager:
....
The fact that two variables may be stored in adjacent
bytes (e.g., in a byte array) is immaterial.
Two variables can be simultaneously updated by
different threads without needing to use synchronization
to account for the fact that they are ``adjacent''. "

I'm afraid. And I don't think that this can/should be done.

> 2. Should isolated be part of an object's type (like const
> or volatile or even, in C99, restrict) or just a storage directive
> governing alignment and external padding?

Non aggregated C/C+ objects should be "isolated" by default
using alignment/padding, I believe.

Yeah, aggregated C/C+ {sub-}objects would really need extra
"isolated" type qualifier to make it work with respect to
avoiding word-tearing, I think.

> So, instead, what about this (for C++):
>
> "Implementations may permit (or rather, suffer from ;-)
> 'word-tearing', which is the modification of one object's
> memory as viewed from one thread due to the modification
> of a different object by a different thread. When an object A
> is guaranteed not to suffer word-tearing due to the
> modifications of an object B, A is said to be 'isolated' from
> B. All objects shall be isolated from all others, except that:
>
> - Sub-objects which are not of class type, or which are
> base class sub-objects, need not be isolated from other
> sub-objects of the same object.
>
> - Array elements which are not of class type need not
> be isolated from other elements of the same array.
>

I personally don't like this classes-vs-builtins distinction.

regards,
alexander.

Garry Lancaster

unread,

Oct 23, 2002, 12:18:16 PM10/23/02

Alexander Terekhov:

> The only way to avoid introduction of "isolation" is
> to declare/legislate something along the lines of Java's
>
> "A variable refers to a static variable of a loaded
> class, a field of an allocated object, or element of
> an allocated array. The system must maintain the
> following properties with regards to variables and the
> memory manager:
> ....
> The fact that two variables may be stored in adjacent
> bytes (e.g., in a byte array) is immaterial.
> Two variables can be simultaneously updated by
> different threads without needing to use synchronization
> to account for the fact that they are ``adjacent''. "
>
> I'm afraid. And I don't think that this can/should be done.

I agree that Java's approach is inappropriate for languages
such as C and C++ that need to be closer to the realities
of a wide variety of hardware. But if the premise is that
Java's way or adding an "isolated" keyword in some way
are the only two ways to sort this out, then I am not
convinced it is valid

> > 2. Should isolated be part of an object's type (like const
> > or volatile or even, in C99, restrict) or just a storage directive
> > governing alignment and external padding?
>
> Non aggregated C/C+ objects should be "isolated" by default
> using alignment/padding, I believe.

I'm not 100% sure what you mean by "non-aggregated" [1].
If you just mean objects that are not sub-objects of a
class or elements of an array [1], then I agree with you
that they should be isolated and, what's more, so do the
proposed isolation rules. *How* they are isolated is
an implementation decision: even for "difficult" platforms
like the Alpha, many objects can be isolated without any
extra alignment or padding, for example.

It occurs to me that thinking about isolated vs. non-isolated
actually doesn't help us that much, unless we go the Java
route and say everything is isolated. We need to think about
isolation *boundaries*. Notice in the suggested isolation
rules even those objects that aren't totally isolated are only
non-isolated with respect to their "sibling" objects in the
class or array. They are still isolated from all other objects.
As you pointed out with respect to "thread private" objects,
an entirely non-isolated object is a dangerous thing in a
multi-threaded program: its value may be corrupted by a
write to any other object on any other thread, which is why
there aren't any in the suggested rules.

[snip]

> Yeah, aggregated C/C+ {sub-}objects would really need extra
> "isolated" type qualifier to make it work with respect to
> avoiding word-tearing, I think.

With the suggested isolation rules:

struct ichar { char c; }
char a[10]; // Elements not guaranteed isolated from each other.
ichar b[10]; // Elements guaranteed isolated from each other.

No new type qualifier is required.

> > "Implementations may permit (or rather, suffer from ;-)
> > 'word-tearing', which is the modification of one object's
> > memory as viewed from one thread due to the modification
> > of a different object by a different thread. When an object A
> > is guaranteed not to suffer word-tearing due to the
> > modifications of an object B, A is said to be 'isolated' from
> > B. All objects shall be isolated from all others, except that:
> >
> > - Sub-objects which are not of class type, or which are
> > base class sub-objects, need not be isolated from other
> > sub-objects of the same object.
> >
> > - Array elements which are not of class type need not
> > be isolated from other elements of the same array.
> >
>
> I personally don't like this classes-vs-builtins distinction.

Well, your not liking it doesn't help very much. Pull it
apart, find the problems, then post them so we can
see if they're fixable. Even if you're still convinced that
we need an "isolated" keyword, we need to do something
like this analysis anyway in order to find sensible
semantics for the keyword, in particular, sensible
defaults (and neither everything isolated nor everything
non-isolated make sensible defaults, for what I think
are now obvious reasons).

Kind regards

Garry Lancaster
Codemill Ltd
Visit our web site at http://www.codemill.net

Notes:

1. In the C++ standard "aggregate" doesn't mean
simply a type that is a class or array, it means [8.5.1/1]
"an array or a class (clause 9) with no user-declared
constructors (12.1), no private or protected non-static
data members (clause 11), no base classes (clause 10),
and no virtual functions (10.3)." So, what an aggregated
object is seems a bit of a gray area. But I hope I grasped
your meaning correctly.

David Butenhof

unread,

Oct 24, 2002, 7:29:22 AM10/24/02

Sergey P. Derevyago wrote:

> David Butenhof wrote:
>> Or even better,
>>
>> typedef struct {char c; Mutex m} Data;
>> Data *d1;
>> Data *d2;
>>
>> d1 = malloc (sizeof Data);
>> d2 = malloc (sizeof Data);
>>
> IMHO it makes sense to issue these calls to malloc from different threads,
> say from the owners of the appropriate pointers.

We're arguing subtle linguistic semantics here, but the distinction could be
important, so I'll proceed. ;-)

While there's nothing to prevent the application from considering 'd1' and
'd2' to be "owned" by different threads, that's a secondary consideration
of data lifetime. Both are implicitly SHARED data, by their association
with a mutex, and that means the most important application definition of
"ownership" is "the thread that holds the mutex". There's no point in
having the mutex unless more than one thread at any time will have access
to the address of the data.

So it MAY make sense, depending on the circumstances leading to creation of
these data, to allocate them in different threads; but there's absolutely
no reason to presume this will be the case. "In some fashion" they will be
allocated, and "in some fashion" the respective addresses shall be made
available to a collection of threads within the application.

David Butenhof

unread,

Oct 24, 2002, 8:09:19 AM10/24/02

Alexander Terekhov wrote:

> Max Khesin wrote:
>>
>> why not something like this:
>>
>> char sharedData[sizeof(int)+1];
>>
>> char& c1 = sharedData[0];
>> char& c2 = sharedData[sizeof(int)];
>>
>> this would seem (assuming "int" is the largest size read at once) to
>> sufficiently separate the data.
>
> assert( sizeof( char ) == sizeof( int ) );

Again, the real problem with this alternative (as already pointed out by
several) is that a char array has no required alignment and element 0 need
not have (int*) alignment either. Now you need to do bit masking to align
the address of c1 as well as c2.

> [...]
>> > Instead, use:
>> >
>> > char c1 = 0;
>> > Mutex m1;
>> > char c2 = 0;
>> > Mutex m2;
>
> assert( sizeof( char ) == sizeof( Mutex ) );

This would almost certainly still be better than having c1 and c2 in
adjacent bytes.

However, in general, yes, you've successfully described ONE (of many) of the
possibilities that caused me to say that these strategies will often help
but provide no real guarantees. I don't really see why you bothered. (Or,
even worse, why I'm bothering to respond.)

>> > This still doesn't guarantee cache isolation, though at least you know
>> that
>> > the machine is far less likely to have atomicity problems accessing c1
>> > and c2 with respect to each other. For one thing, on most machines
>> > without atomic access to the char data type, the compiler will generate
>> > padding between the char and the Mutex (which most likely has wider
>> > data, such as int or long or pointer).
>> >
>> > Or even better,
>> >
>> > typedef struct {char c; Mutex m} Data;
>> > Data *d1;
>> > Data *d2;
>> >
>> > d1 = malloc (sizeof Data);
>> > d2 = malloc (sizeof Data);
>
> assert( 2*sizeof( Data ) <= sizeof( pthread_memory_granule_np_t ) );

Again, and more directly this time: "exactly: and so what"?

> http://groups.google.com/groups?selm=yahiuqxlycg.fsf%40berling.diku.dk
>
> "....
> However, I still think that malloc(1) on an implementation where all
> pointers have the same representation may still return a pointer that
> is not "aligned" in the everyday, non-standardese meaning of the term.
> Because it is undefined anyway what happens when one tries to use the
> pointer to access an object bigger than the size I asked malloc() for."

On a machine with no address alignment requirements, there are no alignment
requirements on malloc(). But address alignment rules aren't the same as
atomic access rules, and this can complicate "isolation". On a machine like
Alpha that requires natural data aligment, a (short*) MUST have the low
address bit clear, (int*) must have 2 low address bits clear, and so forth.
Therefore an implementation of malloc() that did not return a value with
the maximum number of cleared low address bits would be erroneous. (Yes,
'malloc(1)' could return an unaligned address, 'malloc(2)' could return an
address with a single cleared low bit, and so forth, though this is an
unlikely implementation. Certainly 'malloc(2)' cannot return a value with
the low address bit set, because it cannot legally presume the storage will
be mapped to 'char[2]' rather than 'int' even though that's all the
information it has.)

It's possible, (though I know of no examples except one subtly broken model
of the VAX family), that a machine without address alignment rules could
have restrictions on atomic access to unaligned data. In such an
implementation, malloc(8) might return an address with the low bit set,
restricting atomic access to that data. Possible, but unlikely. Except for
very early VAX models, unaligned data access may have been LEGAL, but was
extremely inefficient (it meant locking the memory bus, doing multiple
atomic ALIGNED fetches, unlocking the memory bus, and gluing the data
together) -- and of course every VAX data access was required to be atomic,
so there was no way to skip that overhead. No rational implementation of
malloc() would ever return unaligned addresses even though it might be
"legal".

The only real solution to this has to be at the language level, an area
where POSIX and SUS can't tread. There must be language syntax, and it must
be general and simple. I don't recall the context of discussions sited
regarding an "isolated" keyword, but I doubt that'd be practical or usable
except in, uh, "isolated" instances.

Better might be a general compiler option, perhaps a standard #pragma, to
force all "discrete" data allocations to be sufficiently isolated for
atomic access on the target hardware. At the simplest (and most easily
usable) level it would appear in a header file (perhaps <pthread.h>?) to
cause all externs, statics, and allocated return values (e.g., from
malloc()) to be sufficiently separated to ensure atomicity with respect to
other values so allocated.

But... what about 'char foo[2];'? Clearly the address "&foo" must be
"aligned". But what about "&foo[1]"? If it IS, then you really need to
force the compiler to change the definition of sizeof(char) in that
compilation scope or break many patterns in previously portable code. For
example, "char *bar = &foo; bar[1] = 0;". (One could construct nastier
examples that would be harder to detect and fix.)

What about structures? Is each field in the structure expanded? Essentially
what we're saying is that if the machine can access 'long', but not 'int',
'short', or 'char', atomically, then we really allocate nothing smaller
than 'long'. Is that acceptable? How does it impact application code (and
data sizes)?

The best strategy would probably be to say that AN array or A structure is
an "atomicity unit". You don't, by default, gain any guaranteed atomic
access to members of the unit. (This could be provided for by an additional
pragma, or by something like 'isolated'; though the pragma would probably
be cleaner.)

Often we want a larger alignment than strictly needed, for efficiency. The
best unit here is almost always the machine's cache line size -- a value
not commonly communicated to application code. This has proven particularly
critical in designing data structures for NUMA environments, but compiler
support tends to be pretty bad.

Perhaps something like "#pragma align_all ({cache|atomicity})" (Where
"cache" is required to subsume "atomicity", just to remove ambiguity.)

I'm not entirely sure that'd be sufficient, either, but it's another idea to
consider.

Garry Lancaster

unread,

Oct 24, 2002, 11:00:07 AM10/24/02

[snip]

David Butenhof:

> The only real solution to this has to be at the language level, an area
> where POSIX and SUS can't tread.

I think POSIX *could* do it, but the languages *should*
do it. But then I also think that a lot of what POSIX
currently does would, in an ideal world, be done by
the languages. At the moment this has somehow
fallen through the cracks because it's quite subtle.

> There must be language syntax, and it must
> be general and simple. I don't recall the context of discussions sited
> regarding an "isolated" keyword, but I doubt that'd be practical or usable
> except in, uh, "isolated" instances.
>
> Better might be a general compiler option, perhaps a standard #pragma, to
> force all "discrete" data allocations to be sufficiently isolated for
> atomic access on the target hardware. At the simplest (and most easily
> usable) level it would appear in a header file (perhaps <pthread.h>?) to
> cause all externs, statics, and allocated return values (e.g., from
> malloc()) to be sufficiently separated to ensure atomicity with respect to
> other values so allocated.

We have to be careful not to confuse atomicity with
isolation. Depending on exactly how you define it,
atomicity is probably sufficient for isolation, but isolation
is not sufficient for atomicity e.g. a machine with only
byte access to memory can isolate multi-byte words,
but cannot access them atomically (at least not without
a global system lock or some other extra form of
synchronisation.)

If I can take it that you're actually talking about isolation
rather than full atomicity, I tend to agree with most of
what you write. Anyway, I make that assumption in my
subsequent comments...

You write that we need to isolate globals and dynamics.
As Alexander pointed out earlier, you also need
to isolate "thread private" objects. This includes
automatics (a.k.a. stack dwellers). In very many cases
the compiler has to do nothing special at all in order to
isolate automatics (specifically, where it can prove that
all objects existing within a given natural word/isolation
boundary are only accessed by the same single thread),
so it shouldn't waste much space, but requiring automatics
also to be isolated spells out that those cases where
special action is necessary must be dealt with correctly
by the compiler.

> But... what about 'char foo[2];'? Clearly the address "&foo" must be
> "aligned". But what about "&foo[1]"? If it IS, then you really need to
> force the compiler to change the definition of sizeof(char) in that
> compilation scope or break many patterns in previously portable code. For
> example, "char *bar = &foo; bar[1] = 0;". (One could construct nastier
> examples that would be harder to detect and fix.)

Changing sizeof(char) is a no-no. This, and the
rules for sizing arrays, provide a good reason why
the Java-esque default of having everything isolated
from everything else is not tenable for C and C++.

> What about structures? Is each field in the structure expanded?
Essentially
> what we're saying is that if the machine can access 'long', but not 'int',
> 'short', or 'char', atomically, then we really allocate nothing smaller
> than 'long'. Is that acceptable? How does it impact application code (and
> data sizes)?

Right: this wouldn't be acceptable.

> The best strategy would probably be to say that AN array or A structure is
> an "atomicity unit". You don't, by default, gain any guaranteed atomic
> access to members of the unit. (This could be provided for by an
additional
> pragma, or by something like 'isolated'; though the pragma would probably
> be cleaner.)

Yes, and/but:

- For the reasons stated above, I prefer "isolation unit".

- For what are properly known in C and C++ as arrays
of arrays, but which are often termed multi-dimensional
arrays, only the topmost array is an isolation unit (for one
thing because the language rules insist that T a[n][n] has
always to be n times the size of T b[n]). In contrast, structs
(and classes and unions) are allowed extra internal byte
padding, so they can always be isolation units.

- Objects that are not members of a struct/class/union
nor elements of an array should be in their own isolation
unit. An entirely non-isolated object is a dangerous
thing in a multi-threaded program: the languages would
do no service to their users by permitting it.

- The idea of a standard #pragma is, at least currently,
a contradiction in terms: they are specified to be used
for *implementation-defined* purposes. There
is always the possibility of changing this by introducing
the first ever standard #pragma, but I think it would be
difficult to sell this as better than a new keyword. (Plus
there is general dislike of the pre-processor amongst
the C++ standards people: they are unlikely to go for
anything that extends its role.) You don't need a pragma
anyway: when you need additional isolation units, just
refactor into multiple structs. For example,

// Members not guaranteed isolated from each other.
struct a {
char b;
char c;
};

// Members guaranteed isolated from each other.
struct ia {
struct { char b; } bb;
struct { char c; } cc;
};

Admittedly an anonymous-struct would be a nice
extension here, but for single member isolation
we can, in C++ at least, use an anonymous union
to permit the access syntax to remain unchanged.

// Members guaranteed isolated from each other.
// Anonymous-union syntax is C++ only.
struct ia2 {
union { char b; };
union { char c; };
};

> Often we want a larger alignment than strictly needed, for efficiency. The
> best unit here is almost always the machine's cache line size -- a value
> not commonly communicated to application code. This has proven
particularly
> critical in designing data structures for NUMA environments, but compiler
> support tends to be pretty bad.
>
> Perhaps something like "#pragma align_all ({cache|atomicity})" (Where
> "cache" is required to subsume "atomicity", just to remove ambiguity.)
>
> I'm not entirely sure that'd be sufficient, either, but it's another idea
to
> consider.

Aligning to cache lines *is* something that is suitable
for a #pragma: an environment-specific efficiency tweak.
This wouldn't be something that a language standard
would specify.

Alexander Terekhov

unread,

Oct 24, 2002, 12:19:20 PM10/24/02

After spending some time [thanks to "mainframe schedulers: if you
don't have cpu utilization at 99+%, something is seriously wrong"]
trying to digest messages from Garry Lancaster and David Butenhof,
I'm now thinking in the following direction:

1. std::thread_allocator<T>, thread_new/thread_delete, operator
thread_new/operator thread_delete, thread_malloc()/thread_free(),
etc. -- thread specific memory allocation facilities that would
allow slightly more optimized/less expensive operations with
respect to synchronization and isolation for data that is *NOT*
meant to be thread-shared.

2. "isolation" scopes [might also be nested; possibly] for defs of
objects of static storage duration and non-static class members:

isolated {

static char a;
static char b;
static mutex m1;

}

isolated {

static char c;
static char d;
static mutex m2;

}

struct something {

isolated {

char a;
char b;
mutex m1;

}

isolated {

char a;
char b;
mutex m1;

}

} s; // isolated by default -- see below

This would allow one to clearly express isolation boundaries.

By default, definitions of objects of static storage duration
shall be treated as being isolated from each other:

static char a; // isolated { static char a; }
static char b; // isolated { static char b; }

Objects of automatic storage duration need NOT be isolated
[the isolation of the entire thread stack aside] unless an
address/ref is taken and it can't be proven that access to
it from some other thread is impossible.

3. Array elements can be made isolated ONLY using class type
with "isolated" member(s):

char c_array[2]; // no isolation with respect to elems

struct isolated_char {

isolated { char c; }

} ic_array2[2]; // fully isolated ic_array[0].c
// and ic_array[1].c

4. Introduce something ala offsetof-"magic" with respect to
alignment/padding that would provide the means to write
thread-shared *AND* thread-private allocators entirely in
standard C/C++.

5. In the single threaded "mode", isolation scopes can simply
be ignored.

regards,
alexander.

Alexander Terekhov

unread,

Oct 24, 2002, 1:22:20 PM10/24/02

< corrections >

Alexander Terekhov wrote:
[...]

> 1. std::thread_allocator<T>, thread_new/thread_delete, operator
> thread_new/operator thread_delete, thread_malloc()/thread_free(),
> etc. -- thread specific memory allocation facilities that would
> allow slightly more optimized/less expensive operations with
> respect to synchronization and isolation for data that is *NOT*
> meant to be thread-shared.

Well, "*NOT* meant to be thread-shared" was probably confusing. The
allocator AND all its allocated objects COULD be accessed by different
threads, but serialized/synchronized -- with precluded asynchrony on
some "higher" level. Sort of "dynamic segmented stack model" where
the entire stack can be passed from thread to thread [if needed].

[...]

> struct something {
>
> isolated {
>
> char a;
> char b;
> mutex m1;
>
> }
>
> isolated {
>
> char a;
> char b;
> mutex m1;
>
> }
>
> } s; // isolated by default -- see below

struct something {

isolated {

char a;
char b;
mutex m1;

}

isolated {

char c;
char d;
mutex m2;

}

} s; // isolated by default -- see below

[...]

> char c_array[2]; // no isolation with respect to elems
>
> struct isolated_char {
>
> isolated { char c; }
>
> } ic_array2[2]; // fully isolated ic_array[0].c
> // and ic_array[1].c

char c_array[2]; // no isolation with respect to elems

struct isolated_char {

isolated { char c; }

} ic_array[2]; // fully isolated ic_array[0].c

// and ic_array[1].c

regards,
alexander.

Garry Lancaster

unread,

Oct 25, 2002, 5:04:06 AM10/25/02

Alexander Terekhov:

> After spending some time [thanks to "mainframe schedulers: if you
> don't have cpu utilization at 99+%, something is seriously wrong"]
> trying to digest messages from Garry Lancaster and David Butenhof,
> I'm now thinking in the following direction:
>
> 1. std::thread_allocator<T>, thread_new/thread_delete, operator
> thread_new/operator thread_delete, thread_malloc()/thread_free(),
> etc. -- thread specific memory allocation facilities that would
> allow slightly more optimized/less expensive operations with
> respect to synchronization and isolation for data that is *NOT*
> meant to be thread-shared.

Alexander's later correction:

> Well, "*NOT* meant to be thread-shared" was probably confusing. The
> allocator AND all its allocated objects COULD be accessed by different
> threads, but serialized/synchronized -- with precluded asynchrony on
> some "higher" level. Sort of "dynamic segmented stack model" where
> the entire stack can be passed from thread to thread [if needed].

If I understand your correction correctly, all objects allocated
using these per-thread techniques are isolated except that
those allocated on the same thread need not be isolated
from each other. Makes sense.

I can think of two reasons why you might want thread-specific
allocation as part of a language standard:

1. You can reduce the padding between adjacent allocations
if you know they are only going to be used by the same
thread since isolation with respect to each other is not an
issue. This is sound in theory, however, the smallest allocation
chunks in most language library allocation routines are already
at or beyond the granularity of the natural isolation/word boundary.
If you only ask for 1 byte, you probably get 8 or 16 in many cases.
This happens because general purpose allocators need to
supply memory aligned to the maximum alignment requirement
of any type in the system and because of the bookkeeping space
overhead of small allocations. Your type-specific
std::thread_allocator<T> could get around the alignment issue,
but is likely that a relatively simple user-defined allocator
tailored for a specific purpose could out-perform it, so why
bother supplying this half-way house as standard?

2. Per-thread allocators can avoid the need for global
synchronization during each allocation and deallocation by
maintaining per-thread allocation and free lists etc. But
this doesn't require a special interface - thread local
storage is just as available to the current allocation
interfaces as it would be to your newly suggested ones.
I'm guessing things like the Hoard allocator do this.
So there is no advantage over what we have now. At
first glance you might think that using std::thread_allocator<T>
could get around the need to use TLS to implement
this, but the standard allocator interface doesn't work
like that: any state must be shared between objects.
(Bizarrely, all allocators of the same type must be able
to free each others allocations. Don't ask me why it is
that way. It just is.)

> 2. "isolation" scopes [might also be nested; possibly] for defs of
> objects of static storage duration and non-static class members:
>
> isolated {
>
> static char a;
> static char b;
> static mutex m1;
>
> }
>
> isolated {
>
> static char c;
> static char d;
> static mutex m2;
>
> }

For ease of comparison I'll re-write your examples as I would
write them if the isolation rules I proposed were in place.
(Just to show that we don't *need* a new keyword.)

static struct {

char a;
char b;
mutex m1;

} e;
static struct {

char c;
char d;
mutex m2;

} f;

static char a;
static char b;
static mutex m1;

static char c;
static char d;
static mutex m2;

That last is "over-isolated" compared to the others,
but given the relatively small amount of static data
in most programs any extra padding is likely to be
negligible (and the mutexes will most likely already be
aligned and padded to avoid cross-thread
interference in any case).

(Corrections applied to following.)

> struct something {
>
> isolated {
>
> char a;
> char b;
> mutex m1;
>
> }
>
> isolated {
>

> char c;
> char d;
> mutex m2;
>
> }
>

> } s; // isolated by default -- see below

struct something {
struct internal {

char a;
char b;
mutex m1;

};
struct internal c;
struct internal d;
} s;

> This would allow one to clearly express isolation boundaries.

Your use of the "isolated" keyword is sufficient, but it's
not necessary.

> By default, definitions of objects of static storage duration
> shall be treated as being isolated from each other:
>
> static char a; // isolated { static char a; }
> static char b; // isolated { static char b; }

Yes, I agree, and so do the rules I posted.

> Objects of automatic storage duration need NOT be isolated
> [the isolation of the entire thread stack aside] unless an
> address/ref is taken and it can't be proven that access to
> it from some other thread is impossible.

I agree with your intent, but you don't need to say that.
Just say they "shall be isolated" and let the implementations
figure out what they actually have to do to achieve it for
each object. If that's nothing and they can easily deduce
that during compilation, they will do.

> 3. Array elements can be made isolated ONLY using class type
> with "isolated" member(s):
>
> char c_array[2]; // no isolation with respect to elems

This is the same with my model.

(Corrections applied to following.)

> struct isolated_char {
>
> isolated { char c; }
>

> } ic_array[2]; // fully isolated ic_array[0].c

> // and ic_array[1].c

struct ichar { char c; } ic_array[2];

I think the two sets of rules are the same except that
in your model sub-objects or array elements of
class-type are not guaranteed to be isolated from
their "sibling" sub-objects or elements, and in mine
they are. Both models work.

In other words your "isolated" has the same semantics
with respect to isolation as sub-object structs/classes/
unions in mine.

I don't think there is any real difference in the
isolation boundaries achievable with the two sets
of rules: they just differ in their defaults and how
you control them.

So, the default amount of isolation in your model
is slightly less than in mine, which means you are
forced to hand-tweak the isolation boundaries
slightly more often to ensure isolation safety. In
favour of your rules you will undoubtedly save a
few bytes here and there in many programs. Since
ideally we would just be standardising current
practice, it would be interesting to know what current
compilers do with respect to isolation units (if they
even consider them).

The other main difference is the addition of the
keyword. Why are new keywords a bad thing?

- Any programs that already use the identifier
"isolated" (e.g. for a variable or type) will break. If
you choose the uglier "__isolated" instead you would
avoid breaking standard-conforming programs
provided no compiler vendor had already used this
as an extension. (The language standards say that
names containing double underscores are reserved
for implementations.)

- You create a mismatch between pre- and post-
isolated-aware code. Any single use of the new keyword
means the program will not compile on a pre-isolated
compiler.

> 4. Introduce something ala offsetof-"magic" with respect to
> alignment/padding that would provide the means to write
> thread-shared *AND* thread-private allocators entirely in
> standard C/C++.

What problems are there at the moment that wouldn't
be fixed by either set of suggested isolation rules?

> 5. In the single threaded "mode", isolation scopes can simply
> be ignored.

Again, you are right, but you do not need to say so
explicitly: implementations can figure that out for
themselves, provided they can tell the difference
between a single- and a multi-threaded build.

I bet that most of them will choose to keep the sizes
of all types the same across the different build models
though. Doing otherwise is not wrong but is likely to
break code that works but assumes more than it
should about structure layouts. (Some people think it
is a good thing for compilers to go out of their way to
break non-conforming code, though. Maybe I'm too
soft ;-)

Sergey P. Derevyago

unread,

Oct 24, 2002, 1:03:37 PM10/24/02

David Butenhof wrote:
> >> Or even better,
> >>
> >> typedef struct {char c; Mutex m} Data;
> >> Data *d1;
> >> Data *d2;
> >>
> >> d1 = malloc (sizeof Data);
> >> d2 = malloc (sizeof Data);
> >>
> > IMHO it makes sense to issue these calls to malloc from different threads,
> > say from the owners of the appropriate pointers.
>
> We're arguing subtle linguistic semantics here, but the distinction could be
> important, so I'll proceed. ;-)

The point is to call malloc from different threads just to increase the
chance that both d1 and d2 will live on different cache lines.

PS Yes, I agree that the "owner thread" is rather tricky concept :)

Alexander Terekhov

unread,

Oct 25, 2002, 2:31:09 PM10/25/02

Garry Lancaster wrote:
>
> Alexander Terekhov:
> > After spending some time [thanks to "mainframe schedulers: if you
> > don't have cpu utilization at 99+%, something is seriously wrong"]
> > trying to digest messages from Garry Lancaster and David Butenhof,
> > I'm now thinking in the following direction:
> >
> > 1. std::thread_allocator<T>, thread_new/thread_delete, operator
> > thread_new/operator thread_delete, thread_malloc()/thread_free(),
> > etc. -- thread specific memory allocation facilities that would
> > allow slightly more optimized/less expensive operations with
> > respect to synchronization and isolation for data that is *NOT*
> > meant to be thread-shared.
>
> Alexander's later correction:
> > Well, "*NOT* meant to be thread-shared" was probably confusing. The
> > allocator AND all its allocated objects COULD be accessed by different
> > threads, but serialized/synchronized -- with precluded asynchrony on
> > some "higher" level. Sort of "dynamic segmented stack model" where
> > the entire stack can be passed from thread to thread [if needed].
>
> If I understand your correction correctly, all objects allocated
> using these per-thread techniques are isolated except that
> those allocated on the same thread need not be isolated
> from each other.

Well, I'd say that all objects allocated by the same allocator
are isolated from all other objects allocated by some other
allocator(s) but aren't necessarily isolated with respect to
each other. This would mean that the allocator and all its
allocated object shall be accesses by only one thread at any
time, but the "ownership" can be "transferred" from thread to
thread [optionally; if needed/wanted].

> Makes sense.
>
> I can think of two reasons why you might want thread-specific
> allocation as part of a language standard:
>
> 1. You can reduce the padding between adjacent allocations
> if you know they are only going to be used by the same
> thread since isolation with respect to each other is not an
> issue.

Yes.

> This is sound in theory, however, the smallest allocation
> chunks in most language library allocation routines are already
> at or beyond the granularity of the natural isolation/word boundary.
> If you only ask for 1 byte, you probably get 8 or 16 in many cases.
> This happens because general purpose allocators need to
> supply memory aligned to the maximum alignment requirement
> of any type in the system and because of the bookkeeping space
> overhead of small allocations.

Well, yes. And even the "buckets"-things like

http://publibn.boulder.ibm.com/doc_link/en_US/a_doc_lib/aixbman/prftungd/2365c35.htm#HDRI45811
(see MALLOCBUCKETS...)

http://publibn.boulder.ibm.com/doc_link/en_US/a_doc_lib/aixprggd/genprogc/malloc_buckets.htm
("Malloc Buckets")

have some "restrictions" w.r.t. sizing/alignment:

"The bucket sizing factor must be a multiple of 8 for 32-bit
implementations and a multiple of 16 for 64-bit implementations
in order to guarantee that addresses returned from malloc
subsystem functions are properly aligned for all data types."

> Your type-specific
> std::thread_allocator<T> could get around the alignment issue,

I'm not sure how would one "get around the alignment issue"...

> but is likely that a relatively simple user-defined allocator
> tailored for a specific purpose could out-perform it, so why
> bother supplying this half-way house as standard?

Well, yes. I've played a bit with "user-defined allocator
tailored for a specific purpose" myself. You might want to
take a look at the following: [that's rather old stuff, but
it illustrates some ideas -- modulo bugs ;-) ]

http://www.terekhov.de/hsamemal.hpp
http://www.terekhov.de/hsamemal.inl
http://www.terekhov.de/hsamemal.cpp
http://www.terekhov.de/hsamemal.c

But I'd really prefer to use something "Standard" instead.

[...]

Apart from the problem of "over-isolation". ;-)

> > By default, definitions of objects of static storage duration
> > shall be treated as being isolated from each other:
> >
> > static char a; // isolated { static char a; }
> > static char b; // isolated { static char b; }
>
> Yes, I agree, and so do the rules I posted.
>
> > Objects of automatic storage duration need NOT be isolated
> > [the isolation of the entire thread stack aside] unless an
> > address/ref is taken and it can't be proven that access to
> > it from some other thread is impossible.
>
> I agree with your intent, but you don't need to say that.
> Just say they "shall be isolated" and let the implementations
> figure out what they actually have to do to achieve it for
> each object. If that's nothing and they can easily deduce
> that during compilation, they will do.
>
> > 3. Array elements can be made isolated ONLY using class type
> > with "isolated" member(s):
> >
> > char c_array[2]; // no isolation with respect to elems
>
> This is the same with my model.

Yes, you've convinced me that introduction of yet another type
qualifier [e.g. "isolated char", where sizeof( isolated char )
>= sizeof( char )] would be rather messy.

> (Corrections applied to following.)
> > struct isolated_char {
> >
> > isolated { char c; }
> >
> > } ic_array[2]; // fully isolated ic_array[0].c
> > // and ic_array[1].c
>
> struct ichar { char c; } ic_array[2];
>
> I think the two sets of rules are the same except that
> in your model sub-objects or array elements of
> class-type are not guaranteed to be isolated from
> their "sibling" sub-objects or elements, and in mine
> they are. Both models work.

Yes, I'm just fearing a bit the problem/overhead of "over-
isolation". Imagine how much memory could be wasted when
isolation is be done on something around 64 bytes [cache
line size]...

> In other words your "isolated" has the same semantics
> with respect to isolation as sub-object structs/classes/
> unions in mine.
>
> I don't think there is any real difference in the
> isolation boundaries achievable with the two sets
> of rules: they just differ in their defaults and how
> you control them.
>
> So, the default amount of isolation in your model
> is slightly less than in mine, which means you are
> forced to hand-tweak the isolation boundaries
> slightly more often to ensure isolation safety. In
> favour of your rules you will undoubtedly save a
> few bytes here and there in many programs. Since
> ideally we would just be standardising current
> practice, it would be interesting to know what current
> compilers do with respect to isolation units (if they
> even consider them).
>
> The other main difference is the addition of the
> keyword. Why are new keywords a bad thing?

They are NOT Good Things, for sure. ;-)

> - Any programs that already use the identifier
> "isolated" (e.g. for a variable or type) will break. If
> you choose the uglier "__isolated" instead you would
> avoid breaking standard-conforming programs
> provided no compiler vendor had already used this
> as an extension. (The language standards say that
> names containing double underscores are reserved
> for implementations.)
>
> - You create a mismatch between pre- and post-
> isolated-aware code. Any single use of the new keyword
> means the program will not compile on a pre-isolated
> compiler.

Yes, that's a problem. However, consider that I'm sort
of dreaming to have even more new keywords/constructs...

http://groups.google.com/groups?selm=3DA6C62A.AB8FF3D3%40web.de
(Subject: Re: local statics and TLS objects)

So, few keywords less here and there... ``big deal.'' ;-) ;-)

> > 4. Introduce something ala offsetof-"magic" with respect to
> > alignment/padding that would provide the means to write
> > thread-shared *AND* thread-private allocators entirely in
> > standard C/C++.
>
> What problems are there at the moment that wouldn't
> be fixed by either set of suggested isolation rules?

First off, under your rules, struct Char { char c; }
could be way too big [and with no gains whatsoever] for
purely "thread-specific"/"intra-thread" stuff. I really
don't like it. Under "my rules", I'd probably need to
know how much extra space need to be added to make my
custom user allocator "inter-thread" safe.

regards,
alexander.

David Thompson

unread,

Oct 27, 2002, 7:53:37 PM10/27/02

Garry Lancaster <glanc...@ntlworld.com> wrote :
...

> - The idea of a standard #pragma is, at least currently,
> a contradiction in terms: they are specified to be used
> for *implementation-defined* purposes. There
> is always the possibility of changing this by introducing
> the first ever standard #pragma, but I think it would be
> difficult to sell this as better than a new keyword.

C99 added a few standard #pragma's, for its new
more-explicit almost-IEEE/60559 floating-point stuff
(and reserved the introducer STDC for possibly more).
Adding a new keyword without a leading underscore,
thus breaking existing code where it was a legal identifier,
seems to be a near-taboo for the standards committees;
between _keyword and #pragma (or in C++ only key__word
with 2 underscores?!) it's hard to say which is uglier.

> (Plus
> there is general dislike of the pre-processor amongst
> the C++ standards people: they are unlikely to go for

> anything that extends its role.) ...

Well, C99 also adds an "operator", _Pragma( "string" ),
with the same effect; maybe that could sneak past. :-)

--
- David.Thompson 1 now at worldnet.att.net

Garry Lancaster

unread,

Oct 28, 2002, 4:35:32 AM10/28/02

[snip]

Garry Lancaster:

> > This is sound in theory, however, the smallest allocation
> > chunks in most language library allocation routines are already
> > at or beyond the granularity of the natural isolation/word boundary.
> > If you only ask for 1 byte, you probably get 8 or 16 in many cases.
> > This happens because general purpose allocators need to
> > supply memory aligned to the maximum alignment requirement
> > of any type in the system and because of the bookkeeping space
> > overhead of small allocations.

[snip]

> > Your type-specific
> > std::thread_allocator<T> could get around the alignment issue,

Alexander Terekhov:

> I'm not sure how would one "get around the alignment issue"...

I simply mean that when T is known at compile time
an allocator can be designed that only satisfies T's
alignment requirements, rather than having to satisfy
the most conservative alignment requirements of all
types.

If a concrete example helps, think of std::thread_allocator<char>
implemented as a simple array-based allocator.

Maybe this allocator discussion would be better as a separate
thread. It is only tenuously related to the main subject.

[snip]

> > Your use of the "isolated" keyword is sufficient, but it's
> > not necessary.

> Apart from the problem of "over-isolation". ;-)

It's not necessary to avoid over-isolation either.
Any data layout you can acheive with the isolated
keyword can also be acheived without it by my
suggested rules.

[snip]

> Yes, I'm just fearing a bit the problem/overhead of "over-
> isolation". Imagine how much memory could be wasted when
> isolation is be done on something around 64 bytes [cache
> line size]...

Aren't you overstating your case a bit here? Aren't most
platforms' natural isolation boundaries the same or less
than their natural word size, so typically 8 bytes or less
rather than 64 bytes?

I understood from previous comments that cache size
alignment was an efficiency issue, not an isolation
issue (recall I defined isolation as what is necessary
to avoid word-tearing).

If I misunderstood then I would agree that some rethink
is required.

[snip]

> > The other main difference is the addition of the
> > keyword. Why are new keywords a bad thing?
>
> They are NOT Good Things, for sure. ;-)
>
> > - Any programs that already use the identifier
> > "isolated" (e.g. for a variable or type) will break. If
> > you choose the uglier "__isolated" instead you would
> > avoid breaking standard-conforming programs
> > provided no compiler vendor had already used this
> > as an extension. (The language standards say that
> > names containing double underscores are reserved
> > for implementations.)
> >
> > - You create a mismatch between pre- and post-
> > isolated-aware code. Any single use of the new keyword
> > means the program will not compile on a pre-isolated
> > compiler.
>
> Yes, that's a problem. However, consider that I'm sort
> of dreaming to have even more new keywords/constructs...
>
> http://groups.google.com/groups?selm=3DA6C62A.AB8FF3D3%40web.de
> (Subject: Re: local statics and TLS objects)
>
> So, few keywords less here and there... ``big deal.'' ;-) ;-)

Smileys noted, but I don't think the arguments against new
keywords are quite so easily dismissed. Particularly when
there is a counter-proposal without any new keywords.

> > > 4. Introduce something ala offsetof-"magic" with respect to
> > > alignment/padding that would provide the means to write
> > > thread-shared *AND* thread-private allocators entirely in
> > > standard C/C++.
> >
> > What problems are there at the moment that wouldn't
> > be fixed by either set of suggested isolation rules?
>
> First off, under your rules, struct Char { char c; }
> could be way too big [and with no gains whatsoever] for
> purely "thread-specific"/"intra-thread" stuff.
> I really don't like it.

(Ignoring isolation I think most people would write the above
as

typedef char Char;

anyway. But I take your general point even if I quibble over
your exact example.)

Same with (admittedly even more unlikely)

struct Char { isolated { char c; } };

under your rules.

Someone who understood isolation would tend to avoid
over-isolation in any case, so we are mostly talking about
supporting those who are not familiar with the concept.
Fair enough, that is a large enough chunk of people at the
moment (and until recently I was one of them!) and I wouldn't
expect isolation being tackled by the standard would
improve matters that much.

This is a hard decision, but at the moment I prefer a
solution that offers more of a safety net for these people:
one that more often produces code that is over-isolated
and wastes a few extra bytes than code that is under-
isolated and doesn't work. Not that even my suggestion
is totally safe for these people: we already rejected the
Java rules as too inefficient. The best solution in any case
is programmer education, but is that realistic here?

[snip]

Alexander Terekhov

unread,

Oct 28, 2002, 6:05:46 AM10/28/02

Garry Lancaster wrote:
[...]

> Someone who understood isolation would tend to avoid
> over-isolation in any case, so we are mostly talking about
> supporting those who are not familiar with the concept.
> Fair enough, that is a large enough chunk of people at the
> moment (and until recently I was one of them!) and I wouldn't
> expect isolation being tackled by the standard would
> improve matters that much.

I disagree. Actually, an introduction of a new keyword would
be really helpful with respect to "education"/awareness, I
believe.

> This is a hard decision, but at the moment I prefer a
> solution that offers more of a safety net for these people:
> one that more often produces code that is over-isolated
> and wastes a few extra bytes than code that is under-
> isolated and doesn't work. Not that even my suggestion
> is totally safe for these people: we already rejected the
> Java rules as too inefficient. The best solution in any case
> is programmer education, but is that realistic here?

I'll post copies of selected messages from this thread to the
OG/AG reflector... Let's see what system implementors [and
WG14/WG21 liaison ;-) ] folks think about all this.

regards,
alexander.

Alexander Terekhov

unread,

Oct 28, 2002, 6:14:21 AM10/28/02

Alexander Terekhov wrote:
[...]

> I'll post copies of selected messages from this thread to the
> OG/AG reflector... Let's see what system implementors [and
> WG14/WG21 liaison ;-) ] folks think about all this.

http://www.opengroup.org/sophocles/show_mail.tpl?source=L&listname=austin-group-l&id=4743
(Subject: Memory isolation)

regards,
alexander.

Garry Lancaster

unread,

Oct 28, 2002, 5:09:22 AM10/28/02

Garry Lancaster:

> > - The idea of a standard #pragma is, at least currently,
> > a contradiction in terms: they are specified to be used
> > for *implementation-defined* purposes. There
> > is always the possibility of changing this by introducing
> > the first ever standard #pragma, but I think it would be
> > difficult to sell this as better than a new keyword.

David Thompson:

> C99 added a few standard #pragma's, for its new
> more-explicit almost-IEEE/60559 floating-point stuff
> (and reserved the introducer STDC for possibly more).

True. Thanks for pointing out my oversight.

> Adding a new keyword without a leading underscore,
> thus breaking existing code where it was a legal identifier,
> seems to be a near-taboo for the standards committees;

This is my impression too, although they do sometimes
overcome their inhibitions e.g. restrict.

> between _keyword and #pragma (or in C++ only key__word
> with 2 underscores?!) it's hard to say which is uglier.
>
> > (Plus
> > there is general dislike of the pre-processor amongst
> > the C++ standards people: they are unlikely to go for
> > anything that extends its role.) ...
>
> Well, C99 also adds an "operator", _Pragma( "string" ),
> with the same effect; maybe that could sneak past. :-)

Still not fashionable enough. Can it be made to use
templates somehow? ;-)

Konrad Schwarz

unread,

Oct 31, 2002, 11:04:41 AM10/31/02

David Butenhof schrieb:

> Perhaps something like "#pragma align_all ({cache|atomicity})" (Where
> "cache" is required to subsume "atomicity", just to remove ambiguity.)

The compiler might not know the cache-line size of the target machine.
This would then require extensive dynamic (re-) linking.

David Butenhof

unread,

Oct 31, 2002, 1:23:48 PM10/31/02

Konrad Schwarz wrote:

Cache line size is pretty basic for optimizing data access. You're
generating machine code for a machine -- either a particular member of a
family, or something that's "reasonable" for all members of a family.

If you can't count on any architectural information without knowing the
particular family member on which the code will run, then you'll need to
know that anyway. Most architectural families have defined minimum and
maximum allowed variations for important parameters like cache line size,
(and page size), and the compiler can do something that'll work on all
members.

Garry Lancaster

unread,

Nov 6, 2002, 7:37:33 AM11/6/02

> Alexander Terekhov wrote:
> > I'll post copies of selected messages from this thread to the
> > OG/AG reflector... Let's see what system implementors [and
> > WG14/WG21 liaison ;-) ] folks think about all this.
>
>
http://www.opengroup.org/sophocles/show_mail.tpl?source=L&listname=austin-gr
oup-l&id=4743
> (Subject: Memory isolation)

This link still shows only your original post (of selected messages
from this newsgroup thread).

Will the existing replies appear on this page, and there just
aren't any yet?

Or do we need another link to see the replies?

It would probably be easiest for everyone else if you were to
post a summary of any interesting responses back here, if
you are willing and able.

Alexander Terekhov

unread,

Nov 6, 2002, 11:35:33 AM11/6/02

Garry Lancaster wrote:
>
> > Alexander Terekhov wrote:
> > > I'll post copies of selected messages from this thread to the
> > > OG/AG reflector... Let's see what system implementors [and
> > > WG14/WG21 liaison ;-) ] folks think about all this.
> >
> >
> http://www.opengroup.org/sophocles/show_mail.tpl?source=L&listname=austin-gr
> oup-l&id=4743
> > (Subject: Memory isolation)
>
> This link still shows only your original post (of selected messages
> from this newsgroup thread).
>
> Will the existing replies appear on this page,

Nope. They will appear on this page:

http://www.opengroup.org/sophocles/show_archive.tpl?source=L&listname=austin-group-l&first=1&pagesize=80&searchstring=Memory+isolation&zone=G

http://tinyurl.com/2hco

</TinyURL>

> and there just aren't any yet?

There aren't any [public] replies yet.

>
> Or do we need another link to see the replies?

Well, the best would to subscribe to the OG/AG forum(s). This can
be done here:

http://www.opengroup.org/austin/lists.html
(Mailing Lists & Archives)

> It would probably be easiest for everyone else if you were to
> post a summary of any interesting responses back here, if
> you are willing and able.

OK. Uhmm. Summary [thus far]: folks don't seem to care about this
problem at all. :-(

regards,
alexander.

Alexander Terekhov

unread,

Nov 21, 2002, 10:33:48 AM11/21/02

Alexander Terekhov wrote:
[...]

> OK. Uhmm. Summary [thus far]: folks don't seem to care about this
> problem at all. :-(

Except that Mr. Douglas A. Gwyn suggested the following:

<quote>

Alexander Terekhov wrote:
> "Douglas A. Gwyn" wrote:
> > There is nothing to prevent introduction of some new
> > data (C object) type with any access property you
> > think all implementations can provide, and having
> > programmers alias their other types against it using
> > unions.
> I don't understand this. Could you please elaborate?
> With some example(s), of possible. TIA.

#include <extensions.h>
typedef union {
ext_cache_line_aligned_t dummy;
uint_least32_t data;
} cla_ul32;
cla_ul32 *p = malloc(sizeof(cla_ul32));

</quote>

For further details click here:

http://groups.google.com/groups?threadm=3DD91BAA.F351B0E%40web.de
(Newsgroups: comp.lang.c++, comp.std.c; "Subject: Memory isolation
[was: Re: Volatile declared objects]")

regards,
alexander.

Garry Lancaster

unread,

Nov 28, 2002, 4:55:39 AM11/28/02

> Alexander Terekhov:

> [...]
> > OK. Uhmm. Summary [thus far]: folks don't seem to care about this
> > problem at all. :-(

Alexander Terekhov:

> Except that Mr. Douglas A. Gwyn suggested the following:
>
> <quote>
>
> Alexander Terekhov wrote:
> > "Douglas A. Gwyn" wrote:
> > > There is nothing to prevent introduction of some new
> > > data (C object) type with any access property you
> > > think all implementations can provide, and having
> > > programmers alias their other types against it using
> > > unions.
> > I don't understand this. Could you please elaborate?
> > With some example(s), of possible. TIA.
>
> #include <extensions.h>
> typedef union {
> ext_cache_line_aligned_t dummy;
> uint_least32_t data;
> } cla_ul32;
> cla_ul32 *p = malloc(sizeof(cla_ul32));
>
> </quote>

Interesting idea.

First, I note that the discussion again mutated to avoiding
sharing of cache lines rather than just to achieve isolation.
All the platforms I've examined seemed on casual analysis
(I am not a hardware expert) *not* to require avoidance of
cache line sharing in order to achieve isolation. Cache
line sharing was undoubtedly a real performance issue,
but it wasn't a correctness issue since they all implemented
some form of cache coherency. The isolation issues arose
from the non-availability of instructions addressing part-
words.

I am really interested to see if this casual conclusion
holds for the vast majority of platforms. On the principle
that the surest way of getting good information on
Usenet is to post an incorrect statement and wait for
the corrections I put it bluntly:

There is no existing platform that requires cache line
sized alignment for isolation. A variable is always
isolated from every other variable provided it does
not share any words with any other variable, where
a word just means the natural data word size for the
platform.

(I wrote something similar, but less strong, earlier
in the thread and no-one contradicted me.)

Second, using unions in this way is problematic in
C++ because [C++ standard 9.5/1]:

"...An object of a class with a non-trivial constructor (12.1),
a non-trivial copy constructor (12.8), a non-trivial destructor
(12.4), or a non-trivial copy assignment operator (13.5.3,
12.8) cannot be a member of a union, nor can an array of
such objects..."

Classes that have one or more of these forbidden
characteristics are of couse very common in C++
programming. So, it is not a general solution.

Third, how are we supposed to align a variable that
might be larger than a cache line? If we do,

typedef union {
ext_cache_line_aligned_t dummy;
struct large_struct data;
} cla_large_struct;

it won't necessarily work, since large_struct could
be 1.5 cache lines in size and thus not guaranteed
isolated from variables placed after it in memory.

If we have to create a separate CLA union for every
fundamental type member of a struct when we only
want the struct to be isolated as a whole (i.e. it
doesn't matter if members of the struct are not
isolated from each other), this will be very wasteful
of memory. (And of course we are assuming that
the size of each fundamental type is <= to the size
of a cache line.)

Lastly, even in C, if this is the sole way of achieving
isolation, it is either used for the declarations of every
variable in the program or some variables will not be
guaranteed, portably, to be isolated. As noted before,
a non-isolated variable is a dangerous thing, which is
why I believe that a future language standard should
make isolation guarantees for all variables. Not that
this means every variable should be guaranteed
isolated from every other variable, for example for a
struct member it could be as weak as "this variable
is isolated from all variables except for those that are
members of the same struct".

For your information, this subject was discussed at
the last meeting of the British C++ standards panel,
and the consensus seemed to be that it was an issue
worth pursuing further. This doesn't mean it will be
dealt with by the next C++ standard, but it is one step
closer. Whatever happens, I will keep this group
informed.

Kind regards

Garry Lancaster

Alexander Terekhov

unread,

Nov 28, 2002, 10:08:21 AM11/28/02

Garry Lancaster wrote:
[...]

> Second, using unions in this way is problematic in
> C++ because [C++ standard 9.5/1]:
>
> "...An object of a class with a non-trivial constructor (12.1),
> a non-trivial copy constructor (12.8), a non-trivial destructor
> (12.4), or a non-trivial copy assignment operator (13.5.3,
> 12.8) cannot be a member of a union, nor can an array of
> such objects..."
>
> Classes that have one or more of these forbidden
> characteristics are of couse very common in C++
> programming. So, it is not a general solution.

Yep. That's sort of almost-the-same problem as with
"simple and neat"(*) alignment<T>:

http://groups.google.com/groups?selm=3CA37848.5202315C%40web.de
(Subject: Re: offsetof(type, member constant expression?)

>
> Third, how are we supposed to align a variable that
> might be larger than a cache line? If we do,
>
> typedef union {
> ext_cache_line_aligned_t dummy;
> struct large_struct data;
> } cla_large_struct;
>
> it won't necessarily work, since large_struct could
> be 1.5 cache lines in size and thus not guaranteed
> isolated from variables placed after it in memory.

Yup. And a sort of "example" illustrating this problem
can be found here:

http://groups.google.com/groups?selm=3DDA68FF.5A0DBA22%40web.de
(Subject: Re: Memory isolation [was: Re: Volatile declared objects])

[...]

> For your information, this subject was discussed at
> the last meeting of the British C++ standards panel,
> and the consensus seemed to be that it was an issue
> worth pursuing further. This doesn't mean it will be
> dealt with by the next C++ standard, but it is one step
> closer. Whatever happens, I will keep this group
> informed.

Great! Uhmmm...

http://groups.google.com/groups?selm=3CC58AB8.690D614A%40web.de
(P.S. If/when you see the folks from that "Panel"...)

regards,
alexander.

(*) ``For every complex problem, there is a solution that is simple,
neat, and wrong.'' -- H. L. Mencken

--
http://technetcast.ddj.com/tnc_play_stream.html?stream_id=560

Alexander Terekhov

unread,

Nov 28, 2002, 3:13:07 PM11/28/02

Let's see whether the "Aahz's Law" works on comp.arch... ;-)

> I am really interested to see if this casual conclusion
> holds for the vast majority of platforms. On the principle
> that the surest way of getting good information on
> Usenet is to post an incorrect statement and wait for
> the corrections I put it bluntly:
>
> There is no existing platform that requires cache line
> sized alignment for isolation. A variable is always
> isolated from every other variable provided it does
> not share any words with any other variable, where
> a word just means the natural data word size for the
> platform.

AGREED!

http://groups.google.com/groups?selm=3DD91BAA.F351B0E%40web.de
(Subject: Memory isolation [was: Re: Volatile declared objects])

regards,
alexander.

0 new messages