false sharing confusion!!

mynickmynick

unread,

Jul 18, 2008, 1:35:21 PM7/18/08

to

I know very badly the problem of false sharing.
Questions:
(1) There is a false sharing of 2 processors with 2 cache which can
lead ONLY to inefficiency BUT NOT to bugs (unwanted rush conditions on
not really shared adjacent data)?
(2) There is a false sharing of core-2 Intels with two cores but one
cache? This might lead to BUGS?!
(3) There is a false sharing of core duo with 2 cash (if they exist)
which might be similar to case (1) ?
(4) In a C pthread program what directives or whatever can I use to
prevent at all this f..ng false sharing??
(5) please give me interesting links!

Thank you for support

David Schwartz

unread,

Jul 19, 2008, 4:00:14 AM7/19/08

to

You're trying to solve a problem you don't have.

Generally, you can avoid false sharing with three simple rules:

1) Don't try to write your own low-level primitives. Use ones that
have been developed and optimized by experts for your platform.

2) Don't share sub-objects. If you need to share something between
threads, make it a standalone object allocated from the system memory
allocator.

3) Write code that is guaranteed to work by the threading standard you
are using. Avoid structures that "just happen to work" on the machine
you tried them on.

Otherwise, fix performance problems once they're proven to be actual
problems.

DS

Chris Thomasson

unread,

Jul 19, 2008, 5:40:16 AM7/19/08

to

"mynickmynick" <mynick...@yahoo.com> wrote in message
news:834f500e-94d1-4d9a...@34g2000hsh.googlegroups.com...

>I know very badly the problem of false sharing.
> Questions:
> (1) There is a false sharing of 2 processors with 2 cache which can
> lead ONLY to inefficiency BUT NOT to bugs (unwanted rush conditions on
> not really shared adjacent data)?

Word-tearing aside for a moment, I have not seen any bugs caused by
false-sharing, but it does simply destroy performance by constantly
invalidating cached data.

> (2) There is a false sharing of core-2 Intels with two cores but one
> cache? This might lead to BUGS?!
> (3) There is a false sharing of core duo with 2 cash (if they exist)
> which might be similar to case (1) ?
> (4) In a C pthread program what directives or whatever can I use to
> prevent at all this f..ng false sharing??
> (5) please give me interesting links!
>
> Thank you for support

http://groups.google.com/group/comp.programming.threads/msg/8036dc8a380718ad

If you want to avoid false-sharing you really need to find out the L2
cache-line size of the architecture(s) you are targeting. Once you got that,
you then need to make sure to pad all of your shared data-structures up to a
multiple of that size. Then, you also need to explicitly align them on a
boundary which is again a multiple of the L2-cache-line size. Here is some
sample alignment code that can help you with the latter:

http://groups.google.com/group/comp.lang.c/msg/be7e0d0e97c5e1d9

For the padding, well, the "simplest" way to achieve that is to use a union;
something like:

#define L2_CACHELINE_SIZE 128

struct your_shared_data {
int whatever;
[...];
};

union your_shared_data_pad {
struct your_shared_data data;
char padding[L2_CACHELINE_SIZE];
};

Using the alignment code from above, you can allocate and align objects of
type `your_shared_data_pad' using something like:

struct your_shared_data*
your_shared_data_allocate(
void** pbasemem
) {
union your_shared_data_pad* _this;
if (*pbasemem = malloc(sizeof(*_this) + L2CACHE_SIZE - 1)) {
_this = ALIGN(*pbasemem, union your_shared_data_pad*, L2CACHE_SIZE);
return &_this->data;
}
return NULL;
}

You can use the above function like:

{
void* basemem;
struct your_shared_data* _this = your_shared_data_allocate(&basemem);
if (_this) {
/* use `_this' */;

/* we are done, free the `basemem' */
free(basemem);
}
}

Hope that helps!

;^)

Chris Thomasson

unread,

Jul 19, 2008, 5:44:38 AM7/19/08

to

"David Schwartz" <dav...@webmaster.com> wrote in message
news:db17fd18-9c21-4383...@r35g2000prm.googlegroups.com...

On Jul 18, 10:35 am, mynickmynick <mynickmyn...@yahoo.com> wrote:
> > I know very badly the problem of false sharing.
> > Questions:
> > (1) There is a false sharing of 2 processors with 2 cache which can
> > lead ONLY to inefficiency BUT NOT to bugs (unwanted rush conditions on
> > not really shared adjacent data)?
> > (2) There is a false sharing of core-2 Intels with two cores but one
> > cache? This might lead to BUGS?!
> > (3) There is a false sharing of core duo with 2 cash (if they exist)
> > which might be similar to case (1) ?
> > (4) In a C pthread program what directives or whatever can I use to
> > prevent at all this f..ng false sharing??
> > (5) please give me interesting links!

> You're trying to solve a problem you don't have.

> Generally, you can avoid false sharing with three simple rules:

> 1) Don't try to write your own low-level primitives. Use ones that
> have been developed and optimized by experts for your platform.

Sadly, this does not avoid the possibility of false-sharing wrt an
applications
shared-data. Also, keep this in mind:

http://groups.google.com/group/comp.programming.threads/msg/88bd832858072802

http://groups.google.com/group/comp.programming.threads/msg/9701b98be6ed6e7f

There is false-sharing among CRITICAL_SECTIONS and the data they protect,
unless special care is taken...

> 2) Don't share sub-objects. If you need to share something between
> threads, make it a standalone object allocated from the system memory
> allocator.

This also does not avoid the possibility of false-sharing wrt an
applications
shared-data. You NEED to pad your objects to L2-cache and align them on
L2-cache boundaries. Only then can you be 100% sure that false-sharing is
totally eliminated within said objects.

> 3) Write code that is guaranteed to work by the threading standard you
> are using. Avoid structures that "just happen to work" on the machine
> you tried them on.

> Otherwise, fix performance problems once they're proven to be actual
> problems.

Agreed.

David Schwartz

unread,

Jul 19, 2008, 6:24:48 AM7/19/08

to

On Jul 19, 2:44 am, "Chris Thomasson" <x...@xxx.xxx> wrote:

> There is false-sharing among CRITICAL_SECTIONS and the data they protect,
> unless special care is taken...

That's a "bug" in that specific implementation. I put "bug" in quotes
because it's a trade-off, but one that I think was made very badly.
Keeping critical sections small saves memory when you have a lot of
them and can improve performance if multiple "nearby" critical
sections are used by the same thread many times (less memory
bandwidth, smaller cache footprint). The cost is false sharing.

> > 2) Don't share sub-objects. If you need to share something between
> > threads, make it a standalone object allocated from the system memory
> > allocator.

> This also does not avoid the possibility of false-sharing wrt an
> applications
> shared-data. You NEED to pad your objects to L2-cache and align them on
> L2-cache boundaries. Only then can you be 100% sure that false-sharing is
> totally eliminated within said objects.

That has costs as well as benefits. If nearby objects are commonly-
accessed frequently by a single thread, increasing the L2 cache
footprint has net cost rather than benefit. Unless you really know
what you're doing, I put this in the "fix it once you *know* it's a
problem, lest you make things worse". (Of course, with enough
experience, you can know when it's going to be a problem, maybe.)

DS

Chris Thomasson

unread,

Jul 19, 2008, 9:46:46 AM7/19/08

to

"David Schwartz" <dav...@webmaster.com> wrote in message

news:7291ef64-6ffb-485e...@v26g2000prm.googlegroups.com...

On Jul 19, 2:44 am, "Chris Thomasson" <x...@xxx.xxx> wrote:

> > There is false-sharing among CRITICAL_SECTIONS and the data they
> > protect,
> > unless special care is taken...

> That's a "bug" in that specific implementation. I put "bug" in quotes
> because it's a trade-off, but one that I think was made very badly.
> Keeping critical sections small saves memory when you have a lot of
> them and can improve performance if multiple "nearby" critical
> sections are used by the same thread many times (less memory
> bandwidth, smaller cache footprint). The cost is false sharing.

Agreed.

> > > 2) Don't share sub-objects. If you need to share something between
> > > threads, make it a standalone object allocated from the system memory
> > > allocator.

> > This also does not avoid the possibility of false-sharing wrt an
> > applications
> > shared-data. You NEED to pad your objects to L2-cache and align them on
> > L2-cache boundaries. Only then can you be 100% sure that false-sharing
> > is
> > totally eliminated within said objects.

> That has costs as well as benefits. If nearby objects are commonly-
> accessed frequently by a single thread, increasing the L2 cache
> footprint has net cost rather than benefit.

Indeed! If the objects are "read-mostly", then it could be beneficial
to pack several of them in a single L2 cache-line which would aid certain
cache-blocking techniques and allow an initial cache-line load to bring
in several objects at once; that's good. On the other hand, if the objects
are "write-mostly", then keeping a single object per-cache-line tends to
makes more sense.

If your application has a mix of these access patterns, then it could
possibly be a good idea to allocate two separate memory regions such
that stores into region A will not effect stores into region B, and
vise-versa. That way you can keep "write-mostly" objects segregated from
"read-mostly" objects. I mean, it would be bad for stores into write objects
to interfere with loads from read objects. These memory regions would be
a multiple of L2 cache-line size, say 4096 cache-lines per-region, and be
aligned on a sufficient boundary. IMHO, this scheme can be made to work
fairly well...

> Unless you really know
> what you're doing, I put this in the "fix it once you *know* it's a
> problem, lest you make things worse". (Of course, with enough
> experience, you can know when it's going to be a problem, maybe.)

Agreed.